Google is building the world's largest database of facts and objects in it without any form of human editing or intervention and this database is named the "Knowledge Vault". This massive database will expand and enhance itself automatically pulling in information from all over the web. It will provide Google all the information required to process queries based upon entities and their relationships. This will take Google a step ahead in becoming an answer engine. Conversational queries like:
"When was Shakespeare born?"
"How far is London from New York?"
"Who is the author of Harry Potter" etc.
would be answered perfectly by the information contained in the Knowledge Vault.
The current database that Google uses to answer queries is the Knowledge Graph. It is based upon human edited databases like Wikipedia,Freebase and many such sources.
Knowledge Vault has pulled in 1.6 billion facts to date and is built on text, tabular data, page structure, and human annotations. It is composed of 3 major components:
Extractors - Helps to extracts triplets from the web and assigns a confidence score to it.
Priors - These help to learn the prior probability of possible triples.
Knowledge Fusion - It determines if the probability extracted by the extractor and the priors are true.
It also uses the path ranking algorithm approach and LCWA (Local Closed World Assumption) labels.
Sources and citations:
http://searchenginewatch.com/article/2362128/Move-Over-Google-Knowledge-Graph-Here-Comes-Knowledge-Vault
http://www.cs.cmu.edu/~nlao/publication/2014.kdd.pdf
http://www.newscientist.com/article/mg22329832.700-googles-factchecking-bots-build-vast-knowledge-bank.html
Also See:
How Does Google Applies Semantic Search?
Latent Semantic Indexing
Facebook Graph Search Optimization
Getting Listed on Search Engines
Universal Analytics
"When was Shakespeare born?"
"How far is London from New York?"
"Who is the author of Harry Potter" etc.
would be answered perfectly by the information contained in the Knowledge Vault.
The current database that Google uses to answer queries is the Knowledge Graph. It is based upon human edited databases like Wikipedia,Freebase and many such sources.
Knowledge Vault has pulled in 1.6 billion facts to date and is built on text, tabular data, page structure, and human annotations. It is composed of 3 major components:
Extractors - Helps to extracts triplets from the web and assigns a confidence score to it.
Priors - These help to learn the prior probability of possible triples.
Knowledge Fusion - It determines if the probability extracted by the extractor and the priors are true.
It also uses the path ranking algorithm approach and LCWA (Local Closed World Assumption) labels.
Sources and citations:
http://searchenginewatch.com/article/2362128/Move-Over-Google-Knowledge-Graph-Here-Comes-Knowledge-Vault
http://www.cs.cmu.edu/~nlao/publication/2014.kdd.pdf
http://www.newscientist.com/article/mg22329832.700-googles-factchecking-bots-build-vast-knowledge-bank.html
Also See:
How Does Google Applies Semantic Search?
Latent Semantic Indexing
Facebook Graph Search Optimization
Getting Listed on Search Engines
Universal Analytics
Google Indepth Articles
Google Query Processing by Identifying Entities
How Google Identifies Substitute Terms of a Query?
Google Patent to Identify Erroneous Business Listings
How Google Identifies Spam in Information Collected From a Source?
Google Patent Named Ranking Documents to Penalize Spammers
Taxonomic Classification While Finding Context of Search Query
Google Granted Patent for Detecting Hidden Texts and Hidden Links
Google Query Processing by Identifying Entities
How Google Identifies Substitute Terms of a Query?
Google Patent to Identify Erroneous Business Listings
How Google Identifies Spam in Information Collected From a Source?
Google Patent Named Ranking Documents to Penalize Spammers
Taxonomic Classification While Finding Context of Search Query
Google Granted Patent for Detecting Hidden Texts and Hidden Links
No comments:
Post a Comment