{"id":156790,"date":"2009-04-01T00:00:00","date_gmt":"2009-04-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/exploiting-web-search-engines-to-search-structured-information\/"},"modified":"2018-10-16T21:10:32","modified_gmt":"2018-10-17T04:10:32","slug":"exploiting-web-search-engines-to-search-structured-information","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/exploiting-web-search-engines-to-search-structured-information\/","title":{"rendered":"Exploiting Web Search Engines to Search Structured Information"},"content":{"rendered":"
\n

Web search engines leverage information from structured databases to answer queries. For example, many product related queries on search engines (Amazon, Google, Yahoo, Live Search) are answered by searching underlying product databases containing names, descriptions, specifications, and reviews of products. However, these vertical search engines are \u201csilo-ed\u201d in that their results are independent of those from web search. This often leads to empty or incomplete results, as query terms are matched against the information in the underlying database only. In order to overcome this problem, we propose an approach that first identifies relationships between web documents and items in structured databases. This allows us to subsequently exploit results from web search engines in combination with these relationships to obtain the structured data items relevant for a much wider range of queries. We propose an architecture that implements the integrated search functionality efficiently, adding very little additional overhead to query processing and is fully integrated with the search engine architecture. We demonstrate the quality of our techniques through an extensive experimental study.<\/p>\n<\/div>\n

<\/p>\n","protected":false},"excerpt":{"rendered":"

Web search engines leverage information from structured databases to answer queries. For example, many product related queries on search engines (Amazon, Google, Yahoo, Live Search) are answered by searching underlying product databases containing names, descriptions, specifications, and reviews of products. However, these vertical search engines are \u201csilo-ed\u201d in that their results are independent of those […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13555],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-156790","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-search-information-retrieval","msr-locale-en_us"],"msr_publishername":"Association for Computing Machinery, Inc.","msr_edition":"18th International World Wide Web Conference (WWW 2009)","msr_affiliation":"","msr_published_date":"2009-04-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"207754","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"www09_final_structured_db_search.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/www09_final_structured_db_search.pdf","id":207754,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":207754,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/www09_final_structured_db_search.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"sagrawal","user_id":33490,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=sagrawal"},{"type":"user_nicename","value":"surajitc","user_id":33764,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=surajitc"},{"type":"user_nicename","value":"kaushik","user_id":32503,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=kaushik"},{"type":"text","value":"Venkatesh Ganti","user_id":0,"rest_url":false},{"type":"user_nicename","value":"dongxin","user_id":31666,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=dongxin"},{"type":"user_nicename","value":"chrisko","user_id":31427,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=chrisko"},{"type":"text","value":"Arnd Christian K\u00f6nig","user_id":0,"rest_url":false},{"type":"user_nicename","value":"vganti","user_id":34554,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=vganti"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[957177],"msr_project":[170672,169514],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":170672,"post_title":"Entity Search and Query Portals","post_name":"entity-search-and-query-portals","post_type":"msr-project","post_date":"2011-03-07 19:19:42","post_modified":"2017-06-08 16:59:10","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/entity-search-and-query-portals\/","post_excerpt":"The goal of entity search is to return entities (e.g., people, products, locations) relevant to a keyword query. The goal of Query Portals is to go one step further and return not only the names of relevant entities but a rich set of information associated with each entity. Often, users issuing keyword searches are not looking for documents but for entities\u00a0residing in a\u00a0structured database. Consider a user searching for products (product search), people (expert search\/celebrity…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/170672"}]}},{"ID":169514,"post_title":"Data Exploration","post_name":"data-exploration","post_type":"msr-project","post_date":"2004-06-08 15:56:40","post_modified":"2017-06-06 10:57:58","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/data-exploration\/","post_excerpt":"This is a project area rather than a specific project. This project area focuses on novel ways to query, browse, extract, explore, mine and manage various kinds of data residing within the enterprise and on the web: structured data in relational databases, tabular data embedded in web pages, enterprise documents and spreadsheets as well as unstructured data in query logs, text documents and social media. Our research is relevant to both enterprise and consumer scenarios…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/169514"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/156790","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/156790\/revisions"}],"predecessor-version":[{"id":533526,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/156790\/revisions\/533526"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=156790"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=156790"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=156790"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=156790"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=156790"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=156790"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=156790"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=156790"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=156790"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=156790"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=156790"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=156790"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=156790"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=156790"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=156790"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=156790"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}