{"id":184238,"date":"2005-02-21T00:00:00","date_gmt":"2009-10-31T13:31:20","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/controlling-overlap-in-content-oriented-xml-retrieval\/"},"modified":"2016-09-09T10:00:42","modified_gmt":"2016-09-09T17:00:42","slug":"controlling-overlap-in-content-oriented-xml-retrieval","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/controlling-overlap-in-content-oriented-xml-retrieval\/","title":{"rendered":"Controlling Overlap in Content-Oriented XML Retrieval"},"content":{"rendered":"
\n

The representation of documents in XML provides an opportunity for information retrieval systems to take advantage of document structure, returning individual document components when appropriate, rather than complete documents in all circumstances. In response to a user query, an XML information retrieval system might return a mixture of paragraphs, sections, articles, bibliographic entries and other components. This facility is of particular benefit when a collection contains very long documents, such as product manuals or books, where the user should be directed to the most relevant portions of these documents.<\/p>\n

The direct application of standard ranking techniques to retrieve individual elements from a collection of XML documents often produces a result set in which the top ranks are dominated by a large number of elements taken from a small number of highly relevant documents. This paper presents and evaluates an algorithm that re-ranks this result set, with the aim of minimizing redundant content while preserving the benefits of element retrieval, including the benefit of identifying topic-focused components contained within relevant documents. Test collections developed by the INitiative for the Evaluation of XML Retrieval (INEX) form the basis for the evaluation.<\/p>\n<\/div>\n

<\/p>\n","protected":false},"excerpt":{"rendered":"

The representation of documents in XML provides an opportunity for information retrieval systems to take advantage of document structure, returning individual document components when appropriate, rather than complete documents in all circumstances. In response to a user query, an XML information retrieval system might return a mixture of paragraphs, sections, articles, bibliographic entries and other […]<\/p>\n","protected":false},"featured_media":290642,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"research-area":[],"msr-video-type":[],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-184238","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/Wt5xzvmf-hQ","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/184238"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/184238\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/290642"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=184238"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=184238"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=184238"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=184238"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=184238"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=184238"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}