{"id":306086,"date":"2010-07-19T09:00:54","date_gmt":"2010-07-19T16:00:54","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=306086"},"modified":"2016-10-15T18:41:14","modified_gmt":"2016-10-16T01:41:14","slug":"quest-quality-searches","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/quest-quality-searches\/","title":{"rendered":"The Quest for Quality Searches"},"content":{"rendered":"

By Janie Chang, Writer, Microsoft Research<\/em><\/p>\n

When the Association for Computing Machinery\u2019s (ACM\u2019s) Special Interest Group on Information Retrieval (SIGIR<\/a>) holds a conference, it must be difficult for participants to decide which sessions to attend, because creating easy, effective search experiences these days involves challenges potentially as diverse as dealing with multimedia, social media, relevance judgments, unstructured searches, and massive scalability. The 33rd annual ACM SIGIR Conference<\/a>, being held at the University of Geneva from July 19-23, features a busy schedule of tutorials, workshops, and presentations of research papers that explore these topics.<\/p>\n

The increasingly multidisciplinary nature of this subject is reflected in the eighty-seven papers accepted for this year\u2019s conference. Fifteen submissions from Microsoft alone represent 10 groups from four research facilities\u2014Microsoft Research Redmond<\/a>, Microsoft Research Cambridge<\/a>, Microsoft Research Asia<\/a>, and Microsoft Research India<\/a>\u2014as well as the Internet Services Research Center<\/a> and Bing<\/a>.<\/p>\n

Image Search by Concept Map<\/em><\/a>\u2014by Hao Xu of the University of Science and Technology of China and Jingdong Wang<\/a>, Xian-Sheng Hua, and Shipeng Li of Microsoft Research Asia\u2014is an example of how the use of multimedia has increased the complexity of information-retrieval problems. Digital images are, after text, the second-most prevalent media on the Web. The challenge for these researchers was to devise a more intuitive way for users to query for images.<\/p>\n

Hua, lead researcher with the Media Computing Group<\/a>, wants to overcome the limitations of existing image-search engines, which depend on the metadata of web images\u2014and which rarely contain spatial information. Although the Image Search by Color Sketch<\/a> feature in Bing addresses spatial relationships between colors in an image, his team wanted to convey semantic intention.<\/p>\n

\u201cThis is a totally new way of searching web images,\u201d Hua says, \u201cwhen compared to text-box-based image searches. In this model, we allow users to specify the spatial positions of the query terms. The typed keywords indicate the desired visual concepts, or objects, within the image. The spatial relation of the keywords indicates the desired layout of the visual contents. We translate from a concept map to a visual instance map.\u201d<\/p>\n

\"sample

This sample query illustrates how a user would search for \u201cimages containing a butterfly on the top left of a flower.\u201d<\/p><\/div>\n

With his interest in multimedia search, it`s not surprising that Hua is also part of a team researching music-information retrieval. Along with fellow researchers Jialie Shen and HweeHwa Pang of Singapore Management University, Meng Wang of Microsoft Research Asia, and Shuicheng Yan of the National University of Singapore, Hua has authored a paper entitled Effective Music Tagging Through Advanced Statistical Modeling<\/em><\/a>. The work addresses the challenge of managing large music archives through knowledge representation of music documents.<\/p>\n

Music search and recommendation requires compact but comprehensive textual annotation\u2014tags\u2014to describe a musical piece\u2019s content and semantic notion. But as with all searches, retrieved results are only as good as the tags that describe the item\u2019s content. Manual tagging of music files in a large collection is expensive and time-consuming, making automated music tagging an important research area. But high-level semantic concepts such as genre and mood are difficult to derive from the physical properties of music, so the researchers combined advanced musical feature-extraction techniques with high-level semantic-concept modeling.<\/p>\n

\"Xian-Sheng

Xian-Sheng Hua<\/p><\/div>\n

\u201cWe propose a multilayer approach to modeling the content of music.\u201d Hua says. \u201cIt bridges the gap between music content and semantic tags. Musical content is very rich, and we need to capture features such as timbral texture, harmony, rhythm structure, instrument, and pitch. Low-level acoustic characteristics are too simplistic for accurate representation.\u201d<\/p>\n

The results proved that their approach delivered substantial improvements in accuracy and robust annotation over existing methods. Even so, Hua concludes this is a challenging line of research that still has a long way to go.<\/p>\n

\u201cSeveral tags involve domain knowledge, such as the instrument and mood,\u201d he explains. \u201cEven when you employ professional musicians to label a data set manually, they may need to listen to the music multiple times. In some cases, the labelers may need to have a discussion before establishing the final tags of a piece.\u201d<\/p>\n

Search and retrieval by color, image content, spatial relationships, and musical mood and genre are only a few of the technical challenges that SIGIR 2010 will address, and these difficult topics are what drive some of the most novel work in computing research today.<\/p>\n

A Track Record of Support<\/h2>\n

This prestigious conference showcases the most innovative thinking in information retrieval and draws significant support from Microsoft Research, a gold sponsor of the event. In addition to the 15 accepted papers, Gary Flake, technical fellow at Microsoft, will deliver the first keynote address of SIGIR 2010. Microsoft researchers also have committed time to SIGIR in other capacities, chairing three of the technical-paper sessions, giving four tutorials, and organizing three workshops.<\/p>\n

Kuansan Wang<\/a>, principal researcher at Microsoft Research Redmond is pleased with an opportunity to run the Web N-gram Workshop, organized with colleagues Chengxiang Zhai from the University of Illinois at Urbana-Champaign, David Yarowsky of Johns Hopkins University, Evelyne Viegas<\/a> of Microsoft Research Redmond, and Stephan Vogel of Carnegie Mellon University.<\/p>\n

Wang hopes the workshop will encourage researchers to use the Web N-gram Services<\/a> hosted by Microsoft Research, which comprises algorithms, implementation, and petabytes of data regularly updated.<\/p>\n

\u201cThe SIGIR Web N-gram Workshop,\u201d he says, \u201cwill feature some of the research coming from using the Microsoft Web N-gram Services which went in public beta worldwide on April 28. The service-based distribution model enables us to update the data to keep up with the fast pace at which the web is changing. At the workshop we will announce several new features about the service.\u201d<\/p>\n

The effort that Microsoft Research applies to advancing the state of the art in information retrieval is reflected by the fact that four of the eight nominations this year for the SIGIR best paper award were written all or in part by Microsoft Research scientists.<\/p>\n

Papers from Microsoft Research accepted for SIGIR 2010 (* best-paper nominee):<\/p>\n

Adaptive Near-Duplicate Detection via Similarity Learning<\/em><\/strong> Hannaneh Hajishirzi, University of Illinois at Urbana-Champaign; Wen-tau Yih, Microsoft Research Redmond; and Aleksander Kolcz, Microsoft<\/p>\n

Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs<\/em><\/strong><\/a>*<\/em><\/strong> Ryen White, Microsoft Research Redmond; and Jeff Huang, University of Washington<\/p>\n

Collecting High Quality Overlapping Labels at Low Cost<\/em><\/strong><\/a> Hui Yang, Carnegie Mellon University; Anton Mityagin, Microsoft; Krysta Svore, Microsoft Research Redmond; and Sergey Markov, Microsoft<\/p>\n

Comparing the Sensitivity of Information Retrieval Metrics<\/em><\/strong><\/a>*<\/em><\/strong> Filip Radlinski, Microsoft Research Cambridge; and Nick Craswell, Microsoft Research Redmond<\/p>\n

Context-Aware Ranking in Web Search<\/em><\/strong><\/a> Biao Xiang, University of Science and Technology of China; Daxin Jiang, Microsoft Research Asia; Jian Pei, Simon Fraser University; Xiaohui Sun, Microsoft; Enhong Chen, (University of Science and Technology of China; and Hang Li, Microsoft Research Asia<\/p>\n

Effective Music Tagging Through Advanced Statistical Modeling<\/em><\/strong><\/a> Jialie Shen, Singapore Management University; Meng Wang, Microsoft Research Asia; Shuicheng Yan, National University of Singapore; HweeHwa Pang, Singapore Management University; and Xian-Sheng Hua, Microsoft Research Asia<\/p>\n

Extending Average Precision to Graded Relevance Judgments<\/em><\/strong><\/a>*<\/em><\/strong> Stephen Robertson, Microsoft Research Cambridge; Evangelos Kanoulas, University of Sheffield; and Emine Yilmaz, Microsoft Research Cambridge<\/p>\n

How Good Is a Span of Terms?\u00a0 Exploiting Proximity to Improve Web Retrieval<\/em><\/strong><\/a> Krysta Svore, Microsoft Research Redmond; Pallika Kanani, University of Massachusetts Amherst; and Nazan Khan, Microsoft<\/p>\n

Image Search by Concept Map<\/em><\/strong><\/a> Hao Xu, University of Science and Technology of China; Jingdong Wang, Microsoft Research Asia; Xian-Sheng Hua, Microsoft Research Asia; and Shipeng Li, Microsoft Research Asia<\/p>\n

Incorporating Post-Click Behaviors Into a Click Model<\/em><\/strong><\/a> Feimin Zhong, Tsinghua University; Dong Wang, Tsinghua University; Gang Wang, Microsoft Research Asia; Weizhu Chen, Microsoft Research Asia; Yuchen Zhang, Microsoft Research Asia; Zheng Chen, Microsoft Research Asia; and Haixun Wang, Microsoft Research Asia<\/p>\n

Multi-Style Language Model for Web Scale Information Retrieval<\/em><\/strong><\/a>*<\/em><\/strong> Kuansan Wang, Microsoft Research Redmond; Jianfeng Gao, Microsoft Research Redmond; and Xiaolong Li, Microsoft Research Redmond<\/p>\n

Studying Trailfinding Algorithms for Enhanced Web Search<\/em><\/strong><\/a> Adish Singla, Microsoft; Ryen White, Microsoft Research Redmond; and Jeff Huang, University of Washington<\/p>\n

The Good, the Bad, and the Random: An Eye-Tracking Study of Ad Quality in Web Search<\/em><\/strong><\/a> Georg Buscher, Deutsches Forschungszentrum f\u00fcr K\u00fcnstliche Intellitgenz; Susan Dumais, Microsoft Research Redmond; and Edward Cutrell, Microsoft Research India<\/p>\n

Understanding Web Browsing Behaviors Through Weibull Analysis of Dwell Time<\/em><\/strong><\/a> Chao Liu, Microsoft Research Redmond; Ryen White, Microsoft Research Redmond; and Susan Dumais, Microsoft Research Redmond<\/p>\n

Visual Summarization of Web Pages<\/em><\/strong> Binxing Jiao, University of Science and Technology of China; Linjun Yang, Microsoft Research Asia; Jizheng Xu, Microsoft Research Asia; and Feng Wu, Microsoft Research Asia<\/p>\n","protected":false},"excerpt":{"rendered":"

By Janie Chang, Writer, Microsoft Research When the Association for Computing Machinery\u2019s (ACM\u2019s) Special Interest Group on Information Retrieval (SIGIR) holds a conference, it must be difficult for participants to decide which sessions to attend, because creating easy, effective search experiences these days involves challenges potentially as diverse as dealing with multimedia, social media, relevance […]<\/p>\n","protected":false},"author":39507,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[194460],"tags":[194561,194719,186604,214520,186898,214523,196575,214517,203829,203961],"research-area":[13555],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-306086","post","type-post","status-publish","format-standard","hentry","category-search-and-information-retrieval","tag-acm","tag-association-for-computing-machinery","tag-bing","tag-concept-map","tag-image-search","tag-information-retrieval-problems","tag-multimedia","tag-search-experiences","tag-sigir","tag-special-interest-group-on-information-retrieval","msr-research-area-search-information-retrieval","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199560,199561,199562,199565],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144848],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"July 19, 2010","formattedExcerpt":"By Janie Chang, Writer, Microsoft Research When the Association for Computing Machinery\u2019s (ACM\u2019s) Special Interest Group on Information Retrieval (SIGIR) holds a conference, it must be difficult for participants to decide which sessions to attend, because creating easy, effective search experiences these days involves challenges…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/306086"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/39507"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=306086"}],"version-history":[{"count":5,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/306086\/revisions"}],"predecessor-version":[{"id":306119,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/306086\/revisions\/306119"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=306086"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=306086"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=306086"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=306086"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=306086"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=306086"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=306086"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=306086"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=306086"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=306086"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=306086"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}