{"id":163453,"date":"2011-08-01T00:00:00","date_gmt":"2011-08-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/bootstrapping-domain-detection-using-query-click-logs-for-new-domains\/"},"modified":"2018-10-16T22:03:03","modified_gmt":"2018-10-17T05:03:03","slug":"bootstrapping-domain-detection-using-query-click-logs-for-new-domains","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/bootstrapping-domain-detection-using-query-click-logs-for-new-domains\/","title":{"rendered":"Bootstrapping Domain Detection Using Query Click Logs for New Domains"},"content":{"rendered":"
\n

Domain detection in spoken dialog systems is usually treated as a multi-class, multi-label classification problem, and training of domain classifiers requires collection and manual annotation of example utterances. In order to extend a dialog system to new domains in a way that is seamless for users, domain detection should be able to handle utterances from the new domain as soon as it is introduced. In this work, we propose using web search query logs, which include queries entered by users and the links they subsequently click on, to bootstrap domain detection for new domains. While sampling user queries from the query click logs to train new domain classifiers, we introduce two types of measures based on the behavior of the users who entered a query and the form of the query. We show that both types of measures result in reductions in the error rate as compared to randomly sampling training queries. In controlled experiments over five domains, we achieve the best gain from the combination of the two types of sampling criteria.<\/p>\n<\/div>\n

<\/p>\n","protected":false},"excerpt":{"rendered":"

Domain detection in spoken dialog systems is usually treated as a multi-class, multi-label classification problem, and training of domain classifiers requires collection and manual annotation of example utterances. In order to extend a dialog system to new domains in a way that is seamless for users, domain detection should be able to handle utterances from […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13545],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-163453","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_publishername":"International Speech Communication Association","msr_edition":"Proceedings of Interspeech","msr_affiliation":"","msr_published_date":"2011-08-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"Proceedings of Interspeech","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"220090","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"Dilek-IS11.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2011\/08\/Dilek-IS11.pdf","id":220090,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":220090,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2011\/08\/Dilek-IS11.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"dilekha","user_id":31630,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=dilekha"},{"type":"user_nicename","value":"gokhant","user_id":31896,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=gokhant"},{"type":"user_nicename","value":"lheck","user_id":32659,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=lheck"},{"type":"user_nicename","value":"elshribe","user_id":31734,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=elshribe"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[171393,171150],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":171393,"post_title":"Knowledge Graphs and Linked Big Data Resources for Conversational Understanding","post_name":"knowledge-graphs-and-linked-big-data-resources-for-conversational-understanding","post_type":"msr-project","post_date":"2014-08-13 20:10:32","post_modified":"2017-06-19 11:05:46","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/knowledge-graphs-and-linked-big-data-resources-for-conversational-understanding\/","post_excerpt":"Interspeech 2014 Tutorial Web Page State-of-the-art statistical spoken language processing typically requires significant manual effort to construct domain-specific schemas (ontologies) as well as manual effort to annotate training data against these schemas. At the same time, a recent surge of activity and progress on semantic web-related concepts from the large search-engine companies represents a potential alternative to the manually intensive design of spoken language processing systems. Standards such as schema.org have been established for schemas…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171393"}]}},{"ID":171150,"post_title":"Spoken Language Understanding","post_name":"spoken-language-understanding","post_type":"msr-project","post_date":"2013-05-01 11:46:32","post_modified":"2019-08-19 14:48:51","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/spoken-language-understanding\/","post_excerpt":"Spoken language understanding (SLU) is an emerging field in between the areas of speech processing and natural language processing. The term spoken language understanding has largely been coined for targeted understanding of human speech directed at machines. This project covers our research on SLU tasks such as domain detection, intent determination, and slot filling, using data-driven methods. Projects Deeper Understanding: Moving\u00a0beyond shallow targeted understanding towards building domain independent SLU models. Scaling SLU: Quickly bootstrapping SLU…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171150"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/163453","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/163453\/revisions"}],"predecessor-version":[{"id":541605,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/163453\/revisions\/541605"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=163453"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=163453"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=163453"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=163453"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=163453"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=163453"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=163453"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=163453"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=163453"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=163453"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=163453"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=163453"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=163453"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=163453"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=163453"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=163453"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}