{"id":357914,"date":"2017-01-25T14:40:26","date_gmt":"2017-01-25T22:40:26","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=357914"},"modified":"2018-10-16T20:14:54","modified_gmt":"2018-10-17T03:14:54","slug":"achieving-human-parity-conversational-speech-recognition-3","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/achieving-human-parity-conversational-speech-recognition-3\/","title":{"rendered":"Achieving Human Parity in Conversational Speech Recognition"},"content":{"rendered":"
The Switchboard-1 Telephone Speech Corpus was originally collected by Texas Instruments in 1990-91, under DARPA sponsorship, and marked the beginning of over 25 years of intensive effort in conversational speech recognition. Recently, we have measured the ability of professional transcribers to transcribe this sort of data, and found that our latest systems have achieved the same level of performance. In this talk, I will describe the key technological advances that have made this possible: the systematic use of CNN and LSTM acoustic models in both acoustic and language modeling, as well as the extensive use of system combination. The talk will also provide an analysis of the errors made by people and computers, which show substantially similar error patterns, with the exception of confusions between backchannel acknowledgments and hesitations.<\/span><\/span><\/p>\n","protected":false},"excerpt":{"rendered":" The Switchboard-1 Telephone Speech Corpus was originally collected by Texas Instruments in 1990-91, under DARPA sponsorship, and marked the beginning of over 25 years of intensive effort in conversational speech recognition. Recently, we have measured the ability of professional transcribers to transcribe this sort of data, and found that our latest systems have achieved the […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13545],"msr-publication-type":[193724],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-357914","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_publishername":"","msr_edition":"Invited talk, IEEE SLT Workshop","msr_affiliation":"","msr_published_date":"2016-12-14","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"IEEE Spoken Language Technology Workshop","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"Invited talk, IEEE SLT Workshop","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"357944","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"HumanParity","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/HumanParity.pdf","id":357944,"label_id":0}],"msr_related_uploader":"","msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Geoff Zweig","user_id":0,"rest_url":false},{"type":"user_nicename","value":"weixi","user_id":34811,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=weixi"},{"type":"user_nicename","value":"jdroppo","user_id":32211,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=jdroppo"},{"type":"user_nicename","value":"xdh","user_id":34869,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=xdh"},{"type":"user_nicename","value":"fseide","user_id":31826,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=fseide"},{"type":"user_nicename","value":"mseltzer","user_id":33017,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=mseltzer"},{"type":"user_nicename","value":"anstolck","user_id":31054,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=anstolck"},{"type":"user_nicename","value":"dongyu","user_id":31667,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=dongyu"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[664548],"msr_project":[350126],"publication":[],"video":[],"download":[],"msr_publication_type":"miscellaneous","related_content":{"projects":[{"ID":350126,"post_title":"Human Parity in Speech Recognition","post_name":"human-parity-speech-recognition","post_type":"msr-project","post_date":"2017-01-10 11:44:06","post_modified":"2019-08-19 10:12:03","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/human-parity-speech-recognition\/","post_excerpt":"This ongoing project aims to drive the state of the art in speech recognition toward \u00a0matching, and ultimately surpassing, humans, with a focus on unconstrained conversational speech.\u00a0\u00a0 The goal is a moving target as the scope of the task is broadened from high signal-to-noise speech between strangers (like in the Switchboard corpus) to\u00a0include\u00a0scenarios that make\u00a0recognition more challenging, such\u00a0as:\u00a0 conversation\u00a0among familiar speakers, multi-speaker meetings, and speech captured in noisy or distant-microphone environments. Related DataSkeptic podcast interview…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/350126"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/357914","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/357914\/revisions"}],"predecessor-version":[{"id":525397,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/357914\/revisions\/525397"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=357914"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=357914"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=357914"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=357914"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=357914"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=357914"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=357914"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=357914"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=357914"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=357914"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=357914"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=357914"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=357914"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=357914"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=357914"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=357914"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}