This ongoing project aims to drive the state of the art in speech recognition toward matching, and ultimately surpassing, humans, with a focus on unconstrained conversational speech. The goal is a moving target as the scope of the task is broadened from high signal-to-noise speech between strangers (like in the Switchboard corpus) to include scenarios that make recognition more challenging, such as: conversation among familiar speakers, multi-speaker meetings, and speech captured in noisy or distant-microphone environments.
Related
DataSkeptic podcast (opens in new tab) interview on human versus machine transcription
Personne
Wayne Xiong
Partner Group Engineering Manager