Acoustical Pre-Processing for Robust Spoken Language Systems
- Alex Acero
Proc. of the International Conference on Spoken Language Systems |
Published by International Speech Communication Association
In this paper we report our initial efforts to make SPHINX, the CMU continuous-speech speaker-independent recognition system, robust to changes in the environment. To deal with differences in noise level and spectral tile between close-talking and desktop microphones, we propose two novel methods based on additive corrections in the cepstral domain. In the first algorithm, the additive correction depends on the instantaneous SNR of the signal. In the second technique, EM techniques are used to best match the cepstral vectors of the input utterances to the ensemble of codebook algorithms dramatically improves recognition accuracy when the system is tested on a microphone other than the one on which it was trained.
© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.