This paper reports recent efforts to apply the speaker-independent SPHINX-II system to the DARPA Wall Street Journal continuous speech recognition task. In SPHINX-II, we incorporated additional dynamic and speaker-normalized features, replaced discrete models with sex-dependent semi-continuous hidden Markov models, augmented within-word triphones with between-word triphones, and extended generalized triphone models to shared-distribution models. The configuration of SPHINX-II being used for this task includes sex-dependent, semi-continuous, shared-distribution hidden Markov models and left context dependent between-word triphones. In applying our technology to this task we addressed issues that were not previously of concern owing to the (relatively) small size of the Resource Management task.
Applying SPHINX-II to the DARPA Wall Street Journal CSR task
- Fil Alleva ,
- Hsiao-Wuen Hon ,
- Xuedong Huang ,
- Mei-Yuh Hwang ,
- R. Rosenfeld ,
- Robert Weide
HLT '91 Proceedings of the workshop on Speech and Natural Language |