Directory retrieval using voice form-filling

2007 IEEE International Conference on Acoustics, Speech and Signal Processing |

Published by IEEE | Organized by IEEE

Accurate retrieval of entries from large directories is a difficult task. Practical systems attempt to achieve acceptable performance using dialog to restrict the size of the directory. For instance, knowledge of city and state can be used to restrict the entries in a telephone number retrieval application. It is shown that it is advantageous to use a voice form-filling paradigm in which the user speaks all the field entries, first name, last name, city, and state, in a single utterance. A two-pass method for form-filling presented recently (S. Parthsarathy et al., 2005) is evaluated on the directory retrieval task. A delayed network expansion and pruning method is proposed to improve the efficiency of short-list generation in the form-filling algorithm. Experimental results demonstrate that sentence accuracies greater than 85% can be achieved on directory sizes of up to 8 million entries, with modest computing requirements.