Language modeling: Attention mechanisms for extending context-awareness of LSTM

Jurik Juraska; Sarangarajan Parthasarathy (sarangp); William Gale

Language modeling: Attention mechanisms for extending context-awareness of LSTM

Jurik Juraska ,
Sarangarajan Parthasarathy (sarangp) ,
William Gale

September 2018

Download BibTex

Language models for ASR are traditionally trained on a sentence-level corpus. In this internship, we explore the potential of taking advantage of context beyond the current sentence for next word prediction. We show that adding an attention mechanism to LSTM allows modeling of long contexts.