Language modeling: Attention mechanisms for extending context-awareness of LSTM

Language models for ASR are traditionally trained on a sentence-level corpus. In this internship, we explore the potential of taking advantage of context beyond the current sentence for next word prediction. We show that adding an attention mechanism to LSTM allows modeling of long contexts.