Publication Speech LLMs are Contextual Reasoning Transcribers Keqi Deng, Ruchao Fan, Bo Ren, Yiming Wang, Jinyu Li April 2026
Publication RESPOND: Responsive Engagement Strategy for Predictive Orchestration and Dialogue Meng-Chen Lee, Costas Panay, Javier Hernandez, Sean Andrist, Dan Bohus, Anatoly Churikov, Andrew D. Wilson March 2026 Project
Publication Sirens’Whisper: Inaudible Near-Ultrasonic Jailbreaks of Speech-Driven LLMs Zijian Ling, Pingyi Hu, Xiuyong Gao, Xiaojing Ma, Man Zhou, Jun Feng, Songfeng Lu, Dongmei Zhang, Bin Benjamin Zhu March 2026
Publication VibeVoice: Expressive Podcast Generation with Next-Token Diffusion Zhiliang Peng, Jianwei Yu, Wenhui Wang, Yaoyao Chang, Yutao Sun, Li Dong, Yi Zhu, Weijiang Xu, Hangbo Bao, Zehua Wang, Shaohan Huang, Yan Xia, Furu Wei ICLR 2026 | February 2026
Publication Aurelius: Relation Aware Text-to-Audio Generation At Scale Yuhang He, He Liang, Yash Jain, Andrew Markham, Vibhav Vineet ICLR | February 2026
Publication EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning Dingdong Wang, Shujie Liu, Tianhua Zhang, Youjun Chen, Jinyu Li, Helen M. Meng ICLR 2026 | January 2026
Publication SALAD-VAE: Semantic Audio Compression with Language-Audio Distillation Sebastian Braun, Hannes Gamper, Dimitra Emmanouilidou 2026 International Conference on Acoustics, Speech, and Signal Processing | January 2026
Publication Towards Real-Time Generative Speech Restoration with Flow-Matching Tsun-An Hsieh, Sebastian Braun 2026 International Conference on Acoustics, Speech, and Signal Processing | January 2026 Project
Publication Sci-Phi: A Large Language Model Spatial Audio Descriptor Xilin Jiang, Sebastian Braun, Hannes Gamper IEEE Open Journal of Signal Processing | January 2026 Project