A Pitch-Aware Approach to Single-Channel Speech Separation
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Music Source Separation and Spatial Audio | Poster Area E
Ke Wang, Frank Soong, Lei Xie
A Sparsity Measure for Echo Density Growth in General Environments
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Acoustic Environments and Music Analysis | Poster Area D
Helena Peic Tukuljac, Ville Pulkki, Hannes Gamper, Keith Godin, Ivan Tashev, Nikunj Raghuvanshi
Blind Room Volume Estimation from Single-Channel Noisy Speech
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Acoustic Environments and Music Analysis | Poster Area D
Andrea Genovese, Hannes Gamper, Ville Pulkki, Nikunj Raghuvanshi, Ivan Tashev
Improving Binaural Ambisonics Decoding by Spherical Harmonics Domain Tapering and Coloration Compensation
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Music Source Separation and Spatial Audio | Poster Area E
Christoph Hold, Hannes Gamper, Ville Pulkki, Nikunj Raghuvanshi, Ivan Tashev
Static and Dynamic State Predictions for Acoustic Model Combination
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Deep Learning Applications I | Auditorium 2
Kshitiz Kumar, Yifan Gong
Gaussian Process LSTM Recurrent Neural Network Language Models for Speech Recognition
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Language Modeling, ASR and Punctuation Prediction | Poster Area C
Max W.Y. Lam, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng
Investigation of Sampling Techniques for Maximum Entropy Language Modeling Training
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Language Modeling, ASR and Punctuation Prediction | Poster Area C
Xie Chen, Jun Zhang, Tasos Anastasakos, Fil Alleva
Recurrent Neural Network Language Model Training Using Natural Gradient
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Language Modeling, ASR and Punctuation Prediction | Poster Area C
Jianwei Yu, Max W.Y. Lam, Xie Chen, Shoukang Hu, Songxiang Liu, Xixin Wu, Xunying Liu, Helen Meng
Towards Code-Switching ASR for End-to-End CTC Models
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Multi-lingual Speech Recognition | Poster Area A
Ke Li, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong
Adversarial Speaker Verification
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Features and Robustness for Speaker Identification | Poster Area B
Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong
Attention in Recurrent Neural Networks for Ransomware Detection
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Deep Learning III | Poster Area G
Rakshit Agrawal, Jack W. Stokes, Karthik Selvaraj, Mady Marinescu
Encrypted Speech Recognition Using Deep Polynomial Networks
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Novel Architectures and Training Strategies for ASR | Auditorium 1
Shixiong Zhang, Yifan Gong, Dong Yu
Single-Channel Speech Extraction Using Speaker Inventory and Attention Network
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Source Separation and Speech Enhancement I | Meeting Room 1
Xiong Xiao, Zhuo Chen, Takuya Yoshioka, Hakan Erdogan, Changliang Liu, Dimitrios Dimitriadis, Jasha Droppo, Yifan Gong
Universal Acoustic Modeling Using Neural Mixture Models
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Novel Architectures and Training Strategies for ASR | Auditorium 1
Amit Das, Jinyu Li, Changliang Liu, Yifan Gong
Adversarial Speaker Adaptation
Wednesday, May 15, 2019 | 1:30 PM–3:30 PM | Feature Learning and Adaptation for ASR | Auditorium 1
Zhong Meng, Jinyu Li, Yifan Gong
Detecting Cyber Attacks Using Anomaly Detection with Explanations and Expert Feedback
Wednesday, May 15, 2019 | 1:30 PM–3:30 PM | Learning Theory and Methods I | Auditorium 2
Md Amran Siddiqui, Jack W. Stokes, Christian Seifert, Evan Argyle, Robert McCann, Joshua Neil, Justin Carroll
Directional Interference Suppression Using a Spatial Relative Transfer Function Feature
Wednesday, May 15, 2019 | 4:00 PM–6:00 PM | Quality Measures and Sensor Array Processing | Poster Area D
NN-Based Ordinal Regression for Assessing Fluency of ESL Speech
Wednesday, May 15, 2019 | 4:00 PM–6:00 PM | Training Regimes for Emotion and Sentiment Analysis | Poster Area C
Shaoguang Mao, Zhiyong Wu, Jingshuai Jiang, Peiyun Liu, Frank Soong
Non-Intrusive Speech Quality Assessment Using Neural Networks
Wednesday, May 15, 2019 | 4:00 PM–6:00 PM | Quality Measures and Sensor Array Processing | Poster Area D
Anderson R. Avila, Hannes Gamper, Chandan Reddy, Ross Cutler, Ivan Tashev, Johannes Gehrke
Conditional Teacher-Student Learning
Thursday, May 16, 2019 | 8:00 AM–10:00 AM | ASR Training Strategies and Toolkits | Poster Area A
Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong
Decoding Homomorphically Encrypted Flac Audio Without Decryption
Thursday, May 16, 2019 | 8:00 AM–10:00 AM | Audio Security and Source Separation | Poster Area D
Yuanyuan Tang, Bin Zhu, Xiaojing Ma, Mathiopoulos P. Takis, Xia Xie, Hong Huang
Improving Layer Trajectory LSTM with Future Context Frames
Thursday, May 16, 2019 | 1:00 PM–3:00 PM | New Features, Models and Representations/Audio Visual ASR | Poster Area A
Jinyu Li, Liang Lu, Changliang Liu, Yifan Gong
Contextual Out-of-Domain Utterance Handling with Counterfeit Data Augmentation
Thursday, May 16, 2019 | 3:30 PM–5:30 PM | Dialogue | Syndicate 1
Sungjin Lee, Igor Shalyminov
Dilated Residual Network with Multi-Head Self-Attention for Speech Emotion Recognition
Thursday, May 16, 2019 | 3:30 PM–5:30 PM | Architectures for Emotion and Sentiment Analysis | Poster Area B
Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, Helen Meng
Attentive Adversarial Learning for Domain-Invariant Training
Thursday, May 16, 2019 | 6:00 PM–8:00 PM | Robust Speech Recognition | Poster Area A
Zhong Meng, Jinyu Li, Yifan Gong
Speech Super Resolution Generative Adversarial Network
Thursday, May 16, 2019 | 6:00 PM–8:00 PM | Audio and Speech Applications | Poster Area G
Sefik Emre Eskimez, Kazuhito Koishida
Word Characters and Phone Pronunciation Embedding for ASR Confidence Classifier
Thursday, May 16, 2019 | 6:00 PM–8:00 PM | Signal Processing for Emerging and Practical Applications | Poster Area E
Session Chair: Ivan Tashev
Kshitiz Kumar, Tasos Anastasakos, Yifan Gong
Acoustic and Lexical Sentiment Analysis for Customer Service Calls
Friday, May 17, 2019 | 8:30 AM–10:30 AM | Using Multiple Perspectives in Emotion and Sentiment Analysis | Syndicate 3
Bryan Li, Dimitrios Dimitriadis, Andreas Stolcke
Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech
Friday, May 17, 2019 | 8:30 AM–10:30 AM | Artificial Intelligence Based Human-Machine Conversation Technology for Interactive Education | Syndicate 1
Session Chairs: Yao Qian, Helen Meng, Frank K. Soong
Jingyong Hou, Pengcheng Guo, Sining Sun, Frank K. Soong, Wenping Hu, Lei Xie
Learning Latent Representations for Style Control and Transfer in End-to-End Speech Synthesis
Friday, May 17, 2019 | 8:30 AM–10:30 AM | Speech Synthesis II | Poster Area B
Ya-Jie Zhang, Shifeng Pan, Lei He, Zhen-Hua Ling
Low-Latency Speaker-Independent Continuous Speech Separation
Friday, May 17, 2019 | 1:30 PM–3:30 PM | Speech Separation, Enhancement and Denoising | Poster Area A
Takuya Yoshioka, Zhuo Chen, Changliang Liu, Xiong Xiao, Hakan Erdogan, Dimitrios Dimitriadis
Cross Modal Audio Search and Retrieval with Joint Embeddings Based on Text and Audio
Friday, May 17, 2019 | 4:00 PM–6:00 PM | Multimedia Analysis | Poster Area C
Benjamin Elizalde, Shuayb Zarar, Bhiksha Raj