{"id":642510,"date":"2020-03-24T09:03:44","date_gmt":"2020-03-24T16:03:44","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-event&p=642510"},"modified":"2020-04-30T13:40:27","modified_gmt":"2020-04-30T20:40:27","slug":"icassp-2020","status":"publish","type":"msr-event","link":"https:\/\/www.microsoft.com\/en-us\/research\/event\/icassp-2020\/","title":{"rendered":"Microsoft @ ICASSP 2020"},"content":{"rendered":"

Website:<\/strong> ICASSP 2020 (opens in new tab)<\/span><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"

Microsoft is proud to be a silver sponsor of the 45th International Conference on Acoustics, Speech and Signal Processing (ICASSP). Stop by our booth to chat with our experts, see demos of our latest research and find out about career opportunities with Microsoft.<\/p>\n","protected":false},"featured_media":644613,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"msr_startdate":"2020-05-04","msr_enddate":"2020-05-08","msr_location":"Virtual","msr_expirationdate":"","msr_event_recording_link":"","msr_event_link":"","msr_event_link_redirect":false,"msr_event_time":"","msr_hide_region":true,"msr_private_event":false,"footnotes":""},"research-area":[243062,13545],"msr-region":[239178,256048],"msr-event-type":[197941],"msr-video-type":[],"msr-locale":[268875],"msr-program-audience":[],"msr-post-option":[],"msr-impact-theme":[],"class_list":["post-642510","msr-event","type-msr-event","status-publish","has-post-thumbnail","hentry","msr-research-area-audio-acoustics","msr-research-area-human-language-technologies","msr-region-europe","msr-region-global","msr-event-type-conferences","msr-locale-en_us"],"msr_about":"Website:<\/strong> ICASSP 2020<\/a>","tab-content":[{"id":0,"name":"About","content":"Microsoft is proud to be a silver sponsor of the 45th<\/sup> International Conference on Acoustics, Speech and Signal Processing (ICASSP)<\/a>."},{"id":1,"name":"Sessions","content":"

Tuesday, May 5<\/h2>\r\n

11:30 \u2013 13:30 CEST<\/h3>\r\nMLSP-P2: Applications in Speech and Audio\r\nMulti-Label Sound Event Retrieval Using A Deep Learning-Based Siamese Structure With A Pairwise Presence Matrix<\/strong><\/a>\r\nJianyu Fan,\u00a0Eric Nichols<\/strong>,\u00a0Daniel Tompkins<\/strong>, Ana Elisa Me\u0301ndez Me\u0301ndez, Benjamin Elizalde, Philippe Pasquier\r\n

11:50 \u2013 12:10 CEST<\/h3>\r\nSPE-L1: End-to-end Speech Recognition I: Streaming\r\nMinimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR<\/strong><\/a>\r\nHirofumi Inaguma,\u00a0Yashesh Gaur<\/strong>,\u00a0Liang Lu<\/strong>,\u00a0Jinyu Li<\/a>,\u00a0Yifan Gong<\/a>\r\n

16:30 \u2013 18:30 CEST<\/h3>\r\nSPE-P3: Machine Learning for Speech Synthesis I\r\nImproving Prosody with Linguistic and Bert Derived Features in Multi-Speaker Based Mandarin Chinese Neural TTS<\/strong><\/a>\r\nYujia Xiao<\/strong>,\u00a0Lei He<\/strong>,\u00a0Huaiping Ming<\/strong>,\u00a0Frank K. Soong<\/a>\r\n

17:30 \u2013 17:50 CEST<\/h3>\r\nAUD-L2: Deep Learning for Source Separation\r\nDual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation<\/strong><\/a>\r\nYi Luo,\u00a0Zhuo Chen<\/strong>,\u00a0Takuya Yoshioka<\/a>\r\n\r\n
\r\n\r\n

Wednesday, May 6<\/h2>\r\n

9:00 \u2013 11:00 CEST<\/h3>\r\nAUD-P4: Feedback, Noise, and Reverberation\r\nJoint Beamforming and Reverberation Cancellation Using a Constrained Kalman Filter with Multichannel Linear Prediction<\/strong><\/a>\r\nSahar Hashemgeloogerdi,\u00a0Sebastian Braun<\/a>\r\n\r\nAUD-P4: Feedback, Noise, and Reverberation\r\nPredicting Word Error Rate for Reverberant Speech<\/strong><\/a>\r\nHannes Gamper<\/a>,\u00a0Dimitra Emmanouilidou<\/a>,\u00a0Sebastian Braun<\/a>,\u00a0Ivan Tashev<\/a>\r\n\r\nSPE-P5: Deep Speaker Recognition Models\r\nImproving Deep CNN Networks with Long Temporal Context for Text-independent Speaker Verification<\/strong><\/a>\r\nYong Zhao<\/strong>,\u00a0Tianyan Zhou<\/strong>,\u00a0Zhuo Chen<\/strong>,\u00a0Jian Wu<\/strong>\r\n

9:20 \u2013 9:40 CEST<\/h3>\r\nSPE-L6: Speech Enhancement II: Single Channel\r\nLow-Latency Single Channel Speech Enhancement Using U-Net Convolutional Neural Networks<\/strong><\/a>\r\nAhmet E. Bulut<\/strong>,\u00a0Kazuhito Koishida<\/a>\r\n

11:30 \u2013 13:30 CEST<\/h3>\r\nSAM-P3: Sparsity, Super-Resolution and Imaging\r\nLow-Rank Toeplits Matrix Estimation Via Random Ultra-Sparse Rulers<\/strong><\/a>\r\nHannah Lawrence, Jerry Li<\/a>, Cameron Musco, Christopher Musco\r\n\r\nSPE-P8: Robust Speech Recognition\r\nA Practical Two-Stage Training Strategy for Multi-Stream End-to-End Speech Recognition<\/strong><\/a>\r\nRuizhi Li, Gregory Sell,\u00a0Xiaofei Wang<\/a>, Shinji Watanabe, Hynek Hermansky\r\n

16:30 \u2013 16:50 CEST<\/h3>\r\nIFS-L2: Privacy, Biometrics and Information Security\r\nPrivacy-Preserving Phishing Web Page Classification Via Fully Homomorphic Encryption<\/strong><\/a>\r\nEdward Chou,\u00a0Arun Gururajan<\/strong>,\u00a0Kim Laine<\/a>,\u00a0Nitin Kumar Goel<\/strong>,\u00a0Anna Bertiger<\/strong>,\u00a0Jack W. Stokes<\/a>\r\n

16:30 \u2013 18:30 CEST<\/h3>\r\nHLT-P1: Spoken Language Understanding and Dialogue I\r\nFast Domain Adaptation for Goal-Oriented Dialogue Using A Hybrid Generative-Retrieval Transformer<\/strong><\/a>\r\nIgor Shalyminov,\u00a0Alessandro Sordoni<\/a>,\u00a0Adam Atkinson<\/a>,\u00a0Hannes Schulz<\/a>\r\n\r\nSPE-P9: End-to-end Speech Recognition III: General Topics\r\nExploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition<\/strong><\/a>\r\nHu Hu,\u00a0Rui Zhao<\/strong>,\u00a0Jinyu Li<\/a>,\u00a0Liang Lu<\/strong>,\u00a0Yifan Gong<\/a>\r\n\r\n
\r\n\r\n

Thursday, May 7<\/h2>\r\n

9:00 \u2013 11:00 CEST<\/h3>\r\nHLT-P2: Speech and Language Analysis\r\nCombining Acoustics, Content and interaction Features to Find Hot Spots in Meetings<\/strong><\/a>\r\nDave Makhervaks,\u00a0William Hinthorn<\/a>,\u00a0Dimitrios Dimitriadis<\/a>, Andreas Stolcke\r\n

10:20 \u2013 10:40 CEST<\/h3>\r\nAUD-L6: Acoustic Environments and Spatial Audio II\r\nFast Acoustic Scattering Using Convolutional Neural Networks<\/strong><\/a>\r\nZiqi Fan,\u00a0Vibhav Vineet<\/a>,\u00a0Hannes Gamper<\/a>,\u00a0Nikunj Raghuvanshi<\/a>\r\n

10:40 \u2013 11:00 CEST<\/h3>\r\nSPE-L11: Speech Separation and Extraction I: Single Channel\r\nAn Online Speaker-Aware Speech Separation Approach Based on Time-Domain Representation<\/strong><\/a>\r\nHui Wang, Yan Song,\u00a0Zeng-Xi Li<\/strong>, Ian McLoughlin, Li-Rong Dai\r\n

11:30 \u2013 13:30 CEST<\/h3>\r\nSPE-P12: Machine Learning for Speech Synthesis II\r\nImproving LPCNET-Based Text-to-Speech with Linear Prediction-Structured Mixture Density Network<\/strong><\/a>\r\nMin-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto,\u00a0Frank Soong<\/a>, Hong-Goo Kang\r\n\r\nSPE-P13: Speech Separation and Extraction III\r\nContinuous Speech Separation: Dataset and Analysis<\/strong><\/a>\r\nZhuo Chen<\/strong>,\u00a0Takuya Yoshioka<\/a>,\u00a0Liang Lu<\/strong>,\u00a0Tianyan Zhou<\/strong>,\u00a0Zhong Meng<\/strong>,\u00a0Yi Luo<\/strong>,\u00a0Jian Wu<\/strong>,\u00a0Xiong Xiao<\/strong>,\u00a0Jinyu Li<\/a>\r\n

12:10 \u2013 12:30 CEST<\/h3>\r\nSPE-L12: Speech Separation and Extraction II: Multi-channel\r\nEnd-to-End Microphone Permutation and Number Invariant Multi-Channel Speech Separation<\/strong><\/a>\r\nYi Luo,\u00a0Zhuo Chen<\/strong>, Nima Mesgarani,\u00a0Takuya Yoshioka<\/a>\r\n

16:30 \u2013 18:30 CEST<\/h3>\r\nMMSP-P3:\u00a0 Multimedia Signal Processing\r\nSupervised Deep Hashing for Efficient Audio Event Retrieval<\/strong><\/a>\r\nArindam Jati,\u00a0Dimitra Emmanouilidou<\/a>\r\n\r\nMMSP-P3:\u00a0 Multimedia Signal Processing\r\nMultimodal Active Speaker Detection and Virtual Cinematography for Video Conferencing<\/strong><\/a>\r\nRoss Cutler<\/strong>, Ramin Mehran, Sam Johnson,\u00a0Cha Zhang<\/strong>, Adam Kirk, Oliver Whyte, Adarsh Kowdle\r\n\r\nSPE-P15: Speech Recognition: Adaptation\r\nL-Vector: Neural Label Embedding for Domain Adaptation<\/strong><\/a>\r\nZhong Meng<\/strong>, Hu Hu,\u00a0Jinyu Li<\/a>,\u00a0Changliang Liu<\/strong>,\u00a0Yan Huang<\/strong>,\u00a0Yifan Gong<\/a>, Chin-Hui Lee\r\n\r\nSPE-P15: Speech Recognition: Adaptation\r\nAcoustic Model Adaptation for Presentation Transcription and Intelligent Meeting Assistant Systems<\/strong><\/a>\r\nYan Huang<\/strong>,\u00a0Yifan Gong<\/a>\r\n\r\nSPE-P15: Speech Recognition: Adaptation\r\nUsing Personalized Speech Synthesis and Neural Language Generator for Rapid Speaker Adaptation<\/strong><\/a>\r\nYan Huang<\/strong>,\u00a0Lei He<\/strong>,\u00a0Wenning Wei<\/strong>,\u00a0William Gale<\/strong>,\u00a0Jinyu Li<\/a>,\u00a0Yifan Gong<\/a>\r\n\r\nSS-P1: Signal Processing Education: Trends and Innovations\r\nA Dataset for Measuring Reading Levels in India at Scale<\/strong><\/a>\r\nDolly Agarwal,\u00a0Jayant Gupchup<\/strong>, Nishant Baghel\r\n

17:30 \u2013 17:30 CEST<\/h3>\r\nIDSP-L2: Industry Session on Large-Scale Distributed Learning Strategies\r\nParallelizing Adam Optimizer with Blockwise Model-Update Filtering<\/strong><\/a>\r\nKai Chen<\/strong>, Haisong Ding,\u00a0Qiang Huo<\/strong>\r\n\r\n
\r\n\r\n

Friday, May 8<\/h2>\r\n

8:00 \u2013 10:00 CEST<\/h3>\r\nIFS-P1: Information Hiding, Biometrics and Security\r\nTexception: A Character\/Word-Level Deep Learning Model for Phishing URL Detection<\/strong><\/a>\r\nFarid Tajaddodianfar<\/strong>,\u00a0Jack W. Stokes<\/a>,\u00a0Arun Gururajan<\/strong>\r\n\r\nSAM-P6: Detection, Estimation and Classification\r\nStatic Visual Spatial Priors For DOA Estimation<\/strong><\/a>\r\nPawel Swietojanski,\u00a0Ondrej Miksik<\/strong>\r\n\r\nSPE-P16: Word Spotting\r\nAdaptation of RNN Transducer with Text-to-Speech Technology for Keyword Spotting<\/strong><\/a>\r\nEva Sharma<\/strong>,\u00a0Guoli Ye<\/strong>,\u00a0Wenning Wei<\/strong>,\u00a0Rui Zhao<\/strong>,\u00a0Yao Tian<\/strong>,\u00a0Jian Wu<\/strong>,\u00a0Lei He<\/strong>,\u00a0Ed Lin<\/strong>,\u00a0Yifan Gong<\/a>\r\n\r\nSPE-P17: Speech Enhancement IV\r\nAV(SE) \u00b2: Audio-Visual Squeeze-Excite Speech Enhancement<\/strong><\/a>\r\nMichael Iuzzolino,\u00a0Kazuhito Koishida<\/a>\r\n

8:20 \u2013 8:40 CEST<\/h3>\r\nHLT-L2: Language Modeling\r\nLow-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers<\/strong><\/a>\r\nJunhao Xu,\u00a0Xie Chen<\/strong>, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Mei-Ling Meng\r\n

9:40 \u2013 10:00 CEST<\/h3>\r\nMLSP-L10: Deep Neural Network Structures\r\nNeural Attentive Multiview Machines<\/strong><\/a>\r\nOren Barkan<\/strong>,\u00a0Ori Katz<\/strong>,\u00a0Noam Koenigstein<\/strong>\r\n

11:45 \u2013 13:45 CEST<\/h3>\r\nAUD-P11: Signal Enhancement and Restoration II\r\nGeometrically Constrained Independent Vector Analysis for Directional Speech Enhancement<\/strong><\/a>\r\nLi Li,\u00a0Kazuhito Koishida<\/a>\r\n\r\nAUD-P11: Signal Enhancement and Restoration II\r\nWeighted Speech Distortion Losses for Neural-Network-Based Real-Time Speech Enhancement<\/strong><\/a>\r\nYangyang Xia,\u00a0Sebastian Braun<\/strong>,\u00a0Chandan Reddy<\/strong>,\u00a0Harishchandra Dubey<\/strong>,\u00a0Ross Cutler<\/strong>,\u00a0Ivan Tashev<\/strong>\r\n\r\nHLT-P5: Multilingual Processing of Language\r\nAddressing Accent Mismatch in Mandarin-English Code-Switching Speech Recognition<\/strong><\/a>\r\nZhili Tan<\/strong>,\u00a0Xinghua Fan<\/strong>,\u00a0Hui Zhu<\/strong>,\u00a0Ed Lin<\/strong>\r\n\r\nIFS-P2: Anonymization, Security and Privacy\r\nDetection of Malicious VSCRIPT Using Static and Dynamic Analysis with Recurrent Deep Learning<\/strong><\/a>\r\nJack W. Stokes<\/a>, Rakshit Agrawal,\u00a0Geoff McDonald<\/strong>\r\n\r\nSPE-P19: Machine Learning for Speech Synthesis III\r\nESPNET-TTS: Unified, Reproducible, and Integartable Open Source End-to-End Text-to-Speech Toolkit<\/strong><\/a>\r\nTomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang,\u00a0Xu Tan<\/a>\r\n\r\nSPE-P20: Speech Recognition: Acoustic Modelling II\r\nHigh-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model<\/strong><\/a>\r\nJinyu Li<\/a>,\u00a0Rui Zhao<\/strong>,\u00a0Eric Sun<\/strong>,\u00a0Jeremy Wong<\/strong>,\u00a0Amit Das<\/strong>,\u00a0Zhong Meng<\/strong>,\u00a0Yifan Gong<\/a>\r\n

12:25 \u2013 12:45 CEST<\/h3>\r\nSPE-L16: Speaker Diarization\r\nSpeaker Diarization with Session-Level Speaker Embedding Refinement Using Graph Neural Networks<\/strong><\/a>\r\nJixuan Wang,\u00a0Xiong Xiao<\/strong>,\u00a0Jian Wu<\/strong>,\u00a0Ranjani Ramamurthy<\/a>, Frank Rudzicz, Michael Brudno\r\n

13:05 \u2013 13:25 CEST<\/h3>\r\nSPE-L16: Speaker Diarization\r\nA Memory Augmented Architecture for Continuous Speaker Identification in Meetings<\/strong><\/a>\r\nNikolaos Flemotomos,\u00a0Dimitrios Dimitriadis<\/a>\r\n

15:15 \u2013 17:15 CEST<\/h3>\r\nSPE-P21: Voice Conversion\r\nAn Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data<\/strong><\/a>\r\nFeng-Long Xie, Xin-Hui Li, Bo Liu, Yi-Bin Zheng, Li Meng, Li Lu,\u00a0Frank K. Soong<\/a>\r\n

16:15 \u2013 16:30 CEST<\/h3>\r\nMLSP-L11: Attention Needs\r\nAttentive Item2vec: Neural Attentive User Representations<\/strong><\/a>\r\nOren Barkan<\/strong>, Avi Caciularu,\u00a0Ori Katz<\/strong>,\u00a0Noam Koenigstein<\/strong>"}],"msr_startdate":"2020-05-04","msr_enddate":"2020-05-08","msr_event_time":"","msr_location":"Virtual","msr_event_link":"","msr_event_recording_link":"","msr_startdate_formatted":"May 4, 2020","msr_register_text":"Watch now","msr_cta_link":"","msr_cta_text":"","msr_cta_bi_name":"","featured_image_thumbnail":"\"Microsoft","event_excerpt":"Microsoft is proud to be a silver sponsor of the 45th International Conference on Acoustics, Speech and Signal Processing (ICASSP). Stop by our booth to chat with our experts, see demos of our latest research and find out about career opportunities with Microsoft.","msr_research_lab":[],"related-researchers":[],"msr_impact_theme":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-opportunities":[],"related-publications":[658509,658848,705277,810856],"related-videos":[],"related-posts":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/642510"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-event"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/642510\/revisions"}],"predecessor-version":[{"id":642654,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/642510\/revisions\/642654"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/644613"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=642510"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=642510"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=642510"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=642510"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=642510"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=642510"},{"taxonomy":"msr-program-audience","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-program-audience?post=642510"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=642510"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=642510"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}