{"id":144923,"date":"2011-10-01T00:14:10","date_gmt":"2011-10-01T07:14:10","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/group\/audio-and-acoustics-research-group\/"},"modified":"2024-11-01T14:29:48","modified_gmt":"2024-11-01T21:29:48","slug":"audio-and-acoustics-research-group","status":"publish","type":"msr-group","link":"https:\/\/www.microsoft.com\/en-us\/research\/group\/audio-and-acoustics-research-group\/","title":{"rendered":"Audio and Acoustics Research Group"},"content":{"rendered":"
\n\t
\n\t\t
\n\t\t\t\"Audio\t\t<\/div>\n\t\t\n\t\t
\n\t\t\t\n\t\t\t
\n\t\t\t\t\n\t\t\t\t
\n\t\t\t\t\t\n\t\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t<\/span>\n\t\t\t\t\t\t\t\t\tReturn to Microsoft Research Lab – Redmond\t\t\t\t\t\t\t\t<\/a>\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n

Audio and Acoustics Research Group<\/h1>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n

The Audio and Acoustics group conducts research in audio processing and speech enhancement, 3D audio perception and technologies, devices for audio capture and rendering, array processing, information extraction from audio signals.<\/p>\n\n\n\n

The mission of the Audio and Acoustics Group is to develop state of the art algorithms and designs for audio processing, speech enhancement, 3D audio capture and rendering. We also work on the better acoustical design of audio devices, such as microphones and loudspeakers. We conduct research in the area of information retrieval from audio signals, such as speaker identification, emotion detection, etc. Our goal is to create technologies enabling natural interaction with computers with speech and audio. At the same time, we try to impact Microsoft’s current and future offerings in these areas.<\/p>\n\n\n\n

Contact for the Audio and Acoustics Research Group is Ivan Tashev<\/a>.<\/p>\n\n\n\n

<\/div>\n\n\n\n\n\n\n\n

Ali Vosoughi, University of Rochester, New York, USA. Audio-Video Learning from Unlabeled Data by Leveraging Multimodal LLMs.<\/p>\n\n\n\n

Benjamin Stahl, University of Music and Performing Arts Graz, Austria. Distilling Self-Supervised-Learning-Based Speech Quality Assessment into Compact Models.<\/p>\n\n\n\n

Elisabeth Heremans, KU Leuven, Belgium. Shining light on the learning brain: Estimating mental workload in a simulated flight task using optical f-NIRS signals.<\/p>\n\n\n\n

Gene-Ping Yang, University of Edinburgh, UK. Distributed asynchronous device speech enhancement using microphone permutation and number invariant windowed cross attention.<\/p>\n\n\n\n

Haibin Wu, National Taiwan University, Taiwan. Towards ultra-low latency speech enhancement – A comprehensive study.<\/p>\n\n\n\n

Jinhua Liang, Queen Mary University of London, UK. Audio-Visual Representation Learning and Generation in the Latent Space.<\/p>\n\n\n\n

Shivam Mehta. KTH Royal Institute of Technology, Stockholm, Sweden. Make some noise: Teaching the language of audio to an LLM using sound tokens.<\/p>\n\n\n\n\n\n

Ard Kastrati, ETH Zurich, Switzerland. Decoding Neurophysiological Responses for Improving Predictive Text Systems using Brain-Computer Interfaces.<\/p>\n\n\n\n

Azalea Gui, University of Toronto, Canada. Improving Frechet Audio Distance for Generative Music Evaluation<\/a>.<\/p>\n\n\n\n

Eloi Moliner Juanpere, Aalto University in Espoo, Finland. Unsupervised Speech Reverberation Control with Diffusion Implicit Bridges.<\/p>\n\n\n\n

Michele Mancusi, Sapienza – University of Rome, Italy. Unsupervised Speech Separation Using Adversarial Loss and Additional Separation Losses.<\/p>\n\n\n\n

Ruihan Yang, University of California \u2013 Irvine, USA. Synchronized Audio-Visual Generation with a Joint Generative Diffusion Model and Contrastive Loss (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n

Tanmay Srivastava, Stony Brook University, USA. Private and Accessible Speech Commands in Head-Worn Devices.<\/p>\n\n\n\n

Yuanchao Li, University of Edinburgh, UK. A Comparative Study of Audio Encoders for Emotion in Real and Synthesized Music: Advancing Realistic Emotion Generation.<\/p>\n\n\n\n\n\n

Haleh Akrami, University of Southern California, CA, USA. Semi-supervised multi-task learning for acoustic parameter estimation (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n

Jeremy Hyrkas, University of California, CA, USA. Binaural spatial audio positioning in video calls (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n

Julian Neri, McGill University, Montreal, Canada. Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n

Justin Kilmarx, The University of Texas at Austin, USA. Mapping the neural representational similarity of multiple object categories during visual imagery.<\/p>\n\n\n\n

Khandokar Md. Nayem, Indiana University, Bloomington, IN, USA. Unified Speech Enhancement Approach for Speech Degradations and Noise Suppression (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n

Sandeep Reddy Kothinti, The Johns Hopkins University, USA. Automated Audio Captioning: Methods and Metrics for Natural Language Description of Sounds.<\/p>\n\n\n\n

Sophia Mehdizadeh, Georgia Tech, USA. Improving text prediction accuracy using neurophysiology (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n

Tan Gemicioglu, Georgia Tech, USA. Tongue Gesture Recognition in Head Mounted Displays. (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

<\/p>\n\n\n\n

<\/p>\n\n\n\n\n\n

Wei-Cheng Lin, University of Texas at Dallas, USA. Toxic Speech and Speech Emotions: Investigations of Audio-based Modeling Methodology and Intercorrelations.<\/p>\n\n\n\n

Shoken Kaneko, University of Maryland, USA. DIABLo: a Deep Individual-Agnostic Binaural Localizer<\/a>.<\/p>\n\n\n\n

Justin Kilmarx, University of Texas at Austin, USA. Developing a Brain-Computer Interface Based on Visual Imagery<\/a>.<\/p>\n\n\n\n

Viet Anh Trinh, City University of New York (CUNY), USA. Unsupervised Speech Enhancement<\/a>.<\/p>\n\n\n\n

Abu-Zaher Faridee, University of Maryland, USA. Non-Intrusive Multi-Task Speech Quality Assessment.<\/p>\n\n\n\n\n\n

Ali Aroudi, University of Oldenburg, Germany. Geometry-constrained Beamforming Network for end-to-end Farfield Sound Source Separation.<\/p>\n\n\n\n

Kuan-Jung Chiang, University of California \u2013 San Diego, USA. A Closed-loop Adaptive Brain-computer Interface framework<\/a>.<\/p>\n\n\n\n

Midia Yousefi, University of Texas, Dallas, USA. Audio-based Toxic Language Detection.<\/p>\n\n\n\n

Shoken Kaneko, University of Maryland, College Park, USA. Forest Sound Scene Simulation and Bird Localization with Distributed Microphone Arrays (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n

Wenkang An, Carnegie Mellon University, USA. Decoding Music Attention from \u201cEEG headphones\u201d: a User-friendly Auditory Brain-computer Interface (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n\n\n

Arindam Jati, University of Southern California (USC), Los Angeles, USA. Supervised Deep Hashing for Efficient Audio Retrieval.<\/p>\n\n\n\n

Benjamin Martinez Elizalde, Carnegie Mellon University, USA. Sound event recognition for video-content analysis.<\/p>\n\n\n\n

Fabian Brinkmann, Technical University of Berlin, Germany. Efficient and Perceptually Plausible 3-D Sound for Virtual Reality<\/a>.<\/p>\n\n\n\n

Hakim Si Mohammed, INRIA Rennes, France. Improving the Ergonomics and User-Friendliness of SSVEP-based BCIs in Virtual Reality.<\/p>\n\n\n\n

Md Tamzeed Islam, University of North Carolina at Chapel Hill, USA. Anthropometric Feature Estimation using Sensors on Headphone for HRTF Personalization.<\/p>\n\n\n\n

Morayo Ogunsina, Penn State Erie, USA. Hearing AI App for Sound-Based User Surrounding Awareness.<\/p>\n\n\n\n

Nicholas Huang, Johns Hopkins University, USA. Decoding Auditory Attention Via the Auditory Steady-State Response for Use in A Brain-Computer Interface.<\/p>\n\n\n\n

Sahar Hashemgeloogerdi, University of Rochester, USA. Integrating Beamforming and Multichannel Linear Prediction for Dereverberation and Denoising.<\/p>\n\n\n\n

Wenkang An, Carnegie Mellon University, USA. Decoding Multisensory Attention from Electroencephalography for Use in a Brain-Computer Interface<\/a>.<\/p>\n\n\n\n

Yangyang (Raymond) Xia, Carnegie Mellon University, USA. Real-time Single-channel Speech Enhancement with Recurrent Neural Networks<\/a>.<\/p>\n\n\n\n\n\n

Anderson Avila, Institut National de la Recherche Scientifique (INRS-EMT), Canada. Deep Neural Network Models for Audio Quality Assessment<\/a>.<\/p>\n\n\n\n

Andrea Genovese, New York University Steinhardt, USA. Blind Room Parameter Estimation in Real Time from Single-Channel Audio Signals in Noisy Conditions<\/a>.<\/p>\n\n\n\n

Benjamin Martinez Elizalde, Carnegie Mellon University, USA. A Cross-modal Audio Search Engine based on Joint Audio-Text Embeddings<\/a>.<\/p>\n\n\n\n

Chen Song, University at Buffalo, the State University of New York, USA. Sensor Fusion for Learning-based Motion Estimation in VR<\/a>.<\/p>\n\n\n\n

Christoph F. Hold, Technische Universit\u00e4t Berlin, Germany. Improvements on Higher Order Ambisonics Reproduction in the Spherical Harmonics Domain Under Real-time Constraints<\/a>.<\/p>\n\n\n\n

Harishchandra Dubey, University of Texas at Dallas, USA. MSR-Freesound: Advancing Audio Event Detection & Classification through Efficient Deep Learning Approaches.<\/p>\n\n\n\n

Sebastian Braun, Friedrich-Alexander University of Erlangen Nuremberg (FAU), Germany. Speech Enhancement Using Linear and Non-linear Spatial Filtering for Head-mounted Displays.<\/p>\n\n\n\n\n\n

Etienne Thuillier, Aalto University, Finland. Spatial Audio Feature Discovery Using a Neural Network Classifier.<\/p>\n\n\n\n

Xuesu Xiao, Texas A&M University, USA. Articulated Human Pose Tracking with Inertial Sensors<\/a>.<\/p>\n\n\n\n

Srinivas Parthasarathy, University of Texas at Dallas, USA. Speech Emotion Recognition with Convolutional Neural Networks<\/a>.<\/p>\n\n\n\n

Han Zhao, Carnegie Mellon University, USA. High-Accuracy Neural-Network Models for Speech Enhancement<\/a>.<\/p>\n\n\n\n

Jong Hwan Ko, Georgia Institute of Technology, USA. Efficient Neural-Network Design for Real-Time Speech Enhancement.<\/p>\n\n\n\n

Rasool Fakoor, University of Texas at Arlington, USA. Speech Enhancement With and Without Gradient Descent.<\/p>\n\n\n\n

Yan-hui Tu, University of Science and Technology of China, P. R. China. Regression Based Speech Enhancement with Neural Networks.<\/p>\n\n\n\n\n\n

Amit Das, University of Illinois at Urbana-Champaign, USA. Ultrasound Based Gesture Recognition<\/a>.<\/p>\n\n\n\n

Vani Rajendran, University of Oxford, UK. Simple Effects that Enhance the Elevation Perception in Spatial Sound.<\/p>\n\n\n\n

Zhong-Qiu Wang, Ohio State University. Emotion, gender, and age recognition from speech utterances using neural networks.<\/p>\n\n\n\n\n\n

Archontis Politis, Aalto University, Finland. Applications of 3-Dimensional Spherical Transforms to Acoustics and Personalization of Head-related Transfer Functions (HRTFs)<\/a>.<\/p>\n\n\n\n

Supreeth Krishna Rao, Worcester Polytechnic Institute, USA. Ultrasound Doppler Radar.<\/p>\n\n\n\n

Seyedmahdad Mirsamadi, University of Texas at Dallas, USA. DNN-based Online Speech Enhancement Using Multitask Learning and Suppression Rule Estimation<\/a>.<\/p>\n\n\n\n

Long Le, University of Illinois at Urbana-Champaign, USA. Spatial Probability for Sound Source Localization.<\/p>\n\n\n\n\n\n

Jinkyu Lee, Yonsei University, Korea. Emotion Detection from Speech Signals.<\/p>\n\n\n\n

Felicia Lim, Imperial College London, UK. Blind Estimation of Reverberation Parameters.<\/p>\n\n\n\n\n\n

Ivan Dokmanic, EPFL, Switzerland. Ultrasound Depth Imaging<\/a>.<\/p>\n\n\n\n

Piotr Bilinski, INRIA, France. HRTF Personalization Using Anthropometric Features<\/a>.<\/p>\n\n\n\n

Kun Han, Ohio State University, USA. Emotion Detection from Speech Signals<\/a>.<\/p>\n\n\n\n\n\n

Keith Godin, University of Texas at Dallas, USA. Open-set Speaker Identification on Noisy, Short Utterances.<\/p>\n\n\n\n

Jason Wung, Georgia Tech, USA. Next Steps in Multi-Channel Acoustic Echo reduction for Xbox Kinect.<\/p>\n\n\n\n

Xing Li, University of Washington, USA. Dynamic Loudness Control for In-Car Audio.<\/p>\n\n\n\n\n\n

Keith Godin, University of Texas at Dallas, USA. Binaural Sound Source Localization.<\/p>\n\n\n\n\n\n

Hoang Do, Brown University, USA. A Step Towards NUI: Speaker Verification for Gaming Scenarios.<\/p>\n\n\n\n\n\n

<\/div>\n\n\n\n\n\n
\n
\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Taste and smell: complimentary senses (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

March 22, 2024<\/p>\n<\/div>\n\n\n\n

\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

On the salmon road (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

June 30, 2023<\/p>\n<\/div>\n<\/div>\n\n\n\n

\n
\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Clicking sounds and pulsating noise (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

March 24, 2023<\/p>\n<\/div>\n\n\n\n

\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Horses and tulips (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

April 1, 2022<\/p>\n<\/div>\n<\/div>\n\n\n\n

\n
\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Surface hydroacoustics (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nAugust 1, 2019 <\/p>\n<\/div>\n\n\n\n

\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Underwater and Underground Sounds (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nApril 17, 2019 <\/p>\n<\/div>\n<\/div>\n\n\n\n

\n
\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Waves propagation (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nJune 15, 2018 <\/p>\n<\/div>\n\n\n\n

\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Frozen sound (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nFebruary 26, 2018 <\/p>\n<\/div>\n<\/div>\n\n\n\n

\n
\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Sounds in the Dust (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nJuly 28, 2017 <\/p>\n<\/div>\n\n\n\n

\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Horseback Riding on Orcas Island (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nAugust 1, 2016 <\/p>\n<\/div>\n<\/div>\n\n\n\n

\n
\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

White water rafting summer trip (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nJune 24, 2015 <\/p>\n<\/div>\n\n\n\n

\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Whale watching summer trip (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nJuly 28, 2014 <\/p>\n<\/div>\n<\/div>\n\n\n\n

\n
\n
\"Audio (opens in new tab)<\/span><\/a><\/figure>\n\n\n\n
<\/div>\n\n\n\n

Audio on ski \u2026 or skiing Audio (opens in new tab)<\/span><\/a><\/h4>\n\n\n\n

\nMarch 13, 2014 <\/p>\n<\/div>\n<\/div>\n\n\n\n

<\/div>\n\n\n","protected":false},"excerpt":{"rendered":"

The Audio and Acoustics group conducts research in audio processing and speech enhancement, 3D audio perception and technologies, devices for audio capture and rendering, array processing, information extraction from audio signals.<\/p>\n","protected":false},"featured_media":887814,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_group_start":"2011-04-05","footnotes":""},"research-area":[13556,243062,13552],"msr-group-type":[243694],"msr-locale":[268875],"msr-impact-theme":[],"class_list":["post-144923","msr-group","type-msr-group","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-audio-acoustics","msr-research-area-hardware-devices","msr-group-type-group","msr-locale-en_us"],"msr_group_start":"2011-04-05","msr_detailed_description":"","msr_further_details":"","msr_hero_images":[],"msr_research_lab":[199565],"related-researchers":[{"type":"user_nicename","display_name":"Sebastian Braun","user_id":37688,"people_section":"Team members","alias":"sebraun"},{"type":"user_nicename","display_name":"Dimitra Emmanouilidou","user_id":37461,"people_section":"Team members","alias":"diemmano"},{"type":"user_nicename","display_name":"Hannes Gamper","user_id":31943,"people_section":"Team members","alias":"hagamper"},{"type":"user_nicename","display_name":"David Johnston","user_id":31562,"people_section":"Team members","alias":"davidjo"},{"type":"user_nicename","display_name":"Ivan Tashev","user_id":32127,"people_section":"Team members","alias":"ivantash"},{"type":"user_nicename","display_name":"Ed Cutrell","user_id":31490,"people_section":"Collaborators and affiliates","alias":"cutrell"},{"type":"user_nicename","display_name":"Mary Czerwinski","user_id":32824,"people_section":"Collaborators and affiliates","alias":"marycz"},{"type":"user_nicename","display_name":"Johannes Gehrke","user_id":32364,"people_section":"Collaborators and affiliates","alias":"johannes"},{"type":"user_nicename","display_name":"Eric Horvitz","user_id":32033,"people_section":"Collaborators and affiliates","alias":"horvitz"},{"type":"user_nicename","display_name":"Rico Malvar","user_id":32786,"people_section":"Collaborators and affiliates","alias":"malvar"},{"type":"user_nicename","display_name":"Nikunj Raghuvanshi","user_id":33106,"people_section":"Collaborators and affiliates","alias":"nikunjr"},{"type":"user_nicename","display_name":"Andy Wilson","user_id":31159,"people_section":"Collaborators and affiliates","alias":"awilson"}],"related-publications":[454383,486468,466413,480927,466377,466398,466422,449847,441264,466431,466449,369605,347000,372032,346982,371261,372026,371249,372020,274161,274497,274125,274458,245471,371972,245708,238131,238132,377081,244142,245462,377132,168888,244088,168755,244148,168450,168449,168299,168298,168297,168451,167783,167965,167055,167515,167542,167506,166853,166735,166682,166725,166727,168452,166734,168453,165801,165800,244130,165624,250184,164327,166120,164093,164059,164058,166126,244082,163085,162754,244061,245444,162755,274479,162161,161872,253745,245441,244094,244115,161420,245525,160244,159878,159389,245582,158121,245570,158122,253751,158123,155561,155560,155559,244118,155562,245399,156678,155563,167053,167052,167054,155568,155567,155566,155569,245609,155570,156679,167051,155571,155572,155573,155574,155575,155576,155577,155578,687735,890886,544869,763438,982425,612726,830983,697225,915651,573123,764146,983700,612741,854880,697996,916755,574440,768106,1016412,616803,863991,155951,703480,924168,574680,768115,1050846,618039,864003,155954,754294,927873,578248,786796,1065057,642027,864012,158120,754306,951393,581593,787132,1084854,658839,864021,166572,754324,970326,582358,803650,1087872,658848,885597,437388,754333,970338,582376,810181,1089270,658857,885624,437400,758989,979095,582475,815314,1090497,661485,885639,489146,759025,982389,611670,816130,1102137,664644,887826,507317,763384,982416,611697,820753],"related-downloads":[],"related-videos":[],"related-projects":[],"related-events":[],"related-opportunities":[1100166],"related-posts":[],"tab-content":[{"id":0,"name":"Interns","content":"2021\r\n<\/strong>Wei-Cheng Lin, University of Texas at Dallas, USA. Toxic Speech and Speech Emotions: Investigations of Audio-based Modeling Methodology and Intercorrelations.\r\nShoken Kaneko, University of Maryland, USA. DIABLo: a Deep Individual-Agnostic Binaural Localizer<\/a>.\r\nJustin Kilmarx, University of Texas at Austin, USA. Developing a Brain-Computer Interface Based on Visual Imagery<\/a>.\r\nViet Anh Trinh, City University of New York (CUNY), USA. Unsupervised Speech Enhancement<\/a>.\r\nAbu-Zaher Faridee, University of Maryland, USA. Non-Intrusive Multi-Task Speech Quality Assessment.\r\n\r\n2020<\/strong>\r\nAli Aroudi, University of Oldenburg, Germany. Geometry-constrained Beamforming Network for end-to-end Farfield Sound Source Separation.\r\nKuan-Jung Chiang, University of California \u2013 San Diego, USA. A Closed-loop Adaptive Brain-computer Interface framework<\/a>.\r\nMidia Yousefi, University of Texas, Dallas, USA. Audio-based Toxic Language Detection.\r\nShoken Kaneko, University of Maryland, College Park, USA. Forest Sound Scene Simulation and Bird Localization with Distributed Microphone Arrays<\/a>.\r\nWenkang An, Carnegie Mellon University, USA. Decoding Music Attention from \u201cEEG headphones\u201d: a User-friendly Auditory Brain-computer Interface<\/a>.\r\n\r\n2019<\/strong>\r\nArindam Jati, University of Southern California (USC), Los Angeles, USA. Supervised Deep Hashing for Efficient Audio Retrieval.\r\nBenjamin Martinez Elizalde, Carnegie Mellon University, USA. Sound event recognition for video-content analysis.\r\nFabian Brinkmann, Technical University of Berlin, Germany. Efficient and Perceptually Plausible 3-D Sound for Virtual Reality<\/a>.\r\nHakim Si Mohammed, INRIA Rennes, France. Improving the Ergonomics and User-Friendliness of SSVEP-based BCIs in Virtual Reality.\r\nMd Tamzeed Islam, University of North Carolina at Chapel Hill, USA. Anthropometric Feature Estimation using Sensors on Headphone for HRTF Personalization.\r\nMorayo Ogunsina, Penn State Erie, USA. Hearing AI App for Sound-Based User Surrounding Awareness.\r\nNicholas Huang, Johns Hopkins University, USA. Decoding Auditory Attention Via the Auditory Steady-State Response for Use in A Brain-Computer Interface.\r\nSahar Hashemgeloogerdi, University of Rochester, USA. Integrating Beamforming and Multichannel Linear Prediction for Dereverberation and Denoising.\r\nWenkang An, Carnegie Mellon University, USA. Decoding Multisensory Attention from Electroencephalography for Use in a Brain-Computer Interface<\/a>.\r\nYangyang (Raymond) Xia, Carnegie Mellon University, USA. Real-time Single-channel Speech Enhancement with Recurrent Neural Networks<\/a>.\r\n\r\n2018<\/strong>\r\nAnderson Avila, Institut National de la Recherche Scientifique (INRS-EMT), Canada. Deep Neural Network Models for Audio Quality Assessment<\/a>.\r\nAndrea Genovese, New York University Steinhardt, USA. Blind Room Parameter Estimation in Real Time from Single-Channel Audio Signals in Noisy Conditions<\/a>.\r\nBenjamin Martinez Elizalde, Carnegie Mellon University, USA. A Cross-modal Audio Search Engine based on Joint Audio-Text Embeddings<\/a>.\r\nChen Song, University at Buffalo, the State University of New York, USA. Sensor Fusion for Learning-based Motion Estimation in VR<\/a>.\r\nChristoph F. Hold, Technische Universit\u00e4t Berlin, Germany. Improvements on Higher Order Ambisonics Reproduction in the Spherical Harmonics Domain Under Real-time Constraints<\/a>.\r\nHarishchandra Dubey, University of Texas at Dallas, USA. MSR-Freesound: Advancing Audio Event Detection & Classification through Efficient Deep Learning Approaches.\r\nSebastian Braun, Friedrich-Alexander University of Erlangen Nuremberg (FAU), Germany. Speech Enhancement Using Linear and Non-linear Spatial Filtering for Head-mounted Displays.\r\n\r\n2017\r\n<\/b>Etienne Thuillier, Aalto University, Finland. Spatial Audio Feature Discovery Using a Neural Network Classifier.\r\nXuesu Xiao, Texas A&M University, USA. Articulated Human Pose Tracking with Inertial Sensors<\/a>.\r\nSrinivas Parthasarathy, University of Texas at Dallas, USA. Speech Emotion Recognition with Convolutional Neural Networks<\/a>.\r\nHan Zhao, Carnegie Mellon University, USA. High-Accuracy Neural-Network Models for Speech Enhancement<\/a>.\r\nJong Hwan Ko, Georgia Institute of Technology, USA. Efficient Neural-Network Design for Real-Time Speech Enhancement.\r\nRasool Fakoor, University of Texas at Arlington, USA. Speech Enhancement With and Without Gradient Descent.\r\nYan-hui Tu, University of Science and Technology of China, P. R. China. Regression Based Speech Enhancement with Neural Networks.\r\n\r\n2016<\/strong>\r\nAmit Das, University of Illinois at Urbana-Champaign, USA. Ultrasound Based Gesture Recognition<\/a>.\r\nVani Rajendran, University of Oxford, UK. Simple Effects that Enhance the Elevation Perception in Spatial Sound.\r\nZhong-Qiu Wang, Ohio State University. Emotion, gender, and age recognition from speech utterances using neural networks.\r\n\r\n2015<\/strong>\r\nArchontis Politis, Aalto University, Finland. Applications of 3-Dimensional Spherical Transforms to Acoustics and Personalization of Head-related Transfer Functions (HRTFs)<\/a>.\r\nSupreeth Krishna Rao,\u00a0Worcester Polytechnic Institute, USA. Ultrasound Doppler Radar.\r\nSeyedmahdad Mirsamadi, University of Texas at Dallas, USA. DNN-based Online Speech Enhancement Using Multitask Learning and Suppression Rule Estimation<\/a>.\r\nLong Le, University of Illinois at Urbana-Champaign, USA. Spatial Probability for Sound Source Localization.\r\n\r\n2014<\/strong>\r\nJinkyu Lee, Yonsei University, Korea. Emotion Detection from Speech Signals.\r\nFelicia Lim,\u00a0Imperial College London, UK. Blind Estimation of Reverberation Parameters.\r\n\r\n2013<\/strong>\r\nIvan Dokmanic, EPFL, Switzerland. Ultrasound Depth Imaging<\/a>.\r\nPiotr Bilinski, INRIA, France. HRTF Personalization Using Anthropometric Features<\/a>.\r\nKun Han, Ohio State University, USA. Emotion Detection from Speech Signals<\/a>.\r\n\r\n2012<\/strong>\r\nKeith Godin, University of Texas at Dallas, USA. Open-set Speaker Identification on Noisy, Short Utterances.\r\nJason Wung, Georgia Tech, USA. Next Steps in Multi-Channel Acoustic Echo reduction for Xbox Kinect.\r\nXing Li, University of Washington, USA. Dynamic Loudness Control for In-Car Audio.\r\n\r\n2011<\/strong>\r\nKeith Godin, University of Texas at Dallas, USA. Binaural Sound Source Localization.\r\n\r\n2010<\/strong>\r\nHoang Do, Brown University, USA. A Step Towards NUI: Speaker Verification for Gaming Scenarios."},{"id":1,"name":"In the News","content":"

Hearing in 3D with Dr. Ivan Tashev <\/a><\/div>\r\n
Microsoft Research Podcast, November 14, 2018<\/div>\r\n
Beyond Tomorrow \u2013 A vision study by Br\u00fcel & Kj\u00e6r<\/a><\/div>\r\n
Beyond Tomorrow, December 4, 2017<\/div>\r\n
Ivan Tashev is on the expert panel<\/em><\/div>\r\n
Is Sound the Secret Sauce for Making Immersive Experiences? <\/a><\/div>\r\n
VRScout, October 24, 2017<\/div>\r\n
Listeners Seeing What They Hear: Virtual Reality & 3D Acoustics Integration<\/a><\/div>\r\n
The Science Times, June 28, 2017<\/div>\r\n
3D sound to let you hear Walking Dead zombies first<\/a><\/div>\r\n
The Times, June 27, 2017<\/div>\r\n
Researchers use head related transfer functions to personalize audio in mixed and virtual reality<\/a><\/div>\r\n
Phys.org, June 26, 2017<\/div>\r\n
Creating a personalized, immersive audio environment<\/a><\/div>\r\n
ScienceDaily, June 26, 2017<\/div>\r\n
Be There: 3D Audio Virtual Presence (video)<\/a><\/div>\r\n
TechFest, March 25, 2015<\/div>\r\n
Headset provides '3D soundscape' to help blind people navigate cities<\/a><\/div>\r\n
The Guardian, November 7, 2014<\/div>\r\n
How 3D audio technology could 'unlock' cities for blind people<\/a><\/div>\r\n
The Telegraph, November 6, 2014<\/div>\r\n
Cities Unlocked: Lighting up the world through sound (video) <\/a><\/div>\r\n
Microsoft UK, November 6, 2014<\/div>\r\n
3D audio is the secret to HoloLens' convincing holograms<\/a><\/div>\r\n
Engadget, November 2, 2016<\/div>\r\n
Virtual Reality May Become the Next Great Media Platform\u2014But Can It Fool All Five Senses?<\/a><\/div>\r\n
Singularity Hub, September 28, 2014<\/div>\r\n
What\u2019s Missing from Virtual Reality? Immersive 3D Soundscapes<\/a><\/div>\r\n
Singularity Hub, July 6, 2014<\/div>\r\n
Microsoft\u2019s \u201c3-D Audio\u201d Gives Virtual Objects a Voice<\/a><\/div>\r\n
MIT Technology Review, June 4, 2014<\/div>\r\n
Microsoft 3D audio tech makes virtual sounds sound real<\/a><\/div>\r\n
Windows Central, June 4, 2014<\/div>\r\n
Audio Advances Help Xbox One Determine Signal from Noise<\/a><\/div>\r\n
Microsoft Research Blog, October 16, 2013<\/div>\r\n
Ivan Tashev Helps Makes Microsoft Sound\u00a0Great (video)<\/a><\/div>\r\n
Microsoft Research Luminaries,\u00a0October 16, 2013<\/div>\r\n
Keynote - Ivan Tashev Optimizing Kinect: Audio and Acoustics (video)<\/a><\/div>\r\n
ITA 2012,\u00a0February 14, 2012<\/div>\r\n
Tellme and the Voice of Kinect<\/a><\/div>\r\n
Microsoft - The AI Blog, August 1, 2011<\/div>\r\n
Kinect Audio: Preparedness Pays Off<\/a><\/div>\r\n
Microsoft Research Blog, April 14, 2011<\/div>\r\n
MSR NUI Panel with Curtis Wong & Ivan\u00a0Tashev (video)<\/a><\/div>\r\n
Channel 9 Live at MIX11, April 13,\u00a02011<\/div>\r\n
Audio for Kinect: From Idea to \"Xbox,\u00a0Play!\" (video)<\/a><\/div>\r\n
MIX11, March 15,\u00a02011<\/div>"},{"id":2,"name":"Team Life","content":"[row][column class=\"m-col-12-24\"] \"\"\r\n

Surface hydroacoustics<\/a><\/h4>\r\nAugust 1, 2019 [\/column] [column class=\"m-col-12-24\"] \"\"\r\n

Underwater and Underground Sounds<\/a><\/h4>\r\nApril 17, 2019 [\/column][\/row] [row][column class=\"m-col-12-24\"] \"Waves<\/a>\r\n

Waves propagation<\/a><\/h4>\r\nJune 15, 2018 [\/column] [column class=\"m-col-12-24\"] \"Frozen<\/a>\r\n

Frozen sound<\/a><\/h4>\r\nFebruary 26, 2018 [\/column][\/row] [row][column class=\"m-col-12-24\"] \"Sounds<\/a>\r\n

Sounds in the Dust<\/a><\/h4>\r\nJuly 28, 2017 [\/column] [column class=\"m-col-12-24\"] \"\"<\/a>\r\n

Horseback Riding on Orcas Island<\/a><\/h4>\r\nAugust 1, 2016 [\/column][\/row] [row][column class=\"m-col-12-24\"] \"\"<\/a>\r\n

White water rafting summer trip<\/a><\/h4>\r\nJune 24, 2015 [\/column] [column class=\"m-col-12-24\"] \"\"<\/a>\r\n

Whale watching summer trip<\/a><\/h4>\r\nJuly 28, 2014 [\/column][\/row] [row][column class=\"m-col-12-24\"] \"\"<\/a>\r\n

Audio on ski \u2026 or skiing Audio<\/a><\/h4>\r\nMarch 13, 2014 [\/column][\/row]"}],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/144923"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-group"}],"version-history":[{"count":51,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/144923\/revisions"}],"predecessor-version":[{"id":1099947,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/144923\/revisions\/1099947"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/887814"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=144923"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=144923"},{"taxonomy":"msr-group-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group-type?post=144923"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=144923"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=144923"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}