The effect of room acoustics on audio event classification

 

Classification accuracy as a function of Impulse Response parameters.

Abstract: The increasing availability of large-scale annotated databases, together with advances in data-driven learning and deep neural networks, have pushed the state of the art for computer-aided detection problems like audio scene analysis and event classification. However, the large variety of acoustic environments and their acoustic properties encountered in practice can pose a great challenge for such tasks and compromise the robustness of general-purpose classifiers when tested in unseen conditions or real-life applications. In this work we perform a quantitative analysis of the effect of room acoustics on general audio event detection scenarios. We study the impact of mismatches between training and testing conditions in terms of acoustical parameters, including the reverberation time (T60) and the direct-to-reverberant ratio (DRR), on audio classification accuracy and class separability. The results of this study may serve as guidance for practitioners to build more robust frameworks for audio event classification tasks.