Understanding PANNs%3A Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

The paper introduces Pretrained Audio Neural Networks (PANNs) for audio pattern recognition tasks, leveraging the large-scale AudioSet dataset. PANNs are trained using various convolutional neural networks (CNNs) and are evaluated on AudioSet tagging, achieving a mean average precision (mAP) of 0.439, surpassing previous state-of-the-art systems. The authors propose a Wavegram-Logmel-CNN architecture that combines log-mel spectrograms and waveform inputs, enhancing performance. PANNs are also transferred to other audio tasks, including acoustic scene classification, music classification, and speech emotion classification, demonstrating state-of-the-art performance in several cases. The paper discusses the trade-offs between performance and computational complexity, and provides detailed experimental results and comparisons with previous methods. The source code and pre-trained models are released for further research.The paper introduces Pretrained Audio Neural Networks (PANNs) for audio pattern recognition tasks, leveraging the large-scale AudioSet dataset. PANNs are trained using various convolutional neural networks (CNNs) and are evaluated on AudioSet tagging, achieving a mean average precision (mAP) of 0.439, surpassing previous state-of-the-art systems. The authors propose a Wavegram-Logmel-CNN architecture that combines log-mel spectrograms and waveform inputs, enhancing performance. PANNs are also transferred to other audio tasks, including acoustic scene classification, music classification, and speech emotion classification, demonstrating state-of-the-art performance in several cases. The paper discusses the trade-offs between performance and computational complexity, and provides detailed experimental results and comparisons with previous methods. The source code and pre-trained models are released for further research.

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

23 Aug 2020 | Qiuqiang Kong, Student Member, IEEE, Yin Cao, Member, IEEE, Turab Iqbal, Yuxuan Wang, Wenwu Wang, Senior Member, IEEE and Mark D. Plumbley, Fellow, IEEE