openSMILE – The Munich Versatile and Fast Open-Source Audio Feature Extractor

openSMILE – The Munich Versatile and Fast Open-Source Audio Feature Extractor

October 25–29, 2010, Firenze, Italy | Florian Eyben, Martin Wöllmer, Björn Schuller
The paper introduces openSMILE, an open-source audio feature extraction toolkit designed to unite feature extraction algorithms from speech processing and Music Information Retrieval (MIR) communities. It supports a wide range of low-level descriptors such as CHROMA, CENS, loudness, Mel-frequency cepstral coefficients, perceptual linear predictive cepstral coefficients, linear predictive coefficients, line spectral frequencies, fundamental frequency, and formant frequencies. The toolkit also allows for various statistical functionals and delta regression on these descriptors. implemented in C++ with no third-party dependencies, openSMILE is fast, runs on Unix and Windows, and has a modular architecture that supports easy extension via plugins. It supports both online incremental processing and offline/batch processing, ensuring numeric compatibility through unit tests. The paper discusses related tools, describes openSMILE's architecture, lists available feature extractors, provides performance benchmarks, and outlines future developments and plans. openSMILE is already used in various research projects, including emotion recognition and speaker height classification, and is actively developed with plans to integrate more features and improve multithreading support.The paper introduces openSMILE, an open-source audio feature extraction toolkit designed to unite feature extraction algorithms from speech processing and Music Information Retrieval (MIR) communities. It supports a wide range of low-level descriptors such as CHROMA, CENS, loudness, Mel-frequency cepstral coefficients, perceptual linear predictive cepstral coefficients, linear predictive coefficients, line spectral frequencies, fundamental frequency, and formant frequencies. The toolkit also allows for various statistical functionals and delta regression on these descriptors. implemented in C++ with no third-party dependencies, openSMILE is fast, runs on Unix and Windows, and has a modular architecture that supports easy extension via plugins. It supports both online incremental processing and offline/batch processing, ensuring numeric compatibility through unit tests. The paper discusses related tools, describes openSMILE's architecture, lists available feature extractors, provides performance benchmarks, and outlines future developments and plans. openSMILE is already used in various research projects, including emotion recognition and speaker height classification, and is actively developed with plans to integrate more features and improve multithreading support.
Reach us at info@study.space
[slides] Opensmile%3A the munich versatile and fast open-source audio feature extractor | StudySpace