Optimizing the configuration of deep learning models for music genre classification

Optimizing the configuration of deep learning models for music genre classification

17 January 2024 | Teng Li
This paper proposes a novel deep learning approach for music genre classification using a combination of Mel Frequency Cepstral Coefficients (MFCC) and Short-Time Fourier Transform (STFT) features, followed by the application of two Convolutional Neural Networks (CNNs) optimized via the Black Hole Optimization (BHO) algorithm. The method involves preprocessing input signals, extracting MFCC and STFT features, optimizing CNN hyperparameters using BHO, and classifying music genres using a SoftMax classifier. The proposed approach achieves high classification accuracy on the GTZAN and Extended-Ballroom datasets, with 95.2% and 95.7% accuracy, respectively, outperforming previous methods. The BHO algorithm is used to optimize the CNN hyperparameters, which helps in minimizing training error and improving model performance. The study also compares the proposed method with existing techniques such as BAG and Yang et al.'s approach, showing that the proposed method achieves higher accuracy and better performance in most cases. The results indicate that the combination of MFCC and STFT features, along with BHO-optimized CNNs, provides a more effective solution for music genre classification. The method is validated using the GTZAN and Extended-Ballroom datasets, and the results demonstrate the effectiveness of the proposed approach in accurately classifying music genres.This paper proposes a novel deep learning approach for music genre classification using a combination of Mel Frequency Cepstral Coefficients (MFCC) and Short-Time Fourier Transform (STFT) features, followed by the application of two Convolutional Neural Networks (CNNs) optimized via the Black Hole Optimization (BHO) algorithm. The method involves preprocessing input signals, extracting MFCC and STFT features, optimizing CNN hyperparameters using BHO, and classifying music genres using a SoftMax classifier. The proposed approach achieves high classification accuracy on the GTZAN and Extended-Ballroom datasets, with 95.2% and 95.7% accuracy, respectively, outperforming previous methods. The BHO algorithm is used to optimize the CNN hyperparameters, which helps in minimizing training error and improving model performance. The study also compares the proposed method with existing techniques such as BAG and Yang et al.'s approach, showing that the proposed method achieves higher accuracy and better performance in most cases. The results indicate that the combination of MFCC and STFT features, along with BHO-optimized CNNs, provides a more effective solution for music genre classification. The method is validated using the GTZAN and Extended-Ballroom datasets, and the results demonstrate the effectiveness of the proposed approach in accurately classifying music genres.
Reach us at info@study.space
[slides] Optimizing the configuration of deep learning models for music genre classification | StudySpace