[slides] Music-Genre Classification using Bidirectional Long Short-Term Memory and Mel-Frequency Cepstral Coefficients

This research proposes a method for music genre classification using Bidirectional Long Short-Term Memory (BiLSTM) and Mel-Frequency Cepstral Coefficients (MFCC) extraction features. The method was tested on the GTZAN and ISMIR2004 datasets, with the ISMIR2004 dataset being preprocessed to match the duration of the GTZAN dataset (30 seconds). Preprocessing steps included removing silent parts and stretching the audio to ensure normalized input. The proposed method achieved an accuracy of 93.10% on the GTZAN dataset and 93.69% on the ISMIR2004 dataset. The study compared the performance of the BiLSTM model with other models such as LSTM and MCLNN, demonstrating that the BiLSTM model outperformed these models in terms of validation accuracy. The research also evaluated the model using confusion matrices and classification reports, showing high precision, recall, F1-score, and accuracy. The results indicate that the BiLSTM model is effective in classifying music genres, even on imbalanced datasets. Future research could explore different feature extraction methods and use more recent datasets to improve the model's relevance.This research proposes a method for music genre classification using Bidirectional Long Short-Term Memory (BiLSTM) and Mel-Frequency Cepstral Coefficients (MFCC) extraction features. The method was tested on the GTZAN and ISMIR2004 datasets, with the ISMIR2004 dataset being preprocessed to match the duration of the GTZAN dataset (30 seconds). Preprocessing steps included removing silent parts and stretching the audio to ensure normalized input. The proposed method achieved an accuracy of 93.10% on the GTZAN dataset and 93.69% on the ISMIR2004 dataset. The study compared the performance of the BiLSTM model with other models such as LSTM and MCLNN, demonstrating that the BiLSTM model outperformed these models in terms of validation accuracy. The research also evaluated the model using confusion matrices and classification reports, showing high precision, recall, F1-score, and accuracy. The results indicate that the BiLSTM model is effective in classifying music genres, even on imbalanced datasets. Future research could explore different feature extraction methods and use more recent datasets to improve the model's relevance.

Music-Genre Classification using Bidirectional Long Short-Term Memory and Mel-Frequency Cepstral Coefficients

January, 9th 2024 | Nantalira Niar Wijaya, De Rosal Ignatius Moses Setiadi, Ahmad Rofiqul Muslih