A systematic study of the class imbalance problem in convolutional neural networks

A systematic study of the class imbalance problem in convolutional neural networks

2018 | Mateusz Buda, Atsuto Maki, Maciej A. Mazurowski
This study systematically investigates the impact of class imbalance on the classification performance of convolutional neural networks (CNNs) and compares various methods to address this issue. Class imbalance is a common problem in machine learning, particularly in deep learning, where it can significantly affect training and generalization. The study uses three benchmark datasets—MNIST, CIFAR-10, and ImageNet—to evaluate the effects of imbalance and compare methods such as oversampling, undersampling, two-phase training, and thresholding. The main evaluation metric is the area under the receiver operating characteristic curve (ROC AUC) adjusted for multi-class tasks, as overall accuracy is not suitable for imbalanced data. Key findings include: 1. Class imbalance has a detrimental effect on classification performance. 2. Oversampling is the most effective method across almost all scenarios, consistently improving performance over the baseline. 3. Undersampling generally performs poorly and often degrades performance. 4. Two-phase training methods show mixed results, with performance varying depending on the baseline method. 5. Thresholding is useful for compensating for prior class probabilities when the number of correctly classified cases is important. The study concludes that oversampling should be applied to eliminate imbalance, while the optimal undersampling ratio depends on the extent of imbalance. Thresholding is recommended for scenarios where the number of correctly classified cases is a concern.This study systematically investigates the impact of class imbalance on the classification performance of convolutional neural networks (CNNs) and compares various methods to address this issue. Class imbalance is a common problem in machine learning, particularly in deep learning, where it can significantly affect training and generalization. The study uses three benchmark datasets—MNIST, CIFAR-10, and ImageNet—to evaluate the effects of imbalance and compare methods such as oversampling, undersampling, two-phase training, and thresholding. The main evaluation metric is the area under the receiver operating characteristic curve (ROC AUC) adjusted for multi-class tasks, as overall accuracy is not suitable for imbalanced data. Key findings include: 1. Class imbalance has a detrimental effect on classification performance. 2. Oversampling is the most effective method across almost all scenarios, consistently improving performance over the baseline. 3. Undersampling generally performs poorly and often degrades performance. 4. Two-phase training methods show mixed results, with performance varying depending on the baseline method. 5. Thresholding is useful for compensating for prior class probabilities when the number of correctly classified cases is important. The study concludes that oversampling should be applied to eliminate imbalance, while the optimal undersampling ratio depends on the extent of imbalance. Thresholding is recommended for scenarios where the number of correctly classified cases is a concern.
Reach us at info@study.space