Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

27 Oct 2019 | Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma
This paper proposes two novel methods to improve performance on imbalanced datasets: a label-distribution-aware margin (LDAM) loss and a deferred re-balancing optimization schedule. The LDAM loss is designed to encourage larger margins for minority classes by incorporating the label distribution into the loss function. This loss extends the existing soft margin loss by encouraging minority classes to have larger margins. The deferred re-balancing optimization schedule defers re-weighting until after the initial stage, allowing the model to learn an initial representation while avoiding some of the complications associated with re-weighting or re-sampling. The methods are tested on several benchmark vision tasks including the real-world imbalanced dataset iNaturalist 2018. The experiments show that either of these methods alone can already improve over existing techniques and their combination achieves even better performance gains. The LDAM loss is theoretically motivated by minimizing a margin-based generalization bound, and it is shown to optimize a uniform-label generalization error bound. The deferred re-balancing schedule is shown to improve the optimization and generalization of a generic re-weighting scheme. The methods are evaluated on several benchmark vision tasks and show significant improvements on several benchmark vision tasks, such as artificially imbalanced CIFAR and Tiny ImageNet, and the real-world large-scale imbalanced dataset iNaturalist'18.This paper proposes two novel methods to improve performance on imbalanced datasets: a label-distribution-aware margin (LDAM) loss and a deferred re-balancing optimization schedule. The LDAM loss is designed to encourage larger margins for minority classes by incorporating the label distribution into the loss function. This loss extends the existing soft margin loss by encouraging minority classes to have larger margins. The deferred re-balancing optimization schedule defers re-weighting until after the initial stage, allowing the model to learn an initial representation while avoiding some of the complications associated with re-weighting or re-sampling. The methods are tested on several benchmark vision tasks including the real-world imbalanced dataset iNaturalist 2018. The experiments show that either of these methods alone can already improve over existing techniques and their combination achieves even better performance gains. The LDAM loss is theoretically motivated by minimizing a margin-based generalization bound, and it is shown to optimize a uniform-label generalization error bound. The deferred re-balancing schedule is shown to improve the optimization and generalization of a generic re-weighting scheme. The methods are evaluated on several benchmark vision tasks and show significant improvements on several benchmark vision tasks, such as artificially imbalanced CIFAR and Tiny ImageNet, and the real-world large-scale imbalanced dataset iNaturalist'18.
Reach us at info@study.space
[slides and audio] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss