30 Oct 2018 | Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor W. Tsang, Masashi Sugiyama
This paper proposes a new deep learning paradigm called "Co-teaching" to train deep neural networks robustly in the presence of extremely noisy labels. The key idea is to maintain two networks simultaneously and have them teach each other by selecting and sharing small-loss instances. This approach allows the networks to filter out noisy labels and improve robustness. The method is evaluated on noisy versions of MNIST, CIFAR-10, and CIFAR-100 datasets, demonstrating superior performance compared to state-of-the-art methods, especially under high noise conditions. Co-teaching outperforms other methods in both high and low noise scenarios, showing its effectiveness in handling noisy labels. The algorithm is implemented using stochastic gradient descent with momentum and leverages the memorization ability of deep networks to filter out noisy instances. The paper also discusses the theoretical and practical implications of Co-teaching, including its adaptability to different network architectures and its potential for future research in robust learning with noisy labels.This paper proposes a new deep learning paradigm called "Co-teaching" to train deep neural networks robustly in the presence of extremely noisy labels. The key idea is to maintain two networks simultaneously and have them teach each other by selecting and sharing small-loss instances. This approach allows the networks to filter out noisy labels and improve robustness. The method is evaluated on noisy versions of MNIST, CIFAR-10, and CIFAR-100 datasets, demonstrating superior performance compared to state-of-the-art methods, especially under high noise conditions. Co-teaching outperforms other methods in both high and low noise scenarios, showing its effectiveness in handling noisy labels. The algorithm is implemented using stochastic gradient descent with momentum and leverages the memorization ability of deep networks to filter out noisy instances. The paper also discusses the theoretical and practical implications of Co-teaching, including its adaptability to different network architectures and its potential for future research in robust learning with noisy labels.