27 Dec 2017 | Aritra Ghosh, Himanshu Kumar, P. S. Sastry
This paper addresses the issue of label noise in deep neural networks, a common problem in classifier learning where training data is often noisy due to human errors, measurement issues, or subjective biases. The authors propose a novel approach to robust loss functions that are inherently tolerant to label noise, focusing on multiclass classification problems. They derive sufficient conditions for a loss function to be noise-tolerant, generalizing existing results for binary classification. Specifically, they show that the mean absolute error (MAE) loss function satisfies these conditions, making it robust to various types of label noise. Empirical results using both image and text datasets demonstrate the effectiveness of MAE in maintaining high accuracy even under high levels of label noise, while other common losses like cross-entropy (CCE) show significant degradation. The paper also discusses the theoretical foundations and provides a detailed analysis of the robustness of different loss functions, highlighting the advantages and limitations of each.This paper addresses the issue of label noise in deep neural networks, a common problem in classifier learning where training data is often noisy due to human errors, measurement issues, or subjective biases. The authors propose a novel approach to robust loss functions that are inherently tolerant to label noise, focusing on multiclass classification problems. They derive sufficient conditions for a loss function to be noise-tolerant, generalizing existing results for binary classification. Specifically, they show that the mean absolute error (MAE) loss function satisfies these conditions, making it robust to various types of label noise. Empirical results using both image and text datasets demonstrate the effectiveness of MAE in maintaining high accuracy even under high levels of label noise, while other common losses like cross-entropy (CCE) show significant degradation. The paper also discusses the theoretical foundations and provides a detailed analysis of the robustness of different loss functions, highlighting the advantages and limitations of each.