17 Nov 2017 | Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang
This paper proposes a generalized large-margin softmax (L-Softmax) loss for convolutional neural networks (CNNs) to enhance feature discriminativeness. The L-Softmax loss explicitly encourages intra-class compactness and inter-class separability, improving performance in visual classification and verification tasks. Unlike the standard softmax loss, which only uses cosine distance for classification, L-Softmax introduces a margin parameter m to control the angular separation between classes. This allows for a more rigorous decision boundary, leading to better feature learning. The L-Softmax loss can be optimized using stochastic gradient descent and has been shown to significantly boost performance on benchmark datasets such as MNIST, CIFAR10, CIFAR100, and LFW. The loss also helps avoid overfitting by making the learning task more challenging. The paper provides a geometric interpretation of the L-Softmax loss, showing how it narrows the feasible angle for each class and increases the margin between classes. Experiments demonstrate that L-Softmax outperforms the standard softmax loss and other state-of-the-art methods in both classification and verification tasks. The loss is flexible, with m controlling the difficulty of the learning task, and can be easily integrated into existing CNN architectures. The results show that L-Softmax improves feature discriminativeness, leading to better performance in visual recognition tasks.This paper proposes a generalized large-margin softmax (L-Softmax) loss for convolutional neural networks (CNNs) to enhance feature discriminativeness. The L-Softmax loss explicitly encourages intra-class compactness and inter-class separability, improving performance in visual classification and verification tasks. Unlike the standard softmax loss, which only uses cosine distance for classification, L-Softmax introduces a margin parameter m to control the angular separation between classes. This allows for a more rigorous decision boundary, leading to better feature learning. The L-Softmax loss can be optimized using stochastic gradient descent and has been shown to significantly boost performance on benchmark datasets such as MNIST, CIFAR10, CIFAR100, and LFW. The loss also helps avoid overfitting by making the learning task more challenging. The paper provides a geometric interpretation of the L-Softmax loss, showing how it narrows the feasible angle for each class and increases the margin between classes. Experiments demonstrate that L-Softmax outperforms the standard softmax loss and other state-of-the-art methods in both classification and verification tasks. The loss is flexible, with m controlling the difficulty of the learning task, and can be easily integrated into existing CNN architectures. The results show that L-Softmax improves feature discriminativeness, leading to better performance in visual recognition tasks.