A Light CNN for Deep Face Representation with Noisy Labels

A Light CNN for Deep Face Representation with Noisy Labels

VOL. 14, NO. 8, AUGUST 2017 | Xiang Wu, Ran He, Senior Member, IEEE, Zhenan Sun*, Member, IEEE, and Tieniu Tan, Fellow, IEEE
This paper presents a Light CNN framework designed to learn a compact and efficient representation for face recognition from large-scale datasets with noisy labels. The key contributions include: 1. **Max-Feature-Map (MFM) Operation**: A novel activation function inspired by maxout, MFM uses a competitive relationship to separate noisy and informative signals and perform feature selection between two feature maps. This operation helps in reducing the number of parameters and computational costs. 2. **Light CNN Architectures**: Three network architectures (Light CNN-4, Light CNN-9, and Light CNN-29) are designed to achieve better performance with fewer parameters and computational resources. These architectures incorporate MFM, small convolution filters, and Network in Network layers. 3. **Semantic Bootstrapping Method**: A method is proposed to handle noisy labeled images by re-labeling training data using pre-trained deep networks. This method balances the trade-off between prediction and original labels, improving the consistency of predictions. 4. **Experimental Results**: The proposed framework achieves state-of-the-art results on various face benchmarks without fine-tuning. The single model with a 256-D representation outperforms other methods on large-scale, video-based, cross-age, heterogenous, and cross-view face recognition tasks. 5. **Computational Efficiency**: The Light CNN models are significantly faster and more efficient than other published CNN methods, making them suitable for real-time applications. The paper also discusses the effectiveness of MFM in different CNN architectures and analyzes the performance of the proposed methods on various datasets, demonstrating their robustness and generalization capabilities.This paper presents a Light CNN framework designed to learn a compact and efficient representation for face recognition from large-scale datasets with noisy labels. The key contributions include: 1. **Max-Feature-Map (MFM) Operation**: A novel activation function inspired by maxout, MFM uses a competitive relationship to separate noisy and informative signals and perform feature selection between two feature maps. This operation helps in reducing the number of parameters and computational costs. 2. **Light CNN Architectures**: Three network architectures (Light CNN-4, Light CNN-9, and Light CNN-29) are designed to achieve better performance with fewer parameters and computational resources. These architectures incorporate MFM, small convolution filters, and Network in Network layers. 3. **Semantic Bootstrapping Method**: A method is proposed to handle noisy labeled images by re-labeling training data using pre-trained deep networks. This method balances the trade-off between prediction and original labels, improving the consistency of predictions. 4. **Experimental Results**: The proposed framework achieves state-of-the-art results on various face benchmarks without fine-tuning. The single model with a 256-D representation outperforms other methods on large-scale, video-based, cross-age, heterogenous, and cross-view face recognition tasks. 5. **Computational Efficiency**: The Light CNN models are significantly faster and more efficient than other published CNN methods, making them suitable for real-time applications. The paper also discusses the effectiveness of MFM in different CNN architectures and analyzes the performance of the proposed methods on various datasets, demonstrating their robustness and generalization capabilities.
Reach us at info@study.space
[slides] A Light CNN for Deep Face Representation With Noisy Labels | StudySpace