This paper presents a high-performance deep convolutional network called DeepID2+ for face recognition, which achieves state-of-the-art results on LFW and YouTube Faces benchmarks. The network is trained with identification-verification supervision, and by increasing the dimension of hidden representations and adding supervision to early convolutional layers, it achieves new state-of-the-art performance. Through empirical studies, three key properties of its deep neural activations are identified: sparsity, selectiveness, and robustness.
Sparsity refers to the moderate activation of neurons in the top hidden layer, where about half of the neurons are activated for each input face image. This sparsity maximizes the discriminative power of the network and the distance between images. Selectiveness indicates that higher-layer neurons are highly responsive to identities and identity-related attributes. Neurons are activated or inhibited for specific identities or attributes, and this selectivity is implicitly learned by the network without explicit training on attributes. Robustness refers to the network's ability to maintain performance despite occlusions, as occlusion patterns are not included in the training data.
The network's deep structure naturally leads to these properties without additional regularization. The study shows that binary activation patterns are more important than activation magnitudes in deep neural networks. DeepID2+ features are also more robust to image corruption than handcrafted features like LBP. The network's performance is evaluated on various benchmarks, showing significant improvements in face verification and identification accuracy. The results demonstrate that DeepID2+ features are not only highly discriminative but also robust to occlusions and other corruptions. The study provides valuable insights into the intrinsic properties of deep networks and their applications in face recognition.This paper presents a high-performance deep convolutional network called DeepID2+ for face recognition, which achieves state-of-the-art results on LFW and YouTube Faces benchmarks. The network is trained with identification-verification supervision, and by increasing the dimension of hidden representations and adding supervision to early convolutional layers, it achieves new state-of-the-art performance. Through empirical studies, three key properties of its deep neural activations are identified: sparsity, selectiveness, and robustness.
Sparsity refers to the moderate activation of neurons in the top hidden layer, where about half of the neurons are activated for each input face image. This sparsity maximizes the discriminative power of the network and the distance between images. Selectiveness indicates that higher-layer neurons are highly responsive to identities and identity-related attributes. Neurons are activated or inhibited for specific identities or attributes, and this selectivity is implicitly learned by the network without explicit training on attributes. Robustness refers to the network's ability to maintain performance despite occlusions, as occlusion patterns are not included in the training data.
The network's deep structure naturally leads to these properties without additional regularization. The study shows that binary activation patterns are more important than activation magnitudes in deep neural networks. DeepID2+ features are also more robust to image corruption than handcrafted features like LBP. The network's performance is evaluated on various benchmarks, showing significant improvements in face verification and identification accuracy. The results demonstrate that DeepID2+ features are not only highly discriminative but also robust to occlusions and other corruptions. The study provides valuable insights into the intrinsic properties of deep networks and their applications in face recognition.