Bag of Tricks for Image Classification with Convolutional Neural Networks

Bag of Tricks for Image Classification with Convolutional Neural Networks

5 Dec 2018 | Tong He Zhi Zhang Hang Zhang Zhongyue Zhang Junyuan Xie Mu Li
This paper presents a collection of training refinements for convolutional neural networks (CNNs) that significantly improve image classification accuracy. The authors examine various techniques, including changes to data augmentation, optimization methods, and model architecture, and evaluate their impact through ablation studies. They demonstrate that combining these refinements can significantly enhance model performance. For example, applying these techniques to ResNet-50 improves its ImageNet top-1 validation accuracy from 75.3% to 79.29%. The improvements also lead to better transfer learning performance in other tasks such as object detection and semantic segmentation. The paper discusses several key techniques, including large-batch training with learning rate scaling and warmup, low-precision training with FP16, model tweaks like changing convolutional layers and input stems, and training refinements such as cosine learning rate decay, label smoothing, knowledge distillation, and mixup training. These techniques are evaluated on various CNN architectures, including ResNet, Inception-V3, and MobileNet, and show consistent improvements in accuracy. The results show that these refinements not only improve model accuracy but also enhance transfer learning performance. For instance, the improved ResNet-50 model achieves higher accuracy in object detection and semantic segmentation tasks. The paper concludes that these techniques are effective across different CNN architectures and datasets, and that the benefits extend to broader domains where classification-based models are used.This paper presents a collection of training refinements for convolutional neural networks (CNNs) that significantly improve image classification accuracy. The authors examine various techniques, including changes to data augmentation, optimization methods, and model architecture, and evaluate their impact through ablation studies. They demonstrate that combining these refinements can significantly enhance model performance. For example, applying these techniques to ResNet-50 improves its ImageNet top-1 validation accuracy from 75.3% to 79.29%. The improvements also lead to better transfer learning performance in other tasks such as object detection and semantic segmentation. The paper discusses several key techniques, including large-batch training with learning rate scaling and warmup, low-precision training with FP16, model tweaks like changing convolutional layers and input stems, and training refinements such as cosine learning rate decay, label smoothing, knowledge distillation, and mixup training. These techniques are evaluated on various CNN architectures, including ResNet, Inception-V3, and MobileNet, and show consistent improvements in accuracy. The results show that these refinements not only improve model accuracy but also enhance transfer learning performance. For instance, the improved ResNet-50 model achieves higher accuracy in object detection and semantic segmentation tasks. The paper concludes that these techniques are effective across different CNN architectures and datasets, and that the benefits extend to broader domains where classification-based models are used.
Reach us at info@study.space
[slides and audio] Bag of Tricks for Image Classification with Convolutional Neural Networks