Understanding EfficientNetV2%3A Smaller Models and Faster Training

This paper introduces EfficientNetV2, a new family of convolutional neural networks designed to achieve faster training speeds and better parameter efficiency compared to previous models. The authors use a combination of training-aware neural architecture search (NAS) and scaling to optimize both training speed and parameter efficiency. EfficientNetV2 models are trained using a search space enriched with new operations such as Fused-MBConv, which helps in improving training speed and reducing parameter size. The paper also proposes an improved method of progressive learning, which dynamically adjusts regularization (e.g., data augmentation) as the image size increases during training, preventing accuracy drops. Experiments show that EfficientNetV2 models train up to 4x faster than prior models while being up to 6.8x smaller in parameter size. On ImageNet, EfficientNetV2 achieves 87.3% top-1 accuracy, outperforming recent Vision Transformers by 2.0% accuracy while training 5x-11x faster. The paper also demonstrates the effectiveness of EfficientNetV2 on other datasets like CIFAR, Cars, and Flowers, and highlights its superior performance in both training speed and parameter efficiency.This paper introduces EfficientNetV2, a new family of convolutional neural networks designed to achieve faster training speeds and better parameter efficiency compared to previous models. The authors use a combination of training-aware neural architecture search (NAS) and scaling to optimize both training speed and parameter efficiency. EfficientNetV2 models are trained using a search space enriched with new operations such as Fused-MBConv, which helps in improving training speed and reducing parameter size. The paper also proposes an improved method of progressive learning, which dynamically adjusts regularization (e.g., data augmentation) as the image size increases during training, preventing accuracy drops. Experiments show that EfficientNetV2 models train up to 4x faster than prior models while being up to 6.8x smaller in parameter size. On ImageNet, EfficientNetV2 achieves 87.3% top-1 accuracy, outperforming recent Vision Transformers by 2.0% accuracy while training 5x-11x faster. The paper also demonstrates the effectiveness of EfficientNetV2 on other datasets like CIFAR, Cars, and Flowers, and highlights its superior performance in both training speed and parameter efficiency.

EfficientNetV2: Smaller Models and Faster Training

23 Jun 2021 | Mingxing Tan, Quoc V. Le