29 May 2019 | Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, Quoc V. Le
MnasNet is a platform-aware neural architecture search (NAS) approach for mobile devices, designed to find mobile convolutional neural networks (CNNs) that achieve a good balance between accuracy and inference latency. Unlike previous methods that use FLOPS as a proxy for latency, MnasNet directly measures real-world inference latency by executing models on mobile phones. This approach ensures that the search process considers actual performance on mobile hardware, leading to more accurate and efficient models.
The paper introduces a novel factorized hierarchical search space that encourages layer diversity while maintaining a balance between flexibility and search space size. This search space allows different layer architectures in different blocks, enabling the model to adapt to varying computational demands. The search process is based on reinforcement learning, where the controller learns to maximize a reward function that considers both accuracy and latency.
Experimental results show that MnasNet outperforms state-of-the-art mobile CNN models on multiple tasks, including ImageNet classification and COCO object detection. On ImageNet, MnasNet achieves 75.2% top-1 accuracy with 78ms latency on a Pixel phone, which is 1.8× faster than MobileNetV2 with 0.5% higher accuracy and 2.3× faster than NASNet with 1.2% higher accuracy. On COCO, MnasNet achieves better mAP quality than MobileNets.
The paper also discusses the impact of different latency constraints and search space configurations on model performance. It shows that MnasNet can achieve better accuracy-latency trade-offs by exploring a diverse range of architectures. The results demonstrate that MnasNet is effective in finding mobile CNN models that are both accurate and efficient, making it a valuable tool for mobile application development.MnasNet is a platform-aware neural architecture search (NAS) approach for mobile devices, designed to find mobile convolutional neural networks (CNNs) that achieve a good balance between accuracy and inference latency. Unlike previous methods that use FLOPS as a proxy for latency, MnasNet directly measures real-world inference latency by executing models on mobile phones. This approach ensures that the search process considers actual performance on mobile hardware, leading to more accurate and efficient models.
The paper introduces a novel factorized hierarchical search space that encourages layer diversity while maintaining a balance between flexibility and search space size. This search space allows different layer architectures in different blocks, enabling the model to adapt to varying computational demands. The search process is based on reinforcement learning, where the controller learns to maximize a reward function that considers both accuracy and latency.
Experimental results show that MnasNet outperforms state-of-the-art mobile CNN models on multiple tasks, including ImageNet classification and COCO object detection. On ImageNet, MnasNet achieves 75.2% top-1 accuracy with 78ms latency on a Pixel phone, which is 1.8× faster than MobileNetV2 with 0.5% higher accuracy and 2.3× faster than NASNet with 1.2% higher accuracy. On COCO, MnasNet achieves better mAP quality than MobileNets.
The paper also discusses the impact of different latency constraints and search space configurations on model performance. It shows that MnasNet can achieve better accuracy-latency trade-offs by exploring a diverse range of architectures. The results demonstrate that MnasNet is effective in finding mobile CNN models that are both accurate and efficient, making it a valuable tool for mobile application development.