2018 | Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han
The paper "AMC: AutoML for Model Compression and Acceleration on Mobile Devices" by Yihui He et al. introduces a novel approach called AutoML for Model Compression (AMC) that leverages reinforcement learning to automate the process of model compression for mobile devices. Traditional model compression techniques rely on hand-crafted heuristics and rule-based policies, which are often sub-optimal and time-consuming. AMC aims to address these issues by using reinforcement learning to find the optimal compression policy, resulting in higher compression ratios, better accuracy preservation, and reduced human labor.
The authors propose two compression policy search protocols: resource-constrained compression and accuracy-guaranteed compression. Resource-constrained compression focuses on achieving the best accuracy given a maximum amount of hardware resources, while accuracy-guaranteed compression aims to achieve the smallest model size with minimal accuracy loss. The AMC engine automates the compression process by learning-based policies, rather than relying on rule-based policies and experienced engineers.
The paper evaluates AMC on multiple neural networks, including VGG, ResNet, and MobileNet, demonstrating significant improvements over hand-crafted heuristic policies. For example, AMC achieved a 5× compression ratio for ResNet-50 on ImageNet with no loss of accuracy, and a 2× reduction in FLOPs for MobileNet with only 0.1% loss of Top-1 accuracy. Additionally, AMC achieved a 1.81× speedup in inference latency on an Android phone and a 1.43× speedup on a Titan XP GPU.
The authors also compare AMC with existing channel reduction methods and show that AMC outperforms them by more than 0.9%, even surpassing human experts by 0.3% without any human labor. AMC's effectiveness is further demonstrated in object detection tasks, where it achieved 0.7% better mAP compared to the best hand-crafted pruning methods under the same compression ratio.
Overall, AMC provides a more efficient and effective solution for model compression and acceleration on mobile devices, facilitating the deployment of deep neural networks with limited computational resources and power budgets.The paper "AMC: AutoML for Model Compression and Acceleration on Mobile Devices" by Yihui He et al. introduces a novel approach called AutoML for Model Compression (AMC) that leverages reinforcement learning to automate the process of model compression for mobile devices. Traditional model compression techniques rely on hand-crafted heuristics and rule-based policies, which are often sub-optimal and time-consuming. AMC aims to address these issues by using reinforcement learning to find the optimal compression policy, resulting in higher compression ratios, better accuracy preservation, and reduced human labor.
The authors propose two compression policy search protocols: resource-constrained compression and accuracy-guaranteed compression. Resource-constrained compression focuses on achieving the best accuracy given a maximum amount of hardware resources, while accuracy-guaranteed compression aims to achieve the smallest model size with minimal accuracy loss. The AMC engine automates the compression process by learning-based policies, rather than relying on rule-based policies and experienced engineers.
The paper evaluates AMC on multiple neural networks, including VGG, ResNet, and MobileNet, demonstrating significant improvements over hand-crafted heuristic policies. For example, AMC achieved a 5× compression ratio for ResNet-50 on ImageNet with no loss of accuracy, and a 2× reduction in FLOPs for MobileNet with only 0.1% loss of Top-1 accuracy. Additionally, AMC achieved a 1.81× speedup in inference latency on an Android phone and a 1.43× speedup on a Titan XP GPU.
The authors also compare AMC with existing channel reduction methods and show that AMC outperforms them by more than 0.9%, even surpassing human experts by 0.3% without any human labor. AMC's effectiveness is further demonstrated in object detection tasks, where it achieved 0.7% better mAP compared to the best hand-crafted pruning methods under the same compression ratio.
Overall, AMC provides a more efficient and effective solution for model compression and acceleration on mobile devices, facilitating the deployment of deep neural networks with limited computational resources and power budgets.