[slides and audio] Quantized Convolutional Neural Networks for Mobile Devices

This paper introduces a unified framework called Quantized Convolutional Neural Networks (Q-CNN) to accelerate and compress convolutional neural networks (CNNs) for mobile devices. The authors propose quantizing both filter kernels in convolutional layers and weighting matrices in fully-connected layers to minimize the estimation error of each layer's response, achieving 4-6× speed-up and 15-20× compression with minimal loss in classification accuracy. The Q-CNN framework is evaluated on the ILSVRC-12 benchmark, demonstrating significant improvements in test-phase efficiency and memory consumption. The authors also implement the quantized CNN model on mobile devices, enabling fast image classification within one second. The main contributions include a unified Q-CNN framework, an effective training scheme to suppress cumulative errors, and the demonstration of substantial acceleration and compression with minimal performance degradation.This paper introduces a unified framework called Quantized Convolutional Neural Networks (Q-CNN) to accelerate and compress convolutional neural networks (CNNs) for mobile devices. The authors propose quantizing both filter kernels in convolutional layers and weighting matrices in fully-connected layers to minimize the estimation error of each layer's response, achieving 4-6× speed-up and 15-20× compression with minimal loss in classification accuracy. The Q-CNN framework is evaluated on the ILSVRC-12 benchmark, demonstrating significant improvements in test-phase efficiency and memory consumption. The authors also implement the quantized CNN model on mobile devices, enabling fast image classification within one second. The main contributions include a unified Q-CNN framework, an effective training scheme to suppress cumulative errors, and the demonstration of substantial acceleration and compression with minimal performance degradation.

Quantized Convolutional Neural Networks for Mobile Devices

16 May 2016 | Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, Jian Cheng