Understanding Compressing Deep Convolutional Networks using Vector Quantization

This paper addresses the challenge of compressing deep convolutional neural networks (CNNs) to reduce their storage requirements, particularly for resource-limited devices such as smartphones and embedded systems. The authors explore various vector quantization methods, including scalar quantization using k-means clustering, structured quantization using product quantization, and residual quantization. They find that these methods can achieve significant compression rates while maintaining high classification accuracy. Specifically, they demonstrate that k-means clustering can compress dense connected layers by 16-24 times with only 1% loss in accuracy on the ImageNet dataset. The paper also evaluates the effectiveness of these compressed models in image retrieval tasks, showing that they perform well even when applied to new datasets. The results highlight the potential of vector quantization as a practical solution for deploying state-of-the-art CNNs on resource-constrained devices.This paper addresses the challenge of compressing deep convolutional neural networks (CNNs) to reduce their storage requirements, particularly for resource-limited devices such as smartphones and embedded systems. The authors explore various vector quantization methods, including scalar quantization using k-means clustering, structured quantization using product quantization, and residual quantization. They find that these methods can achieve significant compression rates while maintaining high classification accuracy. Specifically, they demonstrate that k-means clustering can compress dense connected layers by 16-24 times with only 1% loss in accuracy on the ImageNet dataset. The paper also evaluates the effectiveness of these compressed models in image retrieval tasks, showing that they perform well even when applied to new datasets. The results highlight the potential of vector quantization as a practical solution for deploying state-of-the-art CNNs on resource-constrained devices.

COMPRESSING DEEP CONVOLUTIONAL NETWORKS USING VECTOR QUANTIZATION

18 Dec 2014 | Yunchao Gong, Liu Liu ; Ming Yang, Lubomir Bourdev