4 Nov 2016 | Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer
SqueezeNet is a compact convolutional neural network (CNN) architecture that achieves AlexNet-level accuracy on ImageNet with 50 times fewer parameters and a model size less than 0.5MB. The architecture is designed to be efficient for deployment on devices with limited memory, such as FPGAs and embedded systems. SqueezeNet uses a novel "Fire" module, which combines 1x1 and 3x3 convolutional filters to reduce parameter count while maintaining accuracy. The Fire module includes a squeeze layer that reduces input channels to 3x3 filters, and an expand layer that increases the number of filters. The architecture is optimized to maintain large activation maps, which can improve classification accuracy.
SqueezeNet was evaluated against other model compression techniques, such as SVD, Network Pruning, and Deep Compression. It achieved a 50x reduction in model size compared to AlexNet while maintaining or exceeding its accuracy. When further compressed using techniques like quantization and sparsity, SqueezeNet's model size was reduced to less than 0.5MB, making it highly suitable for deployment in resource-constrained environments.
The paper also explores the design space of CNN architectures, focusing on both microarchitectural and macroarchitectural decisions. It investigates how different design choices affect model size and accuracy, and demonstrates that SqueezeNet can be further optimized with bypass connections and other architectural modifications. The results show that SqueezeNet is not only efficient in terms of model size but also effective in maintaining high accuracy, making it a promising candidate for various applications, especially those requiring small model sizes. The architecture has been implemented in multiple frameworks and has been used for tasks such as autonomous driving and image recognition.SqueezeNet is a compact convolutional neural network (CNN) architecture that achieves AlexNet-level accuracy on ImageNet with 50 times fewer parameters and a model size less than 0.5MB. The architecture is designed to be efficient for deployment on devices with limited memory, such as FPGAs and embedded systems. SqueezeNet uses a novel "Fire" module, which combines 1x1 and 3x3 convolutional filters to reduce parameter count while maintaining accuracy. The Fire module includes a squeeze layer that reduces input channels to 3x3 filters, and an expand layer that increases the number of filters. The architecture is optimized to maintain large activation maps, which can improve classification accuracy.
SqueezeNet was evaluated against other model compression techniques, such as SVD, Network Pruning, and Deep Compression. It achieved a 50x reduction in model size compared to AlexNet while maintaining or exceeding its accuracy. When further compressed using techniques like quantization and sparsity, SqueezeNet's model size was reduced to less than 0.5MB, making it highly suitable for deployment in resource-constrained environments.
The paper also explores the design space of CNN architectures, focusing on both microarchitectural and macroarchitectural decisions. It investigates how different design choices affect model size and accuracy, and demonstrates that SqueezeNet can be further optimized with bypass connections and other architectural modifications. The results show that SqueezeNet is not only efficient in terms of model size but also effective in maintaining high accuracy, making it a promising candidate for various applications, especially those requiring small model sizes. The architecture has been implemented in multiple frameworks and has been used for tasks such as autonomous driving and image recognition.