Tensorizing Neural Networks

Tensorizing Neural Networks

20 Dec 2015 | Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, Dmitry Vetrov
This paper introduces the TensorNet, a neural network architecture that uses the Tensor-Train (TT) format to compress fully-connected layers while maintaining their expressive power. The TT format represents dense weight matrices with a compact multilinear structure, significantly reducing the number of parameters. For example, in the Very Deep VGG network, the dense weight matrix of a fully-connected layer is compressed by up to 200,000 times, leading to a total network compression factor of 7. The TT-layer is a fully-connected layer with weights stored in TT format, allowing for a large number of hidden units with a moderate number of parameters. The TT-format enables efficient computation and back-propagation, making it compatible with existing training algorithms. The paper demonstrates that TT-layers can match the performance of their uncompressed counterparts while requiring significantly fewer parameters. Experiments on various datasets, including MNIST, CIFAR-10, and ImageNet, show that TT-layers achieve comparable performance with much lower memory usage and faster inference times. For instance, on CIFAR-10, a wide and shallow TensorNet with 262,144 hidden units outperforms other non-convolutional networks. On ImageNet, the TT-layer reduces the number of parameters in the largest fully-connected layer by a factor of 194,622, with only a slight increase in error. The TT-format is also efficient for large-scale tasks, with the memory usage for a 25088 × 4096 fully-connected layer being reduced from 392MB to 0.766MB when using the TT-layer. The paper discusses the potential of TT-layers for real-time applications and mobile devices, as well as future work on further optimizing the TT-format for larger networks. The TT-decomposition framework is shown to be effective in reducing redundancy in neural network parametrization, enabling efficient training and inference with significantly fewer parameters.This paper introduces the TensorNet, a neural network architecture that uses the Tensor-Train (TT) format to compress fully-connected layers while maintaining their expressive power. The TT format represents dense weight matrices with a compact multilinear structure, significantly reducing the number of parameters. For example, in the Very Deep VGG network, the dense weight matrix of a fully-connected layer is compressed by up to 200,000 times, leading to a total network compression factor of 7. The TT-layer is a fully-connected layer with weights stored in TT format, allowing for a large number of hidden units with a moderate number of parameters. The TT-format enables efficient computation and back-propagation, making it compatible with existing training algorithms. The paper demonstrates that TT-layers can match the performance of their uncompressed counterparts while requiring significantly fewer parameters. Experiments on various datasets, including MNIST, CIFAR-10, and ImageNet, show that TT-layers achieve comparable performance with much lower memory usage and faster inference times. For instance, on CIFAR-10, a wide and shallow TensorNet with 262,144 hidden units outperforms other non-convolutional networks. On ImageNet, the TT-layer reduces the number of parameters in the largest fully-connected layer by a factor of 194,622, with only a slight increase in error. The TT-format is also efficient for large-scale tasks, with the memory usage for a 25088 × 4096 fully-connected layer being reduced from 392MB to 0.766MB when using the TT-layer. The paper discusses the potential of TT-layers for real-time applications and mobile devices, as well as future work on further optimizing the TT-format for larger networks. The TT-decomposition framework is shown to be effective in reducing redundancy in neural network parametrization, enabling efficient training and inference with significantly fewer parameters.
Reach us at info@study.space