FASTTEXT.ZIP: COMpressing TEXT CLASSIFICATION MODELS

FASTTEXT.ZIP: COMpressing TEXT CLASSIFICATION MODELS

12 Dec 2016 | Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hervé Jégou & Tomas Mikolov
This paper addresses the challenge of creating compact architectures for text classification models that fit within limited memory constraints. The authors propose a method based on product quantization (PQ) to store word embeddings, which significantly reduces memory usage while maintaining accuracy. They adapt the PQ technique to mitigate quantization artifacts, achieving a two-orders-of-magnitude reduction in memory compared to fastText, with only a slight decrease in accuracy. Experiments on various benchmarks demonstrate the effectiveness of their approach, outperforming state-of-the-art methods in terms of memory usage and accuracy trade-offs. The paper also discusses related work, including compression techniques for language models and similarity estimation, and provides a detailed explanation of the proposed method, including feature pruning, quantization, hashing, and retraining. The authors plan to release the code as an extension of the fastText library to facilitate reproducibility and encourage further research in this area.This paper addresses the challenge of creating compact architectures for text classification models that fit within limited memory constraints. The authors propose a method based on product quantization (PQ) to store word embeddings, which significantly reduces memory usage while maintaining accuracy. They adapt the PQ technique to mitigate quantization artifacts, achieving a two-orders-of-magnitude reduction in memory compared to fastText, with only a slight decrease in accuracy. Experiments on various benchmarks demonstrate the effectiveness of their approach, outperforming state-of-the-art methods in terms of memory usage and accuracy trade-offs. The paper also discusses related work, including compression techniques for language models and similarity estimation, and provides a detailed explanation of the proposed method, including feature pruning, quantization, hashing, and retraining. The authors plan to release the code as an extension of the fastText library to facilitate reproducibility and encourage further research in this area.
Reach us at info@study.space