[slides] OneBit%3A Towards Extremely Low-bit Large Language Models

The paper "OneBit: Towards Extremely Low-bit Large Language Models" introduces a novel 1-bit model compression framework named OneBit, which aims to quantize the weight matrices of large language models (LLMs) to 1-bit values. This approach significantly reduces both storage and computational overheads, making LLMs more deployable on resource-constrained devices. The framework includes a 1-bit parameter representation method and an effective parameter initialization method based on matrix decomposition to improve convergence speed. Experimental results show that OneBit achieves good performance (at least 81% of the non-quantized performance on LLaMA models) with robust training processes when using 1-bit weight matrices. The method is evaluated on various models and tasks, demonstrating its effectiveness and generalizability. The paper also discusses the efficiency, robustness, and practical usability of the 1-bit quantized models, providing insights into future research directions.The paper "OneBit: Towards Extremely Low-bit Large Language Models" introduces a novel 1-bit model compression framework named OneBit, which aims to quantize the weight matrices of large language models (LLMs) to 1-bit values. This approach significantly reduces both storage and computational overheads, making LLMs more deployable on resource-constrained devices. The framework includes a 1-bit parameter representation method and an effective parameter initialization method based on matrix decomposition to improve convergence speed. Experimental results show that OneBit achieves good performance (at least 81% of the non-quantized performance on LLaMA models) with robust training processes when using 1-bit weight matrices. The method is evaluated on various models and tasks, demonstrating its effectiveness and generalizability. The paper also discusses the efficiency, robustness, and practical usability of the 1-bit quantized models, providing insights into future research directions.

OneBit: Towards Extremely Low-bit Large Language Models

29 Nov 2024 | Yuzhuang Xu, Xu Han, Zonghan Yang, Shuo Wang, Qingfu Zhu, Zhiyuan Liu, Weidong Liu, Wanxiang Che