Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

23 Jan 2024 | Xiaoding Lu, Zongyi Liu, Adian Liusie, Vyas Raina, Vineet Mudupalli, Yuwen Zhang, William Beauchamp
This paper introduces Blended, a simple and effective method for combining multiple chat AIs to achieve better performance than a single large model. The approach involves randomly selecting responses from a group of smaller models, which can then be combined to create a more engaging and diverse chat AI. The study shows that blending three moderately sized models (6B/13B parameters) can rival or even surpass the performance of a much larger model like ChatGPT (175B+ parameters). This is demonstrated through A/B testing with a large user base on the Chai research platform over 30 days, where the blended system showed significantly higher user retention and engagement compared to ChatGPT. The blended approach requires only a fraction of the computational resources and memory of a large model, making it a cost-effective alternative. The study also highlights the potential of model collaboration over simple parameter scaling in improving chat AI performance. The results suggest that blending smaller models can lead to more engaging and diverse conversations without increasing computational costs. The paper also discusses related work in chat AI and generative system combination, and outlines future directions for improving the Blended approach.This paper introduces Blended, a simple and effective method for combining multiple chat AIs to achieve better performance than a single large model. The approach involves randomly selecting responses from a group of smaller models, which can then be combined to create a more engaging and diverse chat AI. The study shows that blending three moderately sized models (6B/13B parameters) can rival or even surpass the performance of a much larger model like ChatGPT (175B+ parameters). This is demonstrated through A/B testing with a large user base on the Chai research platform over 30 days, where the blended system showed significantly higher user retention and engagement compared to ChatGPT. The blended approach requires only a fraction of the computational resources and memory of a large model, making it a cost-effective alternative. The study also highlights the potential of model collaboration over simple parameter scaling in improving chat AI performance. The results suggest that blending smaller models can lead to more engaging and diverse conversations without increasing computational costs. The paper also discusses related work in chat AI and generative system combination, and outlines future directions for improving the Blended approach.
Reach us at info@study.space