1 Jul 2024 | Nguyen Nhat Minh, Andrew Baker, Andreas Kirsch, Clement Neo
Min-p sampling is a dynamic truncation method that balances creativity and coherence in text generation, particularly at high temperatures. It sets a minimum probability threshold for token selection, adjusting based on the probability of the top candidate token. This approach maintains coherence while enhancing diversity, outperforming top-p sampling in terms of output diversity and coherence at higher temperatures. Experiments on benchmarks like GPQA, GSM8K, and AlpacaEval Creative Writing show that min-p improves text quality and creativity without sacrificing coherence. It has been adopted by multiple open-source LLM implementations and validated by the community. Min-p dynamically adjusts its filtering threshold based on model confidence, focusing on high-probability tokens when confident and including diverse options when uncertain. This method allows for more creative and diverse outputs compared to top-p and other sampling methods. The practical utility of min-p is further validated by its adoption in open-source frameworks. Key contributions include introducing min-p as a dynamic truncation method that balances quality and diversity, demonstrating its advantages over top-p on LLM benchmarks, and providing a simple, effective sampling method without additional techniques. Min-p sampling is a viable alternative to top-p, enabling high-temperature settings for enhanced creativity without sacrificing coherence.Min-p sampling is a dynamic truncation method that balances creativity and coherence in text generation, particularly at high temperatures. It sets a minimum probability threshold for token selection, adjusting based on the probability of the top candidate token. This approach maintains coherence while enhancing diversity, outperforming top-p sampling in terms of output diversity and coherence at higher temperatures. Experiments on benchmarks like GPQA, GSM8K, and AlpacaEval Creative Writing show that min-p improves text quality and creativity without sacrificing coherence. It has been adopted by multiple open-source LLM implementations and validated by the community. Min-p dynamically adjusts its filtering threshold based on model confidence, focusing on high-probability tokens when confident and including diverse options when uncertain. This method allows for more creative and diverse outputs compared to top-p and other sampling methods. The practical utility of min-p is further validated by its adoption in open-source frameworks. Key contributions include introducing min-p as a dynamic truncation method that balances quality and diversity, demonstrating its advantages over top-p on LLM benchmarks, and providing a simple, effective sampling method without additional techniques. Min-p sampling is a viable alternative to top-p, enabling high-temperature settings for enhanced creativity without sacrificing coherence.