The Effect of Sampling Temperature on Problem Solving in Large Language Models

The Effect of Sampling Temperature on Problem Solving in Large Language Models

14 Jun 2024 | Matthew Renze, Erhan Guven
This study investigates the effect of sampling temperature on the performance of Large Language Models (LLMs) in problem-solving tasks. The researchers created a multiple-choice question-and-answer (MCQA) exam by sampling problems from standard LLM benchmarks. They tested nine popular LLMs with five prompt-engineering techniques across a range of sampling temperatures (0.0 to 1.6). The results showed that changes in temperature from 0.0 to 1.0 did not have a statistically significant impact on LLM performance for problem-solving tasks. These results generalized across LLMs, prompt-engineering techniques, and problem domains. Sampling temperature controls the randomness of the model's output. Lower temperatures produce more deterministic, focused outputs, while higher temperatures increase randomness and creativity. However, higher temperatures also increase the risk of hallucination, where the model generates statistically probable but factually incorrect responses. The study found that while higher temperatures can lead to more creative outputs, they do not significantly improve problem-solving accuracy. The researchers used a variety of metrics to evaluate performance, including correct-answer accuracy and text similarity. They found that accuracy remained stable across temperatures from 0.0 to 1.0 for all LLMs and problem domains. Text similarity decreased as temperature increased, indicating more varied outputs. However, accuracy dropped significantly at higher temperatures, especially above 1.0. The study recommends setting the sampling temperature to 0.0 for problem-solving tasks to maximize reproducibility without compromising accuracy. This temperature also helps avoid performance drops that occur at higher temperatures. The study also highlights the need for further research to explore the effects of sampling temperature in more complex problem-solving scenarios and across different problem domains.This study investigates the effect of sampling temperature on the performance of Large Language Models (LLMs) in problem-solving tasks. The researchers created a multiple-choice question-and-answer (MCQA) exam by sampling problems from standard LLM benchmarks. They tested nine popular LLMs with five prompt-engineering techniques across a range of sampling temperatures (0.0 to 1.6). The results showed that changes in temperature from 0.0 to 1.0 did not have a statistically significant impact on LLM performance for problem-solving tasks. These results generalized across LLMs, prompt-engineering techniques, and problem domains. Sampling temperature controls the randomness of the model's output. Lower temperatures produce more deterministic, focused outputs, while higher temperatures increase randomness and creativity. However, higher temperatures also increase the risk of hallucination, where the model generates statistically probable but factually incorrect responses. The study found that while higher temperatures can lead to more creative outputs, they do not significantly improve problem-solving accuracy. The researchers used a variety of metrics to evaluate performance, including correct-answer accuracy and text similarity. They found that accuracy remained stable across temperatures from 0.0 to 1.0 for all LLMs and problem domains. Text similarity decreased as temperature increased, indicating more varied outputs. However, accuracy dropped significantly at higher temperatures, especially above 1.0. The study recommends setting the sampling temperature to 0.0 for problem-solving tasks to maximize reproducibility without compromising accuracy. This temperature also helps avoid performance drops that occur at higher temperatures. The study also highlights the need for further research to explore the effects of sampling temperature in more complex problem-solving scenarios and across different problem domains.
Reach us at info@study.space