20 Oct 2022 | Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer
This paper investigates the role of demonstrations in in-context learning (ICL), where large language models (LLMs) perform new tasks by conditioning on a few input-label pairs. The study challenges the assumption that ground truth labels in demonstrations are essential for performance. Experiments show that replacing labels with random ones has minimal impact on performance across multiple classification and multi-choice tasks, suggesting that the model does not rely on the input-label mapping in the demonstrations. Instead, other aspects of the demonstrations, such as the label space, input text distribution, and sequence format, are key to performance.
The analysis reveals that the model can achieve significant performance gains by understanding the label space and input distribution, even without correct labels. Additionally, the format of the demonstrations plays a crucial role, as models that retain the format of input-label pairs perform better than those that do not. Meta-training with an in-context learning objective further enhances these effects, as models focus on simpler aspects of the demonstrations, such as format, rather than the input-label mapping.
The study also highlights that in-context learning can achieve performance comparable to zero-shot learning without relying on ground truth labels. This suggests that LLMs may have learned implicit input-label correspondences through language modeling, enabling them to perform tasks without explicit supervision. However, the effectiveness of in-context learning depends on the format and structure of the demonstrations, and it may not work for tasks where the input-label correspondence is not already captured in the model.
The findings have implications for understanding how LLMs learn and how to improve in-context learning. They suggest that future work should focus on better ways to extract input-label mappings, refine the LM objective, or use explicit supervision to enhance performance. The study also raises questions about the generalizability of in-context learning and its potential limitations in certain tasks. Overall, the research provides new insights into the mechanisms of in-context learning and opens up new avenues for further exploration.This paper investigates the role of demonstrations in in-context learning (ICL), where large language models (LLMs) perform new tasks by conditioning on a few input-label pairs. The study challenges the assumption that ground truth labels in demonstrations are essential for performance. Experiments show that replacing labels with random ones has minimal impact on performance across multiple classification and multi-choice tasks, suggesting that the model does not rely on the input-label mapping in the demonstrations. Instead, other aspects of the demonstrations, such as the label space, input text distribution, and sequence format, are key to performance.
The analysis reveals that the model can achieve significant performance gains by understanding the label space and input distribution, even without correct labels. Additionally, the format of the demonstrations plays a crucial role, as models that retain the format of input-label pairs perform better than those that do not. Meta-training with an in-context learning objective further enhances these effects, as models focus on simpler aspects of the demonstrations, such as format, rather than the input-label mapping.
The study also highlights that in-context learning can achieve performance comparable to zero-shot learning without relying on ground truth labels. This suggests that LLMs may have learned implicit input-label correspondences through language modeling, enabling them to perform tasks without explicit supervision. However, the effectiveness of in-context learning depends on the format and structure of the demonstrations, and it may not work for tasks where the input-label correspondence is not already captured in the model.
The findings have implications for understanding how LLMs learn and how to improve in-context learning. They suggest that future work should focus on better ways to extract input-label mappings, refine the LM objective, or use explicit supervision to enhance performance. The study also raises questions about the generalizability of in-context learning and its potential limitations in certain tasks. Overall, the research provides new insights into the mechanisms of in-context learning and opens up new avenues for further exploration.