Revisiting Demonstration Selection Strategies in In-Context Learning

Revisiting Demonstration Selection Strategies in In-Context Learning

23 Jun 2024 | Keqin Peng, Liang Ding, Yancheng Yuan, Xuebo Liu, Min Zhang, Yuanxin Ouyang, Dacheng Tao
This paper investigates the factors influencing demonstration selection in in-context learning (ICL) from the model perspective. The authors find that demonstration choice is both data- and model-dependent. They propose a conjecture that effective demonstrations enhance the inference model's understanding of test samples, and accordingly propose a data- and model-dependent demonstration selection method, TopK + ConE. Empirical results show that their method consistently improves performance on both language understanding and generation tasks across different model scales. Further analysis confirms that their method provides a unified explanation for the effectiveness of previous methods. The method is evaluated on various NLU and NLG tasks, achieving state-of-the-art performance. The authors also analyze the impact of hyperparameters and demonstrate that their method is robust and generalizable across different models and domains. The study highlights the importance of considering both data and model factors in ICL and provides a new perspective on improving ICL performance through better demonstration selection. The code is publicly available for further research.This paper investigates the factors influencing demonstration selection in in-context learning (ICL) from the model perspective. The authors find that demonstration choice is both data- and model-dependent. They propose a conjecture that effective demonstrations enhance the inference model's understanding of test samples, and accordingly propose a data- and model-dependent demonstration selection method, TopK + ConE. Empirical results show that their method consistently improves performance on both language understanding and generation tasks across different model scales. Further analysis confirms that their method provides a unified explanation for the effectiveness of previous methods. The method is evaluated on various NLU and NLG tasks, achieving state-of-the-art performance. The authors also analyze the impact of hyperparameters and demonstrate that their method is robust and generalizable across different models and domains. The study highlights the importance of considering both data and model factors in ICL and provides a new perspective on improving ICL performance through better demonstration selection. The code is publicly available for further research.
Reach us at info@study.space