25 Jan 2024 | Yanda Chen¹², Chandan Singh², Xiaodong Liu², Simiao Zuo², Bin Yu³, He He⁴, Jianfeng Gao²
This paper introduces EC-finetuning, a method to improve the consistency of natural-language explanations generated by large language models (LLMs). LLMs often produce inconsistent explanations for related questions, which can hinder their use in applications requiring trustworthy interpretation. EC-finetuning addresses this by training LLMs on synthetic data that contains consistent explanations. The process involves generating follow-up questions based on an initial explanation and then answering these questions in a way that is consistent with the initial explanation. This approach improves the consistency of explanations across different datasets, including both training and out-of-distribution datasets. The method was tested on various question-answering datasets, showing a 10.0% relative improvement in explanation consistency on four training datasets and a 4.5% improvement on seven out-of-distribution datasets. The results suggest that EC-finetuning can help users build mental models of LLMs by providing more consistent explanations. The paper also discusses related work, evaluation metrics, and analysis of the results, showing that EC-finetuning improves explanation consistency more on correct predictions than on incorrect ones. The study concludes that EC-finetuning is a promising approach for improving the consistency of LLM explanations and suggests future research directions.This paper introduces EC-finetuning, a method to improve the consistency of natural-language explanations generated by large language models (LLMs). LLMs often produce inconsistent explanations for related questions, which can hinder their use in applications requiring trustworthy interpretation. EC-finetuning addresses this by training LLMs on synthetic data that contains consistent explanations. The process involves generating follow-up questions based on an initial explanation and then answering these questions in a way that is consistent with the initial explanation. This approach improves the consistency of explanations across different datasets, including both training and out-of-distribution datasets. The method was tested on various question-answering datasets, showing a 10.0% relative improvement in explanation consistency on four training datasets and a 4.5% improvement on seven out-of-distribution datasets. The results suggest that EC-finetuning can help users build mental models of LLMs by providing more consistent explanations. The paper also discusses related work, evaluation metrics, and analysis of the results, showing that EC-finetuning improves explanation consistency more on correct predictions than on incorrect ones. The study concludes that EC-finetuning is a promising approach for improving the consistency of LLM explanations and suggests future research directions.