9 May 2024 | Alexandra Zyte, Sara Pidó, Kalyan Veeramachaneni
This paper explores the use of Large Language Models (LLMs) to enhance Explainable Artificial Intelligence (XAI) by transforming machine learning (ML) explanations into natural, human-readable narratives. The authors propose several research directions to improve the interpretability and usability of XAI explanations. These include defining evaluation metrics, designing prompts, comparing LLM models, further training and fine-tuning, and integrating external data. Initial experiments and a user study suggest that LLMs can effectively enhance XAI by generating more understandable and context-aware explanations.
The authors investigate how well LLMs can generate explanation narratives without additional training, using GPT-3.5 and GPT-4. They focus on SHAP explanations as inputs and experiment with different prompts to guide the LLMs in producing accurate and useful explanations. They define metrics such as completeness, soundness, fluency, context-awareness, and length to evaluate the quality of the generated narratives. Results show that GPT-4 performs better in terms of soundness, completeness, and context-awareness compared to GPT-3.5.
A user study was conducted to assess how people perceive narrative-based explanations versus traditional explanations. Participants generally preferred narrative-based explanations, finding them easier to understand and more informative. The findings suggest that LLM-based narrative explanations have the potential to improve user understanding of ML outputs.
The authors conclude that LLMs can significantly enhance XAI by generating more interpretable and usable explanations. They highlight the potential of GPT-4 in this regard and suggest future research directions, including further investigation into prompt design, additional LLMs, and the integration of training data and external guides to create more context-aware explanations. The study contributes to the ongoing trend of making AI systems more transparent, interpretable, and usable, fostering trust and understanding in AI technologies.This paper explores the use of Large Language Models (LLMs) to enhance Explainable Artificial Intelligence (XAI) by transforming machine learning (ML) explanations into natural, human-readable narratives. The authors propose several research directions to improve the interpretability and usability of XAI explanations. These include defining evaluation metrics, designing prompts, comparing LLM models, further training and fine-tuning, and integrating external data. Initial experiments and a user study suggest that LLMs can effectively enhance XAI by generating more understandable and context-aware explanations.
The authors investigate how well LLMs can generate explanation narratives without additional training, using GPT-3.5 and GPT-4. They focus on SHAP explanations as inputs and experiment with different prompts to guide the LLMs in producing accurate and useful explanations. They define metrics such as completeness, soundness, fluency, context-awareness, and length to evaluate the quality of the generated narratives. Results show that GPT-4 performs better in terms of soundness, completeness, and context-awareness compared to GPT-3.5.
A user study was conducted to assess how people perceive narrative-based explanations versus traditional explanations. Participants generally preferred narrative-based explanations, finding them easier to understand and more informative. The findings suggest that LLM-based narrative explanations have the potential to improve user understanding of ML outputs.
The authors conclude that LLMs can significantly enhance XAI by generating more interpretable and usable explanations. They highlight the potential of GPT-4 in this regard and suggest future research directions, including further investigation into prompt design, additional LLMs, and the integration of training data and external guides to create more context-aware explanations. The study contributes to the ongoing trend of making AI systems more transparent, interpretable, and usable, fostering trust and understanding in AI technologies.