3 Feb 2019 | Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter and Lalana Kagal
This paper provides an overview of interpretability and explainability in machine learning, highlighting the importance of explanations for ensuring algorithmic fairness, identifying biases, and ensuring that algorithms perform as expected. It discusses the challenges of explaining deep neural networks (DNNs), including their susceptibility to adversarial examples and the difficulty of achieving both interpretability and completeness in explanations. The paper defines key terms such as explanation, interpretability, and explainability, and distinguishes between them. It reviews existing approaches to explainable AI, including proxy models, decision trees, and automatic rule extraction, as well as methods for salience mapping and explanation-producing systems. The paper also presents a taxonomy of explanation methods, categorizing them based on their focus on processing, representation, or explanation-producing. It discusses the need for standardized evaluation metrics and suggests future research directions in the field of explainable AI. The paper concludes that while there is growing interest in interpretability and explainability, there is still a need for more research and collaboration across disciplines to develop effective and trustworthy explanations for complex machine learning systems.This paper provides an overview of interpretability and explainability in machine learning, highlighting the importance of explanations for ensuring algorithmic fairness, identifying biases, and ensuring that algorithms perform as expected. It discusses the challenges of explaining deep neural networks (DNNs), including their susceptibility to adversarial examples and the difficulty of achieving both interpretability and completeness in explanations. The paper defines key terms such as explanation, interpretability, and explainability, and distinguishes between them. It reviews existing approaches to explainable AI, including proxy models, decision trees, and automatic rule extraction, as well as methods for salience mapping and explanation-producing systems. The paper also presents a taxonomy of explanation methods, categorizing them based on their focus on processing, representation, or explanation-producing. It discusses the need for standardized evaluation metrics and suggests future research directions in the field of explainable AI. The paper concludes that while there is growing interest in interpretability and explainability, there is still a need for more research and collaboration across disciplines to develop effective and trustworthy explanations for complex machine learning systems.