[slides and audio] Techniques for interpretable machine learning

The paper "Techniques for Interpretable Machine Learning" by Mengnan Du, Ninghao Liu, and Xia Hu provides a comprehensive survey of existing techniques to enhance the interpretability of machine learning models. The authors highlight the importance of interpretability in addressing the lack of transparency in complex models, which can lead to user trust issues and safety concerns, especially in critical applications like self-driving cars. They categorize interpretability techniques into intrinsic and post-hoc approaches, and further differentiate them into global and local interpretability. Intrinsic interpretability involves designing models that inherently provide explanations, such as decision trees and attention mechanisms, while post-hoc interpretability requires creating additional models to explain existing ones. The paper discusses various methods for global and local explanations, including feature importance, attention mechanisms, and perturbation-based approaches. It also explores applications such as model validation, debugging, and knowledge discovery, and identifies key challenges in explanation method design and evaluation. The authors propose directions for more user-friendly explanations, emphasizing the need for contrastive, selective, credible, and conversational explanations. Overall, the paper aims to bridge the gap between machine learning models and human understanding, fostering better trust and adoption of these models in society.The paper "Techniques for Interpretable Machine Learning" by Mengnan Du, Ninghao Liu, and Xia Hu provides a comprehensive survey of existing techniques to enhance the interpretability of machine learning models. The authors highlight the importance of interpretability in addressing the lack of transparency in complex models, which can lead to user trust issues and safety concerns, especially in critical applications like self-driving cars. They categorize interpretability techniques into intrinsic and post-hoc approaches, and further differentiate them into global and local interpretability. Intrinsic interpretability involves designing models that inherently provide explanations, such as decision trees and attention mechanisms, while post-hoc interpretability requires creating additional models to explain existing ones. The paper discusses various methods for global and local explanations, including feature importance, attention mechanisms, and perturbation-based approaches. It also explores applications such as model validation, debugging, and knowledge discovery, and identifies key challenges in explanation method design and evaluation. The authors propose directions for more user-friendly explanations, emphasizing the need for contrastive, selective, credible, and conversational explanations. Overall, the paper aims to bridge the gap between machine learning models and human understanding, fostering better trust and adoption of these models in society.

Techniques for Interpretable Machine Learning

19 May 2019 | Mengnan Du, Ninghao Liu, Xia Hu