[slides] A Survey of Methods for Explaining Black Box Models

The article "A Survey of Methods for Explaining Black Box Models" by Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi provides a comprehensive overview of the challenges and approaches in explaining the decisions made by black box models. Black box models, which hide their internal logic from users, pose both practical and ethical issues, particularly in sensitive areas like credit scoring, insurance, and health status prediction. The authors highlight the need for interpretable models to address these issues, ensuring transparency and accountability. The paper discusses the motivations for requiring interpretable models, emphasizing the risks of black box systems, such as the potential for discrimination and the lack of trust. It reviews various types of interpretable models, including decision trees, rules, and linear models, and their advantages and limitations. The authors also explore the dimensions of interpretability, such as global and local interpretability, time limitation, and user expertise, and the desiderata for an interpretable model, including accuracy, fidelity, and fairness. The article categorizes the problems addressed in the literature into three main categories: model explanation, outcome explanation, and model inspection. Model explanation aims to provide a global understanding of the black box model, while outcome explanation focuses on explaining specific predictions. Model inspection involves understanding specific properties of the black box model. The authors provide a detailed taxonomy of these problems and discuss various methods for addressing them, including post-hoc interpretability techniques and the design of transparent classifiers. The paper concludes by discussing open research questions and future directions, emphasizing the importance of balancing accuracy and interpretability in black box models.The article "A Survey of Methods for Explaining Black Box Models" by Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi provides a comprehensive overview of the challenges and approaches in explaining the decisions made by black box models. Black box models, which hide their internal logic from users, pose both practical and ethical issues, particularly in sensitive areas like credit scoring, insurance, and health status prediction. The authors highlight the need for interpretable models to address these issues, ensuring transparency and accountability. The paper discusses the motivations for requiring interpretable models, emphasizing the risks of black box systems, such as the potential for discrimination and the lack of trust. It reviews various types of interpretable models, including decision trees, rules, and linear models, and their advantages and limitations. The authors also explore the dimensions of interpretability, such as global and local interpretability, time limitation, and user expertise, and the desiderata for an interpretable model, including accuracy, fidelity, and fairness. The article categorizes the problems addressed in the literature into three main categories: model explanation, outcome explanation, and model inspection. Model explanation aims to provide a global understanding of the black box model, while outcome explanation focuses on explaining specific predictions. Model inspection involves understanding specific properties of the black box model. The authors provide a detailed taxonomy of these problems and discuss various methods for addressing them, including post-hoc interpretability techniques and the design of transparent classifiers. The paper concludes by discussing open research questions and future directions, emphasizing the importance of balancing accuracy and interpretability in black box models.

A Survey of Methods for Explaining Black Box Models

August 2018 | RICCARDO GUIDOTTI, ANNA MONREALE, SALVATORE RUGGIERI, and FRANCO TURINI, KDDLab, University of Pisa, Italy FOSCA GIANNOTTI, KDDLab, ISTI-CNR, Italy DINO PEDRESCHI, KDDLab, University of Pisa, Italy