Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

14 Jan 2019 | W. James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu
Interpretable machine learning (IML) is a growing field that aims to make machine learning models more transparent and understandable. This paper defines IML and introduces the Predictive, Descriptive, Relevant (PDR) framework for evaluating interpretations. The PDR framework includes three desiderata: predictive accuracy, descriptive accuracy, and relevancy. Predictive accuracy refers to how well an interpretation method captures the model's predictions. Descriptive accuracy refers to how well an interpretation method captures the model's learned relationships. Relevancy refers to how useful the interpretation is for a particular audience. The paper discusses two main types of interpretation methods: model-based and post hoc. Model-based methods involve constructing models that are easier to understand, while post hoc methods involve analyzing trained models to extract information about what relationships the model has learned. The paper also discusses various techniques for improving interpretability, including sparsity, simulatability, modularity, and domain-based feature engineering. The paper highlights the importance of human audiences in discussions of interpretability. It also discusses the role of interpretability in other research areas, such as causal inference and stability. The paper concludes by discussing future work in the field, including the need for better evaluation methods and the development of more interpretable models.Interpretable machine learning (IML) is a growing field that aims to make machine learning models more transparent and understandable. This paper defines IML and introduces the Predictive, Descriptive, Relevant (PDR) framework for evaluating interpretations. The PDR framework includes three desiderata: predictive accuracy, descriptive accuracy, and relevancy. Predictive accuracy refers to how well an interpretation method captures the model's predictions. Descriptive accuracy refers to how well an interpretation method captures the model's learned relationships. Relevancy refers to how useful the interpretation is for a particular audience. The paper discusses two main types of interpretation methods: model-based and post hoc. Model-based methods involve constructing models that are easier to understand, while post hoc methods involve analyzing trained models to extract information about what relationships the model has learned. The paper also discusses various techniques for improving interpretability, including sparsity, simulatability, modularity, and domain-based feature engineering. The paper highlights the importance of human audiences in discussions of interpretability. It also discusses the role of interpretability in other research areas, such as causal inference and stability. The paper concludes by discussing future work in the field, including the need for better evaluation methods and the development of more interpretable models.
Reach us at info@study.space
Understanding Definitions%2C methods%2C and applications in interpretable machine learning