Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

January 27–30, 2020, Barcelona, Spain | Ramaravind K. Mothilal, Amit Sharma, Chenhao Tan
The paper "Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations" by Ramaravind K. Mothilal addresses the importance of post-hoc explanations for machine learning models to help users understand and act on algorithmic predictions. The authors propose a framework for generating and evaluating diverse counterfactual explanations, which are hypothetical examples that show how to obtain different predictions. The framework aims to satisfy two key properties: feasibility of the counterfactual actions given user context and constraints, and diversity among the presented counterfactuals. The authors introduce an optimization problem based on determinantal point processes to generate a set of counterfactuals that are both diverse and close to the original input. They also provide metrics to evaluate the actionability of counterfactuals and compare them with other local explanation methods. Experiments on four real-world datasets demonstrate that their method generates more diverse counterfactuals than prior approaches, and these counterfactuals can approximate local decision boundaries as well as LIME, a popular local explanation method. The paper further discusses the trade-offs and causal implications in optimizing for counterfactuals and presents an implementation of the framework at https://github.com/microsoft/DICE.The paper "Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations" by Ramaravind K. Mothilal addresses the importance of post-hoc explanations for machine learning models to help users understand and act on algorithmic predictions. The authors propose a framework for generating and evaluating diverse counterfactual explanations, which are hypothetical examples that show how to obtain different predictions. The framework aims to satisfy two key properties: feasibility of the counterfactual actions given user context and constraints, and diversity among the presented counterfactuals. The authors introduce an optimization problem based on determinantal point processes to generate a set of counterfactuals that are both diverse and close to the original input. They also provide metrics to evaluate the actionability of counterfactuals and compare them with other local explanation methods. Experiments on four real-world datasets demonstrate that their method generates more diverse counterfactuals than prior approaches, and these counterfactuals can approximate local decision boundaries as well as LIME, a popular local explanation method. The paper further discusses the trade-offs and causal implications in optimizing for counterfactuals and presents an implementation of the framework at https://github.com/microsoft/DICE.
Reach us at info@study.space
[slides and audio] Explaining machine learning classifiers through diverse counterfactual explanations