This paper presents a sensitivity analysis-based method for explaining prediction models, applicable to any classification or regression model. The method considers all subsets of input features, accounting for interactions and redundancies between them. It is equivalent to model-specific methods for additive models. The method is illustrated with examples from artificial and real-world data, and an empirical analysis of running times. A controlled experiment with 122 participants showed that the method improved their understanding of the model.
Prediction models are important in decision support systems, with applications ranging from credit scoring to financial auditing. Model interpretability is often as important as prediction accuracy. Some models are harder to interpret and require post-processing to improve understanding and user trust. Most explanation methods are model-specific, but general approaches treat models as black boxes, changing inputs and observing output changes. These methods are applicable to any model type, facilitating model comparisons and eliminating the need to replace explanation methods when models change.
The key component of general explanations is the contribution of individual input features. A prediction is explained by assigning a number to each feature indicating its influence. Contributions can be aggregated to plot the feature's average contribution against its value, providing an overview of the model. This is similar to plotting the marginal effect for additive models.
The paper begins with a simple example of a linear regression model. The situational importance of a feature is the difference between its contribution when its value is x_i and its expected contribution. This is used to determine whether a feature has a positive, negative, or neutral contribution to a prediction. The situational importance can be plotted to show how different feature values contribute to predictions, and can be used to semi-graphically compute predictions for any instance. This method can be applied to any additive model, and has been used in developing several model-specific explanation methods.This paper presents a sensitivity analysis-based method for explaining prediction models, applicable to any classification or regression model. The method considers all subsets of input features, accounting for interactions and redundancies between them. It is equivalent to model-specific methods for additive models. The method is illustrated with examples from artificial and real-world data, and an empirical analysis of running times. A controlled experiment with 122 participants showed that the method improved their understanding of the model.
Prediction models are important in decision support systems, with applications ranging from credit scoring to financial auditing. Model interpretability is often as important as prediction accuracy. Some models are harder to interpret and require post-processing to improve understanding and user trust. Most explanation methods are model-specific, but general approaches treat models as black boxes, changing inputs and observing output changes. These methods are applicable to any model type, facilitating model comparisons and eliminating the need to replace explanation methods when models change.
The key component of general explanations is the contribution of individual input features. A prediction is explained by assigning a number to each feature indicating its influence. Contributions can be aggregated to plot the feature's average contribution against its value, providing an overview of the model. This is similar to plotting the marginal effect for additive models.
The paper begins with a simple example of a linear regression model. The situational importance of a feature is the difference between its contribution when its value is x_i and its expected contribution. This is used to determine whether a feature has a positive, negative, or neutral contribution to a prediction. The situational importance can be plotted to show how different feature values contribute to predictions, and can be used to semi-graphically compute predictions for any instance. This method can be applied to any additive model, and has been used in developing several model-specific explanation methods.