25 Sep 2018 | Vitali Petsiuk, Abir Das, Kate Saenko
RISE: Randomized Input Sampling for Explanation of Black-box Models
This paper introduces RISE, a method for explaining black-box models by estimating the importance of input image regions for the model's prediction. RISE generates an importance map indicating how salient each pixel is for the model's prediction. Unlike white-box approaches that estimate pixel importance using gradients or other internal network states, RISE works on black-box models. It estimates importance empirically by probing the model with randomly masked versions of the input image and obtaining the corresponding outputs. RISE is a general approach that can produce a saliency map for an arbitrary network without requiring access to its internals. The key idea is to probe the base model by sub-sampling the input image via random masks and recording its response to each of the masked images. The final importance map is generated as a linear combination of the random binary masks where the combination weights come from the output probabilities predicted by the base model on the masked images.
RISE is evaluated using two automatic evaluation metrics: deletion and insertion. The deletion metric measures the drop in the probability of a class as important pixels are gradually removed from the image. A sharp drop indicates a good explanation. The insertion metric measures the increase in probability as more and more pixels are introduced, with higher AUC indicating a better explanation. RISE is compared to state-of-the-art importance extraction methods using both an automatic deletion/insertion metric and a pointing metric based on human-annotated object segments. Extensive experiments on several benchmark datasets show that RISE matches or exceeds the performance of other methods, including white-box approaches.
RISE is a true black-box explanation approach that is conceptually different from mainstream white-box saliency approaches such as Grad-CAM. It is generalizable to base models of any architecture. RISE is also compared to state-of-the-art explanation models in terms of a human-centric evaluation metric. The results show that RISE provides better performance than other methods, outperforming even the white-box Grad-CAM method. RISE is also competitive in terms of the human-centric pointing metric, despite being a black-box method. The paper concludes that RISE is a promising approach for explaining black-box models.RISE: Randomized Input Sampling for Explanation of Black-box Models
This paper introduces RISE, a method for explaining black-box models by estimating the importance of input image regions for the model's prediction. RISE generates an importance map indicating how salient each pixel is for the model's prediction. Unlike white-box approaches that estimate pixel importance using gradients or other internal network states, RISE works on black-box models. It estimates importance empirically by probing the model with randomly masked versions of the input image and obtaining the corresponding outputs. RISE is a general approach that can produce a saliency map for an arbitrary network without requiring access to its internals. The key idea is to probe the base model by sub-sampling the input image via random masks and recording its response to each of the masked images. The final importance map is generated as a linear combination of the random binary masks where the combination weights come from the output probabilities predicted by the base model on the masked images.
RISE is evaluated using two automatic evaluation metrics: deletion and insertion. The deletion metric measures the drop in the probability of a class as important pixels are gradually removed from the image. A sharp drop indicates a good explanation. The insertion metric measures the increase in probability as more and more pixels are introduced, with higher AUC indicating a better explanation. RISE is compared to state-of-the-art importance extraction methods using both an automatic deletion/insertion metric and a pointing metric based on human-annotated object segments. Extensive experiments on several benchmark datasets show that RISE matches or exceeds the performance of other methods, including white-box approaches.
RISE is a true black-box explanation approach that is conceptually different from mainstream white-box saliency approaches such as Grad-CAM. It is generalizable to base models of any architecture. RISE is also compared to state-of-the-art explanation models in terms of a human-centric evaluation metric. The results show that RISE provides better performance than other methods, outperforming even the white-box Grad-CAM method. RISE is also competitive in terms of the human-centric pointing metric, despite being a black-box method. The paper concludes that RISE is a promising approach for explaining black-box models.