13 Jun 2017 | Mukund Sundararajan * 1 Ankur Taly * 1 Qiqi Yan * 1
The paper "Axiomatic Attribution for Deep Networks" addresses the problem of attributing the prediction of a deep network to its input features. The authors identify two fundamental axioms—Sensitivity and Implementation Invariance—that attribution methods should satisfy. They argue that most existing methods do not meet these axioms, which they consider a significant weakness. To address this, they propose a new method called Integrated Gradients, which is simple to implement and requires no modification to the original network. Integrated Gradients combines the implementation invariance of gradients with the sensitivity of methods like DeepLift and LRP. The method is evaluated on various deep networks, including image, text, and chemistry models, demonstrating its effectiveness in debugging, rule extraction, and improving user understanding. The paper also discusses the uniqueness of Integrated Gradients among path methods and its symmetry-preserving property, making it a canonical choice for attribution.The paper "Axiomatic Attribution for Deep Networks" addresses the problem of attributing the prediction of a deep network to its input features. The authors identify two fundamental axioms—Sensitivity and Implementation Invariance—that attribution methods should satisfy. They argue that most existing methods do not meet these axioms, which they consider a significant weakness. To address this, they propose a new method called Integrated Gradients, which is simple to implement and requires no modification to the original network. Integrated Gradients combines the implementation invariance of gradients with the sensitivity of methods like DeepLift and LRP. The method is evaluated on various deep networks, including image, text, and chemistry models, demonstrating its effectiveness in debugging, rule extraction, and improving user understanding. The paper also discusses the uniqueness of Integrated Gradients among path methods and its symmetry-preserving property, making it a canonical choice for attribution.