[slides] Learning Important Features Through Propagating Activation Differences

DeepLIFT (Deep Learning Important FeaTures) is a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input. It compares the activation of each neuron to its 'reference activation' and assigns contribution scores based on the difference. By optionally considering both positive and negative contributions, DeepLIFT can reveal dependencies missed by other methods. Scores are computed efficiently in a single backward pass. The method is applied to models trained on MNIST and simulated genomic data, showing significant advantages over gradient-based methods. DeepLIFT addresses the "black box" nature of neural networks, making them more interpretable, especially in applications where interpretability is crucial.DeepLIFT (Deep Learning Important FeaTures) is a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input. It compares the activation of each neuron to its 'reference activation' and assigns contribution scores based on the difference. By optionally considering both positive and negative contributions, DeepLIFT can reveal dependencies missed by other methods. Scores are computed efficiently in a single backward pass. The method is applied to models trained on MNIST and simulated genomic data, showing significant advantages over gradient-based methods. DeepLIFT addresses the "black box" nature of neural networks, making them more interpretable, especially in applications where interpretability is crucial.

Learning Important Features Through Propagating Activation Differences

12 Oct 2019 | Avanti Shrikumar, Peyton Greenside, Anshul Kundaje