[slides and audio] Attention-based Deep Multiple Instance Learning

This paper introduces an attention-based deep multiple instance learning (MIL) approach that improves interpretability and flexibility in MIL. The method formulates the MIL problem as learning the Bernoulli distribution of the bag label, parameterized by neural networks. A key contribution is the use of an attention mechanism as a permutation-invariant aggregation operator, which allows for the identification of key instances contributing to the bag label. The attention mechanism is implemented as a trainable weighted average, where weights are determined by a two-layered neural network. This approach enables the model to provide interpretable results by highlighting important instances or regions of interest (ROIs). The proposed method is evaluated on benchmark MIL datasets, an MNIST-based MIL dataset, and two real-life histopathology datasets. It achieves comparable performance to the best MIL methods on benchmark datasets and outperforms other methods on the MNIST-based and histopathology datasets. The model is also shown to provide meaningful insights into the decision-making process by highlighting key instances, which is particularly valuable in medical imaging applications. The method is based on the Fundamental Theorem of Symmetric Functions, which provides a general framework for modeling the bag label probability. The approach involves three steps: (i) transforming instances into a low-dimensional embedding, (ii) applying a permutation-invariant aggregation function, and (iii) transforming the aggregated result into a bag probability. The use of neural networks for all transformations allows for end-to-end training and increased flexibility. The attention-based MIL pooling operator is shown to be more flexible and adaptive than traditional pooling methods such as max and mean. It also provides interpretability by assigning higher weights to instances that are more likely to contribute to the bag label. The method is evaluated on various datasets, including classical MIL datasets, MNIST-BAGS, and histopathology datasets, demonstrating its effectiveness in both classification tasks and in identifying key instances or ROIs. The results show that the proposed approach performs well in both small and large sample size regimes, and it is particularly effective in medical imaging applications where interpretability is crucial. The attention mechanism allows for the identification of key instances, which can be used to highlight ROIs in histopathology images. The method is also shown to outperform other MIL approaches in terms of classification accuracy and recall, especially in medical applications where false negatives can have severe consequences. The approach is fully parameterized by neural networks and is capable of modeling arbitrary permutation-invariant score functions.This paper introduces an attention-based deep multiple instance learning (MIL) approach that improves interpretability and flexibility in MIL. The method formulates the MIL problem as learning the Bernoulli distribution of the bag label, parameterized by neural networks. A key contribution is the use of an attention mechanism as a permutation-invariant aggregation operator, which allows for the identification of key instances contributing to the bag label. The attention mechanism is implemented as a trainable weighted average, where weights are determined by a two-layered neural network. This approach enables the model to provide interpretable results by highlighting important instances or regions of interest (ROIs). The proposed method is evaluated on benchmark MIL datasets, an MNIST-based MIL dataset, and two real-life histopathology datasets. It achieves comparable performance to the best MIL methods on benchmark datasets and outperforms other methods on the MNIST-based and histopathology datasets. The model is also shown to provide meaningful insights into the decision-making process by highlighting key instances, which is particularly valuable in medical imaging applications. The method is based on the Fundamental Theorem of Symmetric Functions, which provides a general framework for modeling the bag label probability. The approach involves three steps: (i) transforming instances into a low-dimensional embedding, (ii) applying a permutation-invariant aggregation function, and (iii) transforming the aggregated result into a bag probability. The use of neural networks for all transformations allows for end-to-end training and increased flexibility. The attention-based MIL pooling operator is shown to be more flexible and adaptive than traditional pooling methods such as max and mean. It also provides interpretability by assigning higher weights to instances that are more likely to contribute to the bag label. The method is evaluated on various datasets, including classical MIL datasets, MNIST-BAGS, and histopathology datasets, demonstrating its effectiveness in both classification tasks and in identifying key instances or ROIs. The results show that the proposed approach performs well in both small and large sample size regimes, and it is particularly effective in medical imaging applications where interpretability is crucial. The attention mechanism allows for the identification of key instances, which can be used to highlight ROIs in histopathology images. The method is also shown to outperform other MIL approaches in terms of classification accuracy and recall, especially in medical applications where false negatives can have severe consequences. The approach is fully parameterized by neural networks and is capable of modeling arbitrary permutation-invariant score functions.

Attention-based Deep Multiple Instance Learning

2018 | Maximilian Ilse * 1 Jakub M. Tomczak * 1 Max Welling 1