25 Jul 2024 | Wenhao Tang, Fengtao Zhou, Sheng Huang, Xiang Zhu, Yi Zhang, Bo Liu
This paper proposes a Re-embedded Regional Transformer (R²T) to enhance the performance of multiple instance learning (MIL) in computational pathology. Traditional MIL methods rely on pre-trained feature extractors, which lack adaptability to specific downstream tasks. R²T introduces online feature re-embedding to capture fine-grained local features and establish connections across different regions. It is designed to be a portable module that can seamlessly integrate into mainstream MIL models. Experimental results show that R²T improves the performance of MIL models based on ResNet-50 features to the level of foundation model features and further enhances the performance of foundation model features. R²T-MIL, an R²T-enhanced AB-MIL, outperforms other latest methods by a large margin. The R²T consists of two novel components: Cross-region MSA (CR-MSA) and Embedded Position Encoding Generator (EPEG). CR-MSA enables effective information fusion across different regions, while EPEG combines the benefits of relative and convolutional position encodings to encode positional information more effectively. The proposed method achieves state-of-the-art performance on various computational pathology benchmarks. The results demonstrate that R²T significantly improves the performance of MIL models in tasks such as cancer diagnosis, sub-typing, and survival prediction. The method is effective in re-embedding features and achieving substantial improvements even when working with high-quality features extracted by foundation models. The R²T is applicable to various MIL frameworks and shows good performance and efficiency in local self-attention. The paper also discusses the importance of re-embedding in MIL-based computational pathology and validates the effectiveness of the proposed method through extensive experiments.This paper proposes a Re-embedded Regional Transformer (R²T) to enhance the performance of multiple instance learning (MIL) in computational pathology. Traditional MIL methods rely on pre-trained feature extractors, which lack adaptability to specific downstream tasks. R²T introduces online feature re-embedding to capture fine-grained local features and establish connections across different regions. It is designed to be a portable module that can seamlessly integrate into mainstream MIL models. Experimental results show that R²T improves the performance of MIL models based on ResNet-50 features to the level of foundation model features and further enhances the performance of foundation model features. R²T-MIL, an R²T-enhanced AB-MIL, outperforms other latest methods by a large margin. The R²T consists of two novel components: Cross-region MSA (CR-MSA) and Embedded Position Encoding Generator (EPEG). CR-MSA enables effective information fusion across different regions, while EPEG combines the benefits of relative and convolutional position encodings to encode positional information more effectively. The proposed method achieves state-of-the-art performance on various computational pathology benchmarks. The results demonstrate that R²T significantly improves the performance of MIL models in tasks such as cancer diagnosis, sub-typing, and survival prediction. The method is effective in re-embedding features and achieving substantial improvements even when working with high-quality features extracted by foundation models. The R²T is applicable to various MIL frameworks and shows good performance and efficiency in local self-attention. The paper also discusses the importance of re-embedding in MIL-based computational pathology and validates the effectiveness of the proposed method through extensive experiments.