[slides and audio] Multimodal contrastive learning for spatial gene expression prediction using histology images

This paper proposes mclSTExp, a multimodal contrastive learning approach for predicting spatial gene expression using histology images. The method integrates spot features with spatial context through a self-attention mechanism of a Transformer encoder, and further enriches the model by incorporating image features via contrastive learning. The model is evaluated on three datasets: HER2+, cSCC, and Alex+10x, demonstrating superior performance in predicting spatial gene expression. mclSTExp not only accurately predicts gene expression but also interprets cancer-specific overexpressed genes, identifies immune-related genes, and detects specialized spatial domains annotated by pathologists. The model's performance is validated through extensive experiments, showing that it outperforms existing methods in terms of Pearson correlation coefficients (PCCs) for both all considered genes (ACG) and top 50 highly expressed genes (HEG). The results indicate that mclSTExp provides valuable insights into cancer biology and has potential applications in cancer therapy. The method is implemented using a combination of a Transformer encoder and contrastive learning, and the source code is available on GitHub. The study highlights the importance of integrating multimodal information from histology images and spatial transcriptomics data for improving the accuracy of gene expression prediction.This paper proposes mclSTExp, a multimodal contrastive learning approach for predicting spatial gene expression using histology images. The method integrates spot features with spatial context through a self-attention mechanism of a Transformer encoder, and further enriches the model by incorporating image features via contrastive learning. The model is evaluated on three datasets: HER2+, cSCC, and Alex+10x, demonstrating superior performance in predicting spatial gene expression. mclSTExp not only accurately predicts gene expression but also interprets cancer-specific overexpressed genes, identifies immune-related genes, and detects specialized spatial domains annotated by pathologists. The model's performance is validated through extensive experiments, showing that it outperforms existing methods in terms of Pearson correlation coefficients (PCCs) for both all considered genes (ACG) and top 50 highly expressed genes (HEG). The results indicate that mclSTExp provides valuable insights into cancer biology and has potential applications in cancer therapy. The method is implemented using a combination of a Transformer encoder and contrastive learning, and the source code is available on GitHub. The study highlights the importance of integrating multimodal information from histology images and spatial transcriptomics data for improving the accuracy of gene expression prediction.

Multimodal contrastive learning for spatial gene expression prediction using histology images

2024 | Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang