Accurate Spatial Gene Expression Prediction by Integrating Multi-Resolution Features

Accurate Spatial Gene Expression Prediction by Integrating Multi-Resolution Features

25 Apr 2024 | Youngmin Chung, Ji Hun Ha, Kyeong Chan Im, Joo Sang Lee
TRIPLEX is a novel deep learning framework designed to predict spatial gene expression from Whole Slide Images (WSIs) by integrating multi-resolution features. The framework captures cellular morphology at individual spots, local context around these spots, and global tissue organization. By effectively fusing these features, TRIPLEX achieves accurate gene expression prediction. Comprehensive benchmark studies on three public ST datasets and Visium data from 10X Genomics demonstrate that TRIPLEX outperforms existing models in terms of Mean Squared Error (MSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (PCC). The model's predictions align closely with ground truth gene expression profiles and tumor annotations, highlighting its potential in advancing cancer diagnosis and treatment. TRIPLEX integrates three types of features: the target spot image, the neighbor view, and the global view. These features capture varying levels of biological information, from detailed cell morphology in the target spot image to the surrounding tissue phenotype and the overall tissue microenvironment in the WSI. Separate encoders extract these features from WSIs, each focusing on its assigned resolution. For the neighbor and global views, pre-extracted features are used to reduce computational costs, while for the target spot images, encoders are fully updated to extract fine-grained information. These features are then integrated via a fusion layer for effective gene expression prediction. The study sets a new benchmark in spatial gene expression prediction by comparing TRIPLEX with five prior studies under uniform experimental conditions. Internal evaluations were conducted on three public ST datasets, and external validations used higher-resolution Visium data from 10X Genomics. The results show that TRIPLEX surpasses existing models in terms of MSE, MAE, and PCC in both internal and external evaluations. Visualizations of gene expression distributions for a specific cancer-associated gene demonstrate that TRIPLEX's predictions align more closely with actual gene expression data and tumor annotations. Key contributions of TRIPLEX include an innovative approach to predict spatial gene expression levels from WSIs by integrating multiple biological contexts, a framework that seamlessly integrates multi-resolution features using a feature extraction strategy, various types of transformers, and a fusion loss technique, and comprehensive experiments on three public ST datasets and additional external evaluations using three Visium data, establishing a new benchmark in spatial gene expression prediction. The results consistently show that the proposed method outperforms all existing models included in the comparative analysis.TRIPLEX is a novel deep learning framework designed to predict spatial gene expression from Whole Slide Images (WSIs) by integrating multi-resolution features. The framework captures cellular morphology at individual spots, local context around these spots, and global tissue organization. By effectively fusing these features, TRIPLEX achieves accurate gene expression prediction. Comprehensive benchmark studies on three public ST datasets and Visium data from 10X Genomics demonstrate that TRIPLEX outperforms existing models in terms of Mean Squared Error (MSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (PCC). The model's predictions align closely with ground truth gene expression profiles and tumor annotations, highlighting its potential in advancing cancer diagnosis and treatment. TRIPLEX integrates three types of features: the target spot image, the neighbor view, and the global view. These features capture varying levels of biological information, from detailed cell morphology in the target spot image to the surrounding tissue phenotype and the overall tissue microenvironment in the WSI. Separate encoders extract these features from WSIs, each focusing on its assigned resolution. For the neighbor and global views, pre-extracted features are used to reduce computational costs, while for the target spot images, encoders are fully updated to extract fine-grained information. These features are then integrated via a fusion layer for effective gene expression prediction. The study sets a new benchmark in spatial gene expression prediction by comparing TRIPLEX with five prior studies under uniform experimental conditions. Internal evaluations were conducted on three public ST datasets, and external validations used higher-resolution Visium data from 10X Genomics. The results show that TRIPLEX surpasses existing models in terms of MSE, MAE, and PCC in both internal and external evaluations. Visualizations of gene expression distributions for a specific cancer-associated gene demonstrate that TRIPLEX's predictions align more closely with actual gene expression data and tumor annotations. Key contributions of TRIPLEX include an innovative approach to predict spatial gene expression levels from WSIs by integrating multiple biological contexts, a framework that seamlessly integrates multi-resolution features using a feature extraction strategy, various types of transformers, and a fusion loss technique, and comprehensive experiments on three public ST datasets and additional external evaluations using three Visium data, establishing a new benchmark in spatial gene expression prediction. The results consistently show that the proposed method outperforms all existing models included in the comparative analysis.
Reach us at info@study.space