02 March 2024 | Zhixing Zhong, Junchen Hou, Zhixian Yao, Lei Dong, Feng Liu, Junqiu Yue, Tiantian Wu, Junhua Zheng, Gaoliang Ouyang, Chaoyong Yang & Jia Song
Cancer-Finder is a domain generalization-based deep learning algorithm that enables accurate and efficient annotation of malignant cells in single-cell and spatial transcriptomics data. The algorithm achieves an average accuracy of 95.16% in identifying malignant cells in single-cell data and can be extended to spatial transcriptomics data by replacing the single-cell training data with spatial transcriptomic datasets. When applied to 5 clear cell renal cell carcinoma (ccRCC) spatial transcriptomic samples, Cancer-Finder successfully identifies a gene signature of 10 genes that are significantly co-localized at the tumor-normal interface and are strongly correlated with the prognosis of ccRCC patients.
The algorithm addresses the challenge of accurately annotating malignant cells in diverse tumor microenvironments, where current methods lack generalization and accuracy. Cancer-Finder uses a deep neural network with feature extraction and classification modules to distinguish malignant from non-malignant cells. It employs risk extrapolation for domain generalization, optimizing the model to reduce risk differences across training domains and improve generalization performance. The model is trained on multiple datasets with varying distributions and can be applied to various data types, including single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) data.
Cancer-Finder outperforms existing methods in accuracy and stability, achieving 98.30% accuracy on gold standard datasets and 90.89% similarity on silver standard datasets. It also demonstrates high inference speed and low memory consumption, making it suitable for large-scale datasets. When applied to spatial transcriptomics data, Cancer-Finder can rapidly identify malignant spots on spatial slides without reference data. The algorithm's performance was validated on multiple datasets, including 10 external validation datasets, and it showed consistent results across different cancer types and technologies.
In addition, Cancer-Finder was applied to analyze intertumor heterogeneity in ccRCC spatial transcriptomics data. It identified a gene signature of 10 genes that are enriched at the tumor-normal interface and are associated with the prognosis of ccRCC patients. These genes are involved in processes such as epithelial-mesenchymal transition (EMT) and tumor invasion. The algorithm's ability to accurately annotate malignant cells and identify key genes provides valuable insights into the tumor microenvironment and could lead to better understanding and treatment of cancer. Overall, Cancer-Finder is an efficient and extensible tool for malignant cell annotation in single-cell and spatial transcriptomics data.Cancer-Finder is a domain generalization-based deep learning algorithm that enables accurate and efficient annotation of malignant cells in single-cell and spatial transcriptomics data. The algorithm achieves an average accuracy of 95.16% in identifying malignant cells in single-cell data and can be extended to spatial transcriptomics data by replacing the single-cell training data with spatial transcriptomic datasets. When applied to 5 clear cell renal cell carcinoma (ccRCC) spatial transcriptomic samples, Cancer-Finder successfully identifies a gene signature of 10 genes that are significantly co-localized at the tumor-normal interface and are strongly correlated with the prognosis of ccRCC patients.
The algorithm addresses the challenge of accurately annotating malignant cells in diverse tumor microenvironments, where current methods lack generalization and accuracy. Cancer-Finder uses a deep neural network with feature extraction and classification modules to distinguish malignant from non-malignant cells. It employs risk extrapolation for domain generalization, optimizing the model to reduce risk differences across training domains and improve generalization performance. The model is trained on multiple datasets with varying distributions and can be applied to various data types, including single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) data.
Cancer-Finder outperforms existing methods in accuracy and stability, achieving 98.30% accuracy on gold standard datasets and 90.89% similarity on silver standard datasets. It also demonstrates high inference speed and low memory consumption, making it suitable for large-scale datasets. When applied to spatial transcriptomics data, Cancer-Finder can rapidly identify malignant spots on spatial slides without reference data. The algorithm's performance was validated on multiple datasets, including 10 external validation datasets, and it showed consistent results across different cancer types and technologies.
In addition, Cancer-Finder was applied to analyze intertumor heterogeneity in ccRCC spatial transcriptomics data. It identified a gene signature of 10 genes that are enriched at the tumor-normal interface and are associated with the prognosis of ccRCC patients. These genes are involved in processes such as epithelial-mesenchymal transition (EMT) and tumor invasion. The algorithm's ability to accurately annotate malignant cells and identify key genes provides valuable insights into the tumor microenvironment and could lead to better understanding and treatment of cancer. Overall, Cancer-Finder is an efficient and extensible tool for malignant cell annotation in single-cell and spatial transcriptomics data.