Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN

Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN

16 February 2024 | Yanay Rosen, Maria Brbić, Yusuf Roohani, Kyle Swanson, Ziang Li, Jure Leskovec
The article introduces SATURN (Species Alignment Through Unification of Rna and proteiNs), a deep learning method designed to integrate single-cell RNA-seq datasets from different species. SATURN leverages protein language models to encode genes' biological properties, allowing it to learn universal cell embeddings that can bridge interspecies differences. By coupling protein embeddings with RNA expression, SATURN integrates datasets from diverse organisms, enabling the detection of functionally related genes coexpressed across species. The method is applied to three species whole-organism atlases and frog and zebrafish embryogenesis datasets, demonstrating its ability to transfer annotations across species, even when they are evolutionarily distant. SATURN also performs multispecies differential expression analysis, revealing gene programs shared across datasets. The article highlights SATURN's effectiveness in integrating large-scale single-cell datasets, re annotating cell types, and identifying divergent gene functions between species.The article introduces SATURN (Species Alignment Through Unification of Rna and proteiNs), a deep learning method designed to integrate single-cell RNA-seq datasets from different species. SATURN leverages protein language models to encode genes' biological properties, allowing it to learn universal cell embeddings that can bridge interspecies differences. By coupling protein embeddings with RNA expression, SATURN integrates datasets from diverse organisms, enabling the detection of functionally related genes coexpressed across species. The method is applied to three species whole-organism atlases and frog and zebrafish embryogenesis datasets, demonstrating its ability to transfer annotations across species, even when they are evolutionarily distant. SATURN also performs multispecies differential expression analysis, revealing gene programs shared across datasets. The article highlights SATURN's effectiveness in integrating large-scale single-cell datasets, re annotating cell types, and identifying divergent gene functions between species.
Reach us at info@study.space