Dictionary learning for integrative, multimodal, and scalable single-cell analysis

Dictionary learning for integrative, multimodal, and scalable single-cell analysis

February 26, 2022 | Yuhan Hao, Tim Stuart, Madeline Kowalski, Saket Choudhary, Paul Hoffman, Austin Hartman, Avi Srivastava, Gesmira Molla, Shaista Madad, Carlos Fernandez-Granda, Rahul Satija
The paper introduces a method called "bridge integration" to harmonize single-cell datasets across different modalities by leveraging a multi-omic dataset as a molecular bridge. This method uses dictionary learning, a technique from representation learning, to transform datasets into a shared space, allowing for accurate annotation and comparison across different molecular features such as gene expression, chromatin accessibility, histone modifications, and protein levels. The authors demonstrate the effectiveness of bridge integration by mapping scATAC-seq data onto scRNA-seq references, achieving high accuracy and identifying rare cell types. They also introduce "atomic sketch integration," which uses a representative subset of cells from each dataset to improve computational scalability, enabling the integration of large-scale datasets. The method is applied to integrate scRNA-seq and CyTOF datasets, highlighting its potential for community-wide integration and cross-modality analysis. The authors conclude that bridge integration and atomic sketch integration can enhance the utility of single-cell reference datasets and facilitate the comparison of diverse molecular modalities.The paper introduces a method called "bridge integration" to harmonize single-cell datasets across different modalities by leveraging a multi-omic dataset as a molecular bridge. This method uses dictionary learning, a technique from representation learning, to transform datasets into a shared space, allowing for accurate annotation and comparison across different molecular features such as gene expression, chromatin accessibility, histone modifications, and protein levels. The authors demonstrate the effectiveness of bridge integration by mapping scATAC-seq data onto scRNA-seq references, achieving high accuracy and identifying rare cell types. They also introduce "atomic sketch integration," which uses a representative subset of cells from each dataset to improve computational scalability, enabling the integration of large-scale datasets. The method is applied to integrate scRNA-seq and CyTOF datasets, highlighting its potential for community-wide integration and cross-modality analysis. The authors conclude that bridge integration and atomic sketch integration can enhance the utility of single-cell reference datasets and facilitate the comparison of diverse molecular modalities.
Reach us at info@study.space