[slides and audio] Deep Generative Modeling for Single-cell Transcriptomics

Single-cell Variational Inference (scVI) is a scalable probabilistic framework for analyzing gene expression in single cells. It uses deep neural networks and stochastic optimization to model and account for technical noise and batch effects, enabling accurate and efficient analysis of single-cell transcriptomics. scVI is publicly available and can be used for tasks such as batch correction, visualization, clustering, and differential expression. It outperforms existing methods in these tasks, providing a principled and inclusive solution for single-cell data analysis. The scVI model is based on a hierarchical Bayesian model with conditional distributions specified by deep neural networks. It encodes each cell's transcriptome into a low-dimensional latent vector of normal random variables, then decodes this to generate a posterior estimate of gene expression parameters. The model accounts for zero-inflated negative binomial distributions, which capture over-dispersion and limited sensitivity in single-cell data. scVI also explicitly models two key nuisance factors in scRNA-seq data: library size and batch effects. scVI was evaluated on various datasets, including mouse brain cells, retinal bipolar neurons, and hematopoietic differentiation data. It demonstrated superior performance in imputation, clustering, and differential expression tasks compared to state-of-the-art methods. scVI is also capable of generating unseen data by sampling from the latent space and provides a flexible representation of biological variability, capturing hierarchical structures, continuous cell states, and structureless noise. scVI addresses the challenge of modeling bias and uncertainty in single-cell data by incorporating a probabilistic framework that accounts for technical variability. It provides a computationally efficient and scalable solution for single-cell transcriptomics, enabling accurate downstream analysis. The model's ability to handle large datasets and its integration of normalization and probabilistic modeling make it a valuable tool for single-cell RNA sequencing analysis. scVI is also applicable to other forms of single-cell data analysis, such as lineage inference and cell-state annotation. The software is publicly available on GitHub, and all code for reproducing results is deposited in Zenodo.Single-cell Variational Inference (scVI) is a scalable probabilistic framework for analyzing gene expression in single cells. It uses deep neural networks and stochastic optimization to model and account for technical noise and batch effects, enabling accurate and efficient analysis of single-cell transcriptomics. scVI is publicly available and can be used for tasks such as batch correction, visualization, clustering, and differential expression. It outperforms existing methods in these tasks, providing a principled and inclusive solution for single-cell data analysis. The scVI model is based on a hierarchical Bayesian model with conditional distributions specified by deep neural networks. It encodes each cell's transcriptome into a low-dimensional latent vector of normal random variables, then decodes this to generate a posterior estimate of gene expression parameters. The model accounts for zero-inflated negative binomial distributions, which capture over-dispersion and limited sensitivity in single-cell data. scVI also explicitly models two key nuisance factors in scRNA-seq data: library size and batch effects. scVI was evaluated on various datasets, including mouse brain cells, retinal bipolar neurons, and hematopoietic differentiation data. It demonstrated superior performance in imputation, clustering, and differential expression tasks compared to state-of-the-art methods. scVI is also capable of generating unseen data by sampling from the latent space and provides a flexible representation of biological variability, capturing hierarchical structures, continuous cell states, and structureless noise. scVI addresses the challenge of modeling bias and uncertainty in single-cell data by incorporating a probabilistic framework that accounts for technical variability. It provides a computationally efficient and scalable solution for single-cell transcriptomics, enabling accurate downstream analysis. The model's ability to handle large datasets and its integration of normalization and probabilistic modeling make it a valuable tool for single-cell RNA sequencing analysis. scVI is also applicable to other forms of single-cell data analysis, such as lineage inference and cell-state annotation. The software is publicly available on GitHub, and all code for reproducing results is deposited in Zenodo.

Deep Generative Modeling for Single-cell Transcriptomics

2018 December | Romain Lopez, Jeffrey Regier, Michael B. Cole, Michael I. Jordan, Nir Yosef