2018 December ; 15(12): 1053–1058 | Romain Lopez, Jeffrey Regier, Michael B. Cole, Michael I. Jordan, and Nir Yosef
The paper introduces scVI (Single-cell Variational Inference), a scalable framework for probabilistic representation and analysis of gene expression in single cells. scVI uses stochastic optimization and deep neural networks to aggregate information across similar cells and genes, approximating the distributions underlying observed expression values while accounting for batch effects and limited sensitivity. The authors evaluate scVI's performance on various tasks, including batch correction, visualization, clustering, and differential expression, demonstrating its accuracy and scalability compared to state-of-the-art methods. scVI is publicly available and can be used as a principled and inclusive solution for analyzing single-cell transcriptomes. The model is based on a hierarchical Bayesian framework with conditional distributions specified by deep neural networks, allowing for efficient training even for large datasets. scVI explicitly models key nuisance factors such as library size and batch effects, and offers solutions for a range of downstream tasks using a single generative model. The paper also discusses the scalability of the training procedure, the ability to capture biological structure in the latent space, and the handling of technical variability. Overall, scVI provides a flexible and robust tool for single-cell transcriptomics analysis.The paper introduces scVI (Single-cell Variational Inference), a scalable framework for probabilistic representation and analysis of gene expression in single cells. scVI uses stochastic optimization and deep neural networks to aggregate information across similar cells and genes, approximating the distributions underlying observed expression values while accounting for batch effects and limited sensitivity. The authors evaluate scVI's performance on various tasks, including batch correction, visualization, clustering, and differential expression, demonstrating its accuracy and scalability compared to state-of-the-art methods. scVI is publicly available and can be used as a principled and inclusive solution for analyzing single-cell transcriptomes. The model is based on a hierarchical Bayesian framework with conditional distributions specified by deep neural networks, allowing for efficient training even for large datasets. scVI explicitly models key nuisance factors such as library size and batch effects, and offers solutions for a range of downstream tasks using a single generative model. The paper also discusses the scalability of the training procedure, the ability to capture biological structure in the latent space, and the handling of technical variability. Overall, scVI provides a flexible and robust tool for single-cell transcriptomics analysis.