2024 | Alok Sharma, Artem Lysenko, Shangrui Jia, Keith A. Boroevich, Tatsuhiko Tsunoda
The field of omics, driven by advances in high-throughput sequencing, faces a data explosion that offers unprecedented opportunities for predictive modeling in precision medicine. Traditional machine learning (ML) techniques have been partially successful but exhibit limitations in handling complex data relationships. This review explores the revolutionary shift towards deep learning (DL), specifically convolutional neural networks (CNNs), which can transform tabular omics data into image-like representations, enhancing predictive power and leveraging transfer learning to reduce computational time and improve performance. However, integrating CNNs in predictive omics data analysis presents challenges such as model interpretability, data heterogeneity, and size. Addressing these challenges requires interdisciplinary collaboration between ML experts, bioinformatics researchers, biologists, and medical doctors. The review highlights the complexities and future research directions to unlock the full predictive potential of CNNs in omics data analysis and related fields. Key methodologies like DeepInsight and DeepFeature are discussed, along with their applications in various domains, including cancer drug efficacy prediction and single-cell RNA sequencing (scRNA-seq) data analysis. The review emphasizes the importance of interpretability, biological relevance, and the need for rigorous benchmarking to ensure the clinical validity and generalizability of these models.The field of omics, driven by advances in high-throughput sequencing, faces a data explosion that offers unprecedented opportunities for predictive modeling in precision medicine. Traditional machine learning (ML) techniques have been partially successful but exhibit limitations in handling complex data relationships. This review explores the revolutionary shift towards deep learning (DL), specifically convolutional neural networks (CNNs), which can transform tabular omics data into image-like representations, enhancing predictive power and leveraging transfer learning to reduce computational time and improve performance. However, integrating CNNs in predictive omics data analysis presents challenges such as model interpretability, data heterogeneity, and size. Addressing these challenges requires interdisciplinary collaboration between ML experts, bioinformatics researchers, biologists, and medical doctors. The review highlights the complexities and future research directions to unlock the full predictive potential of CNNs in omics data analysis and related fields. Key methodologies like DeepInsight and DeepFeature are discussed, along with their applications in various domains, including cancer drug efficacy prediction and single-cell RNA sequencing (scRNA-seq) data analysis. The review emphasizes the importance of interpretability, biological relevance, and the need for rigorous benchmarking to ensure the clinical validity and generalizability of these models.