January 24, 2019 | Kishore Jaganathan, Sofia Kyriazopoulou Panagiotopoulou, Jeremy F. McRae, Siavash Fazel Darbani, David Knowles, Yang I. Li, Jack A. Kosmicki, Juan Arbelaez, Wenwu Cui, Grace B. Schwartz, Eric D. Chow, Efthathios Kanterakis, Hong Gao, Amirali Kia, Serafim Batzoglou, Stephan J. Sanders, Kyle Kai-How Farh
A deep neural network, SpliceAI, accurately predicts splicing from pre-mRNA sequences, enabling precise identification of noncoding cryptic splice mutations. SpliceAI, a 32-layer deep neural network, predicts splice junctions using 10,000 nucleotides of flanking sequence. It achieves 95% top-k accuracy in predicting splice junctions and is validated on RNA-seq data. Cryptic splice variants are frequently associated with alternative splicing and may account for up to 10% of pathogenic variants in neurodevelopmental disorders. The network was trained on GENCODE-annotated pre-mRNA sequences and tested on remaining chromosomes. It predicts splice junctions with high accuracy and shows strong correlation with RNA-seq data. The network also identifies cryptic splice variants that are strongly deleterious in the human population. Cryptic splice mutations are enriched in patients with autism and intellectual disability compared to healthy controls. The network's predictions are validated on RNA-seq data and show high accuracy in identifying functional cryptic splice variants. The study highlights the importance of cryptic splice mutations in rare genetic diseases and their potential as a significant source of genetic variation. The findings suggest that deep learning models can provide biological insights into splicing mechanisms and improve the diagnosis of genetic disorders. The study also demonstrates the utility of deep learning in predicting the effects of de novo mutations and identifying novel disease genes. The results indicate that cryptic splice mutations are a major cause of rare genetic disorders and that their impact on splicing is significant. The study provides a resource for the scientific community to understand the role of cryptic splice mutations in genetic variation and disease.A deep neural network, SpliceAI, accurately predicts splicing from pre-mRNA sequences, enabling precise identification of noncoding cryptic splice mutations. SpliceAI, a 32-layer deep neural network, predicts splice junctions using 10,000 nucleotides of flanking sequence. It achieves 95% top-k accuracy in predicting splice junctions and is validated on RNA-seq data. Cryptic splice variants are frequently associated with alternative splicing and may account for up to 10% of pathogenic variants in neurodevelopmental disorders. The network was trained on GENCODE-annotated pre-mRNA sequences and tested on remaining chromosomes. It predicts splice junctions with high accuracy and shows strong correlation with RNA-seq data. The network also identifies cryptic splice variants that are strongly deleterious in the human population. Cryptic splice mutations are enriched in patients with autism and intellectual disability compared to healthy controls. The network's predictions are validated on RNA-seq data and show high accuracy in identifying functional cryptic splice variants. The study highlights the importance of cryptic splice mutations in rare genetic diseases and their potential as a significant source of genetic variation. The findings suggest that deep learning models can provide biological insights into splicing mechanisms and improve the diagnosis of genetic disorders. The study also demonstrates the utility of deep learning in predicting the effects of de novo mutations and identifying novel disease genes. The results indicate that cryptic splice mutations are a major cause of rare genetic disorders and that their impact on splicing is significant. The study provides a resource for the scientific community to understand the role of cryptic splice mutations in genetic variation and disease.