09 February 2024 | Jinsen Li, Tsu-Pei Chiu, Remo Rohs
This study introduces Deep DNAshape, a deep learning-based method that predicts 3D DNA structural parameters (DNA shape) for any DNA sequence. Unlike the previous pentamer-based method, DNAshape, which relied on a limited query table and could only consider up to 2-bp flanking regions, Deep DNAshape can predict DNA shape features for sequences of any length, considering longer-range flanking regions. The method uses a specialized deep learning architecture to handle variable-length DNA sequences and compute the effects of flanking regions in a layer-by-layer manner. The model is trained on DNA shape features derived from Monte Carlo (MC) simulations and validated against tetramer query tables from MD simulations. Deep DNAshape outperforms the pentamer-based method in predicting DNA shape features, particularly in understanding the influence of extended flanking regions on DNA structure. The improved accuracy of Deep DNAshape is demonstrated through its application in predicting TF-DNA binding specificity and analyzing DNA shape preferences in transcription start sites across different Drosophila species. The method provides a versatile and powerful tool for studying DNA structure-related phenomena, including protein-DNA binding mechanisms.This study introduces Deep DNAshape, a deep learning-based method that predicts 3D DNA structural parameters (DNA shape) for any DNA sequence. Unlike the previous pentamer-based method, DNAshape, which relied on a limited query table and could only consider up to 2-bp flanking regions, Deep DNAshape can predict DNA shape features for sequences of any length, considering longer-range flanking regions. The method uses a specialized deep learning architecture to handle variable-length DNA sequences and compute the effects of flanking regions in a layer-by-layer manner. The model is trained on DNA shape features derived from Monte Carlo (MC) simulations and validated against tetramer query tables from MD simulations. Deep DNAshape outperforms the pentamer-based method in predicting DNA shape features, particularly in understanding the influence of extended flanking regions on DNA structure. The improved accuracy of Deep DNAshape is demonstrated through its application in predicting TF-DNA binding specificity and analyzing DNA shape preferences in transcription start sites across different Drosophila species. The method provides a versatile and powerful tool for studying DNA structure-related phenomena, including protein-DNA binding mechanisms.