iDNA-OpenPrompt: OpenPrompt learning model for identifying DNA methylation

iDNA-OpenPrompt: OpenPrompt learning model for identifying DNA methylation

16 April 2024 | Xia Yu, Jia Ren, Haixia Long, Rao Zeng, Guoqiang Zhang, Anas Bilal and Yani Cui
The paper introduces the iDNA-OpenPrompt model, an innovative approach for identifying DNA methylation sites using the OpenPrompt learning framework. The model combines a prompt template, prompt verbalizer, and Pre-trained Language Model (PLM) to construct a prompt learning framework for DNA methylation sequences. A DNA vocabulary library, BERT tokenizer, and specific label words are also integrated into the model to enhance its accuracy. The study evaluates the model's performance on 17 benchmark datasets covering various species and three types of DNA methylation modifications (4mC, 5hmC, 6mA). The results consistently show that the iDNA-OpenPrompt model outperforms existing methods in terms of accuracy, reliability, and consistency. The model's effectiveness is attributed to its ability to learn biological contextual semantics and its robustness across different species. The paper also discusses the impact of the DNA vocabulary and label words on the model's accuracy, highlighting the optimal lengths for these components. Overall, the iDNA-OpenPrompt model demonstrates significant advancements in DNA methylation site identification, making it a valuable tool for genomic research and disease studies.The paper introduces the iDNA-OpenPrompt model, an innovative approach for identifying DNA methylation sites using the OpenPrompt learning framework. The model combines a prompt template, prompt verbalizer, and Pre-trained Language Model (PLM) to construct a prompt learning framework for DNA methylation sequences. A DNA vocabulary library, BERT tokenizer, and specific label words are also integrated into the model to enhance its accuracy. The study evaluates the model's performance on 17 benchmark datasets covering various species and three types of DNA methylation modifications (4mC, 5hmC, 6mA). The results consistently show that the iDNA-OpenPrompt model outperforms existing methods in terms of accuracy, reliability, and consistency. The model's effectiveness is attributed to its ability to learn biological contextual semantics and its robustness across different species. The paper also discusses the impact of the DNA vocabulary and label words on the model's accuracy, highlighting the optimal lengths for these components. Overall, the iDNA-OpenPrompt model demonstrates significant advancements in DNA methylation site identification, making it a valuable tool for genomic research and disease studies.
Reach us at info@study.space
Understanding iDNA-OpenPrompt%3A OpenPrompt learning model for identifying DNA methylation.