DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing

DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing

08 July 2023 | Peng Ni, Fan Nie, Zeyu Zhong, Jinrui Xu, Neng Huang, Jun Zhang, Haochen Zhao, You Zou, Yuanfeng Huang, Jinchen Li, Chuan-Le Xiao, Feng Luo, Jianxin Wang
This study introduces ccsmeth, a deep-learning method for detecting DNA 5-methylcytosine (5mCpG) using PacBio circular consensus sequencing (CCS) reads. ccsmeth leverages bidirectional Gated Recurrent Unit (BiGRU) and attention neural networks to predict methylation states at both read and site levels. The method is trained on CCS reads from PCR-treated and M.SssI-methylated human DNA, achieving an accuracy of 0.90 and an Area Under the Curve (AUC) of 0.97 for 5mCpG detection at single-molecule resolution. At the genome-wide site level, ccsmeth achieves correlations >0.90 with bisulfite sequencing (BS-seq) and nanopore sequencing using only 10× reads. Additionally, a Nextflow pipeline called ccsmethphase is developed to detect haplotype-aware methylation using CCS reads, validated on a Chinese family trio. ccsmeth and ccsmethphase demonstrate robust and accurate detection of DNA 5-methylcytosine, even in repetitive genomic regions. The study also assesses the performance of PacBio CCS for methylation detection and phasing in repetitive genomic regions, showing that it can detect methylation states of CpGs in RepeatMasker repeats, segmental duplications, and peri/centromeric satellites with high accuracy. Overall, ccsmeth and ccsmethphase provide comprehensive and accurate tools for 5mCpG detection and methylation phasing using PacBio CCS data.This study introduces ccsmeth, a deep-learning method for detecting DNA 5-methylcytosine (5mCpG) using PacBio circular consensus sequencing (CCS) reads. ccsmeth leverages bidirectional Gated Recurrent Unit (BiGRU) and attention neural networks to predict methylation states at both read and site levels. The method is trained on CCS reads from PCR-treated and M.SssI-methylated human DNA, achieving an accuracy of 0.90 and an Area Under the Curve (AUC) of 0.97 for 5mCpG detection at single-molecule resolution. At the genome-wide site level, ccsmeth achieves correlations >0.90 with bisulfite sequencing (BS-seq) and nanopore sequencing using only 10× reads. Additionally, a Nextflow pipeline called ccsmethphase is developed to detect haplotype-aware methylation using CCS reads, validated on a Chinese family trio. ccsmeth and ccsmethphase demonstrate robust and accurate detection of DNA 5-methylcytosine, even in repetitive genomic regions. The study also assesses the performance of PacBio CCS for methylation detection and phasing in repetitive genomic regions, showing that it can detect methylation states of CpGs in RepeatMasker repeats, segmental duplications, and peri/centromeric satellites with high accuracy. Overall, ccsmeth and ccsmethphase provide comprehensive and accurate tools for 5mCpG detection and methylation phasing using PacBio CCS data.
Reach us at info@study.space
[slides and audio] DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing