CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model

CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model

2013, Vol. 41, No. 6 | Liguo Wang, Hyun Jung Park, Surendra Dasari, Shengqin Wang, Jean-Pierre Kocher, Wei Li
The Coding Potential Assessment Tool (CPAT) is an alignment-free method designed to rapidly distinguish between coding and noncoding transcripts from deep transcriptome sequencing data. CPAT uses a logistic regression model based on four sequence features: open reading frame (ORF) size, ORF coverage, Fickett TESTCODE statistic, and hexamer usage bias. This approach outperforms other state-of-the-art alignment-based methods such as Coding-Potential Calculator (CPC) and Phylo Codon Substitution Frequencies (PhyloCSF) in terms of both sensitivity and specificity. CPAT is significantly faster, processing thousands of transcripts in seconds compared to hours or days. The software is user-friendly, accepting FASTA or BED formatted input files, and a web interface is available for instant predictions. CPAT's high accuracy and efficiency make it a valuable tool for the growing RNA-seq community in genome annotation.The Coding Potential Assessment Tool (CPAT) is an alignment-free method designed to rapidly distinguish between coding and noncoding transcripts from deep transcriptome sequencing data. CPAT uses a logistic regression model based on four sequence features: open reading frame (ORF) size, ORF coverage, Fickett TESTCODE statistic, and hexamer usage bias. This approach outperforms other state-of-the-art alignment-based methods such as Coding-Potential Calculator (CPC) and Phylo Codon Substitution Frequencies (PhyloCSF) in terms of both sensitivity and specificity. CPAT is significantly faster, processing thousands of transcripts in seconds compared to hours or days. The software is user-friendly, accepting FASTA or BED formatted input files, and a web interface is available for instant predictions. CPAT's high accuracy and efficiency make it a valuable tool for the growing RNA-seq community in genome annotation.
Reach us at info@study.space
Understanding CPAT%3A Coding-Potential Assessment Tool using an alignment-free logistic regression model