2006 | Mario Stanke*, Oliver Keller¹, Irfan Gunduz², Alec Hayes², Stephan Waack¹ and Burkhard Morgenstern
AUGUSTUS is a gene prediction tool for eukaryotes based on a Generalized Hidden Markov Model (GHMM). The original version predicted one transcript per gene, ignoring alternative splicing. The authors present an extended version of AUGUSTUS that can predict multiple splice variants, making it the first ab initio gene finder capable of this. The tool also includes a motif searching feature for user-defined regular expressions. The web server and standalone program are freely available.
The paper discusses the importance of gene prediction tools in bioinformatics, noting that existing tools have limitations. AUGUSTUS was evaluated in the EGASP workshop and performed well in ab initio gene prediction. It can be improved by using BLAST hits and alignments. The new version of AUGUSTUS allows users to control the number of predicted splice variants per gene, balancing sensitivity and specificity. The tool also includes a motif search feature for user-defined patterns.
The method uses posterior probabilities to estimate the likelihood of exons, introns, transcripts, and genes. These probabilities are calculated based on random sampling of parses. The tool filters transcripts based on posterior probabilities and overlaps. The web server offers four options for the number of alternative transcripts: single, few, medium, and many.
The tool's output includes graphical and text formats, with results in GFF and FASTA. The pattern-searching option allows users to search for conserved protein patterns. The tool was tested on the EGASP dataset, showing improved sensitivity when predicting multiple transcripts. The results show that predicting more transcripts increases gene-level sensitivity, though it may reduce specificity at lower levels. The tool is useful for finding at least one correct splice variant for a gene. Funding was provided by BMBF and DAAD.AUGUSTUS is a gene prediction tool for eukaryotes based on a Generalized Hidden Markov Model (GHMM). The original version predicted one transcript per gene, ignoring alternative splicing. The authors present an extended version of AUGUSTUS that can predict multiple splice variants, making it the first ab initio gene finder capable of this. The tool also includes a motif searching feature for user-defined regular expressions. The web server and standalone program are freely available.
The paper discusses the importance of gene prediction tools in bioinformatics, noting that existing tools have limitations. AUGUSTUS was evaluated in the EGASP workshop and performed well in ab initio gene prediction. It can be improved by using BLAST hits and alignments. The new version of AUGUSTUS allows users to control the number of predicted splice variants per gene, balancing sensitivity and specificity. The tool also includes a motif search feature for user-defined patterns.
The method uses posterior probabilities to estimate the likelihood of exons, introns, transcripts, and genes. These probabilities are calculated based on random sampling of parses. The tool filters transcripts based on posterior probabilities and overlaps. The web server offers four options for the number of alternative transcripts: single, few, medium, and many.
The tool's output includes graphical and text formats, with results in GFF and FASTA. The pattern-searching option allows users to search for conserved protein patterns. The tool was tested on the EGASP dataset, showing improved sensitivity when predicting multiple transcripts. The results show that predicting more transcripts increases gene-level sensitivity, though it may reduce specificity at lower levels. The tool is useful for finding at least one correct splice variant for a gene. Funding was provided by BMBF and DAAD.