AUGUSTUS: a web server for gene finding in eukaryotes

AUGUSTUS: a web server for gene finding in eukaryotes

2004 | Mario Stanke*, Rasmus Steinkamp, Stephan Waack¹ and Burkhard Morgenstern
AUGUSTUS is a web server for gene finding in eukaryotic genomes, based on a generalized Hidden Markov Model (GHMM) with a new method for modeling intron length distribution. This approach improves the accuracy of gene prediction, especially for longer sequences. The server is available at http://augustus.gobics.de and allows users to upload DNA sequences in FASTA format or by pasting sequences into a web form. It supports both human and Drosophila species, with parameter sets tailored to the average GC content of the input sequence. The server provides a user-friendly interface for gene prediction, with options for predicting complete genes or partial genes. It also includes an 'expert option' to ignore conflicts between gene structures on opposite strands, which helps avoid false positives. The results are output in both graphical and text formats, with the text output in the General Feature Format (GFF). AUGUSTUS has been evaluated on human and Drosophila sequences, showing superior accuracy compared to existing gene-finding tools, particularly for longer sequences. It uses a combination of explicit length modeling and a geometric distribution for intron lengths, allowing more realistic modeling of intron lengths. The program's performance is further enhanced by integrating extrinsic information from protein and EST databases, and future work includes incorporating comparative gene-finding methods and homology information from DIALIGN.AUGUSTUS is a web server for gene finding in eukaryotic genomes, based on a generalized Hidden Markov Model (GHMM) with a new method for modeling intron length distribution. This approach improves the accuracy of gene prediction, especially for longer sequences. The server is available at http://augustus.gobics.de and allows users to upload DNA sequences in FASTA format or by pasting sequences into a web form. It supports both human and Drosophila species, with parameter sets tailored to the average GC content of the input sequence. The server provides a user-friendly interface for gene prediction, with options for predicting complete genes or partial genes. It also includes an 'expert option' to ignore conflicts between gene structures on opposite strands, which helps avoid false positives. The results are output in both graphical and text formats, with the text output in the General Feature Format (GFF). AUGUSTUS has been evaluated on human and Drosophila sequences, showing superior accuracy compared to existing gene-finding tools, particularly for longer sequences. It uses a combination of explicit length modeling and a geometric distribution for intron lengths, allowing more realistic modeling of intron lengths. The program's performance is further enhanced by integrating extrinsic information from protein and EST databases, and future work includes incorporating comparative gene-finding methods and homology information from DIALIGN.
Reach us at info@study.space