The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes

The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes

5/14/08 | F. Meyer¹,²,*, D. Paarmann², M. D'Souza², R. Olson¹, E. M. Glass¹, M. Kubal², T. Paczian¹, R. Stevens¹,², A. Wilke², J. Wilkening¹, R. A. Edwards¹,³
The metagenomics RAST server is a public resource for the automatic phylogenetic and functional analysis of metagenomes. It provides a high-throughput pipeline for processing metagenome sequence data, enabling automated functional assignments by comparing protein and nucleotide databases. The server generates phylogenetic and functional summaries of metagenomes and includes tools for comparative metagenomics. User access is controlled to ensure data privacy, but the collaborative environment allows data sharing between multiple users. All users retain full control of their data, and everything is available for download in various formats. The server is an open-source system based on the SEED framework for comparative genomics. Users can upload raw sequence data in FASTA format, which is then normalized and processed, with summaries automatically generated. The pipeline is designed with a modular framework to accommodate new analysis methods. The server provides access to various data types, including phylogenetic and metabolic reconstructions, and allows comparison of metabolism and annotations between metagenomes and genomes. It also offers comprehensive search capabilities. User registration allows access to data sets, with the ability to delegate authorization and release data to the public. The pipeline accepts data in multiple formats, including 454 reads, Sanger sequences, and assembled sequences. The system uses open-source components such as the SEED framework, NCBI BLAST, SQLite, and Sun Grid Engine. It includes a normalization step to generate unique internal IDs and remove duplicate sequences. Sequences are screened for potential protein encoding genes using BLASTX and compared to various databases, including rDNA databases and boutique databases. The server computes derived data using matches to external databases, including phylogenomic reconstructions and functional classifications. It provides a web-based interface for browsing and analyzing data, downloading results, and adjusting parameters for functional, metabolic, and phylogenetic reconstructions. Comparative metagenomics tools allow users to compare their data against other metagenomes or complete genomes, highlighting differences through heatmaps and taxonomic profiles. The server handles both assembled and unassembled data, with each approach having its advantages. The analytical methods integrated into the pipeline provide core annotation and analysis tools for comparing diverse metagenomes. The subsystems-based functional analysis has been validated with 90 samples from nine major biomes, demonstrating clear separation based on functional composition. The service is available to all users after registration and provides results in various formats, including GFF3, GenBank, and flat text. It is open-source and supports future developments in data mining and analysis of 16S-based metagenome datasets.The metagenomics RAST server is a public resource for the automatic phylogenetic and functional analysis of metagenomes. It provides a high-throughput pipeline for processing metagenome sequence data, enabling automated functional assignments by comparing protein and nucleotide databases. The server generates phylogenetic and functional summaries of metagenomes and includes tools for comparative metagenomics. User access is controlled to ensure data privacy, but the collaborative environment allows data sharing between multiple users. All users retain full control of their data, and everything is available for download in various formats. The server is an open-source system based on the SEED framework for comparative genomics. Users can upload raw sequence data in FASTA format, which is then normalized and processed, with summaries automatically generated. The pipeline is designed with a modular framework to accommodate new analysis methods. The server provides access to various data types, including phylogenetic and metabolic reconstructions, and allows comparison of metabolism and annotations between metagenomes and genomes. It also offers comprehensive search capabilities. User registration allows access to data sets, with the ability to delegate authorization and release data to the public. The pipeline accepts data in multiple formats, including 454 reads, Sanger sequences, and assembled sequences. The system uses open-source components such as the SEED framework, NCBI BLAST, SQLite, and Sun Grid Engine. It includes a normalization step to generate unique internal IDs and remove duplicate sequences. Sequences are screened for potential protein encoding genes using BLASTX and compared to various databases, including rDNA databases and boutique databases. The server computes derived data using matches to external databases, including phylogenomic reconstructions and functional classifications. It provides a web-based interface for browsing and analyzing data, downloading results, and adjusting parameters for functional, metabolic, and phylogenetic reconstructions. Comparative metagenomics tools allow users to compare their data against other metagenomes or complete genomes, highlighting differences through heatmaps and taxonomic profiles. The server handles both assembled and unassembled data, with each approach having its advantages. The analytical methods integrated into the pipeline provide core annotation and analysis tools for comparing diverse metagenomes. The subsystems-based functional analysis has been validated with 90 samples from nine major biomes, demonstrating clear separation based on functional composition. The service is available to all users after registration and provides results in various formats, including GFF3, GenBank, and flat text. It is open-source and supports future developments in data mining and analysis of 16S-based metagenome datasets.
Reach us at info@futurestudyspace.com