[slides and audio] The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

The SEED and RAST are integrated systems for microbial genome annotation. The SEED provides a platform for consistent and accurate genome annotations across thousands of genomes, while RAST is an automated annotation system that uses subsystems technology. The SEED database includes subsystems, which are collections of functionally related protein families, and their derived FIGfams, which are used by RAST for annotation. When a new genome is submitted to RAST, genes are identified and annotated by comparing them to the FIGfam collection. If the genome is public, it is added to the SEED and its proteins contribute to the FIGfam collection. This cycle ensures robust and scalable genome annotation. The SEED continuously integrates various genomic data, including public genomes annotated by RAST, expert annotations, metabolic modeling data, expression data, and links to other databases. The SEED website offers tools for genome annotation and comparison, such as the 'Compare Regions View', which allows users to compare genomic regions across different genomes. The SEED also supports a variety of comparative genomics tools, including the 'Sequence Based Comparison Tool' and the 'Function Based Comparison Tool'. The SEED supports several online resources, including NMPDR, PATRIC, PhAnToMe, Model SEED, and the U.S. Department of Energy KBase project. RAST is an automatic annotation server that uses the SEED system to annotate microbial genomes. It allows users to submit genomes and receive annotations, which are then integrated into the SEED database. RAST has been used to annotate over 60,000 distinct genomes, and its use has increased significantly over the past 16 years. The RAST pipeline involves several steps, including identifying selenoproteins and pyrrolysoproteins, estimating phylogenetic neighbors, identifying tRNA and rRNA genes, and assigning functions to genes based on k-mers and BLAST similarities. The pipeline also includes iterative retraining of gene-calling algorithms and the identification of missed genes. The SEED and RAST systems are interconnected, with annotations from RAST being integrated into the SEED database and vice versa. The SEED and RAST systems are continually updated and improved, with new tools and features being added to enhance genome annotation and analysis. The SEED provides a robust framework for integrating and analyzing genomic data, while RAST offers an automated system for genome annotation. Together, they form a powerful platform for microbial genome research and analysis.The SEED and RAST are integrated systems for microbial genome annotation. The SEED provides a platform for consistent and accurate genome annotations across thousands of genomes, while RAST is an automated annotation system that uses subsystems technology. The SEED database includes subsystems, which are collections of functionally related protein families, and their derived FIGfams, which are used by RAST for annotation. When a new genome is submitted to RAST, genes are identified and annotated by comparing them to the FIGfam collection. If the genome is public, it is added to the SEED and its proteins contribute to the FIGfam collection. This cycle ensures robust and scalable genome annotation. The SEED continuously integrates various genomic data, including public genomes annotated by RAST, expert annotations, metabolic modeling data, expression data, and links to other databases. The SEED website offers tools for genome annotation and comparison, such as the 'Compare Regions View', which allows users to compare genomic regions across different genomes. The SEED also supports a variety of comparative genomics tools, including the 'Sequence Based Comparison Tool' and the 'Function Based Comparison Tool'. The SEED supports several online resources, including NMPDR, PATRIC, PhAnToMe, Model SEED, and the U.S. Department of Energy KBase project. RAST is an automatic annotation server that uses the SEED system to annotate microbial genomes. It allows users to submit genomes and receive annotations, which are then integrated into the SEED database. RAST has been used to annotate over 60,000 distinct genomes, and its use has increased significantly over the past 16 years. The RAST pipeline involves several steps, including identifying selenoproteins and pyrrolysoproteins, estimating phylogenetic neighbors, identifying tRNA and rRNA genes, and assigning functions to genes based on k-mers and BLAST similarities. The pipeline also includes iterative retraining of gene-calling algorithms and the identification of missed genes. The SEED and RAST systems are interconnected, with annotations from RAST being integrated into the SEED database and vice versa. The SEED and RAST systems are continually updated and improved, with new tools and features being added to enhance genome annotation and analysis. The SEED provides a robust framework for integrating and analyzing genomic data, while RAST offers an automated system for genome annotation. Together, they form a powerful platform for microbial genome research and analysis.

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

2014 | Ross Overbeek, Robert Olson, Gordon D. Pusch, Gary J. Olsen, James J. Davis, Terry Disz, Robert A. Edwards, Svetlana Gerdes, Bruce Parrello, Maulik Shukla, Veronika Vonstein, Alice R. Wattam, Fangfang Xia and Rick Stevens