2014, Vol. 42, Database issue | Ross Overbeek, Robert Olson, Gordon D. Pusch, Gary J. Olsen, James J. Davis, Terry Disz, Robert A. Edwards, Svetlana Gerdes, Bruce Parrello, Maulik Shukla, Veronika Vonstein, Alice R. Wattam, Fangfang Xia and Rick Stevens
The SEED (http://pubseed.theseed.org/) is a comprehensive platform for microbial genome annotations, integrating genomic data, a genome database, web front end, API, and server scripts. It serves as a resource for predicting gene functions and discovering new pathways. The SEED houses subsystems and their derived FIGfams, which are central to the RAST (Rapid Annotation using Subsystems Technology) annotation engine. When a new genome is submitted to RAST, genes are annotated by comparing them to the FIGfam collection. If the genome is made public, it is added to the SEED, and its proteins populate the FIGfam collection. This cycle has proven robust and scalable for annotating the increasing number of genomes. Over 12,000 users have annotated over 60,000 distinct genomes using RAST. The SEED and RAST are interconnected, with RAST annotations being integrated back into the SEED for curation. The RAST pipeline includes steps for identifying special case genes, generating phylogenetic neighbors, identifying tRNA and rRNA genes, testing gene candidates, and assigning functions. The SEED also supports manual improvements to RAST-annotated genomes and provides high-performance web services for computation against SEED data. Future developments include improved performance, accuracy, and user interface, as well as the addition of specialized tools for recognizing specific genome features.The SEED (http://pubseed.theseed.org/) is a comprehensive platform for microbial genome annotations, integrating genomic data, a genome database, web front end, API, and server scripts. It serves as a resource for predicting gene functions and discovering new pathways. The SEED houses subsystems and their derived FIGfams, which are central to the RAST (Rapid Annotation using Subsystems Technology) annotation engine. When a new genome is submitted to RAST, genes are annotated by comparing them to the FIGfam collection. If the genome is made public, it is added to the SEED, and its proteins populate the FIGfam collection. This cycle has proven robust and scalable for annotating the increasing number of genomes. Over 12,000 users have annotated over 60,000 distinct genomes using RAST. The SEED and RAST are interconnected, with RAST annotations being integrated back into the SEED for curation. The RAST pipeline includes steps for identifying special case genes, generating phylogenetic neighbors, identifying tRNA and rRNA genes, testing gene candidates, and assigning functions. The SEED also supports manual improvements to RAST-annotated genomes and provides high-performance web services for computation against SEED data. Future developments include improved performance, accuracy, and user interface, as well as the addition of specialized tools for recognizing specific genome features.