IMG: the integrated microbial genomes database and comparative analysis system

IMG: the integrated microbial genomes database and comparative analysis system

2012 | Victor M. Markowitz, I-Min A. Chen, Krishna Palaniappan, Ken Chu, Ernest Szeto, Yuri Grechkin, Anna Ratner, Biju Jacob, Jinghua Huang, Peter Williams, Marcel Huntetmann, Iain Anderson, Konstantinos Mavromatis, Natalia N. Ivanova and Nikos C. Kyropides
The Integrated Microbial Genomes (IMG) system is a comprehensive resource for comparative analysis of publicly available microbial genomes. It integrates draft and complete genomes from all three domains of life, along with numerous plasmids and viruses. IMG provides tools for analyzing and reviewing gene and genome annotations in a comparative context. Since its initial release in March 2005, IMG has continuously expanded its data content and analytical capabilities. It is available at http://img.jgi.doe.gov, with companion systems such as IMG/ER for expert review, IMG/EDU for educational purposes, and IMG/HMP for analysis of Human Microbiome Project-related genomes. IMG integrates data from NCBI's RefSeq as its main source of public genome sequence data, and includes primary annotations consisting of predicted genes and protein products. It also integrates metadata from GOLD and fills in additional information such as CRISPR repeats, signal peptides, and transmembrane helices. Missing RNAs are identified using various tools. Genes are associated with secondary functional annotations and lists of related genes. IMG's annotations include protein family and domain characterizations based on COG clusters, Pfam, TIGRfam, InterPro, Gene Ontology, and KEGG Orthology terms. KEGG pathways are associated with IMG genomes based on the assignment of KEGG Orthology (KO) terms to IMG genes. The MetaCyc collection of pathways is also available in IMG. Genes are characterized using IMG terms and pathways defined by domain experts. Transporter genes are linked to the Transport Classification Database based on their assignment to COG, Pfam, or TIGRfam domains or IMG terms. For each gene, IMG provides lists of related genes based on sequence similarities. The system identifies gene fusions and conserved gene cassettes. It also includes tools for genome data analysis, such as genome, scaffold, gene, and function 'carts' that handle lists of genomes, scaffolds, genes, and functions. Data selection tools allow users to select genomes, genes, and functions using browsers, search tools, and BLAST searches. IMG also includes tools for protein expression data analysis, such as 'Protein Expression Studies' and 'Protein Expression Experiments'. These tools allow users to examine expressed genes in the context of pathways and compare gene expression levels between samples. Comparative analysis tools include the 'Phylogenetic Profiler', 'Abundance Profile Overview', and 'Function Profile' for comparing gene content and functional capabilities. IMG also provides tools for comparing genomes in terms of sequence conservation, such as VISTA, Artemis, and Dotplot tools. IMG/ER provides tools for identifying and correcting annotation anomalies and filling annotation gaps. The system also includes a pangenome framework for analyzing genomic data, with five pangenomes and analysis tools for exploring and comparing pangenomes and genomes. IMG's data content has grown significantly, with over 300The Integrated Microbial Genomes (IMG) system is a comprehensive resource for comparative analysis of publicly available microbial genomes. It integrates draft and complete genomes from all three domains of life, along with numerous plasmids and viruses. IMG provides tools for analyzing and reviewing gene and genome annotations in a comparative context. Since its initial release in March 2005, IMG has continuously expanded its data content and analytical capabilities. It is available at http://img.jgi.doe.gov, with companion systems such as IMG/ER for expert review, IMG/EDU for educational purposes, and IMG/HMP for analysis of Human Microbiome Project-related genomes. IMG integrates data from NCBI's RefSeq as its main source of public genome sequence data, and includes primary annotations consisting of predicted genes and protein products. It also integrates metadata from GOLD and fills in additional information such as CRISPR repeats, signal peptides, and transmembrane helices. Missing RNAs are identified using various tools. Genes are associated with secondary functional annotations and lists of related genes. IMG's annotations include protein family and domain characterizations based on COG clusters, Pfam, TIGRfam, InterPro, Gene Ontology, and KEGG Orthology terms. KEGG pathways are associated with IMG genomes based on the assignment of KEGG Orthology (KO) terms to IMG genes. The MetaCyc collection of pathways is also available in IMG. Genes are characterized using IMG terms and pathways defined by domain experts. Transporter genes are linked to the Transport Classification Database based on their assignment to COG, Pfam, or TIGRfam domains or IMG terms. For each gene, IMG provides lists of related genes based on sequence similarities. The system identifies gene fusions and conserved gene cassettes. It also includes tools for genome data analysis, such as genome, scaffold, gene, and function 'carts' that handle lists of genomes, scaffolds, genes, and functions. Data selection tools allow users to select genomes, genes, and functions using browsers, search tools, and BLAST searches. IMG also includes tools for protein expression data analysis, such as 'Protein Expression Studies' and 'Protein Expression Experiments'. These tools allow users to examine expressed genes in the context of pathways and compare gene expression levels between samples. Comparative analysis tools include the 'Phylogenetic Profiler', 'Abundance Profile Overview', and 'Function Profile' for comparing gene content and functional capabilities. IMG also provides tools for comparing genomes in terms of sequence conservation, such as VISTA, Artemis, and Dotplot tools. IMG/ER provides tools for identifying and correcting annotation anomalies and filling annotation gaps. The system also includes a pangenome framework for analyzing genomic data, with five pangenomes and analysis tools for exploring and comparing pangenomes and genomes. IMG's data content has grown significantly, with over 300
Reach us at info@study.space