2017 | Minoru Kanehisa¹, Miho Furumichi¹, Mao Tanabe¹, Yoko Sato² and Kanae Morishima¹
KEGG is an integrated database resource that provides information on genes, genomes, pathways, diseases, and drugs. It aims to assign functional meanings to genes and genomes at both molecular and higher levels. The KEGG database includes the KO (KEGG Orthology) database, which stores molecular-level functions as functional orthologs of genes and proteins. Higher-level functions are represented through KEGG pathway maps, BRITE hierarchies, and KEGG modules. The KO database has been expanded and improved, with new content added regardless of whether KOs appear in the molecular network databases. The GENES database now includes an addendum category for experimentally characterized proteins, and the DISEASE and DRUG databases have been improved through systematic analysis of drug labels to better integrate diseases and drugs with KEGG molecular networks.
KEGG was originally developed in 1995 as an integrated database for biological interpretation of completely sequenced genomes. It has since expanded to include various databases, such as PATHWAY, GENES, COMPOUND, and ENZYME. The KEGG pathway mapping process has evolved, with the introduction of BRITE and MODULE, and the replacement of ENZYME with KO for the role of KEGG pathway mapping. KEGG is now used for analyzing not only genomics data but also transcriptomics, proteomics, glycomics, metabolomics, and other high-throughput data.
In 2016, KEGG was updated to become a more comprehensive knowledge base for assisting biological interpretations of large-scale molecular datasets. The focus has shifted to improving and expanding the KO database, with existing KOs linked to experimentally characterized protein sequence data and new KOs defined based on published reports. The DISEASE and DRUG databases have also been improved through systematic analysis of drug labels to better integrate diseases and drugs with KEGG molecular networks.
KEGG consists of fifteen manually curated databases and a computationally generated database in four categories. The databases in the systems information category include PATHWAY, BRITE, and MODULE, which constitute the reference knowledge base for understanding higher-level systemic functions of the cell and organism. The genomic information category includes the KO database, which organizes knowledge of molecular-level functions with the concept of functional orthologs. The chemical information category includes COMPOUND, GLYCAN, REACTION, RCLASS, and ENZYME, collectively called KEGG LIGAND. The health information category includes DISEASE, DRUG, DGROUP, and ENVIRON, as well as two outside databases for drug labels: Japanese drug labels and FDA drug labels.
KEGG identifiers are unique identifiers for each KEGG object, including genes, proteins, small molecules, reactions, pathways, diseases, and drugs. The KEGG website architecture has been updated to simplify its overall structure without losing content. The KEGG home page is directly linked to main databases and software tools,KEGG is an integrated database resource that provides information on genes, genomes, pathways, diseases, and drugs. It aims to assign functional meanings to genes and genomes at both molecular and higher levels. The KEGG database includes the KO (KEGG Orthology) database, which stores molecular-level functions as functional orthologs of genes and proteins. Higher-level functions are represented through KEGG pathway maps, BRITE hierarchies, and KEGG modules. The KO database has been expanded and improved, with new content added regardless of whether KOs appear in the molecular network databases. The GENES database now includes an addendum category for experimentally characterized proteins, and the DISEASE and DRUG databases have been improved through systematic analysis of drug labels to better integrate diseases and drugs with KEGG molecular networks.
KEGG was originally developed in 1995 as an integrated database for biological interpretation of completely sequenced genomes. It has since expanded to include various databases, such as PATHWAY, GENES, COMPOUND, and ENZYME. The KEGG pathway mapping process has evolved, with the introduction of BRITE and MODULE, and the replacement of ENZYME with KO for the role of KEGG pathway mapping. KEGG is now used for analyzing not only genomics data but also transcriptomics, proteomics, glycomics, metabolomics, and other high-throughput data.
In 2016, KEGG was updated to become a more comprehensive knowledge base for assisting biological interpretations of large-scale molecular datasets. The focus has shifted to improving and expanding the KO database, with existing KOs linked to experimentally characterized protein sequence data and new KOs defined based on published reports. The DISEASE and DRUG databases have also been improved through systematic analysis of drug labels to better integrate diseases and drugs with KEGG molecular networks.
KEGG consists of fifteen manually curated databases and a computationally generated database in four categories. The databases in the systems information category include PATHWAY, BRITE, and MODULE, which constitute the reference knowledge base for understanding higher-level systemic functions of the cell and organism. The genomic information category includes the KO database, which organizes knowledge of molecular-level functions with the concept of functional orthologs. The chemical information category includes COMPOUND, GLYCAN, REACTION, RCLASS, and ENZYME, collectively called KEGG LIGAND. The health information category includes DISEASE, DRUG, DGROUP, and ENVIRON, as well as two outside databases for drug labels: Japanese drug labels and FDA drug labels.
KEGG identifiers are unique identifiers for each KEGG object, including genes, proteins, small molecules, reactions, pathways, diseases, and drugs. The KEGG website architecture has been updated to simplify its overall structure without losing content. The KEGG home page is directly linked to main databases and software tools,