2009 | Aron Marchler-Bauer*, John B. Anderson, Farideh Chitsaz, Myra K. Derbyshire, Carol DeWeese-Scott, Jessica H. Fong, Lewis Y. Geer, Renata C. Geer, Noreen R. Gonzales, Marc Gwadz, Siqian He, David I. Hurwitz, John D. Jackson, Zhaoxi Ke, Christopher J. Lanczycki, Cynthia A. Liebert, Chunlei Liu, Fu Lu, Shennan Lu, Gabriele H. Marchler, Mikhail Mullokandov, James S. Song, Asba Tasneem, Narmada Thanki, Roxanne A. Yamashita, Dachuan Zhang and Stephen H. Bryant
The Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived search models that represent protein domains conserved through molecular evolution. It is part of NCBI's Entrez system and can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. It allows retrieval of precalculated domain annotations for protein sequences in NCBI's Entrez system and querying of novel protein sequences via CD-Search at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Starting with version 2.14, CDD summarizes redundant and homologous domain models at a superfamily level, and domain annotations are flagged as either 'specific' or 'non-specific'.
CDD integrates data from Pfam, SMART, COGs, and other sources to provide comprehensive coverage of protein databases. It attempts to reconcile protein sequence conservation with domain 3D structure and represents many domain families as structured hierarchies. CDD also annotates functional sites in protein domain families, allowing these annotations to be transferred computationally onto protein sequences. CDD provides web interfaces and software tools for interpreting domain annotations and classifying user sequences within existing NCBI-curated domain hierarchies.
Domain models imported from external sources are processed to fit into the CDD framework. The content of the models is determined by providers, but sequence alignments are processed automatically upon import. The import process has been discussed in a previous manuscript. Recently, CDD introduced three major changes: clustering domain models into superfamilies, referring to superfamilies instead of best scoring models for annotation, and labeling annotations as 'specific' or 'non-specific' based on match criteria.
CDD curators have recorded functionally conserved sites on protein domain models where evidence is available. Functional site annotation is now available for proteins in NCBI's Entrez system and is visualized via CD-Search. CDD also provides a new version of CDTree/Cn3D, which enables users to examine NCBI-curated domain hierarchies in detail.
CDD is a database in NCBI's Entrez system and can be searched by keyword. It is linked to other resources in Entrez, and explicit 'Conserved Domains' links are available for most protein sequences. Precomputed domain annotations are updated several times a day as the protein sequence database grows. CDD can also be searched with a protein query sequence through CD-Search, which uses the RPS-BLAST algorithm to compare query sequences against position-specific scoring matrices derived from the model collection in CDD. CDD is updated several times a year, with the current version, v2.14, containing 26660 models.The Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived search models that represent protein domains conserved through molecular evolution. It is part of NCBI's Entrez system and can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. It allows retrieval of precalculated domain annotations for protein sequences in NCBI's Entrez system and querying of novel protein sequences via CD-Search at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Starting with version 2.14, CDD summarizes redundant and homologous domain models at a superfamily level, and domain annotations are flagged as either 'specific' or 'non-specific'.
CDD integrates data from Pfam, SMART, COGs, and other sources to provide comprehensive coverage of protein databases. It attempts to reconcile protein sequence conservation with domain 3D structure and represents many domain families as structured hierarchies. CDD also annotates functional sites in protein domain families, allowing these annotations to be transferred computationally onto protein sequences. CDD provides web interfaces and software tools for interpreting domain annotations and classifying user sequences within existing NCBI-curated domain hierarchies.
Domain models imported from external sources are processed to fit into the CDD framework. The content of the models is determined by providers, but sequence alignments are processed automatically upon import. The import process has been discussed in a previous manuscript. Recently, CDD introduced three major changes: clustering domain models into superfamilies, referring to superfamilies instead of best scoring models for annotation, and labeling annotations as 'specific' or 'non-specific' based on match criteria.
CDD curators have recorded functionally conserved sites on protein domain models where evidence is available. Functional site annotation is now available for proteins in NCBI's Entrez system and is visualized via CD-Search. CDD also provides a new version of CDTree/Cn3D, which enables users to examine NCBI-curated domain hierarchies in detail.
CDD is a database in NCBI's Entrez system and can be searched by keyword. It is linked to other resources in Entrez, and explicit 'Conserved Domains' links are available for most protein sequences. Precomputed domain annotations are updated several times a day as the protein sequence database grows. CDD can also be searched with a protein query sequence through CD-Search, which uses the RPS-BLAST algorithm to compare query sequences against position-specific scoring matrices derived from the model collection in CDD. CDD is updated several times a year, with the current version, v2.14, containing 26660 models.