CDD/SPARCLE: functional classification of proteins via subfamily domain architectures

CDD/SPARCLE: functional classification of proteins via subfamily domain architectures

2017, Vol. 45, Database issue | Aron Marchler-Bauer*, Yu Bo, Lianyi Han, Jane He, Christopher J. Lanczycki, Shennan Lu, Farideh Chitsaz, Myra K. Derbyshire, Renata C. Geer, Noreen R. Gonzales, Marc Gwadz, David I. Hurwitz, Fu Lu, Gabriele H. Marchler, James S. Song, Narmada Thanki, Zhouxi Wang, Roxanne A. Yamashita, Dachuan Zhang, Chanjuan Zheng, Lewis Y. Geer and Stephen H. Bryant
The article introduces the Conserved Domain Database (CDD), a resource for annotating biomolecular sequences with evolutionarily conserved protein domain footprints and functional sites. CDD maintains an archive of pre-computed domain annotations and offers live search services. It curates a comprehensive collection of protein domain and family models, including those from external providers and in-house curated families, organized into hierarchical classifications. CDD supports comparative analyses of protein families using conserved domain architectures and has recently focused on providing functional characterizations of distinct subfamily architectures using SPARCLE (Subfamily Protein Architecture Labeling Engine). The current version, v3.15, contains 48,963 protein and domain models, with v3.16 scheduled for release in late 2016, featuring 50,369 models. CDD is integrated into NCBI's Entrez system and cross-linked with other databases. It annotates approximately 250 million sequences and 96% of structure-derived protein sequences over 30 residues long. The article also discusses the availability and data sharing of CDD services, including the RPS-BLAST program and the CDART service. Additionally, it highlights the collaboration with InterPro to integrate domain signatures and the development of SPARCLE for subfamily domain architecture labeling.The article introduces the Conserved Domain Database (CDD), a resource for annotating biomolecular sequences with evolutionarily conserved protein domain footprints and functional sites. CDD maintains an archive of pre-computed domain annotations and offers live search services. It curates a comprehensive collection of protein domain and family models, including those from external providers and in-house curated families, organized into hierarchical classifications. CDD supports comparative analyses of protein families using conserved domain architectures and has recently focused on providing functional characterizations of distinct subfamily architectures using SPARCLE (Subfamily Protein Architecture Labeling Engine). The current version, v3.15, contains 48,963 protein and domain models, with v3.16 scheduled for release in late 2016, featuring 50,369 models. CDD is integrated into NCBI's Entrez system and cross-linked with other databases. It annotates approximately 250 million sequences and 96% of structure-derived protein sequences over 30 residues long. The article also discusses the availability and data sharing of CDD services, including the RPS-BLAST program and the CDART service. Additionally, it highlights the collaboration with InterPro to integrate domain signatures and the development of SPARCLE for subfamily domain architecture labeling.
Reach us at info@study.space
Understanding CDD%2FSPARCLE%3A functional classification of proteins via subfamily domain architectures