2008, Vol. 36, Database issue | Antonina Andreeva1*, Dave Howorth1, John-Marc Chandonia2,3, Steven E. Brenner2, Tim J. P. Hubbard4, Cyrus Chothia5 and Alexey G. Murzin1
The Structural Classification of Proteins (SCOP) database, which categorizes proteins based on their evolutionary and structural relationships, has undergone significant updates to accommodate the rapid growth of new structural data. The SCOP hierarchy includes levels such as Species, Protein, Family, Superfamily, Fold, and Class. To address the increasing volume of data, the SCOP team has introduced a new update protocol that supports batch classification of new protein structures at the Family and Superfamily levels, rather than by release date. This protocol includes a pre-classification step using sequence clustering and database searches, followed by manual curation. A preview version of SCOP, called pre-SCOP, provides early access to new relationships. The impact of worldwide Structural Genomics initiatives on the discovery and growth of protein families and superfamilies is also discussed, highlighting the contribution of these initiatives to the classification of new structures. The article details the changes in SCOP domain boundary definitions, integrated taxonomy, curated relationships to sequence databases, and future developments, including the redefinition of the Fold level to enable more comprehensive classifications. The growth of structurally characterized protein families has facilitated the discovery of new protein relationships, particularly at the Superfamily level, and has provided valuable insights into the structural and functional diversity of proteins.The Structural Classification of Proteins (SCOP) database, which categorizes proteins based on their evolutionary and structural relationships, has undergone significant updates to accommodate the rapid growth of new structural data. The SCOP hierarchy includes levels such as Species, Protein, Family, Superfamily, Fold, and Class. To address the increasing volume of data, the SCOP team has introduced a new update protocol that supports batch classification of new protein structures at the Family and Superfamily levels, rather than by release date. This protocol includes a pre-classification step using sequence clustering and database searches, followed by manual curation. A preview version of SCOP, called pre-SCOP, provides early access to new relationships. The impact of worldwide Structural Genomics initiatives on the discovery and growth of protein families and superfamilies is also discussed, highlighting the contribution of these initiatives to the classification of new structures. The article details the changes in SCOP domain boundary definitions, integrated taxonomy, curated relationships to sequence databases, and future developments, including the redefinition of the Fold level to enable more comprehensive classifications. The growth of structurally characterized protein families has facilitated the discovery of new protein relationships, particularly at the Superfamily level, and has provided valuable insights into the structural and functional diversity of proteins.