[slides] The Universal Protein Resource (UniProt)%3A an expanding universe of protein information

The Universal Protein Resource (UniProt) is a central resource for protein sequences and functional annotation, consisting of three main components: the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). UniProtKB contains manually and automatically annotated protein sequences, providing extensive cross-references and functional annotations. UniRef compresses sequence space by merging sequences with high similarity, facilitating similarity searches. UniParc stores all publicly available protein sequences, ensuring comprehensive coverage and historical data. UniProt databases continue to grow, with new features including major release downloads, taxonomic divisions, and complete proteome sets. A bibliography mapping service allows linking to curated and computationally mapped references. A new ID mapping service will enable conversion between common gene and protein IDs and UniProt identifiers. Recent changes include updates to database contents, formats, and controlled vocabularies. The UniProtKB format has been revised to improve readability and facilitate identification of proteins and species. New feature keys have been added and redefined to enhance annotation. The use of "UniProt" now refers to the resource, while "UniProtKB" refers to the knowledgebase and "UniProt Consortium" to the organization. Upcoming developments include annotation archive preservation, ID mapping service, and caBIG grid enablement for data sharing. UniProt promotes scientific community interaction and database access through external links to molecular databases and resources. UniProt databases are accessible online and via FTP, with ongoing support from various funding sources.The Universal Protein Resource (UniProt) is a central resource for protein sequences and functional annotation, consisting of three main components: the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). UniProtKB contains manually and automatically annotated protein sequences, providing extensive cross-references and functional annotations. UniRef compresses sequence space by merging sequences with high similarity, facilitating similarity searches. UniParc stores all publicly available protein sequences, ensuring comprehensive coverage and historical data. UniProt databases continue to grow, with new features including major release downloads, taxonomic divisions, and complete proteome sets. A bibliography mapping service allows linking to curated and computationally mapped references. A new ID mapping service will enable conversion between common gene and protein IDs and UniProt identifiers. Recent changes include updates to database contents, formats, and controlled vocabularies. The UniProtKB format has been revised to improve readability and facilitate identification of proteins and species. New feature keys have been added and redefined to enhance annotation. The use of "UniProt" now refers to the resource, while "UniProtKB" refers to the knowledgebase and "UniProt Consortium" to the organization. Upcoming developments include annotation archive preservation, ID mapping service, and caBIG grid enablement for data sharing. UniProt promotes scientific community interaction and database access through external links to molecular databases and resources. UniProt databases are accessible online and via FTP, with ongoing support from various funding sources.

The Universal Protein Resource (UniProt): an expanding universe of protein information

2006 | Cathy H. Wu, Rolf Apweiler, Amos Bairoch, Darren A. Natale, Winona C. Barker, Brigitte Boeckmann, Serenella Ferro, Elisabeth Gasteiger, Hongzhan Huang, Rodrigo Lopez, Michele Magrane, Maria J. Martin, Raja Mazumder, Claire O'Donovan, Nicole Redaschi and Baris Suzek