Activities at the Universal Protein Resource (UniProt)

Activities at the Universal Protein Resource (UniProt)

2013 | The UniProt Consortium
The Universal Protein Resource (UniProt) is a comprehensive, high-quality, and freely accessible database of protein sequences and functional annotations. It integrates and standardizes data from literature and various resources to provide the most comprehensive catalog of protein information. The central activities of UniProt are the biocuration of the UniProt Knowledgebase (UniProtKB) and the dissemination of these data through its website and web services. UniProt is produced by the UniProt Consortium, which includes groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB), and the Protein Information Resource (PIR). UniProt is updated every 4 weeks and can be accessed online for searches or downloads. UniProt's mission is to facilitate scientific discovery by organizing biological knowledge and enabling researchers to rapidly comprehend complex areas of biology. The four UniProt databases are optimized for different users. UniProtKB consists of two sections: a reviewed section with manually annotated records (UniProtKB/SwissProt) and an unreviewed section with automatically annotated records (UniProtKB/TrEMBL). UniProt Archive (UniParc) is a comprehensive sequence repository. UniProt Reference Clusters (UniRef) merge closely related sequences to facilitate sequence similarity searches. UniProt Metagenomic and Environmental Sequence (UniMES) is a database for metagenomics. UniProt's biocuration involves manual and automatic annotation of protein sequences. UniProt leads in providing full and comprehensive curation of experimental data from the literature. UniProt curation is not only added to UniProtKB but is also used in other resources. UniProt's protein nomenclature is used in NCBI's Reference Sequence collection and INSDC submission guidelines. Literature-based expert curation provides high-quality information for experimentally characterized proteins in a standardized way. UniProt focuses on annotating experimental data from the literature for reference proteome records. UniProt has developed two complementary systems for automatic annotation: UniRule and SAAS. These systems provide annotations with conditions for their application. The InterPro hierarchy is used for protein classification. Two of the UniProt partners create signatures for integration into InterPro for the UniRule system. The Gene Ontology annotation project has grown with the inclusion of five new sources of manual annotation. UniProt has started producing all of its species-specific and multispecies annotation files in the new GO Consortium Gene Product Association Data (GPAD) and Gene Product Information (GPI) formats. This is of particular benefit to users of the UniProt multispecies annotation file. Developments in UniRef include improving intra-cluster coherency by introducing an 80% overlap threshold for UniRef90 and UniRef50 clusters. This reduces redundancy and improves computation time. A new full and incremental update calendar was adopted. UniProt continues to rapidly grow with the release of 2013_10 containing 45,288,084The Universal Protein Resource (UniProt) is a comprehensive, high-quality, and freely accessible database of protein sequences and functional annotations. It integrates and standardizes data from literature and various resources to provide the most comprehensive catalog of protein information. The central activities of UniProt are the biocuration of the UniProt Knowledgebase (UniProtKB) and the dissemination of these data through its website and web services. UniProt is produced by the UniProt Consortium, which includes groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB), and the Protein Information Resource (PIR). UniProt is updated every 4 weeks and can be accessed online for searches or downloads. UniProt's mission is to facilitate scientific discovery by organizing biological knowledge and enabling researchers to rapidly comprehend complex areas of biology. The four UniProt databases are optimized for different users. UniProtKB consists of two sections: a reviewed section with manually annotated records (UniProtKB/SwissProt) and an unreviewed section with automatically annotated records (UniProtKB/TrEMBL). UniProt Archive (UniParc) is a comprehensive sequence repository. UniProt Reference Clusters (UniRef) merge closely related sequences to facilitate sequence similarity searches. UniProt Metagenomic and Environmental Sequence (UniMES) is a database for metagenomics. UniProt's biocuration involves manual and automatic annotation of protein sequences. UniProt leads in providing full and comprehensive curation of experimental data from the literature. UniProt curation is not only added to UniProtKB but is also used in other resources. UniProt's protein nomenclature is used in NCBI's Reference Sequence collection and INSDC submission guidelines. Literature-based expert curation provides high-quality information for experimentally characterized proteins in a standardized way. UniProt focuses on annotating experimental data from the literature for reference proteome records. UniProt has developed two complementary systems for automatic annotation: UniRule and SAAS. These systems provide annotations with conditions for their application. The InterPro hierarchy is used for protein classification. Two of the UniProt partners create signatures for integration into InterPro for the UniRule system. The Gene Ontology annotation project has grown with the inclusion of five new sources of manual annotation. UniProt has started producing all of its species-specific and multispecies annotation files in the new GO Consortium Gene Product Association Data (GPAD) and Gene Product Information (GPI) formats. This is of particular benefit to users of the UniProt multispecies annotation file. Developments in UniRef include improving intra-cluster coherency by introducing an 80% overlap threshold for UniRef90 and UniRef50 clusters. This reduces redundancy and improves computation time. A new full and incremental update calendar was adopted. UniProt continues to rapidly grow with the release of 2013_10 containing 45,288,084
Reach us at info@study.space