UniProt: a hub for protein information

UniProt: a hub for protein information

2015 | The UniProt Consortium
UniProt is a comprehensive database of protein sequences and their annotations, which has grown significantly in size, reaching 80 million sequences in the past year. The database has expanded its accession number format from 6 to 10 characters to accommodate the increased number of sequences. A new proteome identifier has been introduced to uniquely identify a species' assembly, aiding in tracking sequence provenance. The website has been redesigned with a user-centered approach, improving navigation, usability, and the visibility of annotation data. Annotation scores have been introduced to represent the level of knowledge about each protein, helping users identify the most characterized proteins for comparative analysis. All data is freely available online. UniProt is composed of two main sections: UniProtKB/Swiss-Prot, which contains manually curated entries, and UniProtKB/TrEMBL, which includes unreviewed sequences. The database also provides non-redundant sequence sets (UniRef100, UniRef90, UniRef50) and the UniParc database, which contains all known sequences. UniProt cross-references over 150 databases and serves as a central hub for protein information. The growth of proteomes and sequence data has been driven by the increased submission of complete genomes, particularly from bacterial sources. To manage this growth, UniProt has introduced reference proteomes, which are selected based on annotation quality and represent a broad range of species. These proteomes are manually and automatically annotated to provide the best characterized protein sets. Manual curation of proteins is a core activity of UniProt, providing high-quality annotations for experimentally characterized proteins. In 2013, over 8400 papers were curated, resulting in over 3300 new entries. The curation process focuses on enzymes, particularly orphan enzymes, which lack associated amino acid sequences. The curation of these enzymes is essential for new enzyme discovery and understanding of enzyme function. Automatic annotation is supported by two rule-based systems: UniRule and SAAS. These systems use the InterPro classification to annotate proteins and functional domains. The UniRule pipeline leverages manual curation for rule validation and ensures the accuracy of annotations. The new UniProt website has been redesigned to enhance user experience, with improved navigation, search functionality, and annotation data visibility. The website now includes customizable search results, improved entry views, and new pages for proteome data. The website is available for use and provides a user-friendly interface for accessing and analyzing protein data. UniProt is widely used in the scientific community, with its publications cited in numerous journals. The database is an essential resource for protein sequence analysis, functional annotation, and comparative studies. The annotation scores help users identify the most characterized proteins, which are most informative for comparative analysis. UniProt continues to develop new features and improve its website to better serve the research community.UniProt is a comprehensive database of protein sequences and their annotations, which has grown significantly in size, reaching 80 million sequences in the past year. The database has expanded its accession number format from 6 to 10 characters to accommodate the increased number of sequences. A new proteome identifier has been introduced to uniquely identify a species' assembly, aiding in tracking sequence provenance. The website has been redesigned with a user-centered approach, improving navigation, usability, and the visibility of annotation data. Annotation scores have been introduced to represent the level of knowledge about each protein, helping users identify the most characterized proteins for comparative analysis. All data is freely available online. UniProt is composed of two main sections: UniProtKB/Swiss-Prot, which contains manually curated entries, and UniProtKB/TrEMBL, which includes unreviewed sequences. The database also provides non-redundant sequence sets (UniRef100, UniRef90, UniRef50) and the UniParc database, which contains all known sequences. UniProt cross-references over 150 databases and serves as a central hub for protein information. The growth of proteomes and sequence data has been driven by the increased submission of complete genomes, particularly from bacterial sources. To manage this growth, UniProt has introduced reference proteomes, which are selected based on annotation quality and represent a broad range of species. These proteomes are manually and automatically annotated to provide the best characterized protein sets. Manual curation of proteins is a core activity of UniProt, providing high-quality annotations for experimentally characterized proteins. In 2013, over 8400 papers were curated, resulting in over 3300 new entries. The curation process focuses on enzymes, particularly orphan enzymes, which lack associated amino acid sequences. The curation of these enzymes is essential for new enzyme discovery and understanding of enzyme function. Automatic annotation is supported by two rule-based systems: UniRule and SAAS. These systems use the InterPro classification to annotate proteins and functional domains. The UniRule pipeline leverages manual curation for rule validation and ensures the accuracy of annotations. The new UniProt website has been redesigned to enhance user experience, with improved navigation, search functionality, and annotation data visibility. The website now includes customizable search results, improved entry views, and new pages for proteome data. The website is available for use and provides a user-friendly interface for accessing and analyzing protein data. UniProt is widely used in the scientific community, with its publications cited in numerous journals. The database is an essential resource for protein sequence analysis, functional annotation, and comparative studies. The annotation scores help users identify the most characterized proteins, which are most informative for comparative analysis. UniProt continues to develop new features and improve its website to better serve the research community.
Reach us at info@study.space
Understanding UniProt%3A a hub for protein information