UniProt: the universal protein knowledgebase

UniProt: the universal protein knowledgebase

2017, Vol. 45, Database issue | The UniProt Consortium
The UniProt knowledgebase is a comprehensive resource for protein sequences and detailed annotations, containing over 60 million sequences. Since its last update in 2014, the database has more than doubled the number of reference proteomes to 5,631, enhancing taxonomic coverage. To address redundancy, a pipeline was implemented to remove highly similar proteomes, reducing the number of sequences by 47 million. UniProt now offers pan proteomes, which provide a representative set of sequences from highly related organisms, and reference proteomes that are manually curated and serve as a basis for both manual and automatic annotation. The database includes over 550,000 curated entries in UniProtKB/Swiss-Prot and 60 million uncurated entries in UniProtKB/TrEMBL. Expert curation remains a cornerstone, focusing on experimental data and post-translational modifications (PTMs). Automatic annotation systems, such as UniRule and SAAS, are used to annotate unreviewed sequences with high accuracy. UniProt also provides a SPARQL endpoint for complex queries and an enhanced website with features like the ProtVista feature viewer and a peptide search tool. The database is continuously updated and accessible via its website and FTP sites.The UniProt knowledgebase is a comprehensive resource for protein sequences and detailed annotations, containing over 60 million sequences. Since its last update in 2014, the database has more than doubled the number of reference proteomes to 5,631, enhancing taxonomic coverage. To address redundancy, a pipeline was implemented to remove highly similar proteomes, reducing the number of sequences by 47 million. UniProt now offers pan proteomes, which provide a representative set of sequences from highly related organisms, and reference proteomes that are manually curated and serve as a basis for both manual and automatic annotation. The database includes over 550,000 curated entries in UniProtKB/Swiss-Prot and 60 million uncurated entries in UniProtKB/TrEMBL. Expert curation remains a cornerstone, focusing on experimental data and post-translational modifications (PTMs). Automatic annotation systems, such as UniRule and SAAS, are used to annotate unreviewed sequences with high accuracy. UniProt also provides a SPARQL endpoint for complex queries and an enhanced website with features like the ProtVista feature viewer and a peptide search tool. The database is continuously updated and accessible via its website and FTP sites.
Reach us at info@study.space