[slides and audio] UniProt%3A the universal protein knowledgebase in 2021

The UniProt Knowledgebase (UniProtKB) aims to provide a comprehensive, high-quality, and freely accessible set of protein sequences annotated with functional information. Over the past two years, UniProt has seen significant updates, including an increase in the number of sequences to approximately 190 million, despite efforts to reduce redundancy. New methods for assessing proteome completeness and quality have been adopted, and detailed annotations from literature are added to both reviewed and unreviewed entries. The Association-Rule-Based Annotator (ARBA) has been implemented to enhance automated annotations. A credit-based publication submission interface allows the community to contribute publications and annotations. UniProt responded to the COVID-19 pandemic by rapidly curating relevant entries and making them available through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license. The database continues to evolve to meet new challenges, focusing on capturing all available protein sequence data and curating functional data from the scientific literature. The UniProt Proteome portal provides detailed information on proteomes, including BUSCO and CPD scores for evaluating completeness and quality. UniProt also integrates large-scale datasets, such as clinical variation sources and mass spectrometry proteomics data, and offers various computational access options. Community curation is encouraged through the 'Community submission' page, where researchers can add relevant articles and annotations. The UniProt website is being redesigned to improve user experience and enhance programmatic access.The UniProt Knowledgebase (UniProtKB) aims to provide a comprehensive, high-quality, and freely accessible set of protein sequences annotated with functional information. Over the past two years, UniProt has seen significant updates, including an increase in the number of sequences to approximately 190 million, despite efforts to reduce redundancy. New methods for assessing proteome completeness and quality have been adopted, and detailed annotations from literature are added to both reviewed and unreviewed entries. The Association-Rule-Based Annotator (ARBA) has been implemented to enhance automated annotations. A credit-based publication submission interface allows the community to contribute publications and annotations. UniProt responded to the COVID-19 pandemic by rapidly curating relevant entries and making them available through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license. The database continues to evolve to meet new challenges, focusing on capturing all available protein sequence data and curating functional data from the scientific literature. The UniProt Proteome portal provides detailed information on proteomes, including BUSCO and CPD scores for evaluating completeness and quality. UniProt also integrates large-scale datasets, such as clinical variation sources and mass spectrometry proteomics data, and offers various computational access options. Community curation is encouraged through the 'Community submission' page, where researchers can add relevant articles and annotations. The UniProt website is being redesigned to improve user experience and enhance programmatic access.

UniProt: the universal protein knowledgebase in 2021

2021 | The UniProt Consortium