Understanding The Universal Protein Resource (UniProt)

The Universal Protein Resource (UniProt) is a comprehensive, centralized database for protein sequences and functional information, combining the Swiss-Prot, TrEMBL, and PIR protein databases. It consists of three main databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt), and the UniProt Reference (UniRef). UniParc stores all publicly available protein sequence data, ensuring non-redundancy and cross-referencing with various sources. UniProt provides detailed, manually curated and automatically annotated protein sequences, with a focus on accurate and consistent annotation. UniRef databases offer non-redundant sequence collections for efficient sequence similarity searches. UniProt includes extensive cross-references to other databases, such as nucleotide sequence databases, structural databases, and disease databases. It also supports evidence attribution, allowing users to trace the source of annotations. The UniProt Knowledgebase includes both manually curated entries (UniProt/Swiss-Prot) and computationally analyzed entries (UniProt/TrEMBL), with ongoing improvements in annotation quality and standardization. New features include the TOXIC DOSE comment line for storing toxicity information and enhanced documentation for strains and synonyms. UniProt provides tools for sequence searches, including the UniRef non-redundant databases, and offers access to its data through online portals and FTP. It supports submissions of new sequences and annotations, and is continuously updated with new releases every two weeks. UniProt serves as a central resource for protein sequence and function, integrating reliable automated annotation with expert manual curation to provide consistent, non-redundant, and comprehensive protein information. It is supported by various funding sources, including the National Institutes of Health and the European Bioinformatics Institute.The Universal Protein Resource (UniProt) is a comprehensive, centralized database for protein sequences and functional information, combining the Swiss-Prot, TrEMBL, and PIR protein databases. It consists of three main databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt), and the UniProt Reference (UniRef). UniParc stores all publicly available protein sequence data, ensuring non-redundancy and cross-referencing with various sources. UniProt provides detailed, manually curated and automatically annotated protein sequences, with a focus on accurate and consistent annotation. UniRef databases offer non-redundant sequence collections for efficient sequence similarity searches. UniProt includes extensive cross-references to other databases, such as nucleotide sequence databases, structural databases, and disease databases. It also supports evidence attribution, allowing users to trace the source of annotations. The UniProt Knowledgebase includes both manually curated entries (UniProt/Swiss-Prot) and computationally analyzed entries (UniProt/TrEMBL), with ongoing improvements in annotation quality and standardization. New features include the TOXIC DOSE comment line for storing toxicity information and enhanced documentation for strains and synonyms. UniProt provides tools for sequence searches, including the UniRef non-redundant databases, and offers access to its data through online portals and FTP. It supports submissions of new sequences and annotations, and is continuously updated with new releases every two weeks. UniProt serves as a central resource for protein sequence and function, integrating reliable automated annotation with expert manual curation to provide consistent, non-redundant, and comprehensive protein information. It is supported by various funding sources, including the National Institutes of Health and the European Bioinformatics Institute.

The Universal Protein Resource (UniProt)

2005 | Amos Bairoch, Rolf Apweiler, Cathy H. Wu, Winona C. Barker, Brigitte Boeckmann, Serenella Ferro, Elisabeth Gasteiger, Hongzhan Huang, Rodrigo Lopez, Michele Magrane, Maria J. Martin, Darren A. Natale, Claire O'Donovan, Nicole Redaschi and Lai-Su L. Yeh