Proteoform: a single term describing protein complexity

Proteoform: a single term describing protein complexity

2013 March | Lloyd M Smith, Neil L Kelleher, and The Consortium for Top Down Proteomics
The human genome project revealed that the number of genes is lower than expected, leading to the realization that protein variation, not gene number, drives biological complexity. Protein diversity arises from allelic variations, alternative splicing, and post-translational modifications. While two-dimensional gel electrophoresis first revealed this complexity, newer proteomic technologies, particularly mass spectrometry, offer more precise data. Two approaches, 'bottom-up' and 'top-down' proteomics, exist. The top-down approach directly identifies intact proteins, providing richer data but is more complex to execute. Current terminology for protein variation, such as 'protein forms', 'isoforms', 'species', and 'variants', is inadequate. 'Isoform' is widely used but refers only to genetic differences, not protein-level variations. 'Protein species' is also problematic as it does not distinguish between proteins from different genes. The term 'proteoform' is proposed to describe all molecular forms of a protein product of a single gene, including genetic variations, alternative splicing, and post-translational modifications. This term encompasses all post-translational modifications except those classified as reagent-derivatized or isotope-labeled residues. It is compatible with a gene-centric approach and avoids ambiguity. The term is intuitive, aesthetically pleasing, and has been adopted by UniProt, the Protein Ontology, and the wider community. It improves readability and comprehension in proteomics publications. The term 'proteoform' is recommended to replace existing terms for clarity and precision.The human genome project revealed that the number of genes is lower than expected, leading to the realization that protein variation, not gene number, drives biological complexity. Protein diversity arises from allelic variations, alternative splicing, and post-translational modifications. While two-dimensional gel electrophoresis first revealed this complexity, newer proteomic technologies, particularly mass spectrometry, offer more precise data. Two approaches, 'bottom-up' and 'top-down' proteomics, exist. The top-down approach directly identifies intact proteins, providing richer data but is more complex to execute. Current terminology for protein variation, such as 'protein forms', 'isoforms', 'species', and 'variants', is inadequate. 'Isoform' is widely used but refers only to genetic differences, not protein-level variations. 'Protein species' is also problematic as it does not distinguish between proteins from different genes. The term 'proteoform' is proposed to describe all molecular forms of a protein product of a single gene, including genetic variations, alternative splicing, and post-translational modifications. This term encompasses all post-translational modifications except those classified as reagent-derivatized or isotope-labeled residues. It is compatible with a gene-centric approach and avoids ambiguity. The term is intuitive, aesthetically pleasing, and has been adopted by UniProt, the Protein Ontology, and the wider community. It improves readability and comprehension in proteomics publications. The term 'proteoform' is recommended to replace existing terms for clarity and precision.
Reach us at info@study.space