13 September 2011 | George A. Khoury, Richard C. Baliban & Christodoulos A. Floudas
This study presents a comprehensive analysis of post-translational modifications (PTMs) across the Swiss-Prot database, focusing on their frequency and curation. The researchers analyzed high-quality, manually curated proteome-wide data to determine the relative abundance of each PTM. They found that less than one-fifth of proteins are glycosylated, challenging the previously held belief that more than half of proteins are glycosylated. Phosphorylation was found to dominate the number of experimental PTMs, while N-linked glycosylation dominated the number of putative PTMs. The study also highlights the importance of PTMs in systems biology, proteomics, protein design, and the origins of life. The researchers developed a resource (http://selene.princeton.edu/PTMCuration) to provide updated statistics on PTMs, which can be used by the academic community for further research. The study also discusses the challenges of curating PTM data, including the need for manual curation and the potential for errors in the database. The researchers emphasize the importance of open access to proteome-wide results for a full understanding of cellular function. The study provides a detailed methodology for generating and curating PTM statistics, including a workflow that involves accessing the Swiss-Prot database, preprocessing PTM IDs, populating experimental and putative statistics, sorting and categorizing results, and manually checking results. The study also discusses the implications of PTM statistics for protein design, systems biology, and the understanding of the combinatorial histone code. The researchers conclude that the study provides a valuable resource for the academic community to assess the frequency and distribution of PTMs in the proteome.This study presents a comprehensive analysis of post-translational modifications (PTMs) across the Swiss-Prot database, focusing on their frequency and curation. The researchers analyzed high-quality, manually curated proteome-wide data to determine the relative abundance of each PTM. They found that less than one-fifth of proteins are glycosylated, challenging the previously held belief that more than half of proteins are glycosylated. Phosphorylation was found to dominate the number of experimental PTMs, while N-linked glycosylation dominated the number of putative PTMs. The study also highlights the importance of PTMs in systems biology, proteomics, protein design, and the origins of life. The researchers developed a resource (http://selene.princeton.edu/PTMCuration) to provide updated statistics on PTMs, which can be used by the academic community for further research. The study also discusses the challenges of curating PTM data, including the need for manual curation and the potential for errors in the database. The researchers emphasize the importance of open access to proteome-wide results for a full understanding of cellular function. The study provides a detailed methodology for generating and curating PTM statistics, including a workflow that involves accessing the Swiss-Prot database, preprocessing PTM IDs, populating experimental and putative statistics, sorting and categorizing results, and manually checking results. The study also discusses the implications of PTM statistics for protein design, systems biology, and the understanding of the combinatorial histone code. The researchers conclude that the study provides a valuable resource for the academic community to assess the frequency and distribution of PTMs in the proteome.