13 September 2011 | George A. Khoury, Richard C. Baliban & Christodoulos A. Floudas
This study presents a comprehensive analysis of post-translational modifications (PTMs) in the Swiss-Prot database, aiming to quantify and curate the frequency of various PTMs. The authors report that among 530,264 proteins, only 87,308 have experimentally identified PTMs and 234,938 have putative PTMs, significantly fewer than previously estimated. They find that less than one-fifth of proteins are glycosylated, challenging the widely held belief that more than half of proteins are glycoproteins. Phosphorylation is the most common PTM, with 139,582 instances on 530,264 proteins. The study also highlights the importance of D-alanine isomers, which are among the top 15 experimentally found PTMs. The authors provide a continuously updated resource (http://selene.princeton.edu/PTMCuration) to facilitate further research in systems biology, proteomics, and protein design. The method used involves accessing the latest Swiss-Prot database, preprocessing PTM IDs, populating experimental and putative statistics, sorting and categorizing results, manually checking and updating problematic IDs, and sending the data to a web interface. This resource aims to enhance the understanding of PTMs and their roles in cellular function.This study presents a comprehensive analysis of post-translational modifications (PTMs) in the Swiss-Prot database, aiming to quantify and curate the frequency of various PTMs. The authors report that among 530,264 proteins, only 87,308 have experimentally identified PTMs and 234,938 have putative PTMs, significantly fewer than previously estimated. They find that less than one-fifth of proteins are glycosylated, challenging the widely held belief that more than half of proteins are glycoproteins. Phosphorylation is the most common PTM, with 139,582 instances on 530,264 proteins. The study also highlights the importance of D-alanine isomers, which are among the top 15 experimentally found PTMs. The authors provide a continuously updated resource (http://selene.princeton.edu/PTMCuration) to facilitate further research in systems biology, proteomics, and protein design. The method used involves accessing the latest Swiss-Prot database, preprocessing PTM IDs, populating experimental and putative statistics, sorting and categorizing results, manually checking and updating problematic IDs, and sending the data to a web interface. This resource aims to enhance the understanding of PTMs and their roles in cellular function.