Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles

Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles

April 1999 | MATTEO PELLEGRINI*, EDWARD M. MARCOTTE*, MICHAEL J. THOMPSON, DAVID EISENBERG, AND TODD O. YEATES†
This paper presents a method for assigning protein functions based on comparative genome analysis, specifically using phylogenetic profiles. The method relies on the assumption that proteins functioning together in a pathway or structural complex evolve in a correlated manner. By comparing the presence or absence of proteins across different genomes, we can create phylogenetic profiles, which are strings of bits indicating the presence or absence of a protein in each genome. Proteins with similar profiles are likely to be functionally linked. The method was tested using the genomes of Escherichia coli, where proteins with similar phylogenetic profiles were found to be functionally linked. For example, ribosome protein RL7 and flagellar structural protein FlgL were found to have similar profiles, suggesting they are functionally linked. Additionally, proteins involved in metabolic pathways, such as histidine biosynthesis, were also found to have similar profiles. The method was further validated by comparing proteins grouped by their SwissProt annotations and by EcoCyc classes. Proteins in these groups were found to have more similar phylogenetic profiles than random proteins. This suggests that functionally linked proteins are more likely to have similar profiles. The method was also used to predict the function of uncharacterized proteins. By comparing the phylogenetic profiles of characterized proteins with those of uncharacterized proteins, it was found that a significant proportion of the neighbors of a characterized protein had overlapping keywords, suggesting a similar function. The results indicate that phylogenetic profiles can be a useful tool for identifying the function of proteins and for understanding the structure and function of metabolic pathways and structural complexes. As more genomes are sequenced, the method becomes even more powerful, as the number of bits in phylogenetic profiles increases, allowing for more detailed comparisons.This paper presents a method for assigning protein functions based on comparative genome analysis, specifically using phylogenetic profiles. The method relies on the assumption that proteins functioning together in a pathway or structural complex evolve in a correlated manner. By comparing the presence or absence of proteins across different genomes, we can create phylogenetic profiles, which are strings of bits indicating the presence or absence of a protein in each genome. Proteins with similar profiles are likely to be functionally linked. The method was tested using the genomes of Escherichia coli, where proteins with similar phylogenetic profiles were found to be functionally linked. For example, ribosome protein RL7 and flagellar structural protein FlgL were found to have similar profiles, suggesting they are functionally linked. Additionally, proteins involved in metabolic pathways, such as histidine biosynthesis, were also found to have similar profiles. The method was further validated by comparing proteins grouped by their SwissProt annotations and by EcoCyc classes. Proteins in these groups were found to have more similar phylogenetic profiles than random proteins. This suggests that functionally linked proteins are more likely to have similar profiles. The method was also used to predict the function of uncharacterized proteins. By comparing the phylogenetic profiles of characterized proteins with those of uncharacterized proteins, it was found that a significant proportion of the neighbors of a characterized protein had overlapping keywords, suggesting a similar function. The results indicate that phylogenetic profiles can be a useful tool for identifying the function of proteins and for understanding the structure and function of metabolic pathways and structural complexes. As more genomes are sequenced, the method becomes even more powerful, as the number of bits in phylogenetic profiles increases, allowing for more detailed comparisons.
Reach us at info@futurestudyspace.com
[slides and audio] Assigning protein functions by comparative genome analysis%3A protein phylogenetic profiles.