27 March 2024 | Gabriel Monteiro da Silva, Jennifer Y. Cui, David C. Dalgarno, George P. Lisi & Brenda M. Rubenstein
This paper presents an innovative approach for predicting the relative populations of protein conformations using AlphaFold 2 (AF2), an AI-powered method that has revolutionized biology by enabling accurate prediction of protein structures. While AF2 is designed to predict proteins' ground state conformations, this study demonstrates how AF2 can directly predict the relative populations of different protein conformations by subsampling multiple sequence alignments (MSAs). The method was tested on two proteins with drastically different amounts of available sequence data: Abl1 kinase and granulocyte-macrophage colony-stimulating factor (GMCSF). The results showed that AF2 predicted changes in their relative state populations with over 80% accuracy. The subsampling approach worked best for qualitatively predicting the effects of mutations or evolution on the conformational landscape and well-populated states of proteins. It offers a fast and cost-effective way to predict the relative populations of protein conformations at even single-point mutation resolution, making it useful for pharmacology, analysis of experimental results, and predicting evolution.
Proteins are essential biomolecules that carry out a wide range of functions in living organisms. Understanding their three-dimensional structures is critical for elucidating their functions and designing drugs that target them. Historically, experimental techniques such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and electron microscopy have been used to determine protein structures. However, these methods can be time-consuming, technically challenging, and expensive, and may not work for all proteins. To meet this challenge, ab initio structure prediction methods, which use computational algorithms to predict protein structures from their amino acid sequences, have been developed. For many years, ab initio structure prediction methods have relied on physics-based algorithms to predict stable protein structures. Although successful, these methods are challenged by larger and more complex proteins.
The recent development of machine learning algorithms has significantly improved the speed of protein structure prediction. One of the most remarkable achievements in this area is the AlphaFold 2 (AF2) engine developed by DeepMind, which uses a deep neural network to predict ground state protein structures from amino acid sequences. AlphaFold 2 was trained using large amounts of experimental data and incorporates co-evolutionary information from massive metagenomic databases. Its accuracy has revolutionized the field of protein structure prediction, opening up new possibilities for drug discovery and basic research with clear consequences for human health.
However, a series of studies have found that the default AF2 algorithm is limited in its capacity to predict alternative protein conformations and the effects of sequence variants. Although AF2's inability to predict multiple conformations is unsurprising given its initial scope, the capacity to make predictions of different conformations would be as revolutionary as the capacity to accurately predict ground states. Phenomena that involve different conformations such as fold-switching and order-disorder transitions are ubiquitous across the proteome and are directly tied to the activity of many macromoleculesThis paper presents an innovative approach for predicting the relative populations of protein conformations using AlphaFold 2 (AF2), an AI-powered method that has revolutionized biology by enabling accurate prediction of protein structures. While AF2 is designed to predict proteins' ground state conformations, this study demonstrates how AF2 can directly predict the relative populations of different protein conformations by subsampling multiple sequence alignments (MSAs). The method was tested on two proteins with drastically different amounts of available sequence data: Abl1 kinase and granulocyte-macrophage colony-stimulating factor (GMCSF). The results showed that AF2 predicted changes in their relative state populations with over 80% accuracy. The subsampling approach worked best for qualitatively predicting the effects of mutations or evolution on the conformational landscape and well-populated states of proteins. It offers a fast and cost-effective way to predict the relative populations of protein conformations at even single-point mutation resolution, making it useful for pharmacology, analysis of experimental results, and predicting evolution.
Proteins are essential biomolecules that carry out a wide range of functions in living organisms. Understanding their three-dimensional structures is critical for elucidating their functions and designing drugs that target them. Historically, experimental techniques such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and electron microscopy have been used to determine protein structures. However, these methods can be time-consuming, technically challenging, and expensive, and may not work for all proteins. To meet this challenge, ab initio structure prediction methods, which use computational algorithms to predict protein structures from their amino acid sequences, have been developed. For many years, ab initio structure prediction methods have relied on physics-based algorithms to predict stable protein structures. Although successful, these methods are challenged by larger and more complex proteins.
The recent development of machine learning algorithms has significantly improved the speed of protein structure prediction. One of the most remarkable achievements in this area is the AlphaFold 2 (AF2) engine developed by DeepMind, which uses a deep neural network to predict ground state protein structures from amino acid sequences. AlphaFold 2 was trained using large amounts of experimental data and incorporates co-evolutionary information from massive metagenomic databases. Its accuracy has revolutionized the field of protein structure prediction, opening up new possibilities for drug discovery and basic research with clear consequences for human health.
However, a series of studies have found that the default AF2 algorithm is limited in its capacity to predict alternative protein conformations and the effects of sequence variants. Although AF2's inability to predict multiple conformations is unsurprising given its initial scope, the capacity to make predictions of different conformations would be as revolutionary as the capacity to accurately predict ground states. Phenomena that involve different conformations such as fold-switching and order-disorder transitions are ubiquitous across the proteome and are directly tied to the activity of many macromolecules