Highly accurate protein structure prediction for the human proteome

Highly accurate protein structure prediction for the human proteome

26 August 2021 | Kathryn Tunyasuvunakool, Jonas Adler, Zachary Wu, Tim Green, Michal Zielinski, Augustin Žídek, Alex Bridgland, Andrew Cowie, Clemens Meyer, Agata Laydon, Sameer Velankar, Gerard J. Kleywegt, Alex Bateman, Richard Evans, Alexander Pritzel, Michael Figurnov, Olaf Ronneberger, Russ Bates, Simon A. A. Kohl, Anna Potapenko, Andrew J. Ballard, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Ellen Clancy, David Reiman, Stig Petersen, Andrew W. Senior, Koray Kavukcuoglu, Ewan Birney, Pushmeet Kohli, John Jumper, Demis Hassabis
This study significantly expands the structural coverage of the human proteome by applying the state-of-the-art machine learning method, AlphaFold, to predict structures for almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, and 36% of residues have very high confidence. The authors introduce several metrics, including pLDDT and pTM, to interpret the dataset and identify multi-domain predictions and disordered regions. Case studies on glucose-6-phosphatase, diacylglycerol O-acyltransferase 2, and wolframin highlight how high-quality predictions can generate biological hypotheses. The predictions are freely available, and the authors anticipate that routine large-scale and high-accuracy structure prediction will become an important tool for addressing new questions from a structural perspective.This study significantly expands the structural coverage of the human proteome by applying the state-of-the-art machine learning method, AlphaFold, to predict structures for almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, and 36% of residues have very high confidence. The authors introduce several metrics, including pLDDT and pTM, to interpret the dataset and identify multi-domain predictions and disordered regions. Case studies on glucose-6-phosphatase, diacylglycerol O-acyltransferase 2, and wolframin highlight how high-quality predictions can generate biological hypotheses. The predictions are freely available, and the authors anticipate that routine large-scale and high-accuracy structure prediction will become an important tool for addressing new questions from a structural perspective.
Reach us at info@study.space