[slides and audio] Verification of protein structures%3A Patterns of nonbonded atomic interactions

A novel method for differentiating between correctly and incorrectly determined regions of protein structures based on characteristic atomic interactions is described. Different types of atoms are distributed nonrandomly in proteins. Errors in model building lead to more randomized distributions of atom types, which can be distinguished by statistical methods. Atoms are classified into three categories: carbon (C), nitrogen (N), and oxygen (O), leading to six types of noncovalently bonded interactions (CC, CN, CO, NN, NO, and OO). A quadratic error function is used to characterize pairwise interactions from nine-residue sliding windows in a database of 96 reliable protein structures. Regions of candidate structures with mistracings or misregistrations can be identified by analyzing the pattern of nonbonded interactions. The database consists of high-resolution structures with specific criteria. The method uses nonbonded interactions to identify regions requiring adjustment in preliminary models. Statistical methods, including a Gaussian error function, are used to classify observations. The method is sensitive to errors in backbone positions on the order of 1.5 Å. It was tested on several protein structures, including HIV-1 protease, rubisco, EcoRI, and PRAI-IGPS, where it successfully identified incorrect regions. The method is also sensitive to the refinement method used and does not distinguish well between small and severe errors. It is not effective in identifying structures refined without experimental constraints. The method is sensitive to surface polarity, which is an important criterion for distinguishing misfolded structures. The FORTRAN program ERRAT is available for use. Analysis of a 300-residue protein takes less than 2 seconds of CPU time. The method is useful for model-building and structure verification.A novel method for differentiating between correctly and incorrectly determined regions of protein structures based on characteristic atomic interactions is described. Different types of atoms are distributed nonrandomly in proteins. Errors in model building lead to more randomized distributions of atom types, which can be distinguished by statistical methods. Atoms are classified into three categories: carbon (C), nitrogen (N), and oxygen (O), leading to six types of noncovalently bonded interactions (CC, CN, CO, NN, NO, and OO). A quadratic error function is used to characterize pairwise interactions from nine-residue sliding windows in a database of 96 reliable protein structures. Regions of candidate structures with mistracings or misregistrations can be identified by analyzing the pattern of nonbonded interactions. The database consists of high-resolution structures with specific criteria. The method uses nonbonded interactions to identify regions requiring adjustment in preliminary models. Statistical methods, including a Gaussian error function, are used to classify observations. The method is sensitive to errors in backbone positions on the order of 1.5 Å. It was tested on several protein structures, including HIV-1 protease, rubisco, EcoRI, and PRAI-IGPS, where it successfully identified incorrect regions. The method is also sensitive to the refinement method used and does not distinguish well between small and severe errors. It is not effective in identifying structures refined without experimental constraints. The method is sensitive to surface polarity, which is an important criterion for distinguishing misfolded structures. The FORTRAN program ERRAT is available for use. Analysis of a 300-residue protein takes less than 2 seconds of CPU time. The method is useful for model-building and structure verification.

Verification of protein structures: Patterns of nonbonded atomic interactions

1993 | CHRIS COLOVOS AND TODD O. YEATES