A method and server for predicting damaging missense mutations

A method and server for predicting damaging missense mutations

2010 April | Ivan A. Adzhubei1,7, Steffen Schmidt2,7, Leonid Peshkin3,7, Vasily E. Ramensky4, Anna Gerasimova5, Peer Bork6, Alexey S. Kondrashov5, and Shamil R. Sunyaev1
A new method and software tool, PolyPhen-2, are introduced for predicting the damaging effects of missense mutations. Unlike PolyPhen-1, PolyPhen-2 uses eight sequence-based and three structure-based features, selected by an iterative algorithm. These features compare wild-type and mutant alleles to determine amino acid replacements. The alignment pipeline uses clustering to select homologous sequences and constructs multiple alignments. Predictions are made using a Naïve Bayes classifier. PolyPhen-2 was trained on two datasets: HumDiv, containing damaging alleles from Mendelian diseases, and HumVar, containing disease-causing mutations and non-damaging variants. PolyPhen-2 outperformed PolyPhen and other tools in prediction accuracy. It achieved 92% true positive rate at 20% false positive rate on HumDiv and 73% on HumVar. Lower accuracy on HumVar is due to some non-damaging variants being mildly deleterious. HumVar-trained PolyPhen-2 is suitable for distinguishing drastic effects from mild ones, while HumDiv-trained is better for rare alleles in complex traits. The tool calculates posterior probabilities and reports false and true positive rates. It also classifies mutations as benign, possibly damaging, or probably damaging. The tool is available online. The study was supported by NIH grant R01 GM078598. References to prior studies are included. The method uses UniRef100 and Swiss-Prot databases for homology searches. SIFT, SNAP, and SNPs3D were also tested but had limitations in applying to HumDiv.A new method and software tool, PolyPhen-2, are introduced for predicting the damaging effects of missense mutations. Unlike PolyPhen-1, PolyPhen-2 uses eight sequence-based and three structure-based features, selected by an iterative algorithm. These features compare wild-type and mutant alleles to determine amino acid replacements. The alignment pipeline uses clustering to select homologous sequences and constructs multiple alignments. Predictions are made using a Naïve Bayes classifier. PolyPhen-2 was trained on two datasets: HumDiv, containing damaging alleles from Mendelian diseases, and HumVar, containing disease-causing mutations and non-damaging variants. PolyPhen-2 outperformed PolyPhen and other tools in prediction accuracy. It achieved 92% true positive rate at 20% false positive rate on HumDiv and 73% on HumVar. Lower accuracy on HumVar is due to some non-damaging variants being mildly deleterious. HumVar-trained PolyPhen-2 is suitable for distinguishing drastic effects from mild ones, while HumDiv-trained is better for rare alleles in complex traits. The tool calculates posterior probabilities and reports false and true positive rates. It also classifies mutations as benign, possibly damaging, or probably damaging. The tool is available online. The study was supported by NIH grant R01 GM078598. References to prior studies are included. The method uses UniRef100 and Swiss-Prot databases for homology searches. SIFT, SNAP, and SNPs3D were also tested but had limitations in applying to HumDiv.
Reach us at info@study.space
[slides] A method and server for predicting damaging missense mutations | StudySpace