2011 | David Habier, Rohan L Fernando, Kadir Kizilkaya, Dorian J Garrick
This study introduces two Bayesian model averaging methods, BayesC $ \pi $ and BayesD $ \pi $, to address the limitations of BayesA and BayesB in genomic selection. These methods treat the prior probability $ \pi $ that a SNP has zero effect as unknown, improving the accuracy of genomic estimated breeding values (GEBVs) and providing insights into the genetic architecture of quantitative traits.
BayesC $ \pi $ and BayesD $ \pi $ were compared with BayesA, BayesB, and ridge regression using simulated scenarios and real data from North American Holstein bulls. The results showed that BayesC $ \pi $ provided more accurate estimates of $ \pi $ and better inferred the number of QTL, especially when the number of QTL was large. BayesD $ \pi $, on the other hand, overestimated the number of QTL in some scenarios. The accuracy of GEBVs was similar across the Bayesian methods, with BayesA performing well with real data. However, BayesA had longer computing times compared to BayesC $ \pi $.
The study found that the number of QTL and the size of their effects varied across traits, with milk yield and fat yield having QTL with larger effects than protein yield and somatic cell score. The accuracy of GEBVs was mainly influenced by linkage disequilibrium (LD) rather than additive genetic relationships, especially when training data was limited.
BayesC $ \pi $ and BayesD $ \pi $ showed different behaviors in estimating the number of SNPs and QTL, with BayesC $ \pi $ generally providing more accurate estimates. The study also highlighted that the prior probability $ \pi $ is crucial for the shrinkage of SNP effects, and treating $ \pi $ as unknown improves the accuracy of GEBVs.
In conclusion, BayesC $ \pi $ is recommended for routine applications due to its ability to account for the number of QTL and provide more accurate GEBVs, especially when the number of QTL is large. The study also suggests that as SNP density increases, the overestimation of QTL numbers is expected to decrease, as LD between SNPs and QTL will be higher, leading to fewer SNPs being modeled per QTL.This study introduces two Bayesian model averaging methods, BayesC $ \pi $ and BayesD $ \pi $, to address the limitations of BayesA and BayesB in genomic selection. These methods treat the prior probability $ \pi $ that a SNP has zero effect as unknown, improving the accuracy of genomic estimated breeding values (GEBVs) and providing insights into the genetic architecture of quantitative traits.
BayesC $ \pi $ and BayesD $ \pi $ were compared with BayesA, BayesB, and ridge regression using simulated scenarios and real data from North American Holstein bulls. The results showed that BayesC $ \pi $ provided more accurate estimates of $ \pi $ and better inferred the number of QTL, especially when the number of QTL was large. BayesD $ \pi $, on the other hand, overestimated the number of QTL in some scenarios. The accuracy of GEBVs was similar across the Bayesian methods, with BayesA performing well with real data. However, BayesA had longer computing times compared to BayesC $ \pi $.
The study found that the number of QTL and the size of their effects varied across traits, with milk yield and fat yield having QTL with larger effects than protein yield and somatic cell score. The accuracy of GEBVs was mainly influenced by linkage disequilibrium (LD) rather than additive genetic relationships, especially when training data was limited.
BayesC $ \pi $ and BayesD $ \pi $ showed different behaviors in estimating the number of SNPs and QTL, with BayesC $ \pi $ generally providing more accurate estimates. The study also highlighted that the prior probability $ \pi $ is crucial for the shrinkage of SNP effects, and treating $ \pi $ as unknown improves the accuracy of GEBVs.
In conclusion, BayesC $ \pi $ is recommended for routine applications due to its ability to account for the number of QTL and provide more accurate GEBVs, especially when the number of QTL is large. The study also suggests that as SNP density increases, the overestimation of QTL numbers is expected to decrease, as LD between SNPs and QTL will be higher, leading to fewer SNPs being modeled per QTL.