2010 April | Hyun Min Kang, Jae Hoon Sul, Susan K Service, Noah A Zaitlen, Sit-yee Kong, Nelson B Freimer, Chiara Sabatti, Eleazar Eskin
A variance component approach called EMMAX (Efficient Mixed Model for Association Mapping) was developed to correct for sample structure in genome-wide association studies (GWASs). This method improves computational efficiency by using a linear mixed model with an empirically estimated relatedness matrix to account for genetic relatedness and population stratification. EMMAX reduces analysis time from years to hours, making it suitable for large datasets. It was applied to two human GWAS datasets: the Northern Finland Birth Cohort (NFBC66) and the Wellcome Trust Case Control Consortium (WTCCC). EMMAX outperformed principal component analysis (PCA) and genomic control in correcting for sample structure.
Sample structure in GWASs includes population stratification and hidden relatedness, which can lead to spurious associations. EMMAX addresses these issues by modeling the correlation between phenotypes using a relatedness matrix derived from high-density markers. This approach accounts for both population stratification and hidden relatedness, which are often not fully captured by PCA or genomic control. EMMAX was tested on various traits, including quantitative traits from the NFBC66 and common diseases from the WTCCC. It showed improved performance in reducing inflation of test statistics and identifying true associations.
EMMAX also provides marker-specific inflation factors, which are more accurate than the global inflation factors used in genomic control. These factors help in identifying and correcting for sample structure more effectively. The method was found to be more efficient than traditional approaches, with significantly reduced computational time. It was also shown to improve the ranking of SNPs and reduce false positives.
In the WTCCC data, EMMAX corrected for sample structure and identified significant associations that were not detected by other methods. It also showed better performance in handling complex genetic relationships and reducing overdispersion of test statistics. The method was found to be effective in identifying true associations, even in cases where other methods failed. Overall, EMMAX provides a more accurate and efficient approach for correcting sample structure in GWASs, improving the reliability of association findings.A variance component approach called EMMAX (Efficient Mixed Model for Association Mapping) was developed to correct for sample structure in genome-wide association studies (GWASs). This method improves computational efficiency by using a linear mixed model with an empirically estimated relatedness matrix to account for genetic relatedness and population stratification. EMMAX reduces analysis time from years to hours, making it suitable for large datasets. It was applied to two human GWAS datasets: the Northern Finland Birth Cohort (NFBC66) and the Wellcome Trust Case Control Consortium (WTCCC). EMMAX outperformed principal component analysis (PCA) and genomic control in correcting for sample structure.
Sample structure in GWASs includes population stratification and hidden relatedness, which can lead to spurious associations. EMMAX addresses these issues by modeling the correlation between phenotypes using a relatedness matrix derived from high-density markers. This approach accounts for both population stratification and hidden relatedness, which are often not fully captured by PCA or genomic control. EMMAX was tested on various traits, including quantitative traits from the NFBC66 and common diseases from the WTCCC. It showed improved performance in reducing inflation of test statistics and identifying true associations.
EMMAX also provides marker-specific inflation factors, which are more accurate than the global inflation factors used in genomic control. These factors help in identifying and correcting for sample structure more effectively. The method was found to be more efficient than traditional approaches, with significantly reduced computational time. It was also shown to improve the ranking of SNPs and reduce false positives.
In the WTCCC data, EMMAX corrected for sample structure and identified significant associations that were not detected by other methods. It also showed better performance in handling complex genetic relationships and reducing overdispersion of test statistics. The method was found to be effective in identifying true associations, even in cases where other methods failed. Overall, EMMAX provides a more accurate and efficient approach for correcting sample structure in GWASs, improving the reliability of association findings.