2010 April ; 42(4): 348–354. doi:10.1038/ng.548. | Hyun Min Kang, Jae Hoon Sui, Susan K Service, Noah A Zaitlen, Sit-ye Kong, Nelson B Freimer, Chiara Sabatti, and Eleazar Eskin
The paper introduces a variance component model, implemented in the software EMMA eXpanded (EMMAX), to address sample structure in genome-wide association studies (GWAS). Sample structure, including population stratification and hidden relatedness, can lead to inflated test statistics and spurious associations. Traditional methods like principal component analysis (PCA) and genomic control are computationally intensive and may not fully correct for sample structure. EMMAX uses a linear mixed model to estimate the correlation between phenotypes of sample subjects, reducing computational time from years to hours. The method is evaluated on two human GWAS datasets: the Northern Finland Birth Cohort (NFBC66) and the Wellcome Trust Case Control Consortium (WTCCC). EMMAX outperforms both PCA and genomic control in correcting for sample structure, providing more accurate association results. The study highlights the importance of accounting for marker-specific inflation factors and the potential issues with using a single global correction factor in meta-analyses and multistage analyses.The paper introduces a variance component model, implemented in the software EMMA eXpanded (EMMAX), to address sample structure in genome-wide association studies (GWAS). Sample structure, including population stratification and hidden relatedness, can lead to inflated test statistics and spurious associations. Traditional methods like principal component analysis (PCA) and genomic control are computationally intensive and may not fully correct for sample structure. EMMAX uses a linear mixed model to estimate the correlation between phenotypes of sample subjects, reducing computational time from years to hours. The method is evaluated on two human GWAS datasets: the Northern Finland Birth Cohort (NFBC66) and the Wellcome Trust Case Control Consortium (WTCCC). EMMAX outperforms both PCA and genomic control in correcting for sample structure, providing more accurate association results. The study highlights the importance of accounting for marker-specific inflation factors and the potential issues with using a single global correction factor in meta-analyses and multistage analyses.