↓ Skip to main content

PLOS

Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions

Overview of attention for article published in PLOS ONE, October 2013
Altmetric Badge

Mentioned by

twitter
7 X users

Citations

dimensions_citation
80 Dimensions

Readers on

mendeley
254 Mendeley
Title
Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions
Published in
PLOS ONE, October 2013
DOI 10.1371/journal.pone.0075707
Pubmed ID
Authors

Gabriel E. Hoffman

Abstract

Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.

X Demographics

X Demographics

The data shown below were collected from the profiles of 7 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 254 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 6 2%
Germany 3 1%
United Kingdom 3 1%
Chile 1 <1%
Brazil 1 <1%
France 1 <1%
New Zealand 1 <1%
Australia 1 <1%
Unknown 237 93%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 76 30%
Researcher 46 18%
Student > Master 35 14%
Student > Bachelor 15 6%
Student > Doctoral Student 14 6%
Other 46 18%
Unknown 22 9%
Readers by discipline Count As %
Agricultural and Biological Sciences 119 47%
Biochemistry, Genetics and Molecular Biology 41 16%
Mathematics 17 7%
Computer Science 16 6%
Medicine and Dentistry 6 2%
Other 23 9%
Unknown 32 13%