Parsimonious Mixed Models

Parsimonious Mixed Models

26 May 2018 | Douglas Bates, Reinhold Kliegl, Shravan Vasishth, R. Harald Baayen
The analysis of experimental data with mixed-effects models requires decisions about the specification of the appropriate random-effects structure. Recent recommendations suggest fitting 'maximal' models with all possible random effect components. However, estimation of maximal models may not converge due to the model being too complex for the data, regardless of estimation method. Overparameterization can lead to uninterpretable models. Diagnostic tools are provided to detect overparameterization and guide model simplification. Linear mixed models (LMMs) are increasingly used in psychology and linguistics for analyzing data with both subjects and items as random factors. LMMs allow for statistical inference about experimental effects and interactions within a single framework. However, selecting the proper random-effects structure is crucial. LMMs consider variance between subjects and items for fixed effects and interactions, as well as correlations between intercepts and slopes. A simple model with a two-level within-subject manipulation and subject as the only random factor is illustrated. As model complexity increases, so does the difficulty of estimation. LMMs and generalized linear mixed-effects models (GLMMs) have limitations, particularly in handling high-dimensional data. The number of parameters to estimate increases significantly with more experimental factors. For example, a maximal model with three within-subject and within-items factors would require many parameters. This can lead to convergence issues and degenerate covariance matrices. Iterative reduction of model complexity is recommended. This involves checking for overparameterization using principal components analysis (PCA), removing variance components not supported by the data, and testing the significance of remaining variance components. Correlation parameters can be included if supported by the data. The process ensures a parsimonious model that is supported by the data. In the reanalysis of Kronmüller and Barr (2007), a maximal model was found to be too complex. PCA revealed that only a few variance components were necessary. Removing non-significant variance components and testing the significance of remaining components led to a simpler model that still provided reliable inferences. Similarly, in the reanalysis of Kliegl et al. (2015), a maximal model was found to be overparameterized, and a simpler model was identified. Bayesian analysis also supports the conclusion that a simpler model is better supported by the data. Bayesian models provide credible intervals for parameters, which can be compared to frequentist confidence intervals. The results show that the maximal model may not be necessary, and a simpler model is sufficient for reliable inferences. In conclusion, maximal models are not always necessary. A parsimonious model that is supported by the data is preferable. Overparameterization can lead to uninterpretable models and incorrect inferences. The iterative reduction of model complexity, guided by PCA and likelihood ratio tests, is recommended to ensure a model that is both supported by the data and interpretable.The analysis of experimental data with mixed-effects models requires decisions about the specification of the appropriate random-effects structure. Recent recommendations suggest fitting 'maximal' models with all possible random effect components. However, estimation of maximal models may not converge due to the model being too complex for the data, regardless of estimation method. Overparameterization can lead to uninterpretable models. Diagnostic tools are provided to detect overparameterization and guide model simplification. Linear mixed models (LMMs) are increasingly used in psychology and linguistics for analyzing data with both subjects and items as random factors. LMMs allow for statistical inference about experimental effects and interactions within a single framework. However, selecting the proper random-effects structure is crucial. LMMs consider variance between subjects and items for fixed effects and interactions, as well as correlations between intercepts and slopes. A simple model with a two-level within-subject manipulation and subject as the only random factor is illustrated. As model complexity increases, so does the difficulty of estimation. LMMs and generalized linear mixed-effects models (GLMMs) have limitations, particularly in handling high-dimensional data. The number of parameters to estimate increases significantly with more experimental factors. For example, a maximal model with three within-subject and within-items factors would require many parameters. This can lead to convergence issues and degenerate covariance matrices. Iterative reduction of model complexity is recommended. This involves checking for overparameterization using principal components analysis (PCA), removing variance components not supported by the data, and testing the significance of remaining variance components. Correlation parameters can be included if supported by the data. The process ensures a parsimonious model that is supported by the data. In the reanalysis of Kronmüller and Barr (2007), a maximal model was found to be too complex. PCA revealed that only a few variance components were necessary. Removing non-significant variance components and testing the significance of remaining components led to a simpler model that still provided reliable inferences. Similarly, in the reanalysis of Kliegl et al. (2015), a maximal model was found to be overparameterized, and a simpler model was identified. Bayesian analysis also supports the conclusion that a simpler model is better supported by the data. Bayesian models provide credible intervals for parameters, which can be compared to frequentist confidence intervals. The results show that the maximal model may not be necessary, and a simpler model is sufficient for reliable inferences. In conclusion, maximal models are not always necessary. A parsimonious model that is supported by the data is preferable. Overparameterization can lead to uninterpretable models and incorrect inferences. The iterative reduction of model complexity, guided by PCA and likelihood ratio tests, is recommended to ensure a model that is both supported by the data and interpretable.
Reach us at info@study.space
Understanding Parsimonious Mixed Models