Abstract
The analysis of highly structured data requires models with unobserved components (random effects) able to adequately account for the patterns of variances and correlations. The specification of the unobserved components is a key and challenging task. In this paper, we first review the literature about the consequences of misspecifying the distribution of the random effects and the related diagnostic tools; we then outline the main alternatives and generalizations, also considering some issues arising in Bayesian inference. The relevance of suitably structuring the unobserved components is illustrated by means of an application exploiting a model with heteroscedastic random effects.
Similar content being viewed by others
References
Agresti, A., Caffo, B., Ohman-Strickland, P.: Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies. Comput. Stat. Data Anal. 47, 639–653 (2004)
Aitkin, M.: A general maximum likelihood analysis of variance components in generalized linear models. Biometrics 55, 117–128 (1999)
Alonso, A., Litire, S., Laenen, A.: A note on the indeterminacy of the random-effects distribution in hierarchical models. Am. Stat. 64, 318–324 (2010)
Antic, J., Laffont, C.M., Chafaï, D., Concordet, D.: Comparison of nonparametric methods in nonlinear mixed effects models. Comput. Stat. Data Anal. 53, 642–656 (2009)
Arpino, B., Varriale, R.: Assessing the quality of institutions’ rankings obtained through multilevel linear regression models. J. Appl. Econ. Sci. 5, 7–22 (2010)
Azzimonti, L., Ieva, F., Paganoni, A.M.: Nonlinear nonparametric mixed-effects models for unsupervised classification. Comput. Stat. 28, 1549–1570 (2013)
Bartolucci, F., Pennoni, F., Vittadini, G.: Assessment of school performance through a multilevel latent Markov Rasch model. J. Educ. Behav. Stat. 36, 491–522 (2011)
Besag, J., York, J., Mollié, A.: Bayesian image restoration, with two applications in spatial statistics. Ann. Inst. Stat. Math. 43, 1–59 (1991)
Browne, W., Goldstein, H.: MCMC sampling for a multilevel model with nonindependent residuals within and between cluster units. J. Educ. Behav. Stat. 35, 453–473 (2010)
Comte, F., Samson, A.: Nonparametric estimation of random-effects densities in linear mixed-effects model. J. Nonparametr. Stat. 24, 951–975 (2012)
Demidenko, E.: Mixed Models: Theory and Applications with R, 2nd edn. Wiley, New York (2013)
Ebbes, P., Bockenholt, U., Wedel, M.: Regressor and random-effects dependencies in multilevel models. Statistica Neerlandica 58, 161–178 (2004)
Eberly, L.E., Thackeray, L.M.: On Lange and Ryan’s plotting technique for diagnosing non-normality of random effects. Stat. Probab. Lett. 75, 77–85 (2005)
Fong, Y., Rue, H., Wakefield, J.: Bayesian inference for generalized linear mixed models. Biostatistics 11, 397–412 (2010)
Gelman, A.: Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 1, 515–533 (2006)
Gelman, A., Hill, J.: Data Analysis Using Regression and Multilevel/hierarchical Models. Cambridge University Press, Cambridge (2007)
Ghidey, W., Lesaffre, E., Eilers, P.: Smooth random effects distribution in a linear mixed model. Biometrics 60, 945–953 (2004)
Ghidey, W., Lesaffre, E., Verbeke, G.: A comparison of methods for estimating the random effects distribution of a linear mixed model. Stat. Methods Med. Res. 19, 575–600 (2010)
Goldstein, H.: Multilevel Statistical Models. Wiley, New York (2011)
Grilli, L., Rampichini, C.: Multilevel models for the evaluation of educational institutions: a review. In: Monari, P., Bini, M., Piccolo, D., Salmaso, L. (eds.) Statistical Methods for the Evaluation of Educational Services and Quality of Products, pp. 61–80. Physica-Verlag, Heidelberg (2009)
Grilli, L., Rampichini, C.: The role of sample cluster means in multilevel models: a view on endogeneity and measurement error issues. Methodology 7, 121–133 (2011)
Grilli, L., Metelli, S., Rampichini, C.: Bayesian estimation with integrated nested Laplace approximation for binary logit mixed models. J. Stat. Comput. Simul. (2014). doi:10.1080/00949655.2014.935377
Guglielmi, A., Ieva, F., Paganoni, A.M., Ruggeri, F., Soriano, J.: Semiparametric Bayesian models for clustering and classification in presence of unbalanced in-hospital survival. J. R. Stat. Soc. C 63(1), 25–46 (2014)
Hall, P., Yao, Q.: Inference in components of variance models with low replication. Ann. Stat. 31, 414–441 (2003)
Heagerty, P.J., Kurland, B.F.: Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika 88, 973–985 (2001)
Hedeker, D., Mermelstein, R.J., Demirtas, H.: An application of a mixed-effects location scale model for analysis of ecological momentary assessment (EMA) data. Biometrics 64, 627–634 (2008)
Hedeker, D., Mermelstein, R.J., Demirtas, H.: Modeling between-subject and within-subject variances in ecological momentary assessment data using mixed-effects location scale models. Stat. Med. 31, 3328–3336 (2012)
Heinzl, F., Tutz, G.: Clustering in linear mixed models with approximate Dirichlet process mixtures using EM algorithm. Stat. Model. 13, 41–67 (2013)
Huang, X.: Diagnosis of random-effect model misspecification in generalized linear mixed models for binary response. Biometrics 65, 361–368 (2009)
Huang, X.: Detecting random-effects model misspecification via coarsened data. Comput. Stat. Data Anal. 55, 703–714 (2011)
Jacqmin-Gadda, H., Sibillot, S., Proust, C., Molina, J.-M., Thiébaut, R.: Robustness of the linear mixed model to misspecified error distribution. Comput. Stat. Data Anal. 51, 5142–5154 (2007)
Kim, J.S., Frees, E.W.: Multilevel modeling with correlated effects. Psychometrika 72, 505–533 (2007)
Kleinman, K., Ibrahim, J.: A semi-parametric Bayesian approach to generalized linear mixed models. Stat. Med. 17, 2579–2596 (1998)
Komárek, A., Lesaffre, E.: Generalized linear mixed model with a penalized Gaussian mixture as a random effects distribution. Comput. Stat. Data Anal. 52, 3441–3458 (2008)
Langford, I.H., Lewis, T.: Outliers in multilevel data. J. R. Stat. Soc. A 161, 121–160 (1998)
Leckie, G., Goldstein, H.: The limitations of using school league tables to inform school choice. J. R. Stat. Soc. A 172, 835–851 (2009)
Leckie, G., Goldstein, H.: Understanding uncertainty in school league tables. Fiscal Stud. 32, 207–224 (2011)
Leckie, G.B., Pillinger, R.J., Jones, K., Goldstein, H.: Multilevel modelling of social segregation. J. Educ. Behav. Stat. 37, 3–30 (2012)
Lesperance, M., Saab, R., Neuhaus, J.: Nonparametric estimation of the mixing distribution in logistic regression mixed models with random intercepts and slopes. Comput. Stat. Data Anal. 71, 211–219 (2014)
Litière, S., Alonso, A., Molenberghs, G.: Type I and Type II error under random-effects misspecification in generalized linear mixed models. Biometrics 63, 1038–1044 (2007)
Litière, S., Alonso, A., Molenberghs, G.: The impact of a misspecified random-effects distribution on the estimation and the performance of inferential procedures in generalized linear mixed models. Stat. Med. 27, 3125–3144 (2008)
Litière, S., Alonso, A., Molenberghs, G.: Rejoinder to “A note on Type I and Type II error under random-effects misspecification in generalized linear mixed models”. Biometrics 67, 656–660 (2011)
Liu, J., Dey, D.K.: Skew random effects in multilevel binomial models: an alternative to nonparametric approach. Stat. Model. 8, 221–241 (2008)
Loy, A., Hofmann, H.: Diagnostic tools for hierarchical linear models. WIREs Comput. Stat. 5, 48–61 (2013)
Lukociene, O., Varriale, R., Vermunt, J.K.: The simultaneous decision(s) about the number of lower- and higher-level classes in multilevel latent class analysis. Sociol. Methodol. 40, 247–283 (2010)
Lunn, D., Jackson, C., Best, N., Thomas, A., Spiegelhalter, D.: The BUGS Book—A Practical Introduction to Bayesian Analysis published. CRC Press/Chapman and Hall, Boca Raton (2012)
Maas, C.J.M., Hox, J.J.: Robustness issues in multilevel regression analysis. Statistica Neerlandica 58, 127–137 (2004)
McCulloch, C.E., Neuhaus, J.M.: Misspecifying the shape of a random effects distribution: why getting it wrong may not matter. Stat. Sci. 26, 388–402 (2011a)
McCulloch, C.E., Neuhaus, J.M.: Prediction of random effects in linear and generalized linear models under model misspecification. Biometrics 67, 270–279 (2011b)
Muthén, B.: Latent variable analysis: growth mixture modeling and related techniques for longitudinal data. In: Kaplan, D. (ed.) Handbook of Quantitative Methodology for the Social Sciences, pp. 345–368. Sage, New York (2004)
Ohlssen, D.I., Sharples, L.D., Spiegelhalter, D.J.: Flexible random-effects models using Bayesian semi-parametric models: applications to institutional comparisons. Stat. Med. 26, 2088–2112 (2007)
Palardy, G., Vermunt, J.K.: Multilevel growth mixture models for classifying groups. J. Educ. Behav. Stat. 35, 532–565 (2010)
Papageorgiou, G., Hinde, J.: Multivariate generalized linear mixed models with semi-nonparametric and smooth nonparametric random effects densities. Stat. Comput. 22, 79–92 (2012)
Pinheiro, J.C., Liu, C., Wu, Y.N.: Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate t distribution. J. Comput. Graph. Stat. 10, 249–276 (2001)
Rabe-Hesketh, S., Pickles, A., Skrondal, A.: Correcting for covariate measurement error in logistic regression using nonparametric maximum likelihood estimation. Stat. Model. 3, 215–232 (2003)
Rao, J.N.K.: Small Area Estimation. Wiley, LinkHoboken (2003)
Raudenbush, S.W., Bryk, A.S.: Hierarchical Linear Models: Applications and Data Analysis Methods. Sage, Thousand Oaks, CA (2002)
Sani, C., Grilli, L.: Differential variability of test scores among schools: a multilevel analysis of the fifth-grade Invalsi test using heteroscedastic random effects. J. Appl. Quant. Methods 6, 88–99 (2011)
Scott, M.A., Simonoff, J.S., Marx, B.D. (eds.): The Sage Handbook of Multilevel Modeling. Sage, London (2013)
Shen, W., Louis, T.A.: Empirical Bayes estimation via the smoothing by roughening approach. J. Comput. Graph. Stat. 8, 800–823 (1999)
Snijders, T.A.B., Berkhof, J.: Diagnostic checks for multilevel models. In: de Leeuw, J., Meijer, E. (eds.) Handbook of Multilevel Analysis, pp. 141–175. Springer, Berlin (2008)
Snijders, T.A.B., Bosker, R.J.: Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modelling, 2nd edn. MPG Books Group, Bodmin (2012)
Verbeke, G., Lesaffre, E.: A linear mixed-effects model with heterogeneity in the random-effects population. J. Am. Stat. Assoc. 91, 217–221 (1996)
Verbeke, G., Lesaffre, E.: The effects of misspecifying the random-effects distribution in linear mixed models for longitudinal data. Comput. Stat. Data Anal. 23, 541–556 (1997)
Verbeke, G., Molenberghs, G.: The gradient function as an exploratory goodness-of-fit assessment of the random-effects distribution in mixed models. Biostatistics 14, 477–490 (2013)
Vermunt, J.K.: Multilevel latent class models. Sociol. Methodol. 33, 213–239 (2003)
White, N., Johnson, H., Silburn, P.A.: Dirichlet process mixture models for unsupervised clustering of symptoms in Parkinson’s disease. J. Appl. Stat. 39, 2363–2377 (2012)
Zhang, D., Davidian, M.: Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics 57, 795–802 (2001)
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by the FIRB 2012 project “Mixture and latent variable models for causal inference and analysis of socio-economic data”, Grant No. RBFR12SHVV_003.
Rights and permissions
About this article
Cite this article
Grilli, L., Rampichini, C. Specification of random effects in multilevel models: a review. Qual Quant 49, 967–976 (2015). https://doi.org/10.1007/s11135-014-0060-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-014-0060-5