9 Effect Size: Variance Explained

Note that these versions of \(R^2\) are becoming more common, but are not entirely agreed upon or standard. You will not be able to calculate them directly in standard software. Instead, you need to calculate the components and program the calculation. Importantly, if you choose to report one or both of them, you should not only identify which one you are using, but provide some brief interpretation and a citation of the article.

The Analysis Factor, R-Squared for Mixed Effects Models, by Kim Love

Marginal vs. Conditional \(R^2\):

A general and simple method for obtaining R2 from generalized linear mixed‐effects models

2012: Journal Article

Authors: Shinichi Nakagawa and Holger Schielzeth

Somewhat technical, but good figures:

Quantifying Explained Variance in Multilevel Models: An Integrative Framework for Defining R-Squared Measures

2019: manuscript, Psychological Methods

Authors: Jason D. Rights and Sonya K. Sterba, Vanderbilt University

9.1 MLM or LMM

Multilevel Model or Linear Multilevel Models

Social and Behavioral Science Focus:

R-squared measures in Multilevel Modeling: The undesirable property of negative R-squared values

2018: First Year Paper, Research Master in Social and Behavioral Science

Student: Edoardo Costantini Supervisor: prof. dr. M.A.L.M. van Assen

Education Focus:

Effect size measures for multilevel models: definition, interpretation, and TIMSS example

2018: Manuscript, Large-sclae ASsessments in Educaiton

Author: Julie Lorah

Psychology Focus:

Effect Size Measures for Multilevel Models in Clinical Child and Adolescent Research: New R-Squared Methods and Recommendations

2018: manuscript, Journal of Clinical Child & Adolescent Psychology

Authors: Jason D. Rights and David A. Cole

Online Appendix: r2MLMcomp

Software - R functions

r2MLM computes and outputs R-squared measures and analytic decompositions of variance for multilevel models
r2MLMcomp computes R-squared differences between pairs of multilevel models under comparison
r2MLM3 computes all measures relevant for a three-level model and provides a barchart graphic

9.2 GLMM

Generalized Linear Multilevel Models

Ecology Focus: for MLM/LMM and GLMM

R squared for mixed models

2013: Blogpost by Phil Martin

R squared for mixed models – the easy way

2013: Blogplot by Phil Martin

Nakagawa and Schielzeth (2013)

https://stats.stackexchange.com/questions/111150/calculating-r2-in-mixed-models-using-nakagawa-schielzeths-2013-r2glmm-me

I am answering by pasting Douglas Bates’s reply in the R-Sig-ME mailing list, on 17 Dec 2014 on the question of how to calculate an \(R^2\) statistic for generalized linear mixed models, which I believe is required reading for anyone interested in such a thing. Bates is the original author of the lme4 package for \(R\) and co-author of nlme, as well as co-author of a well-known book on mixed models, and CV will benefit from having the text in an answer, rather than just a link to it.

I must admit to getting a little twitchy when people speak of the “\(R^2\) for GLMMs”. \(R^2\) for a linear model is well-defined and has many desirable properties. For other models one can define different quantities that reflect some but not all of these properties. But this is not calculating an \(R^2\) in the sense of obtaining a number having all the properties that the \(R^2\) for linear models does. Usually there are several different ways that such a quantity could be defined. Especially for GLMs and GLMMs before you can define “proportion of response variance explained” you first need to define what you mean by “response variance”. The whole point of GLMs and GLMMs is that a simple sum of squares of deviations does not meaningfully reflect the variability in the response because the variance of an individual response depends on its mean.

Confusion about what constitutes \(R^2\) or degrees of freedom of any of the other quantities associated with linear models as applied to other models comes from confusing the formula with the concept. Although formulas are derived from models the derivation often involves quite sophisticated mathematics. To avoid a potentially confusing derivation and just “cut to the chase” it is easier to present the formulas. But the formula is not the concept. Generalizing a formula is not equivalent to generalizing the concept. And those formulas are almost never used in practice, especially for generalized linear models, analysis of variance and random effects. I have a “meta-theorem” that the only quantity actually calculated according to the formulas given in introductory texts is the sample mean.

It may seem that I am being a grumpy old man about this, and perhaps I am, but the danger is that people expect an “\(R^2\)-like” quantity to have all the properties of an \(R^2\) for linear models. It can’t. There is no way to generalize all the properties to a much more complicated model like a GLMM.

I was once on the committee reviewing a thesis proposal for Ph.D. candidacy. The proposal was to examine I think 9 different formulas that could be considered ways of computing an \(R^2\) for a nonlinear regression model to decide which one was “best”. Of course, this would be done through a simulation study with only a couple of different models and only a few different sets of parameter values for each. My suggestion that this was an entirely meaningless exercise was not greeted warmly.