Pseudo-R-squared

Pseudo-R-squared values are used when the outcome variable is nominal or ordinal such that the coefficient of determination $R$ ² cannot be applied as a measure for goodness of fit and when a likelihood function is used to fit a model.

In linear regression, the squared multiple correlation, $R$ ² is used to assess goodness of fit as it represents the proportion of variance in the criterion that is explained by the predictors.^[1] In logistic regression analysis, there is no agreed upon analogous measure, but there are several competing measures each with limitations.^[1]^[2]

Four of the most commonly used indices and one less commonly used one are examined in this article:

Likelihood ratio $R$ ²_L
Cox and Snell $R$ ²_CS
Nagelkerke $R$ ²_N
McFadden $R$ ²_McF
Tjur $R$ ²_T

$R$ ²_L by Cohen

$R$ ²_L is given by Cohen:^[1]

R_{\text{L}}^{2}={\frac {D_{\text{null}}-D_{\text{fitted}}}{D_{\text{null}}}}.

This is the most analogous index to the squared multiple correlations in linear regression.^[3] It represents the proportional reduction in the deviance wherein the deviance is treated as a measure of variation analogous but not identical to the variance in linear regression analysis.^[3] One limitation of the likelihood ratio $R$ ² is that it is not monotonically related to the odds ratio,^[1] meaning that it does not necessarily increase as the odds ratio increases and does not necessarily decrease as the odds ratio decreases.

$R$ ²_CS by Cox and Snell

$R$ ²_CS is an alternative index of goodness of fit related to the $R$ ² value from linear regression.^[2] It is given by:

{\begin{aligned}R_{\text{CS}}^{2}&=1-\left({\frac {L_{0}}{L_{M}}}\right)^{2/n}\\[5pt]&=1-\exp \left({\frac {2}{n}}(\ln(L_{0})-\ln(L_{M}))\right)\end{aligned}}

where $L M$ and $L 0$ are the likelihoods for the model being fitted and the null model, respectively. The Cox and Snell index corresponds to the standard $R$ ² in case of a linear model with normal error. In certain situations, $R$ ²_CS may be problematic as its maximum value is $1-L_{0}^{2/n}$ . For example, for logistic regression, the upper bound is $R_{\text{CS}}^{2}\leq 0.75$ for a symmetric marginal distribution of events and decreases further for an asymmetric distribution of events.^[2]

$R$ ²_N by Nagelkerke

$R$ ²_N, proposed by Nico Nagelkerke in a highly cited Biometrika paper,^[4] provides a correction to the Cox and Snell $R$ ² so that the maximum value is equal to 1. Nevertheless, the Cox and Snell and likelihood ratio $R$ ²s show greater agreement with each other than either does with the Nagelkerke $R$ ².^[1] Of course, this might not be the case for values exceeding 0.75 as the Cox and Snell index is capped at this value. The likelihood ratio $R$ ² is often preferred to the alternatives as it is most analogous to $R$ ² in linear regression, is independent of the base rate (both Cox and Snell and Nagelkerke $R$ ²s increase as the proportion of cases increase from 0 to 0.5) and varies between 0 and 1.

$R$ ²_McF by McFadden

The pseudo $R$ ² by McFadden (sometimes called likelihood ratio index^[5]) is defined as

R_{\text{McF}}^{2}=1-{\frac {\ln(L_{M})}{\ln(L_{0})}},

and is preferred over $R$ ²_CS by Allison.^[2] The two expressions $R$ ²_McF and $R$ ²_CS are then related respectively by,

{\begin{matrix}R_{\text{CS}}^{2}=1-\left({\dfrac {1}{L_{0}}}\right)^{\frac {2(R_{\text{McF}}^{2})}{n}}\\[1.5em]R_{\text{McF}}^{2}=-{\dfrac {n}{2}}\cdot {\dfrac {\ln(1-R_{\text{CS}}^{2})}{\ln L_{0}}}\end{matrix}}

$R$ ²_T by Tjur

Allison^[2] prefers $R$ ²_T which is a relatively new measure developed by Tjur.^[6] It can be calculated in two steps:

For each level of the dependent variable, find the mean of the predicted probabilities of an event.
Take the absolute value of the difference between these means

Interpretation

A word of caution is in order when interpreting pseudo- $R$ ² statistics. The reason these indices of fit are referred to as pseudo $R$ ² is that they do not represent the proportionate reduction in error as the $R$ ² in linear regression does.^[1] Linear regression assumes homoscedasticity, that the error variance is the same for all values of the criterion. Logistic regression will always be heteroscedastic – the error variances differ for each value of the predicted score. For each value of the predicted score there would be a different value of the proportionate reduction in error. Therefore, it is inappropriate to think of $R$ ² as a proportionate reduction in error in a universal sense in logistic regression.^[1]

References

^ ^a ^b ^c ^d ^e ^f ^g Cohen, Jacob; Cohen, Patricia; West, Steven G.; Aiken, Leona S. (2002). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.). Routledge. p. 502. ISBN 978-0-8058-2223-6.
^ ^a ^b ^c ^d ^e Allison, Paul D. "Measures of fit for logistic regression" (PDF). Statistical Horizons LLC and the University of Pennsylvania.
^ ^a ^b Menard, Scott W. (2002). Applied Logistic Regression (2nd ed.). SAGE. ISBN 978-0-7619-2208-7. ^{[page needed]}
^ Nagelkerke, N. J. D. (1991). A Note on a General Definition of the Coefficient of Determination. Biometrika, 78(3), 691–692. https://doi.org/10.2307/2337038
^ Hardin, J. W., Hilbe, J. M. (2007). Generalized linear models and extensions. USA: Taylor & Francis. Page 60, Google Books
^ Tjur, Tue (2009). "Coefficients of determination in logistic regression models". American Statistician. 63 (4): 366–372. doi:10.1198/tast.2009.08210. S2CID 121927418.

[Cohen-1] ^ ^a ^b ^c ^d ^e ^f ^g Cohen, Jacob; Cohen, Patricia; West, Steven G.; Aiken, Leona S. (2002). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.). Routledge. p. 502. ISBN 978-0-8058-2223-6.

[:0-2] Allison, Paul D. "Measures of fit for logistic regression" (PDF). Statistical Horizons LLC and the University of Pennsylvania.

[Menard-3] Menard, Scott W. (2002). Applied Logistic Regression (2nd ed.). SAGE. ISBN 978-0-7619-2208-7. ^{[page needed]}

[4] Nagelkerke, N. J. D. (1991). A Note on a General Definition of the Coefficient of Determination. Biometrika, 78(3), 691–692. https://doi.org/10.2307/2337038

[5] Hardin, J. W., Hilbe, J. M. (2007). Generalized linear models and extensions. USA: Taylor & Francis. Page 60, Google Books

[6] Tjur, Tue (2009). "Coefficients of determination in logistic regression models". American Statistician. 63 (4): 366–372. doi:10.1198/tast.2009.08210. S2CID 121927418.

[1]

[2]

[3]

[4]

[5]

[6]

Pseudo-R-squared

R2L by Cohen

R2CS by Cox and Snell

R2N by Nagelkerke

R2McF by McFadden

R2T by Tjur