To better understand the interpretation of polynomials, let’s see three plausible examples:
##
## Call:
## lm(formula = health ~ exercise + I(exercise^2), data = d1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.92735 -0.59522 0.00662 0.74671 2.22454
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.13438 0.29274 3.875 0.000194 ***
## exercise 1.84092 0.17757 10.367 < 2e-16 ***
## I(exercise^2) -0.17177 0.02182 -7.873 4.98e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.02 on 97 degrees of freedom
## Multiple R-squared: 0.6524, Adjusted R-squared: 0.6452
## F-statistic: 91.02 on 2 and 97 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = fail_exam ~ anxiety + I(anxiety^2), data = d2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.9224 -3.8807 0.0049 3.6602 11.9476
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.81460 1.85273 10.15 < 2e-16 ***
## anxiety -4.97617 0.80918 -6.15 1.73e-08 ***
## I(anxiety^2) 1.02500 0.07529 13.62 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.08 on 97 degrees of freedom
## Multiple R-squared: 0.9221, Adjusted R-squared: 0.9205
## F-statistic: 574.3 on 2 and 97 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = happiness ~ love_R + I(love_R^2), data = d3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.4666 -3.8456 -0.0085 3.7124 14.2427
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.16588 1.43464 2.207 0.0297 *
## love_R 3.39067 0.68967 4.916 3.59e-06 ***
## I(love_R^2) 0.47874 0.06727 7.117 1.91e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.209 on 97 degrees of freedom
## Multiple R-squared: 0.9592, Adjusted R-squared: 0.9583
## F-statistic: 1139 on 2 and 97 DF, p-value: < 2.2e-16
To interpret these models, we have three main options:
To understand these, we’ll do each for each example above.
##
## Call:
## lm(formula = health ~ exercise + I(exercise^2), data = d1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.92735 -0.59522 0.00662 0.74671 2.22454
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.13438 0.29274 3.875 0.000194 ***
## exercise 1.84092 0.17757 10.367 < 2e-16 ***
## I(exercise^2) -0.17177 0.02182 -7.873 4.98e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.02 on 97 degrees of freedom
## Multiple R-squared: 0.6524, Adjusted R-squared: 0.6452
## F-statistic: 91.02 on 2 and 97 DF, p-value: < 2.2e-16
To interpret this one, we can:
##
## Call:
## lm(formula = fail_exam ~ anxiety + I(anxiety^2), data = d2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.9224 -3.8807 0.0049 3.6602 11.9476
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.81460 1.85273 10.15 < 2e-16 ***
## anxiety -4.97617 0.80918 -6.15 1.73e-08 ***
## I(anxiety^2) 1.02500 0.07529 13.62 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.08 on 97 degrees of freedom
## Multiple R-squared: 0.9221, Adjusted R-squared: 0.9205
## F-statistic: 574.3 on 2 and 97 DF, p-value: < 2.2e-16
To interpret this one, we can:
##
## Call:
## lm(formula = happiness ~ love_R + I(love_R^2), data = d3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.4666 -3.8456 -0.0085 3.7124 14.2427
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.16588 1.43464 2.207 0.0297 *
## love_R 3.39067 0.68967 4.916 3.59e-06 ***
## I(love_R^2) 0.47874 0.06727 7.117 1.91e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.209 on 97 degrees of freedom
## Multiple R-squared: 0.9592, Adjusted R-squared: 0.9583
## F-statistic: 1139 on 2 and 97 DF, p-value: < 2.2e-16
To interpret this one, we can:
To better understand the interpretation of interactions, let’s see four plausible examples:
##
## Call:
## lm(formula = health ~ hike * diet, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3915 -0.4668 -0.0226 0.6060 1.9511
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.9989 0.1993 5.012 2.46e-06 ***
## hike1 -1.1947 0.2913 -4.101 8.61e-05 ***
## diet1 0.9322 0.2679 3.480 0.000755 ***
## hike1:diet1 1.9081 0.4025 4.741 7.39e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9965 on 96 degrees of freedom
## Multiple R-squared: 0.519, Adjusted R-squared: 0.504
## F-statistic: 34.53 on 3 and 96 DF, p-value: 3.173e-15
##
## Call:
## lm(formula = health ~ exercise * diet, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.09249 -0.61860 -0.04806 0.57464 2.35702
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.12427 0.24837 4.527 1.72e-05 ***
## exercise -1.02349 0.05904 -17.337 < 2e-16 ***
## diet1 -0.92545 0.35957 -2.574 0.0116 *
## exercise:diet1 2.01677 0.08832 22.834 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8964 on 96 degrees of freedom
## Multiple R-squared: 0.9468, Adjusted R-squared: 0.9452
## F-statistic: 570 on 3 and 96 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = health ~ exercise * hours_sleep, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.04429 -0.51506 0.01136 0.53872 2.85592
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.48299 0.64278 2.307 0.0232 *
## exercise -8.05161 0.14115 -57.044 <2e-16 ***
## hours_sleep -1.05469 0.07477 -14.106 <2e-16 ***
## exercise:hours_sleep 2.00446 0.01646 121.764 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8871 on 96 degrees of freedom
## Multiple R-squared: 0.9989, Adjusted R-squared: 0.9989
## F-statistic: 2.997e+04 on 3 and 96 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = health ~ hours_sleep * location, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.40304 -0.65048 -0.07422 0.52715 2.19050
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.00735 0.85453 -2.349 0.0210 *
## hours_sleep 3.14136 0.09839 31.928 < 2e-16 ***
## locationSubmarine -2.43410 1.13202 -2.150 0.0342 *
## locationurban -9.58839 1.15442 -8.306 8.30e-13 ***
## locationrural -13.38270 0.99426 -13.460 < 2e-16 ***
## hours_sleep:locationSubmarine 0.97182 0.13658 7.115 2.38e-10 ***
## hours_sleep:locationurban 4.94159 0.13435 36.781 < 2e-16 ***
## hours_sleep:locationrural 6.88204 0.11771 58.467 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9059 on 92 degrees of freedom
## Multiple R-squared: 0.9986, Adjusted R-squared: 0.9985
## F-statistic: 9460 on 7 and 92 DF, p-value: < 2.2e-16
To interpret these models, we have two main options:
To understand these, we’ll do each for each example above.
##
## Call:
## lm(formula = health ~ hike * diet, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3915 -0.4668 -0.0226 0.6060 1.9511
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.9989 0.1993 5.012 2.46e-06 ***
## hike1 -1.1947 0.2913 -4.101 8.61e-05 ***
## diet1 0.9322 0.2679 3.480 0.000755 ***
## hike1:diet1 1.9081 0.4025 4.741 7.39e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9965 on 96 degrees of freedom
## Multiple R-squared: 0.519, Adjusted R-squared: 0.504
## F-statistic: 34.53 on 3 and 96 DF, p-value: 3.173e-15
Overall there is a significant interactive effect between hiking and dieting on health.
##
## Call:
## lm(formula = health ~ exercise * diet, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.09249 -0.61860 -0.04806 0.57464 2.35702
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.12427 0.24837 4.527 1.72e-05 ***
## exercise -1.02349 0.05904 -17.337 < 2e-16 ***
## diet1 -0.92545 0.35957 -2.574 0.0116 *
## exercise:diet1 2.01677 0.08832 22.834 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8964 on 96 degrees of freedom
## Multiple R-squared: 0.9468, Adjusted R-squared: 0.9452
## F-statistic: 570 on 3 and 96 DF, p-value: < 2.2e-16
Overall there is a significant interaction between exercise and diet on health.
##
## Call:
## lm(formula = health ~ exercise * hours_sleep, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.04429 -0.51506 0.01136 0.53872 2.85592
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.48299 0.64278 2.307 0.0232 *
## exercise -8.05161 0.14115 -57.044 <2e-16 ***
## hours_sleep -1.05469 0.07477 -14.106 <2e-16 ***
## exercise:hours_sleep 2.00446 0.01646 121.764 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8871 on 96 degrees of freedom
## Multiple R-squared: 0.9989, Adjusted R-squared: 0.9989
## F-statistic: 2.997e+04 on 3 and 96 DF, p-value: < 2.2e-16
## Call:
## probemod::jn(model = mod1, dv = "health", iv = "exercise", mod = "hours_sleep")
##
## Conditional effects of exercise on health at values of hours_sleep
## hours_sleep Effect se t p llci ulci
## 1 -6.0471 0.1255 -48.2031 0.0000 -6.2962 -5.7981
## 2 -4.0427 0.1100 -36.7583 0.0000 -4.2610 -3.8243
## 3 -2.0382 0.0948 -21.4904 0.0000 -2.2265 -1.8499
## 4 -0.0338 0.0802 -0.4208 0.6749 -0.1930 0.1255
## 5 1.9707 0.0665 29.6411 0.0000 1.8387 2.1027
## 6 3.9752 0.0543 73.2402 0.0000 3.8674 4.0829
## 7 5.9796 0.0449 133.2640 0.0000 5.8905 6.0687
## 8 7.9841 0.0403 198.1920 0.0000 7.9041 8.0641
The model shows that there is a significant interaction between exercise and sleep on health (p < .001).
##
## Call:
## lm(formula = health ~ hours_sleep * location, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.40304 -0.65048 -0.07422 0.52715 2.19050
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.00735 0.85453 -2.349 0.0210 *
## hours_sleep 3.14136 0.09839 31.928 < 2e-16 ***
## locationSubmarine -2.43410 1.13202 -2.150 0.0342 *
## locationurban -9.58839 1.15442 -8.306 8.30e-13 ***
## locationrural -13.38270 0.99426 -13.460 < 2e-16 ***
## hours_sleep:locationSubmarine 0.97182 0.13658 7.115 2.38e-10 ***
## hours_sleep:locationurban 4.94159 0.13435 36.781 < 2e-16 ***
## hours_sleep:locationrural 6.88204 0.11771 58.467 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9059 on 92 degrees of freedom
## Multiple R-squared: 0.9986, Adjusted R-squared: 0.9985
## F-statistic: 9460 on 7 and 92 DF, p-value: < 2.2e-16
There is a significant interactive effect, where the effect of sleep on health depends on the location wherein the individual lives (all p values < .001).
Whenever a variable has a non-linear effect (polynomial, interaction, or soon-to-be-discussed GLM), AMEs can help simplify the interpretation by putting the effect in terms much like a regular coefficient.
For example, on average, the effect of exercise can be shown by doing the following:
library(margins)
mod <- lm(health ~ exercise + I(exercise^2),
data = d1)
margins(mod) %>%
summary()
## factor AME SE z p lower upper
## exercise 0.5944 0.0465 12.7789 0.0000 0.5032 0.6856
This means that, on average, for every one unit increase in exercise, there is a .59 unit increase in health (p < .001).