Introduction

Chapter 13 and 14 cover interactions (also known as moderation), including their use, interpretation, and limitations. The use in R is straightforward as we’ll highlight below.

Interactions (Moderation)

  1. Let’s start by loading the tidyverse package, the furniture package, and the rlm package.
library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────────── tidyverse 1.2.1.9000 ──
## ✔ ggplot2 2.2.1.9000     ✔ purrr   0.2.5     
## ✔ tibble  1.4.2          ✔ dplyr   0.7.6     
## ✔ tidyr   0.8.1          ✔ stringr 1.3.1     
## ✔ readr   1.1.1          ✔ forcats 0.3.0
## Warning: package 'dplyr' was built under R version 3.5.1
## ── Conflicts ────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(furniture)
## ── furniture 1.7.12 ────────────────────────────────────────────────────────── learn more at tysonbarrett.com ──
## ✔ furniture attached
## ✖ The furniture::table1() function has the same name as tidyr::table1 (tbl_df)
##    Consider using `furniture::` for each function call.
library(rlm)
## ── rlm 0.1.0 ───────────────────────────────────────────────────────────────── learn more at tysonbarrett.com ──
## ✔ rlm attached
## ✔ No potential conflicts found
  1. Import the hosp data set into R.
data(hosp)
  1. Because when a data set was imported from SPSS (as this one originally was), there are labelled variable types, we need to adjust these using a little loop based on whether that variable is labelled.
hosp[] <- hosp %>%
  map_if(labelled::is.labelled, ~as_factor(.x))
  1. Interactions are often either theory-driven or data-driven. Of the two, theory-driven is often the most respected reason to test and report on an interaction. Herein, we’ll explore some relationships with visualizations that we’ll then test (since we don’t have any good theories about this data set).
hosp %>%
  ggplot(aes(exhaust, safety, group = sex, color = sex)) +
    geom_point(alpha = .6) +
    geom_smooth(method = "lm")

  1. There looks like there could be an interaction between exhaust and sex on safety. To test this, let’s run a linear regression model with the cross product of safety and sex. R is smart enough that if we put exhaust * sex in the formula, it will include all the right effects for us.
hosp %>%
  lm(safety ~ exhaust * sex, data = .) %>%
  summary()
## 
## Call:
## lm(formula = safety ~ exhaust * sex, data = .)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.23439 -0.61719  0.00054  0.74033  2.61092 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      3.32847    0.27543  12.085  < 2e-16 ***
## exhaust          0.13674    0.07956   1.719  0.08673 .  
## sexmale         -0.45969    0.40175  -1.144  0.25345    
## exhaust:sexmale  0.31860    0.11456   2.781  0.00576 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9816 on 296 degrees of freedom
## Multiple R-squared:  0.1812, Adjusted R-squared:  0.1729 
## F-statistic: 21.83 on 3 and 296 DF,  p-value: 8.46e-13

Interpretation

  1. If exhaust is how exhausted the individual feels, safety is the amount the individual works around safety protocols (avoids performing the full safety protocol) and sex is the biological sex of the individual (0 = female, 1 = male), what is the interpretation of each coefficient?
  2. Does this interpretation match the figure we created before? Why or why not?
  3. Let’s include some covariates in the model; namely, tenure (the job experience the individual has) and age (the individual’s age in years).
hosp %>%
  lm(safety ~ exhaust * sex + tenure + age, data = .) %>%
  summary()
## 
## Call:
## lm(formula = safety ~ exhaust * sex + tenure + age, data = .)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.26295 -0.60779 -0.00117  0.66936  2.17351 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      5.377140   0.504865  10.651  < 2e-16 ***
## exhaust          0.067240   0.077642   0.866 0.387184    
## sexmale         -1.293346   0.422770  -3.059 0.002424 ** 
## tenure          -0.096794   0.027183  -3.561 0.000431 ***
## age             -0.022167   0.008041  -2.757 0.006201 ** 
## exhaust:sexmale  0.401353   0.111166   3.610 0.000359 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9423 on 294 degrees of freedom
## Multiple R-squared:  0.2505, Adjusted R-squared:  0.2377 
## F-statistic: 19.65 on 5 and 294 DF,  p-value: < 2.2e-16
  1. First, is the interaction still significant? Did it decrease or increase with the covariates added?
  2. Next, what does the coefficient on exhaust mean here? What about sex?
  3. Finally, interpret both tenure and age.

Conditional Effects

  1. Given that the coefficient in the previous model on sex was interpretted when exhaust = 0 (something not actually seen in the data), let’s look at the difference between males and females when exhaust is at its mean value.
hosp %>%
  mutate(exhaust = exhaust - mean(exhaust, na.rm=TRUE)) %>%
  lm(safety ~ exhaust * sex + tenure + age, data = .) %>%
  summary()
## 
## Call:
## lm(formula = safety ~ exhaust * sex + tenure + age, data = .)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.26295 -0.60779 -0.00117  0.66936  2.17351 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      5.603221   0.391249  14.321  < 2e-16 ***
## exhaust          0.067240   0.077642   0.866 0.387184    
## sexmale          0.056135   0.171171   0.328 0.743183    
## tenure          -0.096794   0.027183  -3.561 0.000431 ***
## age             -0.022167   0.008041  -2.757 0.006201 ** 
## exhaust:sexmale  0.401353   0.111166   3.610 0.000359 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9423 on 294 degrees of freedom
## Multiple R-squared:  0.2505, Adjusted R-squared:  0.2377 
## F-statistic: 19.65 on 5 and 294 DF,  p-value: < 2.2e-16
  1. What does the coefficient on sex mean now? Compare it to the previously run model. What is different?

Probing Interactions

  1. Most aspects of probing an interaction are not necessary but you may come across reviewers that are… let’s say… less than helpful that may ask for some probing. We will use the Johnson-Neyman Technique for this as it should apply in several situations. The jn() function that rlm re-exports for us can do this. For this method to be useful, we need an interaction between two continuous variables. Here, we are using the book’s example of exhaust * tenure. Note that we use this same interaction model for the average marginal effects below.
interaction_model <- lm(safety ~ exhaust * tenure + age + sex, data = hosp)
jn(interaction_model, dv = "safety", iv = "exhaust", mod = "tenure")
## Call:
## jn(model = interaction_model, dv = "safety", iv = "exhaust", 
##     mod = "tenure")
## 
## Conditional effects of  exhaust  on  safety  at values of  tenure 
##  tenure  Effect     se       t      p    llci    ulci
##       1  0.5605 0.0956  5.8652 0.0000  0.3724  0.7485
##       2  0.4986 0.0828  6.0220 0.0000  0.3356  0.6615
##       3  0.4367 0.0715  6.1097 0.0000  0.2960  0.5774
##       4  0.3748 0.0624  6.0067 0.0000  0.2520  0.4976
##       5  0.3129 0.0566  5.5239 0.0000  0.2014  0.4244
##       6  0.2510 0.0553  4.5414 0.0000  0.1422  0.3598
##       7  0.1891 0.0586  3.2283 0.0014  0.0738  0.3044
##       8  0.1272 0.0659  1.9314 0.0544 -0.0024  0.2569
##       9  0.0653 0.0760  0.8597 0.3907 -0.0843  0.2150
##      10  0.0035 0.0880  0.0393 0.9687 -0.1698  0.1767
##      11 -0.0584 0.1012 -0.5772 0.5642 -0.2577  0.1408
##      12 -0.1203 0.1152 -1.0442 0.2973 -0.3471  0.1065
##      13 -0.1822 0.1298 -1.4041 0.1613 -0.4376  0.0732
##      14 -0.2441 0.1447 -1.6872 0.0926 -0.5288  0.0406
##      15 -0.3060 0.1599 -1.9141 0.0566 -0.6206  0.0086
##      16 -0.3679 0.1752 -2.0993 0.0366 -0.7128 -0.0230
##      17 -0.4298 0.1908 -2.2528 0.0250 -0.8052 -0.0543
##      18 -0.4917 0.2064 -2.3819 0.0179 -0.8979 -0.0854
  1. For what values of tenure is exhaust significantly different than zero? Where does the effect of exhaust turn negative?
  2. Do these results agree with the following figure? Why or Why not?
hosp %>%
  mutate(tenure_groups = cut_number(tenure, 6)) %>%
  ggplot(aes(exhaust, safety, group = tenure_groups, color = tenure_groups)) +
    geom_point(alpha = .6) +
    geom_smooth(method = "lm",
                se = FALSE,
                fullrange = TRUE) +
    labs(color = "Tenure") +
    scale_color_hue() +
    theme_bw()