Introduction

Chapter 15 discussed mediation analysis, a way of understanding the process of an effect from one variable to another. Mediation is built on linear regression as we’ve discussed but combines paths to create a single conceptual model.

  1. Let’s start by loading the tidyverse package, the furniture package, the lavaan and the MarginalMediation package.
library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────────── tidyverse 1.2.1.9000 ──
## ✔ ggplot2 2.2.1.9000     ✔ purrr   0.2.5     
## ✔ tibble  1.4.2          ✔ dplyr   0.7.6     
## ✔ tidyr   0.8.1          ✔ stringr 1.3.1     
## ✔ readr   1.1.1          ✔ forcats 0.3.0
## Warning: package 'dplyr' was built under R version 3.5.1
## ── Conflicts ────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(furniture)
## ── furniture 1.7.12 ────────────────────────────────────────────────────────── learn more at tysonbarrett.com ──
## ✔ furniture attached
## ✖ The furniture::table1() function has the same name as tidyr::table1 (tbl_df)
##    Consider using `furniture::` for each function call.
library(lavaan)
## This is lavaan 0.6-1
## lavaan is BETA software! Please report any bugs.
library(MarginalMediation)
## ── MarginalMediation 0.6.0 ─────────────────────────────────────────────────── learn more at tysonbarrett.com ──
## ✔ MarginalMediation attached
## ✔ No potential conflicts found
  1. Import the FacialBurns data set found in lavaan into R.
data(FacialBurns)
str(FacialBurns)
## 'data.frame':    77 obs. of  6 variables:
##  $ Selfesteem: int  35 40 38 30 27 35 35 28 29 26 ...
##  $ HADS      : int  6 2 2 4 11 5 4 13 19 16 ...
##  $ Age       : int  23 61 34 29 46 18 64 20 47 41 ...
##  $ TBSA      : num  3.5 6 8 6 19.2 ...
##  $ RUM       : num  5 4 4 5 13 8 6 8 13 8 ...
##  $ Sex       : int  1 1 1 1 1 1 1 1 1 1 ...

Mediation

  1. We hypothesize that anxiety (HADS) mediates the effect of TBSA on Selfesteem. Let’s test this out first with lavaan. We will create the model object with two regression models and then we define the indirect effect and the total effect from the coefficients of the two regressions.
med_model <- "
   ## a path
   HADS ~ a*TBSA
   ## b and c' paths
   Selfesteem ~ b*HADS + c*TBSA

   ## Define Indirect and Total Effects
   ind := a * b
   tot := a * b + c
"
sem(med_model, data = FacialBurns, se = "bootstrap", bootstrap = 500) %>% 
  summary()
## lavaan (0.6-1) converged normally after  13 iterations
## 
##   Number of observations                            77
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       0.000
##   Degrees of freedom                                 0
##   Minimum Function Value               0.0000000000000
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                            Bootstrap
##   Number of requested bootstrap draws              500
##   Number of successful bootstrap draws             500
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   HADS ~                                              
##     TBSA       (a)    0.281    0.109    2.577    0.010
##   Selfesteem ~                                        
##     HADS       (b)   -0.436    0.106   -4.125    0.000
##     TBSA       (c)   -0.180    0.063   -2.834    0.005
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .HADS             47.174    9.605    4.911    0.000
##    .Selfesteem       18.698    3.353    5.577    0.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     ind              -0.122    0.047   -2.625    0.009
##     tot              -0.302    0.063   -4.817    0.000
  1. The output gives us the a, b, and c paths (as we defined them) along with the defined parameters ind and tot. Is the indirect path (from TBSA -> HADS -> Self-esteem) significant? What does it mean here?
  2. Let’s get the standardized indirect and total effects.
sem(med_model, data = FacialBurns) %>% 
  summary(standardized = TRUE)
## lavaan (0.6-1) converged normally after  13 iterations
## 
##   Number of observations                            77
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       0.000
##   Degrees of freedom                                 0
##   Minimum Function Value               0.0000000000000
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   HADS ~                                                                
##     TBSA       (a)    0.281    0.108    2.600    0.009    0.281    0.284
##   Selfesteem ~                                                          
##     HADS       (b)   -0.436    0.072   -6.075    0.000   -0.436   -0.548
##     TBSA       (c)   -0.180    0.071   -2.537    0.011   -0.180   -0.229
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .HADS             47.174    7.603    6.205    0.000   47.174    0.919
##    .Selfesteem       18.698    3.013    6.205    0.000   18.698    0.576
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ind              -0.122    0.051   -2.390    0.017   -0.122   -0.156
##     tot              -0.302    0.083   -3.655    0.000   -0.302   -0.385
  1. Interpet the standardized indirect effect. How much of the total effect is explained by the process through anxiety?
  2. We can get similar results using the MarginalMediation package that uses linear regression, although this provides us with confidence intervals instead of p-values. Notably, the standardized results will be slightly different given that lavaan standardizes all the outcomes and predictors whereas MarginalMediation only standardizes the outcome (since there can also be categorical predictors in many mediation models).
bcpath <- glm(Selfesteem ~ HADS + TBSA, data = FacialBurns)
apath <- glm(HADS ~ TBSA, data = FacialBurns)

mma(bcpath, apath,
    ind_effects = c("TBSA-HADS"))
## 
## calculating a paths... b and c paths... Done.
                                                                                 
## ┌───────────────────────────────┐
## │  Marginal Mediation Analysis  │
## └───────────────────────────────┘
## A marginal mediation model with:
##    1 mediators
##    1 indirect effects
##    1 direct effects
##    500 bootstrapped samples
##    95% confidence interval
##    n = 77 
## 
## Formulas:
##    ◌ Selfesteem ~ HADS + TBSA
##    ◌ HADS ~ TBSA 
## 
## Unstandardized Effects
## ⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺
## ── Indirect Effects ──
##            A-path   B-path Indirect    Lower    Upper
## TBSA-HADS 0.28052 -0.43583 -0.12226 -0.25526 -0.04199
## 
## ── Direct Effects ──
##        Direct    Lower    Upper
## TBSA -0.17974 -0.29681 -0.02496
## 
## 
## Standardized Effects
## ⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺
## ── Indirect Effects ──
##           Indirect    Lower    Upper
## TBSA-HADS -0.02132 -0.04451 -0.00732
## 
## ── Direct Effects ──
##        Direct    Lower    Upper
## TBSA -0.03134 -0.05176 -0.00435
## -----
  1. Because each part of the mediation is a linear model, we can assess the assumptions as we’ve done before.
par(mfrow = c(2,2))
plot(bcpath)

rlm::diagnostics(bcpath) %>% head(20)
##    Selfesteem HADS  TBSA     pred        resid       dfb.1_     dfb.HADS
## 1          35    6  3.50 34.58895   0.41105324  0.012960985 -0.001657402
## 2          40    2  6.00 35.88290   4.11709786  0.139472409 -0.091399317
## 3          38    2  8.00 35.52342   2.47658489  0.073303811 -0.060058633
## 4          30    4  6.00 35.01125  -5.01124506 -0.151252519  0.072416542
## 5          27   11 19.25 29.57884  -2.57884369  0.025545079  0.004737331
## 6          35    5  6.00 34.57542   0.42458349  0.011919758 -0.004453942
## 7          35    4 12.00 33.93278   1.06721604  0.018873615 -0.022195323
## 8          28   13  2.50 31.71789  -3.71789049 -0.079671896 -0.090183049
## 9          29   19  9.50 27.84471   1.15528536 -0.004831601  0.046257560
## 10         26   16  7.25 29.55662  -3.55662317 -0.021272391 -0.108797783
## 11         36    3  5.50 35.53695   0.46305465  0.015200321 -0.008174825
## 12         40    3  7.50 35.17746   4.82254168  0.139428137 -0.096119911
## 13         31    8  7.50 32.99832  -1.99831562 -0.039154996  0.001341013
## 14         36    7 10.50 32.89491   3.10508639  0.047501510 -0.023835686
## 15         22   10  7.00 32.21653 -10.21653030 -0.181039780 -0.079114940
## 16         35    3  9.00 34.90784   0.09215695  0.002359607 -0.001970088
## 17         34    2  6.50 35.79303  -1.79303038 -0.058572870  0.040558407
## 18         21    6 14.00 32.70164 -11.70163985 -0.122962711  0.188215760
## 19         36    0  1.50 37.56341  -1.56340504 -0.073987147  0.039585768
## 20         34    5  2.10 35.27642  -1.27641622 -0.046473281  0.008157308
##         dfb.TBSA        dffit     cov.r       cook.d        hat
## 1  -0.0082986979  0.014243753 1.0653855 6.854635e-05 0.02262020
## 2  -0.0266148316  0.152496937 1.0305333 7.763038e-03 0.02539166
## 3   0.0028418597  0.089949930 1.0540467 2.722001e-03 0.02464283
## 4   0.0432691601 -0.164963371 1.0073201 9.031548e-03 0.02015430
## 5  -0.0886660174 -0.114235195 1.0648924 4.388357e-03 0.03580649
## 6  -0.0040859180  0.013198028 1.0607397 5.885052e-05 0.01836152
## 7   0.0150202199  0.036888778 1.0630506 4.594355e-04 0.02249764
## 8   0.1196086195 -0.166256149 1.0488780 9.246874e-03 0.03626584
## 9  -0.0137434190  0.055665627 1.0841584 1.046023e-03 0.04198454
## 10  0.0609048341 -0.147574853 1.0465958 7.292129e-03 0.03155786
## 11 -0.0043460774  0.016222360 1.0658027 8.890977e-05 0.02309770
## 12 -0.0088921220  0.164095403 1.0126964 8.948519e-03 0.02150303
## 13  0.0144031402 -0.054337744 1.0477025 9.948722e-04 0.01413372
## 14  0.0159114404  0.085075508 1.0353685 2.429048e-03 0.01428797
## 15  0.1193167935 -0.305047639 0.8412928 2.912689e-02 0.01577810
## 16  0.0003527832  0.003125899 1.0647655 3.301679e-06 0.02171649
## 17  0.0081330079 -0.065553319 1.0610504 1.448709e-03 0.02500310
## 18 -0.2393857279 -0.421261505 0.7836450 5.413253e-02 0.02206803
## 19  0.0338439597 -0.074374436 1.0804135 1.865802e-03 0.04097890
## 20  0.0313116956 -0.049326428 1.0677176 8.211853e-04 0.02780632
  1. Do you see any issues or extreme values from the figures? What about the diagnostic statistics (for space, we are only showing the first 20 observations)?

Mediation for Categorical Outcomes (Marginal Mediation Analysis)

  1. A new approach to when the the mediator and/or outcome is categorical is called Marginal Mediation Analysis. To use this method, we will use the function mma() from MarginalMediation. To use it, we fit two separate models using glm(). But first, we are going to dichotomize the anxiety variable (just to demonstrate the method–don’t do this in practice).
FacialBurns <- FacialBurns %>%
  mutate(HADS_binary = cut_number(HADS, 2))

bcpath <- glm(Selfesteem ~ HADS_binary + TBSA, data = FacialBurns)
apath <- glm(HADS_binary ~ TBSA, data = FacialBurns, family = binomial(link = "logit"))

mma(bcpath, apath,
    ind_effects = c("TBSA-HADS_binary"))
## 
## calculating a paths... b and c paths... Done.
                                                                                 
## ┌───────────────────────────────┐
## │  Marginal Mediation Analysis  │
## └───────────────────────────────┘
## A marginal mediation model with:
##    1 mediators
##    1 indirect effects
##    1 direct effects
##    500 bootstrapped samples
##    95% confidence interval
##    n = 77 
## 
## Formulas:
##    ◌ Selfesteem ~ HADS_binary + TBSA
##    ◌ HADS_binary ~ TBSA 
## 
## Unstandardized Effects
## ⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺
## ── Indirect Effects ──
##                   A-path   B-path Indirect    Lower    Upper
## TBSA-HADS_binary 0.02194 -5.94592 -0.13047 -0.24003 -0.03645
## 
## ── Direct Effects ──
##        Direct    Lower   Upper
## TBSA -0.17823 -0.30025 -0.0306
## 
## 
## Standardized Effects
## ⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺
## ── Indirect Effects ──
##                  Indirect    Lower    Upper
## TBSA-HADS_binary -0.02275 -0.04186 -0.00636
## 
## ── Direct Effects ──
##        Direct    Lower    Upper
## TBSA -0.03108 -0.05236 -0.00534
## -----
  1. Both the unstandardized and Standardized effects are reported. Interpret the standardized indirect effect here.

Moderated-Mediation (Conditional Process Analysis)

  1. There are times that a combination of moderation and mediation are of interest. This can be a challenge but it is possible. Let’s use lavaan to see if sex moderates the effect from anxiety to self-esteem.
FacialBurns <- FacialBurns %>%
  mutate(HADS_Sex = HADS * Sex)
med_model <- "
   ## a path
   HADS ~ a*TBSA
   ## b and c' paths
   Selfesteem ~ b*HADS + c*TBSA + b2 * HADS_Sex

   ## Define Indirect effects for each sex
   ind_sex1 := a * b
   ind_sex2 := a * b + a * b2
"
sem(med_model, data = FacialBurns, se = "bootstrap", bootstrap = 500) %>% 
  summary()
## lavaan (0.6-1) converged normally after  14 iterations
## 
##   Number of observations                            77
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                     113.187
##   Degrees of freedom                                 1
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                            Bootstrap
##   Number of requested bootstrap draws              500
##   Number of successful bootstrap draws             500
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   HADS ~                                              
##     TBSA       (a)    0.281    0.105    2.675    0.007
##   Selfesteem ~                                        
##     HADS       (b)   -0.561    0.248   -2.260    0.024
##     TBSA       (c)   -0.187    0.069   -2.697    0.007
##     HADS_Sex  (b2)    0.087    0.133    0.650    0.516
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .HADS             47.174    9.803    4.812    0.000
##    .Selfesteem       18.477    3.291    5.614    0.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     ind_sex1         -0.157    0.086   -1.830    0.067
##     ind_sex2         -0.133    0.056   -2.383    0.017
  1. Did sex moderate the b path (see the p-value associated with the b2 path)?
  2. For more information on moderated mediation see nickmichalak.com

Conclusion

Mediation analysis is an important part of our field. It helps us understand how one variable affects another. Because each part of a mediation is essentially a linear regression (or a GLM), the things we have learned throughout the course apply to each piece of the entire conceptual model.