Introduction

Chapter 15 discussed mediation analysis, a way of understanding the process of an effect from one variable to another. Mediation is built on linear regression as we’ve discussed but combines paths to create a single conceptual model.

  1. Let’s start by loading the tidyverse package, the furniture package, the lavaan and the MarginalMediation package.
library(tidyverse)
library(furniture)
library(lavaan)
library(MarginalMediation)
  1. Import the FacialBurns data set found in lavaan into R.
data(FacialBurns)
str(FacialBurns)
## 'data.frame':    77 obs. of  6 variables:
##  $ Selfesteem: int  35 40 38 30 27 35 35 28 29 26 ...
##  $ HADS      : int  6 2 2 4 11 5 4 13 19 16 ...
##  $ Age       : int  23 61 34 29 46 18 64 20 47 41 ...
##  $ TBSA      : num  3.5 6 8 6 19.2 ...
##  $ RUM       : num  5 4 4 5 13 8 6 8 13 8 ...
##  $ Sex       : int  1 1 1 1 1 1 1 1 1 1 ...

Mediation

  1. We hypothesize that anxiety (HADS) mediates the effect of TBSA on Selfesteem. Let’s test this out first with lavaan. We will create the model object with two regression models and then we define the indirect effect and the total effect from the coefficients of the two regressions.
med_model <- "
   ## a path
   HADS ~ a*TBSA
   ## b and c' paths
   Selfesteem ~ b*HADS + c*TBSA

   ## Define Indirect and Total Effects
   ind := a * b
   tot := a * b + c
"
sem(med_model, data = FacialBurns, se = "bootstrap", bootstrap = 500) %>% 
  summary()
## lavaan 0.6-5 ended normally after 13 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                          5
##                                                       
##   Number of observations                            77
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws              500
##   Number of successful bootstrap draws             500
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   HADS ~                                              
##     TBSA       (a)    0.281    0.110    2.539    0.011
##   Selfesteem ~                                        
##     HADS       (b)   -0.436    0.100   -4.372    0.000
##     TBSA       (c)   -0.180    0.063   -2.859    0.004
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .HADS             47.174    9.856    4.786    0.000
##    .Selfesteem       18.698    3.231    5.788    0.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     ind              -0.122    0.046   -2.633    0.008
##     tot              -0.302    0.062   -4.887    0.000
  1. The output gives us the a, b, and c paths (as we defined them) along with the defined parameters ind and tot. Is the indirect path (from TBSA -> HADS -> Self-esteem) significant? What does it mean here?
  2. Let’s get the standardized indirect and total effects.
sem(med_model, data = FacialBurns) %>% 
  summary(standardized = TRUE)
## lavaan 0.6-5 ended normally after 13 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                          5
##                                                       
##   Number of observations                            77
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   HADS ~                                                                
##     TBSA       (a)    0.281    0.108    2.600    0.009    0.281    0.284
##   Selfesteem ~                                                          
##     HADS       (b)   -0.436    0.072   -6.075    0.000   -0.436   -0.548
##     TBSA       (c)   -0.180    0.071   -2.537    0.011   -0.180   -0.229
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .HADS             47.174    7.603    6.205    0.000   47.174    0.919
##    .Selfesteem       18.698    3.013    6.205    0.000   18.698    0.576
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ind              -0.122    0.051   -2.390    0.017   -0.122   -0.156
##     tot              -0.302    0.083   -3.655    0.000   -0.302   -0.385
  1. Interpet the standardized indirect effect. How much of the total effect is explained by the process through anxiety?
  2. We can get similar results using the MarginalMediation package that uses linear regression, although this provides us with confidence intervals instead of p-values. Notably, the standardized results will be slightly different given that lavaan standardizes all the outcomes and predictors whereas MarginalMediation only standardizes the outcome (since there can also be categorical predictors in many mediation models).
bcpath <- glm(Selfesteem ~ HADS + TBSA, data = FacialBurns)
apath <- glm(HADS ~ TBSA, data = FacialBurns)

mma(bcpath, apath,
    ind_effects = c("TBSA-HADS"))
## 
## calculating a paths... b and c paths... Done.
                                                                                 
## ┌───────────────────────────────┐
## │  Marginal Mediation Analysis  │
## └───────────────────────────────┘
## A marginal mediation model with:
##    1 mediators
##    1 indirect effects
##    1 direct effects
##    500 bootstrapped samples
##    95% confidence interval
##    n = 77 
## 
## Formulas:
##    ◌ Selfesteem ~ HADS + TBSA
##    ◌ HADS ~ TBSA 
## 
## Regression Models: 
## 
##      Selfesteem ~ 
##                          Est      SE   Est/SE P-Value
##         (Intercept) 37.83302 0.94718 39.94283 0.00000
##         HADS        -0.43583 0.07319 -5.95501 0.00000
##         TBSA        -0.17974 0.07227 -2.48700 0.01514
## 
##      HADS ~ 
##                         Est      SE  Est/SE P-Value
##         (Intercept) 6.07293 1.31966 4.60188 0.00002
##         TBSA        0.28052 0.10933 2.56579 0.01229
## 
## Unstandardized Mediated Effects: 
## 
##    Indirect Effects: 
## 
##      Selfesteem ~ 
##                      Indirect    Lower    Upper
##         TBSA => HADS -0.12226 -0.24347 -0.02311
## 
##    Direct Effects: 
## 
##      Selfesteem ~ 
##                Direct    Lower    Upper
##         TBSA -0.17974 -0.28505 -0.04651
## 
## 
## Standardized Mediated Effects: 
## 
##    Indirect Effects: 
## 
##      Selfesteem ~ 
##                      Indirect    Lower    Upper
##         TBSA => HADS -0.02132 -0.04246 -0.00403
## 
##    Direct Effects: 
## 
##      Selfesteem ~ 
##                Direct    Lower    Upper
##         TBSA -0.03134 -0.04971 -0.00811
  1. Because each part of the mediation is a linear model, we can assess the assumptions as we’ve done before.
par(mfrow = c(2,2))
plot(bcpath)

educ7610::diagnostics(bcpath) %>% head(20)
##    Selfesteem HADS  TBSA     pred        resid       dfb.1_     dfb.HADS
## 1          35    6  3.50 34.58895   0.41105324  0.012960985 -0.001657402
## 2          40    2  6.00 35.88290   4.11709786  0.139472409 -0.091399317
## 3          38    2  8.00 35.52342   2.47658489  0.073303811 -0.060058633
## 4          30    4  6.00 35.01125  -5.01124506 -0.151252519  0.072416542
## 5          27   11 19.25 29.57884  -2.57884369  0.025545079  0.004737331
## 6          35    5  6.00 34.57542   0.42458349  0.011919758 -0.004453942
## 7          35    4 12.00 33.93278   1.06721604  0.018873615 -0.022195323
## 8          28   13  2.50 31.71789  -3.71789049 -0.079671896 -0.090183049
## 9          29   19  9.50 27.84471   1.15528536 -0.004831601  0.046257560
## 10         26   16  7.25 29.55662  -3.55662317 -0.021272391 -0.108797783
## 11         36    3  5.50 35.53695   0.46305465  0.015200321 -0.008174825
## 12         40    3  7.50 35.17746   4.82254168  0.139428137 -0.096119911
## 13         31    8  7.50 32.99832  -1.99831562 -0.039154996  0.001341013
## 14         36    7 10.50 32.89491   3.10508639  0.047501510 -0.023835686
## 15         22   10  7.00 32.21653 -10.21653030 -0.181039780 -0.079114940
## 16         35    3  9.00 34.90784   0.09215695  0.002359607 -0.001970088
## 17         34    2  6.50 35.79303  -1.79303038 -0.058572870  0.040558407
## 18         21    6 14.00 32.70164 -11.70163985 -0.122962711  0.188215760
## 19         36    0  1.50 37.56341  -1.56340504 -0.073987147  0.039585768
## 20         34    5  2.10 35.27642  -1.27641622 -0.046473281  0.008157308
##         dfb.TBSA        dffit     cov.r       cook.d        hat
## 1  -0.0082986979  0.014243753 1.0653855 6.854635e-05 0.02262020
## 2  -0.0266148316  0.152496937 1.0305333 7.763038e-03 0.02539166
## 3   0.0028418597  0.089949930 1.0540467 2.722001e-03 0.02464283
## 4   0.0432691601 -0.164963371 1.0073201 9.031548e-03 0.02015430
## 5  -0.0886660174 -0.114235195 1.0648924 4.388357e-03 0.03580649
## 6  -0.0040859180  0.013198028 1.0607397 5.885052e-05 0.01836152
## 7   0.0150202199  0.036888778 1.0630506 4.594355e-04 0.02249764
## 8   0.1196086195 -0.166256149 1.0488780 9.246874e-03 0.03626584
## 9  -0.0137434190  0.055665627 1.0841584 1.046023e-03 0.04198454
## 10  0.0609048341 -0.147574853 1.0465958 7.292129e-03 0.03155786
## 11 -0.0043460774  0.016222360 1.0658027 8.890977e-05 0.02309770
## 12 -0.0088921220  0.164095403 1.0126964 8.948519e-03 0.02150303
## 13  0.0144031402 -0.054337744 1.0477025 9.948722e-04 0.01413372
## 14  0.0159114404  0.085075508 1.0353685 2.429048e-03 0.01428797
## 15  0.1193167935 -0.305047639 0.8412928 2.912689e-02 0.01577810
## 16  0.0003527832  0.003125899 1.0647655 3.301679e-06 0.02171649
## 17  0.0081330079 -0.065553319 1.0610504 1.448709e-03 0.02500310
## 18 -0.2393857279 -0.421261505 0.7836450 5.413253e-02 0.02206803
## 19  0.0338439597 -0.074374436 1.0804135 1.865802e-03 0.04097890
## 20  0.0313116956 -0.049326428 1.0677176 8.211853e-04 0.02780632
  1. Do you see any issues or extreme values from the figures? What about the diagnostic statistics (for space, we are only showing the first 20 observations)?

Mediation for Categorical Outcomes (Marginal Mediation Analysis)

  1. A new approach to when the the mediator and/or outcome is categorical is called Marginal Mediation Analysis. To use this method, we will use the function mma() from MarginalMediation. To use it, we fit two separate models using glm(). But first, we are going to dichotomize the anxiety variable (just to demonstrate the method–don’t do this in practice).
FacialBurns <- FacialBurns %>%
  mutate(HADS_binary = cut_number(HADS, 2))

bcpath <- glm(Selfesteem ~ HADS_binary + TBSA, data = FacialBurns)
apath <- glm(HADS_binary ~ TBSA, data = FacialBurns, family = binomial(link = "logit"))

mma(bcpath, apath,
    ind_effects = c("TBSA-HADS_binary"))
## 
## calculating a paths... b and c paths... Done.
                                                                                 
## ┌───────────────────────────────┐
## │  Marginal Mediation Analysis  │
## └───────────────────────────────┘
## A marginal mediation model with:
##    1 mediators
##    1 indirect effects
##    1 direct effects
##    500 bootstrapped samples
##    95% confidence interval
##    n = 77 
## 
## Formulas:
##    ◌ Selfesteem ~ HADS_binary + TBSA
##    ◌ HADS_binary ~ TBSA 
## 
## Regression Models: 
## 
##      Selfesteem ~ 
##                                Est      SE   Est/SE P-Value
##         (Intercept)       36.69491 0.90177 40.69212  0.0000
##         HADS_binary(7,32] -5.94592 1.08740 -5.46800  0.0000
##         TBSA              -0.17823 0.07464 -2.38785  0.0195
## 
##      HADS_binary ~ 
##                          Est      SE   Est/SE P-Value
##         (Intercept) -1.10733 0.43573 -2.54130 0.01104
##         TBSA         0.09759 0.04002  2.43849 0.01475
## 
## Unstandardized Mediated Effects: 
## 
##    Indirect Effects: 
## 
##      Selfesteem ~ 
##                             Indirect    Lower    Upper
##         TBSA => HADS_binary -0.13047 -0.23795 -0.04039
## 
##    Direct Effects: 
## 
##      Selfesteem ~ 
##                Direct    Lower    Upper
##         TBSA -0.17823 -0.29307 -0.04388
## 
## 
## Standardized Mediated Effects: 
## 
##    Indirect Effects: 
## 
##      Selfesteem ~ 
##                             Indirect    Lower    Upper
##         TBSA => HADS_binary -0.02275 -0.04149 -0.00704
## 
##    Direct Effects: 
## 
##      Selfesteem ~ 
##                Direct    Lower    Upper
##         TBSA -0.03108 -0.05111 -0.00765
  1. Both the unstandardized and Standardized effects are reported. Interpret the standardized indirect effect here.

Moderated-Mediation (Conditional Process Analysis)

  1. There are times that a combination of moderation and mediation are of interest. This can be a challenge but it is possible. Let’s use lavaan to see if sex moderates the effect from anxiety to self-esteem.
FacialBurns <- FacialBurns %>%
  mutate(HADS_Sex = HADS * Sex)
med_model <- "
   ## a path
   HADS ~ a*TBSA
   ## b and c' paths
   Selfesteem ~ b*HADS + c*TBSA + b2 * HADS_Sex

   ## Define Indirect effects for each sex
   ind_sex1 := a * b
   ind_sex2 := a * b + a * b2
"
sem(med_model, data = FacialBurns, se = "bootstrap", bootstrap = 500) %>% 
  summary()
## lavaan 0.6-5 ended normally after 14 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                          6
##                                                       
##   Number of observations                            77
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                               113.187
##   Degrees of freedom                                 1
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws              500
##   Number of successful bootstrap draws             500
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   HADS ~                                              
##     TBSA       (a)    0.281    0.111    2.525    0.012
##   Selfesteem ~                                        
##     HADS       (b)   -0.561    0.250   -2.248    0.025
##     TBSA       (c)   -0.187    0.064   -2.909    0.004
##     HADS_Sex  (b2)    0.087    0.131    0.663    0.507
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .HADS             47.174    9.881    4.774    0.000
##    .Selfesteem       18.477    3.345    5.524    0.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     ind_sex1         -0.157    0.085   -1.843    0.065
##     ind_sex2         -0.133    0.057   -2.343    0.019
  1. Did sex moderate the b path (see the p-value associated with the b2 path)?
  2. For more information on moderated mediation see nickmichalak.com

Conclusion

Mediation analysis is an important part of our field. It helps us understand how one variable affects another. Because each part of a mediation is essentially a linear regression (or a GLM), the things we have learned throughout the course apply to each piece of the entire conceptual model.