8 Repeated Measures ANOVA
Required Packages
library(tidyverse) # Loads several very helpful 'tidy' packages
library(furniture) # Nice tables (by our own Tyson Barrett)
library(afex) # needed for ANOVA, emmeans is loaded automatically.
library(multcomp) # for advanced control for multiple testing/Type 1 error
8.1 Tutorial - Fitting RM ANOVA Models with afex::aov_4()
The aov_4()
function from the afex
package fits ANOVA models (oneway, two-way, repeated measures, and mixed design). It needs at least two arguments:
formula:
continuous_var ~ 1 + (RM_var|id_var)
one observation per subject for each level of theRMvar
, so eachid_var
has multiple lines for each subjectdataset:
data = .
we use the period to signify that the datset is being piped from above
Here is an outline of what your syntax should look like when you fit and save a RM ANOVA. Of course you will replace the dataset name and the variable names, as well as the name you are saving it as.
NOTE: The aov_4()
function works on
data in LONG format only. Each observation needs to be on its one line
or row with seperate variables for the group membership (categorical
factor or fct
) and the continuous measurement (numberic or
dbl
).
# RM ANOVA: fit and save
<- data_name %>%
aov_name ::aov_4(continuous_var ~ 1 + (RM_var|id_var),
afexdata = .)
By running the name you saved you model under, you will get a brief set of output, including a measure of Effect Size.
NOTE: The ges
is the generalized
eta squared. In a one-way ANOVA, the eta-squared effect size is the
same value, ie. generalized \(\eta_g\)
and partial \(\eta_p\) are the
same.
# Display basic ANOVA results (includes effect size)
aov_name
To fully fill out a standard ANOVA table and compute other effect sizes, you will need a more complete set of output, including the Sum of Squares components, you will need to add summary()
piped at the end of the model name before running it or after the model with a pipe.
NOTE: IGNORE the first line that starts with
(Intercept)
! Also, the ‘mean sum of squares’ are not
included in this table, nor is the Total line at the
bottom of the standard ANOVA table. You will need to manually compute
these values and add them on the homework page. Remember that
Sum of Squares (SS)
and
degrees of freedom (df)
add up, but
Mean Sum of Squreas (MS)
do not add up. Also:
MS = SS/df
for each term.
This also runs and displays the results of Mauchly Tests for Sphericity, as well as the Greenhouse-Geisser (GG) and Huynh-Feldt (HF) Corrections to the p-value. If the Mauchly’s p-value is bigger than .05, do not use the corrections. If Mauchly’s p-value is less than .05, then apply the epsilon (eps
or \(\epsilon\)) to multiply the degree’s of freedom. Yes, the df will be decimal numbers.
# Display fuller ANOVA results (sphericity tests)
summary(aov_name)
To see all the Sumes-of-Squared residuals for ALL of the model comoponents, you add $aov
at the end of the model name.
# Display all the sum of squares
$aov aov_name
Repeated Measures MANOVA Tests (Pillai test statistic) is computed is you add $Anova
at the end of the model name. This is a so called ‘Multivariate Test’. This is NOT what you want to do!
# Display fuller ANOVA results (includes sum of squares)
$Anova aov_name
If you only need to obtain the omnibus (overall) F-test without a correction for violation of sphericity, you can add an option for correction = "none"
. You can also request both the generalized and partial \(\eta^2\) effect sizes with es = c("ges", "pes")
.
# RM ANOVA: no correction, both effect sizes
%>%
data_name ::aov_4(continuous_var ~ 1 + (RM_var|id_var),
afexdata = .,
anova_table = list(correction = "none",
es = c("ges", "pes")))
Post Hoc tests may be ran the same way as the 1 and 2-way ANOVAs from the last unit.
NOTE: Use Fisher’s LSD
(adjust = “none”
) if the omnibus F-test is significant AND
there are THREE measurements per subject or block. Tukey’s HSD
(adjust = “tukey”
) may be used even if the F-test is not
significant or if there are four or more repeated measures.
# RM ANOVA: post hoc all pairwise tests with Fisher's LSD correction
%>%
aov_name ::emmeans(~ RM_var) %>%
emmeanspairs(adjust = "none")
# RM ANOVA: post hoc all pairwise tests with Tukey's HSD correction
%>%
aov_name ::emmeans(~ RM_var) %>%
emmeanspairs(adjust = "tukey")
A means plot (model based) can help you write up your results. This zooms in on just the means and will make all differences seem significant, so make sure to interpret it in conjunction with the ANOVA and post hoc tests.
# RM ANOVA: means plot
%>%
aov_name ::emmip(~ RM_var) emmeans
8.2 Words Recalled Data Example (Chapter 15, section A)
8.2.1 Data Prep
I input the data as a tribble
which saves it as a data.frame
and then cleaned up a few of the important variables.
<- tibble::tribble(
d ~ID, ~word_type, ~words_recalled,
1, 1, 20,
2, 1, 16,
3, 1, 8,
4, 1, 17,
5, 1, 15,
6, 1, 10,
1, 2, 21,
2, 2, 18,
3, 2, 7,
4, 2, 15,
5, 2, 10,
6, 2, 4,
1, 3, 17,
2, 3, 11,
3, 3, 4,
4, 3, 18,
5, 3, 13,
6, 3, 10) %>%
mutate(word_type = factor(word_type,
labels = c("Neutral", "Positive", "Negative"))) %>%
mutate(fake_id = row_number())
d
# A tibble: 18 × 4
ID word_type words_recalled fake_id
<dbl> <fct> <dbl> <int>
1 1 Neutral 20 1
2 2 Neutral 16 2
3 3 Neutral 8 3
4 4 Neutral 17 4
5 5 Neutral 15 5
6 6 Neutral 10 6
7 1 Positive 21 7
8 2 Positive 18 8
9 3 Positive 7 9
10 4 Positive 15 10
11 5 Positive 10 11
12 6 Positive 4 12
13 1 Negative 17 13
14 2 Negative 11 14
15 3 Negative 4 15
16 4 Negative 18 16
17 5 Negative 13 17
18 6 Negative 10 18
8.2.2 One-Way Independent ANOVA
First, let’s ignore the fact that we know this has repeated measures. As such, we will assume that each word type group is independent. Let’s look at what happens:
<- d %>%
ind_anova ::aov_4(words_recalled ~ word_type + (1|fake_id),
afexdata = .)
$Anova ind_anova
Anova Table (Type III tests)
Response: dv
Sum Sq Df F value Pr(>F)
(Intercept) 3042.00 1 101.4752 4.538e-08 ***
word_type 16.33 2 0.2724 0.7652
Residuals 449.67 15
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If we ignored that the word_type
groups are not independent, we get an F-statistic = 0.272 and p = 0.765.
What do you think will happen if we account for the repeated measures? Will the F-statistic increase or decrease?
8.2.3 One-Way RM ANOVA
Now, let’s look at the repeated measures. We do this by using afex::aov_4()
and then the summary()
functions as shown below.
<- d %>%
oneway ::aov_4(words_recalled ~ 1 + (word_type|ID),
afexdata = .)
summary(oneway)
Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 3042.00 1 381.33 5 39.8864 0.001466 **
word_type 16.33 2 68.33 10 1.1951 0.342453
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Mauchly Tests for Sphericity
Test statistic p-value
word_type 0.2134 0.045538
Greenhouse-Geisser and Huynh-Feldt Corrections
for Departure from Sphericity
GG eps Pr(>F[GG])
word_type 0.55972 0.3282
HF eps Pr(>F[HF])
word_type 0.6077293 0.3309148
Here, we see a number of pieces of information, including the sums of squares, F-statistic, and p-value. The F-statistic is now 1.195 and the p-value (although still not signfiicant) is .342. So what happened to the F-statistic? It decreased! So by using the information that these are repeated measures, we have more power. Why is that?
If we look at the output for the two ANOVAs above, both have the sums of squares for word_type
at 16.33 so that didn’t change at all. So what did change? Well, it comes down to understanding what is happening to the error term. Although not shown explicitly in the tables above, consider that:
\[ \text{Independent ANOVA: } SS_{total} = SS_{bet} + SS_w \]
\[ \text{RM ANOVA: } SS_{total} = SS_{RM} + SS_{sub} + SS_{inter} \]
\(SS_{total}\) is the same in both and \(SS_{RM} = SS_{RM}\) so what we are doing is splitting up the \(SS_w\) into \(SS_{sub} + SS_{inter}\) where only the \(SS_{inter}\) is the error term now.
This means we have more power with the same amount of data.
Let’s now plot this using a spaghetti plot.
%>%
d ggplot(aes(word_type, words_recalled, group = ID)) +
geom_line() +
geom_point()
The output provides us with a bit of an understanding of why there is not a significant effect of word_type
. In addition to a spaghetti plot, it is often useful to show what the overall repeated measure factor is doing, not the individuals (especially if your sample size is larger than 20). To do that, we can use:
%>%
oneway ::emmip(~ word_type) emmeans
Although there is a pattern here, we need to consider the scale. Looking at the spaghetti plot, we have individuals that range from 5 to 20 so a difference of 2 or 3 is not large. However, it is clear that a pattern may exist and so we should probably investigate this further, possibly with a larger sample size.
8.3 Another Example - Weight Loss
Contrived data on weight loss and self esteem over three months, for three groups of individuals: Control, Diet and Diet + Exercise. The data constitute a double-multivariate design.
8.3.1 Restructure the data from wide to long format
Wide format
head(carData::WeightLoss, n = 6)
group wl1 wl2 wl3 se1 se2 se3
1 Control 4 3 3 14 13 15
2 Control 4 4 3 13 14 17
3 Control 4 3 1 17 12 16
4 Control 3 2 1 11 11 12
5 Control 5 3 2 16 15 14
6 Control 6 5 4 17 18 18
Restructure
<- carData::WeightLoss %>%
WeightLoss_long ::mutate(id = row_number() %>% factor()) %>%
dplyr::gather(key = var,
tidyrvalue = value,
starts_with("wl"), starts_with("se")) %>%
::separate(var,
tidyrsep = 2,
into = c("measure", "month")) %>%
::spread(key = measure,
tidyrvalue = value) %>%
::select(id, group, month, wl, se) %>%
dplyr::mutate_at(vars(id, month), factor) %>%
dplyr::arrange(id, month) dplyr
Long format
head(WeightLoss_long, n = 20)
id group month wl se
1 1 Control 1 4 14
2 1 Control 2 3 13
3 1 Control 3 3 15
4 2 Control 1 4 13
5 2 Control 2 4 14
6 2 Control 3 3 17
7 3 Control 1 4 17
8 3 Control 2 3 12
9 3 Control 3 1 16
10 4 Control 1 3 11
11 4 Control 2 2 11
12 4 Control 3 1 12
13 5 Control 1 5 16
14 5 Control 2 3 15
15 5 Control 3 2 14
16 6 Control 1 6 17
17 6 Control 2 5 18
18 6 Control 3 4 18
19 7 Control 1 6 17
20 7 Control 2 5 16
Summary table
%>%
WeightLoss_long ::group_by(group, month) %>%
dplyr::table1(wl, se) furniture
───────────────────────────────────────────────────────────────────────────────────────────────────────
group, month
Control-1 Diet-1 DietEx-1 Control-2 Diet-2 DietEx-2
n = 12 n = 12 n = 10 n = 12 n = 12 n = 10
wl
4.5 (1.0) 5.3 (1.4) 6.2 (2.3) 3.3 (1.1) 3.9 (1.4) 6.1 (1.4)
se
14.8 (1.9) 14.8 (2.4) 15.2 (1.3) 14.3 (1.9) 13.8 (2.8) 13.3 (1.9)
Control-3 Diet-3 DietEx-3
n = 12 n = 12 n = 10
2.1 (1.2) 2.2 (1.1) 2.2 (1.2)
15.1 (2.4) 16.2 (2.5) 17.6 (1.0)
───────────────────────────────────────────────────────────────────────────────────────────────────────
::WeightLoss %>%
carData::group_by(group) %>%
dplyr::table1(wl1, wl2, wl3,
furniture se1, se2, se3)
──────────────────────────────────────
group
Control Diet DietEx
n = 12 n = 12 n = 10
wl1
4.5 (1.0) 5.3 (1.4) 6.2 (2.3)
wl2
3.3 (1.1) 3.9 (1.4) 6.1 (1.4)
wl3
2.1 (1.2) 2.2 (1.1) 2.2 (1.2)
se1
14.8 (1.9) 14.8 (2.4) 15.2 (1.3)
se2
14.3 (1.9) 13.8 (2.8) 13.3 (1.9)
se3
15.1 (2.4) 16.2 (2.5) 17.6 (1.0)
──────────────────────────────────────
Raw data plot (exploratory)
%>%
WeightLoss_long ::gather(key = measure,
tidyrvalue = value,
%>%
wl, se) ::mutate(measure = fct_recode(measure,
dplyr"Weight Loss, lb" = "wl",
"Self Esteem Rating" = "se")) %>%
ggplot(aes(x = month,
y = value %>% as.numeric %>% jitter,
group = id)) +
facet_grid(measure ~ group,
scale = "free_y",
switch = "y") +
geom_line() +
theme_bw() +
labs(x = "Time, months",
y = "Measurement Value",
title = "Person Profile Plot",
subtitle = "Raw Data")
8.3.2 Does weight long change over time?
Fit the model
<- WeightLoss_long %>%
fit_wl ::aov_4(wl ~ 1 + (month|id),
afexdata = .)
Brief output - uses the Greenhouse-Geisser correction for violations of sphericity (by default)
fit_wl
Anova Table (Type 3 tests)
Response: wl
Effect df MSE F ges p.value
1 month 2.00, 65.94 1.08 80.31 *** .427 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Sphericity correction method: GG
Lots more output
summary(fit_wl)
Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 1584.35 1 162.314 33 322.115 < 2.2e-16 ***
month 173.88 2 71.451 66 80.308 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Mauchly Tests for Sphericity
Test statistic p-value
month 0.99908 0.98539
Greenhouse-Geisser and Huynh-Feldt Corrections
for Departure from Sphericity
GG eps Pr(>F[GG])
month 0.99908 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
HF eps Pr(>F[HF])
month 1.063446 2.090759e-18
To force no GG correction:
<- WeightLoss_long %>%
fit_wl_noGG ::aov_4(wl ~ 1 + (month|id),
afexdata = .,
anova_table = list(correction = "none"))
fit_wl_noGG
Anova Table (Type 3 tests)
Response: wl
Effect df MSE F ges p.value
1 month 2, 66 1.08 80.31 *** .427 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
To request BOTH effect sizes: partial eta squared and generalized eta squared
<- WeightLoss_long %>%
fit_wl_2es ::aov_4(wl ~ 1 + (month|id),
afexdata = .,
anova_table = list(es = c("ges", "pes")))
fit_wl_2es
Anova Table (Type 3 tests)
Response: wl
Effect df MSE F ges pes p.value
1 month 2.00, 65.94 1.08 80.31 *** .427 .709 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Sphericity correction method: GG
To force no GG correction AND request BOTH effect sizes
<- WeightLoss_long %>%
fit_wl_noGG ::aov_4(wl ~ 1 + (month|id),
afexdata = .,
anova_table = list(correction = "none",
es = c("ges", "pes")))
fit_wl_noGG
Anova Table (Type 3 tests)
Response: wl
Effect df MSE F ges pes p.value
1 month 2, 66 1.08 80.31 *** .427 .709 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Estimated marginal means
%>%
fit_wl ::emmeans(~ month) emmeans
month emmean SE df lower.CL upper.CL
X1 5.29 0.288 33 4.71 5.88
X2 4.35 0.295 33 3.75 4.95
X3 2.18 0.196 33 1.78 2.57
Confidence level used: 0.95
pairwise post hoc
%>%
fit_wl ::emmeans(~ month) %>%
emmeanspairs(adjust = "none")
contrast estimate SE df t.ratio p.value
X1 - X2 0.941 0.250 33 3.771 0.0006
X1 - X3 3.118 0.256 33 12.178 <.0001
X2 - X3 2.176 0.251 33 8.656 <.0001
means plot
%>%
fit_wl ::emmip( ~ month) emmeans