You will need the Rmd file for today’s lab - can download it here.
Be sure to have the following packages installed and loaded:
library(tidyverse) # for plotting and data wrangling
library(rio) # for importing data
library(psych) # for descriptives
library(lsr) # for eta squared functions
library(emmeans) # for marginal means and simple effects
library(sjPlot) # for plotting model results
library(apaTables) # for tables of means
library(car) # for testing model assumptions
library(broom) # for tidying model output
# suppress scientific notation
options(scipen = 999)
Purpose
Factorial ANOVA (aka multiple regression with categorical predictors)
refers to a special case of the general linear model in which there is
an interaction of two or more categorical variables
(i.e. factors). A factorial design is used when there is an
interest in how two or more variables (or factors) affect some outcomes
variable. Rather than conduct separate one-way ANOVAs for each factor,
they are all included in one analysis. Today we will review how to run
factorial ANOVA models in R and how to interpret and visualize the
results.
Research scenario
Based on subjects’ self-reports of rejection sensitivity (N = 80), a
researcher divides subjects into two equal groups (low RS and high RS).
Whereas half of the subjects in each group interact with a partner who
displays a happy emotional expression during the interaction, the other
half of the subjects in each group interact with a partner who displays
a neutral emotional expression during the interaction. After the
interaction, subjects are asked to rate the statement, “My interaction
partner likes me”, on a scale from 1 (strongly disagree) to 7 (strongly
agree).
Factor 1: Rejection Sensitivity
Factor 2: Partner’s Emotional Expression
Dependent Variable: Perceived Liking
Hypothesis Testing
With a factorial ANOVA, three different null hypotheses can be tested
regarding whether there is 1) a main effect of factor 1, 2) a main
effect of factor 2, and 3) an interaction between factor 1 and factor
2.
Imagine a table with means like the following:
Neutral |
x.xx |
x.xx |
x.xx |
Happy |
x.xx |
x.xx |
x.xx |
Column Means |
x.xx |
x.xx |
x.xx |
Null Hypothesis 1 (Main Effect of Rejection
Sensitivity): \[H_0: \mu_{Low} =
\mu_{High}\]
Null Hypothesis 2 (Main Effect of Partner’s
Emotional Expression): \[H_0: \mu_{Neutral} =
\mu_{Happy}\] Null Hypothesis 3 (Interaction
Effect):
H0: There is no interaction between rejection sensitivity and
partner’s emotional expression.
Import & inspect the Data
reject <- import("https://raw.githubusercontent.com/uopsych/psy612/master/labs/lab-8/data/reject.csv")
str(reject)
## 'data.frame': 80 obs. of 3 variables:
## $ rs : chr "Low" "Low" "Low" "Low" ...
## $ partner: chr "Neutral" "Neutral" "Neutral" "Neutral" ...
## $ liking : int 4 4 4 5 5 5 5 5 5 5 ...
head(reject)
- It looks like
rs
and partner
are both
being read in as character variables. Let’s go ahead and change those to
factors.
reject <- reject %>%
mutate(rs = as.factor(rs),
partner = as.factor(partner))
- Check the structure again.
str(reject)
## 'data.frame': 80 obs. of 3 variables:
## $ rs : Factor w/ 2 levels "High","Low": 2 2 2 2 2 2 2 2 2 2 ...
## $ partner: Factor w/ 2 levels "Happy","Neutral": 2 2 2 2 2 2 2 2 2 2 ...
## $ liking : int 4 4 4 5 5 5 5 5 5 5 ...
Notice that by default R orders factor levels alphabetically. In our
case, this means that High
will be the reference group of
rejection sensitivity and Happy
will be the reference group
of interaction partner’s emotional expression. However, it might be more
intuitive to have Low
and Neutral
be the
reference groups, respectively.
- To accomplish this, we can simply re-order the levels of our factor
variables with
fct_relevel()
.
reject <- reject %>%
mutate(rs = fct_relevel(rs, c("Low", "High")),
partner = fct_relevel(partner, c("Neutral", "Happy")))
str(reject) # Notice the re-ordering of levels
## 'data.frame': 80 obs. of 3 variables:
## $ rs : Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
## $ partner: Factor w/ 2 levels "Neutral","Happy": 1 1 1 1 1 1 1 1 1 1 ...
## $ liking : int 4 4 4 5 5 5 5 5 5 5 ...
- We can also see how the factors are being coded using
contrasts().
contrasts(reject$rs)
## High
## Low 0
## High 1
contrasts(reject$partner)
## Happy
## Neutral 0
## Happy 1
Notice that each are being dummy coded. It’s very important to be
aware of how the factors in your regression model are being coded to be
able to accurately interpret the regression coefficients in your
output.
Speaking of regression, note we are going to conduct the analysis
using the lm() function. The following is how you could write the
factorial ANOVA as the regression equation that we’re estimating:
\[Liking_i = \beta_0 +
\beta_1RejectionSensitivity_i + \beta_2PartnerEmotion_i +
\beta_3RejectionXPartner_i + \epsilon_i\]
Descriptive Statistics
Obtaining sample size per condition.
reject %>%
group_by(rs, partner) %>%
summarize(n = n())
## `summarise()` has grouped output by 'rs'. You can override using the `.groups`
## argument.
Table of Means
The results of a factorial ANOVA are often represented in a table of
means, which contains the means of each combination of conditions (the
cell means) and means for the levels of each variable ignoring the other
variables (the marginal means).
Table of Means using apa.2way.table()
- The apa.2way.table() function from the apaTables package is a very
convenient way to get our cell means and marginal means. This function
works for any type of 2-way ANOVA, regardless of the number of levels of
your factors, e.g. it would work for a 3 x 3 ANOVA. All you need to do
is indicate what the IV’s (aka factors) and DV are and specify
show.marginal.means = TRUE
.
table_of_means <- apa.2way.table(iv1 = rs,
iv2 = partner,
dv = liking,
data = reject,
show.marginal.means = TRUE)
table_of_means
##
##
## Means and standard deviations for liking as a function of a 2(rs) X 2(partner) design
##
## partner
## Neutral Happy Marginal
## rs M SD M SD M SD
## Low 5.55 1.05 6.00 0.73 5.78 0.92
## High 1.80 0.70 6.45 0.69 4.12 2.45
## Marginal 3.67 2.09 6.22 0.73
##
## Note. M and SD represent mean and standard deviation, respectively.
## Marginal indicates the means and standard deviations pertaining to main effects.
We can also get the output of the apa.2way.table() function exported
to a word document.
apa.2way.table(iv1 = rs,
iv2 = partner,
dv = liking,
data = reject,
show.marginal.means = TRUE,
filename = "table_of_means.doc")
##
##
## Means and standard deviations for liking as a function of a 2(rs) X 2(partner) design
##
## partner
## Neutral Happy Marginal
## rs M SD M SD M SD
## Low 5.55 1.05 6.00 0.73 5.78 0.92
## High 1.80 0.70 6.45 0.69 4.12 2.45
## Marginal 3.67 2.09 6.22 0.73
##
## Note. M and SD represent mean and standard deviation, respectively.
## Marginal indicates the means and standard deviations pertaining to main effects.
Table of Means using dplyr
The following can be used to obtain cell means.
reject %>%
group_by(rs, partner) %>% # Group by both rs and partner
summarize(mean = mean(liking, na.rm = TRUE), # Obtain the mean for each condition
sd = sd(liking, na.rm = TRUE)) %>% # Obtain the SD for each condition
knitr::kable(digits = c(NA, NA, 2, 2, 2), # Specifying number of digits to report
caption = "Cell Means & SD")
## `summarise()` has grouped output by 'rs'. You can override using the `.groups`
## argument.
Cell Means & SD
Low |
Neutral |
5.55 |
1.05 |
Low |
Happy |
6.00 |
0.73 |
High |
Neutral |
1.80 |
0.70 |
High |
Happy |
6.45 |
0.69 |
The following can be used to obtain marginal means for rejection
sensitivity.
# Marginal Means for Rejection Sensitivity
reject %>%
group_by(rs) %>% # Just group by rejection sensitivity
summarize(mean = mean(liking, na.rm = TRUE),
sd = sd(liking, na.rm = TRUE)) %>%
knitr::kable(digits = c(NA, 2, 2, 2),
caption = "Marginal Means & SD for Rejection Sensitivity")
Marginal Means & SD for Rejection Sensitivity
Low |
5.78 |
0.92 |
High |
4.12 |
2.45 |
And the marginal means for partner’s emotional expression.
# Marginal Means for Partner's Emotional Expression
reject %>%
group_by(partner) %>% # Just group by partner
summarize(mean = mean(liking, na.rm = TRUE),
sd = sd(liking, na.rm = TRUE)) %>%
knitr::kable(digits = c(NA, 2, 2, 2),
caption = "Marginal Means & SD for Partner's Emotional Expression")
Marginal Means & SD for Partner’s Emotional
Expression
Neutral |
3.67 |
2.09 |
Happy |
6.22 |
0.73 |
The following can be used to obtain the grand mean (the overall mean
on the DV).
reject %>% # note that we don't need group_by for this one
summarize(mean = mean(liking, na.rm = TRUE),
sd = sd(liking, na.rm = TRUE)) %>%
knitr::kable(digits = c(2, 2, 2),
caption = "Grand Mean")
Question: Which means are being compared in the main
effect of rejection sensitivity?
Question: Which means are being compared in the main
effect of interaction partner?
Question: Which means are involved in the
interaction?
Conduct the Factorial ANOVA
Using Regression: lm
()
Factorial ANOVA is the method by which we can examine whether two (or
more) categorical IVs have joint effects on a continuous outcome of
interest. Like all general linear models, factorial ANOVA is a specific
case of multiple regression. However, we may choose to use an ANOVA
framework for the sake of interpretability.
Write the lm()
code that performs the factorial ANOVA
analysis.
## your code here
Results in regression framework
## your code here
Question: Write the full model with parameter
estimates filled in below:
\[Liking_i = \]
>Question: What do each of the regression
coefficient estimates mean?
It may be helpful to refer back to the table of means.
table_of_means
##
##
## Means and standard deviations for liking as a function of a 2(rs) X 2(partner) design
##
## partner
## Neutral Happy Marginal
## rs M SD M SD M SD
## Low 5.55 1.05 6.00 0.73 5.78 0.92
## High 1.80 0.70 6.45 0.69 4.12 2.45
## Marginal 3.67 2.09 6.22 0.73
##
## Note. M and SD represent mean and standard deviation, respectively.
## Marginal indicates the means and standard deviations pertaining to main effects.
Recoding the Factors
Obviously, the means being compared by b1 and b2 do not represent our
main effects. The main effect of RS would be a comparison of the
marginal means for the low and high conditions. The main effect of
partner would be a comparison of the marginal means for the neutral and
happy conditions.
With the way we have rs and partner coded in the model, the
regression coefficient estimates don’t correspond simply to the mean
difference between conditions. We can recode the factors so that the
parameter estimates, b1 and b2, correspond to a test of the main
effects.
Question: How should we code the levels of rejection
sensitivity and partner emotion so that the parameter estimates, b1 and
b2, correspond to a test of their main effects?
Assign new codes to the levels of each factor.
## your code here
Re-run the analysis with newly coded factors.
## your code here
Using Traditional ANOVA: anova()
You can also perform a traditional ANOVA and get straightforward
results by passing the model to the anova()
function.
model <- lm(liking ~ rs*partner, data = reject)
anova(model)
Question: How would you interpret these results?
Effect Sizes
Effect size is really important to report. It provides an idea of the
size of the effect of a predictor variable on an outcome
variable. Reporting effect size in publications is also very helpful for
other researchers wanting to conduct a priori power analyses.
We can get effect sizes by passing our model to the
etaSquared()
function.
etaSquared(model)
## eta.sq eta.sq.part
## rs 0.1692045 0.5258329
## partner 0.4041330 0.7259280
## rs:partner 0.2740833 0.6423889
Question: What is the difference between eta-squared
and partial eta-squared?
Visualization
Visualizing the results is helpful for understanding your results and
interpreting significant (or non-significant) effects. Especially when
there is a significant interaction effect, a plot can help illustrate
what the interaction effect is.
Main effects
- Remember that main effects correspond to differences in marginal
means. To plot main effects, we can use the plot_model() function from
the sjPlot package. This function takes three arguments: 1) the fitted
model object, 2) because we want to plot marginal means, we specify
type = emm
, and 3) Specify in the terms =
argument which variable(s) you want the marginal means for.
Plotting the Main Effect of Rejection Sensitivity
plot_model(model, type = "emm", terms = "rs") # bars correspond to 95% CIs
## NOTE: Results may be misleading due to involvement in interactions
Plotting the Main Effect of Partner’s Emotional Expression
plot_model(model, type = "emm", terms = "partner")
## NOTE: Results may be misleading due to involvement in interactions
Plotting the Interaction Effect
plot_model(model, type = "emm", terms = c("rs", "partner"))
Switch how the interaction is visualized by switching the order of
terms
.
Question: How would you describe in a paper what the
significant interaction effect means?
Simple effects
Simple effects are the effect of some factor (e.g., interaction
partner’s expression) at each level of another factor (e.g., at high and
low RS separately).
We can get these simple effects using the emmeans
function and calculating the differences between every pair of cell
means.
# Comparing cell means
emmeans(model, pairwise~rs*partner, adjust = "bonferroni")
## $emmeans
## rs partner emmean SE df lower.CL upper.CL
## Low Neutral 5.55 0.18 76 5.19 5.91
## High Neutral 1.80 0.18 76 1.44 2.16
## Low Happy 6.00 0.18 76 5.64 6.36
## High Happy 6.45 0.18 76 6.09 6.81
##
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## Low Neutral - High Neutral 3.75 0.254 76 14.754 <.0001
## Low Neutral - Low Happy -0.45 0.254 76 -1.770 0.4840
## Low Neutral - High Happy -0.90 0.254 76 -3.541 0.0041
## High Neutral - Low Happy -4.20 0.254 76 -16.524 <.0001
## High Neutral - High Happy -4.65 0.254 76 -18.294 <.0001
## Low Happy - High Happy -0.45 0.254 76 -1.770 0.4840
##
## P value adjustment: bonferroni method for 6 tests
# for info on types of p-value adjustments you can choose from:
?summary.emmGrid # scroll to p-value adjustments
## starting httpd help server ... done
Question: Does partner emotion have an effect on
perceived liking for people low on rejection sensitivity? For people
high on rejection sensitivty?
Question: Does rejection sensitivity have an effect
on perceived liking when partner shows neutral emotions? When partner
shows happy emotions?
You could also use the emmeans
function to compare
marginal means for each factor individually.
# Comparing marginal means for partner emotion
emmeans(model, pairwise~partner, adjust = "bonferroni")
## NOTE: Results may be misleading due to involvement in interactions
## $emmeans
## partner emmean SE df lower.CL upper.CL
## Neutral 3.67 0.127 76 3.42 3.93
## Happy 6.22 0.127 76 5.97 6.48
##
## Results are averaged over the levels of: rs
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## Neutral - Happy -2.55 0.18 76 -14.188 <.0001
##
## Results are averaged over the levels of: rs
# Comparing marginal means for partner emotion
emmeans(model, pairwise~rs, adjust = "bonferroni")
## NOTE: Results may be misleading due to involvement in interactions
## $emmeans
## rs emmean SE df lower.CL upper.CL
## Low 5.78 0.127 76 5.52 6.03
## High 4.12 0.127 76 3.87 4.38
##
## Results are averaged over the levels of: partner
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## Low - High 1.65 0.18 76 9.180 <.0001
##
## Results are averaged over the levels of: partner