You can download the rmd file here.
To quickly navigate to the desired section, click one of the following links:
Today we will be analyzing data from Fox and Guyer’s (1978) anonymity and cooperation study again. The data is included in the {carData}
package and you can see information about the data set using ?Guyer
. Twenty groups of four participants each played 30 trials of the the prisoner’s dilemma game. The number of cooperative choices (cooperation
) made by each group were scored out of 120 (i.e., cooperative choices made by 4 participants over 30 trials). The groups either made decisions publicly or privately (condition
) and groups were either comprised of all women or all men (sex
).
Run the following code to load the data into your global environment.
# load data
data <- Guyer
# take a look at the data
head(Guyer)
## cooperation condition sex
## 1 49 public male
## 2 64 public male
## 3 37 public male
## 4 52 public male
## 5 68 public male
## 6 54 public female
Last time we wanted to compare the mean cooperation level to an experimenter specified level (50% cooperation).
Now, let’s ask a new research question: does whether the decision was made publicly or anonymously have an effect on people’s cooperation levels? For this case, we will use an independent samples t-test, which is used to compare the means of two independent samples.
The null hypothesis would be:
\[H_{0}: \mu_1 = \mu_2\]
The alternative hypothesis would be:
\[H_{1}: \mu_1 \neq \mu_2\]
The assumptions include..
Another way of stating the null hypothesis is that, if the two population means are equal, than the difference between the two population means (population mean 1 minus population mean 2) is zero, as shown below:
\[H_{0}: \mu_1 - \mu_2 = 0\]
Remember the sampling distribution of means is used to represent the results we would expect to obtain if the null hypothesis is true.
For an independent samples t-test, the sampling distribution is a t distribution that represents all the possible sample mean differences we could expect to obtain if we randomly obtained samples from two populations with equal population means and calculated the difference between each pair of sample means and plotted those sample mean differences in a distribution.
We’ll firrst talk about how to calculate standard error of the sampling distribution if performing a student’s t-test, which assumes that the population standard deviations are equal.
The standard error of the difference, aka the standard deviation of the sampling distribution of differences between means, is calculated as:
\[\sigma_{d} = \hat{\sigma_{p}} \sqrt{(\frac{1}{N_{1}} + \frac{1}{N_{2}})}\] For student’s t-test (which assumes the standard deviations of the two populations are equal), \(\hat{\sigma_{p}}\) is the pooled estimate of the population standard deviation and is calculated as:
\[\hat{\sigma_{p}} = \sqrt{\frac{{{(N_{1} - 1)\hat{\sigma_{1}^{2}} + (N_{2} - 1)\hat{\sigma_{2}^{2}}}}} {N_{1} + N_{2} - 2}}\]
For an independent samples t-test, the null hypothesis states that the means of the two populations being compared are equal. Another assumption that student’s version of the independent samples t-test makes is that the standard deviation of the two populations are equal. The pooled estimate of the population standard deviation is a weighted average of the two samples’ estimates of their population standard deviations.
The degrees of freedom for student’s t-test are N1 + N2 - 2.
If you have reason to think that the homogeneity of variances assumption has been violated, you should instead perform welch’s t-test, which does not make this assumption. The main difference between welch’s t-test and student’s t-test is in how the standard error of the difference is calculated. For welch’s t-test, you calculate it as:
\[\sigma_{d} = \sqrt{(\frac{\hat{\sigma_{1}^{2}}}{N_{1}} + \frac{\hat{\sigma_{2}^{2}}}{N_{2}})}\] This measure of standard error will be slightly larger than the measure of standard error using student’s t-test, and thus this is a more conservative test (because it will produce a smaller t-test).
The degrees of freedom for welch’s t-test are:
The t-statistic for the independent samples t-test is the difference between our two sample means divided by the standard error of the difference.
\[t = \frac{\bar X_{1} - \bar X_{2}}{\sigma_{d}}\]
Which will express, in standard error units, how far away our actually obtained sample mean difference is from the mean of our sampling distribution of differences between means (that represents all the sample mean differences we would expect if the null were true), which lets us see how unlikely our results would be if the null hypothesis is true.
First, let’s get descriptive statistics for each condition to see what’s going on. We can get the descriptive statistics separately for the two conditions being compared by using the group_by()
function.
descriptives <- data %>%
group_by(condition) %>%
summarise(n = n(),
mean = mean(cooperation, na.rm = TRUE),
sd = sd(cooperation, na.rm = TRUE))
descriptives
## # A tibble: 2 x 4
## condition n mean sd
## <fct> <int> <dbl> <dbl>
## 1 anonymous 10 40.9 9.42
## 2 public 10 55.7 14.8
As with the one-sample and paired t-tests, we can use the t.test()
function from the the built-in {stats}
package to conduct an independent samples t-test.
t.test(cooperation ~ condition, data = data, var.equal = TRUE) # Student's t-test
##
## Two Sample t-test
##
## data: cooperation by condition
## t = -2.6615, df = 18, p-value = 0.0159
## alternative hypothesis: true difference in means between group anonymous and group public is not equal to 0
## 95 percent confidence interval:
## -26.482755 -3.117245
## sample estimates:
## mean in group anonymous mean in group public
## 40.9 55.7
The syntax being used here is similar to how you will perform regression models in R. Whatever comes on the left of the tilde (~
) is the dependent variable (in this case, cooperation
) and whatever comes on the right of the tilde is the independent variable (in this case, condition
).
Since we specified var.equal = TRUE
in the t.test()
function, it calculated a Student’s t-test. If we had specified var.equal = FALSE
(the default for both t.test()
and independentSamplesTTest()
) we would have calculated a Welch’s t-test. Unlike Student’s t-test, Welch’s t-test does not assume equal variances between groups. To test whether you have violated the homogeneity of variances assumption, you can use Levene’s test, as shown below.
Let’s first perform Levene’s test to see if we have violated the homogeneity of variances assumption. A significant Levene’s test indicates a violation.
leveneTest(cooperation ~ condition, data = data, center = "mean")
## Levene's Test for Homogeneity of Variance (center = "mean")
## Df F value Pr(>F)
## group 1 1.9346 0.1812
## 18
Have we violated the homogeneity of variances assumption?
Now, let’s perform Welch’s t-test by setting var.equal = FALSE
.
t.test(cooperation ~ condition, data = data, var.equal = FALSE) # Welch's t-test
##
## Welch Two Sample t-test
##
## data: cooperation by condition
## t = -2.6615, df = 15.237, p-value = 0.0176
## alternative hypothesis: true difference in means between group anonymous and group public is not equal to 0
## 95 percent confidence interval:
## -26.636502 -2.963498
## sample estimates:
## mean in group anonymous mean in group public
## 40.9 55.7
In this case, the t-test was the same for both tests because the homogeneity of variances assumption was not violated.
We can also use the independentSamplesTTest
function in the {lsr}
package to get the output with Cohen’s d included.
Student’s t-test can be used by setting var.equal = TRUE
.
independentSamplesTTest(formula = cooperation ~ condition,
data = data,
var.equal = TRUE)
##
## Student's independent samples t-test
##
## Outcome variable: cooperation
## Grouping variable: condition
##
## Descriptive statistics:
## anonymous public
## mean 40.900 55.700
## std dev. 9.422 14.848
##
## Hypotheses:
## null: population means equal for both groups
## alternative: different population means in each group
##
## Test results:
## t-statistic: -2.661
## degrees of freedom: 18
## p-value: 0.016
##
## Other information:
## two-sided 95% confidence interval: [-26.483, -3.117]
## estimated effect size (Cohen's d): 1.19
Welch’s t-test can be used by setting var.equal = FALSE
.
independentSamplesTTest(formula = cooperation ~ condition,
data = data,
var.equal = FALSE)
##
## Welch's independent samples t-test
##
## Outcome variable: cooperation
## Grouping variable: condition
##
## Descriptive statistics:
## anonymous public
## mean 40.900 55.700
## std dev. 9.422 14.848
##
## Hypotheses:
## null: population means equal for both groups
## alternative: different population means in each group
##
## Test results:
## t-statistic: -2.661
## degrees of freedom: 15.237
## p-value: 0.018
##
## Other information:
## two-sided 95% confidence interval: [-26.637, -2.963]
## estimated effect size (Cohen's d): 1.19
Your turn! What if the research question was ‘do cooperation levels vary by sex?’
#Your code here
Let’s assign the output of the independent samples t-test to an object so we can get APA-style formatted output from it.
Unfortunately, the apa_print()
function only works with the t.test()
output, and not with the independentSamplesTTest()
output. Thus, there will be a few things missing from the output (i.e., M and SD for each condition, measure of effect size).
ist_results <- t.test(cooperation ~ condition, data = data, var.equal = TRUE) # Student's t-test
apa_print(ist_results)
## $estimate
## [1] "$\\Delta M = -14.80$, 95\\% CI $[-26.48$, $-3.12]$"
##
## $statistic
## [1] "$t(18) = -2.66$, $p = .016$"
##
## $full_result
## [1] "$\\Delta M = -14.80$, 95\\% CI $[-26.48$, $-3.12]$, $t(18) = -2.66$, $p = .016$"
##
## $table
## NULL
##
## attr(,"class")
## [1] "apa_results" "list"
A proper write-up for our Independent Sample t-test would be:
Cooperation in the public condition (M = 55.70, SD = 14.84) was much greater than cooperation in the anonymous condition (M = 40.90, SD = 9.42), t(18) = 2.66, p = .018, 95% CI [3.12, 26.64], d = 1.19.
Or, if you want to use the apa_print() output:
Cooperation in the public condition (M = 55.70, SD = 14.84) was much greater than cooperation in the anonymous condition (M = 40.90, SD = 9.42), \(t(18) = -2.66\), \(p = .016\), d = 1.19.
We can quickly plot our means and 95% confidence interval using the ggerrorplot()
function in the {ggpubr}
package.
# create plot
ggerrorplot(data,
x = "condition",
y = "cooperation",
desc_stat = "mean_ci",
color = "condition",
ylab = "Cooperation") +
# add results of the t.test
stat_compare_means(method = "t.test")
Change the ‘desc_stat’ parameter if you want to plot a different type of error bar. Make sure to add a caption, so it is clear what you are plotting.
# create plot
ggerrorplot(data,
x = "condition",
y = "cooperation",
desc_stat = "mean_se", #change to standard error
color = "condition",
ylab = "Cooperation") +
# add results of the t.test
stat_compare_means(method = "t.test") +
#add caption
labs(caption = 'Error bars indicate the SEM.')
Before when we used ggplot, we provided the data and canvas we wanted to plot and then added layers of geoms:
ggplot(data, aes(x=condition, y=cooperation)) + #create canvas
geom_point() #add geom
Sometimes you also want to plot some sort of transformation of your data. One way to do that is with stats layers. Here we want to plot summary statistics (mean and CI), so we’ll use ‘stat_summary’:
ggplot(data, aes(x = condition, y = cooperation)) + #create canvas
stat_summary(aes(color = condition), fun = mean, #specify a function to run on your data
geom="point") + #here is where you specify what geom to use
stat_summary(aes(color = condition),
fun.data = mean_ci, #specify a function to run on your data
geom="errorbar") #here is where you specify what geom to use
In order to calculate power, one way is to use the ‘pwr’ package. This package has several similar functions for different statistical tests that follow the same general logic. Today we’ll be using ‘pwr.t.test’. This function takes a sample size (n), an effect size estimate (d), an alpha level (sig.level), and a power (power). The key thing is to fill in a value for 1 of those 4 things and then leave the 4th one out to calculate it. You can then specify the type of t-test (type), and whether it is a two-sided or directional test (alternative).
Here I will make a post-hoc power calculation.
pwr.t.test(n = 10, #sample size (per group)
d = 1.19, #Cohen's d
sig.level = .05, # alpha level
power = NULL, # Set what you want to calculate to NULL or leave out entirely
type = 'two.sample', #independent sample t-test
alternative = 'two.sided' #two-tailed test
)
##
## Two-sample t test power calculation
##
## n = 10
## d = 1.19
## sig.level = 0.05
## power = 0.7112416
## alternative = two.sided
##
## NOTE: n is number in *each* group
If that is a good estimate of effect size, then our power was .711, even with such a small sample size.
In cases where your Ns are unequal, use ‘pwr.t2n.test’
pwr.t2n.test(n1=10, #n of group 1
n2=10, #n of group 2
d = 1.19, #Cohen's d
sig.level = .05, # alpha level
power = NULL, # Set what you want to calculate to NULL or leave out entirely
alternative = 'two.sided' #two-tailed test
)
##
## t test power calculation
##
## n1 = 10
## n2 = 10
## d = 1.19
## sig.level = 0.05
## power = 0.7112416
## alternative = two.sided
Your turn! What sample size do we need to replicate this same effect size with a power of .95?
#Your code here
One more! What effect size do we need to achieve a power of .9, with 30 participants per group, and an alpha level of .01?
#Your code here
An intrepid researcher submits a paper on the difference in coding abilities between Mac and PC users. They argue that PC users are better at coding than Mac users. Coding ability was operationalized with a continuous ‘ability’ metric. Despite a growing fear that you are becoming Reviewer 2, you acquire their data through GitHub to check their analyses.
Run the following lines of code to load the data:
# set seed for reproducability
set.seed(42)
# load data
data_os <- data.frame("id" = 1:1e5,
"os" = c(rep("pc", 2e4),
rep("mac", 8e4)),
"ability" = c(rnorm(2e4, 15, 90),
rnorm(8e4, 13.9, 1)))
#Your code here
#Your code here
#Your code here
ggerrorplot()
. Does this support your answer to question 3?#Your code here
#Your code here
#Your code here
#Your code here
#Your code here
#Your code here
#Your code here
My advisor told me that I had multiple errors in my code. She told me the line numbers where the errors are, but she thought it would be a good learning experience for me to try to solve the errors myself. I need your help. Fix the errors in the following chunks of code.
Error 1:
I am trying to calculate post-hoc power, but it keeps giving me a strange error message???
pwr.t2n.test(n1=30, #n of group 1
n2=20, #n of group 2
d = .5, #Cohen's d
sig.level = .05, # alpha level
power = NULL, # Set what you want to calculate to NULL or leave out entirely
type = 'two.sample', #independent sample t-test
alternative = 'two.sided' #two-tailed test
)
Error 2:
I prefer t.test()
over independentSamplesTTest()
because I’m an R purist, and I wanted to calculate a Cohen’s d value using the t2d()
function in the {psych}
package. I don’t think it is working properly. I have a t-statistic of 5.12
and the sample sizes for my two groups are 15
and 22
.
# load psych
library(psych)
# calculate cohen's d
t2d(t = 5.12, n = 15, n1 = 22)
Error 3
I was trying to plot my data from the first minihack, but I can only see part of my error bars, I want to see all of it.
ggplot(data_os, aes(x=os, y=ability)) + #create canvas
stat_summary(fun.data = mean_ci, #specify a function to run on your data
geom="errorbar") + #here is where you specify what geom to use
stat_summary(fun = mean, #specify a function to run on your data
geom="bar", #here is where you specify what geom to use
fill='blue') +
# add results of t.test
stat_compare_means(method = "t.test") +
coord_cartesian(ylim=c(12,17))