To quickly navigate to the desired section, click one of the following links:
Today we will be analyzing data from Fox and Guyer’s (1978) anonymity and cooperation study again. The data is included in the {carData}
package and you can see information about the data set using ?Guyer
. Twenty groups of four participants each played 30 trials of the the prisoner’s dilemma game. The number of cooperative choices (cooperation
) made by each group were scored out of 120 (i.e., cooperative choices made by 4 participants over 30 trials). The groups either made decisions publicly or privately (condition
) and groups were either comprised of all women or all men (sex
).
Run the following code to load the data into your global environment.
# load data
data <- Guyer
Last time we wanted to compare the mean cooperation level to an experimenter specified level. But what if we wanted to compare the mean level of cooperation when decisions were made publicly versus the mean level of cooperation when decisions were made anonymously. In that case, we would use an independent samples t-test, which compares the means of, well, two independent samples. For instance, I might suspect that the mean cooperation will be greater when decisions about cooperation are made publicly rather than when decisions about cooperation are made anonymously. The alternative hypothesis (\(H_{1}\)), in that case, would be that the mean cooperation in the public group is not equal to the mean cooperation in the anonymous group (\(\mu_1 \neq \mu_2\)). The null hypothesis (\(H_{0}\)) would be that the means of the two groups are equal (\(\mu_1 = \mu_2\)).
To calculate an independent samples t-test by hand, we will want to start by splitting our data frame into two data frames according to whether the groups made decisions anonymously or publicly.
data_public <- data %>%
filter(condition == "public")
data_anonymous <- data %>%
filter(condition == "anonymous")
Then, as we did for the one-sample t-test, we want to calculate descriptive statistics for both of the conditions.
# calculate group1 descriptives
group1_mean <- mean(data_public$cooperation)
group1_sd <- sd(data_public$cooperation)
group1_var <- group1_sd^2
group1_n <- length(data_public$cooperation)
# calculate group2 descriptives
group2_mean <- mean(data_anonymous$cooperation)
group2_sd <- sd(data_anonymous$cooperation)
group2_var <- group2_sd^2
group2_n <- length(data_anonymous$cooperation)
# print the descriptive statistics
list("public" = c("mean" = group1_mean,
"sd" = group1_sd,
"var" = group1_var,
"n" = group1_n),
"anonymous" = c("mean" = group2_mean,
"sd" = group2_sd,
"var" = group2_var,
"n" = group2_n))
## $public
## mean sd var n
## 55.70000 14.84775 220.45556 10.00000
##
## $anonymous
## mean sd var n
## 40.900000 9.421606 88.766667 10.000000
As we can see from the statistics, the number of participants in each condition was equal (\(n_1 = 10\); \(n_2 = 10\)). The mean (55.70
) and standard deviation (14.85
) of cooperation in the public condition were both larger than the mean (40.90
) and standard deviation (9.42
) of cooperation in the anonymous group.
To calculate the t-statistic, we will use a similar equation to that used for the one-sample t-test, but the numerator will be the difference between the condition means (\(\bar X_1 - \bar X_2\)) and the denominator will be the standard error of the difference between the means (\(\hat \sigma_p \sqrt{\frac{1}{N_1} + \frac{1}{N_2}}\)).
\[\frac{\bar X_1 - \bar X_2}{\hat \sigma_p \sqrt{\frac{1}{N_1} + \frac{1}{N_2}}}\]
The standard error of the difference between the means is the pooled standard deviation (\(\hat \sigma_p\)) decreased by the sample size (\(\sqrt{\frac{1}{N_1} + \frac{1}{N_2}}\)). The pooled standard deviation is calculated using the following equation:
\[\hat \sigma_p = \frac{((N_1 - 1) \times \hat \sigma_1^2) + ((N_2 - 1) \times \hat \sigma_2^2)}{(N_1 - 1) + (N_2 - 1)}\]
Below is the code for calculating the t-statistic according to the equations above.
# calculate the t-statistic
mean_diff <- group1_mean - group2_mean
group1_w <- group1_n - 1
group2_w <- group2_n - 1
sd_pooled <- sqrt(((group1_w * group1_var) + (group2_w * group2_var)) / (group1_w + group2_w))
n_correction <- sqrt((1 / group1_n) + (1 / group2_n))
se <- sd_pooled * n_correction
t_stat <- mean_diff / se
# print the t-statistic
c("t_stat" = t_stat)
## t_stat
## 2.661499
Next, we will calculate a p-value for the t-statistic. The code for calculating the p-value is exactly the same as before, but we will use \(\mbox{df} = N-2\) for the degrees of freedom instead of \(\mbox{df} = N-1\) because our statistic contains two means.
# calculate degrees of freedom
df <- group1_n + group2_n - 2
# print out the degrees of freedom
c("df" = df)
## df
## 18
Now that we have our t-statistic and degrees of freedom, we can calculate the p-value.
# calculate p-value
p_val <- pt(q = abs(t_stat), df = df, lower.tail = FALSE) * 2
# print p-value
c("p_val" = p_val)
## p_val
## 0.01589758
Looks like the probability of obtaining a difference in means of this size or greater, when the null hypothesis is true, is 1.59%.
The code for calculating Cohen’s d is a bit more complex than what we used above, but, again, we are simply deriving the difference in means in standardized units.
# calculate cohen's d
d_val <- mean_diff / sqrt((group1_sd^2 + group2_sd^2) / 2)
# print cohen's d
c("d_val" = d_val)
## d_val
## 1.190259
Looks like cooperation when decisions were made publicly was far higher than cooperation when decisions were made anonymously.
We can also calculate the 95% confidence interval for this difference.
# calculate upper and lower limits of 95% CI
ci_low <- mean_diff + (se * qt(.025, df = df))
ci_up <- mean_diff + (se * qt(.975, df = df))
# print 95% confidence interval
c("95% CI Lower" = ci_low,
"95% CI Upper" = ci_up)
## 95% CI Lower 95% CI Upper
## 3.117245 26.482755
Again, we can be 95% certain that our confidence interval contains the true population mean.
As with the one-sample and paired t-tests, we can use the t.test()
function from the the built-in {stats}
package to conduct an independent samples t-test.
t.test(data_public$cooperation, data_anonymous$cooperation, var.equal = TRUE)
##
## Two Sample t-test
##
## data: data_public$cooperation and data_anonymous$cooperation
## t = 2.6615, df = 18, p-value = 0.0159
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 3.117245 26.482755
## sample estimates:
## mean of x mean of y
## 55.7 40.9
Since we specified var.equal = TRUE
in the t.test()
function, it calculated a Student’s t-test. If we had specified var.equal = FALSE
(the default for both t.test()
and independentSamplesTTest()
) we would have calculated a Welch’s t-test. Unlike Student’s t-test, Welch’s t-test does not assume equal variances between groups. However, when the variances and sample sizes are equal, Welch’s t-test will produce a t-statistic that is nearly indistinguishable from the t-statistic produced by Student’s t-test. There is really no need to ever use Student’s t-test (other than cases with very small sample sizes).
We can also use the independentSamplesTTest
function in the {lsr}
package to get the output with Cohen’s d included.
independentSamplesTTest(formula = cooperation ~ condition,
data = data,
var.equal = TRUE)
##
## Student's independent samples t-test
##
## Outcome variable: cooperation
## Grouping variable: condition
##
## Descriptive statistics:
## anonymous public
## mean 40.900 55.700
## std dev. 9.422 14.848
##
## Hypotheses:
## null: population means equal for both groups
## alternative: different population means in each group
##
## Test results:
## t-statistic: -2.661
## degrees of freedom: 18
## p-value: 0.016
##
## Other information:
## two-sided 95% confidence interval: [-26.483, -3.117]
## estimated effect size (Cohen's d): 1.19
The formula
argument for independentSamplesTTest
may look unfamiliar to some of you. It uses formula
syntax, which you will use a lot when you start analyzing data using multiple regression models. In brief, whatever comes on the left of the tilde (~
) is the dependent variable (in this case, cooperation
) and whatever comes on the right of the tilde is the independent variable (in this case, condition
).
Your turn! What is the code to compare cooperation scores based on sex?
independentSamplesTTest(formula = cooperation ~ sex,
data = data,
var.equal = TRUE)
##
## Student's independent samples t-test
##
## Outcome variable: cooperation
## Grouping variable: sex
##
## Descriptive statistics:
## female male
## mean 48.800 47.800
## std dev. 15.455 13.839
##
## Hypotheses:
## null: population means equal for both groups
## alternative: different population means in each group
##
## Test results:
## t-statistic: 0.152
## degrees of freedom: 18
## p-value: 0.881
##
## Other information:
## two-sided 95% confidence interval: [-12.782, 14.782]
## estimated effect size (Cohen's d): 0.068
A proper write-up for our Independent Sample t-test would be:
"Cooperation in the public condition (M = 55.70, SD = 14.84) was much greater than cooperation in the anonymous condition (M = 40.90, SD = 9.42), t(18) = 2.66, p = .018, 95% CI [3.12, 26.64], d = 1.19.
We can quickly plot our means and 95% confidence interval using the ggerrorplot()
function in the {ggpubr}
package.
# create plot
ggerrorplot(data,
x = "condition",
y = "cooperation",
desc_stat = "mean_ci",
color = "condition",
ylab = "Cooperation") +
# add results of the t.test
stat_compare_means(method = "t.test")
Change the ‘desc_stat’ parameter if you want to plot a different type of error bar. Make sure to add a caption, so it is clear what you are plotting.
# create plot
ggerrorplot(data,
x = "condition",
y = "cooperation",
desc_stat = "mean_se", #change to standard error
color = "condition",
ylab = "Cooperation") +
# add results of the t.test
stat_compare_means(method = "t.test") +
#add caption
labs(caption = 'Error bars indicate the SEM.')
Before when we used ggplot, we provided the data and canvas we wanted to plot and then added layers of geoms:
ggplot(data, aes(x=condition, y=cooperation)) + #create canvas
geom_point() #add geom
Sometimes you also want to plot some sort of transformation of your data. One way to do that is with stats layers. Here we want to plot summary statistics (mean and CI), so we’ll use ‘stat_summary’:
ggplot(data, aes(x=condition, y=cooperation)) + #create canvas
stat_summary(fun = mean, #specify a function to run on your data
geom="point") + #here is where you specify what geom to use
stat_summary(fun.data = mean_ci, #specify a function to run on your data
geom="errorbar") #here is where you specify what geom to use
In order to calculate power, one way is to use the ‘pwr’ package. This package has several similar functions for different statistical tests that follow the same general logic. Today we’ll be using ‘pwr.t.test’. This function takes a sample size (n), an effect size estimate (d), an alpha level (sig.level), and a power (power). The key thing is to fill in a value for 1 of those 4 things and then leave the 4th one out to calculate it. You can then specify the type of t-test (type), and whether it is a two-sided or directional test (alternative).
Here I will make a post-hoc power calculation.
pwr.t.test(n = 10, #sample size (per group)
d = 1.19, #Cohen's d
sig.level = .05, # alpha level
power = NULL, # Set what you want to calculate to NULL or leave out entirely
type = 'two.sample', #independent sample t-test
alternative = 'two.sided' #two-tailed test
)
##
## Two-sample t test power calculation
##
## n = 10
## d = 1.19
## sig.level = 0.05
## power = 0.7112416
## alternative = two.sided
##
## NOTE: n is number in *each* group
If that is a good estimate of effect size, then our power was .711, even with such a small sample size.
In cases where your Ns are unequal, use ‘pwr.t2n.test’
pwr.t2n.test(n1=10, #n of group 1
n2=10, #n of group 2
d = 1.19, #Cohen's d
sig.level = .05, # alpha level
power = NULL, # Set what you want to calculate to NULL or leave out entirely
alternative = 'two.sided' #two-tailed test
)
##
## t test power calculation
##
## n1 = 10
## n2 = 10
## d = 1.19
## sig.level = 0.05
## power = 0.7112416
## alternative = two.sided
Your turn! What sample size do we need to replicate this same effect size with a power of .95?
pwr.t.test(n = NULL, #sample size (per group)
d = 1.19, #Cohen's d
sig.level = .05, # alpha level
power = .95, #power
type = 'two.sample', #independent sample t-test
alternative = 'two.sided' #two-tailed test
)
##
## Two-sample t test power calculation
##
## n = 19.36879
## d = 1.19
## sig.level = 0.05
## power = 0.95
## alternative = two.sided
##
## NOTE: n is number in *each* group
One more! What effect size do we need to achieve a power of .9, with 30 participants per group, and an alpha level of .01?
pwr.t.test(n = 30, #sample size (per group)
sig.level = .01, # alpha level
power = .9, #power
type = 'two.sample', #independent sample t-test
alternative = 'two.sided' #two-tailed test
)
##
## Two-sample t test power calculation
##
## n = 30
## d = 1.025559
## sig.level = 0.01
## power = 0.9
## alternative = two.sided
##
## NOTE: n is number in *each* group
An intrepid researcher submits a paper on the difference in coding abilities between Mac and PC users. They argue that PC users are better at coding than Mac users. Coding ability was operationalized with a continuous ‘ability’ metric. Despite a growing fear that you are becoming Reviewer 2, you acquire their data through GitHub to check their analyses.
Run the following lines of code to load the data:
# set seed for reproducability
set.seed(42)
# load data
data_os <- data.frame("id" = 1:1e5,
"os" = c(rep("pc", 2e4),
rep("mac", 8e4)),
"ability" = c(rnorm(2e4, 15, 90),
rnorm(8e4, 13.9, 1)))
independentSamplesTTest(ability ~ os, data = data_os, var.equal = TRUE)
##
## Student's independent samples t-test
##
## Outcome variable: ability
## Grouping variable: os
##
## Descriptive statistics:
## mac pc
## mean 13.896 14.528
## std dev. 1.002 90.623
##
## Hypotheses:
## null: population means equal for both groups
## alternative: different population means in each group
##
## Test results:
## t-statistic: -1.971
## degrees of freedom: 99998
## p-value: 0.049
##
## Other information:
## two-sided 95% confidence interval: [-1.26, -0.004]
## estimated effect size (Cohen's d): 0.016
independentSamplesTTest(ability ~ os, data = data_os, var.equal = FALSE)
##
## Welch's independent samples t-test
##
## Outcome variable: ability
## Grouping variable: os
##
## Descriptive statistics:
## mac pc
## mean 13.896 14.528
## std dev. 1.002 90.623
##
## Hypotheses:
## null: population means equal for both groups
## alternative: different population means in each group
##
## Test results:
## t-statistic: -0.986
## degrees of freedom: 20000.22
## p-value: 0.324
##
## Other information:
## two-sided 95% confidence interval: [-1.888, 0.624]
## estimated effect size (Cohen's d): 0.01
# The Cohen's d is .02. It is a very small effect.
ggerrorplot()
. Does this support your answer to question 3?# create plot
ggerrorplot(data_os,
x = "os",
y = "ability",
desc_stat = "mean_ci",
color = "os",
ylab = "Cooperation") +
# add results of t.test
stat_compare_means(method = "t.test")
ggplot(data_os, aes(x=os, y=ability)) + #create canvas
stat_summary(fun = mean, #specify a function to run on your data
geom="point") + #here is where you specify what geom to use
stat_summary(fun.data = mean_ci, #specify a function to run on your data
geom="errorbar") + #here is where you specify what geom to use
# add results of t.test
stat_compare_means(method = "t.test") +
coord_cartesian(ylim=c(12,17))
(overlap <- 2 * pnorm(-abs(.02)/2))
## [1] 0.9920213
dat_pc <- data_os %>%
filter(os == 'pc')
t.test(dat_pc$ability, mu=14)
##
## One Sample t-test
##
## data: dat_pc$ability
## t = 0.82379, df = 19999, p-value = 0.4101
## alternative hypothesis: true mean is not equal to 14
## 95 percent confidence interval:
## 13.27186 15.78392
## sample estimates:
## mean of x
## 14.52789
pwr.t.test(d=mean(c(.1,.6,.2)), power=.8)
##
## Two-sample t test power calculation
##
## n = 175.3847
## d = 0.3
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number in *each* group
pwr.t.test(d=mean(c(.1,.6,.2)), power=.8, type ='paired')
##
## Paired t test power calculation
##
## n = 89.14938
## d = 0.3
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number of *pairs*
pow_out <- pwr.t.test(d=mean(c(.1,.6,.2)), power=.8, type ='paired')
To achieve a power of .8, I would need 90 pairs of plants.
My advisor told me that I had multiple errors in my code. She told me the line numbers where the errors are, but she thought it would be a good learning experience for me to try to solve the errors myself. I need your help. Fix the errors in the following chunks of code.
Error 1:
I am trying to calculate post-hoc power, but it keeps giving me a strange error message???
pwr.t2n.test(n1=30, #n of group 1
n2=20, #n of group 2
d = .5, #Cohen's d
sig.level = .05, # alpha level
power = NULL, # Set what you want to calculate to NULL or leave out entirely
type = 'two.sample', #independent sample t-test
alternative = 'two.sided' #two-tailed test
)
Error 2:
I prefer t.test()
over independentSamplesTTest()
because I’m an R purist, and I wanted to calculate a Cohen’s d value using the t2d()
function in the {psych}
package. I don’t think it is working properly. I have a t-statistic of 5.12
and the sample sizes for my two groups are 15
and 22
.
# load psych
library(psych)
# calculate cohen's d
t2d(t = 5.12, n = 15, n1 = 22)
Error 3
I was trying to plot my data from the first minihack, but I can only see part of my error bars, I want to see all of it.
ggplot(data_os, aes(x=os, y=ability)) + #create canvas
stat_summary(fun.data = mean_ci, #specify a function to run on your data
geom="errorbar") + #here is where you specify what geom to use
stat_summary(fun = mean, #specify a function to run on your data
geom="bar", #here is where you specify what geom to use
fill='blue') +
# add results of t.test
stat_compare_means(method = "t.test") +
coord_cartesian(ylim=c(12,17))