Today’s lab will guide you through the process of conducting a paired samples t-test. As we did with one-sample and independent samples t-tests, we will first cover how to a conduct a paired samples t-test by hand. Then we will use functions from the {stats}
and {lsr}
packages to conduct the analysis. We will also discuss how to interpret the results and plot the data using {ggpubr}
. At the end of today’s lab, you will be asked to apply what you learned to a new data set in the minihacks.
To quickly navigate to the desired section, click one of the following links:
#load required packages
library(tidyverse) # includes dplyr and ggplot2 functions
library(lsr) # includes t-test functions
library(stats) # includes different t-test functions
library(ggpubr) # for plotting
library(rio) # for importing data
library(psych) # for descriptives
library(pwr) # for conducting a power analysis
To illustrate how paired-samples t-tests work, we are going to walk through an example from your textbook. In this example, the data comes from Dr. Chico’s introductory statistics class. Students in the class take two tests over the course of the semester. Dr. Chico gives notoriously difficult exams with the intention of motivating her students to work hard in the class and thus learn as much as possible. Dr. Chico’s theory is that the first test will serve as a “wake up call” for her students, such that when they realize how difficult the class actually is they will be motivated to study harder and earn a higher grade on the second test than they got on the first test.
You can load in the data from this example by running the following code:
# wide format
chico_wide <- import("https://raw.githubusercontent.com/uopsych/psy611/master/labs/resources/lab9/data/chico_wide.csv")
# long format
chico_long <- import("https://raw.githubusercontent.com/uopsych/psy611/master/labs/resources/lab9/data/chico_long.csv")
Note: You should now have 2 versions of the same data set loaded into your global environment. The only difference in these versions of the data is their “shape” – one is “wide” and the other is “long”. In the wide form, every row corresponds to a unique person; in the long form, every row corresponds to a unique observation or measurement.
head(chico_wide)
## id grade_test1 grade_test2
## 1 student1 42.9 44.6
## 2 student2 51.8 54.0
## 3 student3 71.7 72.3
## 4 student4 51.6 53.4
## 5 student5 63.5 63.8
## 6 student6 58.0 59.3
head(chico_long)
## id time grade
## 1 student1 test1 42.9
## 2 student2 test1 51.8
## 3 student3 test1 71.7
## 4 student4 test1 51.6
## 5 student5 test1 63.5
## 6 student6 test1 58.0
chico_wide
. Later on in the lab we will discuss more about how to deal with data that is in long format.Let’s take a closer look at the data before we actually run a t-test to see what might be going on…
describe(chico_wide)
## vars n mean sd median trimmed mad min max range skew
## id* 1 20 NaN NA NA NaN NA Inf -Inf -Inf NA
## grade_test1 2 20 56.98 6.62 57.7 56.92 7.71 42.9 71.7 28.8 0.05
## grade_test2 3 20 58.38 6.41 59.7 58.35 6.45 44.6 72.3 27.7 -0.05
## kurtosis se
## id* NA NA
## grade_test1 -0.35 1.48
## grade_test2 -0.39 1.43
Question: What do you notice about the means of the two groups? Can we conclude anything?
Question: What do you notice about the pattern of points in this plot?
chico_wide <- chico_wide %>%
mutate(diff = grade_test2 - grade_test1)
head(chico_wide)
## id grade_test1 grade_test2 diff
## 1 student1 42.9 44.6 1.7
## 2 student2 51.8 54.0 2.2
## 3 student3 71.7 72.3 0.6
## 4 student4 51.6 53.4 1.8
## 5 student5 63.5 63.8 0.3
## 6 student6 58.0 59.3 1.3
In our example above, we created a new variable, chico_wide$diff
, that represents the difference between each student’s score on Test 1 and Test 2.
More generally, if \(X_{i1}\) is the score that the \(i\)-th participant obtained on the first variable, and \(X_{i2}\) is the score that the same person obtained on the second one, then the difference score is:
\[ \Delta_i = X_{i1} - X_{i2}\]
\[ \mu_\Delta = \mu_1 - \mu_2\]
\[ H_0: \mu_\Delta = 0 \] \[ H_1: \mu_\Delta \neq 0 \]
\[t = \frac{\bar \Delta}{\hat \sigma_\Delta / \sqrt{N}}\]
For running a paired t-test by hand, we essentially run a one-sample t-test. The main difference is that our mean and standard deviation are the mean and standard deviation of the difference scores.
# calculate mean and sd
chico_mean <- mean(chico_wide$diff)
chico_sd <- sd(chico_wide$diff)
# print the mean and sd
c("mean" = chico_mean,
"sd" = chico_sd)
## mean sd
## 1.4050000 0.9703363
As discussed in class, we can also calculate the standard deviation of the difference scores with the standard deviations of both groups by using the following equation:
\[\sqrt{\hat\sigma_{M1}^2 + \hat\sigma_{M2}^2 - 2r(\hat\sigma_{M1}\hat\sigma_{M2})}\]
# calculate standard deviation of both groups
g1_sd <- sd(chico_wide$grade_test1)
g2_sd <- sd(chico_wide$grade_test2)
# calculate the correlation of both groups
g_cor <- cor(chico_wide$grade_test1, chico_wide$grade_test2)
# calculate the standard deviation
chico_sd_alt <- sqrt(g1_sd^2 + g2_sd^2 - (2 * g_cor * (g1_sd * g2_sd)))
# print the sd calculated using the alternative method
c("sd" = chico_sd_alt)
## sd
## 0.9703363
Both methods should produce the same value for the standard deviation of the difference scores.
Next we can calculate the sample size and the degrees of freedom for our test. Keep in mind that our sample size is not the total number of observations (40
). Instead, it is the total number of participants (20
).
# calculate sample size and degrees of freedom
chico_n <- length(chico_wide$diff)
chico_df <- chico_n - 1
# print the sample size and the degrees of freedom
c("n" = chico_n,
"df" = chico_df)
## n df
## 20 19
For the degrees of freedom we only subtract one because, as with a one-sample t-test we are only dealing with one mean (i.e., the mean of the difference scores).
The final step before we calculate our t-statistic is to calculate the standard error.
# calculate the standard error
chico_se <- chico_sd / sqrt(chico_n)
# print the standard error
c("se" = chico_se)
## se
## 0.2169738
Now all we have to do to calculate our t-statistic is divide the mean difference scores by the standard error of the mean difference scores.
# calculate the t-statistic
chico_t <- chico_mean / chico_se
# print the t-statistic
c("t" = chico_t)
## t
## 6.475436
And, as always, we can calculate a p-value for our t-statistic using the pt()
function.
# calculate the t-statistic
chico_p <- pt(q = abs(chico_t), df = chico_df, lower.tail = FALSE) * 2
# print the t-statistic
c("p" = chico_p)
## p
## 0.00000332067
We can also used the same code we used for our one-sample t-test and our independent samples t-test to calculate the confidence interval around the mean of the difference scores.
# calculate the confidence interval
chico_ci_low <- chico_mean + chico_se * qt(p = .025, df = chico_df, lower.tail = TRUE)
chico_ci_up <- chico_mean + chico_se * qt(p = .975, df = chico_df, lower.tail = TRUE)
# print the t-statistic
c("95% CI Lower" = chico_ci_low,
"95% CI Upper" = chico_ci_up)
## 95% CI Lower 95% CI Upper
## 0.9508686 1.8591314
And we calculate a Cohen’s d in exactly the same way as the one-sample and independent samples t-test:
\[ d = \frac{\bar \Delta}{\hat \sigma_\Delta} \]
If we divide the mean of the difference scores by the standard deviation of the differences scores we get the Cohen’s d (i.e., the standardized mean of the difference scores) at the within-subjects level.
# calculate cohen's d
chico_d <- chico_mean / chico_sd
# print cohen's d
c("d" = chico_d)
## d
## 1.447952
Looks like there was a large effect! On average, students showed a large improvement on the second test.
We can conduct our t-test using one-sample t-test functions. The example immediately below uses the t.test()
function in the {stats}
package to conduct the one-sample t-test of the difference scores.
t.test(x = chico_wide$diff, mu = 0)
##
## One Sample t-test
##
## data: chico_wide$diff
## t = 6.4754, df = 19, p-value = 0.000003321
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 0.9508686 1.8591314
## sample estimates:
## mean of x
## 1.405
This next bit of code uses the oneSampleTTest()
function from the {lsr}
package to conduct the one-sample t-test of the difference scores.
oneSampleTTest(x = chico_wide$diff, mu = 0)
##
## One sample t-test
##
## Data variable: chico_wide$diff
##
## Descriptive statistics:
## diff
## mean 1.405
## std dev. 0.970
##
## Hypotheses:
## null: population mean equals 0
## alternative: population mean not equal to 0
##
## Test results:
## t-statistic: 6.475
## degrees of freedom: 19
## p-value: <.001
##
## Other information:
## two-sided 95% confidence interval: [0.951, 1.859]
## estimated effect size (Cohen's d): 1.448
The t.test()
function from the {stats}
package also allows you to input raw scores (i.e., not the difference scores) and run a paired samples t-test using the paired = TRUE
argument. The results will be exactly the same as running the one sample t-test on the difference scores.
t.test(x = chico_wide$grade_test1,
y = chico_wide$grade_test2,
paired = TRUE)
##
## Paired t-test
##
## data: chico_wide$grade_test1 and chico_wide$grade_test2
## t = -6.4754, df = 19, p-value = 0.000003321
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.8591314 -0.9508686
## sample estimates:
## mean of the differences
## -1.405
We can also use the pairedSamplesTTest()
function from the {lsr}
package.
pairedSamplesTTest(formula = ~ grade_test2 + grade_test1, # one-sided formula
data = chico_wide) # wide format
##
## Paired samples t-test
##
## Variables: grade_test2 , grade_test1
##
## Descriptive statistics:
## grade_test2 grade_test1 difference
## mean 58.385 56.980 1.405
## std dev. 6.406 6.616 0.970
##
## Hypotheses:
## null: population means equal for both measurements
## alternative: different population means for each measurement
##
## Test results:
## t-statistic: 6.475
## degrees of freedom: 19
## p-value: <.001
##
## Other information:
## two-sided 95% confidence interval: [0.951, 1.859]
## estimated effect size (Cohen's d): 1.448
Note that in the example above, using the pairedSamplesTTest()
function, we used the wide format data. When using wide data with pairedSamplesTTest()
, you enter a one-sided formula that contains your two repeated measures conditions (e.g. ~ grade_test2 + gradte_test1
).
The pairedSamplesTTest()
function can also be used with long data. In this case, you must use a two-sided formula: outcome ~ group
. You also need to specify the name of the ID variable. Note that the grouping variable must also be a factor.
# grouping variable (time) must be a factor
chico_long <- chico_long %>%
mutate(time = as.factor(time))
pairedSamplesTTest(formula = grade ~ time, # two-sided formula
data = chico_long, # long format
id = "id") # name of the id variable
##
## Paired samples t-test
##
## Outcome variable: grade
## Grouping variable: time
## ID variable: id
##
## Descriptive statistics:
## test1 test2 difference
## mean 56.980 58.385 -1.405
## std dev. 6.616 6.406 0.970
##
## Hypotheses:
## null: population means equal for both measurements
## alternative: different population means for each measurement
##
## Test results:
## t-statistic: -6.475
## degrees of freedom: 19
## p-value: <.001
##
## Other information:
## two-sided 95% confidence interval: [-1.859, -0.951]
## estimated effect size (Cohen's d): 1.448
In addition to being produced automatically when you run pairedSamplesTTest()
, the within-subjects Cohen’s d can be produced by using the cohensD()
function from {lsr}
with the argument method = "paired"
.
lsr::cohensD(x = chico_wide$grade_test1,
y = chico_wide$grade_test2,
method = "paired")
## [1] 1.447952
We can also calculate the Cohen’s d for the between-conditions variance using the method = "pooled"
argument.
lsr::cohensD(x = chico_wide$grade_test1,
y = chico_wide$grade_test2,
method = "pooled")
## [1] 0.2157646
A proper write-up for our Independent Sample t-test would be:
“A paired samples t-test was used to compare scores on Dr. Chico’s first and second exam. The students scored substantially higher on the second test (M = 58.39, SD = 6.41) than on the first test (M = 56.98, SD = 6.62), t(19) = 6.48, p < .001, 95% CI [0.95, 1.86], d = 1.45.”
When plotting paired samples data, we want some way to clearly represent the repeated measures structure of the data. One way to do this is to draw a line between each pair of data points. This can be done with the ggpaired()
function from {ggpubr}
.
# wide format
ggpaired(chico_wide,
cond1 = "grade_test1",
cond2 = "grade_test2",
color = "condition",
line.color = "gray",
line.size = 0.4,
palette = "jco")
# long format
ggpaired(chico_long,
x = "time",
y = "grade",
color = "time",
line.color = "gray",
line.size = 0.4,
palette = "jco")
You are welcome to work with a partner or in a small group of 2-3 people. Please feel free to ask the lab leader any questions you might have!
A clinical psychologist wants to know whether a new cognitive-behavioral therapy (CBT) program helps alleviate anxiety. He enrolls 12 individuals diagnosed with an anxiety disorder in a 6-week CBT program. Participants are given an Anxiety Scale before they begin and after they complete treatment.
Import the data by running the the following code:
cbt_data <- import("https://raw.githubusercontent.com/uopsych/psy611/master/labs/resources/lab9/data/cbt_data.csv")
# your code here
# your code here
ggpaired()
.# your code here
pwr.t.test(n = ???,
d = ???,
sig.level = .05,
power = NULL,
type = "paired",
alternative = "two.sided")
# your code here
# your code here
pwr.t.test(n = NULL,
d = ???,
sig.level = .05,
power = ???,
type = "paired",
alternative = "two.sided")
# your code here
You are reviewing a paper that argues people are more cynical after reading the Watchmen graphic novel than before reading the novel. Participant cynicism was operationalized as a participant’s score on the Views subscale of the Mach-IV (scored out of 5). You decide to rerun the authors’ analyses. Use the values below to conduct a paired-samples t-test.
\(\hat{\mu}_{Time1}\) = 1.24
\(\hat{\mu}_{Time2}\) = 4.87
\(\hat{\sigma}_{Time1}\) = 1.22
\(\hat{\sigma}_{Time2}\) = 1.30
\(r_{Time1Time2}\) = .30
\(N_{TOTAL}\) = 20
# your code here
# your code here
# your code here