Today’s lab will guide you through the process of conducting a One-Sample t-test and a Paired Samples t-test. For each test, we will first go through the process of conducting the tests using arithmetic and probability distributions in R. Then we will compare and contrast t-test functions from the {stats} and {lsr} packages.

To quickly navigate to the desired section, click one of the following links:

  1. One-Sample t-test
  2. Paired Samples t-test
  3. Minihacks

As always, there are minihacks to test your knowledge.


One-Sample t-test

The Data

In our first example, we will be analyzing data from Fox and Guyer’s (1978) anonymity and cooperation study. The data is included in the {carData} package and you can see information about the data set using ?Guyer. Twenty groups of four participants each played 30 trials of the the prisoner’s dilemma game. The number of cooperative choices (cooperation) made by each group were scored out of 120 (i.e., cooperative choices made by 4 participants over 30 trials).

Run the following code to load the data into your global environment.

# load data 
dat <- Guyer

What is a One-Sample t-test?

A One-Sample t-test tests whether some obtained sample mean is significantly different from a value specified by the researcher. Looking at the Guyer data, we might hypothesize that groups would cooperate on more than 50% of the trials (i.e., groups would cooperate on more than 60 trials). The alternative hypothesis (\(H_{1}\)) would therefore be that the population mean of cooperation is not equal to 60 (\(\mu \neq 60\)), whereas the null hypothesis (\(H_{0}\)) would be that the population mean of cooperation is equal to 60 (\(\mu = 60\)).

You might have noticed that our alternative hypothesis was that the mean level of cooperation was not equal to 60 (\(\mu \neq 60\)) rather than that the mean level of cooperation was greater than 60 (\(\mu > 60\)). This is because we plan to run a two-tailed significance test. A two-tailed significance tests allows us to declare significance if the mean cooperation is sufficiently greater than 60 and if the mean cooperation is sufficiently less than 60. Two-tailed tests are far more common than one-tailed tests, and you should generally default to running a two-tailed test.

Using Arithmetic

Most statistical tests look at a ratio of signal to noise. The one-sample t-test is no different.

Let’s look at the equation.

\[t = \frac{\bar X - \mu}{\hat \sigma / \sqrt{N}}\]

The numerator (\(\bar x - \mu\)) is the “signal”, or the difference between our sample mean and the value we are comparing it against, and the denominator (\(\hat \sigma / \sqrt{N}\)) is the “noise”, or the inaccuracy (standard error) of our estimate. If we have more “signal” in our data, our t-statistic increases. If we have more “noise” in our data, our t-statistic decreases.

To make our lives easier, we will first calculate some descriptive statistics: (1) the mean of cooperation in our sample (coop_mean), (2) the standard deviation of cooperation in our sample (coop_sd), and (3) the number of unique values in our data (coop_n). We should also calculate the degrees of freedom for our data (coop_df). For a one-sample t-test, the degrees of freedom is equal to \(N - 1\).

# calculate descriptives
coop_mean <- mean(dat$coop)
coop_sd   <- sd(dat$coop)
coop_n    <- length(dat$coop)

# calculate degrees of freedoom
coop_df   <- coop_n - 1

# look at the values
c("mean" = coop_mean,
  "sd"   = coop_sd,
  "n"    = coop_n,
  "df"   = coop_df)
##     mean       sd        n       df 
## 48.30000 14.28691 20.00000 19.00000

As we can see, the mean value of cooperation is 48.30.

It looks like the average cooperation was actually less than 60. But this could simply be due to chance. Let’s calculate how probable obtaining a result at least as extreme as we did here is when the null hypothesis is, in fact, true.

To do so, we will first calculate the t-statistic, which is as easy as inserting our acquired descriptive statistics into the equation above.

\[\frac{48.3 - 60}{14.29 / \sqrt{20}} = -3.66\]

Of course, we can also calculate this in R using the following code:

# calculate the t-statistic
coop_t <- (coop_mean - 60) / (coop_sd / sqrt(coop_n))

# print the t-statistic
c("t_stat" = coop_t)
##    t_stat 
## -3.662373

The t-statistic (-3.662473) is negative because our acquired mean was less than the mean is was tested against. If the acquired mean was greater than the value it was tested against, it would be positive. It is completely fine to have a negative t-statistic and you can treat it exactly the same as a positive t-statistic (in fact, some researchers will drop the negative sign when reporting the results of a negative t-test).

Next, we can use the pt() function to calculate the probability of getting a t-statistic of -3.662473 or less from a t-distribution with 9 degrees of freedom when the null hypothesis is true. The pt() function takes three arguments: (1) q (the t-statistic), (2) df (the degrees of freedom), and (3) lower.tail (whether we want to find the cumulative probability for the upper or lower part of the distribution). Below I’ve taken the absolute value of our t-statistic (using abs()) so that we can use lower.tail = FALSE regardless of whether our t-statistic is positive or negative.

To calculate a two-tailed p-value using our t-statistic, we would run the following code:

# calculate p-value
coop_p <- pt(q = abs(coop_t), df = coop_df, lower.tail = FALSE) * 2

#print p-value
c("p_val" = coop_p)
##       p_val 
## 0.001655805

We get a p-value of .001655805. In other words, there was an approximately .17% chance of getting a t-statistic equal to our less than the t-statistic we got when the null hypothesis is true.

The result is multiplied by 2 because we wanted to calculate a two-sided significance test and wanted to be able to claim significance irrespective of whether the t-statistic was above or below the 50% value of 60.

We could have run a one-sided t-test, but, as shown in the image below, we would not have been able to conclude whether our result was significant or not.

We can also calculate an effect size for our statistic. Cohen’s d is a very popular measure of effect size for the t-test and it tells you the size of your effect (the difference between the acquired mean and the value specified) in standardized units. To calculate Cohen’s d, we simply divide the difference in the means by the standard deviation.

# calculate Cohen's d
coop_d <- (coop_mean - 60) / coop_sd

# print Cohen's d
coop_d
## [1] -0.8189315

Typically, we present a Cohen’s d value as a positive number. As with the t-statistic, Cohen’s d being positive or negative just reflects whether the acquired mean was greater than the specified mean or less than the specified mean.

The thresholds for a small, medium, and large effect size are shown below.

d.value interpretation
0.2 small
0.5 medium
0.8 large

Interpreting our Cohen’s d value using the threshold from the table, we might say that there was a large difference between our acquired mean (48.30) and the mean we were comparing it against (60.00).

Finally, we can also derive the 95% confidence interval for our sample mean by running the following code:

# calculate 95% confidence interval
coop_low  <- coop_mean + ((coop_sd / sqrt(coop_n)) * qt(.025, df = coop_df))
coop_up   <- coop_mean + ((coop_sd / sqrt(coop_n)) * qt(.975, df = coop_df))

# print 95% confidence interval
c("95% CI Lower" = coop_low,
  "95% CI Upper" = coop_up)
## 95% CI Lower 95% CI Upper 
##     41.61352     54.98648

The code may seem like a bit of a mess, but we are just taking our acquired mean (coop_mean) and adding to it the standard error of the mean ((coop_sd / sqrt(coop_n)) multiplied by the critical t-statistic for a t-distribution with 19 degrees of freedom (qt(.025, df = coop_df) and qt(.975, df = coop_df), respectively). Our results of the 95% confidence interval means that, if we derived 95% confidence intervals for 100 different samples, 95% of them would contain the true population mean. In other words, there is a 95% chance that our confidence interval contains the true population mean.

The image above shows our acquired mean with a 95% confidence interval.

Using Functions

There are two useful functions for conducting a one-sample t-test in R. The first, is called t.test() and it is automatically loaded as part of the {stats} package when you first open R. To run a one-sample t-test using t.test(), you provide the function the column of the data you are interested in (e.g., x = data$cooperation) and the mean value you want to compare the data against (e.g., mu = 60).

t.test(x = dat$cooperation, mu = 60)
## 
##  One Sample t-test
## 
## data:  dat$cooperation
## t = -3.6624, df = 19, p-value = 0.001656
## alternative hypothesis: true mean is not equal to 60
## 95 percent confidence interval:
##  41.61352 54.98648
## sample estimates:
## mean of x 
##      48.3

As part of the output, you are provided the mean of cooperation, the t-statistic, the degrees of freedom, the p-value, and the 95% confidence interval. The values outputted by t.test() should be exactly the same as the values we calculated. Unfortunately, we did not get a measure of the effect size.

The oneSampleTTest() function from the the {lsr} package includes Cohen’s d automatically, but the downside is that you have to load the package separately.

oneSampleTTest(x = dat$cooperation, mu = 60)
## 
##    One sample t-test 
## 
## Data variable:   dat$cooperation 
## 
## Descriptive statistics: 
##             cooperation
##    mean          48.300
##    std dev.      14.287
## 
## Hypotheses: 
##    null:        population mean equals 60 
##    alternative: population mean not equal to 60 
## 
## Test results: 
##    t-statistic:  -3.662 
##    degrees of freedom:  19 
##    p-value:  0.002 
## 
## Other information: 
##    two-sided 95% confidence interval:  [41.614, 54.986] 
##    estimated effect size (Cohen's d):  0.819

As you can see from the output, oneSampleTTest() provides you all of the information that t.test() did, but it also includes Cohen’s d.

Interpretation and Write-Up

You can use the ‘apa_print()’ function from the papaja package to efficiently writeup your results in apa format.

#First I'll run my t-test again, but this time I'll save it to a variable:
toutput  <- t.test(x = dat$cooperation, mu = 60)

#Let's look at our output options
apa_print(toutput)
## $estimate
## [1] "$M = 48.30$, 95\\% CI $[41.61$, $54.99]$"
## 
## $statistic
## [1] "$t(19) = -3.66$, $p = .002$"
## 
## $full_result
## [1] "$M = 48.30$, 95\\% CI $[41.61$, $54.99]$, $t(19) = -3.66$, $p = .002$"
## 
## $table
## NULL
## 
## attr(,"class")
## [1] "apa_results" "list"

Depending on what object you give to apa_print, it may give you different commonly reported statistics from that test. You can access these with a ‘$’. Here is an example of how I might report this:

"Average cooperation (\(M = 48.30\), 95% CI \([41.61\), \(54.99]\)) was significantly less than 60, \(t(19) = -3.66\), \(p = .002\).

Visualization

To plot this, we’re just plotting one variable. You already know how to do this! Here is one option. I’ve plotted a boxplot below.

# the data set
dat %>%
  # the canvas
  ggplot(aes(y = cooperation)) +
    # the geometric elements
       geom_boxplot() +
    # set theme
    theme_bw() +
    # add labels 
    labs(title    = "Average Cooperation Scores",
         y        = "Cooperation")

Or I might want to plot a histogram. I used the additional geom, ‘geom_vline’ to plot a line representing the theorized population mean.

# the data set
dat %>%
  # the canvas
  ggplot(aes(x = cooperation)) +
    # the geometric elements
       geom_histogram(
                   fill   = "purple",
                   colour = "black",
                   alpha  = .6,
                   bins   = 5) +
  #add vertical line
  geom_vline(xintercept =60,
             color='blue',
             linetype = "dashed") +
    # set theme
    theme_bw() +
    # add labels 
    labs(title    = "Histogram of Cooperation Scores",
         x        = "Cooperation")


Paired-Sample t-test

To illustrate how paired-samples t-tests work, we are going to walk through an example from your textbook. In this example, the data comes from Dr. Chico’s introductory statistics class. Students in the class take two tests over the course of the semester. Dr. Chico gives notoriously difficult exams with the intention of motivating her students to work hard in the class and thus learn as much as possible. Dr. Chico’s theory is that the first test will serve as a “wake up call” for her students, such that when they realize how difficult the class actually is they will be motivated to study harder and earn a higher grade on the second test than they got on the first test.

You can load in the data from this example by running the following code:

chico_wide <- import("https://raw.githubusercontent.com/uopsych/psy611/master/labs/resources/lab9/data/chico_wide.csv")

# long format
chico_long <- import("https://raw.githubusercontent.com/uopsych/psy611/master/labs/resources/lab9/data/chico_long.csv")

Note: You should now have 2 versions of the same data set loaded into your global environment. The only difference in these versions of the data is their “shape” – one is “wide” and the other is “long”. In the wide form, every row corresponds to a unique person; in the long form, every row corresponds to a unique observation or measurement.

  • To get a sense of the differences between these two versions of the data, see below:
head(chico_wide)
##         id grade_test1 grade_test2
## 1 student1        42.9        44.6
## 2 student2        51.8        54.0
## 3 student3        71.7        72.3
## 4 student4        51.6        53.4
## 5 student5        63.5        63.8
## 6 student6        58.0        59.3
head(chico_long)
##         id  time grade
## 1 student1 test1  42.9
## 2 student2 test1  51.8
## 3 student3 test1  71.7
## 4 student4 test1  51.6
## 5 student5 test1  63.5
## 6 student6 test1  58.0

We will work with both versions of this dataset today.

Data exploration

Let’s take a closer look at the data before we actually run a t-test to see what might be going on…

describe(chico_wide)
##             vars  n  mean   sd median trimmed  mad  min  max range  skew
## id*            1 20 10.50 5.92   10.5   10.50 7.41  1.0 20.0  19.0  0.00
## grade_test1    2 20 56.98 6.62   57.7   56.92 7.71 42.9 71.7  28.8  0.05
## grade_test2    3 20 58.38 6.41   59.7   58.35 6.45 44.6 72.3  27.7 -0.05
##             kurtosis   se
## id*            -1.38 1.32
## grade_test1    -0.35 1.48
## grade_test2    -0.39 1.43

Question: What do you notice about the means of the two groups? Can we conclude anything?

What if we look at the data graphically?

One way this is often plotted is with 95% confidence intervals. One easy way to do that is with the ‘ggerrorplot()’ function in the {ggpubr} package.

# create plot
ggerrorplot(chico_long, 
            x         = "time", 
            y         = "grade", 
            desc_stat = "mean_ci", 
            color     = "time", 
            ylab      = "Grade")

  • Based on the descriptive statistics and the plot above with widely overlapping confidence intervals, it might seem intuitive to think that the improvement in test scores from Test 1 to Test 2 is due entirely to chance. However, if we look at the data in a different way, we will see why this impression may not be correct…
ggplot(chico_wide, aes(x=grade_test1, y=grade_test2)) +
  #add pints
  geom_point()+
   # set theme
    theme_bw() +
    # add labels 
    labs(title    = "Scatterplot of Testing Across 2 Timepoints",
         x        = "Time 1",
         y        = "Time 2")

Question: What do you notice about the pattern of points in this plot?

  • Let’s create a new variable to represent the improvement for each student from Test 1 to Test 2.
chico_wide <- chico_wide %>% 
  mutate(diff = grade_test2 - grade_test1)

head(chico_wide)
##         id grade_test1 grade_test2 diff
## 1 student1        42.9        44.6  1.7
## 2 student2        51.8        54.0  2.2
## 3 student3        71.7        72.3  0.6
## 4 student4        51.6        53.4  1.8
## 5 student5        63.5        63.8  0.3
## 6 student6        58.0        59.3  1.3
  • Now let’s look at a histogram of these improvement scores:
# the data set
chico_wide %>%
  # the canvas
  ggplot(aes(x = diff)) +
    # the geometric elements
       geom_histogram(
                   fill   = "red",
                   colour = "black") +
    # set theme
    theme_bw() +
    # add labels 
    labs(title    = "Histogram of Improvement",
         x        = "Improvement")

* Notice that the vast majority of students have an improvement score above 0; in other words, virtually all students improved their score to some extent from Test 1 to Test 2.


What is a paired samples t-test?

  • In our example above, we created a new variable, chico_wide$diff, that represents the difference between each student’s score on Test 1 and Test 2.

  • More generally, if \(X_{i1}\) is the score that the \(i\)-th participant obtained on the first variable, and \(X_{i2}\) is the score that the same person obtained on the second one, then the difference score is:

\[ \Delta_i = X_{i1} - X_{i2}\]

  • The population mean of difference scores could be represented as:

\[ \mu_\Delta = \mu_1 - \mu_2\]

  • Fundamentally, paired samples t-tests operate at the level of difference scores. Accordingly, we can state the null and alternative hypotheses for a paired-samples t-test as follows:

\[ H_0: \mu_\Delta = 0 \] \[ H_1: \mu_\Delta \neq 0 \]

  • So far this looks really similar to a one-sample t-test; in this case, we are testing whether the mean of some set of difference scores is different from 0. As such, our t-statistics for a paired samples t-test looks similar to what we have already learned for a one-sample t-test:

\[t = \frac{\bar \Delta}{\hat \sigma_\Delta / \sqrt{N}}\]

Using arithmetic

For running a paired t-test by hand, we essentially run a one-sample t-test. The main difference is that our mean and standard deviation are the mean and standard deviation of the difference scores.

# calculate mean and sd
chico_mean   <- mean(chico_wide$diff)
chico_sd     <- sd(chico_wide$diff)

# print the mean and sd
c("mean" = chico_mean,
  "sd"   = chico_sd)
##      mean        sd 
## 1.4050000 0.9703363

As discussed in class, we can also calculate the standard deviation of the difference scores with the standard deviations of both groups by using the following equation:

\[\sqrt{\hat\sigma_{M1}^2 + \hat\sigma_{M2}^2 - 2r(\hat\sigma_{M1}\hat\sigma_{M2})}\]

# calculate standard deviation of both groups
g1_sd <- sd(chico_wide$grade_test1)
g2_sd <- sd(chico_wide$grade_test2)

# calculate the correlation of both groups
g_cor <- cor(chico_wide$grade_test1, chico_wide$grade_test2)

# calculate the standard deviation
chico_sd_alt <- sqrt(g1_sd^2 + g2_sd^2 - (2 * g_cor * (g1_sd * g2_sd)))

# print the sd calculated using the alternative method
c("sd" = chico_sd_alt)
##        sd 
## 0.9703363

Both methods should produce the same value for the standard deviation of the difference scores.

Next we can calculate the sample size and the degrees of freedom for our test. Keep in mind that our sample size is not the total number of observations (40). Instead, it is the total number of participants (20).

# calculate sample size and degrees of freedom
chico_n      <- length(chico_wide$diff)
chico_df     <- chico_n - 1

# print the sample size and the degrees of freedom
c("n"  = chico_n,
  "df" = chico_df)
##  n df 
## 20 19

For the degrees of freedom we only subtract one because, as with a one-sample t-test we are only dealing with one mean (i.e., the mean of the difference scores).

The final step before we calculate our t-statistic is to calculate the standard error.

# calculate the standard error
chico_se <- chico_sd / sqrt(chico_n)

# print the sample size and the degrees of freedom
c("se" = chico_se)
##        se 
## 0.2169738

Now all we have to do to calculate our t-statistic is divide the mean difference scores by the standard error of the mean difference scores.

# calculate the t-statistic
chico_t <- chico_mean / chico_se

# print the t-statistic
c("t" = chico_t)
##        t 
## 6.475436

And, as always, we can calculate a p-value for our t-statistic using the pt() function.

# calculate the t-statistic
chico_p <- pt(q = abs(chico_t), df = chico_n - 1, lower.tail = FALSE) * 2

# print the t-statistic
c("p" = chico_p)
##             p 
## 0.00000332067

We can also used the same code we used for our one-sample t-test and our independent samples t-test to calculate the confidence interval around the mean of the difference scores.

# calculate the confidence interval
chico_ci_low <- chico_mean + chico_se * qt(p = .025, df = chico_df, lower.tail = TRUE)
chico_ci_up  <- chico_mean + chico_se * qt(p = .975, df = chico_df, lower.tail = TRUE)

# print the t-statistic
c("95% CI Lower" = chico_ci_low,
  "95% CI Upper" =  chico_ci_up)
## 95% CI Lower 95% CI Upper 
##    0.9508686    1.8591314

And we calculate a Cohen’s d in exactly the same way as the one-sample and independent samples t-test:

\[ d = \frac{\bar \Delta}{\hat \sigma_\Delta} \]

If we divide the mean of the difference scores by the standard deviation of the differences scores we get the Cohen’s d (i.e., the standardized mean of the difference scores) at the within-subjects level.

# calculate cohen's d
chico_d <- chico_mean / chico_sd

# print cohen's d
c("d" = chico_d)
##        d 
## 1.447952

Looks like there was a large effect! On average, students showed a large improvement on the second test.

Using functions

###One-sample t-test of difference scores

We can conduct our t-test using one-sample t-test functions. The example immediately below uses the t.test() function in the {stats} package to conduct the one-sample t-test of the difference scores.

t.test(x = chico_wide$diff, mu = 0)
## 
##  One Sample t-test
## 
## data:  chico_wide$diff
## t = 6.4754, df = 19, p-value = 0.000003321
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.9508686 1.8591314
## sample estimates:
## mean of x 
##     1.405

This next bit of code uses the oneSampleTTest() function from the {lsr} package to conduct the one-sample t-test of the difference scores.

oneSampleTTest(x = chico_wide$diff, mu = 0)
## 
##    One sample t-test 
## 
## Data variable:   chico_wide$diff 
## 
## Descriptive statistics: 
##              diff
##    mean     1.405
##    std dev. 0.970
## 
## Hypotheses: 
##    null:        population mean equals 0 
##    alternative: population mean not equal to 0 
## 
## Test results: 
##    t-statistic:  6.475 
##    degrees of freedom:  19 
##    p-value:  <.001 
## 
## Other information: 
##    two-sided 95% confidence interval:  [0.951, 1.859] 
##    estimated effect size (Cohen's d):  1.448

Paired samples t-test

The t.test() function from the {stats} package also allows you to input raw scores (i.e., not the difference scores) and run a paired samples t-test using the paired = TRUE argument. The results will be exactly the same as running the one sample t-test on the difference scores.

t.test(x = chico_wide$grade_test1,
       y = chico_wide$grade_test2,
       paired = TRUE)
## 
##  Paired t-test
## 
## data:  chico_wide$grade_test1 and chico_wide$grade_test2
## t = -6.4754, df = 19, p-value = 0.000003321
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.8591314 -0.9508686
## sample estimates:
## mean of the differences 
##                  -1.405

We can also use the pairedSamplesTTest() function from the {lsr} package.

pairedSamplesTTest(formula = ~ grade_test2 + grade_test1, # one-sided formula
                   data = chico_wide # wide format
                  )
## 
##    Paired samples t-test 
## 
## Variables:  grade_test2 , grade_test1 
## 
## Descriptive statistics: 
##             grade_test2 grade_test1 difference
##    mean          58.385      56.980      1.405
##    std dev.       6.406       6.616      0.970
## 
## Hypotheses: 
##    null:        population means equal for both measurements
##    alternative: different population means for each measurement
## 
## Test results: 
##    t-statistic:  6.475 
##    degrees of freedom:  19 
##    p-value:  <.001 
## 
## Other information: 
##    two-sided 95% confidence interval:  [0.951, 1.859] 
##    estimated effect size (Cohen's d):  1.448

Long vs. wide format

Note that in the example above, using the pairedSamplesTTest() function, we used the wide format data. When using wide data with pairedSamplesTTest(), you enter a one-sided formula that contains your two repeated measures conditions (e.g. ~ grade_test2 + gradte_test1).

The pairedSamplesTTest() function can also be used with long data. In this case, you must use a two-sided formula: outcome ~ group. You also need to specify the name of the ID variable. Note that the grouping variable must also be a factor.

# grouping variable (time) must be a factor
chico_long <- chico_long %>% 
  mutate(time = as.factor(time))

pairedSamplesTTest(formula = grade ~ time, # two-sided formula
                   data = chico_long, # long format
                   id = "id" # name of the id variable
                   )
## 
##    Paired samples t-test 
## 
## Outcome variable:   grade 
## Grouping variable:  time 
## ID variable:        id 
## 
## Descriptive statistics: 
##              test1  test2 difference
##    mean     56.980 58.385     -1.405
##    std dev.  6.616  6.406      0.970
## 
## Hypotheses: 
##    null:        population means equal for both measurements
##    alternative: different population means for each measurement
## 
## Test results: 
##    t-statistic:  -6.475 
##    degrees of freedom:  19 
##    p-value:  <.001 
## 
## Other information: 
##    two-sided 95% confidence interval:  [-1.859, -0.951] 
##    estimated effect size (Cohen's d):  1.448

Calculating Cohen’s D using functions

In addition to being produced automatically when you run pairedSamplesTTest(), the within-subjects Cohen’s d can be produced by using the cohensD() function from {lsr} with the argument method = "paired".

lsr::cohensD(x      = chico_wide$grade_test1,
             y      = chico_wide$grade_test2, 
             method = "paired")
## [1] 1.447952

We can also calculate the Cohen’s d for the between-conditions variance using the method = "pooled" argument.

lsr::cohensD(x      = chico_wide$grade_test1,
             y      = chico_wide$grade_test2, 
             method = "pooled")
## [1] 0.2157646

Interpretation and Write-Up

#First I'll run my t-test again, but this time I'll save it to a variable:
pairedtoutput  <- t.test(x = chico_wide$grade_test1,
                         y = chico_wide$grade_test2,
                          paired = TRUE)

#Let's look at our output options
apa_print(pairedtoutput)
## $estimate
## [1] "$M_d = -1.40$, 95\\% CI $[-1.86$, $-0.95]$"
## 
## $statistic
## [1] "$t(19) = -6.48$, $p < .001$"
## 
## $full_result
## [1] "$M_d = -1.40$, 95\\% CI $[-1.86$, $-0.95]$, $t(19) = -6.48$, $p < .001$"
## 
## $table
## NULL
## 
## attr(,"class")
## [1] "apa_results" "list"

A paired samples t-test was used to compare scores on Dr. Chico’s first and second exam. The students scored significantly higher on the second test than on the first test, \(M_d = -1.40\), 95% CI \([-1.86\), \(-0.95]\), \(t(19) = -6.48\), \(p < .001\)

Plotting paired-samples data

When plotting paired samples data, we want some way to clearly represent the repeated measures structure of the data. One way to do this is to draw a line between each pair of data points. This can be done with the ggpaired() function from {ggpubr}.

# wide format
ggpaired(chico_wide, 
         cond1      = "grade_test1", 
         cond2      = "grade_test2",
         color      = "condition", 
         line.color = "gray", 
         line.size  = 0.4,
         palette  = "jco")

  • If you want to make the plot with the long format of the data, you can do the following:
# long format
ggpaired(chico_long, 
         x          = "time", 
         y          = "grade",
         color      = "time", 
         line.color = "gray", 
         line.size  = 0.4,
         palette    = "jco")

Minihacks

Minihack 1:

You are developing a scale to quantify how delicious a food is. You design it such that 0 is as delicious as an olive, 25 is delicious as cement, and 50 is delicious as a lemon. You aren’t sure exactly where pancakes should fall, but it is an important validation of this measure that a pancake would be drawn from a population that is more delicious than a lemon. You order takeout pancakes from 17 local diners and score each

Generate the data by running the the following code:

# set seed for reproducability
set.seed(42)

# load data
dat_pancake <- data.frame("id"      = 1:17,
                          "Delicious" =  rnorm(17, 55, 10))
  1. Run an appropriate test based on your research question.
(toutput  <- t.test(x = dat_pancake$Delicious, mu = 50))
## 
##  One Sample t-test
## 
## data:  dat_pancake$Delicious
## t = 3.996, df = 16, p-value = 0.001041
## alternative hypothesis: true mean is not equal to 50
## 95 percent confidence interval:
##  54.45060 64.50847
## sample estimates:
## mean of x 
##  59.47953
  1. Report and interpret your results using embedded r code.

We found that the average pancake deliciousness of the pancake(\(M = 59.48\), 95% CI \([54.45\), \(64.51]\)) was significantly higher than where we placed a lemon on our scale, \(t(16) = 4.00\), \(p = .001\). More work needs to be done, but this is an important initial validation of our measure.

  1. Plot a histogram of your data. Include lines both with different colors and line shapes for meaningful points on the scale. Optional Bonus: Add a legend with labels for those lines.
# the data set
dat_pancake %>%
  # the canvas
  ggplot(aes(x = Delicious)) +
    # the geometric elements
       geom_histogram(
                   fill   = "brown",
                   colour = "yellow") +
  #add vertical line
  geom_vline(aes(xintercept=50,
                 color="lemon"),
             linetype = "solid") +
  geom_vline(aes(xintercept=25,
                 color="cement"),
             linetype = "dashed") +
  geom_vline(aes(xintercept=0,
                 color="olive"),
             linetype = "dotdash") +
  scale_color_manual(name = "SCALE", values = c(lemon = "orange", cement = "gray", olive = "green")) +
    # set theme
    theme_bw() +
    # add labels 
    labs(title    = "Histogram of Pancake Deliciousness",
         x        = "Deliciousness")

  1. Show a different visualization and put this new plot and the previous plot on separate tabs. Make sure to hide your code.

My 2 visualizations

plot 1

plot 2


Minihack 2:

A clinical psychologist wants to know whether a new cognitive-behavioral therapy (CBT) program helps alleviate anxiety. He enrolls 12 individuals diagnosed with an anxiety disorder in a 6-week CBT program. Participants are given an Anxiety Scale before they begin and after they complete treatment.

Import the data by running the the following code:

cbt_data <- import("https://raw.githubusercontent.com/uopsych/psy611/master/labs/resources/lab9/data/cbt_data.csv")
  1. Run a paired-samples t-test to determine whether participants’ anxiety scores changed from before to after the CBT treatment.
(toutput <- t.test(cbt_data$after, cbt_data$before, paired = TRUE))
## 
##  Paired t-test
## 
## data:  cbt_data$after and cbt_data$before
## t = -5.8495, df = 11, p-value = 0.000111
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -5.16101 -2.33899
## sample estimates:
## mean of the differences 
##                   -3.75
  1. Verify your results using a one sample t-test on difference scores
cbt_data$DIFF <- cbt_data$after - cbt_data$before

t.test(cbt_data$DIFF)
## 
##  One Sample t-test
## 
## data:  cbt_data$DIFF
## t = -5.8495, df = 11, p-value = 0.000111
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -5.16101 -2.33899
## sample estimates:
## mean of x 
##     -3.75
  1. Report and interpret you results using r code embedded in text. Optional Bonus: Add a measure of effect size also.
pst <- pairedSamplesTTest(formula = ~ after + before, # one-sided formula
                   data = cbt_data # wide format
                  )

Anxiety symptoms were significantly reduced between before and after treatment, \(M_d = -3.75\), 95% CI \([-5.16\), \(-2.34]\), \(t(11) = -5.85\), \(p < .001\), d=1.69.

  1. Create a table to display the descriptive statistics from this data.
library(kableExtra)
kbl(describe(cbt_data[2:3]),
    format = "simple",
      digits = 2, #rounds all values to two decimal places
      caption = "Table 1. Descriptive Statistics from our study.")
Table 1. Descriptive Statistics from our study.
vars n mean sd median trimmed mad min max range skew kurtosis se
before 1 12 10.75 4.05 11 10.6 5.19 5 18 13 0.14 -1.26 1.17
after 2 12 7.00 2.37 7 7.0 2.22 3 11 8 0.00 -1.14 0.69
  1. Plot the data using ggpaired(). Optional Bonus: Try to recreate this plot using ggplot.
ggpaired(cbt_data, cond1 = "before", cond2 = "after",
         color = "condition", line.color = "gray", line.size = 0.4,
         palette = "jco")

ggplot version:

#make into long format
cbt_data %>%
  gather(time, anxiety, 2:3)-> cbt_long 

#basic version of plot
ggplot(cbt_long, aes(x=time, y=anxiety, color=time)) +
  geom_boxplot() +
  geom_point() +
  geom_line(aes(group=id),
            color='black',
            alpha=.6)