Critiques of NHST
Misinterpretation of p-values
Promotes binary thinking
Easily gamed in the pursuit of incentives
The open science movement (re:statistics)
Sometimes labelled “the reform movement” – (psychological) scientists trying to address problems in the field identified as part of the replication crisis.
(Note, these are not distinct periods of history, nor are either of them considered over.)
Within psychology, much of the force behind this movement has been driven by social and personality psychologists.
Open because one of the primary problems of the replication crisis was the lack of transparency. Stapel, Bem, everyone did work in private; kept data secret; buried, hid, lost key aspects of research. Opaque was normal.
New methods of conducting science on science – often developed to find fraud, but super useful for detecting errors
Granularity-related inconsistency of means (GRIM) test – is the mean mathematically possible given the sample size?
Sample Parameter Reconstruction via Iterative TEchniques (SPRITE) – given a set of parameters and constraints, generate lots of possible samples and examine for common sense.
StatCheck – upload pdfs and word documents to look for inconsistencies (e.g., statistic, df)
Conceptually, psychologists have struggled with the interpretation of p-values.
Even worse, we seemed to forget some of the most basic assumptions of our statistical tests.
Disillusionment with p-values led many to propose radical changes.
But the field is ruled by researchers who made their name using NHST (and very likely, many of them also used these fishing-methods to build their careers). So a softer approach was more widely favored.
Cumming (2014) proposes the “new statistics” which emphasize effect sizes, confidence intervals, and meta-analyses.
Except…
Confidence intervals are based on the same underlying probability distributions as our p-values.
And what’s the problem with meta-analyses?
The Reproducibility Project (2015) finds that most findings in psychological science don’t replicate.
Also, new methods for approaching meta-analyses are studied and popularized.
Use of scripts – data analysis is reproducible
Software is open-source
Combination of two languages: R and Markdown.
Markdown is a way of writing without a WYSIWYG editor – instead, little bits of code tell the text editor how to format the document.
Increased flexibility: Markdown can be used to create - presentations (this one!) - manuscripts - CVs - books - websites
By combining Markdown with R…
Git is a version control system. Think Microsoft Track Changes for your code.
GitHub is one site that facilitates the use of Git. (Others exist. But they don’t have OctoCat.)
Repositories can be private or public – allow you to share your work with others (reproducible)
GitHub also plays well with the Markdown language, which is what you’re using for your homework assignments.
You can link GitHub repositories to R Projects for near seamless integration.
Pair GitHub and R to make websites!
(Interested? Have 4 hours to kill? I recommend looking through Alison Hill’s workshop on blogdown and the tutorial prepared by Dani Cosme and Sam Chavez).
Another repository, also includes version control
Doesn’t use code/terminal to update files
Also great for collaborations
Easy to navigate
Can be paired with applications you (should) already use
preprint = the pre-copyedited version of your manuscript
journals have different policies regarding what you can post. It’s always a good idea to check.
OSF also allows you to preregister a project.
Preregistration is creating a time-stamped, publicly availble, frozen document of your research plan prior to executing that plan.
Goal: Make it harder to p-hack and HARK
Goal: distinguish data-driven choices from theory based choices
Goal: correctly identify confirmatory and exploratory research
A note: not everything needs to be preregistered. Only confirmatory work.
It’s ok to deviate from preregistrations if you transparently document those deviations. (See this very helpful tutorial by Willroth & Atherton (2024)]
Registered reports (RR) are a special kind of journal article
image credit: Doropthy Bishop
More psychologists have embraced machine learning.
These largely move us away from traditional p-values and decision making becomes predicated on the ability of theoretical models to predict new sets of data.
How accurate does your model need to be for you to accept a relationship?
Most machine learning techniques adjust for over-fitting, or essentially capitalizing on the random error in your sample
Sensitivity analyses and multiverse analyses test the robustness of a finding. These often including testing every possible set of covariates or subgroups, to see whether a particular finding is just because of a unique combination of variables and people.
Certainly there are some who argue that open science methods are broadly harmful: stifle creativity, slow down research, incentivize ad hominem attacks and “methodological terrorists”, and encourage data parasites.
It is worth admitting that, unless incentive structures don’t change, there can be harm done with adoption of these methods. But part of the goal is to change the overall system.
A more legitimate criticism is that adoption of these methods can fool researchers into believing that all research using these methods is “good” and all research not using them are “bad.”
An example: “Preregistration is redundant, at best.” (Oct 31, 2019). Szollosi et al.
And it’s worth adding that this kind of work can take longer.
One-sample tests