From the library:
The library also offers free short workshops. This term, we are offering workshops on R, Python, GitHub, using Zotero to manage your citations, and more. Our Friday lunch chats (Coffee + Data && Code) offer a great opportunity to meet folks interested in data and code from across the university.
An essential aid to “signal detection”
A universal language for communicating what we find.
Required for competent evaluation of others’ work.
Advanced skill in quantitative methods carries with it the responsibility to use those skills carefully and ethically.
Today, we’ll discuss methodological issues present in statistics.
It can be tempting to use statistics to fix poor research design.
These issues cannot be fixed quantitatively (even when it looks like they can).
After today, we focus on what happens after you collect data. But it is still your job to study research design, data collection, and theoretical logic.
Our basic goal in science is to make inferences about the causal relations between constructs.
We can’t do that directly, so we rely on proxies for those constructs.
In order to infer that A –> B, we have to make three assumptions:
When the first two assumptions are true, the relation between X and Y will provide a good estimate of the relation between A and B.
What threatens our ability to carry out this seemingly simple task?
How do quantitative methods help us solve these problems?
Four kinds of validity in research threaten our ability to make valid causal inferences. Solving each problem either directly requires quantitative methods or makes use of principles that are central to quantitative methods.
Statistical conclusion validity
Internal validity
External validity
Construct validity
Definition: the validity of the inference that X and Y are related
low statistical power
violations of assumptions of statistical tests
fishing and the error rate problem
unreliable measures
restricted range
unreliable treatment implementation
extraneous variance in the experimental setting
heterogeneity of units
Definition: the validity of the inference that X and Y are causally related.
Given that X and Y are correlated, can we validly infer that the relation is causal?
Traditional view: direction, correlation, confounds (John Stuart Mill, 1843)
Counter-factual view: Effect is what did happen compared to what would have happened to the same people had they not had the treatment at the same time.
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Temporal precedence can be established in an experiment because treatment precedes outcome.
But, when treatment is not possible, then logic and common sense can sometimes dictate temporal precedence.
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Any systematic differences between groups that might account for an observed effect.
example: Test scores of students who visit the Psychology tutoring center vs students who do not visit tutoring center.
How to combat this?
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Even if random assignment is used, participants may drop out of the study, producing unequal groups, a situation that has the same inferential problems as selection.
How could the design be modified and statistics used to help rule out selection and attrition confounds?
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
History refers to any event that occurs between the beginning of treatment and the measurement of outcome that might have produced the observed effect.
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Ex: A marketing campaign intended to increase beer sales happens to coincide with other events that might have the same effect: a particularly hot period of weather, a long losing streak by the Detroit Tigers, or the Republican National Convention. How would these threats be eliminated?
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Maturation refers to changes in the organism that occur regardless of treatment and that may look like a treatment effect.
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Ex: A school-wide educational intervention is predicted to increase achievement test scores. The entire school must get the same curriculum, so a control group in the school is not possible. How can the threat be reduced?
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Regression (to the mean) occurs when participants are selected because of their extreme scores and those scores are unreliable. The scores will regress toward the mean at the second assessment
Ex: Sports Illustrated cover jinx
How might this problem be reduced?
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Testing refers to the possible change that may occur just because participants have been previously measured. These are often called practice or fatigue effects.
Ex: Students do better on the first half of test compared to the second. Ex: Students do better in the second half of the term compared to the first.
Without adding a control group, how might this threat be reduced?
ambiguous temporal precedence
selection
attrition
history
maturation
regression
testing
instrumentation
Change may occur because the measurement changes over time, perhaps becoming more or less reliable.
Instrumentation reflects changes in the measurement; testing reflects changes in the object of measurement.
“When a measure becomes a target, it ceases to be a good measure.” (Goodhart)
The key point with internal validity is that something else besides the treatment is a plausible alternative explanation for any apparent treatment effect.
Solving threats to internal validity is a research design problem, not a statistics problem. Nonetheless, quantitative methods play a key role in making the case for internal validity.
If the “other variables” can be measured, their influence can be statistically controlled so that the hypothesized relation can be detected more accurately.
However:
Statistical control should best be thought of as a method of last resort, to be used when design controls are not available or have failed.
Definition: The validity of the inference that a causal relation between operations generalizes to other units, treatments, observations, or settings.
interaction of the causal relation with units
interaction of the causal relation with treatment variations
interaction of the causal relation with observations
interaction of the causal relation with settings
context-dependent mediation
Quantitative methods provide a powerful way to demonstrate moderation (interactions) and mediation effects.
Now’s a great time to pause and reflect on validity in general. Some kinds of validity are essential: statistical conclusion, construct.
Other types of validity (internal, external) are not necessary for every study, but should be the goals of a program of research.
What is this? What is it useful for?
When is representativeness counterproductive?
For each of the following, would you say that representativeness (1) improves the study, (2) weakens the study, or (3) has no effect on the study?
Takeaways
Representativeness may or may not be useful. It depends!
Often, it’s better to either recruit a homogenous sample (to minimize differences between groups or reduce noise)…
… or overrecruit from smaller groups to test for differences.
But the biggest takeaway should be that these principles do not apply equally to all studies! You should consider what the goals of YOUR study are and how best to meet those goals.
Definition: The validity of the inference that a given operationalization of units, treatments, observations, or settings represents well the construct of which it is assumed to be an instance.
inadequate explication of constructs
construct confounding
confounding constructs with levels of constructs
reactive self-report changes
reactivity to the experimental situation
experimenter expectancy
novelty and disruption effects
Variables, measurement, constructs
Reminder: