Measurement

Last time

Validity (4 types)

  • Statistical Conclusion
  • Internal
  • External
  • Construct(?)

Today

More about construct validity and its role in scale development.

Conceptual clarity (Bringmann, Elmer, & Eronen, 2022)

  • Identifying and characterizing the concept of study
    • This is independent of measurement and must happen before measurement
  • Why is conceptual clarity important?

Quantitative fallacy

See this article.

Applicant selection for medical residency positions. * Subtest of medical licensing exam is best predictor of residency success. * Great! * Right?

Quantitative fallacy

Fallacy:

  1. Measure whatever can be easily measured.
  2. Disregard things that cannot be measured easily.
  3. Presume things that cannot be measured easily are not important.
  4. Presume that things that are not measured easily do not exist.

Once a concept has been clarified, the next step is to measure it.

Classical test theory states that:

\[ X = T + E \]

X: Observed Score

T: True Score

E: error (random and unpredictable)

Error is random and unpredictable.

  • assumes that it has a mean of error approaches 0 as number of measurements increases.

Thus, if we measure people enough “times”, we can get a good sense of their true score.

  • How many times?

Measuring people multiple times is often impractical and inefficient (and potentially theoretically wrong), so how can we remove measurement error during a single assessment?

How many items we do we need?

Reliability (Quantitude podcast)

Measure of the consistency of a single test or scale. (Like measuring the person many times in a single session.)

Not validity! You can have a reliable measure that is not valid.

Which items?

It wasn’t assigned, but please read a great article by Len Simms1. This article focuses on the measurement using survey methods, but this logic extends to any measurement in which you aggregate multiple responses or scores.

I’m going to use this article as a template for developing a new scale with you.

There are many more articles worth reading on this (see Bonus Materials), and frankly, you should look for a class in measurement/psychometrics.

Phases of scale development

As written by Jane Loevinger (1957) and summarized by Len Simms (2008)

Let’s develop a scale for “academic stress.”

What are some specific aspects of academic stress?

Write the item pool

  • Items should be relevant and representative
  • Seek to be over-inclusive

Add items here: https://shorturl.at/rGg7m

Guidelines for good items:

  • Simple and straight forward
  • No double-barrels
  • Avoid slang
  • Phrase items generally
  • Phrase items about sensitive topics using matter-of-fact and nonperjorative language
  • Choose response format carefully
    • Dichotomous vs polytomous
    • Number of options
    • Response categories
    • Phrasing of items should be consistent with response format

Next steps:

  • Take scale
  • Discuss items
  • Basic descriptives (mean and variance)
  • Item-total correlations
  • Alpha/Omega
  • Factor analysis

Takeaways: * Be able to perform psychometric analysis from memory. * (Kidding) * Classical test theory * Scale development takes time, thought, iteration, and data.

Next time…

Describing data