Classical Test Theory & Item Response Theory

Psychometrics is the study of developing tests and measurements. In this lesson, we'll talk about two different theories of how psychologists can create good tests and measurement: classical test theory and item response theory. Updated: 03/24/2021

Psychometric Tests

When you think of the word ''test,'' you probably imagine yourself sitting in a classroom about to answer an exam. This might bring back stressful memories of school! But in psychology, the word ''test'' has a particular meaning, and it's a little more in-depth than simply some questions you need to answer. In particular, in this lesson we're talking about psychometric tests, which are scientific and systematic ways to test someone's ability to do a job or measure their personality or some mental ability (abilities which can be things like math or even critical thinking).

Psychometrics means the study of developing measurements. So there's an entire field of study dedicated to just how we write things, like exams. Psychometric tests are standardized, and they are designed to assess a particular variable. The people who write them try to make them objective and unbiased. In this lesson, we'll talk about two of these kinds of test theories: classical test theory and item response theory. Think of these as theories about how psychologists create tests and measures.

  • 0:04 Psychometric Tests
  • 1:08 Classical Test Theory
  • 3:06 Item Response Theory
  • 4:54 Lesson Summary
Classical Test Theory

Classical test theory (CTT) in psychometrics is all about reliability. We use the word ''reliable'' or ''reliability'' often in our colloquial language. Your friend who is always on time is reliable, for instance. But in psychology, reliability refers to how consistent a test or measure is. In other words, if you took the same test several times, you should get about the same score each time. So, assuming the conditions are the same, you'd get the same score on a test because the test itself is well designed.

There are three ideas we need to keep in mind when we're talking about CTT: test score, error, and true score. The test score is what we call the observed score. So, if you take a math exam and get an 85, that's your test score.

Error refers to, well, exactly what it sounds like! It's the amount of error that is found in a test or measure. This might be a mistake in the test, or it might also refer to things in the external environment that we can't totally control but that impact testing. Let's say you're taking your history exam, and there's construction going on in the building next door. Hammering isn't great for concentration, is it? This is a form of error because the terrible noise might impact your score.

Then, we have the true score. This is the score you would have achieved if there were absolutely no errors in the measurement. Alas, this isn't really possible. But psychometrics assumes everyone has, in theory, a true score. We can calculate this true score with an equation.

Charles Spearman, a psychologist and statistician, thought that we could reduce random error as much as possible, thereby making tests better. Spearman is widely considered one of the founders of CTT. So the important take away of CTT is that it's a theory that tries to explain and deal with error so our tests are more reliable.

Item Response Theory

Item response theory (IRT) is all about your performance on an exam and how it relates to individual items or questions on a test. IRT is an example of what psychologists call a latent trait model. These models try to figure if there's an underlying trait that that accounts for your performance on a test. The mathematical models we use to calculate IRT measurements are much more sophisticated than CTT, and with IRT we don't even need a sample of test takers.

