Cognitive | Analytical Thinking

Cognitive Reflection Test

7 questions designed to trigger an intuitive wrong answer. The test measures your ability to override the first answer that comes to mind and think more carefully.

17%

Score 7/7

1.2/3

Avg original 3 Qs

2.2/3

MIT grads avg

3.4M+

Scores recorded

Question: —/7

Correct: 0

The Cognitive Reflection Test

The Cognitive Reflection Test was introduced by Shane Frederick in a landmark 2005 paper in the Journal of Economic Perspectives. Frederick designed just three questions, each carefully crafted to trigger an immediate, intuitive wrong answer — what Daniel Kahneman calls a "System 1" response. Participants who answer too quickly invariably give the wrong answer. The correct answer requires deliberate "System 2" thinking: slowing down, checking your intuition, and re-examining the problem.

The original three questions have since become among the most studied items in behavioral economics. They predict decision-making quality, patience in intertemporal choice, risk preferences, and susceptibility to cognitive biases — often more strongly than standard intelligence measures. Frederick found that only 17% of an MIT student sample answered all three correctly, and at various universities the mean score ranged from 0.57 to 2.18 out of 3.

The CRT is conceptually related to executive function tests like the Trail Making Test Part B, which also measures the ability to override habitual responses and apply deliberate cognitive control.

The Three Original Questions

Frederick's original three questions are deceptively simple. Each exploits a specific cognitive shortcut that most people's brains automatically apply, producing a confident but incorrect answer. Understanding why they work reveals the architecture of human reasoning:

The bat-and-ball problem

The intuitive answer ($0.10) comes from decomposing $1.10 into $1.00 and $0.10. But this ignores the constraint that the bat costs $1.00 more than the ball — so if the ball is 10 cents, the bat is $1.10 and the total is $1.20, not $1.10. The correct reasoning requires setting up the equation: ball = x, bat = x + 1.00, total = 2x + 1.00 = 1.10, so x = $0.05.

Exploits: part-whole decomposition heuristic

The widget factory problem

The intuitive answer (100 minutes) comes from proportional reasoning: more machines, more time. But the question describes a fixed rate — 1 machine makes 1 widget in 5 minutes. Adding machines doesn't change that rate, it parallelizes it. 100 machines each make 1 widget in 5 minutes simultaneously.

Exploits: proportionality bias / overextension of linear reasoning

The lily pad problem

The intuitive answer (24 days) comes from halving 48. But doubling is an exponential function — if the lake is full on day 48 and doubles daily, then the day before (day 47) it must have been half-full. Linear intuitions about exponential processes are a fundamental human cognitive bias, with major consequences for understanding pandemics, compound interest, and climate change.

Exploits: exponential growth blindness / linear extrapolation

Score Distribution

CRT score distribution (7-question version)

The distribution is heavily skewed toward low scores: 0 is the most common score across general population samples. Only 17% score perfectly on the 7-question version. Most people score 1–2, confirming that the intuitive-wrong-answer trap is highly effective even for educated adults.

CRT and Decision Making

CRT scores are among the strongest predictors of rational decision-making quality across a wide range of domains. People who score higher on the CRT show better calibrated probability estimates, less susceptibility to the sunk cost fallacy, more consistent time-discounting (patience), and better performance on Bayesian reasoning tasks. The effect sizes are typically larger than for traditional IQ measures.

Financial decisions

Higher CRT scorers choose higher-expected-value gambles, are less loss-averse to a fault, and make better long-term financial plans. Frederick's original paper showed CRT predicted financial patience after controlling for cognitive ability.

Risk assessment

CRT correlates with better risk calibration. High scorers are less influenced by framing effects (loss vs gain), less susceptible to the availability heuristic, and more accurate in estimating base rates.

Logical reasoning

CRT performance predicts performance on syllogistic reasoning, conditional logic, and Wason selection tasks even after controlling for verbal and numerical ability. It captures something about the disposition to think carefully, not just ability.

Belief updating

High CRT scorers update their beliefs more accurately when presented with new evidence. They are less susceptible to confirmation bias and more willing to revise initial judgments. This has implications for learning and scientific reasoning.

CRT vs IQ

The CRT is not an IQ test. Traditional IQ tests like Raven's Matrices or the WAIS measure fluid intelligence — the raw capacity to process and reason about novel problems. The CRT measures something more dispositional: the tendency to engage deliberate "System 2" thinking when an automatic System 1 response is available. This is sometimes called cognitive miserliness — the degree to which a person conserves mental effort by accepting the first plausible answer.

CRT and IQ correlate (typically r = 0.25–0.40), but the CRT predicts decision quality in domains where IQ does not, and vice versa. High-IQ individuals fail CRT questions at substantial rates — the MIT data (mean 2.18/3) shows that even highly intelligent people are susceptible to System 1 traps unless they deliberately slow down. This is why the Mini IQ Test and CRT are best treated as complementary measures.

Dimension	CRT	IQ Test
What it measures	Cognitive disposition / reflectiveness	Fluid & crystallized intelligence
Speed relevant?	No — reflection is rewarded	Often timed
Predicts	Decision quality, bias resistance	Academic performance, novel problem solving
Trainable?	Yes — through habit and awareness	Modestly — fluid IQ is relatively stable

How to Think More Reflectively

Deliberately slow down on important decisions

The core skill the CRT measures is choosing to apply deliberate reasoning when the automatic system has already produced an answer. Practicing this — actively asking "wait, is my first answer actually right?" — is the most direct way to improve CRT-style performance. Even a simple 10-second pause before finalizing any significant judgment significantly improves decision quality in experimental studies.

Study your own cognitive biases

Awareness of specific biases — the base rate neglect, proportionality heuristic, exponential growth blindness — makes them easier to catch in the wild. Kahneman's Thinking, Fast and Slow is the definitive introduction. Research by Morewedge et al. (2015) showed that a single debiasing training session reduced bias susceptibility by 29% at three months follow-up.

Practice checking your arithmetic and logic

Many CRT errors involve simple arithmetic that is never verified. Developing the habit of checking calculations — even when they feel obviously correct — directly targets the core skill being measured. Estimation techniques (back-of-envelope calculations, order-of-magnitude checks) are especially valuable for catching exponential growth problems.

Engage with formal logic and probability

Regular engagement with logic puzzles, probability problems, and mathematical reasoning directly trains the System 2 circuits that override intuitive errors. Fermi estimation, Bayesian reasoning exercises, and conditional probability problems are particularly effective. These skills also transfer to performance on the Raven's Matrices pattern recognition test.