In 2011, social psychologist Daryl Bem published a paper in a prestigious journal claiming to have found evidence for precognition — the ability to sense future events. The paper passed peer review. It used standard statistical methods. And it was, by broad scientific consensus, almost certainly wrong. But the methods Bem used were the same methods used throughout psychology and other social sciences. If those methods could produce evidence for psychic powers, what else had they falsely supported? The answer, it turned out, was a disturbing proportion of published research. In 2015, the Open Science Collaboration attempted to replicate 100 studies published in top psychology journals. Only 36 produced statistically significant results the second time. The effect sizes — even for studies...
Popular framing: A crisis caused by sloppy scientists cutting corners or outright fraud, fixable by demanding more rigorous statistics and penalizing dishonest researchers. The 'Goodhart's Law' application: 'Publication' became the target, so 'Science' ceased to be the outcome.
Structural analysis: The replication crisis is the predictable output of a system where every node — researchers, institutions, journals, funders — is rewarded for impressive novel claims rather than truth. Individually rational behavior (publish significant results, avoid costly replications, inflate effect sizes) aggregates into a collectively irrational literature. No individual actor is to blame because no individual actor controls the selection pressures shaping what gets funded, published, cited, and rewarded. The 'Skin in the Game' problem—the total absence of personal cost for being 'publicly wrong' in academia.
Focusing on individual misconduct or statistical techniques obscures the principal-agent dynamics that make these behaviors adaptive. Reforms targeting researcher behavior without restructuring journal incentives, funding criteria, and tenure metrics will be captured and neutralized — a classic Goodhart's Law pattern where new measures (pre-registration, p<0.005) become new targets to game rather than genuine proxies for reliable science.