Summary of "Noise: A Flaw in Human Judgment"

Core Idea

Noise is unwanted variability in judgment; bias is a systematic directional error, and the book argues that noise is a pervasive but underrecognized source of bad decisions.
The central claim is that in many domains, especially when truth is hidden or delayed, people and institutions should treat judgment like measurement and ask how much scatter there is around the target, not just whether the average is off.
Noise matters because it produces unfairness, inconsistency, and error in settings where similar cases should be treated similarly, from sentencing and hiring to medicine, forecasting, and asylum.

How Noise Works

The authors use the shooting-range metaphor: even if the bull’s-eye is unknown, repeated shots can reveal whether judgments are clustered tightly or scattered widely.
They distinguish level noise (different average severity), pattern noise (different reactions to particular cases), and occasion noise (the same judge changing with mood, fatigue, weather, or time of day).
A key result is the error equation: MSE = Bias² + Noise², which treats reducing noise as just as important as reducing bias for predictive accuracy.
In judgment systems, random errors do not cancel out, because different cases receive different errors; instead, the errors accumulate across the system.
The book argues that many professionals live with an illusion of agreement, assuming colleagues would decide as they do because they share language, norms, and institutional culture.
Noise audits expose this illusion by sending the same cases to multiple judges and comparing answers; in insurance, executives expected small differences, but actual median gaps were far larger than expected.
The book repeatedly notes that noise can be hidden by organizations because people see only obviously bad decisions, not the variability among otherwise acceptable ones.

Evidence, Domains, and Mechanisms

Sentencing is the book’s canonical example: studies found dramatic disparities for similar cases, and irrelevant factors such as hunger, time of day, and even local football losses affected severity.
The U.S. Sentencing Reform Act of 1984 and sentencing guidelines reduced interjudge disparity, but when the guidelines became advisory, disparities and ideological effects rose again.
The same logic appears in insurance, where identical cases handled by different underwriters or adjusters produced huge judgment gaps; similar variability showed up in stock analysis and other professional tasks.
Occasion noise is real even in one person: repeated estimates, free throws, and rejudgments vary, and mood can push people toward greater cooperation, gullibility, stereotyping, or harsher moral choices.
Groups can amplify noise through informational cascades, social pressure, and group polarization; deliberating juries can become noisier than aggregated independent judgments.
The book emphasizes that noisy systems are not just random chaos: some variability is stable pattern noise, reflecting enduring differences in how judges see and weight cases.
In predictive judgment, simple mechanical models often outperform clinical intuition; Meehl’s work, later reviews, and the model of the judge all support the claim that formulas are usually more accurate and cheaper.
Equal-weight and other frugal models often rival regression because regression can fit flukes in the sample; cross-validated, out-of-sample performance is what matters.
Machine learning and AI extend this logic by using large data sets and noise-free rules, sometimes improving accuracy, reducing detention, and even lowering racial disparity relative to human judges.
But algorithms are not automatically fair: trained on biased historical data, they can reproduce human bias, so the book evaluates them against multiple criteria rather than treating them as neutral by default.
The book also stresses objective ignorance: some forecasting error comes from the future’s inherent unknowability, not from bias or noise alone.
Tetlock’s forecasting work, the Good Judgment Project, and the idea of superforecasters show that better prediction is possible, but the limits are real and humans often mistake confidence for knowledge.
Social and causal thinking make events feel more explainable than they were; in the valley of the normal, hindsight creates coherent stories that hide how unpredictable singular events really were.

Reducing Noise

The book’s improvement agenda begins with a noise audit, then asks whether better judges, debiasing, rules, algorithms, or decision hygiene can reduce variance without creating worse problems.
Better judges matter: true expertise requires actual predictive success when outcomes are verifiable, while respect-experts are trusted mainly because they sound coherent, intelligent, and professionally credible.
Cognitive style matters too: measures like the CRT and actively open-minded thinking predict better judgment, especially when people seek contrary evidence and update carefully.
The book’s signature procedural remedy is decision hygiene, which means structuring judgment to reduce both bias and noise before the final decision is made.
Core hygiene tools include: use the outside view; break complex judgments into independent parts; delay intuition until evidence is gathered; collect independent judgments; aggregate them; and use relative scales rather than vague absolute ones.
Structural reforms work because averaging independent judgments reduces noise sharply, while scales with clearer anchors reduce arbitrary spread.
The book illustrates this with structured interviews, performance ratings, forensic sequencing rules, diagnostic checklists such as Apgar and BI-RADS, and protocols that separate evidence collection from final holistic judgment.
A major forensic lesson is that sequencing matters: revealing context too early can trigger confirmation bias, taint subsequent analysis, and produce cascading error, as in fingerprint identification and the Brandon Mayfield case.
The authors argue that unstructured interviews, unguided ratings, and loosely anchored evaluations are noisy partly because they invite first impressions, coherence-making, and inconsistent weighting of cues.
Their preferred organizational approach is not blind faith in discretion but a mix of selection, training, independent assessment, aggregation, and carefully designed rules.

What To Take Away

Bias and noise are different, and the book insists that noise is often the larger, more neglected problem.
The practical standard is not “Did the decision feel reasonable?” but “How variable would the judgment be if the same case were seen again by another competent person, or by the same person later?”
Mechanical rules, algorithms, and structured procedures often beat intuition, but the book also warns that noise reduction can bring costs, rigidity, and sometimes new forms of bias.
The best systems are therefore not the most impressionistic ones; they are the ones that make judgments more independent, more comparable, and less hostage to whim.