Noise: A Flaw in Human Judgment

Noise: A Flaw in Human Judgment is a nonfiction book by professors Daniel Kahneman, Olivier Sibony and Cass Sunstein. It was first published on May 18, 2021. The book concerns 'noise' in human judgment and decision-making. The authors define noise in human judgment as "undesirable variability in judgments of the same problem" and focus on the statistical properties and psychological perspectives of the issue.

Examples they give include their own finding at an insurance company that the median premiums set by underwriters independently for the same five fictive customers varied by 55%, five times as much as expected by most underwriters and their executives. Another example is that two psychiatrists who independently diagnosed 426 state hospital patients agreed on which mental illness the patient suffered from only in half of the cases and a finding that French court judges were more lenient if it happened to be the defendant's birthday.

Kahneman, Sibony and Sunstein argue that noise in human judgment is a thoroughly prevalent and insufficiently addressed problem in matters of judgment. They write that noise arises because of factors such as cognitive biases, mood, group dynamics and emotional reactions. While contrasting statistical bias to noise, they describe cognitive bias as a significant factor giving rise to both statistical bias and noise.

The authors write that noise can lead to gross injustices, unacceptable health hazards, and loss of time and wealth. They argue that organizations should be more committed to reducing noise and promote noise audits and decision hygiene as strategies to detect, measure, and prevent noise. Noise: A Flaw in Human Judgment became a The New York Times Bestseller and received generally positive reviews among critics. Common critiques against efforts to reduce noise are that such efforts dehumanize those affected by the judgments and that it can lead to discrimination. Some commentators also questioned the authors' claims about the novelty of the noise concept.

Noise in human judgment
Noise: A Flaw in Human Judgment was authored by psychologist and Nobel Prize in Economics laureate Daniel Kahneman, management consultant and professor Olivier Sibony, and law professor and Holberg Prize laureate Cass Sunstein. They write that 'noise' in human judgment presents itself in several forms: disagreement between judges, disagreement within judges, and even in judgments made only once by a person or group, since such a judgment can be viewed as only one possible outcome in a cloud of possible judgments that the judge in question could have arrived at. (Note that the term "judge" in the book denotes any person making an assessment of some kind.)

The reasons given by the authors for why noise arises include cognitive biases, differences in skill, differences in 'taste' (preferences) and emotional reactions, mood in the moment, level of fatigue, and group dynamics. The authors consider noise in predictions and evaluations but not in thought processes such as habits and unconscious decisions.

They write that whereas it is good to have noise for certain types of judgments, such as matters of taste when it comes to cultural entities (for example film reviews), society should do much to decrease noise in matters of judgment. This is because noise leads to sizeable consequences for example for fairness, health, safety and costs in terms of time and money.

Noise, statistical bias, and cognitive bias
Kahneman, Sibony and Sunstein use a shooting range as an analogy to illustrate noise and statistical bias, and how cognitive bias affect them both. Fig. 1 is an adaptation of the same illustration in the book, comparing how noise and bias affect the accuracies of judgments made by a team of judges. (The original illustration comes from an academic article on noise by Kahneman and his colleagues. )


 * Target a): Since all shots here are in or close to the bullseye, the collective judgment is accurate.
 * Target b): Here there is less accuracy, because of statistical bias: the shots are systematically off in one direction. Put differently, on average there is an error.
 * Target c): Here there is no error on average and thus no bias, because the errors from each shot cancel each other out. However, there is much noise, because the shots differ much from each other.
 * Target d): This error is the largest, since it has both bias and noise. Reducing error is most difficult here.

Examples of noise
In the book, Kahneman, Sibony and Sunstein provide many examples of noise in human judgment. Beyond the below areas where noise exists, the book also looks at for example performance evaluation and business strategy. However, a recurring statement in the book is that "wherever there is judgment, there is noise, and more of it than you think", so they argue that there is noise in most areas of human decision making.

Criminal law
A study on 208 criminal judges showed that their independently given punitive recommendations on 16 fictive cases varied greatly in harshness. For example, the judges only unanimously recommended imprisonment in three of the cases, and while the recommended number of prison years in one case was 1.1 years on average, one recommendation was as high as 15 years.

Education
A study looked at 682 real decisions by college admissions officers and found that the officers awarded the academic strengths of applicants more importance on cloudy days and, conversely, favored nonacademic strengths on sunny days.

Medicine
One study showed that whereas some radiologists never produced false negatives (missed real breast cancer) when examining mammograms, other radiologists did so half the time. For false positives, the range was 1–64 %.

Recruitment
A meta-analysis showed that a quarter of the time, two separate recruitment interviewers disagreed on which job candidate was the best fit for the job. This was despite the interviewers sitting on the same panel, thus having seen the candidates in the exact same circumstances.

Typology
Kahneman, Sibony and Sunstein propose the following typology/components of noise: level noise (arises due to different holistic views on the decision task among judges), stable pattern noise (arises due to permanent or semi-permanent differences between judges on how they react to certain circumstances in what is being judged), and occasion noise (arises due to factors temporarily affecting judgment, having little to do with the judgment/case itself).

What the authors call 'the first lottery' in how one's case will be judged regards level noise and stable pattern noise and thus which judge is assigned to make the judgment, whereas 'the second lottery' regards occasion noise and thus whether or not the judge makes the judgment on an occasion beneficial or not to those affected by the judgment.

Relative contribution
Kahneman, Sibony and Sunstein write that there is typically more noise than statistical bias. Within the noise, there is typically more pattern noise than level noise. Within the pattern noise, there is typically more stable pattern noise than occasion noise, which is often just a small share of the noise; stable pattern noise is typically larger than level noise on its own. This is all illustrated in Fig 2.

With this said, the authors stress that this is only a tentative understanding of the typical relative sizes of the different types of noise and, moreover, that there are clear exceptions. For instance, they note that noise in asylum cases almost certainly consists mainly of level noise.

Difficult
According to the authors, there has been a large lack of attention given to noise. They speculate that the reasons for this include that the human brain tends to be better at spotting and understanding patterns rather than randomness (noise). They write that this is because humans in general are pattern seekers. People constantly search for causal explanations and are often satisfied with shallow ones that they often do not even attempt to falsify. The authors call this the causal mode of thinking (which is part of what Kahneman has dubbed the brain's System 1), in contrast to the statistical mode of thinking (which is part of what Kahneman calls System 2).

Beyond the trouble the human mind has with understanding and seeing randomness as compared to seeing bias, Kahneman, Sibony and Sunstein put forth several possible further explanations for why noise is relatively hard to spot. These explanations include the following:


 * People do not want to believe that there could be so much unwanted variability between judges/judgments.
 * The human mind always wants to blame a bias.
 * People often think that other people reason as they themselves do and hence may not suspect that there could be noise.
 * People often consider bad decisions to be rare exceptions or outliers made by "bad apples", rather than being legitimate data points to consider.

They write that detecting and measuring noise requires deliberate efforts, since "Noise is inherently statistical: it becomes visible only when we think statistically about an ensemble of similar judgments." Of the three types of noise, only level noise can sometimes be detected without expending substantial effort. At the same time, Kahneman, Sibony and Sunstein argue that measuring noise has one advantage over measuring statistical bias: noise can be measured even when the true value of the judgment task is unknown (the bullseye in Fig 1 is not required).

Noise audit
Kahneman, Sibony and Sunstein propose a method to detect noise that they dub noise audit. Such an audit can be performed in organizations where different judges routinely make judgments on many similar cases. A noise audit as outlined by the authors is a carefully designed study within the organization that comprises collecting anonymous and independent decisions by employees/teams on carefully prepared fictive cases. The amount of noise is then calculated. Judges and executives are also asked beforehand to reveal their confidence in their judgments and how much noise there will be, respectively, which the authors write can further increase the likelihood of eye-opening moments when the results are presented.

Among other caveats, however, the authors caution that a noise audit requires substantial levels of internal buy-in and resource devotion so as to successfully measure noise objectively. Independence between the employees/teams is an important criterion.

Noise and bias reduction
Kahneman, Sibony and Sunstein's stated hope in the book is to create a new sub-science focused on analyzing and reducing noise in organizations and judgments of all kinds. They argue that the fact that noise has historically gone largely unnoticed and unchecked is not only a tragedy but also an opportunity, since big improvements in judgment accuracy, and thus fairness etc., are within relatively close reach.

Furthermore, the authors write that noise reduction often goes hand in hand with bias reduction and therefore can help address long-known but still persistent problems of bias as well. This is both because bias can be decreased as a direct result of noise reduction and because reducing noise will make bias easier to spot, since less variability in the data means less masking of the bias error.

Leaning on mathematical proof, the authors also state that since a reduction in noise will give the same reduction in the total error as would an equally large reduction in bias, and since noise often constitutes the larger part of the total error, noise reduction should often be the first-hand choice (unless there is a good reason not to) as it is then the easiest way to realize large reductions in error.

Decision hygiene
Kahneman, Sibony and Sunstein use the term decision hygiene to describe the use of various techniques that can reduce noise in human judgment. The reason they favor the hygiene metaphor was that both the causes of noise in human judgment and the noise itself is difficult to see with the naked eye. They write that threats can harm you even though they are invisible, giving the analogy that if something goes wrong in the surgery room it is not necessarily because of scalpel misuse – it could be because the highly skilled surgeon forgot to perform proper hand hygiene before entering. Just as how washing one's hands protects against unidentified enemies we don't even see, decision hygiene can eliminate or diminish sources of noise we don't even think about.

In the book, the authors present various decision hygiene techniques through different case study chapters. These techniques include both relatively small changes to the choice architecture (the physical and psychological environment in which the judgment is made) such as nudges (a concept made famous by Sunstein together with Richard Thaler) and larger changes. The latter include debiasing, use of algorithms/rules, use of guidelines, use of relative scales, use of base rates, aggregation of judgments, structured and carefully sequenced decision-making processes, or even simply finding better judges. In other words, one path of decision hygiene is to aid judges in various ways, such as which factors they look at, how they weigh the different factors and how they use the scale in question. The other decision hygiene path is to wholly replace human judgment by algorithms, hard rules or better judges. Examples given of the rule-based approach are algorithms for making fairer and more accurate bail decisions concerning flight risk and the rules and procedures doctors use to quantify tendon degeneration.

The type of decision hygiene technique most suitable for the given situation depends on the type of judgment, the authors write. For example, evaluations performed by state officials are different in many respects from predictions made by political advisors.

Kahneman, Sibony and Sunstein also weave several of these decision hygiene techniques together into a decision hygiene procedure they call mediating assessments protocol (MAP). They argue that organizations of many different kinds should use MAP for group decisions that require considering and weighting multiple dimensions. This would not make the decision easier, they concede, but the emphasis on good process will make the decision better.

Zero noise is undesirable
Although they argue for much noise reduction, Kahneman, Sibony and Sunstein also caution that the goal should not always be zero noise in human judgment. First, reaching zero noise may not be feasible, for various reasons, not least due to costs and the fact that for many types of judgment some noise will always remain. Second, reaching zero noise may not be desirable since it would reduce flexibility, especially if the decision hygiene technique considered has to do with rules. Here, using standards can be a better option. Third, it is often important to let judges have a sense of agency and feel fulfilment after having reached a decision, as evidenced by the abandonment of the noise-wise successful federal sentencing guidelines in the US. Fourth, potentially important information possessed by individual judges can be lost, again especially if the decision hygiene technique considered has to do with rules.

For these reasons, there will always be room and need for intuition, consideration of particulars and discrete judgment – but preferably not until noise reduction has been performed, the authors argue. The idea is to not rely on human judgment all the way, especially not snap judgment. Instead, carefully informed intuition that is less prone to error, thanks to decision hygiene, should be used.

Critical response
Noise: A Flaw in Human Judgment received generally positive scores among critics, with praise of the concept, its importance and the depth of analysis as to which it was treated. However, common critiques of the book were directed towards its understandability, inconsistency of style, and length.

In the American Press, The Washington Post reviewed the book as "well-researched, convincing, and practical," and only criticized a lack of consistency in style: "some sentences and sections read like a psychology or statistics textbook, others like a scholarly article, and still others like the Harvard Business Review." The New York Times called it "tour de force of scholarship and clear writing". The Financial Times described it as a "humbling lesson in inaccuracy" and compared it to Kahneman's earlier work Thinking, Fast and Slow. They also pointed out that Noise: A Flaw in Human Judgment may be more difficult to take for readers than Thinking, Fast and Slow because the former concerns a more narrow problem and therefore has a difficult time reaching the same level of entertainment. The Wall Street Journal Magazine listed the book in an "The Nine Best Books of Spring" article.

In the British press, The Times considered the book a "a rigorous approach to an important topic" and wrote that "anyone who has found the literature on cognitive biases important will find this a valuable addition to their knowledge". However, they also included criticism on the consistency, accessibility, and length. The Sunday Times, a sister paper to The Times, argued that the book is an "outstanding study" and "a monumental, gripping book" but also wrote that the book is "a hard read if you don’t happen to be a statistician". The Economist credited the book's positive tone but criticized that "despite the book’s title, the authors struggled to extract the signal from the noise" and wrote that "a tighter argument would have enhanced the ideas they present". The Guardian penned that the book is "blunt, half-baked" and says that it "could have been half the length and it would have been a far better book for it".

Book industry magazines Publishers Weekly and Kirkus Reviews praised the book's practical analysis of noise and bias, but they found the language somewhat over-complicated and dense.

Side effects
Irish Minister of Finance Paschal Donohoe commented that decreasing the space for noise to arise can increase rigidity, which may demoralize judges and squelch their creativity and sense of professional pride and agency. Donahoe praised the book in general but in this regard wrote that "The risk is that this is at the expense of the diversity of thought and outlook that is central to human agency" and that "guiding decisions towards a sterile central average can also create vulnerabilities".

Anticipating this critique (and several others, including that people may feel dehumanized if not judged by a fellow human being), Kahneman, Sibony and Sunstein wrote in the book that this fear of what algorithmic decision making can lead to is indeed valid, but mainly for certain types of decision hygiene; other types can in fact increase the role for human judgment by freeing up time for thought and deliberation and by unearthing more perspectives on a question.

Writer Michael Blastland wondered whether noise reduction can be overly risky to perform when it comes to evaluative judgments, seeing as how the authors themselves say – although not as clearly as would be desired, Blastland wrote – that if a set of judgments is biased to the side of the real value which has asymmetrically large consequences, and noise reduction leads to lower variability in judgments, that means that a higher share of the judgments will be on the wrong side of the true value and thus a higher likelihood that catastrophes may occur, such as businesses making a faulty decision that ends up ruining the company. So noise reduction could be a bad idea in these judgments, he argued.

Anticipating this critique, Kahneman, Sibony and Sunstein write in the book that noise reduction should ideally be followed by decision makers using the now-better judgment data together with their values and potential risk-avoidance criteria to make the optimal choice.

Novelty
Some commentators interpreted how Kahneman, Sibony and Sunstein express themselves in the book and in interviews (such as ) and similar forums as claiming that the concept of noise in human judgment is something new or at least has been "discovered" by them. The above-mentioned Blastland as well as professor in statistics and political science Andrew Gelman challenged this notion, especially with reference to the existing and similar use of the concept within statistics. Professor in experimental and behavioural economics Andreas Ortmann did so with reference to the previous use of the concept within psychology. Blastland also argued that noise can be seen as simply another word for subjectivity.

Book development
Kahneman and Sibony worked on the concept and the drafting process intermittently for several years before bringing Sunstein in, and even conducted a study on noise at an insurance company in the shape of what they would later call a noise audit. They discussed the project with Sunstein in a meeting in New York in 2018. In the meeting they reportedly all realized the high potential of collaborating on the subject, and asked him to join that same day. The authors finalized the manuscript during the COVID-19 pandemic. Kahneman has said that the restrictions and difficulties associated with the COVID-19 pandemic and the resultant adjusted writing process of video calls rather than meeting in person made the writing more productive, which meant that they were able to finish the book faster than what would otherwise have been the case: "We might not have finished the book if it hadn’t been for the virus." The writing process was coordinated by decision making scientist Linnea Gandhi, who also did much of the editing. Gandhi also co-wrote the Harvard Business Review article that introduced the concept of noise in human judgment in 2016. In the Acknowledgements section of the book, the authors thank many fellow social scientists for their input during the writing of the book, not least on the statistical parts.