The Book of Why

The Book of Why: The New Science of Cause and Effect is a 2018 nonfiction book by computer scientist Judea Pearl and writer Dana Mackenzie. The book explores the subject of causality and causal inference from statistical and philosophical points of view for a general audience.

Summary
The book consists of ten chapters and an introduction.

Introduction: Mind over Data
The introduction describes the inadequacy of early 20th century statistical methods at making statements about causal relationships between variables. The authors then describe what they term 'The Causal Revolution', which started in the middle of the 20th century, and provided new conceptual and mathematical tools for describing causal relationships.

Chapter 1: The Ladder of Causation
Chapter 1 introduces the 'ladder of causation' - a diagram used to illustrate the three levels of causal reasoning. The first level is named 'Association', which discusses associations between variables. Questions such as 'is variable X associated with variable Y?' can be answered at this level. However, crucially, causality is not invoked. An example of reasoning on this first level is the observation that a crowing rooster is associated with the sunrise. However, this kind of reasoning cannot describe causal relations. For example, we cannot say whether the sunrise causes the rooster to crow, or whether the rooster causes the sun to rise. Many of the early 20th century statistical tools, such as correlation and regression operate on this level.

The second level (or 'rung') on the ladder of causation is labelled 'Intervention'. Reasoning on this level answers questions of the form 'if I make the intervention X, how will this affect the probability of the outcome Y?'. For example, the question 'does smoking increase my chance of lung cancer?' exists on the second level of the ladder of causation. This kind of reasoning invokes causality and can be used to investigate more questions than the reasoning of the first rung.

The third rung of the ladder of causation is labelled 'Counterfactuals' and involves answering questions which ask what might have been, had circumstances been different. Such reasoning invokes causality to a greater degree than the previous level. An example counterfactual question given in the book is 'Would Kennedy be alive if Oswald had not killed him?'

Chapter 2: From Buccaneers to Guinea Pigs: The Genesis of Causal Inference
Chapter 2 starts with a brief summary of the contributions of Francis Galton and Karl Pearson to the development of statistics in the late 19th Century and early 20th Centuries. The authors blame Galton for keeping the study of statistics on the first rung of the ladder of causation and discouraging any discussion of causality in statistics. Causal analysis using path diagrams is then introduced through the explanations of the work of Sewall Wright.

Chapter 3: From Evidence to Causes: Reverend Bayes meets Mr Holmes
Chapter 3 provides an introduction to Bayes' Theorem. Then Bayesian Networks are introduced. Finally, the links between Bayesian networks and causal diagrams are discussed.

Chapter 4: Confounding and Deconfounding, or, Slaying the Lurking Variable
This chapter introduces the idea of confounding and describes how causal diagrams can be used to identify confounding variables and determine their effect. Pearl explains that randomized controlled trials (RCTs) can be used to nullify the effect of confounders, but shows that, provided one has a causal model of confounding, an RCT does not necessarily have to be performed to get results.

Chapter 5: The Smoke-filled Debate: Clearing the Air
This chapter takes a historical approach to the question 'does smoking cause lung cancer?', focusing on the arguments made by Abraham Lilienfeld, Jacob Yerushalmy, Ronald Fisher and Jerome Cornfield. The authors explain that, though cigarette smoking was clearly correlated with lung cancer, some, such as Fisher and Yerushalmy, believed that the two variables were confounded and argued against the hypothesis that cigarettes caused the cancer. The authors then explain how causal reasoning (as developed in the rest of the book) can be used to argue that cigarettes do indeed cause cancer.

Chapter 6: Paradoxes Galore!
This chapter examines several paradoxes, including the Monty Hall Problem, Simpson's paradox, Berkson's paradox and Lord's paradox. The authors show how these paradoxes can be resolved using causal reasoning.

Chapter 7: Beyond Adjustment: The Conquest of Mount Intervention
This chapter looks at the 'second rung' of the ladder of causation introduced in chapter 1. The authors describe how to use causal diagrams to ascertain the causal effect of performing interventions (eg. smoking) on outcomes (such as lung cancer). The 'front-door criterion' and the 'do-calculus' are introduced as tools for doing this. The chapter finishes with two examples, used to introduce the use of instrumental variables to estimate causal relationships. The first is John Snow's discovery that cholera is caused by unsanitary water supplies. The second is the relationship between cholesterol levels and likelihood of a heart attack.

Chapter 8: Counterfactuals: Mining worlds that could have been
This chapter examines the third rung of the ladder of causation: counterfactuals. The chapter introduces 'structural causal models', which allow reasoning about counterfactuals in a way that traditional (non-causal) statistics does not. Then, the applications of counterfactual reasoning are explored in the areas of climate science and the law.

Chapter 9: Mediation: The Search for Mechanism
This chapter discusses mediation: the mechanism by which a cause leads to an effect. The authors discuss the work of Barbara Stoddard Burks on the causes of intelligence of children, the 'algebra for all' policy by Chicago public schools, and the use of tourniquets to treat combat wounds.

Chapter 10: Big Data, Artificial Intelligence and the Big Questions
The final chapter discusses the use of causal reasoning in big data and artificial intelligence (AI) and the philosophical problem that AI would have to reflect on its own actions, which requires counterfactual (and therefore causal) reasoning.

Reviews
Scientific background, excerpts, errata, and a list of 37 reviews of The Book of Why is provided on Judea Pearl's web page.

The Book of Why was reviewed by Jonathan Knee in The New York Times. The review was positive, with Knee calling the book "illuminating". However, he describes some parts of the book as "challenging", stating that the book is "not always fully accessible to readers who do not share the author's fondness for equations".

Tim Maudlin gave the book a mixed review in The Boston Review, calling it a "splendid overview of the state of the art in causal analysis". However, Maudlin criticizes the inclusion of "counterfactuals" as separate rung on the "ladder of causation", stating "[c]ounterfactuals are so closely entwined with causal claims that it is not possible to think causally but not counterfactually". Maudlin also criticizes the section on free will for its "imprecision and lack of familiarity with the philosophical literature". Finally he points to the work of several scientists (including Clark Glymour) who developed similar ideas to Pearl, and claims that Pearl "could have saved himself literally years of effort had he been apprised of this work". In a rebuttal, Pearl notes that at the time, he was intimately familiar with this work.

Zoe Hackett, writing in Chemistry World, gave The Book of Why a positive review, with the caveat that "[i]t requires concentration, and a studious effort to work through the mind-bending statistical problems posited in the text". The review concludes by stating that "[t]his book is a must for any serious student of philosophy of science, and should be required reading for any first-year undergraduate statistics class".

Lisa R. Goldberg wrote a detailed, technical review in Notices of the American Mathematical Society.