How to Read Numbers

How to Read Numbers: A Guide to Statistics in the News (and Knowing When to Trust Them) is a 2021 British book by Tom and David Chivers. It describes misleading uses of statistics in the news, with contemporary examples about the COVID-19 pandemic, healthcare, politics and crime. The book was conceived by the authors, who were cousins, in early 2020. It received positive reviews for its readability, engagingness, accessibility to non-mathematicians and applicability to journalistic writing.

Background
Tom and David Chivers, cousins, wrote a proposal for the book in the first months of 2020 after complaining to each other about a news story with poor interpretation of numerical data. The proposal used a case study of deaths at a university that was cut from the final book and briefly mentioned the incoming COVID-19 pandemic. At the time of writing, Tom Chivers was a science editor for UnHerd —winning Statistical Excellence in Journalism Awards from the RSS in 2018 and 2020 —and author of one previous book, The Rationalist's Guide to the Galaxy. David Chivers was an assistant professor of economics at the University of Durham. Tom Chivers viewed journalists as more literate than numerate and incentivised to make information sound dramatic; David Chivers said the "publish or perish" motivation in academia could have a similar effect.

The authors believed statistics could be given more prominence in school curricula and that numerical understanding should be viewed like literacy. Tom Chivers received some feedback from school and university teachers that they had use the book in their teaching. David Chivers said it was common to view maths as calculations rather than as interpretation of what numerical information means in context.

The book was released in March 2021. It concludes with a "statistical style guide", recommended for journalists. The authors presented this at the Significance lecture in 2021.

Synopsis
An introduction outlines why the authors believe interpreting statistics is an important skill, with COVID-19 pandemic information to illustrate this. Each chapter covers a misleading use of statistics that can be found in the news:
 * 1) Simpson's paradox, a type of ecological fallacy, means that an average like the basic reproduction number of SARS-CoV-2 can disguise a different trend in subgroups.
 * 2) Anecdotal evidence can guide individual decision-making but extraordinary anecdotes are more likely to be reported, such as for claimed effectiveness of alternative medicine.
 * 3) Small sample sizes from normal distributions can only measure large effect sizes.
 * 4) Biased samples can be reweighted to be representative, but polls are often not reweighted.
 * 5) Hypothesis testing shows statistical significance when the hypothesis is determined before data collection; p-hacking such as that of Brian Wansink misuse this framework.
 * 6) Studies with small effect sizes should not be used to make major lifestyle changes, though they may be important to scientific understanding.
 * 7) Confounders must be controlled for to determine causality: while ice cream and deaths by drowning positively correlate, neither causes the other. Some studies found "sensation seeking" as a confounder for vaping and smoking marijuana.
 * 8) Observational studies show correlation, not causation. Randomised controlled trials can measure causality; natural experiments or instrumental variables may be used where this is infeasible.
 * 9) Large numbers do not indicate high frequency without knowledge of a population size, as in misleading reports of cycling deaths, murders committed by undocumented migrants or money sent to the European Union when the UK was a member state.
 * 10) The false positive paradox, a consequence of Bayes' theorem, yields unexpected conclusions: for instance, a person's likelihood of having SARS-CoV-2 after a positive test result varies according to its prevalence in the population.
 * 11) Relative risk can make sound more alarming than absolute risk, which is often omitted from news reports: for instance, an 18% increase in seizures could be an increase from 0.024% to 0.028%.
 * 12) Measurement changes can cause inaccurate perception of trends: for example, widening of DSM criteria led to an increased proportion of the population who were diagnosed with autism; increased crime statistics can follow falls in prevalence but higher reporting rates.
 * 13) With GDP and PISA rankings as case studies, changes in ranking position are not always statistically significant.
 * 14) Individual studies should be contextualised, as in literature studies and meta-analyses, but are often reported in isolation in the news. This can lead scientific consensus on health to appear more changeable than in reality. The Lancet MMR autism fraud was amplified by lack of contextualisation.
 * 15) Publication bias – which can be detected with a funnel plot – leads to overrepresentation of studies that report a correlation or large effect size. Daryl Bem's claimed evidence of effect preceding cause is used as an example.
 * 16) For data that regularly fluctuates, such as the weather, extreme values as starting points can be used to disguise trends or create false trends.
 * 17) Confidence intervals measure uncertainty and Brier scores measure how useful a forecast is over many predictions, rather than over a single prediction that may seem wrong—such as rain occurring when a forecast said there was a 5% chance of rain.
 * 18) It is important to know the assumptions made by a model, as some can drastically change the resultant forecast. For instance, predictions of COVID-19 pandemic deaths differ based on whether they model unchanged human behaviour or compliance with lockdowns.
 * 19) The Texas sharpshooter fallacy can make one prediction seem incredibly accurate in hindsight after many diverse predictions are made, such as of the 2007–2008 financial crisis and the 2017 UK hung parliament result.
 * 20) Identifying patterns by selecting for the dependent variable is survivorship bias, such as concluding what makes a company successful by studying only successful companies.
 * 21) Collider variables, opposite to confounders, can yield false results if controlled for. If entrance to a college is predicated on either high academic scores or sporting achievement then the student population may show a negative correlation between academic and sporting success where none exists in the population.
 * 22) Goodhart's law—"when a measure becomes a target, it ceases to be a good measure"—can be seen in healthcare, politics and education.

The authors end with a recommended "statistical style guide" for journalists.

Reception
In a nomination for Chalkdust's 2021 Book of the Year, a reviewer lauded the "readable and enjoyable" brevity of chapters, the clarity and conciseness of explanations and the utility for non-mathematicians. Writing in The Big Issue, Stephen Bush approved of its light tone, informativeness and separation of expository mathematical material into optional sections. Vivek Kaul of Mint praised its simplicity and the importance of the final chapter.

Martin Chilton recommended the book in The Independent as informative and enjoyable, saying that the Chivers "make sense of dense material and offer engrossing insights". In The Times, Manjit Kumar described that "the authors do a splendid job of stringing words together so smartly that even difficult concepts are explained and understood with deceptive ease". Rainer Hank of Frankfurter Allgemeine Zeitung said that he had learned much from the book and that such engaging educational materials, with little mathematical knowledge required, could lead to better journalism.