Abstract

Book review published as: Aronow, Peter M. and Fredrik Sävje (2020), "The Book of Why: The New Science of Cause and Effect." Journal of the American Statistical Association, 115: 482-485.

Full PDF

aa r X i v : . [ s t a t . O T ] M a r Peter M. A

RONOW and Fredrik S ¨

AVJE

Yale University115 Prospect Street, New Haven, CT 06520, [email protected] and [email protected]

The Book of Why: The New Science of Cause and Effect

Judea P

EARL and Dana M

ACKENZIE . New York: Basic Books, 2018. ISBN 978-0-465-09761-6. ix+432 pp.Book review published as: Aronow, Peter M. and Fredrik S¨avje (2020), “The Book of Why: TheNew Science of Cause and Effect.”

Journal of the American Statistical Association , 115: 482–485.Judea Pearl is a giant in the ﬁeld of causal inference, whose many contributions, includingthe discovery of the d -separation criterion, have been immeasurably valuable. He, along withscience writer Dana Mackenzie, has written an important book that relates Pearl’s work to a broadaudience and makes an argument for its place in the scientiﬁc canon.The book recounts the history of the Causal Revolution. The reader is told that causalinference was in a sorry state of affairs for most of the twentieth century. The scientiﬁccommunity was unable to tackle even the most basic causal inquiry, with grave consequences. Forexample, when discussing how scientists could not reach an agreement about whether smokingcauses lung cancer, the book notes that “millions of lives were lost or shortened because scientistsdid not have an adequate language or methodology for answering causal questions” (p. 19).However, during this age of darkness, a small light of hope was burning in the form of SewallWright and a few other brave men. The light was the spark of the Causal Revolution in the 1990s.The revolution consisted of the introduction of a graphical causal representation in the form ofdirected acyclic graphs (DAGs), and a set of associated tools, by Pearl and his coauthors.Scientists were ﬁnally given the language and methodology they needed to conduct serious causalinvestigations, and apart from a few pockets of resistance, the scientiﬁc community rejoiced:There is now an almost universal consensus, at least among epidemiologists,philosophers and social scientists, that (1) confounding needs and has a causalsolution, and (2) causal diagrams provide a complete and systematic way of ﬁndingthat solution. (p. 141)As an autobiographical and expositional text for those working in the ﬁeld, the book is bothinformative and entertaining. We are, however, concerned that the presentation will misleadreaders whose ﬁrst acquaintance with the subject is this book. The goals of this review aretwofold: to highlight instances where naive readers, especially policy-makers, might be led tonrealistically optimistic conclusions; and to discuss alternative models that are overlooked inPearl’s account of the causal revolution. Unfounded optimism about causal models

The book paints a picture of a ﬁeld that has come to its conclusion. Causal inference, for mostintents and purposes, is solved. The optimism is appealing: the world is just waiting for scientiststo uncover its mysteries. A scientiﬁc revolution will follow Pearl’s causal revolution, at leastamong those that adopt his language and methodology.We believe this optimism is unfounded. The central problem that scientists face, especiallyin the social sciences, is not how to express or analyze causal models but how to pick one that isvalid or at least reasonable. The book does not claim to solve this problem. For example,discussing Henry Niles’ critique of Sewall Wright’s research, Pearl writes:Many people still make Niles’ mistake of thinking that the goal of causal analysis isto prove that X is a cause of Y or else to ﬁnd the cause of Y from scratch. That is theproblem of causal discovery . . . In contrast, the focus of Wright’s research, as well asthis book, is representing plausible causal knowledge in some mathematicallanguage, combining it with empirical data and answering causal queries that are ofpractical value. Wright understood from the very beginning that causal discovery wasmuch more difﬁcult and perhaps impossible. (pp. 79–80)The topic of the book is causal analysis, not causal discovery. In Pearl’s model and calculus, theunderlying causal structure is assumed to be known. The issue is that scientists often disagree onthis structure. Pearl’s approach may help clarify exactly where the disagreement lies, but it willnot provide adjudication.The smoking-lung cancer debate, recounted in Chapter 5, is a good illustration. FollowingPearl’s recipe, the scientiﬁc community in the 1950s should have represented the currentconsensus about the plausible causes of cancer (and all other relevant aspects of the causal nexus)in a DAG. After applying Pearl’s calculus, the causal relationships of interest could have beeninvestigated empirically, and the debate would have been resolved. The recipe fails, however,already on the ﬁrst step. As captured in the exchange between Ronald Fisher and JeromeCornﬁeld, the central question of the debate was which causal model was the plausible one.Fisher held it possible (or plausible) that a gene was causing both smoking and lung cancer, andCornﬁeld disagreed. Diagrams 5.1 and 5.2 in the book (p. 176) provide a graphical representationof their positions. The disagreement was, however, not founded in a misunderstanding, so theclariﬁcation provided by the DAGs would have been of little use.The heart of the problem is that large parts of the scientiﬁc community share Pearl’s2kepticism about the prospects of causal discovery. In our experience, the modes of inquiry thatadjudicate debates avoid the problem altogether. Cornﬁeld’s sensitivity analysis, which played animportant role in resolving the smoking and lung cancer debate, is one such approach. Pearlacknowledges its value, but the approach is outside of his dichotomy of causal analysis and causaldiscovery. The causal structure (e.g., as encoded in a DAG) is not presumed to be known here, noris that structure the target of our inferences, as in causal discovery. Instead, sensitivity analysescan remain somewhat agnostic about the underlying structure. This agnosticism is one reasonthey are useful; sensitivity analyses can adjudicate debates because scientists can agree on theirvalidity without reaching full agreement over what constitutes plausible causal knowledge.The book’s treatment of randomized controlled trials (RCTs) also fails to demonstrate anappreciation for this central problem. Pearl writes:Once we have understood why RCTs work, there is no need to put them on a pedestaland treat them as the gold standard of causal analysis, which all other methods shouldemulate. Quite the opposite: we will see that the so-called gold standard in factderives its legitimacy from more basic principles. (p. 140)For the purposes of causal analysis in Pearl’s model, there is no distinction between a randomizedexperiment and any similarly unconfounded treatment assignment. However, this neglects that thesuccessful implementation of an experiment ensures the assumption of unconfoundedness in amanner that can survive scrutiny from even the most determined skeptic. We can move from ametaphysical discussion about the correct causal model to a practical discussion about theexperimental protocol and whether it was followed. Experiments allow us to be largely agnosticabout the causal structure.The problem with these more agnostic modes of inference is their limited applicability. AsPearl notes, we can do less without a rich causal model. No one doubts the usefulness of Pearl’sframework in situations where it can be applied. What naive readers might miss is that thesesituations are rare, particularly in the social sciences. And in the cases where reasonableconsensus about the causal structure can be reached, Pearl’s calculus will often tell us that it is notpossible to draw inferences. For example, as noted in the book, Fisher’s claim that geneticdisposition confounds the smoking-lung cancer relationship was correct. A graph alone cannotencode that this confounding is weak, and Pearl’s calculus would have told us that progress wasnot possible until these genes were identiﬁed and measured.These realizations are sobering: contrary to the impression given to readers, causal inquirycannot be reduced to a mathematical exercise nor automatized. Causal inference is possible, but itis a daunting task best served by modesty and humility.3 luralism in causal inference Pearl’s self-described “Whig history” of causal inference is selective and narrow. Besides his owncontributions, the book focuses on the contributions of his intellectual ancestors. Antagonists areoccasionally brought on the stage, but only for the purpose of being proven wrong. Readers willeasily be under the impression that the ﬁeld has seen a slow but inevitable progression towardsenlightenment, despite misguided resistance from the establishment, culminating with Pearl as asingular ﬁgure.This account is misleading. No consensus, not even an emerging one, exists about thesuperiority of DAGs. Causal inference has its roots in many disciplines, and several conceptualframeworks and methodological approaches exist and thrive. Pearl reduces this pluralism to“cultural resistance” (p. 394). Scientists who resist DAGs are, however, not stubborn monks usingtheir quills to defend a last stand against the printing press. The pluralism is instead a reﬂection ofthe range of challenges they face. The reason Pearl’s model is not used more widely is that manyscientists do not ﬁnd it useful.An extensive survey of the ﬁeld is beyond the scope of this review. We will instead providea few examples of alternative causal models and explain why scientists might prefer them. Theseapproaches differ from Pearl’s model in that they impose different amounts or different types ofstructure on the causal problem.Robins (1986) developed a causal model that is closely related to Pearl’s model and predatesit. Both models are nonparametric structural equation models. The sole distinction is that Pearl’smodel invokes additional independence assumptions. These assumptions make his model morepowerful, but they do not have testable implications. Pearl is quick to point the ﬁrst part whileneglecting the second. For example, Robins’ model does not allow for identiﬁcation of naturalmediation effects, and Pearl notes that thanks to his model the “age-old quest for a mediationmechanism has been reduced to an algebraic exercise” (p. 20). The reason Robins did not makethese assumptions is not because of a lack of imagination. He was reluctant to do so because theyare too strong for the applications he has in mind, and because they cannot be veriﬁed eventhrough experimentation. These are exactly the considerations scientists face. The additional If such a survey were to be written, it could address the two most notable omissions in the book. The ﬁrst isthe contributions of a large group of scholars who were central to the development of modern causal inference. Apartial list of this group is: Angrist, Ashenfelter, Campbell, Card, Heckman, Imbens, Manski, Murphy, Robins, andRosenbaum. The second omission is estimation from data, which the book has a tendency to trivialize. Readers arenot made aware of the fundamental difﬁculties in this enterprise. Under Pearl’s model, it is impossible to estimatecausal effects well without additional statistical assumptions. Statistical theory in this setting remains an active areaof research. Robins (1986) also introduces notation and a calculus for causal effects, called g-notation and g-computation,that have direct parallels with Pearl’s do -operator and do -calculus. In particular, Robins’ g-notation includes Pearl’s do -operator as a special case. how different variables arerelated. Scientists often ﬁnd this useful when their substantive knowledge or theories suggestcertain functional forms. An economist might, for instance, be comfortable assuming that anincrease in income will not cause a reduction in total consumption. The usual trade-off applies, ofcourse, and more elaborate models may introduce conceptual ambiguities and lack of robustness.Pearl does not hide the fact that his calculus cannot exploit information about characteristicsof causal relations, but he does not explain to what degree this limits its usefulness. An illustrativeexample lies with Pearl’s discussion of the local average treatment effect interpretation of theinstrumental variable method. The approach relies on a monotonicity assumption, which statesthat a causal effect is in the same direction for all units in the population. The assumptionconcerns the characteristics of an effect rather than its existence, so it cannot be encoded in aDAG. In Pearl’s words:In do -calculus we make no assumptions whatsoever regarding the nature of thefunctions in the causal model. But if we can justify an assumption like monotonicityor linearity on scientiﬁc grounds, then a more special-purpose tool like instrumentalvariables estimation is worth considering. (p. 257)The statement is correct but misleading. A casual reader would be under the impression that theseassumptions and the associated “special-purpose tools” are on the fringes of causal inference. Onthe contrary, instrumental variable methods are immensely popular in the social sciences. So areregression discontinuity and difference-in-differences designs, which are other methods relying5n functional form assumptions (continuity and additivity, respectively). These methods areomitted from Pearl’s account of the causal revolution. If readers were made aware of theirexistence and popularity, they might question whether “causal diagrams provide a complete andsystematic way of ﬁnding a solution [to confounding].” Peter M. A RONOW and Fredrik S ¨

AVJE

Yale University

References

Robins, James (1986), “A new approach to causal inference in mortality studies with a sustainedexposure period—application to control of the healthy worker survivor effect,”