Purely Bayesian counterfactuals versus Newcomb's paradox
aa r X i v : . [ ec on . T H ] A ug Purely Bayesian counterfactualsversus Newcomb’s paradox
Lˆe Nguyˆen HoangEPFL
Abstract
This paper proposes a careful separation between an entity’s epistemicsystem and their decision system . Crucially, Bayesian counterfactuals areestimated by the epistemic system; not by the decision system. Based onthis remark, I prove the existence of Newcomb-like problems for which anepistemic system necessarily expects the entity to make a counterfactually bad decision.I then address (a slight generalization of) Newcomb’s paradox. I solvethe specific case where the player believes that the predictor applies Bayesrule with a supset of all the data available to the player. I prove that thecounterfactual optimality of the strategy depends on the player’sprior on the predictor’s additional data. If these additional data are notexpected to reduce sufficiently the predictor’s uncertainty on the player’sdecision, then the player’s epistemic system will counterfactually preferto . But if the predictor’s data is believed to make them quasi-omniscient, then will be counterfactually prefered. Implications ofthe analysis are then discussed.More generally, I argue that, to better understand or design an entity,it is useful to clearly separate the entity’s epistemic, decision, but also data collection , reward and maintenance systems, whether the entity ishuman, algorithmic or institutional. Newcomb’s paradox is an iconic paradox of decision theory. Introduced byNozick [1969], the problem involves a player, call her
Alice , and a predictorthat we will name
Omega . Omega predicts
Alice ’s behavior, and, essentially,determines her reward based on this prediction. This will then put
Alice intosome seemingly impossible dilemma.On one hand, a seemingly causal argument suggests that
Alice should ignoreher predictability, especially once
Omega ’s prediction has already been made.This argument suggests that
Alice should then adopt a strategy called .On the other hand, by modifying her behavior, and assuming that
Omega ’sprediction is really reliable,
Alice seems able to bias
Omega ’s prediction so1hat it becomes aligned with
Alice ’s interest. This suggests that
Alice maybe able to hack her predictability to gain more rewards — a strategy known as . More thorough details are provided in Section 3.1.Remarkably, scholars are extremely divided on what strategy
Alice shouldadopt. In fact, Newcomb’s paradox seems to be unveiling a fundamental gapin our understanding of decision theory , and thus of related topics such as freewill, game theory and algorithm design. Because of this, over the last half acentury, Newcomb’s paradox has received a lot of attention from philosophers(Schlesinger [1974], Locke [1978], Horwich [1985], Nozick [1994, 1997], to name afew), mathematicians (Gardner [1974], Wijayatunga [2019], Giacopelli [2019]),economists (Broome [1989], Sugden [1991], Weber [2016]), political scientists(Brams [1975], Frydman et al. [1982]), computer scientists (Aaronson [2013],Everitt [2018]) and psychologists (Bar-Hillel and Margalit [1972]).In addition to connections to fundamental philosophical problems, New-comb’s paradox has been linked to practical problems, such as the voting dilemma(see Elster [1987]). Assuming that
Alice ’s vote has a negligible effect, but istime-costly for her, intuitively, the equivalent of would be to argue that
Alice ’s optimal selfish strategy is to not take the time to vote. However, it hasbeen argued that, by going voting,
Alice changes what
Alice would predictabout the behaviour of individuals similar to her, thereby significantly increas-ing the probability that
Alice ’s favorite candidate will get elected — which isin
Alice ’s selfish interest. A argument suggests that
Alice should vote.In this paper, we present a novel analysis of Newcomb’s paradox. The coreidea of the analysis is a clear separation between an individual’s epistemic sys-tem , and their decision system . The epistemic system will be assumed to be purely Bayesian , which means that all of its thoughts result from data and thelaws of probability. In particular, thereby, the epistemic system can engage incounterfactual reasoning, to determine what decisions yield the largest counter-factual expected rewards.As any good Bayesian, at any point in time, the epistemic system must alsoassign a probability to any future event. This evidently includes the individual’sfuture decision. Now, in principle, such a probability could take the values0 or 1. However, we stress the fact that an epistemic system with such aperfect knowledge of the individual’s decision cannot engage in a counterfactualreasoning. Under perfect knowledge, and under the laws of probability, Bayesiancounterfactual reasoning becomes nonsensical.As a result, most of this paper discusses the arguably more realistic caseof imperfect knowledge of the decision system. This means that the epistemicsystem fails to know with full certainty the algorithm executed by the decisionsystem, and the decision system’s inputs. In such a case, in Section 2, we proveTheorem 1, which says that no decision system can be expected to guaranteecounterfactual optimization. We then discuss consequences of the theorem fordecision theory.In Section 3, we then tackle a generalization of Newcomb’s paradox, underthe assumption that a Bayesian
Omega has collected a supset of
Alice ’s data.We show that the counterfactual optimality of a decision strongly depends on2he extent to which
Alice believes the predictor
Omega to know more thanher about her decision system. Indeed, if
Alice does not believe that
Omega knows more than her, then is the only counterfactually optimal decision.But if
Alice expects
Omega to know a lot more to the point of being quasi-omniscient, then becomes the only counterfactually optimal decision.These findings are formalized by Theorem 2, and by its diverse corollaries (seeSection 3.3).Section 4 will then discuss our results, and raise further questions. First,we analyze the impact of relaxing the assumptions of the previous section. Sec-ond, we discuss implications of our analysis to practical Newcomb-like problems.Third, we generalize the main idea of the paper, namely the separation of epis-temic and decision systems, to a further separation of different systems of anyinformation processing entity.Finally, Section 5 concludes.
In this section, we show that no entity can guarantee counterfactual optimiza-tion. To understand this claim, we first insist on the distinction between anentity’s epistemic system and their decision system . We then stress the fact thatcounterfactual optimization is not an algorithm , but a property , which resultsfrom the interaction between the epistemic system, the decision system and theproblem at hand. We then state and prove the impossibility of counterfactualoptimization, and discuss consequences.
In this paper, we study how an information processing entity, like a human, amachine or an organization, infers a world view and makes a decision. Confus-ingly, these two different tasks seem entangled. On one hand, it seems impor-tant to first infer a world view before making a decision. But as highlighted byNewcomb-like paradoxes, it seems that a decision can sometimes inform a worldview.To disentangle views and decisions, we propose to clarify the distinctionbetween these two tasks. In fact, we will assume that each task is handled bya specific system of the entity. More specifically, in this paper, we focus onan entity that we call
Alice , which possesses both an epistemic and a decisionsystem. We call them respectively
Emma and
Dan . Emma will infer the stateof the world based on her data.
Dan will compute decisions based on his data.
Except in Section 4.2.2, we assume
Emma to be a pure Bayesian . In otherwords,
Emma will apply solely the laws of probabilities to compute credences3nd expectations . In particular, Emma ’s credence in a theory T of the stateof the world, given the data D E she collected, will be given by Bayes rule: P [ T | D E ] = P [ D E | T ] P [ T ] P [ D E ] . (1)Note that we adopt here the purely Bayesian interpretation of probabilities.Namely, as discussed by Laplace [1840], such probabilities describe an entity’sknowledge and uncertainty, and apply even in a deterministic universe.In this paper, we will discuss only Emma and
Omega ’s credences, mostlyunder the assumption that they have common priors, and that
Emma ’s data D E are common knowledge. As a result, without loss of generality, we omit theconditioning by D E in our notations. In our paper, we analyze
Dan ’s decision through the lenses of purely Bayesiancounterfactuals.
Definition 1.
Emma counterfactually prefers a decision X if, for any alter-native decision Y , Emma estimates the counterfactual expected rewards to belarger assuming X than assuming Y , i.e. E [ R | X ] ≥ E [ R | Y ] . (2) Recall that the expectations are implicitly also conditioned by
Emma ’s data D E . Counterfactual optimality is often invoked to argue in favor of one decisionover the other. In the case of the COVID-19 pandemic, for instance, someobservers have argued that lockdown was not actually as bad for the economyas one might expect. Their argument was not that the economy did not sufferduring the lockdown; rather, they argued that the counterfactual world withouta lockdown decision would have seen a similar blow to the economy. By then alsotaking public health into consideration, such observers argued that lockdownwas very probably counterfactually optimal. In other words, these observershave essentially argued that E [ Welfare | Lockdown ] ≥ E [ Welfare | No-Lockdown ] , (3)where Welfare refers to some sort of public good, which involves both publichealth and financial safety. Interestingly, some opponents agreed with the gen-eral approach to determine if
Lockdown was a good decision, but disagreedwith the estimation of the counterfactual expectations . This contrasts with some versions of causal decision theory that invokePearl and Mackenzie [2018]’s “do-operator”. This operator is thus added to Bayesian-ism, and also depends on some causal modeling of the problem. As a result, we do notconsider it to be purely
Bayesian. Note that the counterfactual expectations should be also all conditioned on all the avail-able data at the moment of the decision lockdown, including the dangerous spread of theCOVID-19 disease. willtypically argue that E [ R | ] = R , where R is the content of the opaquebox if the predictor Omega predicts , while the counterfactual expectedrewards are E [ R | ] = r < R , since
Omega would then leave the opaquebox empty. However, the will argue that E [ R | ] = r + E [ R | ].Clearly, both cannot be right simultaneously. Section 3 will aim to clarify whatis going on.
One feature of Bayesianism is that, if P [ D ] = 0, then P [ T | D ] is ill-defined. Ina sense, a Bayesian cannot consider events that they have completely discarded.But this raises a serious technical issue, if we assume that Emma knows forsure
Dan ’s decision X . Indeed, Emma would then be completely discarding thepossibility that
Dan makes an alternative decision Y . In other words, Emma may believe P [ Y ] = 0. But then, the counterfactual expectation E [ R | Y ] wouldbe ill-defined, which makes counterfactual optimization nonsensical. To resolvethis issue, in this paper, we will only consider Bayesians with imperfect knowl-edge . Definition 2.
A Bayesian has imperfect knowledge about X if, for any possiblevalue x of the event, the Bayesian assigns a strictly positive probability to X = x . In the case of
Alice , it seems actually reasonable to assume that, especiallyin practice,
Emma cannot guarantee that
Dan will execute a given decisionalgorithm. After all, even if
Emma carefully inspected
Dan before
Dan makesa decision, and even if
Emma knows exactly the data given to
Dan , Dan may still slightly change before the computation is executed. In other words,
Emma may precisely know what
Dan was like seconds ago, but she cannotknow for sure what
Dan will be doing in a few seconds, when
Dan actually runs his computation to deliver a decision. It thus seems reasonable to assumethat
Emma has an imperfect knowledge of
Dan ’s decision.
Finally, we can state the main result of this section. It asserts that no entitycan guarantee counterfactual optimization.
Theorem 1.
There exists a decision problem with n options, for which anyentity with imperfect knowledge assigns a probability at least − / n to counter-factually bad decisions.Proof. Consider n opaque boxes. Dan must choose one of the boxes. Denote
Box- i the fact that Dan chooses the i -th box.But, now assume that Emma gets convinced that a predictor
Omega hasthe same data as her and knows her prior. As a result,
Emma believes with5robability 1 that
Omega can compute
Emma ’s credence P [ Box- i ] in Dan deciding
Box- i . Suppose also that Emma also believes with probability 1 that
Omega decides the content of box i as follows. Omega computes the smallestvalue ∗ of i such that P [ Box- i ] ≤ /n . Omega then sets R ∗ = 1, and R j = 0for j = ∗ .Then, from Emma ’s perspective, given that she knows the rewards R i ’s,any decision Box- j different from Box- ∗ is counterfactually bad. However, wealso know that Emma also assigns a probability at most 1 /n to Box- ∗ . Thus,according to Emma , there is a probability at least 1 − / n that Dan ’s decisionis counterfactually bad.Note that our proof assumes that
Emma knows for sure that
Omega knows
Emma ’s data and prior. Interestingly, it is however robust to relaxing thiscondition, and to assuming that
Emma only strongly believes that
Omega knows her data and prior. This then yields a probability of at least 1 − / n − o (1)of a counterfactually bad decision, where o (1) denotes a term arbitrarily smallby considering that Emma ’s credences are arbitrarily close to 1.
It is noteworthy that Theorem 1 does not apply to the framework of
AIXI , acounterfactually optimal decision algorithm introduced by Hutter [2001, 2004].In this framework, an entity called
AIXI interacts with its environment bymaking decisions based on its observed past data, and the environment respondsto
AIXI ’s decision by providing new data and a reward .Essentially, AIXI escapes our impossibility theorem because the
AIXI framework prevents external entities from making
AIXI ’s reward depend on
AIXI ’s uncertainty about its decision. More precisely,
AIXI ’s expected re-wards given
AIXI ’s decision X are assumed to be independent from AIXI ’suncertainty about deciding X . Formally, this assumption enforces the equality E [ R | X ∧ P [ X ] = p ] = E [ R | X ∧ P [ X ] = q ] for any values of p and q . As a re-sult, AIXI ’s credence P [ X ] in the fact that it will decide X cannot be exploitedby the environment to bias AIXI ’s rewards. This contrasts with our proof ofTheorem 1 in which, if P [ X ] is large, then the reward given X would be designedto be small.Unfortunately, the restricted framework of AIXI has been argued to make
AIXI unrealistic. In particular, the fact that the environment cannot exploitits understanding of
AIXI seems incompatible with embedded agency , as dis-cussed by Demski and Garrabrant [2019]. Embedded agency assumes that any The environment is assumed to reply according to a computable probability distributionthat depends on past data and decisions, and the latest decision. This is formalized by an environment µ whose outputs given the past and AIXI ’s decisioncannot depend on the policy π of AIXI . This allows
AIXI ’s counterfactual rewards to be alinear function of π , where π is regarded as a probability distribution. Conversely, our proofof Theorem 1 designed a problem where the rewards given a decision X highly depend on π . . Given embedded agency, it then seemsthat any entity can be analyzed by some other entity Omega of the environ-ment.
Omega can then exploit its analysis to make counterfactual optimizationimpossible, as proved by Theorem 1.
Perhaps the most important take-away of Theorem 1 is that counterfactualoptimization cannot be a decision algorithm. Rather, it should be regarded as aproperty that no decision algorithm always satisfy. At best, a decision algorithmshould be designed to often satisfy counterfactual optimization. However, themore general question of what ought to be expected from a “good” decisionalgorithm seems still far from being resolved.Theorem 1 may share similarities with the no-free-lunch theorems in learningtheory (Wolpert [1996], Joyce and Herrmann [2018]). These theorems suggestthat there is no canonical property that identifies the “good” learning algo-rithms, which are arguably what a “good” epistemic system should implement.Instead, the defense of Bayesianism, like in Hoang [2020], rests upon a myriada desirable properties, some of which can be proved to be unique to Bayesian-ism, such as robustness to Dutch book arguments (Teller [1973], Skyrms [1987]),compatibility with logic (Cox [1946, 1963], Jaynes [2003]) and statistical admis-sibility (Wald [1947], Robert [2007]).In fact, just as Bayesianism actually includes a large family of epistemologies,each derived from a given prior, the right path forward in decision theory mightconsist of proving that some desirable property can only be satisfied by decisionalgorithms taken from a certain family. Unfortunately, this research directionis out of the scope of the present paper.
In this section, we analyze Newcomb’s paradox under the lens of Bayesian coun-terfactuals. But first, we need to consider a slight generalization of Newcomb’sparadox, which more adequately fits the Bayesian framework.
In this paper, we consider a slight variation on Newcomb’s problem to makeit more realistic. In particular, we will care about the data that enables thepredictor to make its prediction. Let us describe the problem we consider.
Alice enters a room with two boxes A and B. Embedded agency raises numerous other challenges, such as the uncomputability of Bayesrule proved by Solomonoff [2009], or its computational hardness (see Aaronson [2012]), as wellas the risk of wireheading the reward system (see Everitt [2018]). On the positive side, itallows improvement by a maintenance system, as discussed in Section 4.3.3. Box A is opaque.
Alice cannot see what is inside. But she knows howthe content of
Box A was decided, which is discussed below. • Box B is transparent.
Alice sees that
Box B contains a reward r > Alice is then told that she must decide between two strategies. • The strategy consists of only taking
Box A . • The strategy consists of taking both
Box A and
Box B .What makes Newcomb’s paradox interesting is the way the content of
Box A is decided. In some classical versions of Newcomb’s paradox, some omnisciententity
Omega makes a prediction. If
Omega predicts that
Alice will ,then
Omega puts a large reward R in Box A . Otherwise, if
Omega predictsthat
Alice will , then
Omega leaves
Box A empty.However, the omniscience assumption is reasonably criticized for being toounrealistic. In this paper, we will instead assume that
Omega is an informationprocessing system which exploits huge amounts of data about
Alice and aboutthe world. Based on this large database D Ω , Omega infers a probability ω , P [ | D Ω ] that Alice will . Now, given this probability guess ω , Omega will throw a biased coin, which has a probability ω to land on heads. Ifthe coin lands on heads, Omega will put the reward R in Box A . Otherwise,
Omega leaves
Box A empty.Note that the classical version of Newcomb’s paradox is retrieved by as-suming that
Omega knows
Alice ’s decision process and all the inputs to thisprocess. Indeed,
Omega can then simply simulate this decision process to de-termine
Alice ’s decision. Our variant is thus a natural generalization of theclassical paradox.Now,
Alice knows how
Omega operates. She knows that
Omega appliesBayes rule and she knows that
Omega has processed a huge amount of data that
Alice cannot access.
Alice wants to maximize her expected (counterfactual)rewards.
Alice now has to choose between the and the strategy.What should she do?
The odd feature of Newcomb’s paradox is the assumption that some externalobserver
Omega can better know
Alice than she knows herself. At first sightat least, it may seem that no one can better guess what
Alice will decide than
Alice herself, right before she makes her decision.However, Libet [1993]’s famous experiment suggests that this may not bethe case. An algorithm processing some magnetic resonance imaging of
Alice ’sbrain may then be better able to predict some of
Alice ’s decision — at leasta fraction of a second before the decision is made. More generally, it seems in To fix ideas, you can imagine r = $1 ,
000 and R = $1 , , r and R . Omega to predict
Alice ’s decision is tocollect large amounts of data, not only about
Alice , but also about entitiessimilar to
Alice . Typically, a philosopher who posed Newcomb’s paradox togenerations of students might have gained a remarkable capability to predicttheir current students’ intuitions for the Newcomb’s paradox. Similarly, byanalyzing all sorts of data available on social medias, an algorithm could detectpatterns that would allow it to reliably predict what a given social media usermay decide. In fact, Wang and Kosinski [2018] designed one such algorithm,then replicated by Leuner [2019], that achieved superhuman performances atpredicting sexual orientations from human faces.
Let us focus on the case where
Emma believes that
Omega has the same prioras her, and has a supset of her data. We denote p , P [ ] Emma ’s prioron , and σ , V [ ω ] her prior variance on Omega ’s prediction. Assumingthat
Emma has imperfect knowledge of
Dan then corresponds to 0 < p <
Emma ’s counterfactualpreferences based solely on the variables p and σ . Theorem 2.
Assume that
Emma has imperfect knowledge of
Dan . Supposealso that
Emma believes that
Omega is Bayesian, has the same prior as herand knows a supset of her data. Then,
Emma counterfactually prefers to , if and only if, we have rR ≤ σ p (1 − p ) . Unsurprisingly, the larger R is relatively to r , the more will tendto be preferable. But perhaps what’s more interesting is the right-hand side.Note that the quantity p (1 − p ) is essentially the variance according to Emma ’sprior on what
Dan will do. Thus, the right-hand side compares the varianceof
Omega ’s prediction to the variance of
Dan ’s decision. Or, put differently,it is a measure of how much
Emma expects
Omega to know more than herabout
Dan . Intuitively, the more
Emma knows about
Dan , the more she willcounterfactually prefer to . But the more she feels that
Omega knowsmore than her, the more she will counterfactually prefer to .Thus, at its core, Newcomb’s paradox really seems to be about how muchan entity believes that an observer can better know them than they know them-selves. This may seem extremely confusing at first glance. But separating ourepistemic system from our decision system arguably helps to clarify the originof the paradox. Indeed, Newcomb’s paradox is actually rather about how muchthe observer can better predict the output of the entity’s decision system, than9hat the entity’s epistemic system can predict. In particular, if this epistemicsystem is unreliable, then it may not be that hard to outperform it.
Before proving Theorem 2, let us first note the following lemma, which will becritical in the analysis of Newcomb’s problem. The lemma states the validityof the argument of authority, if the authority is more a knowledgeable honestBayesian.
Lemma 1 (Argument of authority) . If Emma believes that
Omega is an hon-est Bayesian, that
Omega has the same prior as her, and that
Omega has asupset of her data, then
Emma should believe whatever
Omega says. Moreformally, we have P [ T | P [ T | D Ω ] = p ] = p. (4) Proof.
To clarify, denote D p the set of data D Ω such that P [ T | D Ω ] = p . Then D p excludes all data which do not satisfy this. By the law of total probability,we see that P [ T | D p ] is then necessarily an average of terms P [ T | D Ω ], for D Ω ∈ D p . But all such terms equal p , hence the theorem. The proof of Theorem 2 yields some insights into the mechanisms at play. Infact, it rests on the following four interesting observations, on what
Emma predicts based on her uncertainty about
Omega ’s prediction. All the lemmasimplicitly make the same assumptions as Theorem 2.
Lemma 2.
Emma ’s prior p on Dan deciding to is equal to the expec-tation of her prior on
Omega ’s prediction, i.e. p = E [ ω ] .Proof. According to the law of total probability, P [ ] = E ω [ P [ | ω ]].By Lemma 1, if Emma learned
Omega ’s prediction ω = P [ | D Ω ], thenshe too would assign a probability ω to Dan deciding to . Therefore, P [ | ω ] = ω . Thus, p = P [ ] = E [ ω ]. Lemma 3.
Emma assigns a prior probability p to Box A containing the largereward R , P [ A = R ] = p .Proof. According to the law of total probability, P [ A = R ] = E ω [ P [ A = R | ω ]].By the decision algorithm of Omega (see Section 3.1), we have P [ A = R | ω ] = ω . Thus P [ A = R ] = E [ ω ] = p . Lemma 4.
Emma ’s posterior belief that
Box A contains the large reward R , given a decision by Dan , is given by P [ A = R | ] = p + σ p .Interestingly, this quantity is larger than the prior probability, especially if thevariance σ on Omega ’s prediction is large and if
Omega is expected to essen-tially predict .. roof. By the law of total probabilities, we have P [ A = R | ] = X ω P [ A = R | ∧ ω ] P [ ω | ] (5)= X ω P [ A = R ∧ | ω ] P [ | ω ] P [ | ω ] P [ ω ] P [ ] (6)= X ω P [ A = R | ω ] P [ | ω ] P [ ω ] p (7)= 1 p X ω ω P [ ω ] = E (cid:2) ω (cid:3) p = p + σ p = p + σ p . (8)In the second line, we used Bayes rule. In the third line, we use the conditionallyindependence of events A = R and , given ω , as well as Lemma 2. In thefourth line, we exploited the way the decision of A = R is decided by ω , as wellas the Lemma 1 which implies P [ | ω ] = ω . The last line also uses thedefinition of σ . Lemma 5.
Emma ’s posterior belief that
Box A contains the large reward R ,given a decision by Dan , is given by P [ A = R | ] = p − σ (1 − p ) .Perhaps not surprisingly given Lemma 4, this posterior is smaller than the prior,especially for large values of σ and when Omega is expected to essentiallypredict .Proof.
Recall that, by Lemmas 2 and 3, we have P [ A = R ] = p = P [ ].Now, Bayes rule yields P [ A = R | ] = P [ | A = R ] P [ A = R ] P [ ] (9)= (1 − P [ | A = R ]) p − P [ ] (10)= (cid:18) − P [ A = R | ] P [ ] P [ A = R ] (cid:19) p − p (11)= (cid:18) − p − σ p (cid:19) p − p = 1 − σ − p , (12)where, in the fourth line, we used Lemma 4. Theorem 2 then follows from our four previous lemmas.11 roof of Theorem 2.
From Lemmas 4 and 5, we have E [ A | ] = R · P [ A = R | ] = (cid:18) p + σ p (cid:19) ¯ R, and (13) E [ A + B | ] = R (cid:18) p − σ − p (cid:19) + r. (14)Now, Emma counterfactually prefers to if and only if the former con-ditional expectation is larger than the second, which is equivalent to saying σ p ≥ rR − σ − p . Rearranging the terms yields the theorem. In this section, we discuss the corollaries of Theorem 2. These corollaries yieldgreater insight into the key variables of Newcomb’s paradox.
For any rewards
R > r , if
Emma has imperfect knowledge of
Dan and if she expects
Omega to have the same prior and data as her, thenshe will counterfactually prefer to .Proof.
This is the special case where σ = 0.In particular, if Emma is almost sure than
Omega has the same prior anddata as her, and if she is almost sure that
Dan will , then she is almostsure that
Dan ’s decision is counterfactually optimal.
In this section, we show that σ is an increasing function of the size of the data D Ω that Emma suspects
Omega to have. In other words, the more
Emma thinks that
Omega has a more data than her, the more she will lean towardscounterfactually preferring to . Lemma 6.
Denote ω ( D ) = P [ | D ] the prediction made based on data D about Dan . If we know with that the data D +Ω contain data D Ω , even thoughboth are unknown, then V (cid:2) ω ( D +Ω ) (cid:3) = V [ ω ( D Ω )] + E D Ω (cid:2) V (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3)(cid:3) . (15)12 roof. Note that V (cid:2) ω ( D +Ω ) (cid:3) = E (cid:2) ω ( D +Ω ) (cid:3) − p = E D Ω h E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3)i − p (16)= E D Ω h E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) − ω ( D Ω ) + ω ( D Ω ) i − p (17)= E D Ω h E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) − E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) i + E (cid:2) ω ( D Ω ) (cid:3) − E [ ω ( D Ω )] (18)= E D Ω (cid:2) V (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3)(cid:3) + V [ ω ( D Ω )] , (19)where, in the third line, we used the fact that, if D +Ω contains D Ω , then ω ( D Ω ) = E (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) .As a corollary, we have V (cid:2) ω ( D +Ω ) (cid:3) ≥ V [ ω ( D Ω )]. This means that, for aNewcomb paradox with Omega + that has more data than Omega , the variance σ of the predictor’s prediction is larger. In particular, as Omega collects moreand more data,
Emma will tend more and more towards . r ≥ R ) Unsurprisingly, if the reward r of Box B is necessarily larger than the rewardof
Box A , then is counterfactually preferable.
Corollary 2. If r ≥ R , if Emma has imperfect knowledge of
Dan , and if sheexpects
Omega to have the same prior and more data, then
Emma counter-factually prefers to .Proof of Corollary 2.
Lemma 6 with D +Ω = D Ω ∧ implies that p (1 − p ) ≥ σ . If r ≥ R , then we have rR ≥ ≥ σ p (1 − p ) . One particularly interesting particular case is that of a quasi-omniscient
Omega ,which knows a lot more about
Dan than
Emma does. We formalize quasi-omniscience as follows.
Definition 3.
Emma believes
Omega to be quasi-omniscient about
Dan , if
Emma rules out an uncertain prediction by
Omega . Formally,
Omega is δ -omniscient if P [ ω ∈ ( δ, − δ )] = 0 . Note that δ -omniscience is a property that depends on Emma ’s prior on thedata D Ω that Omega has access to. In fact, it is arguably not a fundamentalproperty of
Omega itself.One thing that makes this case interesting is that, for a fixed prior p , es-pecially in the limit δ →
0, the variance σ gets maximized. In particular,throughout this section, we assume 0 < δ < min { p, − p } , which allows toguarantee the fact that the conditional expectations that we will consider arewell-defined. 13 emma 7. If Emma believes that
Omega knows more and is δ -omniscient,then σ ≥ p (1 − p ) − δ − o ( δ ) .Proof. First, recall that, by Lemma 2, E [ ω ] = p . Now, denote q , P [ ω ≥ − δ ].By the law of total probability, p = E [ ω | ω ≤ δ ] (1 − q ) + E [ ω | ω ≥ − δ ] q ≤ δ + q, (20)and thus q ≥ p − δ . Therefore, σ = E (cid:2) ω (cid:3) − p . Now, note that E (cid:2) ω (cid:3) = E (cid:2) ω (cid:12)(cid:12) ω ≤ δ (cid:3) P [ ω ≤ δ ] + E (cid:2) ω (cid:12)(cid:12) ω ≥ − δ (cid:3) P [ ω ≥ − δ ] (21) ≥ − δ ) q ≥ (1 − δ ) ( p − δ ) ≥ p − (1 + 2 p ) δ − o ( δ ) . (22)Therefore, we have σ ≥ p (1 − p ) − δ − o ( δ ).We can then state the following insightful corollary of Theorem 2. Corollary 3. If Emma has imperfect information about
Dan , if she believes
Omega knows more than her, and if she believes
Omega to be δ -omniscient,then, in the limit δ ≪ min { p, − p } , Emma counterfactually prefers to .Proof.
Algebraic manipulations show that σ p (1 − p ) = 1 − δp (1 − p ) − o (cid:16) δp (1 − p ) (cid:17) . Weconclude by applying Theorem 2.In other words, for any given values of R > r , and for any given prior un-certainty from
Emma about
Dan ’s decision, there is a sufficiently omniscient
Omega such that
Emma will counterfactually prefer to . Or, put differ-ently, if
Alice is pretty sure that she will , and if she is pretty sure that
Omega is pretty-pretty-pretty sure of what
Alice decide, then
Alice will bepretty sure to make a counterfactually optimal decision.This conclusion may clarify some of the disagreements about Newcomb’sparadox. If one believes that
Alice fully knows what she decides, then it seemsmeaningless to consider that
Omega could know a lot better than
Alice whatshe will decide. Similarly, by considering that
Omega is fully omniscient, onemay be tempted to exclude some scenarios, such as
Omega being wrong. Butsuch scenarios are arguably not quite Bayesian, nor realistic. By more carefullyconsidering imperfect knowledge, purely Bayesian counterfactuals in Newcomb-like problems arguably seem less mysterious.
While our theorems closed a few problems, they arguably raised even morequestions. In this section, we discuss take-aways and future research directions.
The results of Section 3 assumed that
Emma believes that
Omega has thesame prior as her, as well as a supset of her data. In this section, we discuss thechallenges to remove this assumption. 14 .1.1 If Emma knows more If Emma has a supset of
Omega ’s data, then she would know
Omega ’s pre-diction ω . It can be shown that, as a result, if Emma has imperfect knowledgeof
Dan , then so does
Omega . But this also implies that
Omega ’s actual de-cision to put R in Box A is not fully known to
Omega . It may thus also beimperfectly known to
Emma .Unfortunately, this last bit of uncertainty may be where a Newcomb-likeparadox kicks in. Typically,
Omega ’s coin toss might be further biased byanother entity
Omega + that does have more data than Emma .Note though that if Emma believes that the coin toss is independent from
Dan ’s decision given
Emma ’s data, then
Emma can be proved to counter-factually prefer to . Similarly, if
Omega ’s uncertainty on the content of
Box A disappears, for instance if
Omega now decides to put R in Box A ifand only if its credence in satisfies P [ | D Ω ] ≥ /
2, then
Emma will also counterfactually prefer to . The most general and realistic case is evidently when
Emma and
Omega haveaccess to very different data. In such a case, Lemma 1 does not apply. Unfor-tunately, as a result,
Emma ’s counterfactual preferences are then much harderto analyze.Nevertheless, it seems that the global take-away of our analysis still applies.Essentially, the more data
Emma has compared to
Omega , the more probably seems counterfactually preferable to her. Conversely, the more data
Omega has, the more it seems that
Emma will counterfactually prefer to . However, we leave open the problem of proving this mathematically.
The other important assumption we have made throughout our analysis is that
Emma believes that
Omega has the same Bayesian prior as her. One wayto weaken this assumption is to suppose instead that
Emma has a prior on
Omega ’s possible priors. By expliciting the prior on priors, it is then only acomputation for
Emma to determine what decision is counterfactually prefer-able.Again, it seems intuitive that, as
Omega gathers more and more data,
Emma will tend to counterfactually prefer to . We conjecture that, evenin this case, if
Emma assigns a strictly positive probability to
Omega ’s actualprior, and if she believes
Omega to be sufficiently quasi-omniscient, then shewill counterfactually prefer to . This has yet to be proved though. Recall that the probabilities we consider here are Bayesian. They describe an entity’signorance. Therefore, they are not assumed to describe some fundamental randomness withinthe laws of physics. .2 Practical lessons Our generalization of Newcomb’s paradox arguably made it more realistic. How-ever, there still seems to be a gap between our analysis and practical Newcomb-like problems.
At its heart, Newcomb’s paradox is about the possibility for an external ob-server to better predict what we will decide than what we can predict ourselves.Crucially, we showed that such an external observer does not need to be somesupernatural omniscient entity. Any observer with vastly more data than uscould potentially predict our future decisions better than we can — and usethis to trick us.Arguably, billions of dollars are currently being invested to create such al-gorithmic observers. By leveraging the huge amounts of data provided by theirusers, social media algorithms are constantly trying to predict how likely we areto click on the contents that they decide to recommend to us. Meanwhile, theway we use these social medias, often without paying our utmost attention toour decision systems, means that we may make decisions without even realizingit. In such a case, it seems reasonable to argue that algorithms actually alreadyknow many of our future decisions better than we do ourselves.It may be interesting for future work to investigate how Newcomb’s paradoxcan inform us on what we ought to do in such contexts; or at least, on whatdecisions would be least counterfactually wrong.
We assumed that
Emma and
Omega are pure Bayesians. This hypothesisturned out to be extremely useful to gain insights into Newcomb’s paradox.However, it is noteworthy that, in general, Bayesianism requires unreasonablecomputing resources, and thus cannot be applied exactly in practice, as ex-plained by Solomonoff [2009]. This leaves us with the question of the robustnessof our results, if we now consider non-Bayesian entities.Note that the assumption that
Omega is (believed to be) Bayesian is notcritical. The critical feature of our analysis is rather that
Omega is expectedto have more data than
Emma , which allows
Omega to better predict what
Dan will decide. In fact, it suffices that
Omega ’s prediction ω remains stronglycorrelated with Dan ’s decision, even given
Emma ’s data.The case of
Emma is slightly more problematic, as
Emma needs to ap-ply the law of probabilities to compute counterfactuals. However, again, byassuming that she can reasonably well approximate the computation of thecounterfactuals, then it is sensible to consider that she is engaging in approxi-mate Bayesian counterfactual reasoning. The general take-aways of our analysiswould then apply. 16 .2.3 Logical non-omnniscience
Even if epistemic systems know exactly their decision algorithms and the inputsof these algorithms, they may be incapable to derive the decision unless theyperform themselves the computations of the decision algorithms. This postu-late was called computational irreducibility by Wolfram [2002]. A consequenceof this postulate is that computational limits actually add another source ofuncertainty, which cannot be taken into account by the Bayesian frameworkalone, as explained by Hoang [2020].More generally, as argued by Aaronson [2012], computational complexitytheory may be critical to understand diverse philosophical paradoxes. Its im-plications to Newcomb’s paradox have yet to be analyzed.
The core idea of our analysis was a clear separation between an entity’s epis-temic system and their decision system. In particular, we exploited the fact thatthe epistemic system should have some epistemic uncertainty about their deci-sion system, which we argued to be critical to understand Newcomb’s paradox.However, the decomposition of any entity into different systems with differenttasks can be taken further, as argued by Hoang [2019].
Interestingly, our analysis highlights the importance of data in Newcomb’s para-dox. Indeed, we saw that
Emma will consider that is counterfactuallypreferable if and only if she expects
Omega to have access to sufficiently moredata D Ω so that Omega ’s uncertainty about
Dan ’s decision is greatly reduced.Note that, however, we did not discuss how
Omega would access such data.More generally, data is arguably critical for any entity, both for the epistemicand the decision systems. Therefore, it seems critical for the entity, or for anyanalysis of the entity, to carefully design or understand their data collectionsystem . Arguably, in the case of the COVID-19 crisis, in at least some partsof the world, this system was defective, typically because of a lack of tests orbecause of data misreporting. In practice, improving data collection systems isa key aspect of improving an entity.In particular, especially for large-scale critical applications, designing a qual-ity data collection systems seems to require features such as data authentication , data storage and data communication . Results in the fields of cryptography, dif-ferential privacy or distributed computing may add insights into Newcomb-likeproblems. In this paper, we assumed that the rewards were given by
Omega . In fact,
Omega was essentially
Emma ’s reward system.17n practice, most entities’ rewards are much more complex to compute. Hu-mans may care about happiness, glory or flourishing, or a certain combination ofall three. Organizations may want to advance their cause, publish papers, pro-tect their nations, save lives or make their business sustainable. Finally, manyalgorithms have been designed to maximize profits, clicks or user attention.In most cases, rewards are arguably what determine these entities’ decisionsthe most, and thus their impacts on the world. In particular, a watch-time max-imizing algorithm may flood the Internet with clickbait, virulent and unreliablevideos, which may then encourage climate denialism Allgaier [2019], promotedangerous health recommandations Johnson et al. [2020] or normalize radical-izations Ribeiro et al. [2020], while a profit-maximizing company may disregardits externalities, and a drug addict may neglect the long-term effects of theirdrug consumption. Note that, in these examples, the rewards to be maximizedare not harmful in themselves; but they are oblivious of major negative sideeffects . Carefully understanding and designing entities’ reward systems seemscritical.One important feature of reward systems is that they too need to rely on adata collection system and an epistemic system. To illustrate, the rewards tobe given to a public health organization should arguably depend on the actualhealth of the population. However, if this population does not get tested, orif the mental health of the population is not properly inferred from the data,then the public health organization’s rewards may be misleading. This defectof the computation of the rewards may then create flawed motivations for theentity, which can lead to poor decisions by the decision system, and to poorassessments by the epistemic system.More generally, as argued by Hoang [2019], the careful understanding anddesign of entities’ reward systems is arguably the most critical feature to under-stand these entities, and to make them more robustly beneficial for the futureof our planet. Perhaps Newcomb-like paradoxes can shed more light into whatcan and cannot be achieved from the interaction between the reward systemand the other systems of an entity.
One last component discussed in Hoang [2019] is the maintenance system . Giventhat any entity almost surely has imperfect data collection, reward, epistemicand decision systems, it seems critical that the entity is aware of this, andactively aims to combat the defects of these components.One example of defect within humans, and arguably within organizationstoo, is the confirmation bias . A decision made by our decision systems cansometimes harm the capabilities of our epistemic systems, by irrationally favor-ing justifications of our decisions, even when they are poor (see the survey byNickerson [1998]).Another example of defect is known as the law of Goodhart [1975], whichsays that “when a measure becomes a target, it ceases to be a good measure”.Recently, El-Mhamdi and Hoang [2020] even showed that a fat-tail discrepancy18etween a deployed imperfect reward system and its ideal version could makereward maximization infinitely bad , according to the ideal reward system.It thus seems critical for any robustly beneficial entity evolving in complexenvironments with feedback loops to have a maintenance system that keepstrack of the imperfections of their epistemic, decision, data collection and rewardsystem, and that actively tries to fix and improve these systems if needed.
This paper proposed to more clearly distinguish an entity’s epistemic systemfrom its decision system. I argued that this allowed to better understandNewcomb-like problems. In particular, based on this insight, I proved the im-possibility of counterfactual optimization. Moreover, I showed that Newcomb’sproblem can be reformulated in terms of the capability of some observer to bet-ter predict the decision of an entity than the epistemic system of the entity can.Finally, I discussed numerous further research challenges to better understandinformation processing entities, and to better design entities that make robustlybeneficial decisions.
Acknowledgement
The author is thankful for insightful discussions with El Mahdi El Mhamdi,among others.
References
Scott Aaronson. Why philosophers should care about computational complexity.In
In Computability: G¨odel, Turing, Church, and beyond (eds . Citeseer, 2012.Scott Aaronson.
Quantum computing since Democritus . Cambridge UniversityPress, 2013.Joachim Allgaier. Science and environmental communication via online video:strategically distorted communications on climate change and climate engi-neering on youtube.
Frontiers in Communication , 4:36, 2019.Maya Bar-Hillel and Avishai Margalit. Newcomb’s paradox revisited.
TheBritish Journal for the Philosophy of Science , 23(4):295–304, 1972.Steven J Brams. Newcomb’s problem and prisoners’ dilemma.
Journal of Con-flict Resolution , 19(4):596–612, 1975.John Broome. An economic newcomb problem.
Analysis , 49(4):220–222, 1989.Richard T Cox. Probability, frequency and reasonable expectation.
Americanjournal of physics , 14(1):1–13, 1946.19ichard T Cox. The algebra of probable inference.
American Journal of Physics ,31(1):66–67, 1963.Abram Demski and Scott Garrabrant. Embedded agency. arXiv preprintarXiv:1902.09469 , 2019.El-Mahdi El-Mhamdi and Lˆe Nguyˆen Hoang. On Goodhart’s law, with anapplication to value alignment. arXiv preprint , 2020.Jon Elster.
The multiple self . Cambridge University Press, 1987.Tom Everitt.
Towards safe artificial general intelligence . PhD thesis, 2018.Roman Frydman, Gerald P O’Driscoll, and Andrew Schotter. Rational expec-tations of government policy: an application of newcomb’s problem.
SouthernEconomic Journal , pages 311–319, 1982.Martin Gardner. Reflections on newcombs problem-a prediction and free-willdilemma.
Scientific American , 230(3):102, 1974.Giuseppe Giacopelli. Studying topology of time lines graph leads to an alterna-tive approach to the newcomb’s paradox. arXiv preprint arXiv:1910.09311 ,2019.C Goodhart. Problems of monetary management: the uk experience in papersin monetary economics.
Monetary Economics , 1, 1975.Lˆe Nguyˆen Hoang. Towards robust end-to-end alignment. In Hu´ascarEspinoza, Se´an ´O h´Eigeartaigh, Xiaowei Huang, Jos´e Hern´andez-Orallo,and Mauricio Castillo-Effen, editors,
Workshop on Artificial IntelligenceSafety 2019 co-located with the Thirty-Third AAAI Conference on Artifi-cial Intelligence 2019 (AAAI-19), Honolulu, Hawaii, January 27, 2019 , vol-ume 2301 of
CEUR Workshop Proceedings . CEUR-WS.org, 2019. URL http://ceur-ws.org/Vol-2301/paper_1.pdf .Lˆe Nguyˆen Hoang.
The Equation of Knowledge: From Bayes’ Rule to a UnifiedPhilosophy of Science . CRC Press, 2020.Paul Horwich. Decision theory in light of newcomb’s problem.
Philosophy ofScience , 52(3):431–450, 1985.Marcus Hutter. Towards a universal theory of artificial intelligence based onalgorithmic probability and sequential decisions. In
European conference onmachine learning , pages 226–238. Springer, 2001.Marcus Hutter.
Universal artificial intelligence: Sequential decisions based onalgorithmic probability . Springer Science & Business Media, 2004.Edwin T Jaynes.
Probability theory: The logic of science . Cambridge universitypress, 2003. 20eil F Johnson, Nicolas Vel´asquez, Nicholas Johnson Restrepo, Rhys Leahy,Nicholas Gabriel, Sara El Oud, Minzhang Zheng, Pedro Manrique, StefanWuchty, and Yonatan Lupu. The online competition between pro-and anti-vaccination views.
Nature , pages 1–4, 2020.Thomas Joyce and J Michael Herrmann. A review of no free lunch theorems,and their implications for metaheuristic optimisation. In
Nature-inspired al-gorithms and applied optimization , pages 27–51. Springer, 2018.Pierre Simon Laplace.
Essai philosophique sur les probabilit´es . Bachelier, 1840.John Leuner. A replication study: Machine learning models are capa-ble of predicting sexual orientation from facial images. arXiv preprintarXiv:1902.10739 , 2019.Benjamin Libet. Unconscious cerebral initiative and the role of conscious willin voluntary action. In
Neurophysiology of consciousness , pages 269–306.Springer, 1993.Don Locke. How to make a newcomb choice.
Analysis , 38(1):17–23, 1978.Raymond S Nickerson. Confirmation bias: A ubiquitous phenomenon in manyguises.
Review of general psychology , 2(2):175–220, 1998.Robert Nozick. Newcomb’s problem and two principles of choice. In
Essays inhonor of Carl G. Hempel , pages 114–146. Springer, 1969.Robert Nozick.
The nature of rationality . Princeton University Press, 1994.Robert Nozick.
Socratic puzzles . Harvard University Press, 1997.Judea Pearl and Dana Mackenzie.
The book of why: the new science of causeand effect . Basic Books, 2018.Manoel Horta Ribeiro, Raphael Ottoni, Robert West, Virg´ılio AF Almeida, andWagner Meira Jr. Auditing radicalization pathways on youtube. In
Proceed-ings of the 2020 Conference on Fairness, Accountability, and Transparency ,pages 131–141, 2020.Christian P Robert. Admissibility and complete classes. In
The BayesianChoice , pages 391–426. Springer, 2007.George Schlesinger. The unpredictability of free choices.
The British journalfor the Philosophy of Science , 25(3):209–221, 1974.Brian Skyrms. Dynamic coherence and probability kinematics.
Philosophy ofScience , 54(1):1–20, 1987.Ray J Solomonoff. Algorithmic probability: Theory and applications. In
Infor-mation theory and statistical learning , pages 1–23. Springer, 2009.21obert Sugden. Rational choice: a survey of contributions from economics andphilosophy.
The economic journal , 101(407):751–785, 1991.Paul Teller. Conditionalization and observation.
Synthese , 26(2):218–258, 1973.Abraham Wald. An essentially complete class of admissible decision functions.
The Annals of Mathematical Statistics , pages 549–555, 1947.Yilun Wang and Michal Kosinski. Deep neural networks are more accuratethan humans at detecting sexual orientation from facial images.
Journal ofpersonality and social psychology , 114(2):246, 2018.Thomas A Weber. A robust resolution of newcomb’s paradox.
Theory andDecision , 81(3):339–356, 2016.Priyantha Wijayatunga. Resolution to four probability paradoxes: Two-envelope, wallet-game, sleeping beauty and newcomb’s. In
The 34th In-ternational Workshop on Statistical Modelling 2019, Guimar˜aes, Portugal ,volume 2, pages 252–257, 2019.Stephen Wolfram.
A new kind of science , volume 5. Wolfram media Champaign,IL, 2002.David H Wolpert. The lack of a priori distinctions between learning algorithms.