[PDF] Purely Bayesian counterfactuals versus Newcomb's paradox

Abstract

This paper proposes a careful separation between an entity's epistemic system and their decision system. Crucially, Bayesian counterfactuals are estimated by the epistemic system; not by the decision system. Based on this remark, I prove the existence of Newcomb-like problems for which an epistemic system necessarily expects the entity to make a counterfactually bad decision. I then address (a slight generalization of) Newcomb's paradox. I solve the specific case where the player believes that the predictor applies Bayes rule with a supset of all the data available to the player. I prove that the counterfactual optimality of the 1-Box strategy depends on the player's prior on the predictor's additional data. If these additional data are not expected to reduce sufficiently the predictor's uncertainty on the player's decision, then the player's epistemic system will counterfactually prefer to 2-Box. But if the predictor's data is believed to make them quasi-omniscient, then 1-Box will be counterfactually preferred. Implications of the analysis are then discussed. More generally, I argue that, to better understand or design an entity, it is useful to clearly separate the entity's epistemic, decision, but also data collection, reward and maintenance systems, whether the entity is human, algorithmic or institutional.

Full PDF

aa r X i v : . [ ec on . T H ] A ug Purely Bayesian counterfactualsversus Newcomb’s paradox

Lˆe Nguyˆen HoangEPFL

Abstract

This paper proposes a careful separation between an entity’s epistemicsystem and their decision system . Crucially, Bayesian counterfactuals areestimated by the epistemic system; not by the decision system. Based onthis remark, I prove the existence of Newcomb-like problems for which anepistemic system necessarily expects the entity to make a counterfactually bad decision.I then address (a slight generalization of) Newcomb’s paradox. I solvethe speciﬁc case where the player believes that the predictor applies Bayesrule with a supset of all the data available to the player. I prove that thecounterfactual optimality of the strategy depends on the player’sprior on the predictor’s additional data. If these additional data are notexpected to reduce suﬃciently the predictor’s uncertainty on the player’sdecision, then the player’s epistemic system will counterfactually preferto . But if the predictor’s data is believed to make them quasi-omniscient, then will be counterfactually prefered. Implications ofthe analysis are then discussed.More generally, I argue that, to better understand or design an entity,it is useful to clearly separate the entity’s epistemic, decision, but also data collection , reward and maintenance systems, whether the entity ishuman, algorithmic or institutional. Newcomb’s paradox is an iconic paradox of decision theory. Introduced byNozick [1969], the problem involves a player, call her

Alice , and a predictorthat we will name

Omega . Omega predicts

Alice ’s behavior, and, essentially,determines her reward based on this prediction. This will then put

Alice intosome seemingly impossible dilemma.On one hand, a seemingly causal argument suggests that

Alice should ignoreher predictability, especially once

Omega ’s prediction has already been made.This argument suggests that

Alice should then adopt a strategy called .On the other hand, by modifying her behavior, and assuming that

Omega ’sprediction is really reliable,

Alice seems able to bias

Omega ’s prediction so1hat it becomes aligned with

Alice ’s interest. This suggests that

Alice maybe able to hack her predictability to gain more rewards — a strategy known as . More thorough details are provided in Section 3.1.Remarkably, scholars are extremely divided on what strategy

Alice shouldadopt. In fact, Newcomb’s paradox seems to be unveiling a fundamental gapin our understanding of decision theory , and thus of related topics such as freewill, game theory and algorithm design. Because of this, over the last half acentury, Newcomb’s paradox has received a lot of attention from philosophers(Schlesinger [1974], Locke [1978], Horwich [1985], Nozick [1994, 1997], to name afew), mathematicians (Gardner [1974], Wijayatunga [2019], Giacopelli [2019]),economists (Broome [1989], Sugden [1991], Weber [2016]), political scientists(Brams [1975], Frydman et al. [1982]), computer scientists (Aaronson [2013],Everitt [2018]) and psychologists (Bar-Hillel and Margalit [1972]).In addition to connections to fundamental philosophical problems, New-comb’s paradox has been linked to practical problems, such as the voting dilemma(see Elster [1987]). Assuming that

Alice ’s vote has a negligible eﬀect, but istime-costly for her, intuitively, the equivalent of would be to argue that

Alice ’s optimal selﬁsh strategy is to not take the time to vote. However, it hasbeen argued that, by going voting,

Alice changes what

Alice would predictabout the behaviour of individuals similar to her, thereby signiﬁcantly increas-ing the probability that

Alice ’s favorite candidate will get elected — which isin

Alice ’s selﬁsh interest. A argument suggests that

Alice should vote.In this paper, we present a novel analysis of Newcomb’s paradox. The coreidea of the analysis is a clear separation between an individual’s epistemic sys-tem , and their decision system . The epistemic system will be assumed to be purely Bayesian , which means that all of its thoughts result from data and thelaws of probability. In particular, thereby, the epistemic system can engage incounterfactual reasoning, to determine what decisions yield the largest counter-factual expected rewards.As any good Bayesian, at any point in time, the epistemic system must alsoassign a probability to any future event. This evidently includes the individual’sfuture decision. Now, in principle, such a probability could take the values0 or 1. However, we stress the fact that an epistemic system with such aperfect knowledge of the individual’s decision cannot engage in a counterfactualreasoning. Under perfect knowledge, and under the laws of probability, Bayesiancounterfactual reasoning becomes nonsensical.As a result, most of this paper discusses the arguably more realistic caseof imperfect knowledge of the decision system. This means that the epistemicsystem fails to know with full certainty the algorithm executed by the decisionsystem, and the decision system’s inputs. In such a case, in Section 2, we proveTheorem 1, which says that no decision system can be expected to guaranteecounterfactual optimization. We then discuss consequences of the theorem fordecision theory.In Section 3, we then tackle a generalization of Newcomb’s paradox, underthe assumption that a Bayesian

Omega has collected a supset of

Alice ’s data.We show that the counterfactual optimality of a decision strongly depends on2he extent to which

Alice believes the predictor

Omega to know more thanher about her decision system. Indeed, if

Alice does not believe that

Omega knows more than her, then is the only counterfactually optimal decision.But if

Alice expects

Omega to know a lot more to the point of being quasi-omniscient, then becomes the only counterfactually optimal decision.These ﬁndings are formalized by Theorem 2, and by its diverse corollaries (seeSection 3.3).Section 4 will then discuss our results, and raise further questions. First,we analyze the impact of relaxing the assumptions of the previous section. Sec-ond, we discuss implications of our analysis to practical Newcomb-like problems.Third, we generalize the main idea of the paper, namely the separation of epis-temic and decision systems, to a further separation of diﬀerent systems of anyinformation processing entity.Finally, Section 5 concludes.

In this section, we show that no entity can guarantee counterfactual optimiza-tion. To understand this claim, we ﬁrst insist on the distinction between anentity’s epistemic system and their decision system . We then stress the fact thatcounterfactual optimization is not an algorithm , but a property , which resultsfrom the interaction between the epistemic system, the decision system and theproblem at hand. We then state and prove the impossibility of counterfactualoptimization, and discuss consequences.

In this paper, we study how an information processing entity, like a human, amachine or an organization, infers a world view and makes a decision. Confus-ingly, these two diﬀerent tasks seem entangled. On one hand, it seems impor-tant to ﬁrst infer a world view before making a decision. But as highlighted byNewcomb-like paradoxes, it seems that a decision can sometimes inform a worldview.To disentangle views and decisions, we propose to clarify the distinctionbetween these two tasks. In fact, we will assume that each task is handled bya speciﬁc system of the entity. More speciﬁcally, in this paper, we focus onan entity that we call

Alice , which possesses both an epistemic and a decisionsystem. We call them respectively

Emma and

Dan . Emma will infer the stateof the world based on her data.

Dan will compute decisions based on his data.

Except in Section 4.2.2, we assume

Emma to be a pure Bayesian . In otherwords,

Emma will apply solely the laws of probabilities to compute credences3nd expectations . In particular, Emma ’s credence in a theory T of the stateof the world, given the data D E she collected, will be given by Bayes rule: P [ T | D E ] = P [ D E | T ] P [ T ] P [ D E ] . (1)Note that we adopt here the purely Bayesian interpretation of probabilities.Namely, as discussed by Laplace [1840], such probabilities describe an entity’sknowledge and uncertainty, and apply even in a deterministic universe.In this paper, we will discuss only Emma and

Omega ’s credences, mostlyunder the assumption that they have common priors, and that

Emma ’s data D E are common knowledge. As a result, without loss of generality, we omit theconditioning by D E in our notations. In our paper, we analyze

Dan ’s decision through the lenses of purely Bayesiancounterfactuals.

Deﬁnition 1.

Emma counterfactually prefers a decision X if, for any alter-native decision Y , Emma estimates the counterfactual expected rewards to belarger assuming X than assuming Y , i.e. E [ R | X ] ≥ E [ R | Y ] . (2) Recall that the expectations are implicitly also conditioned by

Emma ’s data D E . Counterfactual optimality is often invoked to argue in favor of one decisionover the other. In the case of the COVID-19 pandemic, for instance, someobservers have argued that lockdown was not actually as bad for the economyas one might expect. Their argument was not that the economy did not suﬀerduring the lockdown; rather, they argued that the counterfactual world withouta lockdown decision would have seen a similar blow to the economy. By then alsotaking public health into consideration, such observers argued that lockdownwas very probably counterfactually optimal. In other words, these observershave essentially argued that E [ Welfare | Lockdown ] ≥ E [ Welfare | No-Lockdown ] , (3)where Welfare refers to some sort of public good, which involves both publichealth and ﬁnancial safety. Interestingly, some opponents agreed with the gen-eral approach to determine if

Lockdown was a good decision, but disagreedwith the estimation of the counterfactual expectations . This contrasts with some versions of causal decision theory that invokePearl and Mackenzie [2018]’s “do-operator”. This operator is thus added to Bayesian-ism, and also depends on some causal modeling of the problem. As a result, we do notconsider it to be purely

Bayesian. Note that the counterfactual expectations should be also all conditioned on all the avail-able data at the moment of the decision lockdown, including the dangerous spread of theCOVID-19 disease. willtypically argue that E [ R | ] = R , where R is the content of the opaquebox if the predictor Omega predicts , while the counterfactual expectedrewards are E [ R | ] = r < R , since

Omega would then leave the opaquebox empty. However, the will argue that E [ R | ] = r + E [ R | ].Clearly, both cannot be right simultaneously. Section 3 will aim to clarify whatis going on.

One feature of Bayesianism is that, if P [ D ] = 0, then P [ T | D ] is ill-deﬁned. Ina sense, a Bayesian cannot consider events that they have completely discarded.But this raises a serious technical issue, if we assume that Emma knows forsure

Dan ’s decision X . Indeed, Emma would then be completely discarding thepossibility that

Dan makes an alternative decision Y . In other words, Emma may believe P [ Y ] = 0. But then, the counterfactual expectation E [ R | Y ] wouldbe ill-deﬁned, which makes counterfactual optimization nonsensical. To resolvethis issue, in this paper, we will only consider Bayesians with imperfect knowl-edge . Deﬁnition 2.

A Bayesian has imperfect knowledge about X if, for any possiblevalue x of the event, the Bayesian assigns a strictly positive probability to X = x . In the case of

Alice , it seems actually reasonable to assume that, especiallyin practice,

Emma cannot guarantee that

Dan will execute a given decisionalgorithm. After all, even if

Emma carefully inspected

Dan before

Dan makesa decision, and even if

Emma knows exactly the data given to

Dan , Dan may still slightly change before the computation is executed. In other words,

Emma may precisely know what

Dan was like seconds ago, but she cannotknow for sure what

Dan will be doing in a few seconds, when

Dan actually runs his computation to deliver a decision. It thus seems reasonable to assumethat

Emma has an imperfect knowledge of

Dan ’s decision.

Finally, we can state the main result of this section. It asserts that no entitycan guarantee counterfactual optimization.

Theorem 1.

There exists a decision problem with n options, for which anyentity with imperfect knowledge assigns a probability at least − / n to counter-factually bad decisions.Proof. Consider n opaque boxes. Dan must choose one of the boxes. Denote

Box- i the fact that Dan chooses the i -th box.But, now assume that Emma gets convinced that a predictor

Omega hasthe same data as her and knows her prior. As a result,

Emma believes with5robability 1 that

Omega can compute

Emma ’s credence P [ Box- i ] in Dan deciding

Box- i . Suppose also that Emma also believes with probability 1 that

Omega decides the content of box i as follows. Omega computes the smallestvalue ∗ of i such that P [ Box- i ] ≤ /n . Omega then sets R ∗ = 1, and R j = 0for j = ∗ .Then, from Emma ’s perspective, given that she knows the rewards R i ’s,any decision Box- j diﬀerent from Box- ∗ is counterfactually bad. However, wealso know that Emma also assigns a probability at most 1 /n to Box- ∗ . Thus,according to Emma , there is a probability at least 1 − / n that Dan ’s decisionis counterfactually bad.Note that our proof assumes that

Emma knows for sure that

Omega knows

Emma ’s data and prior. Interestingly, it is however robust to relaxing thiscondition, and to assuming that

Emma only strongly believes that

Omega knows her data and prior. This then yields a probability of at least 1 − / n − o (1)of a counterfactually bad decision, where o (1) denotes a term arbitrarily smallby considering that Emma ’s credences are arbitrarily close to 1.

It is noteworthy that Theorem 1 does not apply to the framework of

AIXI , acounterfactually optimal decision algorithm introduced by Hutter [2001, 2004].In this framework, an entity called

AIXI interacts with its environment bymaking decisions based on its observed past data, and the environment respondsto

AIXI ’s decision by providing new data and a reward .Essentially, AIXI escapes our impossibility theorem because the

AIXI framework prevents external entities from making

AIXI ’s reward depend on

AIXI ’s uncertainty about its decision. More precisely,

AIXI ’s expected re-wards given

AIXI ’s decision X are assumed to be independent from AIXI ’suncertainty about deciding X . Formally, this assumption enforces the equality E [ R | X ∧ P [ X ] = p ] = E [ R | X ∧ P [ X ] = q ] for any values of p and q . As a re-sult, AIXI ’s credence P [ X ] in the fact that it will decide X cannot be exploitedby the environment to bias AIXI ’s rewards. This contrasts with our proof ofTheorem 1 in which, if P [ X ] is large, then the reward given X would be designedto be small.Unfortunately, the restricted framework of AIXI has been argued to make

AIXI unrealistic. In particular, the fact that the environment cannot exploitits understanding of

AIXI seems incompatible with embedded agency , as dis-cussed by Demski and Garrabrant [2019]. Embedded agency assumes that any The environment is assumed to reply according to a computable probability distributionthat depends on past data and decisions, and the latest decision. This is formalized by an environment µ whose outputs given the past and AIXI ’s decisioncannot depend on the policy π of AIXI . This allows

AIXI ’s counterfactual rewards to be alinear function of π , where π is regarded as a probability distribution. Conversely, our proofof Theorem 1 designed a problem where the rewards given a decision X highly depend on π . . Given embedded agency, it then seemsthat any entity can be analyzed by some other entity Omega of the environ-ment.

Omega can then exploit its analysis to make counterfactual optimizationimpossible, as proved by Theorem 1.

Perhaps the most important take-away of Theorem 1 is that counterfactualoptimization cannot be a decision algorithm. Rather, it should be regarded as aproperty that no decision algorithm always satisfy. At best, a decision algorithmshould be designed to often satisfy counterfactual optimization. However, themore general question of what ought to be expected from a “good” decisionalgorithm seems still far from being resolved.Theorem 1 may share similarities with the no-free-lunch theorems in learningtheory (Wolpert [1996], Joyce and Herrmann [2018]). These theorems suggestthat there is no canonical property that identiﬁes the “good” learning algo-rithms, which are arguably what a “good” epistemic system should implement.Instead, the defense of Bayesianism, like in Hoang [2020], rests upon a myriada desirable properties, some of which can be proved to be unique to Bayesian-ism, such as robustness to Dutch book arguments (Teller [1973], Skyrms [1987]),compatibility with logic (Cox [1946, 1963], Jaynes [2003]) and statistical admis-sibility (Wald [1947], Robert [2007]).In fact, just as Bayesianism actually includes a large family of epistemologies,each derived from a given prior, the right path forward in decision theory mightconsist of proving that some desirable property can only be satisﬁed by decisionalgorithms taken from a certain family. Unfortunately, this research directionis out of the scope of the present paper.

In this section, we analyze Newcomb’s paradox under the lens of Bayesian coun-terfactuals. But ﬁrst, we need to consider a slight generalization of Newcomb’sparadox, which more adequately ﬁts the Bayesian framework.

In this paper, we consider a slight variation on Newcomb’s problem to makeit more realistic. In particular, we will care about the data that enables thepredictor to make its prediction. Let us describe the problem we consider.

Alice enters a room with two boxes A and B. Embedded agency raises numerous other challenges, such as the uncomputability of Bayesrule proved by Solomonoﬀ [2009], or its computational hardness (see Aaronson [2012]), as wellas the risk of wireheading the reward system (see Everitt [2018]). On the positive side, itallows improvement by a maintenance system, as discussed in Section 4.3.3. Box A is opaque.

Alice cannot see what is inside. But she knows howthe content of

Box A was decided, which is discussed below. • Box B is transparent.

Alice sees that

Box B contains a reward r > Alice is then told that she must decide between two strategies. • The strategy consists of only taking

Box A . • The strategy consists of taking both

Box A and

Box B .What makes Newcomb’s paradox interesting is the way the content of

Box A is decided. In some classical versions of Newcomb’s paradox, some omnisciententity

Omega makes a prediction. If

Omega predicts that

Alice will ,then

Omega puts a large reward R in Box A . Otherwise, if

Omega predictsthat

Alice will , then

Omega leaves

Box A empty.However, the omniscience assumption is reasonably criticized for being toounrealistic. In this paper, we will instead assume that

Omega is an informationprocessing system which exploits huge amounts of data about

Alice and aboutthe world. Based on this large database D Ω , Omega infers a probability ω , P [ | D Ω ] that Alice will . Now, given this probability guess ω , Omega will throw a biased coin, which has a probability ω to land on heads. Ifthe coin lands on heads, Omega will put the reward R in Box A . Otherwise,

Omega leaves

Box A empty.Note that the classical version of Newcomb’s paradox is retrieved by as-suming that

Omega knows

Alice ’s decision process and all the inputs to thisprocess. Indeed,

Omega can then simply simulate this decision process to de-termine

Alice ’s decision. Our variant is thus a natural generalization of theclassical paradox.Now,

Alice knows how

Omega operates. She knows that

Omega appliesBayes rule and she knows that

Omega has processed a huge amount of data that

Alice cannot access.

Alice wants to maximize her expected (counterfactual)rewards.

Alice now has to choose between the and the strategy.What should she do?

The odd feature of Newcomb’s paradox is the assumption that some externalobserver

Omega can better know

Alice than she knows herself. At ﬁrst sightat least, it may seem that no one can better guess what

Alice will decide than

Alice herself, right before she makes her decision.However, Libet [1993]’s famous experiment suggests that this may not bethe case. An algorithm processing some magnetic resonance imaging of

Alice ’sbrain may then be better able to predict some of

Alice ’s decision — at leasta fraction of a second before the decision is made. More generally, it seems in To ﬁx ideas, you can imagine r = $1 ,

000 and R = $1 , , r and R . Omega to predict

Alice ’s decision is tocollect large amounts of data, not only about

Alice , but also about entitiessimilar to

Alice . Typically, a philosopher who posed Newcomb’s paradox togenerations of students might have gained a remarkable capability to predicttheir current students’ intuitions for the Newcomb’s paradox. Similarly, byanalyzing all sorts of data available on social medias, an algorithm could detectpatterns that would allow it to reliably predict what a given social media usermay decide. In fact, Wang and Kosinski [2018] designed one such algorithm,then replicated by Leuner [2019], that achieved superhuman performances atpredicting sexual orientations from human faces.

Let us focus on the case where

Emma believes that

Omega has the same prioras her, and has a supset of her data. We denote p , P [ ] Emma ’s prioron , and σ , V [ ω ] her prior variance on Omega ’s prediction. Assumingthat

Emma has imperfect knowledge of

Dan then corresponds to 0 < p <

Emma ’s counterfactualpreferences based solely on the variables p and σ . Theorem 2.

Assume that

Emma has imperfect knowledge of

Dan . Supposealso that

Emma believes that

Omega is Bayesian, has the same prior as herand knows a supset of her data. Then,

Emma counterfactually prefers to , if and only if, we have rR ≤ σ p (1 − p ) . Unsurprisingly, the larger R is relatively to r , the more will tendto be preferable. But perhaps what’s more interesting is the right-hand side.Note that the quantity p (1 − p ) is essentially the variance according to Emma ’sprior on what

Dan will do. Thus, the right-hand side compares the varianceof

Omega ’s prediction to the variance of

Dan ’s decision. Or, put diﬀerently,it is a measure of how much

Emma expects

Omega to know more than herabout

Dan . Intuitively, the more

Emma knows about

Dan , the more she willcounterfactually prefer to . But the more she feels that

Omega knowsmore than her, the more she will counterfactually prefer to .Thus, at its core, Newcomb’s paradox really seems to be about how muchan entity believes that an observer can better know them than they know them-selves. This may seem extremely confusing at ﬁrst glance. But separating ourepistemic system from our decision system arguably helps to clarify the originof the paradox. Indeed, Newcomb’s paradox is actually rather about how muchthe observer can better predict the output of the entity’s decision system, than9hat the entity’s epistemic system can predict. In particular, if this epistemicsystem is unreliable, then it may not be that hard to outperform it.

Before proving Theorem 2, let us ﬁrst note the following lemma, which will becritical in the analysis of Newcomb’s problem. The lemma states the validityof the argument of authority, if the authority is more a knowledgeable honestBayesian.

Lemma 1 (Argument of authority) . If Emma believes that

Omega is an hon-est Bayesian, that

Omega has the same prior as her, and that

Omega has asupset of her data, then

Emma should believe whatever

Omega says. Moreformally, we have P [ T | P [ T | D Ω ] = p ] = p. (4) Proof.

To clarify, denote D p the set of data D Ω such that P [ T | D Ω ] = p . Then D p excludes all data which do not satisfy this. By the law of total probability,we see that P [ T | D p ] is then necessarily an average of terms P [ T | D Ω ], for D Ω ∈ D p . But all such terms equal p , hence the theorem. The proof of Theorem 2 yields some insights into the mechanisms at play. Infact, it rests on the following four interesting observations, on what

Emma predicts based on her uncertainty about

Omega ’s prediction. All the lemmasimplicitly make the same assumptions as Theorem 2.

Lemma 2.

Emma ’s prior p on Dan deciding to is equal to the expec-tation of her prior on

Omega ’s prediction, i.e. p = E [ ω ] .Proof. According to the law of total probability, P [ ] = E ω [ P [ | ω ]].By Lemma 1, if Emma learned

Omega ’s prediction ω = P [ | D Ω ], thenshe too would assign a probability ω to Dan deciding to . Therefore, P [ | ω ] = ω . Thus, p = P [ ] = E [ ω ]. Lemma 3.

Emma assigns a prior probability p to Box A containing the largereward R , P [ A = R ] = p .Proof. According to the law of total probability, P [ A = R ] = E ω [ P [ A = R | ω ]].By the decision algorithm of Omega (see Section 3.1), we have P [ A = R | ω ] = ω . Thus P [ A = R ] = E [ ω ] = p . Lemma 4.

Emma ’s posterior belief that

Box A contains the large reward R , given a decision by Dan , is given by P [ A = R | ] = p + σ p .Interestingly, this quantity is larger than the prior probability, especially if thevariance σ on Omega ’s prediction is large and if

Omega is expected to essen-tially predict .. roof. By the law of total probabilities, we have P [ A = R | ] = X ω P [ A = R | ∧ ω ] P [ ω | ] (5)= X ω P [ A = R ∧ | ω ] P [ | ω ] P [ | ω ] P [ ω ] P [ ] (6)= X ω P [ A = R | ω ] P [ | ω ] P [ ω ] p (7)= 1 p X ω ω P [ ω ] = E (cid:2) ω (cid:3) p = p + σ p = p + σ p . (8)In the second line, we used Bayes rule. In the third line, we use the conditionallyindependence of events A = R and , given ω , as well as Lemma 2. In thefourth line, we exploited the way the decision of A = R is decided by ω , as wellas the Lemma 1 which implies P [ | ω ] = ω . The last line also uses thedeﬁnition of σ . Lemma 5.

Emma ’s posterior belief that

Box A contains the large reward R ,given a decision by Dan , is given by P [ A = R | ] = p − σ (1 − p ) .Perhaps not surprisingly given Lemma 4, this posterior is smaller than the prior,especially for large values of σ and when Omega is expected to essentiallypredict .Proof.

Recall that, by Lemmas 2 and 3, we have P [ A = R ] = p = P [ ].Now, Bayes rule yields P [ A = R | ] = P [ | A = R ] P [ A = R ] P [ ] (9)= (1 − P [ | A = R ]) p − P [ ] (10)= (cid:18) − P [ A = R | ] P [ ] P [ A = R ] (cid:19) p − p (11)= (cid:18) − p − σ p (cid:19) p − p = 1 − σ − p , (12)where, in the fourth line, we used Lemma 4. Theorem 2 then follows from our four previous lemmas.11 roof of Theorem 2.

From Lemmas 4 and 5, we have E [ A | ] = R · P [ A = R | ] = (cid:18) p + σ p (cid:19) ¯ R, and (13) E [ A + B | ] = R (cid:18) p − σ − p (cid:19) + r. (14)Now, Emma counterfactually prefers to if and only if the former con-ditional expectation is larger than the second, which is equivalent to saying σ p ≥ rR − σ − p . Rearranging the terms yields the theorem. In this section, we discuss the corollaries of Theorem 2. These corollaries yieldgreater insight into the key variables of Newcomb’s paradox.

For any rewards

R > r , if

Emma has imperfect knowledge of

Dan and if she expects

Omega to have the same prior and data as her, thenshe will counterfactually prefer to .Proof.

This is the special case where σ = 0.In particular, if Emma is almost sure than

Omega has the same prior anddata as her, and if she is almost sure that

Dan will , then she is almostsure that

Dan ’s decision is counterfactually optimal.

In this section, we show that σ is an increasing function of the size of the data D Ω that Emma suspects

Omega to have. In other words, the more

Emma thinks that

Omega has a more data than her, the more she will lean towardscounterfactually preferring to . Lemma 6.

Denote ω ( D ) = P [ | D ] the prediction made based on data D about Dan . If we know with that the data D +Ω contain data D Ω , even thoughboth are unknown, then V (cid:2) ω ( D +Ω ) (cid:3) = V [ ω ( D Ω )] + E D Ω (cid:2) V (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3)(cid:3) . (15)12 roof. Note that V (cid:2) ω ( D +Ω ) (cid:3) = E (cid:2) ω ( D +Ω ) (cid:3) − p = E D Ω h E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3)i − p (16)= E D Ω h E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) − ω ( D Ω ) + ω ( D Ω ) i − p (17)= E D Ω h E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) − E D +Ω (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) i + E (cid:2) ω ( D Ω ) (cid:3) − E [ ω ( D Ω )] (18)= E D Ω (cid:2) V (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3)(cid:3) + V [ ω ( D Ω )] , (19)where, in the third line, we used the fact that, if D +Ω contains D Ω , then ω ( D Ω ) = E (cid:2) ω ( D +Ω ) (cid:12)(cid:12) D Ω (cid:3) .As a corollary, we have V (cid:2) ω ( D +Ω ) (cid:3) ≥ V [ ω ( D Ω )]. This means that, for aNewcomb paradox with Omega + that has more data than Omega , the variance σ of the predictor’s prediction is larger. In particular, as Omega collects moreand more data,

Emma will tend more and more towards . r ≥ R ) Unsurprisingly, if the reward r of Box B is necessarily larger than the rewardof

Box A , then is counterfactually preferable.

Corollary 2. If r ≥ R , if Emma has imperfect knowledge of

Dan , and if sheexpects

Omega to have the same prior and more data, then

Emma counter-factually prefers to .Proof of Corollary 2.

Lemma 6 with D +Ω = D Ω ∧ implies that p (1 − p ) ≥ σ . If r ≥ R , then we have rR ≥ ≥ σ p (1 − p ) . One particularly interesting particular case is that of a quasi-omniscient

Omega ,which knows a lot more about

Dan than

Emma does. We formalize quasi-omniscience as follows.

Deﬁnition 3.

Emma believes

Omega to be quasi-omniscient about

Dan , if

Emma rules out an uncertain prediction by

Omega . Formally,

Omega is δ -omniscient if P [ ω ∈ ( δ, − δ )] = 0 . Note that δ -omniscience is a property that depends on Emma ’s prior on thedata D Ω that Omega has access to. In fact, it is arguably not a fundamentalproperty of

Omega itself.One thing that makes this case interesting is that, for a ﬁxed prior p , es-pecially in the limit δ →

0, the variance σ gets maximized. In particular,throughout this section, we assume 0 < δ < min { p, − p } , which allows toguarantee the fact that the conditional expectations that we will consider arewell-deﬁned. 13 emma 7. If Emma believes that

Omega knows more and is δ -omniscient,then σ ≥ p (1 − p ) − δ − o ( δ ) .Proof. First, recall that, by Lemma 2, E [ ω ] = p . Now, denote q , P [ ω ≥ − δ ].By the law of total probability, p = E [ ω | ω ≤ δ ] (1 − q ) + E [ ω | ω ≥ − δ ] q ≤ δ + q, (20)and thus q ≥ p − δ . Therefore, σ = E (cid:2) ω (cid:3) − p . Now, note that E (cid:2) ω (cid:3) = E (cid:2) ω (cid:12)(cid:12) ω ≤ δ (cid:3) P [ ω ≤ δ ] + E (cid:2) ω (cid:12)(cid:12) ω ≥ − δ (cid:3) P [ ω ≥ − δ ] (21) ≥ − δ ) q ≥ (1 − δ ) ( p − δ ) ≥ p − (1 + 2 p ) δ − o ( δ ) . (22)Therefore, we have σ ≥ p (1 − p ) − δ − o ( δ ).We can then state the following insightful corollary of Theorem 2. Corollary 3. If Emma has imperfect information about

Dan , if she believes

Omega knows more than her, and if she believes

Omega to be δ -omniscient,then, in the limit δ ≪ min { p, − p } , Emma counterfactually prefers to .Proof.

Algebraic manipulations show that σ p (1 − p ) = 1 − δp (1 − p ) − o (cid:16) δp (1 − p ) (cid:17) . Weconclude by applying Theorem 2.In other words, for any given values of R > r , and for any given prior un-certainty from

Emma about

Dan ’s decision, there is a suﬃciently omniscient

Omega such that

Emma will counterfactually prefer to . Or, put diﬀer-ently, if

Alice is pretty sure that she will , and if she is pretty sure that

Omega is pretty-pretty-pretty sure of what

Alice decide, then

Alice will bepretty sure to make a counterfactually optimal decision.This conclusion may clarify some of the disagreements about Newcomb’sparadox. If one believes that

Alice fully knows what she decides, then it seemsmeaningless to consider that

Omega could know a lot better than

Alice whatshe will decide. Similarly, by considering that

Omega is fully omniscient, onemay be tempted to exclude some scenarios, such as

Omega being wrong. Butsuch scenarios are arguably not quite Bayesian, nor realistic. By more carefullyconsidering imperfect knowledge, purely Bayesian counterfactuals in Newcomb-like problems arguably seem less mysterious.

While our theorems closed a few problems, they arguably raised even morequestions. In this section, we discuss take-aways and future research directions.

The results of Section 3 assumed that

Emma believes that

Omega has thesame prior as her, as well as a supset of her data. In this section, we discuss thechallenges to remove this assumption. 14 .1.1 If Emma knows more If Emma has a supset of

Omega ’s data, then she would know

Omega ’s pre-diction ω . It can be shown that, as a result, if Emma has imperfect knowledgeof

Dan , then so does

Omega . But this also implies that

Omega ’s actual de-cision to put R in Box A is not fully known to

Omega . It may thus also beimperfectly known to

Emma .Unfortunately, this last bit of uncertainty may be where a Newcomb-likeparadox kicks in. Typically,

Omega ’s coin toss might be further biased byanother entity

Omega + that does have more data than Emma .Note though that if Emma believes that the coin toss is independent from

Dan ’s decision given

Emma ’s data, then

Emma can be proved to counter-factually prefer to . Similarly, if

Omega ’s uncertainty on the content of

Box A disappears, for instance if

Omega now decides to put R in Box A ifand only if its credence in satisﬁes P [ | D Ω ] ≥ /

2, then

Emma will also counterfactually prefer to . The most general and realistic case is evidently when

Emma and

Omega haveaccess to very diﬀerent data. In such a case, Lemma 1 does not apply. Unfor-tunately, as a result,

Emma ’s counterfactual preferences are then much harderto analyze.Nevertheless, it seems that the global take-away of our analysis still applies.Essentially, the more data

Emma has compared to

Omega , the more probably seems counterfactually preferable to her. Conversely, the more data

Omega has, the more it seems that

Emma will counterfactually prefer to . However, we leave open the problem of proving this mathematically.

The other important assumption we have made throughout our analysis is that

Emma believes that

Omega has the same Bayesian prior as her. One wayto weaken this assumption is to suppose instead that

Emma has a prior on

Omega ’s possible priors. By expliciting the prior on priors, it is then only acomputation for

Emma to determine what decision is counterfactually prefer-able.Again, it seems intuitive that, as

Omega gathers more and more data,

Emma will tend to counterfactually prefer to . We conjecture that, evenin this case, if

Emma assigns a strictly positive probability to

Omega ’s actualprior, and if she believes

Omega to be suﬃciently quasi-omniscient, then shewill counterfactually prefer to . This has yet to be proved though. Recall that the probabilities we consider here are Bayesian. They describe an entity’signorance. Therefore, they are not assumed to describe some fundamental randomness withinthe laws of physics. .2 Practical lessons Our generalization of Newcomb’s paradox arguably made it more realistic. How-ever, there still seems to be a gap between our analysis and practical Newcomb-like problems.

At its heart, Newcomb’s paradox is about the possibility for an external ob-server to better predict what we will decide than what we can predict ourselves.Crucially, we showed that such an external observer does not need to be somesupernatural omniscient entity. Any observer with vastly more data than uscould potentially predict our future decisions better than we can — and usethis to trick us.Arguably, billions of dollars are currently being invested to create such al-gorithmic observers. By leveraging the huge amounts of data provided by theirusers, social media algorithms are constantly trying to predict how likely we areto click on the contents that they decide to recommend to us. Meanwhile, theway we use these social medias, often without paying our utmost attention toour decision systems, means that we may make decisions without even realizingit. In such a case, it seems reasonable to argue that algorithms actually alreadyknow many of our future decisions better than we do ourselves.It may be interesting for future work to investigate how Newcomb’s paradoxcan inform us on what we ought to do in such contexts; or at least, on whatdecisions would be least counterfactually wrong.

We assumed that

Emma and

Omega are pure Bayesians. This hypothesisturned out to be extremely useful to gain insights into Newcomb’s paradox.However, it is noteworthy that, in general, Bayesianism requires unreasonablecomputing resources, and thus cannot be applied exactly in practice, as ex-plained by Solomonoﬀ [2009]. This leaves us with the question of the robustnessof our results, if we now consider non-Bayesian entities.Note that the assumption that

Omega is (believed to be) Bayesian is notcritical. The critical feature of our analysis is rather that

Omega is expectedto have more data than

Emma , which allows

Omega to better predict what

Dan will decide. In fact, it suﬃces that

Omega ’s prediction ω remains stronglycorrelated with Dan ’s decision, even given

Emma ’s data.The case of

Emma is slightly more problematic, as

Emma needs to ap-ply the law of probabilities to compute counterfactuals. However, again, byassuming that she can reasonably well approximate the computation of thecounterfactuals, then it is sensible to consider that she is engaging in approxi-mate Bayesian counterfactual reasoning. The general take-aways of our analysiswould then apply. 16 .2.3 Logical non-omnniscience

Even if epistemic systems know exactly their decision algorithms and the inputsof these algorithms, they may be incapable to derive the decision unless theyperform themselves the computations of the decision algorithms. This postu-late was called computational irreducibility by Wolfram [2002]. A consequenceof this postulate is that computational limits actually add another source ofuncertainty, which cannot be taken into account by the Bayesian frameworkalone, as explained by Hoang [2020].More generally, as argued by Aaronson [2012], computational complexitytheory may be critical to understand diverse philosophical paradoxes. Its im-plications to Newcomb’s paradox have yet to be analyzed.

The core idea of our analysis was a clear separation between an entity’s epis-temic system and their decision system. In particular, we exploited the fact thatthe epistemic system should have some epistemic uncertainty about their deci-sion system, which we argued to be critical to understand Newcomb’s paradox.However, the decomposition of any entity into diﬀerent systems with diﬀerenttasks can be taken further, as argued by Hoang [2019].

Interestingly, our analysis highlights the importance of data in Newcomb’s para-dox. Indeed, we saw that

Emma will consider that is counterfactuallypreferable if and only if she expects

Omega to have access to suﬃciently moredata D Ω so that Omega ’s uncertainty about

Dan ’s decision is greatly reduced.Note that, however, we did not discuss how

Omega would access such data.More generally, data is arguably critical for any entity, both for the epistemicand the decision systems. Therefore, it seems critical for the entity, or for anyanalysis of the entity, to carefully design or understand their data collectionsystem . Arguably, in the case of the COVID-19 crisis, in at least some partsof the world, this system was defective, typically because of a lack of tests orbecause of data misreporting. In practice, improving data collection systems isa key aspect of improving an entity.In particular, especially for large-scale critical applications, designing a qual-ity data collection systems seems to require features such as data authentication , data storage and data communication . Results in the ﬁelds of cryptography, dif-ferential privacy or distributed computing may add insights into Newcomb-likeproblems. In this paper, we assumed that the rewards were given by

Omega . In fact,

Omega was essentially

Emma ’s reward system.17n practice, most entities’ rewards are much more complex to compute. Hu-mans may care about happiness, glory or ﬂourishing, or a certain combination ofall three. Organizations may want to advance their cause, publish papers, pro-tect their nations, save lives or make their business sustainable. Finally, manyalgorithms have been designed to maximize proﬁts, clicks or user attention.In most cases, rewards are arguably what determine these entities’ decisionsthe most, and thus their impacts on the world. In particular, a watch-time max-imizing algorithm may ﬂood the Internet with clickbait, virulent and unreliablevideos, which may then encourage climate denialism Allgaier [2019], promotedangerous health recommandations Johnson et al. [2020] or normalize radical-izations Ribeiro et al. [2020], while a proﬁt-maximizing company may disregardits externalities, and a drug addict may neglect the long-term eﬀects of theirdrug consumption. Note that, in these examples, the rewards to be maximizedare not harmful in themselves; but they are oblivious of major negative sideeﬀects . Carefully understanding and designing entities’ reward systems seemscritical.One important feature of reward systems is that they too need to rely on adata collection system and an epistemic system. To illustrate, the rewards tobe given to a public health organization should arguably depend on the actualhealth of the population. However, if this population does not get tested, orif the mental health of the population is not properly inferred from the data,then the public health organization’s rewards may be misleading. This defectof the computation of the rewards may then create ﬂawed motivations for theentity, which can lead to poor decisions by the decision system, and to poorassessments by the epistemic system.More generally, as argued by Hoang [2019], the careful understanding anddesign of entities’ reward systems is arguably the most critical feature to under-stand these entities, and to make them more robustly beneﬁcial for the futureof our planet. Perhaps Newcomb-like paradoxes can shed more light into whatcan and cannot be achieved from the interaction between the reward systemand the other systems of an entity.

One last component discussed in Hoang [2019] is the maintenance system . Giventhat any entity almost surely has imperfect data collection, reward, epistemicand decision systems, it seems critical that the entity is aware of this, andactively aims to combat the defects of these components.One example of defect within humans, and arguably within organizationstoo, is the conﬁrmation bias . A decision made by our decision systems cansometimes harm the capabilities of our epistemic systems, by irrationally favor-ing justiﬁcations of our decisions, even when they are poor (see the survey byNickerson [1998]).Another example of defect is known as the law of Goodhart [1975], whichsays that “when a measure becomes a target, it ceases to be a good measure”.Recently, El-Mhamdi and Hoang [2020] even showed that a fat-tail discrepancy18etween a deployed imperfect reward system and its ideal version could makereward maximization inﬁnitely bad , according to the ideal reward system.It thus seems critical for any robustly beneﬁcial entity evolving in complexenvironments with feedback loops to have a maintenance system that keepstrack of the imperfections of their epistemic, decision, data collection and rewardsystem, and that actively tries to ﬁx and improve these systems if needed.

This paper proposed to more clearly distinguish an entity’s epistemic systemfrom its decision system. I argued that this allowed to better understandNewcomb-like problems. In particular, based on this insight, I proved the im-possibility of counterfactual optimization. Moreover, I showed that Newcomb’sproblem can be reformulated in terms of the capability of some observer to bet-ter predict the decision of an entity than the epistemic system of the entity can.Finally, I discussed numerous further research challenges to better understandinformation processing entities, and to better design entities that make robustlybeneﬁcial decisions.

Acknowledgement

The author is thankful for insightful discussions with El Mahdi El Mhamdi,among others.

References

Scott Aaronson. Why philosophers should care about computational complexity.In

In Computability: G¨odel, Turing, Church, and beyond (eds . Citeseer, 2012.Scott Aaronson.

Quantum computing since Democritus . Cambridge UniversityPress, 2013.Joachim Allgaier. Science and environmental communication via online video:strategically distorted communications on climate change and climate engi-neering on youtube.

Frontiers in Communication , 4:36, 2019.Maya Bar-Hillel and Avishai Margalit. Newcomb’s paradox revisited.

TheBritish Journal for the Philosophy of Science , 23(4):295–304, 1972.Steven J Brams. Newcomb’s problem and prisoners’ dilemma.

Journal of Con-ﬂict Resolution , 19(4):596–612, 1975.John Broome. An economic newcomb problem.

Analysis , 49(4):220–222, 1989.Richard T Cox. Probability, frequency and reasonable expectation.

Americanjournal of physics , 14(1):1–13, 1946.19ichard T Cox. The algebra of probable inference.

American Journal of Physics ,31(1):66–67, 1963.Abram Demski and Scott Garrabrant. Embedded agency. arXiv preprintarXiv:1902.09469 , 2019.El-Mahdi El-Mhamdi and Lˆe Nguyˆen Hoang. On Goodhart’s law, with anapplication to value alignment. arXiv preprint , 2020.Jon Elster.

The multiple self . Cambridge University Press, 1987.Tom Everitt.

Towards safe artiﬁcial general intelligence . PhD thesis, 2018.Roman Frydman, Gerald P O’Driscoll, and Andrew Schotter. Rational expec-tations of government policy: an application of newcomb’s problem.

SouthernEconomic Journal , pages 311–319, 1982.Martin Gardner. Reﬂections on newcombs problem-a prediction and free-willdilemma.

Scientiﬁc American , 230(3):102, 1974.Giuseppe Giacopelli. Studying topology of time lines graph leads to an alterna-tive approach to the newcomb’s paradox. arXiv preprint arXiv:1910.09311 ,2019.C Goodhart. Problems of monetary management: the uk experience in papersin monetary economics.

Monetary Economics , 1, 1975.Lˆe Nguyˆen Hoang. Towards robust end-to-end alignment. In Hu´ascarEspinoza, Se´an ´O h´Eigeartaigh, Xiaowei Huang, Jos´e Hern´andez-Orallo,and Mauricio Castillo-Eﬀen, editors,

Workshop on Artiﬁcial IntelligenceSafety 2019 co-located with the Thirty-Third AAAI Conference on Artiﬁ-cial Intelligence 2019 (AAAI-19), Honolulu, Hawaii, January 27, 2019 , vol-ume 2301 of

CEUR Workshop Proceedings . CEUR-WS.org, 2019. URL http://ceur-ws.org/Vol-2301/paper_1.pdf .Lˆe Nguyˆen Hoang.

The Equation of Knowledge: From Bayes’ Rule to a UniﬁedPhilosophy of Science . CRC Press, 2020.Paul Horwich. Decision theory in light of newcomb’s problem.

Philosophy ofScience , 52(3):431–450, 1985.Marcus Hutter. Towards a universal theory of artiﬁcial intelligence based onalgorithmic probability and sequential decisions. In

European conference onmachine learning , pages 226–238. Springer, 2001.Marcus Hutter.

Universal artiﬁcial intelligence: Sequential decisions based onalgorithmic probability . Springer Science & Business Media, 2004.Edwin T Jaynes.

Probability theory: The logic of science . Cambridge universitypress, 2003. 20eil F Johnson, Nicolas Vel´asquez, Nicholas Johnson Restrepo, Rhys Leahy,Nicholas Gabriel, Sara El Oud, Minzhang Zheng, Pedro Manrique, StefanWuchty, and Yonatan Lupu. The online competition between pro-and anti-vaccination views.

Nature , pages 1–4, 2020.Thomas Joyce and J Michael Herrmann. A review of no free lunch theorems,and their implications for metaheuristic optimisation. In

Nature-inspired al-gorithms and applied optimization , pages 27–51. Springer, 2018.Pierre Simon Laplace.

Essai philosophique sur les probabilit´es . Bachelier, 1840.John Leuner. A replication study: Machine learning models are capa-ble of predicting sexual orientation from facial images. arXiv preprintarXiv:1902.10739 , 2019.Benjamin Libet. Unconscious cerebral initiative and the role of conscious willin voluntary action. In

Neurophysiology of consciousness , pages 269–306.Springer, 1993.Don Locke. How to make a newcomb choice.

Analysis , 38(1):17–23, 1978.Raymond S Nickerson. Conﬁrmation bias: A ubiquitous phenomenon in manyguises.

Review of general psychology , 2(2):175–220, 1998.Robert Nozick. Newcomb’s problem and two principles of choice. In

Essays inhonor of Carl G. Hempel , pages 114–146. Springer, 1969.Robert Nozick.

The nature of rationality . Princeton University Press, 1994.Robert Nozick.

Socratic puzzles . Harvard University Press, 1997.Judea Pearl and Dana Mackenzie.

The book of why: the new science of causeand eﬀect . Basic Books, 2018.Manoel Horta Ribeiro, Raphael Ottoni, Robert West, Virg´ılio AF Almeida, andWagner Meira Jr. Auditing radicalization pathways on youtube. In

Proceed-ings of the 2020 Conference on Fairness, Accountability, and Transparency ,pages 131–141, 2020.Christian P Robert. Admissibility and complete classes. In

The BayesianChoice , pages 391–426. Springer, 2007.George Schlesinger. The unpredictability of free choices.

The British journalfor the Philosophy of Science , 25(3):209–221, 1974.Brian Skyrms. Dynamic coherence and probability kinematics.

Philosophy ofScience , 54(1):1–20, 1987.Ray J Solomonoﬀ. Algorithmic probability: Theory and applications. In

Infor-mation theory and statistical learning , pages 1–23. Springer, 2009.21obert Sugden. Rational choice: a survey of contributions from economics andphilosophy.

The economic journal , 101(407):751–785, 1991.Paul Teller. Conditionalization and observation.

Synthese , 26(2):218–258, 1973.Abraham Wald. An essentially complete class of admissible decision functions.

The Annals of Mathematical Statistics , pages 549–555, 1947.Yilun Wang and Michal Kosinski. Deep neural networks are more accuratethan humans at detecting sexual orientation from facial images.

Journal ofpersonality and social psychology , 114(2):246, 2018.Thomas A Weber. A robust resolution of newcomb’s paradox.

Theory andDecision , 81(3):339–356, 2016.Priyantha Wijayatunga. Resolution to four probability paradoxes: Two-envelope, wallet-game, sleeping beauty and newcomb’s. In

The 34th In-ternational Workshop on Statistical Modelling 2019, Guimar˜aes, Portugal ,volume 2, pages 252–257, 2019.Stephen Wolfram.

A new kind of science , volume 5. Wolfram media Champaign,IL, 2002.David H Wolpert. The lack of a priori distinctions between learning algorithms.