[PDF] A Canon of Probabilistic Rationality

Abstract

We prove that a random choice rule satisfies Luce's Choice Axiom if and only if its support is a choice correspondence that satisfies the Weak Axiom of Revealed Preference, thus it consists of alternatives that are optimal according to some preference, and random choice then occurs according to a tie breaking among such alternatives that satisfies Renyi's Conditioning Axiom. Our result shows that the Choice Axiom is, in a precise formal sense, a probabilistic version of the Weak Axiom. It thus supports Luce's view of his own axiom as a "canon of probabilistic rationality."

Full PDF

aa r X i v : . [ ec on . T H ] J u l A Canon of Probabilistic Rationality ∗ Simone Cerreia-Vioglio, Fabio Maccheroni, Massimo Marinacci

Universit`a Bocconi and IGIER

Aldo Rustichini

University of Minnesota

July 23, 2020

Abstract

We prove that a random choice rule satisﬁes Luce’s Choice Axiom ifand only if its support —the set of alternatives that can be chosen — is achoice correspondence that satisﬁes the Weak Axiom of Revealed Pref-erence, and random choice occurs according to a stochastic tie breakingamong optimizers that satisﬁes Renyi’s Conditioning Axiom.Our result shows that the Choice Axiom is, in a precise formal sense,a probabilistic version of the Weak Axiom. It thus supports Luce’s viewof his own axiom as a “canon of probabilistic rationality.” ∗ Part of the material of this paper was ﬁrst circulated in 2016 as IGIER WP 593. Introduction

In 1977, twenty years after proposing it, Duncan Luce commented as followsabout his celebrated Choice Axiom: “Perhaps the greatest strength of the choice axiom, and one reasonit continues to be used, is as a canon of probabilistic rationality . It isa natural probabilistic formulation of K. J. Arrow’s famed principleof the independence of irrelevant alternatives , and as such it is apossible underpinning for rational, probabilistic theories of socialbehavior.”This claim already appears in his 1957 and 1959 works that popularized theaxiom and the resulting stochastic choice model. The conceptual proximity ofArrow’s principle, typically identiﬁed with the set-theoretic version of the WeakAxiom of Revealed Preference (WARP), and Luce’s Choice Axiom is indeedoften invoked. As well-known, the former plays a key role in deterministicchoice theory, the latter in stochastic choice theory.Yet, the formal relation between these two independence of irrelevant al-ternatives (IIA) assumptions has remained elusive so far. For instance, inanalyzing several diﬀerent IIA axioms Ray (1973) writes:“Obviously IIA (Luce) falls in a diﬀerent category altogether [rel-ative to IIA (Arrow)], being concerned with probabilistic choices.”This note provides the missing link by showing that a random choice rulesatisﬁes Luce’s Choice Axiom if and only if:1. its support, the set of alternatives that can be chosen , is a rational choicecorrespondence a la Arrow (1948, 1959), so it consists of alternatives thatare optimal according to some preference;2. tie-breaking among the optimal alternatives is consistent in the sense ofconditional probability a la Renyi (1955, 1956). Luce (1977, p. 229), emphasis added. See Luce (1957, p. 6) and Luce (1959, p. 9). Arrow himself put forth this version of Samuelson’s WARP in his 1948 and 1959 works. See the discussion of Peters and Wakker (1991, p. 1789) and Wakker (2010, p. 373).

2n this way, our analysis formally supports Luce’s “canonical rationality”claim for his Choice Axiom via a lexicographic composition of two standardconcepts of rationality: deterministic rationality (WARP) and stochastic con-sistency (Renyi’s Conditioning Axiom).

Let A be the collection of all non-empty ﬁnite subsets of a universal set X ofpossible alternatives. The elements of A are called choice sets and denoted by A , B and C .A map Γ : A → A such that Γ ( A ) ⊆ A for all choice sets A is called choicecorrespondence . It is rational when B ⊆ A and Γ ( A ) ∩ B = ∅ = ⇒ Γ ( B ) = Γ ( A ) ∩ B (WARP)This is the set-theoretic form of WARP considered by Arrow (1948, 1959). ItsIIA nature is best seen when Γ is a function: B ⊆ A and Γ ( A ) ∈ B = ⇒ Γ ( B ) = Γ ( A )In words, adding suboptimal alternatives is irrelevant for choice behavior.We denote by ∆ ( X ) the set of all ﬁnitely supported probability measureson X and, for each A ⊆ X , by ∆ ( A ) the subset of ∆ ( X ) consisting of themeasures assigning mass 1 to A . Deﬁnition 1 A random choice rule is a function p : A → ∆ ( X ) A p A such that p A ∈ ∆ ( A ) for all A ∈ A . Given any alternative a ∈ A , we interpret p A ( { a } ), also denoted by p ( a, A ),as the probability that an agent chooses a when the set of available alternativesis A . More generally, if B is a subset of A , we denote by p A ( B ) or p ( B, A )3he probability that the selected element lies in B . This probability can beviewed as the frequency with which an element in B is chosen. In particular,the set of alternatives that can be chosen from A is the support of p A , given bysupp p A = { a ∈ X : p ( a, A ) > } The assumption p A ( A ) = 1 guarantees that it is a non-empty subset of A , sothat the support correspondence supp p : A → A A supp p A is a choice correspondence.Finally, the standard way of comparing the probabilities of choices in twodiﬀerent sets B and C are the odds in favor of B over C , that is, r A ( B, C ) = p A ( B ) p A ( C ) = B is chosen C is chosenfor all B, C ⊆ A . As usual, given any b and c in X , we set p ( b, c ) = p ( b, { b, c } )and r ( b, c ) = p ( b, c ) p ( c, b ) The classical assumptions of Luce (1959) on p are: Positivity p ( a, b ) > for all a, b ∈ X . Choice Axiom p ( a, A ) = p ( a, B ) p ( B, A ) for all B ⊆ A in A and all a ∈ B . The latter axiom says that the probability of choosing an alternative a from the choice set A is the probability of ﬁrst selecting B from A , thenchoosing a from B (provided a belongs to B ). As observed by Luce, formallythis assumption corresponds to the fact that { p A : A ∈ A} is a conditionalprobability system in the sense of Renyi (1955, 1956). Remarkably, Luce’sChoice Axiom is also equivalent to: Formally, x p ( x, A ) for all x ∈ X is the discrete density of p A , but with an abuse ofnotation p A ( · ) is identiﬁed with p ( · , A ); we also write p A ( a ) instead of p A ( { a } ). See Lemma 2 of Luce (1959) and Lemma 5 in the appendix. dds Independence p ( a, b ) p ( b, a ) = p ( a, A ) p ( b, A ) (OI) for all A ∈ A and all a, b ∈ A such that p ( a, A ) /p ( b, A ) is well deﬁned. This axiom says that the odds for a against b are independent of the otheravailable alternatives. Theorem 1 (Luce)

A random choice rule p : A → ∆ ( X ) satisﬁes Positivityand the Choice Axiom if and only if there exists α : X → R such that p ( a, A ) = e α ( a ) P b ∈ A e α ( b ) (LM) for all A ∈ A and all a ∈ A . This fundamental result in random choice theory also shows that, underthe Choice Axiom, Positivity is equivalent to the stronger assumption that p A has full support for all choice sets A . Full Support supp p A = A for all A ∈ A . From a choice-theoretic perspective, this axiom is unduly restrictive andmay permit the choice of “dominated” actions. This note shows what happenswhen removing from the Luce analysis this extra baggage.Finally, when X is a separable metric space we may introduce a continuityaxiom. Continuity

Given any x, y ∈ X , if { x n } n ∈ N converges to x , then p ( x n , y ) > for all n ∈ N = ⇒ p ( x, y ) > p ( y, x n ) > for all n ∈ N = ⇒ p ( y, x ) > x n can be always chosen (re-jected) in the binary comparison with y , and x n converges to x , then x can bechosen (rejected) in the binary comparison with y . Continuity is automaticallysatisﬁed under Full Support as well as when X is countable and endowed withthe discrete metric. That is, diﬀerent from 0 /

0. See Lemma 3 of Luce (1959) when Positivity holds andLemma 5 in the appendix for the general case. For this reason, also this axiom often goes under the IIA name. To avoid confusion, weuse a less popular label. Main result

The next result generalizes Luce’s Theorem 1 by getting rid of the Full Supportassumption.

Theorem 2

The following conditions are equivalent for a random choice rule p : A → ∆ ( X ) :(i) p satisﬁes the Choice Axiom;(ii) there exist a function α : X → R and a rational choice correspondence Γ :

A → A such that p ( a, A ) =  e α ( a ) P b ∈ Γ( A ) e α ( b ) if a ∈ Γ ( A )0 else (CA) for all A ∈ A and all a ∈ A .In this case, Γ is unique and given by Γ ( A ) = supp p A for all A ∈ A . Since Γ is a rational choice correspondence, the relation ≻ deﬁned by a ≻ b ⇐⇒ a = b and Γ ( { a, b } ) = { a } ⇐⇒ b / ∈ Γ ( { a, b } )is a strict preference (see Kreps, 1988) and the corresponding weak preference b % a ⇐⇒ a ⊁ b ⇐⇒ b ∈ Γ ( { a, b } ) ⇐⇒ p ( b, a ) > A ) = { a ∈ A : a % b for all b ∈ A } .When X is countable, % is automatically represented by a utility function u and so we have Γ ( A ) = arg max A u In general, some additional conditions are needed, as next we show.

Proposition 3 If X is a separable metric space, then the random choice rule p in Theorem 2 satisﬁes Continuity if and only if there exists a continuous u : X → R such that Γ ( A ) = arg max A u for all A ∈ A . A via maximization of preference % (or utility u ), thenLucean tie-breaking to choose among the optimal alternatives.While the optimization structure of the ﬁrst stage is clear, more can besaid about the tie-breaking structure of the second stage in that Theorem 2describes only its functional form. To this end, recall that a random choice rule p is based on a Random Preference Model if there is a (measurable) collection {≻ ω } ω ∈ (Ω , F , Pr) of strict preferences such that, for all a ∈ A ∈ A , p ( a, A ) = Pr ( ω ∈ Ω : a ≻ ω b ∀ b ∈ A \ { a } )In particular, a Random Preference Model is Lucean if p ( · , A ) has the Luceform (LM).A piece of terminology: the lexicographic composition of two binary rela-tions ≻ and ≻ ′ is the binary relation ≻ ◦ ≻ ′ deﬁned by a ≻ ◦ ≻ ′ b ⇐⇒ a ≻ b or a ∼ b and a ≻ ′ b For instance, > ◦ > is the usual lexicographic preference on the Cartesianplane. We can now state the announced characterization.

Proposition 4

The following conditions are equivalent for a random choicerule p : A → ∆ ( X ) :(i) p satisﬁes the Choice Axiom;(ii) supp p : A → A is a rational choice correspondence and p B ( a ) = p A ( a ) p A ( B ) (COND) for all B ⊆ A ∈ A and all a ∈ B ∩ supp p A .(iii) there exist a strict preference ≻ on X and a Lucean Random PreferenceModel {≻ ω } ω ∈ (Ω , F , Pr) such that p is based on the Random PreferenceModel {≻ ◦ ≻ ω } ω ∈ (Ω , F , Pr) . Here > i is deﬁned by ( a , a ) > i ( b , b ) ⇐⇒ a i > b i . a can be cho-sen from A (i.e., p A ( a ) >

0) and belongs to B ⊆ A , then it can be chosen alsofrom B . But, this axiom is silent about the relation between the frequencies ofchoice in the two sets A and B . Formula (COND) requires them to be relatedby the Conditioning Axiom of Renyi (1955, 1956), a classical probabilistic con-sistency condition. In particular, (COND) per se is weaker than Luce’s ChoiceAxiom, which imposes p A ( a ) = p B ( a ) p A ( B ) for all a ∈ B ⊆ A , not just forthe elements a in B that can be chosen from A .To interpret (iii), note that the ﬁrst stage preference ≻ determines thesupport of p , while the second stage Random Preference Model {≻ ω } ω ∈ (Ω , F , Pr) is the formal description of the Lucean tie-breaking among optimizers that wepreviously discussed.Finally, (iii) also says that, when X is countable, random choice rules thatsatisfy the Choice Axiom are random utility models, something not obviousfrom the deﬁnition. This opens the way to the study of general compositions ofstrict preferences and random utility models. The object of current research,a such study goes beyond the scope of this note.

1. By considering a random choice rule p A to describe the frequency withwhich elements are chosen from A , we make the standard interpretationof the choice correspondence Γ ( A ) = supp p A as the the set of alterna-tives that can be chosen from A (cf. Sen, 1993) operational and formallymeaningful. Here, “can be chosen” means chosen with positive frequency.2. The second stage of randomization, disciplined by α , can be interpretedin the spirit of Salant and Rubinstein (2008) as capturing observableinformation which is irrelevant in the rational assessment of the alter-natives, but nonetheless aﬀects choice and may reveal how previous ex-periences and mental associations aﬀect the selection from the optimalΓ ( A ).3. The distinct roles of u and α become clear once our result is related to the8andom utility representation of the Luce model. In fact, u correspondsto the systematic component of the agent utility , and α to the alternative-speciﬁc bias in the Multinomial Logit Model. Speciﬁcally, Theorem 2shows that a random choice rule p has the form (CA) if and only if, givenany A ∈ A and any a ∈ A , p A ( a ) = lim λ → Pr ( ω ∈ Ω : u ( a ) + λǫ a ( ω ) > u ( b ) + λǫ b ( ω ) ∀ b ∈ A \ { a } )where u is a utility function that rationalizes Γ, { ǫ x } x ∈ X is a collection ofindependent errors with type I extreme value distribution, speciﬁc mean α ( a ), common variance π /

6, and λ is the noise level.In this setting, our analysis shows that, when noise vanishes, optimalchoice results and tie-breaking among optimal alternatives is stochasti-cally driven by alternative-speciﬁc biases.4. A similar interpretation arises when adopting the perspective of Matejikaand McKay (2015) on the Multinomial Logit Model as the outcome ofan optimal information acquisition problem. In this case, u is the true(initially unknown) payoﬀ of alternatives, α captures a prior belief onpayoﬀs held before engaging in experimentation, and λ is the cost of oneunit of information.Here our analysis shows that, when the cost of information vanishes,optimal alternatives are selected without error, and prior beliefs onlygovern the tie-breaking among such alternatives. The study of the relations between axiomatic decision theory and stochasticchoice has been recently an active ﬁeld of research. Horan (2020) and Okand Tserenjigmid (2020) are the most recent works that we are aware of.The former also provides an insightful review of the state of the art. Thelatter expands on the main conceptual topic of this note: the relation betweendeterministic and probabilistic “rationality.” See the seminal McFadden (1973) as well as Ben Akiva and Lerman (1985) and Train(2009) for textbook treatments. p ( a, A ) =  e α ( a ) P b ∈ Γ( A ) e α ( b ) if a ∈ Γ ( A )0 else (GLM)where Γ is a utility correspondence based on α . Speciﬁcally, in Horan, Γdescribes the degree of imperfection in the discrimination of the α -values ofalternatives; on the contrary, in this note α and Γ are independent, with theformer tie-breaking the optimizers identiﬁed by the latter.Horan also compares and provides alternative axiomatizations of several“General Luce Models” (the name is of Echenique and Saito, 2019) of theform (GLM), which correspond to diﬀerent speciﬁcations of the properties ofΓ: Ahumada and Ulku (2019), Dogan and Yildiz (2019), Echenique and Saito(2019), Lindberg (2012), and McCausland (2009).Among these works, the manuscript of Lindberg is the most related to thepresent note. Like us, Lindberg investigates and characterizes the ChoiceAxiom “in purity.” Diﬀerently from us, he does not obtain a representationwith a single (rational) choice function Γ and a single α , but rather focuseson a lexicographic representation in the spirit of Renyi (1956). The mainoverlap between this paper and Lindberg (2012) is the observation that, when X is countable, random choice rules that satisfy the Choice Axiom are randomutility models. This result is stated without proof at the end of Lindberg(2012) and appears in Horan (2020) as Lemma 11.Dogan and Yildiz (2019) and Horan (2020) provide alternative character-izations of (CA): the former based on supermodularity of odds, the latter onthe product rule and a transitivity condition of Fishburn (1978). These re-sults —together with our characterizations of (CA) through the Choice Axiomalone, or WARP and conditioning— provide a full perspective on “rationalchoice” followed by “rational tie-breaking.”Like us, Ok and Tserenjigmid (2020) regard the support of a random choicerule as a deterministic choice correspondence, and they analyze its rationality We thank Sean Horan for this reference, which we learned after the ﬁrst version of ouranalysis circulated. That is, the suﬃciency of (i) for (iii) in Proposition 4. p and its subset consisting of the alternativesthat are chosen with highest frequency (rather than with positive frequency). A preference on X can be given in either strict form, ≻ , or weak form, % . • In the ﬁrst case, ≻ is required to be asymmetric and negatively transitive,and % is deﬁned by a % b if and only if ¬ ( b ≻ a ) (1) • In the second case, % is required to be complete and transitive, and ≻ is deﬁned by b ≻ a if and only if ¬ ( a % b ) (2)These approaches are well known to be interchangeable, and for thisreason we call weak order both ≻ and % with the understanding that they arerelated by the equivalent (1) or (2). Lemma 5

Let p : A → ∆ ( X ) be a random choice rule. The following condi-tions are equivalent:(i) p is such that, p A ( C ) = p B ( C ) p A ( B ) for all C ⊆ B ⊆ A in A ;(ii) p satisﬁes the Choice Axiom;(iii) p is such that p ( b, B ) p ( a, A ) = p ( a, B ) p ( b, A ) for all B ⊆ A in A andall a, b ∈ B ;(iv) p satisﬁes Odds Independence;(v) p is such that p ( Y ∩ B, A ) = p ( Y, B ) p ( B, A ) for all B ⊆ A in A andall Y ⊆ X . See Kreps (1988, p. 11). oreover, in this case, p satisﬁes Positivity if and only if it satisﬁes FullSupport. Proof (i) implies (ii).

Choose as C the singleton a appearing in the statementof the axiom. (ii) implies (iii). Given any B ⊆ A in A and any a, b ∈ B , by the ChoiceAxiom, p ( a, A ) = p ( a, B ) p ( B, A ), but then p ( b, B ) p ( a, A ) = p ( a, B ) p ( b, B ) p ( B, A ) = p ( a, B ) p ( b, A ) where the second equality follows from another application ofthe Choice Axiom. (iii) implies (iv). Let A ∈ A and arbitrarily choose a, b ∈ A such that p ( a, A ) /p ( b, A ) = 0 /

0. By (iii), p ( b, a ) p ( a, A ) = p ( b, { a, b } ) p ( a, A ) = p ( a, { a, b } ) p ( b, A ) = p ( a, b ) p ( b, A )three cases have to be considered: • p ( b, a ) = 0 and p ( b, A ) = 0, then p ( a, A ) /p ( b, A ) = p ( a, b ) /p ( b, a ); • p ( b, a ) = 0, then p ( a, b ) p ( b, A ) = 0, but p ( a, b ) = 0 (because p ( a, b ) /p ( b, a ) =0 / p ( b, A ) = 0 and p ( a, A ) = 0 (because p ( a, A ) /p ( b, A ) = 0 / p ( a, b ) p ( b, a ) = ∞ = p ( a, A ) p ( b, A ) • p ( b, A ) = 0, then p ( b, a ) p ( a, A ) = 0, but p ( a, A ) = 0 (because p ( a, A ) /p ( b, A ) =0 / p ( b, a ) = 0 and p ( a, b ) = 0 (because p ( a, b ) /p ( b, a ) = 0 / p ( a, A ) p ( b, A ) = ∞ = p ( a, b ) p ( b, a ) (iv) implies (iii). Given any B ⊆ A in A and any a, b ∈ B : • If p ( a, A ) /p ( b, A ) = 0 / p ( a, B ) /p ( b, B ) = 0 /

0, then by (OI) p ( a, A ) p ( b, A ) = p ( a, b ) p ( b, a ) = p ( a, B ) p ( b, B ) ◦ If p ( b, A ) = 0, then p ( b, B ) = 0 and p ( b, B ) p ( a, A ) = p ( a, B ) p ( b, A ). ◦ Else p ( b, A ) = 0, then p ( b, B ) = 0 and again p ( b, B ) p ( a, A ) = p ( a, B ) p ( b, A ). 12 Else, either p ( a, A ) /p ( b, A ) = 0 / p ( a, B ) /p ( b, B ) = 0 /

0, and inboth cases p ( b, B ) p ( a, A ) = p ( a, B ) p ( b, A ) (iii) implies (v). Given any B ⊆ A in A and any Y ⊆ X , since p ( B, B ) = 1,it follows p ( Y, B ) = p ( Y ∩ B, B ). Therefore p ( Y ∩ B, A ) = X y ∈ Y ∩ B p ( y, A ) = X y ∈ Y ∩ B X x ∈ B p ( x, B ) ! p ( y, A ) = X y ∈ Y ∩ B X x ∈ B p ( x, B ) p ( y, A ) ! [by (iii)] = X y ∈ Y ∩ B X x ∈ B p ( y, B ) p ( x, A ) ! = X y ∈ Y ∩ B p ( y, B ) X x ∈ B p ( x, A ) ! = X y ∈ Y ∩ B p ( y, B ) p ( B, A ) = p ( Y ∩ B, B ) p ( B, A ) = p ( Y, B ) p ( B, A ) Take Y = C .Finally, let p satisfy the Choice Axiom. Assume – per contra – Posi-tivity holds and p ( a, A ) = 0 for some A ∈ A and some a ∈ A . Then A = { a } and, for all b ∈ A \ { a } , the Choice Axiom implies 0 = p ( a, A ) = p ( a, { a, b } ) p ( { a, b } , A ) = p ( a, b ) ( p ( a, A ) + p ( b, A )) = p ( a, b ) p ( b, A ) whence p ( b, A ) = 0 (because p ( a, b ) = 0), contradicting p ( A, A ) = 1. ThereforePositivity implies Full Support. The converse is trivial. (cid:4) If p : A → ∆ ( X ) is a random choice rule, denote by σ p ( A ) the support of p A , for all A ∈ A . Lemma 6 If p : A → ∆ ( X ) is a random choice rule that satisﬁes the ChoiceAxiom, then σ p : A → A is a rational choice correspondence.

Proof

Clearly, ∅ = σ p ( A ) ⊆ A for all A ∈ A , then σ p : A → A is achoice correspondence. Let

A, B ∈ A be such that B ⊆ A and assume that σ p ( A ) ∩ B = ∅ .We want to show that σ p ( A ) ∩ B = σ p ( B ). Since p satisﬁes the ChoiceAxiom, if a ∈ σ p ( A ) ∩ B , then 0 < p ( a, A ) = p ( a, B ) p ( B, A ). It follows that p ( a, B ) >

0, that is, a ∈ σ p ( B ). Thus, σ p ( A ) ∩ B ⊆ σ p ( B ). As to the converseinclusion, let a ∈ σ p ( B ), that is, p ( a, B ) >

0. By contradiction, assume that a / ∈ σ p ( A ) ∩ B . Since a ∈ B , it must be the case that a / ∈ σ p ( A ), that is,13 ( a, A ) = 0. Since p satisﬁes the Choice Axiom, we then have 0 = p ( a, A ) = p ( a, B ) p ( B, A ). Since p ( a, B ) >

0, it must be the case that p ( B, A ) = 0,that is, σ p ( A ) ∩ B = ∅ . This contradicts σ p ( A ) ∩ B = ∅ ; therefore, a belongsto σ p ( A ) ∩ B . Thus, σ p ( B ) ⊆ σ p ( A ) ∩ B . (cid:4) Lemma 7

The following conditions are equivalent for a function p : A → ∆ ( X ) :(i) p is a random choice rule that satisﬁes the Choice Axiom;(ii) p is a random choice rule such that σ p is a rational choice correspondence,and p B ( a ) = p H ( a ) p H ( B ) (3) for all B ⊆ H ∈ A and all a ∈ σ p ( H ) ∩ B ;(iii) there exist a function v : X → (0 , ∞ ) and a rational choice correspon-dence Γ :

A → A such that, for all x ∈ X and A ∈ A p ( x, A ) =  v ( x ) P b ∈ Γ( A ) v ( b ) if x ∈ Γ ( A )0 else (4) In this case, Γ is unique and coincides with σ p . Proof (iii) implies (i).

Let p be given by (4) with Γ a rational choice corre-spondence and v : X → (0 , ∞ ). It is easy to check that p is a well deﬁnedrandom choice rule, that the support correspondence supp p coincides with Γ,and that p ( Y, A ) = X y ∈ Y ∩ Γ( A ) v ( y ) P d ∈ Γ( A ) v ( d )for all Y ⊆ X and all A ∈ A .Let A, B ∈ A be such that B ⊆ A and a ∈ B . We have two cases: • If Γ ( A ) ∩ B = ∅ , since Γ satisﬁes WARP, Γ ( A ) ∩ B = Γ ( B ).14 If a ∈ Γ ( B ), then a ∈ Γ ( A ) and p ( a, B ) = v ( a ) / P b ∈ Γ( B ) v ( b ), itfollows that p ( a, A ) = v ( a ) P d ∈ Γ( A ) v ( d ) = v ( a ) P b ∈ Γ( B ) v ( b ) P b ∈ Γ( A ) ∩ B v ( b ) P d ∈ Γ( A ) v ( d ) = p ( a, B ) p ( B, A ) ◦ Else a / ∈ Γ ( B ), and since a ∈ B , it must be the case that a / ∈ Γ ( A ),so p ( a, A ) = 0 = p ( a, B ) = p ( a, B ) p ( B, A ). • Else Γ ( A ) ∩ B = ∅ . It follows that a / ∈ Γ ( A ) and p ( B, A ) = 0 = p ( a, A );again, we have p ( a, A ) = p ( a, B ) p ( B, A ).These cases prove that p satisﬁes the Choice Axiom. (i) implies (ii). Let p : A → ∆ ( X ) be a random choice rule that satisﬁesthe Choice Axiom. Then, by Lemma 6, σ p : A → A is a rational choicecorrespondence. Moreover, if B ⊆ H and all a ∈ σ p ( H ) ∩ B , then p ( a, H ) = p ( a, B ) p ( B, H )but p ( B, H ) ≥ p ( a, H ) > a ∈ B and a ∈ σ p ( H ), and (3) follows. (ii) implies (iii). Let p : A → ∆ ( X ) be a random choice rule such that σ p is a rational choice correspondence, and that satisﬁes (3). Since, σ p is arational choice correspondence, then the relation a % b ⇐⇒ a ∈ σ p ( { a, b } ) ⇐⇒ p ( a, b ) > X ; and its symmetric part ∼ is an equivalence relation suchthat a ∼ b ⇐⇒ p ( a, b ) > p ( b, a ) > ⇐⇒ r ( a, b ) ∈ (0 , ∞ )Moreover, by Theorem 3 of Arrow (1959), it follows that σ p ( A ) = { a ∈ A : a % b ∀ b ∈ A } ∀ A ∈ A (5)in particular, all elements of σ p ( A ) are equivalent with respect to ∼ , and σ p ( S ) = S (6)for all S ∈ A consisting of equivalent elements.15et { X i : i ∈ I } be the family of all equivalence classes of ∼ in X . Choose a i ∈ X i for all i ∈ I . For each x ∈ X , there exists one and only one i = i x such that x ∈ X i , set v ( x ) = r ( x, a i ) (7)Since x ∼ a i , then r ( x, a i ) ∈ (0 , ∞ ); and so v : X → (0 , ∞ ) is well deﬁned.Consider any x ∼ y in X and any S ∈ A consisting of equivalent elements andcontaining x and y . Notice that, by (6), σ p ( S ) = S , hence x ∈ σ p ( S ) ∩ { x, y } ,then by (3) with H = S and B = { x, y } , p ( x, y ) = p S ( x ) p S ( { x, y } )therefore 0 < p ( x, S ) = p ( x, y ) p ( { x, y } , S )and analogously 0 < p ( y, S ) = p ( y, x ) p ( { x, y } , S )yielding that p ( x, y ) p ( y, x ) p ( x, S ) p ( y, S ) > p ( x, S ) p ( y, S ) = p ( x, y ) p ( y, x ) = r ( x, y ) (8)We are ready to conclude our proof, that is, to show that (4) holds withΓ = σ p . Let a ∈ X and A ∈ A . If a / ∈ σ p ( A ), then p ( a, A ) = 0 because σ p ( A )is the support of p A . Else, a ∈ σ p ( A ), and, by (5), all the elements in σ p ( A )are equivalent with respect to ∼ and therefore they are equivalent to some a i with i ∈ I . It follows that σ p ( A ) ∪ { a i } ∈ A and it is such that σ p ( A ) ∪{ a i } ⊆ X i . By (6), we have that σ p ( σ p ( A ) ∪ { a i } ) = σ p ( A ) ∪ { a i } , that is, p ( x, σ p ( A ) ∪ { a i } ) > x ∈ σ p ( A ) ∪ { a i } and p ( σ p ( A ) , σ p ( A ) ∪ { a i } ) >

0. By (3) with H = A and B = σ p ( A ), since a ∈ σ p ( A ) ∩ B , it follows p ( a, σ p ( A )) = p ( a, A ) p ( σ p ( A ) , A )Since p ( σ p ( A ) , A ) = 1, then p ( a, A ) = p ( a, σ p ( A ))16y (3) again, with H = σ p ( A ) ∪{ a i } and B = σ p ( A ), since a ∈ σ p ( σ p ( A ) ∪ { a i } ) ∩ σ p ( A ), then p ( a, σ p ( A )) = p ( a, σ p ( A ) ∪ { a i } ) p ( σ p ( A ) , σ p ( A ) ∪ { a i } ) = p ( a,σ p ( A ) ∪{ a i } ) p ( a i ,σ p ( A ) ∪{ a i } ) p ( σ p ( A ) ,σ p ( A ) ∪{ a i } ) p ( a i ,σ p ( A ) ∪{ a i } ) applying (8) to the pairs ( x, y ) = ( a, a i ) and ( x, y ) = ( b, a i ), with b ∈ σ p ( A ),in S = σ p ( A ) ∪ { a i } ⊆ X i , we can conclude that p ( a,σ p ( A ) ∪{ a i } ) p ( a i ,σ p ( A ) ∪{ a i } ) p ( σ p ( A ) ,σ p ( A ) ∪{ a i } ) p ( a i ,σ p ( A ) ∪{ a i } ) = p ( a,σ p ( A ) ∪{ a i } ) p ( a i ,σ p ( A ) ∪{ a i } ) P b ∈ σ p ( A ) p ( b,σ p ( A ) ∪{ a i } ) p ( a i ,σ p ( A ) ∪{ a i } ) = r ( a, a i ) P b ∈ σ p ( A ) r ( b, a i ) = v ( a ) P b ∈ σ p ( A ) v ( b )as wanted.As for the uniqueness part, we already observed that (iii) implies Γ = σ p . (cid:4) Theorem 2 immediately follows.

Proof of Proposition 3

In Theorem 2, Γ is a rational choice correspondenceand the corresponding weak order is a % b ⇐⇒ a ∈ Γ ( { a, b } ) ⇐⇒ p ( a, b ) > x, y ∈ X , if { x n } n ∈ N convergesto x , then x n % y for all n ∈ N = ⇒ x % yy % x n for all n ∈ N = ⇒ y % x This concludes the proof, because on a separable metric space, a weak orderadmits a continuous utility if and only if its upper and lower level sets areclosed (see, e.g., Kreps, 1988, p. 27). (cid:4)

The set W of all weak orders on X is endowed with the σ -algebra W generated by the sets of the form W ab = {≻ : a ≻ b } ∀ a, b ∈ X Given ≻ and ≻ ′ in W , the lexicographic composition ≻ ◦ ≻ ′ of ≻ and ≻ ′ is routinely seen to be a weak order too (see, e.g., Fishburn, 1974).17 emma 8 For each ≻ in W , the map f = f ≻ : W → W≻ ′

7→ ≻ ◦ ≻ ′ is measurable with respect to W . Proof

Arbitrarily choose a, b ∈ X , and study f − ( W ab ) = f − ( {≻ ′′ : a ≻ ′′ b } ) = {≻ ′ : f ( ≻ ′ ) ∈ {≻ ′′ : a ≻ ′′ b }} = {≻ ′ : a f ( ≻ ′ ) b } = {≻ ′ : a ≻ ◦ ≻ ′ b }• if a ≺ b , then there is no ≻ ′ in W such that a ≻ ◦ ≻ ′ b , that is, {≻ ′ : a ≻ ◦ ≻ ′ b } = ∅ which is measurable (because ∅ ∈ W ), • else if a ≻ b , then a ≻ ◦ ≻ ′ b for all ≻ ′ in W , that is, {≻ ′ : a ≻ ◦ ≻ ′ b } = W which is measurable (because W ∈ W ), • else, it must be the case that a ∼ b and a ≻ ◦ ≻ ′ b if and only if a ≻ ′ b ,that is, {≻ ′ : a ≻ ◦ ≻ ′ b } = {≻ ′ : a ≻ ′ b } = W ab which is measurable (because W ab ∈ W ).Therefore f is measurable since the counterimage of a class of generatorsof W is contained in W . (cid:4) A Random Preference Model is a measurable function P : (Ω , F , Pr) → W ω P ( ω )It is common practice to write ≻ ω instead of P ( ω ). The Random Selector p based on the RPM P is given by p ( a, A ) = Pr ( ω ∈ Ω : a ≻ ω b ∀ b ∈ A \ { a } ) ∀ a ∈ A ∈ A { ω ∈ Ω : a ≻ ω b ∀ b ∈ A \ { a }} = { ω ∈ Ω : P ( ω ) ∈ W ab ∀ b ∈ A \ { a }} = (cid:26) ω ∈ Ω : P ( ω ) ∈ \ b ∈ A \{ a } W ab (cid:27) = P − (cid:18)\ b ∈ A \{ a } W ab (cid:19) ∈ F since P is measurable. Moreover, depending on P , the RS p might not deﬁnea random choice rule. For instance, if P is constantly equal to the trivial weakorder according to which all alternatives are indiﬀerent, then p ( a, A ) = 0 forall a ∈ A ∈ A such that | A | ≥ f ≻ and P .First, such a composition deﬁnes a random preference model, because f ≻ ◦ P : (Ω , F , Pr) → W ω f ≻ ( P ( ω )) = ≻ ◦ ≻ ω —being a composition of measurable functions, it is measurable.Second, the random selector based on the random preference model f ≻ ◦ P is a lexicographic version of P , that ﬁrst selects the maximizers of ≻ , thenbreaks the ties according to P .In order to state these results formally, we denote by Γ = Γ ≻ the rationalchoice correspondence induced by ≻ . Lemma 9

Let ≻ be a weak order, P = {≻ ω } ω ∈ Ω be a RPM, and p be the RSbased on P . Then f ≻ ◦ P = {≻ ◦ ≻ ω } ω ∈ Ω is a RPM and the RS based on it isgiven by p ≻ ( a, A ) = ( p ( a, Γ ( A )) if a ∈ Γ ( A )0 else (9) for all a ∈ A ∈ A . Proof

We already observed that f ≻ ◦ P = {≻ ◦ ≻ ω } ω ∈ Ω is a RPM. By deﬁni-tion of random selector based on a RPM p ≻ ( a, A ) = Pr ( ω ∈ Ω : a ≻ ◦ ≻ ω b ∀ b ∈ A \ { a } ) Γ ≻ ( A ) = { a ∈ A : a % b for all b ∈ A } also recall that a % b if and only if a ⊀ b .

19e have to verify that this formula coincides with (9) for all a ∈ A ∈ A .For each A ∈ A and each a ∈ Γ ( A ), set J A ( a ) = { ω ∈ Ω : a ≻ ω c for all c ∈ Γ ( A ) \ { a }} = JK A ( a ) = { ω ∈ Ω : a ≻ ◦ ≻ ω b for all b ∈ A \ { a }} = K Next we check that J = K .If ω ∈ J , then a ≻ ω c for all c ∈ Γ ( A ) \ { a } ; take any b ∈ A \ { a } , • if b is such that b / ∈ Γ ( A ), then, a ≻ b and hence a ≻ ◦ ≻ ω b , • else b ∈ Γ ( A ), then a ∼ b and a ≻ ω b , again a ≻ ◦ ≻ ω b ,then a ≻ ◦ ≻ ω b for all b ∈ A \ { a } , thus ω ∈ K .Conversely, if ω ∈ K , then a ≻ ◦ ≻ ω b for all b ∈ A \ { a } . Thus, for all b ∈ Γ ( A ) \ { a } , since relation a ∼ b , it must be the case that a ≻ ω b . Therefore ω is such that a ≻ ω b for all b ∈ Γ ( A ) \ { a } , and ω ∈ J .Summing up, for all A ∈ A and a ∈ Γ ( A ), p ( a, Γ ( A )) = Pr J A ( a ) = Pr K A ( a ) = p ≻ ( a, A )and the ﬁrst line of (9) is true.Let A ∈ A and a / ∈ Γ ( A ), then there exists ¯ b ∈ A \ { a } such that a ≺ ¯ b ,and for no ω it holds a ≻ ◦ ≻ ω ¯ b , that is, K A ( a ) = { ω ∈ Ω : a ≻ ◦ ≻ ω b for all b ∈ A \ { a }} = ∅ therefore p ≻ ( a, A ) = Pr K A ( a ) = 0, and the second line of (9) is true too. (cid:4) Proof of Proposition 4

The equivalence between points (i) and (ii) corre-sponds with the equivalence between the points with the same name of Lemma7. (i) implies (iii). By Theorem 2, there exist a function α : X → R and arational choice correspondence Γ : A → A such that p ( a, A ) =  e α ( a ) P b ∈ Γ( A ) e α ( b ) if a ∈ Γ ( A )0 else20or all a ∈ A ∈ A . Denote by ≻ the weak order that corresponds to Γ.As shown by McFadden (1973), the Lucean random choice rule q ( a, A ) = e α ( a ) P b ∈ A e α ( b ) ∀ a ∈ A ∈ A is based on a (Lucean) RPM P = {≻ ω } ω ∈ Ω . By Lemma 9, it follows that f ≻ ◦ P = {≻ ◦ ≻ ω } ω ∈ Ω is a RPM and the RS based on it is given by q ≻ ( a, A ) = ( q ( a, Γ ( A )) if a ∈ Γ ( A )0 else = p ( a, A ) ∀ a ∈ A ∈ A Therefore, there exist a Lucean Random Preference Model {≻ ω } ω ∈ (Ω , F , Pr) anda weak order ≻ on X such that p is based on {≻ ◦ ≻ ω } ω ∈ (Ω , F , Pr) .(iii) implies (i). If there exist a Lucean Random Preference Model {≻ ω } ω ∈ (Ω , F , Pr) and a weak order ≻ on X such that p is based on {≻ ◦ ≻ ω } ω ∈ (Ω , F , Pr) ; in par-ticular, there exists α : X → R such thatPr ( ω ∈ Ω : a ≻ ω b ∀ b ∈ A \ { a } ) = e α ( a ) P b ∈ A e α ( b ) ∀ a ∈ A ∈ A Denoting q ( a, A ) = e α ( a ) P b ∈ A e α ( b ) ∀ a ∈ A ∈ A the RS based on {≻ ω } ω ∈ (Ω , F , Pr) , by Lemma 9, the RS is based on {≻ ◦ ≻ ω } ω ∈ (Ω , F , Pr) is q ≻ ( a, A ) = ( q ( a, Γ ( A )) if a ∈ Γ ( A )0 else=  e α ( a ) P b ∈ Γ( A ) e α ( b ) if a ∈ Γ ( A )0 elseBut, by assumption (iii), q ≻ coincides with p ( p is based on {≻ ◦ ≻ ω } ω ∈ (Ω , F , Pr) ),and Γ is a rational choice correspondence because ≻ is a weak order. ThenTheorem 2 guarantees that p satisﬁes the Choice Axiom. (cid:4) References

Ahumada, A., and Ulku, L. (2018). Luce rule with limited consideration.

Mathematical Social Sciences , 93, 52-56.Arrow, K. J. (1948). The possibility of a universal social welfare function.RAND Document P-41, 1948.Arrow, K. J. (1959). Rational choice functions and orderings,

Economica , 26,121-127, 1959.Ben-Akiva, M. E., and Lerman, S. R. (1985).

Discrete choice analysis: theoryand application to travel demand . MIT Press.Dogan, S., and Yildiz, K. (2019). Odds supermodularity and the Luce rule.Manuscript.Echenique, F., and Saito, K. (2019). General Luce Model.

Economic Theory ,68, 811-826.Fishburn, P. C. (1974). Lexicographic orders, utilities and decision rules: Asurvey.

Management science , 20, 1442-1471.Fishburn, P. C. (1978). Choice probabilities and choice functions.

Journal ofMathematical Psychology , 18, 205-219.Horan, S. (2020). Stochastic semi-orders. Manuscript.Kreps, D. M. (1988).

Notes on the theory of choice . Westview.Lindberg, P. O. (2012). The Luce Choice Axiom revisited. Manuscript.Luce, R. D. (1957). A theory of individual choice behavior. Bureau of AppliedSocial Research records, Columbia University Library, 1957.Luce, R. D. (1959).

Individual choice behavior: A theoretical analysis . Wiley.Luce, R. D. (1977). The choice axiom after twenty years.

Journal of Mathe-matical Psychology , 15, 215-233.Matejka, F., and McKay, A. (2015). Rational inattention to discrete choices: Anew foundation for the multinomial logit model.

American Economic Review ,105, 272-298.McCausland, W. J. (2009). Random consumer demand.

Economica , 76, 89-107. 22cFadden, D. (1973). Conditional logit analysis of qualitative choice behavior.In Zarembka, P. (Ed.).

Frontiers in econometrics (pp. 105-142). AcademicPress.Ok, E. A., and Tserenjigmid, G. (2020). Deterministic rationality of stochasticchoice behavior. Manuscript.Peters, H., and Wakker, P. (1991). Independence of irrelevant alternatives andrevealed group preferences.

Econometrica , 59, 1787-1801.Ray, P. (1973). Independence of irrelevant alternatives.

Econometrica , 41,987-991.Renyi, A. (1955). On a new axiomatic theory of probability.

Acta MathematicaAcademiae Scientiarum Hungaricae , 6, 285-335.Renyi, A. (1956). On conditional probability spaces generated by a dimen-sionally ordered set of measures.

Theory of Probability and Its Applications ,1, 55-64.Salant, Y., and Rubinstein, A. (2008). (

A, f ): choice with frames.

Review ofEconomic Studies , 75, 1287-1296.Samuelson, P. A. (1938). A note on the pure theory of consumer’s behaviour.

Economica , 5, 61-71.Sen, A. (1993). Internal consistency of choice.

Econometrica , 61, 495-521.Train, K. E. (2009).

Discrete choice methods with simulation . CambridgeUniversity Press.Wakker, P. P. (2010).