[PDF] The Decision-Conflict and Multicriteria Logit

Abstract

We study two tractable random non-forced choice models that explain behavioural patterns suggesting that the choice-deferral outside option is often selected when people find it hard to decide between the market alternatives available to them, even when these are few and desirable. The *decision-conflict logit* extends the logit model with an outside option by assigning a menu-dependent value to that option. This value captures the degree of complexity/decision difficulty at the relevant menu and allows for the choice probability of the outside option to either increase or decrease when the menu is expanded, depending on *how many* as well as *how attractive* options are added to it. The *multicriteria logit* is a special case of this model and introduces multiple utility functions that jointly predict behaviour in a multiplicative-logit way. Every multicriteria logit admits a simple discrete-choice formulation.

Full PDF

TThe Decision-Conﬂict and Multicriteria Logit

Georgios Gerasimou ∗† University of St Andrews

First uploaded: May 26, 2020This draft: September 29, 2020

Abstract

We study two tractable random non-forced choice models that explain behavioural patternssuggesting that the choice-deferral outside option is often selected when people ﬁnd it hard todecide between the market alternatives available to them, even when these are few and desirable.The decision-conﬂict logit extends the logit model with an outside option by assigning a menu-dependent value to that option. This value captures the degree of complexity/decision diﬃcultyat the relevant menu and allows for the choice probability of the outside option to either increaseor decrease when the menu is expanded, depending on how many as well as how attractive optionsare added to it. The multicriteria logit is a special case of this model and introduces multipleutility functions that jointly predict behaviour in a multiplicative-logit way. Every multicriterialogit admits a simple discrete-choice formulation.

Keywords: Decision diﬃculty; choice delay; indecisiveness; choice overload; reason-based choice; discrete choice. ∗ Email address: [email protected] . † Acknowledgements to be added. a r X i v : . [ ec on . T H ] S e p Introduction

It is a well-established fact that people often opt for the choice-delay outside option whenthey ﬁnd it hard to compare the active-choice alternatives available to them, even wheneach of these alternatives is individually considered “good enough” to be chosen (Tverskyand Shaﬁr, 1992; Anderson, 2003; Bhatia and Mullett, 2016). Real-world examples of suchbehaviour include:1. Employees who operated within an “active decision” pension-savings environment anddid not sign up for one of the plans that were available to them within, say, a day, weekor month of ﬁrst notice, possibly even opting for indeﬁnite non-enrolment. As Carrollet al. (2009) noted, such “an active decision mechanism compels agents to struggle witha potentially time-consuming decision” .2. An online retailer’s uniquely identiﬁable website visitors who, by the end of their visit,had purchased none of the products previously presented to them, possibly due to “conﬂicts, ambivalence and hesitation” (Huang, Korﬁatis, and Chang, 2018).3. Patients who, instead of choosing “immediately” one of the active treatments that wererecommended to them against a medical condition, delayed making such a choice (oftenat a health cost), due to “facing a treatment dilemma” (Knops et al., 2013).4. Doctors who were willing to prescribe the single available drug to treat a medicalcondition but were not prepared to prescribe anything when they had to decide fromthe expanded set that contained one more drug. As noted by Redelmeier and Shaﬁr(1995), “(a)pparently, the diﬃculty in deciding between the two medications led somephysicians to recommend not starting either” .An analyst might be interested in understanding the preferences of the average employee,consumer, patient or doctor in such datasets at the time when the decision problem was ﬁrstpresented to them . In line with common practice in the discrete-choice literature, the ana-lyst’s starting point may be to deﬁne the grand choice set as the collection of all alternatives and the outside option. Then, following Luce (1959), she might check if there is some utilityfunction on this expanded set that explains the data by means of the standard logit formula.If Luce’s Independence of Irrelevant Alternatives (IIA) axiom is violated, she would concludethat such an exact representation is impossible; and if these violations are severe, estimationvia logit regression (McFadden, 1973) would not provide a good ﬁt either. A question thatarises naturally is whether in such cases one can do better with some more general but stilldisciplined model that would impose the logit assumptions only on market alternatives thatcorrespond to the agents’ active choices , while at the same time allowing for the outsideoption to be chosen.We answer this question aﬃrmatively and show that if IIA is satisﬁed by the choiceprobabilities of market alternatives, then such data may admit two novel and tractable rep-resentations that allow for recovering a considerable amount of information about consumer indecisiveness and the generally incomplete preferences of the average decision maker atthe time of ﬁrst presentation. In line with the above examples, and depending also on therichness of the available data (e.g. do they also include response times) and the analyst’sobjective (e.g. are they analyzing only the ﬁnal active choices or are they also interested inunderstanding any initial decision diﬃculties), choosing the outside option might be deﬁned,for example, either as the act of choosing no market alternative from the menu in questionor as an “excessive” delay in choosing such an alternative. The latter deﬁnition may be articularly relevant if additional information about the alternatives has been received bythe agent in the interim stage. Under this ﬂexible interpretation, these models are applicablein a variety of datasets that may originate in the market, ﬁeld or lab.Formally, upon letting the grand choice set X comprise only market alternatives, we showthat the ﬁrst piece of novel preference-relevant information in such cases can be revealed fromthe existence of a (pseudo-)utility function u that is deﬁned on X and of a decision complexity function D that is deﬁned on the set of all menus that can be derived from X , in such a waythat for every menu A ⊆ X and market alternative a ∈ A , the choice probability of a at A when the opt-out option o (cid:54)∈ X is also feasible is given by ρ ( a, A ∪ { o } ) = u ( a ) (cid:88) b ∈ A u ( b ) + D ( A ) , (1)where D is zero at singleton menus and strictly positive otherwise, and the pair ( u, D ) isunique up to a common positive linear transformation. We will refer to (1) as the decision-conﬂict logit model. Like the Luce model with an outside option, market alternatives in(1) are assigned menu-independent (pseudo-)utility values. Unlike that model, the utilityof the outside option is menu-dependent and captured by the value of D at the relevantmenu. Thus, the appeal of the outside option at a menu is strictly increasing in decisioncomplexity. Importantly, however, the complexity function D is not increasing with menuexpansion in general. This is consistent with evidence and intuition suggesting that addingalternatives to a menu may or may not increase decision diﬃculty and the appeal of delayingmaking an active choice, depending on which as well as on how many market alternatives areadded. Indeed, as documented in the meta-analyses of Scheibehenne, Greifeneder, and Todd(2010) and Chernev, B¨ockenholt, and Goodman (2015), an active choice from the biggermenu often becomes more likely than it was originally if a suﬃciently attractive option isintroduced. This feature of the model, therefore, captures the intuitive channel throughwhich choice-overload eﬀects are known to arise as well as disappear with menu expansion.We further show that additional information from observable data of this kind may inspecial cases also be recoverable by means of their multicriteria logit representation wherebythere exist k strictly positive utility functions u l on X such that ρ (cid:0) a, A ∪ { o } (cid:1) = k (cid:89) l =1 u l ( a ) (cid:88) b ∈ A u l ( b ) , (2)where the collection ( u i ) ki =1 is unique up to a joint ratio-scale transformation in the sensethat, if ( v i ) ki =1 is another such representation of ρ , then each u i is a positive linear trans-formation of a unique v j . The distinct utility functions in this model can be thought of ascapturing diﬀerent cardinal criteria according to which the market alternatives are rankedby the average decision maker in the sample, with these rankings generally in conﬂict witheach other (e.g. due to product price vs quality trade-oﬀs; house comfort level vs location;treatment eﬃcacy vs side eﬀects). Interpreting the criteria, for example, as specifying dif-ferent signal distributions whose realizations favour the respective alternatives, the averagedecision maker can be thought of as simultaneously requiring strictly favourable draws from every such distribution before choosing a market alternative or, in the absence of such una-nimity, as requiring that suﬃciently many favourable signals are received that are strongenough to oﬀset any unfavourable ones. Such multiplicity of cardinal criteria appears to be ew in the stochastic choice literature but is conceptually analogous to those found in theanalysis of multicriteria games (Shapley, 1959; Roemer, 1999) and especially in multi-utilityrepresentations of incomplete preferences under risk and uncertainty (Shapley and Baucells,1998; Dubra, Maccheroni, and Ok, 2004; Ok, Ortoleva, and Riella, 2012; Galaabaatar andKarni, 2013; Hara, Ok, and Riella, 2019; Aumann, 1962). By analogy to such representa-tions, functions u l and u h in this model can be thought of as representing separately onattributes l and h the average decision maker’s preferences. Unlike those models, however,which do not feature a speciﬁc decision rule but do predict that a is strictly preferred to b (hence, implicitly, that a is always chosen over b at { a, b } ) whenever the former is better thanthe latter according to all criteria, here it is not just the existence of unanimous dominance u l ( a ) > u l ( b ) for all l ≤ k , but also its degree that dictates how likely it is that a will bechosen over b or whether the outside option will be chosen instead. This degree is increasingin the utility diﬀerence u l ( a ) − u l ( b ) corresponding to each l ≤ k .We show, ﬁnally, that any multicriteria logit admits a tractable discrete-choice formula-tion under the distributional assumption of type-1 extreme-value errors that are independentacross alternatives and utility criteria, and also under the behavioural assumption of maxi-mally dominant choice with incomplete preferences. The former is an extension of a standardassumption in discrete-choice modelling. The latter amounts to the average decision makerbeing portrayed as having preferences that are captured by a possibly incomplete preorderand as making an active choice if and only if a most preferred alternative exists (Gerasimou,2011; 2018, Section 2; Costa-Gomes et al., 2020). This simple model is the only model ofchoice with incomplete preferences that predicts fully consistent active choices, and thereforeimposes rich sets of restrictions. On the descriptive side, it explains Buridan-paradox typesof behaviour such as the one reported in Shaﬁr, Simonson, and Tversky (1993): “At thebookstore, [Thomas Schelling] was presented with two attractive encyclopedias and, ﬁnding itdiﬃcult to choose between the two, ended up buying neither – this, despite the fact that hadonly one encyclopaedia been available he would have happily bought it.” We conclude, ﬁnally,with the presentation of two special cases of (1) and (2) that apply to the empirically rele-vant domains of binary choice with an outside option and provide complementary insightson preferences and decision diﬃculty in such environments.

Example 1

Let X := { a, b, c } and consider the decision-conﬂict and dual-logit models on X where u ( a ) = 72 , u ( b ) = 72 , u ( c ) = 68 . ,D ( { a, b } ) = 145 , D ( { a, c } ) = 141 . , D ( { b, c } ) = 141 . , D ( { a, b, c } ) = 427 . u ( a ) = 8 , u ( b ) = 9 , u ( c ) = 8 . ,u ( a ) = 9 , u ( b ) = 8 , u ( c ) = 8 . . The three alternatives here are very similar and there is no obvious dominance relationshipbetween any two of them. The information in both representations suggests that decisiondiﬃculty is increased and an active choice is less likely in the ternary menu (Figure 1). Sucha similarity-driven choice-overload eﬀect in turn is consistent with the ﬁndings in recent em-pirical studies that suggest a positive link between similarity of the alternatives and decision igure 1: Similarity-driven choice overload (Example 1). ba Attribute / Criterion 1 A tt r i bu t e / C r i t e r i on ρ ( a , { a , b , o }) = ρ ( o , { a , b , o }) = c ba Attribute / Criterion 1 A tt r i bu t e / C r i t e r i on ρ ( a , { a , b , c , o }) = ρ ( o , { a , b , c , o }) = complexity (Sela, Berger, and Liu, 2009; Scheibehenne, Greifeneder, and Todd, 2010; Bhatiaand Mullett, 2018). Example 2

In the opposite direction, the meta-analyses in Scheibehenne, Greifeneder, and Todd (2010)and Chernev, B¨ockenholt, and Goodman (2015) point to the conclusion that choice-overloadeﬀects are mitigated or even disappear when the larger menu includes a clearly superiormarket alternative. To see how the proposed modelling framework predicts such an eﬀect,consider the decision-conﬂict and dual-logit ρ where u ( a ) = 100 , u ( b ) = 1 , u ( c ) = 1 ,D ( { a, b } ) = 25 , D ( { a, c } ) = 25 , D ( { b, c } ) = 4 . , D ( { a, b, c } ) = 54 . u ( a ) = 10 , u ( b ) = 2 , u ( c ) = 0 . ,u ( a ) = 10 , u ( b ) = 0 . , u ( c ) = 2 . Figure 2: Mitigation of choice overload when a suﬃciently attractive option is introduced (Example 2). cb Attribute / Criterion 1 A tt r i bu t e / C r i t e r i on ρ ( o , { b , c , o }) = cb a Attribute / Criterion 1 A tt r i bu t e / C r i t e r i on ρ ( o , { a , b , c , o }) = Alternatives b and c are similarly attractive while a is far superior to both of them. Consis-tent with the reason-based arguments in Tversky and Shaﬁr (1992) and Shaﬁr, Simonson, nd Tversky (1993), the double dominance of a at { a, b, c } makes an active choice more likelyat that menu than at { b, c } where no dominance exists. Consistent with this intuition, themodel-predicted probability of an active choice at { b, c } is 0.32 in this example, whereas at { a, b, c } it rises to 0.65 (Figure 2). Example 3

Dhar and Simonson (2003) were the ﬁrst to report a weakening of the compromise eﬀect (Simonson, 1989) when choice is not forced, whereby a considerable proportion of subjectsopted for the choice-deferral outside option in the target ternary menu that included thetwo extreme and one compromise option. This was contrasted with their ﬁnding that the attraction eﬀect (Huber, Payne, and Puto, 1982) was actually strengthened in such non-forced choice settings. The authors suggested reason-based choice explanations for theseﬁndings that included the presence or absence of an objective dominance relation within themenu (cf Example 2). While the hereby proposed models cannot capture genuine asymmetricdominance eﬀects because they retain the IIA axiom on active-choice probabilities, they dopredict such a weakening of the compromise eﬀect that is driven by this kind of reason-basedlogic.To illustrate, consider the decision-conﬂict logit where u ( a ) = 18 , u ( b ) = 18 , u ( c ) = 18 ,D ( { a, b } ) = 18 , D ( { a, c } ) = 18 , D ( { b, c } ) = 18 , D ( { a, b, c } ) = 24 . Figure 3: Weakening of the compromise eﬀect when choice is not forced (Example 3). ba Attribute / Criterion 1 A tt r i bu t e / C r i t e r i on ρ ( b , { a , b , o }) = ρ ( o , { a , b , o }) = cba Attribute / Criterion 1 A tt r i bu t e / C r i t e r i on ρ ( b , { a , b , c , o }) = ρ ( o , { a , b , c , o }) = Although this model does not admit a dual-logit representation, we assume for simplicity(and without loss of generality) that the alternatives can be represented in two dimensionsas in Figure 3. With D here being strictly monotonic in menu expansion and the threealternatives similarly attractive, the model’s predictions are in the spirit of the Dhar andSimonson (2003) ﬁndings (cf their Table 2). Standard discrete choice models treat the outside option like any other alternative and predictthat it is chosen when its Luce utility is higher than that of all feasible market alternatives see Anderson, de Palma, and Thisse (1992) or Matzkin (2019), for example). To contrast thisdecision rule with the above-mentioned maximally dominant choice model with incompletepreferences and a distinct behavioural model of overload-constrained utility maximization,we provided an axiomatization of this kind of utility maximization with an outside option ina general deterministic framework in Gerasimou, (2018, Section 3). Starting with Manziniand Mariotti (2014), several random choice models of limited attention have also required orallowed for the inclusion of an outside option recently (Brady and Rehbeck, 2016; Dardanoniet al., 2020; Abaluck and Adams, 2020; Cattaneo et al., 2020; Barseghyan et al., 2019; Aguiaret al., 2018). The outside option in these models is chosen when no attention is paid to anyof the market alternatives and its choice probability is always decreasing as menus becomebigger. Horan (2019) has recently shown, however, that removing the outside option isinconsequential for these models’ general features and primary purpose, which is to explainactive-choice decision making subject to cognitive constraints. Our approach is distinctfrom –and complementary to– all the above, being relevant in situations where every feasiblealternative is both desirable and paid attention to. In the proposed framework the averageagent is preference-maximizing but her active choices are potentially hard due to comparisondiﬃculties. Unlike the above models, this allows for the intuitive (non-)monotonicities thatwere illustrated in Section 3.Also distinct from (1) and (2) is the perception-adjusted logit model in Echenique, Saito,and Tserenjigmid (2018) and the nested stochastic choice model in Kovach and Tserenjigmid(2019). In the former, choice probabilities are inﬂuenced by the alternatives’ position in apriority ordering. As the authors showed, the choice probability of the outside option isweakly higher in that model than what it would have been in the corresponding logit model,with its utility being the sum of its menu-independent utility and an additional, menu-dependent component. The axiomatization and generalization of the nested logit model(Ben-Akiva, 1973; Train, 2009, Chapter 4) in Kovach and Tserenjigmid (2019) also allowsfor the inclusion of an outside option where one nest comprises all market alternatives andthe second nest only contains the outside option (see Koujianou-Goldberg (1995) for aninﬂuential application of such a model). Unlike a decision-conﬂict or multicriteria logit, thisspeciﬁcation assigns a menu-independent utility to the outside option and therefore cannoteasily explain the behavioural phenomena that were mentioned and illustrated above. Themodel studied in Kovach and ¨Ulk¨u (2020) predicts that the outside option is chosen if none ofthe feasible market alternatives is preferred to a randomly speciﬁed threshold. The authorsshowed that this is a special case of a random utility model where the outside option is treatedlike a market alternative. The multicriteria logit, moreover, diﬀers from the rank-ordered logit (Beggs, Cardell, and Hausman, 1981) which estimates the probabilities of complete orderings –as opposed to choice probabilities of market alternatives– with a functional form that issimilar to (2).Conceptually related but distinct from the proposed models are also the logit models withcostly information sampling and rational inattention in Matˇejka and McKay (2015), Caplin,Dean, and Leahy (2019) and their extension in dynamic environments that was analysed inSteiner, Stewart, and Matˇejka (2017). In a diﬀerent strand of work, moreover, Fudenbergand Strzalecki (2015) characterized a generalization of dynamic logit under uncertainty that,in addition to a preference for ﬂexibility, allows for choice-overload-like eﬀects whereby re-moving items from a menu could increase that menu’s valuation. In our decision-conﬂictlogit such an eﬀect manifests itself through a decrease in the menu-dependent utility of theoutside option. Working within a forced- and non-forced general choice environment, respec-tively, Frick (2016) and Gerasimou (2018, Section 4) studied threshold general choice models hat also predict choice-overload eﬀects by means of menu complexity functions that aremonotonic in set inclusion and lead to increased levels of choice inconsistency and choice de-ferrals, respectively, as menus expand. Deb and Zhou (2018) studied choice overload with anattribute-based model of reference-dependent preferences. Buturak and Evren (2017) tookmenus of lotteries as the primitive and modelled the decision maker as making an activechoice at a menu if and only if the Sarver (2008) regret-inclusive expected utility of the mostpreferred feasible alternative exceeds the menu-independent cut-oﬀ utility of the outside op-tion. For models of indecisiveness aversion, commitment and ﬂexibility in the literature ofpreferences over menus of lotteries, ﬁnally, we refer the reader to Danan, Guerdjikova, andZimper (2012) and Pejsachowicz and Toussaert (2017). Let X be the grand choice set of ﬁnitely many and at least two market alternatives withgeneric elements a, b ∈ X . Let o (cid:54)∈ X be the outside option. Deﬁne further the augmentedchoice set X := X ∪ { o } and let D := { A ∪ { o } : ∅ (cid:54) = A ⊆ X } be the collection of all decisionproblems , where a nonempty A ⊆ X is a menu and M := { A ⊆ X : A (cid:54) = ∅} is the collectionof all menus. The generic element of a decision problem A ∪{ o } will be denoted by x and maystand for a market alternative in menu A or for the outside option. A random non-forcedchoice model on X is a function ρ : X × D → R + such that ρ ( x, A ∪ { o } ) ∈ [0 ,

1] for all A ∪{ o } ∈ D and x ∈ A ∪{ o } ; ρ ( x, A ∪{ o } ) = 0 for all x (cid:54)∈ A ∪{ o } ; (cid:80) x ∈ A ∪{ o } ρ ( x, A ∪{ o } ) = 1.This deﬁnition is formally equivalent to that of a random choice model with an outside optionand reserves a special role for that option by allowing for it to be chosen and yet to alsobe treated separately from market alternatives. Compared to the papers cited earlier, ourslightly diﬀerent formulation, terminology and notation aims to highlight this option’s specialnature (indeed, it is the only one that is feasible in every menu) and also to make it moretransparent that we distinguish here between active-choice behaviour that pertains to choiceof market alternatives and the avoidant/deferring behaviour that is reﬂected in the choiceof that option.Like Luce (1959) and many papers since, we assume that the domain of a random non-forced choice model ρ includes all decision problems. The minimal structure on such a ρ thatwe will be imposing throughout consists of the following three axioms, to be supplementedby additional ones in the sequel: Desirability If a ∈ X , then ρ ( a, { a, o } ) = 1 . Positivity If | A | ≥ and x ∈ A ∪ { o } , then ρ ( x, A ∪ { o } ) > . Active-Choice Luce (ACL) If a, b ∈ X and A, B ⊇ { a, b } , then ρ ( a, A ∪ { o } ) ρ ( b, A ∪ { o } ) = ρ ( a, B ∪ { o } ) ρ ( b, B ∪ { o } ) . esirability –which is also implicit in the original Luce (1959) model without an outsideoption– implies that every market alternative is suﬃciently good to always be chosen when itis the only feasible one. Thus, when the outside option is chosen in other menus, this is due toreasons other than the potential unattractiveness of market alternatives, which is a diﬀerentand often valid explanation of such behaviour that lies outside the scope of our analysis.The most general of our results below does aﬀord a generalization that relaxes Desirability,assuming that one is willing to accept the interpretation within the model captured in (1)that singleton menus also generate decision conﬂict. While this may indeed be the casein some real binary decisions, retaining the Desirability axiom throughout allows for notconfounding the indecisiveness channel toward choice-avoidant behaviour with the distinctchannel of undesirability, although, of course, both channels may be simultaneously presentin practice.Positivity is a standard axiom in the literature and allows for a crisper illustration ofthe key novel ideas that are put forward here. In view of Desirability, we restrict its scopeof application to non-singleton menus only. Implicitly already assuming Positivity in itsstatement, ACL imposes the standard kind of IIA-consistency only in the choice probabilitiesof market alternatives, while allowing the choice probabilities pertaining to pairs { o, a } thatcomprise the outside option and a market alternative to deviate from it. In recent work(Gerasimou, 2011; 2018; Costa-Gomes et al., 2020) we have argued on conceptual, theoreticaland empirical grounds that observable active choices are more likely to conform to no-cycleprinciples of consistency such as the Weak Axiom of Revealed Preference (WARP) when thesechoices are not forced upon decision makers than when they are. To the extent that thistends to be true in general, the intuitive appeal of ACL in a non-forced choice environment isincreased compared to that of the standard IIA axiom in a forced-choice setting, consideringalso the formal analogy between WARP and the more general Luce Axiom (which also applieswithout Positivity) that was recently established in Cerreia-Vioglio et al. (2020). Proposition 1

The following are equivalent for a random non-forced choice model ρ on X :1. ρ satisﬁes Desirability , Positivity and

Active-Choice Luce .2. ρ is a decision-conﬂict logit. Before proceeding further, we note in passing that a decision-conﬂict logit ρ also satis-ﬁes Active-Choice Strong Stochastic Transitivity, deﬁned as the weakening of the standardversion of the axiom whereby, for all a, b, c ∈ X , ρ ( a, { a, b, o } ) ≥ and ρ ( b, { b, c, o } ) ≥ implies ρ ( a, { a, c, o } ) ≥ max { ρ ( a, { a, b, o } ) , ρ ( b, { b, c, o } ) } .Let us now turn to the implications that (not) imposing some additional structure hason the model’s predictions. To this end, we ﬁrst deﬁne a decision-conﬂict logit ρ = ( u, D ) tobe (strictly) monotonic if D is monotonic in the sense that D ( A ) ≥ D ( B ) ( D ( A ) > D ( B ))whenever A ⊃ B . That is, decision diﬃculty in such a model is (strictly) increasing in menuinclusion. Next, let us introduce a weakening of the fundamental Regularity axiom. Thestandard version of this axiom requires the choice probability of any alternative to decreasewhen additional items are introduced in a menu, and is implied by all random-utility models(Block and Marschak, 1960). Similar to ACL, our version of the axiom requires that this betrue for all alternatives but the outside option. ctive-Choice Regularity (ACR) If a ∈ B ⊂ A , then ρ ( a, B ∪ { o } ) ≥ ρ ( a, A ∪ { o } ) . The next axiom, ﬁnally, describes behaviour that goes in the opposite direction to ACRas far as choice of the outside option is concerned. Although this is in line with the generaldirection pointed to by choice-overload ﬁndings (cf Example 1), it rules out the documentedpattern that was mentioned previously (cf Example 2) according to which the addition ofa suﬃciently superior option often makes an active choice more likely despite the resultingmenu expansion.

Deferral Anti-Regularity (DA) If A ⊃ B , then ρ ( o, A ∪ { o } ) ≥ ρ ( o, B ∪ { o } ) Observation 1

1. A decision-conﬂict logit may violate Active-Choice Regularity.2. A monotonic decision-conﬂict logit satisﬁes Active-Choice Regularity.3. A monotonic decision-conﬂict logit may violate Deferral Anti-Regularity.4. A decision-conﬂict logit that satisﬁes Deferral Anti-Regularity is monotonic.

The second and last points are straightforward to prove, while examples illustrating the ﬁrstand third points are easy to construct (for the third point, see Example 2).

A random non-forced choice model ρ on X admits a Luce (1959)/logit representation withan outside option if there exists a function u : X ∪ { o } → R ++ such that, for all menus A ⊆ X and a ∈ A , ρ ( a, A ∪ { o } ) = u ( a ) (cid:88) b ∈ A u ( b ) + u ( o ) , (3)where u ( o ) is the menu-independent utility of the outside option. Next, by a random forced-choice model ρ on X we will refer to one that satisﬁes the properties of a non-forced choicemodel and, in addition, is such that (cid:80) a ∈ A ρ ( a, A ∪ { o } ) ≡ (cid:80) a ∈ A ρ ( a, A ) = 1 for all A ⊆ X .Thus, the outside option is either infeasible or never chosen in such a model. A forced-choice ρ admits a logit/Luce representation without an outside option if there exists some u : X → R ++ such that, for all A and a ∈ A , ρ ( a, A ) = u ( a ) (cid:88) b ∈ A u ( b ) . (4)Clearly, the decision-conﬂict and multicriteria logit models become the logit model with anoutside option, (3), if o ∈ X and the Luce-IIA axiom applies to all alternatives in this X .Then, D ( · ) ≡ u ( o ) > k = 1 in (2), which also features an additional u ( o ) > ext, towards clarifying the relationship between (1) and (4) we introduce the followingaxiom that eﬀectively amounts to ρ being a random forced-choice model. Active Choices

For every menu A , ρ ( o, A ∪ { o } ) = 0 . This axiom implies Desirability. In its presence, moreover, Positivity reduces to ρ ( a, A ∪{ o } ) > | A | ≥ a ∈ A , while ACL becomes the standard Luce-IIA condition.Therefore, the next result emerges as a direct implication of Proposition 1. Corollary 1

The following are equivalent:1. ρ is a decision-conﬂict logit such that D ( A ) = 0 for every menu A .2. ρ is a logit without an outside option.3. ρ satisﬁes Active Choices, Positivity and Luce-IIA. The equivalence between the last two statements is due to Luce (1959).

As was previously mentioned, recent empirical studies have suggested that decision diﬃcultyis often increasing in the similarity of the available choice alternatives (Sela, Berger, andLiu, 2009; Scheibehenne, Greifeneder, and Todd, 2010; Bhatia and Mullett, 2018), althoughexperimental evidence and theoretical models pointing in the opposite direction also exist(see, for example, Tversky and Russo, 1969 and Natenzon, 2019, respectively). To incorpo-rate this insight into our analysis more concretely, and to make the latter potentially moreapplicable, we will now impose some additional structure on the baseline decision-conﬂictlogit.To this end, we ﬁrst assume that the alternatives can be described in terms of a total m observable attributes. Following this, we can rewrite the grand choice set as X ⊆ { , } m ,so that a ∈ X is now understood to be an m -tuple of 1-0 binary values that indicate,respectively, whether that alternative possesses a certain attribute or not. With m = 5, forexample, this approach might be relevant towards modelling the possibly bounded-rationalaverage consumer who may perceive and try to compare multi-attribute products in a binarymanner on the basis of whether these products possess one or more of the following observable–and assumed desirable– attributes: “price below $ ; “four-star customer rating” ; “topbrand name” ; “weight below 100g” ; “one-day delivery” .Next, we consider for simplicity the well-known and routinely computable Jaccard simi-larity index (Jaccard, 1912) that quantiﬁes the similarity between a and b by S ( a, b ) := m (cid:88) j =1 a j b jm (cid:88) j =1 a j + m (cid:88) j =1 b j − m (cid:88) j =1 a j b j . he value of this symmetric index at the pair ( a, b ) is the number of attributes that the twoalternatives have in common as a proportion of the total number of attributes between them.It is a special case of the Tversky (1977) similarity index that is not necessarily symmetric.Also unlike the more general Tversky index, the mapping deﬁned by ( a, b ) (cid:55)→ − S ( a, b ) isa proper dissimilarity metric on X (Levandowsky and Winter, 1971).Writing a = ( a , . . . , a m ) for a ∈ X , to complete the speciﬁcation we ﬁnally deﬁne u ( a ) := (cid:32) m m (cid:88) s =1 a s (cid:33) p , (5) D ( A ) := (cid:88) a,b ∈ A S ( a, b ) − | A | (cid:0) | A | (cid:1) − | A | , (6)where p ≥ p ≥ u ( a ) attains the maximum possible value of 1 ifand only if a is “ideal” in the sense that a s = 1 for all s ≤ m . On the other hand, Decisiondiﬃculty at a menu is identiﬁed with the average similarity between the distinct alternativesat that menu (recall that S ( a, a ) = 1 for all a ∈ X ). Under this speciﬁcation, therefore, thecommon range of u and D is the unit interval. We will refer to (1) when u and D are deﬁnedby (5)–(6) as the similarity-conﬂict logit . Observation 2

The similarity-conﬂict logit allows for violations of monotonicity and Active-Choice Regu-larity.

It is easily seen that violations of ACR with respect to menus

A, B with A ⊃ B andalternative a ∈ B can occur if and only if D ( B ) − D ( A ) > (cid:88) b ∈ A \ B u ( b ) . In the context of the present model this may be interpreted as saying that, for the choice of amarket alternative to become more likely when more items are added to a menu, the marginalbeneﬁt to decision-diﬃculty alleviation that is brought about by the lower average similarityin the expanded menu must exceed the utility beneﬁt from the additional alternatives. Toillustrate the intuition for this and also for the non-monotonicity of D , consider the examplewhere a := (1 , , , b := (0 , , , c := (1 , , , B := { a, c } and B = { a, b, c } .While distinct, the alternatives a and c are identical in terms of their common attributes,hence perfectly similar: D ( B ) = 1. Alternative b on the other hand does not share anyattributes with a or c , and is therefore perfectly dissimilar to them. Thus, adding b to themenu decreases average similarity: D ( A ) = . For any parameter p ≥

1, ﬁnally, the modelpredicts ρ ( a, A ∪ { o } ) > ρ ( a, B ∪ { o } ), in violation of ACR. Indeed, we have D ( B ) − D ( A ) = > (0 . p = u ( b ), for all p ≥ lso highlights the potential relevance of binary-attribute modelling in bounded-rationaldecision making. Towards analysing the multicriteria logit deﬁned in (2) let us ﬁrst note that models admittingsuch a representation belong to the more general polynomial logit class that is deﬁned by ρ ( a, A ∪ { o } ) = k (cid:89) l =1  u l ( a ) (cid:80) b ∈ A u l ( b )  p , (7)where p ≥ k = p = 1; (ii) the multicriteria logit emergeswhen p = 1 and k ≥

1; (iii) and the power logit –also a random non-forced choice model–corresponds to those cases where k = 1 and p >

1. Extending an interpretation for themulticriteria logit that was mentioned in the introduction, the power logit might be thoughtof as portraying the average decision maker as drawing from a unique signal distribution but,due to a potential lack of conﬁdence in that distribution, also as simultaneously requiringmore than one favourable draw before choosing a given market alternative.Now, if ( u i ) ki =1 with k ≥ X that comprise a multicriteria representation of a random non-forced model ρ in the sense of(2), it also deﬁnes the decision-conﬂict logit ( u, D ) where, for every menu A and alternative a ∈ A , u ( a ) := k (cid:89) l =1 u l ( a ) , (8) D ( A ) := k (cid:89) l =1 (cid:88) b ∈ A u l ( b ) − (cid:88) b ∈ A k (cid:89) l =1 u l ( b ) . (9)We ﬁrst identify the uniqueness structure of the multicriteria logit. To this end, givensuch a non-forced choice model ρ = ( u i ) ki =1 , we will refer to another collection of strictlypositive functions ( v i ) ki =1 on X as a joint ratio-scale transformation of ( u i ) ki =1 if there is apermutation π on { , . . . , k } such that, for all i ≤ k and all j, l ≤ | X | , u i ( a j ) u i ( a l ) = v π ( i ) ( a j ) v π ( i ) ( a l ) . (10)That is, each u j in ( u i ) ki =1 is a positive linear transformation of a unique v h in ( v i ) ki =1 . Proposition 2

A multicriteria logit is unique up to a joint ratio-scale transformation.

Next, we deﬁne the special class of strictly monotonic decision-conﬂict logit models ρ =( u, D ) that comprises those where the complexity function D is additive in the sense that, or every menu A , D ( A ) = (cid:88) a,b ∈ A D ( { a, b } ) . (11)Assuming –as we do throughout this paper– that full attention is paid to all feasible al-ternatives (e.g. because their number is relatively small and/or the decision’s importanceis high), it is intuitive that the average agent’s decision diﬃculty at a menu is a strictlyincreasing function of the pairs of distinct alternatives in it. Indeed, it is the comparisons atbinary menus that ultimately determine the best alternative overall, if one exists, and howdiﬃcult it is to ﬁnd one. The additivity condition disciplines the structure of this monotonicrelationship in an analytically convenient way that, as will be shown below, also provides abridge between the general decision-conﬂict and dual-logit models. Analogously, we refer to( u, D ) and D as supper-additive if, for every menu A , D ( A ) > (cid:88) a,b ∈ A D ( { a, b } ) . (12)We will further say that ( u, D ) and D are convex if, for all menus A , S and T such that A = S ∪ T , S ∩ T = ∅ and max {| S | , | T |} ≥ D ( A ) > D ( S ) + D ( T ) . (13)Such a model predicts that introducing new pairs of options in a menu increases decisiondiﬃculty very rapidly. Despite this fact, however, and consistent with the previous discussion,the DA axiom may still be violated because the degree of decision diﬃculty at a menu, ascaptured by D , is measured on the same cardinal scale as the degree of attractiveness ofthe various market alternatives, as captured by u . Therefore, ρ ( o, A ) < ρ ( o, B ) may well bepossible when A ⊃ B if one of the alternatives in A \ B is suﬃciently attractive, althoughconvexity of D makes such an event less likely. Proposition 3

The following are true for a multicriteria logit ρ = ( u i ) ki =1 on X :1. ρ is a convex decision-conﬂict logit.2. ρ is an additive decision-conﬂict logit when k = 2 and super-additive when k > .3. ρ satisﬁes Active-Choice Regularity but may violate Deferral Anti-Regularity. Regarding the last point, that such a model does not imply DA in general was already seenin Example 2.

If choice probabilities are representable as in the standard logit models –with or without anoutside option– in (3) and (4), the analyst can infer the complete preference ranking of the av-erage decision maker by comparing the Luce utility values of the diﬀerent alternatives. Giventhe way in which these utilities are deﬁned, such comparisons can equivalently be thoughtof as comparisons between the relative choice probabilities of the diﬀerent alternatives. Nowconsider a multicriteria logit ρ = ( u l ) kl =1 with decision-conﬂict representation ( u, D ). We candeﬁne the average consumer’s generally incomplete revealed preference relations (cid:31) DC and (cid:31) MC on X by a (cid:31) DC b ⇐⇒ u ( a ) > u ( b ) and u ( a ) ≥ D ( { a, b } ) , (14) a (cid:31) MC b ⇐⇒ u l ( a ) ≥ u l ( b ) for all l ≤ k, with at least one strict inequality . (15) roposition 4 If ρ is multicriteria logit, then a (cid:31) DC b = ⇒ a (cid:31) MC b. (16) The converse is not true in general.

Under these deﬁnitions, therefore, the multicriteria logit does indeed allow for additionalinformation to be recovered about the average consumer’s preferences whenever the dataadmit both representations, and this additional information does not contradict the moreconservative conclusions arrived at through the decision-conﬂict logit representation. Toillustrate, consider the example choice probabilities in Table 1. These data admit both adecision-conﬂict logit representation ( u, D ) and a dual-logit representation ( u , u ): Table 1: Model-based preference recovery from dual-logit representable choice probabilities: Example. { a, o } { b, o } { c, o } { a, b, o } { a, c, o } { b, c, o } { a, b, c, o } a – b – 1 – – c – – 1 –

230 530 130 o u ( a ) = 8 , u ( b ) = 2 , u ( c ) = 1 ,D ( { a, b } ) = 10 , D ( { a, c } ) = 6 , D ( { b, c } ) = 3; u ( a ) = 4 , u ( b ) = 1 , u ( c ) = 1 ,u ( a ) = 2 , u ( b ) = 2 , u ( c ) = 1 . Therefore, one can infer here that a (cid:31) DC c and a (cid:31) MC b (cid:31) MC c , consistent with (16).We remark that the above deﬁnition of (cid:31) DC can be formulated equivalently as a (cid:31) DC b ⇐⇒ u ( a ) > u ( b ) + δ a,b , where δ a,b ≥ δ a,b = max { D ( { a, b } ) − u ( b ) , } . Thus, (cid:31) DC can be thought of as a stochastic interval order . For a detailed analysis of stochastic semi-orders where δ a,b is constant for all a, b ∈ X we refer the reader to Horan(2020). We now study the dual-logit special case of (2) in more detail. In this model there are twocriteria u , u : X → R ++ , so that ρ satisﬁes ρ ( a, A ∪ { o } ) = u ( a ) (cid:80) b ∈ A u ( b ) · u ( a ) (cid:80) b ∈ A u ( b ) (17) nd (8), (9) reduce to the simpler decision-conﬂict logit ( u, D ) deﬁned by u ( a ) := u ( a ) u ( a ) , (18) D ( { a, b } ) := u ( a ) u ( b ) + u ( b ) u ( a ) . (19)The argument in the proof of Proposition 1 –which extends that in Luce (1959)– eﬀectivelyprovides an algorithm for constructing a decision-conﬂict logit ( u, D ) from a dataset thatconforms to that model’s axioms. In the remaining part of this section our aim is to developa similar algorithm for checking if a ( u, D ) representation of some ρ also admits a duallogit ( u , u ) representation. We will do so by formulating the dual-logit existence problemas one of solving a system of bilinear equations. Toward that end, we apply and adaptthe general solution theory for such systems that was recently developed in the multilinearalgebra literature by Johnson, ˇSmigoc, and Yang (2014).It will be convenient to let X := { a , . . . , a | X | } throughout this section. For any h ≤ | X | and any distinct h, l ≤ | X | , deﬁne the symmetric and Boolean | X | × | X | matrices E hh and E hl by E hhij := (cid:26) , if i = j = h, , otherwise , E hlij :=  , if i = h and j = l, , if i = l and j = h, , otherwise (20)Given the postulated additive model ρ = ( u, D ), we wish to recover the functions u , u : X → R ++ by solving the system of bilinear equations u E s u = ( u, D ) s , s = 1 , , . . . , m ( | X | ) , (21)where u := (cid:0) u ( a ) , . . . , u ( a | X | (cid:1) , u := (cid:0) u ( a ) , . . . , u ( a | X | (cid:1) are the system’s unknown | X | -vectors that list the values of u and u ; each E s is an | X | × | X | matrix that belongs to one ofthe two types in (20); and, for all s , ( u, D ) s > u ( a i )or D ( { a i , a j } ) at some a i ∈ X or binary menu { a i , a j } ⊆ X , respectively. To illustrate thisformulation of the problem when | X | = 3, for example, notice that (21) can be expanded asfollows in this special case: u   u = u ( a ) , u   u = u ( a ) , u   u = u ( a ) , u   u = D ( { a , a } ) , u   u = D ( { a , a } ) , u   u = D ( { a , a } ) . Importantly, the assumed additivity of D implies that, in addition to the values of u on X , all information of interest in this problem is contained in the values of D at thebinary menus alone. Thus, the bilinear system of m ( | X | ) equations in 2 | X | unknowns thatis speciﬁed in (21) does not feature any obvious linear dependencies –by which we meanthat some of its matrices E s are linearly dependent. Although the absence of such obviouslinear relationships does not by itself imply that the matrices in (21) are, in fact, linearlyindependent, this necessary condition for the system’s solvability (Johnson, ˇSmigoc, andYang, 2014) is indeed satisﬁed in (21). emma 1 The matrices E s in (21) are linearly independent. While a necessary condition for the solvability of (21), however, linear independence isnot suﬃcient in general. Also, the speciﬁc and easily testable suﬃcient conditions that weregiven in Johnson, ˇSmigoc, and Yang (2014, Theorems 5.1 & 5.3) are violated by (21). There-fore, we build instead on their more general but less direct solvability method (Theorem 3.1).The general idea of that method (see also Johnson and Link, 2009) is that a particular linearsystem that is derived from the bilinear one is solved ﬁrst, and a search is then carried outover all solutions to that linear system in order to ﬁnd some that allows for the matrix that isobtained by columnising the corresponding solution vector to be of rank 1 and, therefore, toadmit a decomposition as the outer product of two vectors that are solutions to the originalbilinear system. This search, however, can be hard. Applying their algorithm by exploitingthe features of the dual-logit model we will arrive at a test that allows for a fast solvabilitydecision and, if a solution exists, for its recovery.

Step 1 . Start by ordering the equations in (21) as follows (see above for | X | = 3): E = E , . . . , E | X | = E | X || X | , E | X | +1 = E , . . . , E | X | +2 = E , . . . , E m ( | X | ) = E ( | X |− | X | . Now deﬁne the | X | × m ( | X | ) matrix E := (cid:16) vec( E ) , . . . , vec( E m ( | X | ) (cid:17) . (Notation: if A is a p × p real matrix, denote by vec( A ) the p -vector that results from A by putting the columns of that matrix in a single column, starting with column 1 andcontinuing in ascending order.) By Theorem 3.1 in Johnson, ˇSmigoc, and Yang (2014), if u , u solve the bilinear system (21), then they also solve the linear system E T · vec( u · u ) = D | X | , (22)where vec( u · u ) is the unknown | X | -vector in (22) that contains product terms u ( a i ) u ( a j )for i, j ≤ | X | , and D | X | is the m ( | X | )-vector of constants deﬁned by D | X | := (cid:0) u ( a ) , · · · , u ( a | X | ) , D ( { a , a } ) , · · · , D ( { a , a | X | } ) , D ( { a , a } ) , · · · , D ( { a | X |− , a | X | } ) (cid:1) . Lemma 2

The linear system (22) has inﬁnitely many solutions.

Step 2 . Deﬁne the square matrix U :=  u ( a ) u ( a ) u ( a ) u ( a ) . . . u ( a ) u ( a | X | ) u ( a ) u ( a ) u ( a ) u ( a ) . . . u ( a ) u ( a | X | )... · · · . . . ... u ( a | X | ) u ( a ) u ( a | X | ) u ( a ) . . . u ( a | X | ) u ( a | X | )  (23)that is obtained by columnising the general solution vec( u · u ) to (22) after applying theunvec linear transformation that maps R m into the space of m × m real matrices R m × m andis deﬁned by unvec(vec( A )) := A . otice that the diagonal entries of U are determined uniquely in the general solution to(22). In addition, U contains ( | X | − | X | ) strictly positive free variables that correspondto the upper-diagonal terms u ( a i ) u ( a j ) with i < j , and it also follows from (19) that thestrictly positive lower-diagonal terms are then pinned down uniquely by u ( a j ) u ( a i ) = D ( { a i , a j } ) − u ( a i ) u ( a j ) . (24)If U is of rank 1 under some admissible solution to the linear system, then it follows fromthe Singular Value Decomposition Theorem (see Theorem 3.7.5 in Horn and Johnson, 1985,for example) that there exist | X | -vectors u , u such that U = u · u . That is, U can be decomposed as the outer product of the two solution vectors. We can nowmake a stronger formal statement of this fact. Lemma 3

The system (21) is solvable if and only if U is of rank 1 for some solution to (22) . Step 3 : As noted earlier, checking if a rank-1 matrix U exists for some solution to thelinearized counterpart of some bilinear problem is generally diﬃcult (see pp. 1557-58 inJohnson, ˇSmigoc, and Yang, 2014). Exploiting the speciﬁc structure of our problem thatfeatures the dependence relations (24) between the symmetric oﬀ-diagonal terms, to checkif U in (23) is of rank 1 under some solution to (22) we must ultimately solve the following(generally overdetermined) of symmetric polynomial equations of degree 2 in the 2 | X | originalunknowns u ( a ), . . . , u ( a | X | ), u ( a ), . . . , u ( a | X | ):  u ( a ) u ( a )... u ( a | X | ) u ( a | X | ) u ( a ) u ( a ) + u ( a ) u ( a ) u ( a ) u ( a ) + u ( a ) u ( a )... u ( a | X |− ) u ( a | X | ) + u ( a | X | ) u ( a | X |− )  =  u ( a )... u ( a | X | ) D ( { a , a } ) D ( { a , a } )... D ( { a | X |− , a | X | } )  (25)The next result is now immediate. Lemma 4

The matrix U is of rank 1 for some solution to (22) if and only if (25) is solvable. In light of Steps 1 – 3 and Lemmas 1 – 4, we can ﬁnally state the following.

Proposition 5

There exists a ﬁnitely terminating algorithm that decides if an additive decision-conﬂict logit ( u, D ) is a dual logit ( u , u ) and, when this is the case, recovers the latter from the former. We remark that existing numerical methods (e.g. implemented in Mathematica R (cid:13) ) allow forinstantaneous pass/fail applications of this algorithm and solution recovery to (25) and, inview of the previous steps, also to (17). otice that, with ( u, D ) being held ﬁxed on the right hand side of (25), a pair ( u , u ) thatsolves (25) is unique up to multiplication by ( α, α ) for any α >

0. This does not contradictthe richer joint ratio-scale uniqueness property of a dual logit representation that is impliedby Proposition 2. Notice further that, although the analyst can in practice proceed directlyto (25), the previous steps are not redundant because they show that the reduced system in(25) contains all the relevant information for a generally much larger system; they place theexistence of a dual logit in the class of bilinear-system problems; and they demonstrate thatthere is no obviously faster way towards checking if (21) is solvable.After normalizing an arbitrary but unique u l ( a i ) on the left hand side of (25) we observethat this becomes an overdetermined polynomial system whenever | X | >

2, and its extraequations increase quadratically in the number of alternatives (see proof of Lemma 2). Notsurprisingly, therefore, the Gr¨obner-basis computer-algebraic method that is often used toidentify conditions for the existence of solutions to symbolic polynomial systems (see, forexample, Buchberger, 1998) produces in our case sets of quadratic equations on the valuesof u and D whose number is increasing non-linearly in the cardinality of X . Since, by theproof of Proposition 1, these ultimately translate into as many quadratic equations in thechoice probabilities at binary menus, this means that, despite the simple characterization ofthe model in the binary-choice special case of Proposition 6 that we establish in Section 7,the dual logit is not ﬁnitely axiomatizable in the sense of Scott and Suppes (1958). Thatis, there is no ﬁxed set of sentences of ﬁrst-order logic that can characterize the modelon arbitrary ﬁnite domains. As ﬁrst proved in Fudenberg, Iijima, and Strzalecki (2015)with their Acyclicity axiom, the “Fechnerian” random choice model where ρ ( a, { a, b } ) = F (cid:0) u ( a ) − u ( b ) (cid:1) for some u : X → R and strictly increasing F : R → R is not ﬁnitelyaxiomatizable either, the restrictions in that case comprising domain-dependent systemsof linear inequalities. A similarly compact axiom that would summarize the varying sets ofquadratic equations that are necessary and suﬃcient for dual logit representations is currentlyelusive.We ﬁnally note that, by analogy to the bilinear-system identiﬁcation of the dual logit, analgorithm that decides whether a given ρ = ( u, D ) representation also admits a multicriterialogit representation with k > multilinear system of equations. Although such problems can be formalised using tensors, and despiterecent advances in solving problems within this class (see, for example, Brazell, Li, Navasca,and Tamon, 2013), it appears that no general tensor-inversion method is presently availablethat can lead to a general decidability/solvability algorithm for the general multicriteria logitlike the one introduced for the dual logit, which relied heavily on the additivity property ofthat special case (cf Proposition 3). Using choice-probability data alone, the explanatory power and goodness-of-ﬁt performanceof the decision-conﬂict and multicriteria logit models can in principle be assessed non-parametrically by stochastic-choice adaptations of the well-known Houtman and Maks (1985)method in revealed preference analysis or the maximal separation method recently introducedin Apesteguia and Ballester (2020). In addition to such non-parametric analyses, however,of potential interest in empirical applications is the question of whether these models canalso be embedded in some generalized discrete-choice microeconometric framework. In par-ticular, if more data are available about the consumers and/or product characteristics, is it ossible in principle to use these in conjunction with those models to estimate the incompleteconsumer preferences in a sample when there is reason to believe that the no-choice outsideoption may have been chosen due to decision diﬃculty rather than due to the availablemarket alternatives not being suﬃciently desirable or the consumers not paying suﬃcientattention to them? In this section we state the assumptions under which the answer to thisquestion is positive for the multicriteria logit model.Building on and extending the additive random utility methodology pioneered in Marschak(1960) and McFadden (1973), we exploit the fact that, in the multicriteria logit ρ = ( u l ) kl =1 ,the utility values across all criteria are menu-independent. This allows for incorporating thewell-established analytical framework of logit models by imposing an additive and type-1extreme value error structure on each u l and to assume that errors are independently andidentically distributed across all alternatives and utility criteria. However, because the un-derlying preference relation is generally incomplete and, as a consequence, the consumer’sdecision rule is unclear, to complete the discrete-choice speciﬁcation of our model we alsoneed to introduce a behavioural assumption.To this end, we build on some of our previous work (Gerasimou, 2018; Section 2) andassume that the simple deterministic choice rule followed by the average consumer is thatof maximally dominant choice whereby the consumer maximizes a transitive and menu-independent but possibly incomplete preference relation (cid:37) over X in the sense that, forevery menu A ⊆ X , the set C ( A ) of choosable alternatives in A is such that C ( A ) (cid:54) = ∅ ⇐⇒ there is a ∈ A such that a (cid:37) b for all b ∈ A,C ( A ) = ∅ ⇐⇒ for all a ∈ A there is b ∈ A such that a (cid:54) (cid:37) b. Once it is clear that the preference preorder (cid:37) is possibly incomplete and hence that C ( A )is potentially empty, the above two expressions can be summarized compactly in the morefamiliar one C ( A ) = { a ∈ A : a (cid:37) b for all b ∈ A } . (26)Maximally dominant choice therefore predicts that a market alternative is chosen from amenu if and only if it is preferred to all other feasible alternatives, and that the choice-delay outside option is chosen instead when no such alternative exists. This model –whichis supported empirically in the experimental data analysed in Costa-Gomes et al (2020)–dictates active choices that are consistent with WARP and the other revealed-preferenceaxioms.Next, as was shown independently in Evren and Ok (2011) and Kochov (2020), any preorder (cid:37) on any domain X is representable by a set of –in our case ﬁnitely many– functions u , . . . , u k : X → R in the sense that a (cid:37) b ⇐⇒ u l ( a ) ≥ u l ( b ) for all l ≤ k. Indeed, applying these authors’ argument to our environment by writing X := { a , . . . , a | X | } and U (cid:37) { a i } := { a j ∈ X : a j (cid:37) a j } , to obtain such a representation it suﬃces to deﬁne, foreach l ≤ | X | , u l ( a i ) := 1 if a i (cid:37) a l and u l ( a i ) := 0 otherwise. Given this fact, we can nowrewrite (26) as C ( A ) = { a ∈ A : u l ( a ) ≥ u l ( b ) for all b ∈ A and all l ≤ k } . (27)Under any such representation, the strict preference relation (cid:31) that is deﬁned as the asym-metric part of (cid:37) must satisfy a (cid:31) b iﬀ u l ( a ) ≥ u l ( b ) for all l ≤ k and with at least one nequality strict. This is analogous to the deﬁnition of the revealed preference relation (cid:31) MC in the previous section. Note, however, that requiring weak, semi-strict or strict inequalitiesis irrelevant for the purposes of this section, as will be clear below.We assume now that the average decision maker’s unobserved utility from alternative a i under criterion u l , l ≤ k , is written as U lni := V lni + (cid:15) lni , where V lni ≡ V lni ( x ni ; β ln ) := β l · x ni captures her representative utility, x ni is a vector of product and/or consumer characteristics, β l , l ≤ k , is a vector of parameters to be estimated, and (cid:15) lni is a random variable that isindependently and identically distributed across all i and l according to the type-1 extreme-value probability density function f ( (cid:15) lni ) = e − (cid:15) lni e − e − (cid:15)lni . In conjunction with the maximallydominant choice assumption (26)-(27), this speciﬁcation allows for writing ρ m ( a i , A ∪ { o } ) = P r ( U lni ≥ U lnj ∀ j (cid:54) = i ∀ l ≤ k )= P r ( V lni + (cid:15) lni ≥ V lnj + (cid:15) lnj ∀ j (cid:54) = i ∀ l ≤ k )= (cid:90) (cid:15) I ( (cid:15) lnj ≤ (cid:15) lni + V lni − V lnj ∀ j (cid:54) = i ∀ l ≤ k ) f ( (cid:15) n ) d(cid:15) n = k (cid:89) l =1 (cid:90) ∞−∞ (cid:32)(cid:89) j (cid:54) = i e − e − ( (cid:15)lni + V lni − V lnj ) (cid:33) e − (cid:15) lni e − e − (cid:15)lni d(cid:15) n = k (cid:89) l =1 e β l · x ni | A | (cid:88) j =1 e β l · x nj , (28)where I ( · ) is the indicator function. The step from the third to the fourth equation makesuse of the above distributional and independence assumption on (cid:15) lni , so that P r ( U lni ≥ U lnj ∀ j (cid:54) = i ∀ l ≤ k ) = k (cid:89) l =1 P r ( U lni ≥ U lnj ∀ j (cid:54) = i ) . The last step follows from the derivation of McFadden’s conditional logit model when errorsare type-1 extreme value (see pp. 74-75 in Train, 2009).Holding now the choice menu A := { a , . . . , a m } ﬁxed throughout the rest of this section,for each i ≤ m we write p ni := ρ n ( a i , A ∪ { o } ) , where p no := ρ n ( o, A ∪ { o } ) = 1 − (cid:80) mi =1 p ni >

0. Next, letting y n denote the n -th individual’sobserved choice from A , deﬁne the m binary variables y ni by y ni := (cid:26) , if y n = a i , , otherwise . The multinomial density for a given choice by agent n can then be written as f ( y n ) = n (cid:89) i =1 p y ni ni . ssuming an exogenous sample and covariates x ni for agent n and alternative i , the log-likelihood function that results from the independent choices of the N agents is now givenby L ( β , . . . , β k ) = (cid:88) n (cid:88) i y ni ln p ni = (cid:88) n (cid:88) i y ni ln  k (cid:89) l =1 e β l · x ni (cid:88) j e β l · x nj  = (cid:88) n (cid:88) i y ni (cid:88) l ( β l · x ni ) − (cid:88) n (cid:88) i y ni (cid:88) l ln (cid:32)(cid:88) j e β l · x nj (cid:33) . The ﬁrst-order conditions for its maximization are d L dβ l = (cid:88) n (cid:88) i y ni x ni − (cid:88) n (cid:88) i x ni λ lni = 0 , l = 1 , . . . , k, where λ lni := e (cid:98) β l · x ni (cid:80) j e (cid:98) β l · x nj > , (cid:88) i λ lni = 1 , and (cid:89) l λ lni = p ni for all i, m, l, with (cid:98) β l the maximum-likelihood estimator of the l th criterion. Similar to the ﬁrst-ordercondition that is associated with the standard conditional logit with a single criterion, rear-ranging and dividing through by N we get1 N (cid:88) n (cid:88) i y ni x ni = 1 N (cid:88) n (cid:88) i x ni λ lni , l = 1 , . . . , k. (29)Compared to the case of the standard logit, the left hand side of (29) remains interpretableas the sample average x of the covariates x ni over those alternatives that were actually chosenby the N agents. The right hand side, however, is the average (cid:98) x of these variables that ispredicted by the choice probabilities of the model’s l -th criterion (which are additive overthe set of market alternatives) and not by the actual model-predicted choice probabilities(which are not). But since (29) requires that the above optimality condition be satisﬁed bythe estimator of each of the l criteria, it follows that, if the log-likelihood function has aunique maximizing point, the multicriteria logit collapses to the power logit (see Section 5.1)where there is a single vector (cid:98) β such that p ni =  e (cid:98) β · x ni (cid:80) j e (cid:98) β · x ni  k = e k (cid:98) β · x ni (cid:32)(cid:80) j e (cid:98) β · x ni (cid:33) k . (30)The remaining task in this case is to use the log-likelihood function alongside some goodness-of-ﬁt measure such as the Akaike or Bayes-Schwartz Information Criteria to estimate the ata-optimal exponent k in (30). If on the other hand there are multiple maximizers, thenthe possibly distinct criteria in the resulting multicriteria logit can be taken to coincidewith the combination of those maximizers that provide the best ﬁt under some appropriatemeasure. Finally, the estimated model’s explanatory value relative to the standard logit withan outside option or some other similarly evaluable model that allows for choice delay canalso be assessed along these lines. We ﬁnally focus on the empirically important special case of binary (active) choice where | X | = 2 and the outside option is also feasible. In particular we study two generally dis-tinct but overlapping classes of decision-conﬂict logit models in such domains that allow forcomplementary insights to be oﬀered regarding the shape of preferences and the nature ofdecision diﬃculty in a given dataset. Going back to the dual-logit model in the present binary-choice environment, we introducethe following novel axiom that is easily seen to be satisﬁed by all dual-logit models irrespec-tive of the total number of alternatives.

Constrained Binary Symmetry (CBS)

For all a, b ∈ X , ρ ( o, { a, b, o } )2 ρ ( a, { a, b, o } ) ≥ ρ ( b, { a, b, o } ) ρ ( o, { a, b, o } ) . (31)An important implication of bounding the two likelihood ratios as in (31) is that whenthe two market alternatives are suﬃciently choice equi-probable, their choice probabilitiesare “low” and the outside option is the strictly most likely choice outcome. This intuitiveimplication of CBS is a disciplined formalization of the idea that decision conﬂict is increasedwhen the feasible market alternatives are more or less equally attractive. It is illustratedin Figure 4, the solid part of which depicts the region in the simplex that contains thepermissible probability distributions (cid:0) ρ ( o ) , ρ ( a ) , ρ ( b ) (cid:1) that arise at a binary menu { a, b } andare compatible with (31). Figure 4: Permissible choice probability distributions under the binary dual logit (meshed region). roposition 6 The following are equivalent for a random non-forced choice model ρ on X with | X | = 2 :1. ρ satisﬁes Desirability, Positivity and Constrained Binary Symmetry (CBS).2. ρ is a dual logit ( u , u ) , and u = u if and only if CBS holds with equality. As was anticipated in Section 5.3, the binary dual logit is the only case where a simple andeasily interpretable axiomatization is possible for this model. The characterization furtherclariﬁes that the quadratic power logit model that corresponds to a special case of (7) emergesin this binary choice environment as the special case of the dual logit where CBS holds withequality. This special case is depicted on the boundary of the region highlighted in Figure 4.

A binary random non-forced model ρ = ( u, D ) is a parabolic decision-conﬂict logit if there is α > D ( { a, b } ) = α | u ( a ) − u ( b ) | . (32)This model therefore predicts that the outside option’s appeal is decreasing in the absoluteutility-diﬀerence between market alternatives, with α capturing the sensitivity of such amonotonic relation. The intuition here is that as the absolute utility diﬀerence becomeslarger, one of the two market alternatives becomes suﬃciently more attractive for an activechoice to be increasingly more likely. Figure 5: Permissible choice probability distributions under the parabolic decision-conﬂict logit when α ≥ . α ≥ . α ≥

1, respectively (meshed regions).

It is straightforward that the following simple condition, together with Desirability andPositivity, is necessary and suﬃcient for such a representation.

Asymmetry ρ ( a, { a, b, o } ) (cid:54) = ρ ( b, { a, b, o } ).We state this simple equivalence without proof. Proposition 7

The following are equivalent for a random non-forced choice model ρ when | X | = 2 :1. ρ satisﬁes Desirability, Positivity and Asymmetry.2. ρ is a parabolic decision-conﬂict logit. igure 5 depicts collections of choice probability distributions that admit a parabolic decision-conﬂict representation for some α (cid:48) ≥ α under various values of α . While the range of suchdistributions expands as α →

0, Asymmetry prevents it from covering the interior of theentire simplex.As can be readily seen by comparing Figures 4 and 5, this model is distinct from thebinary dual logit in general, but it becomes a special case of the latter when α is suﬃcientlylarge, with the cut-oﬀ α (cid:39)

1. The dual logit is more relevant descriptively either when themarket alternatives are similarly attractive and this similarity translates into high decisionconﬂict or when one of them is the unambiguously superior option. By contrast, givensome α > α for which the model has explanatory as well as predictive power. Understanding the “easy” and “hard” parts of individuals’ preferences as revealed by theiractive-choice or choice-delay decisions at the time when they were ﬁrst presented with adecision problem is important from a methodological and also from a policy and eﬀectivechoice-architecture point of view. This paper contributes to this goal by proposing a class oftractable stochastic choice models in which the choice-deferral outside option is more likelyto be chosen when a clearly superior feasible alternative does not exist. This predictionis consistent with the conclusions in recent meta-analyses of choice-overload phenomenaand reason-based approaches to decision making that have been proposed in psychologyand marketing research, and diﬀers from the predictions made by existing random choicemodels of limited attention or other bounded-rational behaviour. Therefore, the new modelscomplement existing ones in empirically relevant ways.Unlike the standard logit with an outside option and random choice models of limitedattention, the proposed models retain the IIA axiom for pairs of market alternatives but notwhen the outside option is involved. The fact that the multicriteria logit was shown to admita discrete choice formulation under reasonable distributional and behavioural assumptionsmakes it applicable not only in empirical work but, upon extending Anderson, de Palma, andThisse (1992) and other contributions in the industrial organization literature with productdiﬀerentiation, also in monopolistic or oligopolistically competitive markets where consumersare potentially choice-overloaded by menu or attribute complexity. Modelling the eﬀects ofchoice overload in market outcomes is an active area of research, and diﬀerent ideas haverecently been explored in Kamenica (2008), Gerasimou and Papi (2018), Hefti, Liu, andSchmutzler (2020) and Nocke and Rey (2020).We conclude by emphasizing that choice modelling and preference elicitation in the pres-ence of limited comparability and decision diﬃculty is a challenging task that necessitates avariety of methodological approaches. As in other parts of this research programme (Gerasi-mou, 2016,2018; Costa-Gomes et al., 2020), our focus here has been on the observed ac-tive choices when the choice-delay outside option is assumed to be feasible to the decisionmaker and observable to the analyst, as has also been the case with several recent modelsof random choice. Other approaches that are more appropriate when these conditions arenot satisﬁed include those developed in the literatures of preference imprecision (Cubitt, avarro-Martinez, and Starmer, 2015); preference for randomization (Agranov and Ortol-eva, 2017, 2020); preference for ﬂexibility (Kreps, 1979; Dekel, Lipman, and Rustichini, 2001;Danan and Ziegelmeyer, 2006); and undominated (Schwartz, 1976; Eliaz and Ok, 2006) orcyclical forced choices (Mandler, 2005, 2009; Evren, Nishimura, and Ok, 2019). References

Abaluck, J., and

A. Adams (2020): “What Do Consumers Consider Before They Choose?Identiﬁcation from Asymmetric Demand Responses,”

Quarterly Journal of Economics ,forthcoming.

Agranov, M., and

P. Ortoleva (2017): “Stochastic Choice and Preferences for Ran-domization,”

Journal of Political Economy , 125, 40–68.(2020): “Ranges of Preferences and Randomization,”

Working Paper . Aguiar, V. H., M. J. Boccardi, N. Kashaev, and

J. Kim (2018): “Does RandomConsideration Explain Behavior when Choice is Hard? Evidence from a Large-scale Ex-periment,”

Working Paper . Anderson, C. J. (2003): “The Psychology of Doing Nothing: Forms of Decision AvoidanceResult from Reason and Emotion,”

Psychological Bulletin , 129, 139–167.

Anderson, S. P., A. de Palma, and

J.-F. Thisse (1992):

Discrete Choice Theory ofProduct Diﬀerentiation . Cambridge, MA: MIT Press.

Apesteguia, J., and

M. Ballester (2020): “Separating Predicted Randomness fromResidual Behavior,”

Journal of the European Economic Association , forthcoming.

Aumann, R. J. (1962): “Utility Theory without the Completeness Axiom,”

Econometrica ,30, 445–462.

Barseghyan, L., F. Molinari, and

M. Thirkettle (2019): “Discrete Choice underRisk with Limited Consideration,”

Working Paper . Beggs, S., S. Cardell, and

J. Hausman (1981): “Assessing the Potential Demand forElectric Cars,”

Journal of Econometrics , 17, 1–19.

Ben-Akiva, M. E. (1973): “Structure of Passenger Travel Demand Models,” Ph.D. thesis,MIT.

Bhatia, S., and

T. L. Mullett (2016): “The Dynamics of Deferred Decision,”

CognitivePsychology , 86, 112–151.(2018): “Similarity and Decision Time in Preferential Choice,”

Quarterly Journalof Experimental Psychology , 71, 1276–1280.

Block, H. D., and

J. Marschak (1960): “Random Orderings and Stochastic Theoriesof Response,” in

Contributions to Probability and Statistics: Essays in Honor of HaroldHotelling , ed. by I. Olkin, pp. 97–132. Stanford, CA: Stanford University Press.

Brady, R. L., and

J. Rehbeck (2016): “Menu-Dependent Stochastic Feasibility,”

Econo-metrica , 84, 1203–1223. razell, M., N. Li, C. Navasca, and C. Tamon (2013): “Solving Multilinear Systemsvia Tensor Inversion,”

SIAM Journal of Matrix Analysis and Applications , 34, 542–570.

Buchberger, B. (1998): “Introduction to Gr¨obner Bases,” in

Gr¨obner Bases and Appli-cations , ed. by B. Buchberger, pp. 3–31. Cambridge, Cambridge University Press.

Buturak, G., and

O. Evren (2017): “Choice Overload and Asymmetric Regret,”

Theo-retical Economics , 12, 1029–1056.

Caplin, A., M. Dean, and

J. Leahy (2019): “Rationally Inattentive Behavior: Charac-terizing and Generalizing Shannon Entropy,”

Working Paper , (ﬁrst version: 2013).

Carroll, G. D., J. J. Choi, D. Laibson, B. C. Madrian, and

A. Metrick (2009):“Optimal Defaults and Active Decisions,”

Quarterly Journal of Economics , 124, 1639–1674.

Cattaneo, M. D., X. Ma, Y. Masatlioglu, and

E. Suleymanov (2020): “A RandomAttention Model,”

Journal of Political Economy , 128, 2796–2836.

Cerreia-Vioglio, S., F. Maccheroni, M. Marinacci, and

A. Rustichini (2020):“A Canon of Probabilistic Rationality,”

Working Paper . Chernev, A., U. B¨ockenholt, and

J. Goodman (2015): “Choice Overload: A Con-ceptual Review and Meta-Analysis,”

Journal of Consumer Psychology , 25, 333–358.

Costa-Gomes, M., C. Cueva, G. Gerasimou, and

M. Tejiˇsˇc´ak (2020): “Choice,Deferral and Consistency,”

Working Paper , (ﬁrst version: 2014).

Cubitt, R. P., D. Navarro-Martinez, and

C. Starmer (2015): “On Preference Im-precision,”

Journal of Risk and Uncertainty , 50, 1–34.

Danan, E., A. Guerdjikova, and

A. Zimper (2012): “Indecisiveness Aversion andPreference for Commitment,”

Theory and Decision , 72, 1–13.

Danan, E., and

A. Ziegelmeyer (2006): “Are Preferences Complete? An ExperimentalMeasurement of Indecisiveness Under Risk,”

Working Paper . Dardanoni, V., P. Manzini, M. Mariotti, and

C. J. Tyson (2020): “Inferring Cog-nitive Heterogeneity from Aggregate Choices,”

Econometrica , 88, 1269–1296.

Deb, J., and

J. Zhou (2018): “Reference Dependence and Choice Overload,”

WorkingPaper . Dekel, E., B. L. Lipman, and

A. Rustichini (2001): “Representing Preferences with aUnique Subjective State Space,”

Econometrica , 69, 891–934.

Dhar, R., and

I. Simonson (2003): “The Eﬀect of Forced Choice on Choice,”

Journal ofMarketing Research , 40, 146–160.

Dubra, J., F. Maccheroni, and

E. A. Ok (2004): “Expected Utility Theory withoutthe Completeness Axiom,”

Journal of Economic Theory , 115, 118–133.

Echenique, F., K. Saito, and

G. Tserenjigmid (2018): “The Perception-AdjustedLuce Model,”

Mathematical Social Sciences , 93, 67–76. liaz, K., and E. A. Ok (2006): “Indiﬀerence or Indecisiveness? Choice-theoretic Foun-dations of Incomplete Preferences,”

Games and Economic Behavior , 56, 61–86.

Evren, O., H. Nishimura, and

E. A. Ok (2019): “Top-Cycles and Revealed PreferenceStructures,”

Working Paper . Evren, O., and

E. A. Ok (2011): “On the Multi-Utility Representation of PreferenceRelations,”

Journal of Mathematical Economics , 47, 554–563.

Frick, M. (2016): “Monotone Threshold Representations,”

Theoretical Economics , 11,757–772.

Fudenberg, D., R. Iijima, and

T. Strzalecki (2015): “Stochastic Choice and RevealedPerturbed Utility,”

Econometrica , 83, 2371–2409.

Fudenberg, D., and

T. Strzalecki (2015): “Dynamic Logit with Choice Aversion,”

Econometrica , 83, 651–691.

Galaabaatar, T., and

E. Karni (2013): “Subjective Expected Utility with IncompletePreferences,”

Econometrica , 81, 255–284.

Gerasimou, G. (2011): “Essays on the Theory of Choice, Rationality and Indecision,”Ph.D. thesis, University of Cambridge.(2016): “Asymmetric Dominance, Deferral and Status Quo Bias in a BehavioralModel of Choice,”

Theory and Decision , 80, 295–312.(2018): “Indecisiveness, Undesirability and Overload Revealed Through RationalChoice Deferral,”

Economic Journal , 128, 2450–2479.

Gerasimou, G., and

M. Papi (2018): “Duopolistic Competition with Choice-OverloadedConsumers,”

European Economic Review , 101, 330–353.

Hara, K., E. A. Ok, and

G. Riella (2019): “Coalitional Expected Multi-Utility Theory,”

Econometrica , 87, 933–980.

Hefti, A., S. Liu, and

A. Schmutzler (2020): “Preferences, Confusion and Competi-tion,”

Working Paper . Horan, S. (2019): “Random Consideration and Choice: A Case Study of Default Options,”

Mathematical Social Sciences , 102, 73–84.(2020): “Stochastic Semi-Orders,”

Working Paper . Horn, R. A., and

C. R. Johnson (1985):

Matrix Analysis . Cambridge: Cambridge Uni-versity Press.

Houtman, M., and

J. A. Maks (1985): “Determining All Maximal Data Subsets Consis-tent with Revealed Preference,”

Kwantitatieve Methoden , 19, 89–104.

Huang, G.-H., N. Korfiatis, and

C.-T. Chang (2018): “Mobile Shopping Cart Aban-donment: The Roles of Conﬂicts, Ambivalence, and Hesitation,”

Journal of BusinessResearch , 85, 165–174. uber, J., J. W. Payne, and C. Puto (1982): “Adding Asymmetrically Dominated Al-ternatives: Violations of Regularity and the Similarity Hypothesis,”

Journal of ConsumerResearch , 9, 90–98.

Jaccard, P. (1912): “The Distribution of the Flora in the Alpine Zone,”

New Phytologist ,11, 37–50.

Johnson, C. R., and

J. A. Link (2009): “Solution Theory for Complete Bilinear Systemsof Equations,”

Numerical Linear Algebra with Applications , 16, 929–934.

Johnson, C. R., H. ˇSmigoc, and

D. Yang (2014): “Solution Theory for Systems ofBilinear Equations,”

Linear and Multilinear Algebra , 62, 1553–1566.

Kamenica, E. (2008): “Contextual Inference in Markets: On the Informational Content ofProduct Lines,”

American Economic Review , 98, 2127–2149.

Knops, A. M., D. T. Ubbink, D. A. Legemate, L. J. Stalpers, and

P. M. Bossuyt (2013): “Interpreting Patient Decisional Conﬂict Scores: Behavior and Emotions in Deci-sions about Treatment,”

Medical Decision Making , 33, 78–84.

Kochov, A. (2020): “Subjective States without the Completeness Axiom,” in

Mathemat-ical Topics on Representations of Ordered Structures and Utility Theory , ed. by G. Bosi,M. Campi´on, J. Candeal, and

E. Indurain, pp. 255–266. Springer, Cham, Working paperdate: 2007.

Koujianou-Goldberg, P. (1995): “Product Diﬀerentiation and Oligopoly in Interna-tional Markets: The Case of the U.S. Automobile Industry,”

Econometrica , 63, 891–951.

Kovach, M., and

G. Tserenjigmid (2019): “Behavioral Foundations of Nested StochasticChoice and Nested Logit,”

Working Paper . Kovach, M., and

L. ¨Ulk¨u (2020): “Satisﬁcing with a Variable Threshold,”

Journal ofMathematical Economics , 87, 67–76.

Kreps, D. M. (1979): “A Representation Theorem for ‘Preference for Flexibility’,”

Econo-metrica , 47, 565–577.

Levandowsky, M., and

D. Winter (1971): “Distance Between Sets,”

Nature , 234, 34–35.

Luce, R. D. (1959):

Individual Choice Behavior: A Theoretical Analysis . New York, NY:Wiley.

Mandler, M. (2005): “Incomplete Preferences and Rational Intransitivity of Choice,”

Games and Economic Behavior , 50, 255–277.(2009): “Indiﬀerence and Incompleteness Distinguished by Rational Trade,”

Gamesand Economic Behavior , 67, 300–314.

Manzini, P., and

M. Mariotti (2014): “Stochastic Choice and Consideration Sets,”

Econometrica , 83, 1153–1176.

Marschak, J. (1960): “Binary Choice Constraints on Random Utility Indications,” in

Stanford Symposium on Mathematical Methods in the Social Sciences , ed. by K. Arrow,pp. 312–329. Stanford: Stanford University Press. atˇejka, F., and A. McKay (2015): “Rational Inattention to Discrete Choices: A NewFoundation for the Multinomial Logit Model,”

American Economic Review , 105, 272–298.

Matzkin, R. (2019): “Constructive Identiﬁcation in Some Non-separable Discrete ChoiceModels,”

Journal of Econometrics , 211, 83–103.

McFadden, D. (1973): “Conditional Logit Analysis of Qualitative Choice Behavior,” in

Frontiers in Econometrics , ed. by P. Zarembka, pp. 105–142. New York: Academic Press.

Natenzon, P. (2019): “Random Choice and Learning,”

Journal of Political Economy , 127,419–457.

Nocke, V., and

P. Rey (2020): “Consumer Search and Choice Overload,”

Working Paper . Ok, E., P. Ortoleva, and

G. Riella (2012): “Incomplete Preferences under Uncertainty:Indecisiveness in Beliefs Versus Tastes,”

Econometrica , 80, 1791–1808.

Pejsachowicz, L., and

S. Toussaert (2017): “Choice Deferral, Indecisiveness and Pref-erence for Flexibility,”

Journal of Economic Theory , 170, 417–425.

Redelmeier, D. A., and

E. Shafir (1995): “Medical Decision Making in Situations thatOﬀer Multiple Alternatives,”

Journal of the American Medical Association , 273, 302–305.

Roemer, J. E. (1999): “The Democratic Political Economy of Progressive Income Taxa-tion,”

Econometrica , 67, 1–19.

Sarver, T. (2008): “Anticipating Regret: Why Fewer Options May Be Better,”

Economet-rica , 76, 263–305.

Scheibehenne, B., R. Greifeneder, and

P. M. Todd (2010): “Can There Ever BeToo Many Options? A Meta-Analytic Review of Choice Overload,”

Journal of ConsumerResearch , 37, 409–425.

Schwartz, T. (1976): “Choice Functions, “Rationality” Conditions, and Variations of theWeak Axiom of Revealed Preference,”

Journal of Economic Theory , 13, 414–427.

Scott, D., and

P. Suppes (1958): “Foundational Aspects of Theories of Measurement,”

Journal of Symbolic Logic , 23, 113–128.

Sela, A., J. Berger, and

W. Liu (2009): “Variety, Vice, and Virtue: How AssortmentSize Inﬂuences Option Choice,”

Journal of Consumer Research , 35, 941–951.

Shafir, E., I. Simonson, and

A. Tversky (1993): “Reason-Based Choice,”

Cognition ,11, 11–36.

Shapley, L. S. (1959): “Equilibrium Points in Games with Vector Payoﬀs,”

Naval ResearchLogistics Quarterly , 6, 57–61.

Shapley, L. S., and

M. Baucells (1998): “Multiperson Utility,”

Working Paper No 779,Deparment of Economics, UCLA.

Simonson, I. (1989): “Choice Based on Reasons: The Case of the Attraction and Compro-mise Eﬀects,”

Journal of Consumer Research , 16, 158–174.

Steiner, J., C. Stewart, and

F. Matˇejka (2017): “Rational Inattention Dynamics:Inertia and Delay in Decision Making,”

Econometrica , 85, 521–553. rain, K. E. (2009): Discrete Choice Methods with Simulation . Cambridge: CambridgeUniversity Press, 2nd edn.

Tversky, A. (1977): “Features of Similarity,”

Psychological Review , 84, 327–352.

Tversky, A., and

J. E. Russo (1969): “Similarity and Substitutability in BinaryChoices,”

Journal of Mathematical Psychology , 6, 1–12.

Tversky, A., and

E. Shafir (1992): “Choice under Conﬂict: The Dynamics of DeferredDecision,”

Psychological Science , 3, 358–361.

Appendix

Proof of Proposition 1.

Recalling the maintained assumption that D is zero at singletons and strictly positiveelsewhere, it is immediate that the second statement implies the ﬁrst. For the converseimplication, note ﬁrst that using ACL alongside Positivity, and by suitably adapting thearguments in Luce (1959), we readily obtain the existence of a function u : X → R ++ suchthat, for every A ⊆ X and a ∈ A , ρ ( a, A ∪ { o } ) = (cid:0) − ρ ( o, A ∪ { o } ) (cid:1) · u ( a ) (cid:88) b ∈ A u ( b ) , where u ( a ) := α ρ ( a, { a, z, o } ) ρ ( z, { a, z, o } )for arbitrary and ﬁxed α > z ∈ X , and with the notational convention { z, z, o } ≡ { z, o } in place. By Positivity and Desirability we also get ρ ( o, A ∪ { o } ) = 0 ⇔ | A | = 1. It followsnow that for every A there is a unique β A ≥ ρ ( a, A ∪ { o } ) = (cid:0) − ρ ( o, A ∪ { o } ) (cid:1) · u ( a ) (cid:88) b ∈ A u ( b ) = u ( a ) (cid:88) b ∈ A u ( b ) + β A . (33)In particular, deﬁning β A := ρ ( o, A ∪ { o } )1 − ρ ( o, A ∪ { o } ) · (cid:88) b ∈ A u ( b )makes (33) identically true. From (33) and the above remark we get β A = 0 ⇔ | A | = 1and β A > ⇔ | A | >

1. Deﬁning D : 2 X \ {∅} → R + by D ( A ) := β A yields (1) with D ( A ) = 0 ⇔ | A | = 1. That ( u, D ) and ( u (cid:48) , D (cid:48) ) represent the same ρ if and only if u = αu (cid:48) and D = αD (cid:48) for some α > (cid:4) Proof of Proposition 2.

Suppose ρ = ( u i ) ki =1 is a multicriteria logit and let ( v π ( i ) ) ki =1 be a joint ratio-scale trans-formation of ( u i ) ki =1 , with π the corresponding permutation on { , . . . , k } . By deﬁnition, for ach i ≤ k there is α i > u i := α i v π ( i ) . Thus, for any menu A ⊆ X and a ∈ A , v π (1) ( a ) (cid:80) b ∈ A v π (1) ( b ) · · · v π ( k ) ( a ) (cid:80) b ∈ A v π ( k ) ( b ) = α · u ( a ) α · (cid:80) b ∈ A u ( b ) · · · α k · u k ( a ) α k · (cid:80) b ∈ A u k ( b )= u ( a ) (cid:80) b ∈ A u ( b ) · · · u k ( a ) (cid:80) b ∈ A u k ( b )= ρ ( a, A ∪ { o } ) . Conversely, suppose ( u i ) ki =1 and ( v i ) ki =1 are multicriteria representations of the same ρ .First, assume to the contrary that, for all i ≤ k , v i (cid:54) = αu π ( i ) (34)for all α > π on { , . . . , k } . By assumption, for every menu A ⊆ X , a ∈ A and α , . . . , α k > ρ ( a, A ∪ { o } ) = u ( a ) (cid:80) b ∈ A u ( b ) · · · u k ( a ) (cid:80) b ∈ A u k ( b )= v ( a ) (cid:80) b ∈ A v ( b ) · · · v k ( a ) (cid:80) b ∈ A v k ( b ) (cid:54) = α π (1) u π (1) ( a ) α π (1) (cid:80) b ∈ A u π (1) ( b ) · α π (2) u π (2) ( a ) α π (2) (cid:80) b ∈ A u π (2) ( b ) · · · α π ( k ) u π ( k ) ( a ) α π ( k ) (cid:80) b ∈ A u π ( k ) ( b )= u ( a ) (cid:80) b ∈ A u ( b ) · · · u k ( a ) (cid:80) b ∈ A u k ( b ) , which is a contradiction. By cancelling out from the above equations any u j , v j such that v j = αu π ( j ) under some permutation π , the same argument can be used repeatedly to ruleout the case where there is an arbitrary number of indices j ≤ k for which (34) is supposedlytrue. (cid:4) Proof of Proposition 3.

It is immediate that any multicriteria logit satisﬁes Desirability, Positivity and ACL;hence, that it is a decision-conﬂict logit.1. Suppose A = S ∪ T , S ∩ T = ∅ and max {| S | , | T |} ≥ D ( A ) ≤ D ( S ) + D ( T ). This is equivalent to k (cid:89) l =1 (cid:88) a ∈ A u l ( a ) − (cid:88) a ∈ A k (cid:89) l =1 u l ( a ) ≤ k (cid:89) l =1 (cid:88) a ∈ S u l ( a ) − (cid:88) a ∈ S k (cid:89) l =1 u l ( a ) + k (cid:89) l =1 (cid:88) a ∈ T u l ( a ) − (cid:88) a ∈ T k (cid:89) l =1 u l ( a ) . Since (cid:80) a ∈ A (cid:81) kl =1 u l ( a ) = (cid:80) a ∈ S (cid:81) kl =1 u l ( a ) + (cid:80) a ∈ T (cid:81) kl =1 u l ( a ) because A = S ∪ T , the aboveis equivalent to k (cid:89) l =1 (cid:88) a ∈ A u l ( a ) ≤ k (cid:89) l =1 (cid:88) a ∈ S u l ( a ) + k (cid:89) l =1 (cid:88) a ∈ T u l ( a ) . (35) ow observe that k (cid:89) l =1 (cid:88) a ∈ S u l ( a ) < k (cid:89) l =1 (cid:88) a ∈ A u l ( a ) (36)and k (cid:89) l =1 (cid:88) a ∈ T u l ( a ) < k (cid:89) l =1 (cid:88) a ∈ A u l ( a ) . (37)Indeed, letting S := { a S , . . . , a S | S | } T := { a T , . . . , a T | T | } and recalling that A = S ∪ T , S ∩ T = ∅ while each u l ( · ) is strictly increasing, rewriting (36) and (37) as (cid:0) u ( a S ) + . . . + u ( a S | S | ) (cid:1) · · · (cid:0) u k ( a S ) + . . . + u k ( a S | S | ) (cid:1) < (38) (cid:0) u ( a S )+ . . . + u ( a S | S | )+ u ( a T )+ . . . + u ( a T | T | ) (cid:1) · · · (cid:0) u k ( a S )+ . . . + u k ( a S | S | )+ u k ( a T )+ . . . + u l ( a T | T | ) (cid:1) and (cid:0) u ( a T ) + . . . + u ( a T | T | ) (cid:1) · · · (cid:0) u k ( a T ) + . . . + u k ( a T | T | ) (cid:1) < (39) (cid:0) u ( a S )+ . . . + u ( a S | S | )+ u ( a T )+ . . . + u ( a T | T | ) (cid:1) · · · (cid:0) u k ( a S )+ . . . + u k ( a S | S | )+ u k ( a T )+ . . . + u l ( a T | T | ) (cid:1) makes it obvious that (36) and (37) hold because each expanded product term on the lefthand side sum in (38) and (39) is also an expanded product term on the right hand sidesum, but not vice versa. In particular, (38) and (39) also clarify that k (cid:89) l =1 (cid:88) a ∈ A u l ( a ) ≥ k (cid:89) l =1 (cid:88) a ∈ S u l ( a ) + k (cid:89) l =1 (cid:88) a ∈ T u l ( a ) . (40)Therefore, (35) and (40) hold with equality. Now take arbitrary a ∈ A \ S and b ∈ A \ T andnotice, for example, that the strictly positive term u ( a ) u ( a ) · · · u k − ( a ) u k ( b ) is included in (cid:81) kl =1 (cid:80) a ∈ A u l ( a ) but not in (cid:81) kl =1 (cid:80) a ∈ S u l ( a ) + (cid:81) kl =1 (cid:80) a ∈ T u l ( a ). This contradicts the postu-lated equality.2. Suppose ﬁrst that k = 2 and deﬁne u : X → R ++ by u ( a ) := u ( a ) u ( a ) , and, for { a, b } ⊆ X , deﬁne D ( { a, b } ) by D ( { a, b } ) := u ( a ) u ( b ) + u ( b ) u ( a ) . By deﬁnition, ρ ( a, A ∪ { o } ) = u ( a ) (cid:80) b ∈ A u ( b ) · u ( a ) (cid:80) b ∈ A u ( b ) = u ( a ) (cid:80) b ∈ A u ( b ) + (cid:80) a,b ∈ A D ( { a, b } ) = u ( a ) (cid:80) b ∈ A u ( b ) + D ( A ) , where D ( A ) := (cid:88) a,b ∈ A D ( { a, b } ) . ence, D satisﬁes the additivity property (11).Now suppose that k > A such that D ( A ) ≤ (cid:88) a,b ∈ A D ( { a, b } ) . This implies (cid:88) a,b ∈ A (cid:32) k (cid:89) l =1 ( u l ( a ) + u l ( b )) − k (cid:89) l =1 u l ( a ) − k (cid:89) l =1 u l ( b ) (cid:33) ≥ k (cid:89) l =1 (cid:88) b ∈ A u l ( b ) − (cid:88) b ∈ A k (cid:89) l =1 u l ( b )which simpliﬁes to (cid:88) a,b ∈ A k (cid:89) l =1 ( u l ( a ) + u l ( b )) ≥ k (cid:89) l =1 (cid:88) b ∈ A u l ( b )Arguing in a similar fashion as in the proof of the ﬁrst claim, and recalling that each u l ( · ) isstrictly positive, we conclude that this inequality is impossible.3. Suppose a ∈ B ⊂ A and assume, per contra, that ρ ( a, A ∪ { o } ) ≥ ρ ( a, B ∪ { o } ). By (2),this is equivalent to k (cid:89) l =1 u l ( a ) k (cid:89) l =1 (cid:88) b ∈ A u l ( b ) ≥ k (cid:89) l =1 u l ( a ) k (cid:89) l =1 (cid:88) b ∈ B u l ( b ) , which implies (cid:81) kl =1 (cid:80) b ∈ A u l ( b ) ≤ (cid:81) kl =1 (cid:80) b ∈ B u l ( b ). Since A ⊃ B and u l ( · ) is strictly positivefor all l ≤ k , this is a contradiction. (cid:4) Proof of Proposition 4.

Suppose a (cid:31) DC b , so that u ( a ) > u ( b ) and u ( a ) ≥ D ( { a, b } ). By (8) and (9), u ( a ) = k (cid:89) i =1 u i ( a ) , u ( b ) = k (cid:89) i =1 u i ( b ) ,D ( { a, b } ) = k (cid:89) i =1 (cid:0) u i ( a ) + u i ( b ) (cid:1) − k (cid:89) i =1 u i ( a ) − k (cid:89) i =1 u i ( b ) . (41)Suppose to the contrary that a (cid:54)(cid:31) MC b . This implies that there exists i ≤ k such that i ( b ) > u i ( a ). Since u ( a ) ≥ D ( { a, b } ) by assumption, we have u ( a ) u ( a ) · · · u k ( a ) ≥ u ( a ) u ( a ) · · · u k − ( a ) u k ( b )+ u ( a ) u ( a ) · · · u k − ( b ) u k ( a )... ...+ u ( a ) u ( a ) · · · u i ( a ) u i +1 ( b ) u i +2 ( a ) · · · u k ( a )+ u ( a ) u ( a ) · · · u i − ( a ) u i ( b ) u i +1 ( a ) · · · u k ( a )+ u ( a ) u ( a ) · · · u i − ( a ) u i − ( b ) u i ( a ) · · · u k ( a )... ...+ u ( a ) u ( b ) u ( a ) · · · u k ( a )+ u ( b ) u ( a ) u ( a ) · · · u k ( a )+ O ( a, b ) , where O ( a, b ) > D ( { a, b } ) that do not includethe ones written explicitly. Rearranging this expression and recalling that u i ( a ) < u i ( b ), wehave u ( a ) u ( a ) · · · u i − ( a ) u i +1 ( a ) · · · u k ( a )[ u i ( a ) − u i ( b )] ≥ u ( a ) u ( a ) · · · u k − ( a ) u k ( b )+ u ( a ) u ( a ) · · · u k − ( b ) u k ( a )... ...+ u ( a ) u ( a ) · · · u i ( a ) u i +1 ( b ) u i +2 ( a ) · · · u k ( a )+ u ( a ) u ( a ) · · · u i − ( a ) u i − ( b ) u i ( a ) · · · u k ( a )... ...+ u ( a ) u ( b ) u ( a ) · · · u k ( a )+ u ( b ) u ( a ) u ( a ) · · · u k ( a )+ O ( a, b ) . But given that u j ( z ) > j ≤ k , z ∈ { a, b } , and since u j ( a ) < u j ( b ) holds by assump-tion, the term on the left hand side is strictly negative while that on the right hand sidestrictly positive, leading to a contradiction. (cid:4) Proof of Proposition 5.

With the proof of Lemma 4 being immediate, it suﬃces to prove Lemmas 1 – 3.

Proof of Lemma 1.

Suppose to the contrary that there exist real numbers α , . . . , α p , p ≤ m ( | X | ), and some s ≤ m ( | X | ), such that E s = p (cid:88) i =1 α i E i . (42)From the deﬁnition of E hh and E hl , any two distinct matrices E s and E s (cid:48) in (21) have theirnon-zero entries at distinct positions i, j in | X | × | X | . Therefore, (42) is impossible, and wearrive at a contradiction. (cid:4) roof of Lemma 2. First, upon noting the diﬀerence equation m ( | X | ) = 3 , if | X | = 2 ,m ( | X | ) = | X | + m ( | X | − , if | X | > , and its general solution m ( | X | ) = | X | + | X | , it is readily seen that | X | − m ( | X | ) > | X | . Therefore, (22) consistsof m ( | X | ) equations in | X | > m ( | X | ) unknowns. Recalling now that E T is an m ( | X | ) × | X | matrix, we must show next thatrank( E T ) = rank( E T | D | X | ) , (43)where E T | D | X | is the augmented m ( | X | ) × ( | X | + 1) matrix that results when the vector D | X | is inserted next to the last column of E T . To this end, notice ﬁrst that | D | | X | (cid:29) byassumption. Next, by construction of the matrices E hh and E hl , E T features m ( | X | ) < | X | rows that consist of 0 − m ( | X | ) distinct positions, and are therefore linearly independent. Hence, since | D | | X | (cid:29) ,it follows that | D | | X | is a linear combination of these m ( | X | ) rows, so that (43) holds. (cid:4) Proof of Lemma 3.

Suppose (21) is solvable, and recall that u ( x i ) , u ( x i ) > i = 1 , . . . , | X | . Giventhis fact, and denoting by r i the i -th column of the matrix U that corresponds to the pos-tulated solution, observe that for any distinct i, j ≤ | X | there exist real numbers α i , α j such that α i r i + α j r j = , thereby establishing that U is of rank 1. In particular, deﬁning α i := u ( x i ) and α j := − u ( x j ) proves the assertion. The argument for the converse implica-tion appears in the main text before the statement of the lemma. (cid:4) Proof of Proposition 6.

1. Let ρ = ( u, D ) be a decision-conﬂict logit on X := { a, b } and assume that it also satisﬁesCBS. The polynomial system (25) now reduces to  u ( a ) u ( a ) u ( b ) u ( b ) u ( a ) u ( b ) + u ( b ) u ( a )  =  u ( a ) u ( b ) D ( { a, b } )  Normalizing u ( a ) := 1, this system in turn reduces to u ( a ) = 1 , u ( b ) = u ( b ) u ( b ) ,u ( a ) = u ( a ) , u ( b ) = D ( { a, b } ) − u ( b ) u ( a )Solving for u ( b ), u ( b ) we get u ( b ) = κu ( b ) = D ( { a, b } ) − κ, here κ := D ( { a, b } ) ± (cid:0) D ( { a, b } ) − u ( a ) u ( b ) (cid:1) u ( a )are the two solutions to the quadratic equation u ( b ) = D ( { a, b } ) − u ( a ) u ( b ) u ( b ) (44)that emerges from this system. Such solutions exist in the real line if and only if D ( { a, b } ) − u ( a ) u ( b ) ≥

0. Recalling now the relevant argument in the proof of Proposition 1, withoutloss of generality we may write u ( a ) = ρ ( a, { a,b,o } ) ρ ( b, { a,b,o } ) , u ( b ) = 1 and D ( { a, b } ) = ρ ( o, { a,b,o } ) ρ ( b, { a,b,o } ) . It isnow easy to verify that the inequality above is satisﬁed if and only if the CBS condition (31)holds.We verify next that each of the two solutions –let us call them (cid:98) v and (cid:101) v – is a joint ratio-scaletransformation of the other. Let τ := (cid:0) D ( { a, b } ) − u ( a ) u ( b ) (cid:1) . We have (cid:98) v ( a ) = 1 , (cid:101) v ( a ) = 1 , (cid:98) v ( b ) = D ( { a, b } ) + τ u ( a ) , (cid:101) v ( b ) = D ( { a, b } ) − τ u ( a ) , (cid:98) v ( a ) = u ( a ) , (cid:101) v ( a ) = u ( a ) , (cid:98) v ( b ) = D ( { a, b } ) − τ , (cid:101) v ( b ) = D ( { a, b } ) + τ . Observe now that (cid:98) v = α (cid:101) v and (cid:101) v := α (cid:98) v , where α := u ( a ) >

0. The conclusion follows.Finally, it remains to be veriﬁed that CBS holds with equality if and only if u = u . Butthis is obvious because CBS holds with equality if and only if the discriminant associatedwith (44) is zero, which in turn is true if and only if there is a unique solution to (44).2 ⇒

1. Let ( u , u ) be a dual logit representation of ρ on X := { a, b } . We have ρ ( a, { a, b, o } ) = u ( a ) u ( a ) ωρ ( b, { a, b, o } ) = u ( b ) u ( b ) ωρ ( o, { a, b, o } ) = u ( a ) u ( b ) + u ( b ) u ( a ) ω where ω := u ( a ) u ( a ) + u ( b ) u ( b ) + u ( a ) u ( b ) + u ( b ) u ( a ). Suppose to the contrarythat CBS is violated, so that ρ ( o, { a, b, o } ) < ρ ( a, { a, b, o } ) ρ ( b, { a, b, o } ). Given the aboveequations, this implies (cid:0) u ( a ) u ( b ) − u ( b ) u ( a ) (cid:1) <

0, which is a contradiction. It alsofollows that setting u = u implies ρ ( o, { a, b, o } ) = 4 ρ ( a, { a, b, o } ) ρ ( b, { a, b, o } ), hence thatCBS holds with equality. (cid:4)(cid:4)