[PDF] A Recursive Logit Model with Choice Aversion and Its Application to Transportation Networks

Abstract

We introduce a route choice model that incorporates the notion of choice aversion in transportation networks. Formally, we propose a recursive logit model which incorporates a penalty term that accounts for the dimension of the choice set at each node of the network. We make three contributions. First, we show that our model overcomes the correlation problem between routes, a common pitfall of traditional logit models. In particular, our approach can be seen as an alternative to the class of models known as Path Size Logit (PSL). Second, we show how our model can generate violations of regularity in the path choice probabilities. In particular, we show that removing edges in the network can decrease the probability of some existing paths. Finally, we show that under the presence of choice aversion, adding edges to the network can make users worse off. In other words, a type of Braess's paradox can emerge even in the case of uncongested networks. We show that these phenomena can be characterized in terms of a parameter that measures users' degree of choice aversion.

Full PDF

AA RECURSIVE LOGIT MODEL WITH CHOICEAVERSION AND ITS APPLICATION TO ROUTECHOICE ANALYSIS

AUSTIN KNIES EMERSON MELO

Abstract.

We introduce a route choice model that incorporatesthe notion of choice aversion in transportation networks. Formally,we propose a recursive logit model which incorporates a penaltyterm that accounts for the dimension of the choice set at each nodeof the network. We make three contributions. First, we show thatour model overcomes the correlation problem between routes, acommon pitfall of traditional logit models. In particular, our ap-proach can be seen as an alternative to the class of models knownas Path Size Logit (PSL). Second, we show how our model cangenerate violations of regularity in the path choice probabilities.In particular, we show that removing edges in the network can de-crease the probability of some existing paths. Finally, we show thatunder the presence of choice aversion, adding edges to the networkcan increase the total cost of the system. In other words, a typeof Braess’s paradox can emerge even in the case of uncongestednetworks. We show that these phenomena can be characterized interms of a parameter that measures users’ degree of choice aver-sion.

Keywords: choice aversion, recursive logit, IIA, directed networks,transportation networks.

JEL classiﬁcation: D001, C00, C51, C61

Date : October 7, 2020.Please address correspondence to: Emerson Melo, Department of Economics,Indiana University, 307 Wylie Hall, 100 S Woodlawn Ave Bloomington, IN 47408,U.S.A. Email: [email protected]. Austin Knies, Department of Economics, IndianaUniversity-Bloomington, [email protected]. a r X i v : . [ ec on . E M ] O c t Introduction

Discrete choice models have been used extensively to understand thebehavior of participants (who we will refer to as users) in transportationnetworks (McFadden (1978, 1981) and Ben-Akiva and Lerman (1985)).In this context, users choose a path that minimizes their total costof traveling. A prominent model that has arisen from this literatureis the Multinomial Logit Model (MNL), whose main advantage is itstractability and closed-form choice probabilities.Despite its popularity, the MNL presents some drawbacks when ap-plied to the case of transportation networks. In particular, the MNLcan predict unrealistic choice probabilities for paths sharing commonedges in the network. The root of this problem traces back to the In-dependence of Irrelevant Alternatives (IIA) axiom that is required toderive the MNL.To explain the severity of the overlapping paths problem, considerthe simple transportation network displayed in Figure 1. s i ta a a a Figure 1.

Paths ( a , a ) and ( a , a ) share edge a .In this network we have three paths: ( a , a ), ( a , a ), and ( a ).Assume the cost of each path is 1. At the edge level, assume that thecost of edge a is 1 − (cid:15) , the cost of edges a and a is (cid:15) , and, ﬁnally,the cost of edge a is 1.For the setting described above, the MNL will predict that each pathis chosen with probability 1 /

3. However, these choice probabilitiesare unrealistic when paths lack distinctiveness or independence fromanother. In particular, an assignment in which path ( a ) is chosen with probability 1 / a , a ) and ( a , a ) are chosen withprobability 1/4 is more sensible. More explicitly, this latter solutiontakes into account the fact that paths ( a , a ) and ( a , a ) share thecommon edge a and in terms of total cost they are equivalent to eachother as well as the cost of path ( a ). The reason why the MNL modelcannot accommodate situations where the path costs are correlated isbecause the MNL relies on the property of Independence of IrrelevantAlternatives (IIA). As the discussion above shows, the IIA property ishardly satisﬁed even in simple transportation networks. Recognizing this pitfall, the transportation literature has proposedseveral corrections to the MNL. This class of extensions is known asPath Size Logit (PSL). The idea of this class of models is to correct theproblem of overlapping paths by adding an extra correlation-penalizingterm to path costs. Thus, when the choice probabilities are generatedthrough a standard MNL, the correction will account for the degree ofoverlapping between diﬀerent paths.While its usefulness in correction is clear, the PSL class has two prob-lems. First, the type of correction employed by the diﬀerent models donot have a theoretical justiﬁcation in terms of users’ behavior. In par-ticular, the parameters describing the corrections do not have a directinterpretation from an economic viewpoint. Second, it is not clear howPSL models can be used to carry out welfare analysis in transportationnetworks. For instance, it is hard to interpret and predict the changeson welfare when edges are added to, or severed from, the network.In this paper, we propose the use of a recursive logit model whichincorporates the idea of choice aversion (choice overload) in users’ be-havior. In doing so we follow Lorca and Melo (2020) who adapt theapproach of Fudenberg and Strzalecki (2015) to the context of directedgraphs. Simply put, the choice aversion hypothesis states that an in-crease in the number of alternatives to choose from may lead to adverse An in-depth discussion on path correlation and IIA can be found in McFadden(1974), which introduces the canonical red bus/blue bus problem. consequences, such as lesser motivation to actually choose or lower sat-isfaction ex post (cf. Sheena and Lepper (2000) and Scheibehenne et al.(2010)).Formally, we consider a transportation network with source node s and designated sink node t . In this setting, we model users’ behavior asa sequential choice process: when assessing an edge a at some node i (cid:54) = t , users evaluate both the ﬂow cost and the appropriate continuation value associated to such an edge. Following Fudenberg and Strzalecki,we introduce a term that penalizes the size of each choice set that stemssubsequently from every current edge under scrutiny. In particular,when considering an edge a at node i , users will penalize the numberof outgoing edges at i . In other words, when facing a set of alternativesin order to depart from a speciﬁc node, users incorporate the size of theensuing choice set when they appraise the continuation value of eachoutgoing edge. Formally, at each node i (cid:54) = t , we consider the penalty κ log | A + i | , where | A + i | is the cardinality of the set of outgoing edges atnode i and κ ≥ κ plays a critical rolein the form of the correction. We show that our model performs as wellas the recent Adjusted PSL model introduced by Duncan et al. (2020).However, our correction has two main advantages. First, it is a simplecorrection based on users’ optimal behavior. Second, the parameter κ has a clear interpretation in terms of users’ attitude with respect tosize of choice sets.In our second contribution, we show how our model captures viola-tions of regularity (Luce and Suppes (1965)). Formally, we show thatremoving an edge in a particular node can decrease the choice proba-bilities of some paths in the network. To grasp how this result works,we note that removing an edge a at node i is equivalent to removingthe set of all paths in which a is a member. When removing an edge in the traditional MNL (i.e., the choice aversion model with κ = 0),the choice probability of the remaining paths increase proportionally.However, in our model, removing the edge a not only reduces the set ofavailable paths (passing through node i ) but also decreases the choiceaversion costs associated to these paths. This latter reduction makesthe set of paths passing through node i comparatively less expensivethan those not using node i . As a consequence, the path choice prob-abilities of paths passing through node i increase due to the reductionin the set of available paths and the choice aversion cost reduction.We formalize the failure of regularity in terms of a precise relation-ship between the parameter κ and the choice probabilities of pathsusing the node where the edge is removed. As far as we know, thisresult is new to the literature on recursive models in transportationnetworks.In our ﬁnal contribution, we study how our model can capture aBraess’s-like paradox. In particular, we show how adding edges to thenetwork can decrease users’ welfare. Similarly to regularity failure, wealso characterize this result in terms of the degree of choice aversion κ .Unlike existing models and extensions, the behavioral foundation of thechoice aversion model and its formulation allows for a tradeoﬀ betweeninstantaneous route costs and choice aversion penalization such thatdecreases in welfare can be observed even in the absence of congestion.To our knowledge, this is also a novel result that can shed light on thedesign of transportation networks and its eﬀects on welfare.1.1. Related Literature.

Recursive logit models have been studiedby Baillon and Cominetti (2008), Fosgerau et al. (2013), and Mai et al.(2015) among others. Our paper diﬀers from theirs in at least two di-mensions. First, we extend their recursive approach to incorporate thenotion of choice aversion. Second, we show how our model can handlethe problem of path overlapping, violations of regularity, and Braess’s-like paradoxes. With respect to PSL models, the existing literature is For an up to date survey on recursive models in traﬃc networks we refer thereader to Zimmermann and Frejinger (2020). extensive (Ben-Akiva and Bierlarie (1999) and Frejinger and Bierlarie(2007)), and Duncan et al. (2020) present an up-to-date discussion ofthe PSL approach. In addition, they propose an alternative correctiondenominated as the

Adaptive Path Size Logit (APSL) model. Our re-sults diﬀer from theirs in the type of the behavioral foundation we use.From a behavioral standpoint, a similar approach to this paper isfound in Fosgerau and Jiang (2019) and Jiang et al. (2020), who incor-porate a rational inattention model into the context of transportationnetworks. While choice aversion and rational inattention are inter-connected in terms of information processing, we model a particularvariant of costly decision-making in the form of aversion to increasingchoices, rather than mutual information through observing a signal.This modeling choice allows us to generate clear path choice predic-tions, violations of regularity, and Braess’s-like paradox phenomena.The rest of the paper is organized as follows. Section 2 introducesthe recursive logit model with choice aversion. Section 3 explores theuse of choice aversion in the path choice model and in comparison toexisting Path Size Logit (PSL) models. Section 4 discusses the failureof regularity. Section 5 discusses a type of Braess’s paradox observedas a consequence of choice aversion. Section 6 concludes.2.

Recursive logit in Directed Networks

In this section we propose a recursive discrete choice model in di-rected networks. Formally, we model a set of users as solving a dynamicprogramming problem over a directed, acyclic graph. In a noticeabledeparture from previous literature, we adapt the choice aversion formu-lation of Fudenberg and Strzalecki (2015) into the context of directedgraphs, and then we analyze the consequences on equilibria and wel-fare. Througout the text we use the term choice overload to refer to choice aversion.

Directed graphs.

Consider a directed acyclic graph G = ( N, A )where N is the set of nodes and A the set of edges, respectively. Wedenote the set of ingoing edges to node i by A − i , and the set of outgoing edges from node i by A + i . We refer accordingly to the out-degree of node i as | A + i | .Without loss of generality, we assume that G has a single source-sinkpair, where s and t stand for the source (origin) and sink (destination)nodes, respectively. Let j a be the node j that has been reached throughedge a . We therefore deﬁne a path as a sequence of edges ( a , . . . a K )with a k +1 ∈ A + j ak for all k < K .The set of paths connecting nodes s and t is denoted by R . The setof paths connecting nodes s and i (cid:54) = t is denoted by R si . Similarly, theset of all paths connecting nodes i (cid:54) = s and t is denoted as R it . Let R i denote the set of paths passing through i . Finally, let R ci denote theset of paths not passing through node i .A deterministic cost component c a > a ∈ A + i for all i (cid:54) = t . Path costs are assumed to be edge additive,that is, for a path r = ( a , . . . a K ) ∈ R its associated cost is given by (cid:80) Kk =1 c a k .We assume that at node s there is a unitary mass of network userswho must choose a path from the set R . For the sake of exposition,the mass of users is summarized by the canonical vector e s , which hasa 1 in the position of node s and zero elsewhere. The dimension of e s is | N | − Choice aversion.

We now develop a recursive logit choice modelover G that incorporates choice overload by means of an speciﬁc kindof penalty on ensuing choice sets stemming from each edge appraisal.In particular, we adapt the choice aversion approach from Fudenbergand Strzalecki (2015) into the environment described by G as follows:for each a ∈ A + i we associate a collection of i.i.d. random variables { (cid:15) a } a ∈ A + i such that the recursive cost associated to edge a is deﬁned as:(1) V a = c a + E (cid:32) min a (cid:48) ∈ A + ja { V a (cid:48) + (cid:15) a (cid:48) + κ log | A + j a |} (cid:33) for all a ∈ A + i , where c a denotes the instantaneous cost associated to edge a and theterm E (cid:32) min a (cid:48) ∈ A + ja { V a (cid:48) + (cid:15) a (cid:48) + κ log | A + j a |} (cid:33) = E (cid:32) min a (cid:48) ∈ A + ja { V a (cid:48) + (cid:15) a (cid:48) } (cid:33) + κ log | A + j a | is the adjusted continuation value associated to the selection of a .Notice that the latter term includes the factor κ log | A + j a | , which is apenalty term that captures the size of the set A + j a , where κ ≥ Following Fudenberg and Strzalecki (2015), we impose the followingassumption on the random variables { (cid:15) a } a ∈ A + i . Assumption 1 (Logit choice rule) . At each node i (cid:54) = t the collectionof random variables { (cid:15) a } a ∈ A + i follows a Gumbel distribution with scaleparameter µ = 1 . Under this assumption, Eq. (1) can be expressed as:(2) V a = c a − log (cid:16)(cid:80) a (cid:48) ∈ A + ja e − V a (cid:48) (cid:17) + κ log | A + j a | , where − log (cid:16)(cid:80) a (cid:48) ∈ A + ja e − V a (cid:48) (cid:17) + κ log | A + j a | provides a closed-form expres-sion for the adjusted continuation value. Let us deﬁne ϕ j a ( V ) (cid:44) − log (cid:16)(cid:80) a (cid:48) ∈ A + ja e − V a (cid:48) (cid:17) for all j a (cid:54) = t . Accord-ingly Eq. (2) can be rewritten as:(3) V a = c a + ϕ j a ( V ) + κ log | A + j a | . The previous expression deserves some remarks. First, the continu-ation value in Eq. (3) captures the complexity of the choice sets A + j a , Fudenberg and Strzalecki (2015) study a recursive logit model in the context ofintertemporal choice. In doing so, they consider a discount factor δ ∈ (0 , G without discounting. See Train (2009), Chapter 3. as measured by κ log | A + j a | , with κ ≥

0. Intuitively, κ log | A + j a | penal-izes the size of the choice sets at diﬀerent nodes, where the parameter κ ≥ A + j a . Inparticular, V a is an increasing function of κ and log | A + j a | .Second, when κ = 0, Eq. (3) boils down to a traditional recursivelogit model in which users are choice-loving in the sense that they al-ways prefer to add additional items into the menu, as in the “ preferencefor ﬂexibility ” of Kreps (1979). To see this, note that when κ = 0 thefunction ϕ j a ( V ) is decreasing in | A + j a | . As a consequence, the recursivecost V a is decreasing in the size of A + j a . This latter feature implies thattraditional recursive logit models in transportation networks (e.g. Bail-lon and Cominetti, 2008 and Fosgerau et al., 2013) can be associatedwith an intrinsic taste for plentiful options.On the other hand, the case of κ ∈ (0 ,

1) from an economic stand-point may be interpreted as a situation where the users prefer to includeadditional alternatives to the menu, provided the new options are not too much worse than the current average. Finally, the case κ ≥ κ = 1 captures a situation where the users want to removechoices that are worse than the average: they worry about choosingsuch additional alternatives by accident given appraisal costs—such asOrtoleva (2013)’s thinking aversion—that may oﬀset the beneﬁts of thecorresponding random draw.In sum, the parameter κ encapsulates the scale of penalties on the setof ensuing actions arising from each nonterminal node, which unlockskeen consequences on users’ attitudes towards marginally increasing theset of edges. Following Fudenberg and Strzalecki (2015), we identify κ as the users’ choice aversion parameter. We point out that all of ouranalysis extends to the case of node-speciﬁc choice aversion parameters κ i for all i ∈ N . We stress the relevance of allowing for heterogeneous { κ i } i ∈ N in § Flow allocation.

Each user is looking for an optimal path con-necting s and t . Now, when they reach node i (cid:54) = t , they observe therealization of the random costs V a + (cid:15) a for all a ∈ A + i , and consequentlychoose the alternative a ∈ A + i with the lowest cost.This process is repeated at each subsequent node giving rise to arecursive discrete choice model, where the expected ﬂow entering node i (cid:54) = t splits among the alternatives a ∈ A + i according to the choiceprobability:(4) P ( a | A + i ) = P (cid:0) V a + (cid:15) a ≤ V a (cid:48) + (cid:15) a (cid:48) ∀ a (cid:48) (cid:54) = a ∈ A + i (cid:1) ∀ i (cid:54) = t. Due to Assumption 1, Eq. (4) can be rewritten as:(5) P ( a | A + i ) = e − ( c a + ϕ ja ( V )+ κ log | A + ja | ) (cid:80) a (cid:48) ∈ A + i e − ( c a (cid:48) + ϕ ja (cid:48) ( V )+ κ log | A + ja (cid:48) | ) ∀ i (cid:54) = t. As κ increases, the edge choice probability P ( a | A + i ) is increasinglypenalized by the choice set | A + j a | , reﬂecting the eﬀect of choice overloadonto a user’s edge cost from nodes with large choice sets. This is afundamental diﬀerence with the traditional recursive logit model, whichassumes κ = 0 as we mentioned before.Mathematically, the recursive process just described induces a Markovchain over the graph G , where the transition probabilities are given byEq. (5). Let x i be the expected ﬂow entering at node i towards sinknode t . Then the ﬂow received by edge a is given by:(6) f a = x i P ( a | A + i ) ∀ a ∈ A + i , with f = ( f a ) a ∈ A denoting the expected ﬂow vector.In addition, let ˆ P = ( P ij ) i,j (cid:54) = t denote the restriction to the set ofnodes N \ { t } . Then the expected demand vector x = ( x i ) i (cid:54) = t may beexpressed as x = e s + ˆ P T x which generates the following stochasticconservation ﬂow equations(7) x i = (cid:88) a ∈ A − i f a for all i (cid:54) = t. A ﬂow vector f satisfying (7) is called feasible . It is worth mentioningthat the there exists a unique ﬂow vector x ∗ satisfying the ﬂow con-straints (7). In fact, using Baillon and Cominetti (2008, Lemma 1)it is possible to show that [ I − P (cid:62) ] − is well deﬁned. Then x ∗ is theunique vector that satisﬁes x ∗ = [ I − P (cid:62) ] − e s and f a = x ∗ i P ( a | A + i ) forall a ∈ A + i , i (cid:54) = t. Path choice and choice aversion.

In this section we note thatthe solution of our recursive choice model can be equivalently writtenin terms of path choice probabilities. In doing so, assume that for eachpath r ∈ R the cost associated to it is a random variable deﬁned as(8) ˜ C r = C r + (cid:15) r ∀ r ∈ R , where C r = (cid:80) a ∈ r ( c a + κ log | A + j a | ) = (cid:80) a ∈ r c a + κ (cid:80) a ∈ r log | A + j a | and { (cid:15) r } r ∈R is a collection of absolutely continuous random variables satis-fying Assumption 1.Under these conditions, the probability of choosing path r is deﬁnedas:(9) P r (cid:44) P (cid:18) r = arg min r (cid:48) ∈R { C r (cid:48) + (cid:15) r (cid:48) } (cid:19) ∀ r ∈ R . Equations (8) and (9) jointly deﬁne a path choice model over R ,where we again refer to the Gumbel assumption to obtain:(10) P r = e − C r (cid:80) r (cid:48) ∈R e − C r (cid:48) ∀ r ∈ R . However, it is well known that the path choice probability P r canbe decomposed in terms of the edge probabilities (e.g., Fosgerau et al.(2013) and Lorca and Melo (2020)). Formally, we have that for eachpath r = ( a , . . . , a K ) ∈ R with K ≥

2, the following equality holds(11) P r = K (cid:89) k =1 P ( a k | A + s ) P ( a k +1 | A + j ak ) . The previous characterization will play a key role in next sections.Intuitively, Eq. (11) establishes that the P r can be decomposed interms of the recursive choice probabilities. This equivalence allows us to highlight the role and eﬀect of the terms κ log | A + j a | in the path choiceprobabilities P r .2.4.1. Heterogenous κ . As we mentioned earlier, holding κ ﬁxed acrossnodes is not a necessity in our setup. In fact, to model user behavioraccurately, it is sensible to reevaluate the assumption of a homoge-neous choice aversion parameter when considering a variety of trans-portation network contexts. In many cases, a user might be moresensitive to choice overload at one particular node relative to another.For instance, in a context where nodes might represent locations alongroutes where a user must choose which direction to turn, nodes maydiﬀer across characteristics like visibility, intersection type, or lanewidth. If we allow the choice aversion parameter to be node-speciﬁc,then we update Eq. (8) such that C r = (cid:80) a ∈ r ( c a + κ j a log | A + j a | ) = (cid:80) a ∈ r c a + (cid:80) a ∈ r κ j a log | A + j a | . As a result, the path choice probability P r for each path r ∈ R takes the form(12) P r = e − c r − ρ r (cid:80) r (cid:48) ∈R e − c r (cid:48) − ρ r (cid:48) for all r ∈ R , where c r = (cid:80) a ∈ r c a and ρ r (cid:44) (cid:80) a ∈ r κ j a log | A + j a | . From here, we notethat the path choice probability above simpliﬁes to Eq. (10) in thespecial case of κ j a = κ ≥ j a (cid:54) = t .3. Choice aversion and the IIA property

In the context of path choice models, it is well known that the tra-ditional MNL model is restricted by the IIA property, which does nothold in the context of route choice due to the overlapping paths prob-lem. The main implication of overlapping paths is that the traditional It is important that we leave out traﬃc congestion as a characteristic to justifyheterogeneous choice aversion parameters across nodes. In this paper we are onlystudying uncongested networks with ﬁxed costs along each edge. However, it iscertainly sensible that heavily congested intersections induce greater anxiety formany real-world drivers, and choice overload may indeed inﬂuence driver behaviorto make routing choices that steer clear from these intersections. We note this inSection 6. MNL produces unrealistic path choice probabilities (Ben-Akiva andRamming and Ben-Akiva and Bierlarie (1999)).In order to solve this problem, the route choice literature has pro-posed the Path Size Logit (PSL) approach. In simple terms, PSL mod-els extend the MNL by adding a correction term to path costs whichaccount for the degree of overlapping among paths. For instance, Ben-Akiva and Lerman, 1985; Ben-Akiva and Bierlarie, 1999, Frejinger andBierlarie, 2007, and recently Duncan et al. (2020) propose diﬀerent cor-rections to the MNL in order to solve the overlapping problem. Froma behavioral point of view, PSL models try to correct the fact that theIIA property should be relaxed in contexts where paths are not distinctor independent.In this section, we show how choice aversion can be seen as a nat-ural mechanism that overcomes the problem of overlapping paths intransportation networks.Assuming κ i = κ for all i (cid:54) = s, t , let us rewrite Eq. (11) in § P r = e − c r − κγ r (cid:80) r (cid:48) ∈R e − c r (cid:48) − κγ r (cid:48) for all r ∈ R , where c r = (cid:80) a ∈ r c a , γ r (cid:44) (cid:80) a ∈ r log | A + j a | , and κ ≥ It is easy to see that for κ >

0, the term κγ r can be seen as a penaltyterm that accounts for the size of the choice set at each of the nodesaccessed along path r . Formally, the term κγ r accounts for the degree of overlapping amongdiﬀerent paths. In particular, the presence of κγ r accounts for the factthat, in our recursive model, the IIA property does not hold. Note that equation (12) simpliﬁes to (13) when the choice aversion parametersare ﬁxed across nodes. In this case, ρ r = κγ r . Note that paths passing through a common node i would share the penalty κ log | A + i | . To see how our model works, consider two paths r and r (cid:48) with asso-ciated probabilities P r and P r (cid:48) respectively. Computing the probabilityratio between r and r (cid:48) we get:(14) P r P r (cid:48) = e − c r e − c r (cid:48) × e − κγ r e − κγ r (cid:48) Note that expression (14) shows that the ratio P r P r (cid:48) depends on theratio between the costs associated to paths r and r (cid:48) times the ratiobetween κγ r and κγ r (cid:48) . This latter term incorporates information about r and r (cid:48) regarding choice sets at each node crossed by these paths. Thisinformation is captured by the terms κγ r and κγ r (cid:48) .From (14) it follows that adding or deleting links in a particularnode contained (crossed) in r (or r (cid:48) ) will aﬀect the ratio e − κγr e − κγr (cid:48) , and asa consequence P r P r (cid:48) will be modiﬁed. In other words, P r P r (cid:48) depends notonly on r and r (cid:48) , but also on other paths passing by the same nodesas r and r (cid:48) do. Note that when κ = 0, then P r P r (cid:48) = e − cr e − cr (cid:48) and we recoverthe IIA property in the MNL model. Similarly, if paths r and r (cid:48) passthrough the same nodes, we get κγ r = κγ r (cid:48) , so that P r P r (cid:48) = e − cr e − cr (cid:48) . Thus,the factor e − κγr e − κγr (cid:48) captures the degree of overlapping between diﬀerentpaths.In order to see how our model overcomes the overlapping problem,we study a concrete case. Let us reintroduce the network in Figure 2where the set of paths is given by R = { r , r , r } with r = ( a , a ), r = ( a , a ), and r = ( a ). For this example, we assume c a = 1 . c a = c a = 0 .

1, and c a = 2. s i ta a a a Figure 2.

Logit path choice. As we have previously discussed for this network structure, paths r and r overlap, sharing the common edge a . Under this parame-terization of instantaneous cost, coupled with κ = 0, it follows that c r = c r = c r = 2, and, consequently, the logit choice rule (13) assignsone third of ﬂow to each path. In other words, with κ = 0, we get P r = P r = P r = . However, since paths r and r are identical, theassignment P r = P r = and P r = is a more appropriate allocation.However, as κ →

1, the logit model with choice aversion predicts aﬂow allocation approaching (cid:0) , , (cid:1) . To understand this, we look atthe probability ratio between paths r , r , and r :(15) P r P r = 1 and P r P r = P r P r = e − κ log 2 . From the previous expression, it follows that the value of κ will aﬀectthe ratios P r P r and P r P r , but not P r P r . In particular, Eq. (15) shows thatthe ratios P r P r and P r P r are decreasing in κ . In other words, as the degreeof choice aversion increases, the probabilities P r and P r decrease whilethe probability associated to r increases.Figure 3 shows how the route choice probabilities in Figure 2 respondto κ ∈ [0 , .

5] under choice aversion.Now assume that a new edge ˆ a is added at node i . This impliesthat the new set of paths is ˆ R = R ∪ { ˆ a } . In terms of Eq. (15), addingˆ a implies that: P r P r = 1 and P r P r = P r P r = e − κ log 3 . This latter expression makes explicit the fact that changes in R willchange the probability ratio between diﬀerent paths.3.1. Choice aversion compared to PSL models.

What the anal-ysis just laid out shows—which applies to the general case of directednetworks—is that from the vantage point of path selection, choice aver-sion is a robust way to derive path choice probabilities, even in the caseof overlapping of diﬀerent routes. This robustness feature makes our Figure 3.

Route Choice Probabilities for Figure 2.approach similar to the class of PSL models, which is widely used inapplied work (e.g. Duncan et al. (2020)).In this section we compare how our approach compares with some ofthe best well-known PSL models. Figure 4 displays a network topologyoriginally featured in Fosgerau et al. (2013). We test the performanceof the choice aversion model on this network in calculating path choiceprobabilities. We assume that this is a directed acyclical graph, wherethe set of paths is given by R = { r , r , r , r } with r = (12 , , r = (12 , , , r = (12 , , r = (15).This graph represents a more complex uncongested network topologywhere the cost of all routes r i ∈ R are equal. Thus, the only diﬀerencein routes 1 through 4 are the choice sets at each node along the path.The MNL model (equivalent to κ = 0 in the choice aversion model)predicts equal path choice probabilities, i.e., P r i = for i = 1 , , , κ increases from 0 to 10. Asthe choice aversion penalization grows larger, P r approaches 1, sinceit is the only route with no choice set for the user after node 1. On For ease of exposition we provide the details of the PSL models discussed inthis section in Appendix A. Figure 4.

Complex network example from Fosgerauet al. (2013).the other hand, while r and r have equivalent choice aversion terms, r has the advantage of lacking an additional downstream choice incomparison, allowing P r > P r = P r for κ > (A) Choice aversion model (B) Adaptive PSL model Figure 5.

Route choice probabilities for complex net-work example.The prediction in Figure 5(A) diﬀers from route choice probabilitiesgenerated by many PSL models and extensions, including the modelsdiscussed in Fosgerau et al. (2013) and the Adaptive PSL model pro-posed by Duncan et al. (2020). This latter model is shown in Figure5(B). For these models, path correction cost is based on correlation ofroutes through link-path incidence rather than a penalization for size of choice sets along the path. As a result, we see that for most PSLmodels, including the Adaptive PSL model, P r = P r for all values of β , not P r = P r as in the choice aversion model.The behavioral nature of the choice aversion model allows for a dif-ferent type of correction than most PSL models. The choice aversionmodel and other PSL models work in a similar way in the sense of over-coming the overlapping path problem. However, the choice aversionmodel has a simple and clear behavioral interpretation. This featuresets the choice aversion model apart from other PSL models both interms of interpretation and performance.3.2. RNL and IIA.

In this section we compare our approach with theRecursive Nested logit (RNL) model (Mai et al. (2015)). Similar to ourapproach, the RNL does not impose the IIA property. In particular,in order to allow for correlation among paths, Mai et al. (2015) extendthe recursive logit model by allowing the scale parameter µ a of theGumbel-distributed random variables { (cid:15) a } a ∈ A to be link-speciﬁc (incontrast with our Assumption 1). Under this more general assumption,Mai et al. (2015) show that the RNL allows for situations where theIIA property does not hold. They show that the RNL generates morerealistic path choice probabilities.Formally, the main diﬀerence between the recursive logit and theRNL model is that the continuation values ˆ ϕ are deﬁned as(16) ˆ ϕ j b ( V ) = E (cid:32) min a ∈ A + jb { c a + ˆ ϕ j a + µ j b (cid:15) a } (cid:33) ∀ b ∈ A. From (16) it is easy to see that the RNL allows for heterogeneousscale parameters µ j b that are edge-speciﬁc. This modiﬁcation allowsfor correlation between alternatives and payoﬀs. They show that theIIA property holds for paths within the same nest but not for pathsin diﬀerent nests. More importantly, Mai et al. (2015) show that theRNL can be seen as a solution to the overlapping paths problem.Note that given the structure of the NRL, we can extend our choiceaversion model to this context. To see this note that Eq. (16) can be rewritten as:ˆ ϕ j b ( V ) = E (cid:32) min a ∈ A + jb { c a + ˆ ϕ j a + µ j b (cid:15) a } (cid:33) + κ log | A + j b | ∀ b ∈ A. Thus choice aversion can be combined with the RNL approach in asimple way.However, there are at least two important diﬀerences between thechoice aversion and the RNL models. First, in our approach, the be-havioral mechanism that allows for situations where IIA does not holdis the idea of choice aversion (or choice overload). Second, as we shallsee in §

4, our model allows for the failure of the regularity property inthe path choice probabilities. Thereby, our approach can be seen asmore ﬂexible than the RNL model.4.

Choice aversion and the failure of regularity

In the standard MNL ( κ = 0), adding an additional alternative tothe choice set cannot increase the probability that an existing actionis selected (and vice versa). This is known as the regularity propertyin discrete choice models (Luce and Suppes (1965, Def. 26)).In this section, we show that there exists a critical value of κ , whichallows us to understand how varying the network G can generate viola-tions of regularity. In order to gain some intuition, we discuss a simplenetwork that allows us to show how regularity may break down. Inparticular, we study one of the nested network structures considered inMai et al. (2015). Figure 6 replicates their Figure 3. This network hasfour nodes { A, B, C, D } and eight links between the source node Aand sink node D. There are six possible paths from o to d : ( o, a, a , d ),( o, a, a , d ), ( o, a, a , d ), ( o, b, b , d ), ( o, b, b , d ), and ( o, b, b , d ). We de-note these path by r , r , r , r , r , and r , respectively.Tables 1 and 2 display route choice probabilities when links a , a , b , and b are removed from the nested network in Figure 6. Table 1shows the results when choice parameters are homogeneous (i.e., κ B = Figure 6.

Nested network structure from Mai et al.(2015). κ C = κ = 1), and Table 2 shows the results when κ C = 2, all else heldequal. Route Homogen. Route Choice Probabilities when Edge is Removed a a b b r =( o, a, a , d ) 0.4485 - 0.6174 (38%) 0.4185 (-7%) 0.4429 (-1%) r =( o, a, a , d ) 0.1650 0.3726 (126%) - 0.1539 (-7%) 0.1629 (-1%) r =( o, a, a , d ) 0.0607 0.1371 (126%) 0.0836 (38%) 0.0566 (-7%) 0.0599 (-1%) r =( o, b, b , d ) 0.0607 0.0914 (51%) 0.0557 (-8%) - 0.0899 (48%) r =( o, b, b , d ) 0.1001 0.1506 (51%) 0.0918 (-8%) 0.1401 (40%) - r =( o, b, b , d ) 0.1650 0.2484 (51%) 0.1514 (-8%) 0.2309 (40%) 0.2444 (48%) Table 1.

Route Choice Probabilities for Figure 6 with κ B = κ C = κ = 1 and Edges Removed. Route κ C = 2 Route Choice Probabilities when Edge is Removed a a b b r =( o, a, a , d ) 0.5730 - 0.7712 (35%) 0.5138 (-10%) 0.5317 (-7%) r =( o, a, a , d ) 0.2108 0.5535 (163%) - 0.1890 (-10%) 0.1956 (-7%) r =( o, a, a , d ) 0.0775 0.2036 (163%) 0.1044 (35%) 0.0695 (-10%) 0.0720 (-7%) r =( o, b, b , d ) 0.0258 0.0453 (75%) 0.0232 (-10%) - 0.0540 (109%) r =( o, b, b , d ) 0.0426 0.0746 (75%) 0.0382 (-10%) 0.0860 (102%) - r =( o, b, b , d ) 0.0703 0.1230 (75%) 0.0630 (-10%) 0.1417 (102%) 0.1467 (109%) Table 2.

Route Choice Probabilities for Figure 6 with κ C = 2 and Edges Removed.From Table 1, we note that when link a is removed, the choiceprobabilities of remaining paths { r , . . . , r } increase. Note that theprobabilities of paths r and r increase in the same proportion (126%).Similarly, the probabilities of paths r , r , and r also increase in thesame proportion (51%). Each increase is even more pronounced inTable 2 where κ C = 2. However, in both tables, the increase is not proportional across paths crossing diﬀerent nodes. This feature comesfrom the IIA property, which holds within nodes (nests) but not acrossthem.Now, consider the case of removing edge a . In Table 1, the prob-abilities of paths r and r increase proportionally (38%). However,the probability of paths r , r , and r actually decrease . This coun-terintuitive result is a consequence of the fact that removing edge a not only reduces the number of available paths but also decreases thecost associated to paths r and r . In other words, this eﬀect can bedecomposed into two parts. First, the IIA property implies that theprobabilities of r , r , r , r , and r will increase upon removing edge a . The second force behind this counterintuitive result is that remov-ing edge a reduces the choice set when taking paths r and r . Fora choice averse user, this latter eﬀect implies that κ log 3 reduces to κ log 2, which makes paths r and r relatively more attractive than r , r , and r , such that the probabilities of r , r , and r decrease. Whenthis second eﬀect dominates, we will observe the failure of regularityassociated with removing edge a .It may be tempting to think that the eﬀect in path choices describedabove may be driven by the assumption that κ is homogeneous. How-ever, Table 2 shows that a similar pattern occurs when we consider κ B = 1 and κ C = 2. In this case, the failure of regularity is evenmore pronounced through relatively larger changes to remaining pathprobabilities.We formalize the previous intuition in Proposition 1 below. In doingso, recall that R i is the set of paths passing through node i . Similarlythe set of paths not passing through node i is deﬁned as R ci . In addition,deﬁne R ia as the set of paths passing through node i after removingedge a at node i . We note that R ia ⊆ R i . Note that before and afterremoving and edge the set R ci is the same. We remark that after the edge a at node i is removed, the cost ofpaths in R ia can be expressed as¯ C r = C r + ∆ i ∀ r ∈ R ia where ∆ i (cid:44) κ i (cid:0) log | A + i − | − log | A + i | (cid:1) . Note that ∆ i is constantacross all paths in R ia . Let P ( R i ) = (cid:80) r ∈R i P r and P ( R ia ) = (cid:80) r ∈R ia P r . Proposition 1.

The probability of choosing a path r ∈ R ci decreasesafter removing an edge a ∈ A + i if (17) κ i > log (cid:16) P ( R i ) P ( R ia ) (cid:17) log (cid:16) | A + i || A + i |− (cid:17) . Proof.

Without loss of generality, ﬁx a path r ∈ R ci . Let P r and ¯ P r bethe probability of choosing path r before and after removing edge a atnode i , respectively. We want to show under what conditions we have¯ P r − P r < r ∈ R ci .Note that ¯ P r can be written as:¯ P r = e − C r (cid:80) l ∈R ia e − C l − ∆ i + (cid:80) k ∈R ci e − C k ∀ r ∈ R ci Dividing the numerator and denominator by (cid:80) l ∈R e − C l , we ﬁnd that:¯ P r = P r (cid:80) l ∈R ia P l e − ∆ i + (cid:80) k ∈R ci P k ∀ r ∈ R ci From the previous expression, it follows that ¯ P r − P r can be expressedas: ¯ P r − P r = P r (cid:32) (cid:80) l ∈R ia P l e − ∆ i + (cid:80) k ∈R ci P k − (cid:33) Since P r >

0, it follows that ¯ P r − P r is negative iﬀ (cid:32) (cid:80) l ∈R ia P l e − ∆ i + (cid:80) k ∈R ci P k − (cid:33) < . Rearranging this expression, we get1 − (cid:88) k ∈R ci P k < (cid:88) l ∈R ia P l e − ∆ i . Using the fact that P ( R i ) = (cid:80) l ∈R i P l = 1 − (cid:80) k ∈R ci P k , we get P ( R i ) P ( R ia ) < e − ∆ i . Noting that e − ∆ i = e log( | A + i | / | A + i − | ) κi = (cid:16) | A + i || A + i |− (cid:17) κ i , then we conclude: κ i > log (cid:16) P ( R i ) P ( R ia ) (cid:17) log (cid:16) | A + i || A + i |− (cid:17) . (cid:3) Some remarks are in order. First, Proposition 1 provides a simplecondition to know when removing an edge at node i can decrease theprobability of paths not crossing node i . Condition (17) captures thefact that removing an edge at node i will not only modify the set ofavailable paths but also the choise aversion cost. Concretely, condition(17) establishes a lower bound on the parameter κ i in terms of the pathchoice probabilities P ( R i ) and P ( R ia ) and the magnitude of | A + i | and | A + i − | . To the best of our knowledge this result is new to the literatureon recursive discrete choice models in transportation networks.Second, we note from a practical point of view that in order to testwhether condition (17) is satisﬁed or not, we only need the informa-tion contained in the original network. For instance, we can use theinformation contained in the estimation of the probabilities P r for each r ∈ R i (before removing edge a ) to understand how users react tochanges on the topology.Third, Proposition 1 predicts that removing edge a ∈ A + i can de-crease the probability of paths passing through nodes diﬀerent from i .We have identiﬁed this phenomenon as a failure of regularity in thesense of Luce and Suppes (1965). However, behind this result is thefactor that reducing the cardinality of A + i reduces the cost associatedto choice aversion. This cost reduction can overweight the impact ofreducing the number of paths available. In the context of rational inat-tention, Matˇejka and McKay (2015) have shown that regularity mayfail in the logit model. Our result is diﬀerent in two aspects. First, we study a recursive logit model with choice aversion in the contextof transportation networks. Second, our result highlights the role ofchoice aversion by providing a speciﬁc condition on κ i . Matˇejka andMcKay (2015) use the idea of information acquisition in order to derivetheir result. Finally, we mention that a particular case of Proposition 1 is when κ i = κ for all i ∈ A .In order to see how Proposition 1 applies in the concrete case ofthe network in Figure 6, Table 3 summarizes the information afterremoving edges a , a , b , and b , respectively. The main message fromthis table is the simplicity in checking Proposition 1.Edge Removed Condition a κ > . a κ > . b κ > . b κ > . Table 3.

Conditions for regularity failure in Figure 6.5.

Welfare Analysis and Braess’s Paradox

In previous sections we have shown how the recursive choice aversionmodel corrects the problem of predicting routing behavior when thereare overlapping paths. Similarly, we have shown how this model maygenerate violations of regularity when some edge at the network isremoved.In this section, we show how choice aversion can capture changesto users welfare when the network topology is modiﬁed. Formally,we make two contributions. First, we show that, under the presence ofchoice aversion, adding edges to the network can decrease users’ welfare. We note that Matˇejka and McKay (2015)’s analysis has been extended to thegeneral class of additive random utility models by Fosgerau et al.. In particular, we show how a type of Braess’s paradox (Braess (1968)and Braess et al. (2005)) can emerge even in the case of uncongestednetworks. Second, we compare our approach with the APSL model interms of its ability to capture changes on welfare.5.1.

Welfare.

Following McFadden (1981, Ch. 5), we deﬁne the users’welfare as follows:(18) C ( κ ) (cid:44) E (cid:18) min r ∈R { C r + (cid:15) r } (cid:19) = − log (cid:32)(cid:88) r ∈R e − C r (cid:33) , where the last equality follows from Assumption 1. Notice that thisdeﬁnition exploits the equivalence in Eq. (11) and makes explicit thedependence of user welfare on the choice aversion parameter κ .Following the literature on discrete choice models, expression (18)can be interpreted as the inclusive value of paths in R , which is equiv-alent to say that C ( κ ) measures the inclusive value of the source node s .In particular, C ( κ ) represents the expected cost faced by the networkusers.It is easy to show that C ( κ ) is decreasing on κ . Similarly, it can beshown that C ( κ ) is increasing on c a for a ∈ A . In particular, Lorca andMelo (2020, Prop. 4) show that d C ( κ ) dc a = x i P ( a | A + i ).The goal of this section is to use C ( κ ) to quantify changes on welfarein response to adding or deleting edges to the network G. To thatend, we now connect κ with changes on C ( κ ) when the network G ismodiﬁed. Formally, we have the following: Proposition 2.

Fix a node i (cid:54) = s, t . Suppose that a new link a (cid:48) is addedto node i . Then C ( κ ) decreases if and only the following condition holds: (19) κ < log(1 − P ( a (cid:48) | A + i ∪ { a (cid:48) } ))log (cid:16) | A + i || A + i +1 | (cid:17) . The previous result is a restatement of Lorca and Melo (2020, Thm.1). Its relevance comes from the fact that we have a clear way tounderstand how adding an edge to the network is welfare-improving as a function of the value of κ . In particular, for a given network, Eq.(19) can be easily checked.Subsequently, the choice aversion model predicts that there exists arange of values for κ where the addition of costless edges can lead to adecrease in welfare, even if the cost of newly created routes is lower thanexisting routes. Thus, Braess’s Paradox-like phenomena may emergeeven in the case of uncongested transportation networks.From an empirical point of view, Eq. (19) can be estimated providinga simple test to understand when modiﬁcations to the network arewelfare improving or not.5.1.1. Welfare and PSL models.

In addition to being able to computechanges on welfare using the choice aversion model, we also computechanges in welfare using several PSL models. In particular, for thisclass of models the welfare is deﬁned as follows:(20) ˆ C ( θ ) (cid:44) E (cid:18) min r ∈R { ˆ C r + (cid:15) r } (cid:19) = − log (cid:32)(cid:88) r ∈R e − ˆ C r (cid:33) , where θ is a parameter vector describing the speciﬁc PSL model and ˆ C r represents the adjusted cost after applying the respective correction.It is worth pointing out that the PSL models are not designed tocapture changes on welfare. So, Eq. (20) should be interpreted as anadapted welfare measure. The reason to consider these measures is tocompare how traditional PSL models might be used to compute welfarechanges with the choice aversion model.5.2. Braess’s Paradox.

Traditionally, Braess’s paradox is studied asthe result of a congestion game in a transportation network. In thiscontext, Braess’s paradox predicts that introducing additional edgeswith zero cost can actually contribute to a greater total network cost,and, therefore, a decrease in welfare, than without the additional edges.This counterintuitive result relies in the fact that users in the trans-portation network are selﬁsh (Roughgarden (2016)). As a consequence, i s ti a a a a (A) Two unconnected paths i s ti a a a a a (B) Both paths connected by a Figure 7.

Braess’s paradox ensuing from choice aver-sion.adding edges to the network can make everybody worse oﬀ in the sys-tem.However, the choice aversion model reveals that welfare decreasescan arise naturally even when networks are uncongested through in-creasing choice set cardinality. To see this more explicitly, considerthe parameterization of Figure 7(A) where c a = c a = x with x ∈ [0 , c a = c a = 1. This ﬁgure depicts the case of a simple parallel se-rial link network where the set of paths is given by R A = { r , r } with r = ( a , a ) and r = ( a , a ).Figure 7(B) shows the network with edge a added to the directedacyclical graph, connecting i to i . For the purpose of observingBraess’s Paradox, we set c a = 0. The set of paths is given by R B = R A ∪ { r } , where r = ( a , a , a ). We calculate the welfare accordingto (18) for 7(A) and 7(B) and compare the diﬀerence.According to Proposition 2, we can compute the threshold for κ thatdetermines when adding links is welfare improving. In particular, we Note that in the case of uncongested networks, users do not get involve instrategic interaction. have that adding a link is welfare-improving when κ < log (1 − e − x / ( e − + e − x ))log(1 / . After some algebra, we can express a relationship between κ and x asfollows:(21) κ < − log (cid:18) e x − e x − + 1 (cid:19) / log 2 . From (21) it is easy to see that when x = 0 adding an edge is welfareimproving when κ < .

89, approximately. Similarly, when x = 1, thewelfare increases when κ <

1. In particular, Eq. (21) shows that thereis an inverse relationship between κ and x .Based on this observation, we show how C ( κ ) changes when x variesfor a constant κ . Figure 8 shows the welfare change for the choiceaversion model with κ = { , } as well as the MNL model and otherPSL models and extensions mentioned in Section 4.The choice aversion model where κ = 1 matches this characteriza-tion, reﬂecting a welfare upgrade for x < x > Figure 8.

Welfare change from network Figure 7(A) to7(B) for various logit models. In contrast, the MNL model only reﬂects a nonnegative change inwelfare from adding edge a , a symptom of a model where additionalgains in welfare are realized from any additional route added to thechoice set of the agent. From a theoretical standpoint, this feature ofthe MNL model captures the preference for ﬂexibility in users’ prefer-ences (Kreps (1979)).Interestingly, most PSL models and extensions shown in Figure 5reﬂect a welfare change similar to each other: a initial welfare gainwhich decreases in x , ultimately leading to a decrease in welfare.However, Duncan et al. (2020)’s Adaptive PSL model breaks fromthe welfare change pattern observed by other models. For this model,the diﬀerence in welfare is positive and decreasing until x = 1, but for x >

1, the change in welfare is zero. This is a compelling result whichspeaks to the strengths of this PSL extension. As pointed out before, an important property of the choice aversionmodel is its microfoundation based in the behavioral concept of choiceoverload, which provides justiﬁcation for the penalization term in thecost function as well as the observed outcomes. While there is a similarwelfare diﬀerence observed among the choice aversion model with κ = 1and other PSL models and extensions, this microfoundation sets thechoice aversion model apart from other models.It is important to note the contrast in welfare change between κ = 1and κ = 2 for the choice aversion model. For κ = 2, Figure 8 clearly To understand why the APSL diﬀers from traditional PSL models, we notethat c r ≤ c r = c r when x ≤

1, with the equality holding only for x = 1. Here,it is reasonable that we would observe an increase in welfare when a is added tothe network: for an uncongested network, we are adding a cheaper route choicefor users, and it is intuitive that this would improve welfare. Indeed, this resultis consistent with most models as shown in Figure 8. However, for x >

1, where c r > c r = c r , the APSL model reﬂects no decrease in welfare. Again, thisis intuitive: when x >

1, adding costless edge a does not provide users with acheaper route choice. Since the users would incur cheaper costs from choosingroutes r or r , the APSL model considers r to be an irrelevant addition to thechoice set and thus would not contribute to, nor detract from, welfare. shows that there is no value of x > κ . This contrast is important anddraws our attention to how welfare changes as κ varies for a ﬁxed x . (A) Choice aversion model (B) Adaptive PSL model Figure 9.

Welfare change as κ ( β ) varies.Figure 9(A) shows how the diﬀerence in welfare calculated by thechoice aversion model responds to an increase in κ from 0 to 10 for ﬁxed x = { , . , , . , , . , } . For each value of x displayed, we ﬁrst seea positive diﬀerence in welfare for low values of κ . As κ increases, thediﬀerence in welfare continues to decrease until the welfare change isnegative. In other words, we see that the threshold κ (i.e., the κ wherethe welfare change from adding edge a is no longer positive) decreasesin x . This occurrence stems from the fact that for low values of κ , theinstantaneous cost of each route dominates the choice aversion term inthe users’ cost functions. Thus, for low values of x , the welfare change isinitially positive, but as κ increases, the choice aversion term is updatedwith increasing weight until the user’s aversion to choice dominates anyreduction in cost incurred by an additional route created by a .While the choice aversion model predicts this welfare decrease as aresult of its behavioral motivation, other logit models may not providethe same outcome. For example, the APSL model does not show a welfare decrease for any value of x as shown by Figure 9(B). Rather,this model seems to predict that the welfare change is nonnegative forall values of β ∈ [0 ,

10] and only positive for β < While this maybe sensible for x >

1, we would expect that a model looking to explaindecision-making behavior accurately would display a continuous trade-oﬀ between route costs and aversion to increasing choice set cardinality.However, we see that the threshold β , where the welfare change is nolonger positive, is approximately 1 for all values of x featured and mostnotably for x <

1. The APSL model does not seem to be capable oftaking into account the tradeoﬀ mentioned above, perhaps as a resultof its corrective nature and intentional design. While the APSL modeland other PSL models may have a strength in producing more intuitiveroute choice probabilities, there appears to be a weakness in predictingreasonable welfare changes from a behavioral point of view.To summarize, while PSL models and extensions are developed tocorrect the outcome of a standard logit model and provide what maybe deemed as more reasonable choice probabilities for a given networktopology, what they often lack is the behavioral foundation for anypenalties or adjustments in the cost function of the network user, whichwould justify the outcomes they provide. Their designs also limit theircapability to model changes in welfare with respect to internal param-eters. This is not to say that the existing PSL models are inferior;indeed, these models are powerful and useful in various applicationsat providing reasonable predictions in network ﬂow allocations. Wesimply wish to speak to the strengths of the choice aversion modelin the context of applications in transportation networks where choiceoverload is a factor in users’ decision-making, as well as encourage aconvergence of the PSL literature with choice aversion models in trans-portation network applications. Despite the diﬀerence in behavioral motivation, κ in the choice aversion modeland β in most PSL models, including the APSL model, serve a similar purpose andcan be compared equivalently. For instance, Duncan et al. (2020) analyze an example similar to Figure 2.Their conclusions, while being quantitatively diﬀerent, agree with the predictionsmade by the choice aversion model. Node-Speciﬁc Choice Aversion Parameter.

We can also mod-ify our welfare analysis by incorporating node-speciﬁc choice aversionparameters as previously deﬁned in § κ j a ≥ a ∈ A + j a and all j a (cid:54) = t .Accordingly, we update users’ welfare according to Eq. (18) undernode-speciﬁc choice aversion parameters such that(22) C ( κ ) = − log (cid:32)(cid:88) r ∈R e − C r (cid:33) = − log (cid:32)(cid:88) r ∈R e − c r − ρ r (cid:33) where κ = { κ i } i ∈ N , and c r and ρ r follow Eq. (12). With updated rulesfor choice probability and welfare, we can examine how heterogeneouschoice aversion parameters may be used in practice.In particular, the impact of changes on a speciﬁc κ i on C ( κ ) can beformalized as follows: Proposition 3.

Let R i be the set of all paths passing through node i . Similarly, let R ci be the set of all paths passing through nodes otherthan i . Then: i) C ( κ ) is increasing in κ i . In particular, d C ( κ ) dκ i = (cid:88) r ∈R i P r log | A + i | > . ii) Fix a node i (cid:54) = s, t and suppose that a new edge a (cid:48) is added tonode i . Then C ( κ ) decreases if and only the following conditionholds: (23) κ i < log(1 − P ( a (cid:48) | A + i ∪ { a (cid:48) } ))log (cid:16) | A + i || A + i +1 | (cid:17) ∀ i (cid:54) = s, t. Proof. i) Using the deﬁnitions of R i and R ci , C ( κ ) can be written as: C ( κ ) = − log  (cid:88) r ∈R i e − c r − ρ r + (cid:88) r (cid:48) ∈R ci e − c r (cid:48) − ρ r (cid:48)  Then taking the diﬀerential, we ﬁnd: d C ( κ ) dκ i = 1 (cid:80) r ∈R e − c r − ρ r (cid:32) (cid:88) r ∈R i e − c r − ρ r (cid:33) log | A + i | . Noting that (cid:80) r ∈R e − cr − ρr (cid:0)(cid:80) r ∈R i e − c r − ρ r (cid:1) = (cid:80) r ∈R i P r , we conclude that d C ( κ ) dκ i = (cid:88) r ∈R i P r log | A + i | > . ii) Follows the same argument used in proving Proposition 2. (cid:3) For an example of application, we borrow the complex network ex-ample from Figure 4 and initialize κ i = 1 for all i ∈ N . To see hownode-speciﬁc choice aversion parameterization can change outcomes topath choice probability and welfare, we let κ and κ (that is, eachrespective choice aversion parameter κ i for node i ∈ { , } ) vary from0 to 5 independently. Figure 10 displays the responses to choice prob-abilities and welfare. (A) Choice Probabilities as κ varies. (B) Welfare as κ varies.(C) Choice Probabilities as κ varies. (D) Welfare as κ varies. Figure 10.

Choice probability and welfare responsesas κ / κ varies across nodes for Figure 4 (example fromFosgerau et al. (2013)).In this example, a user will reach node 2 when taking r , r , or r . However, node 3, the only remaining node in this example with a choice set cardinality greater than 1, is reached only by taking routes r or r . When κ is small and κ = 1, Figure 10(A) illustrates that P r = P r < P r ≤ P r , with the equality holding at κ = 0. As κ increases, P r declines further away from P r and converges to P r and P r as the common κ parameter dominates the cost functions of thethree routes.Conversely, when κ = 1 and κ is small, shown in Figure 10(C), thepath choice probabilities for r , r , and r are close in value, but as κ increases, P r increases until P r and P r are suﬃciently close to zero.However, P r < P r for all κ ∈ [0 ,

5] since the user is averse to thechoice set at node 2 with a ﬁxed κ = 1.Figures 10(B) and 10(D) show the changes to welfare as κ and κ increase from 0 to 5, respectively. There is a marked diﬀerence in thelevel of welfare variation as κ increases and as κ increases. Becausenode 2 is included in all routes in the network except for r , an increaseor a decrease in κ has a larger eﬀect on welfare than a change to κ ,which is only accounted for along routes r and r .Heterogeneous speciﬁcation of choice aversion at diﬀerent nodes pro-vides a more precise way of modeling and predicting changes to userwelfare. While some transportation contexts might warrant similardegrees of choice aversion across nodes, we anticipate that other appli-cations will beneﬁt from the use of node-speciﬁc parameterization andthe more nuanced welfare predictions that follow.6. Final Remarks

The recursive choice model with choice aversion is a highly tractableextension of the standard MNL that can be used to predict reason-able route choice probabilities and provide welfare interpretations intransportation networks. Upon testing our approach against existingPSL models, our model exhibits the power to provide reasonable cor-rections and predictions with the beneﬁt of a microfoundation in choiceoverload. In addition, we explore how the choice aversion model allows fora break in the regularity condition typically preserved by path choiceprobabilities in logit models and extensions. In doing so, we show that,conditional on the degree of choice aversion, removing edges in thenetwork can lead to a decrease in choice probability of certain existingpaths.We also simulate the welfare implications of the choice aversionmodel and ﬁnd a novel prediction: even in uncongested networks, adecrease in welfare akin to Braess’s Paradox can arise when costlessedges are added. Here, we also provide a simple characterization forwelfare changes conditional on choice aversion which is testable in em-pirical settings.It is worth remarking that given its simplicity, our model can be es-timated following the methodology proposed by Fosgerau et al. (2013).Exploiting these techniques allows one to test the hypothesis κ i = κ for all nodes i (cid:54) = t .An important extension of this work is to consider the role of choiceaversion in the context of congested traﬃc networks and the respectivemodeling approach. One way to study this question is to extend themodel in Baillon and Cominetti (2008) by introducing choice aversionin their recursive approach.Finally, we remark that there is much work to be done regarding theempirical support of choice aversion in transportation networks. Fu-ture work regarding experiments on behavior of participants in trans-portation networks would help establish a better understanding of thesigniﬁcance of choice overload in making routing choices. Appendix A. Path size logit models

PSL models include correction terms to penalize routes that sharelinks with other routes, so that the deterministic cost of route r ∈ R is C r = c r + µ r , where µ r ≥ r ∈ R . The probability that a user chooses path r is given by:(24) ˆ P r = e − c r + µ r (cid:80) r (cid:48) ∈ R e − c r (cid:48) + µ r (cid:48) ∀ r ∈ R Following Ben-Akiva and Bierlarie (1999), Path Size Logit (PSL) mod-els adopt the form µ i = β ln ( γ r ) , where β ≥ γ r ∈ (0 ,

1] is the path size term for route r ∈ R . A distinct route with no shared links has a path size term equal to1 , resulting in no penalization. Less distinct routes have smaller pathsize terms and incur greater penalization. The probability that a userchooses route r ∈ R is:ˆ P r = e − c r + β ln( γ r ) (cid:80) r (cid:48) ∈R e − c r (cid:48) + β ln( γ r (cid:48) ) = ( γ r ) β e − θc r (cid:80) r (cid:48) ∈R ( γ r (cid:48) ) β e − θc r (cid:48) = 1 (cid:80) r (cid:48) ∈ R (cid:16) γ r (cid:48) γ r (cid:17) β e − θ ( c r (cid:48) − c r ) The Path Size Logit (PSL) model was ﬁrst proposed by Ben-Akivaand Ramming, and states that the PSL path size term for route r ∈R , γ PS r , is deﬁned as follows:(25) γ P Sr = (cid:88) a ∈ r c a c r (cid:80) r ∈R δ ar (cid:48) where δ ar (cid:48) = 1 if edge a belongs to path r (cid:48) and δ ar (cid:48) = 0 otherwise.In Eq. (25) each link a in route r is penalized (in terms of decreasingthe path size term and increasing the cost of the path) according to thenumber of paths in the choice set that also use that link (cid:0)(cid:80) r (cid:48) ∈R δ ar (cid:48) (cid:1) , and the signiﬁcance of the penalization is weighted according to howprominent edge a is in route r , i.e. the cost of edge a in relation to thetotal cost of path r , (cid:16) c a c r (cid:17) .A.1. Generalized Path Size Logit (GPSL).

Ben-Akiva and Bier-larie (1999) formulate an alternative PSL model (PSL’) that attemptsto reduce the contributions of excessively expensive routes to the pathsize terms of more realistic routes in the choice set. The GPSL modelstates that the PSL path size term for route r ∈ R , γ P SL (cid:48) r , is deﬁned as follows:(26) γ P SL (cid:48) r = (cid:88) a ∈ r c a c r (cid:80) r (cid:48) ∈R (cid:16) min( c r (cid:48)(cid:48) : r (cid:48)(cid:48) ∈R ) c r (cid:48) (cid:17) δ ar (cid:48) where δ ar (cid:48) = 1 if edge is in path r (cid:48) and δ ar (cid:48) = 0 otherwise.In Eq. (26) the contribution of route r to path size terms is weightedaccording to the ratio of route r and the cheapest route in the choiceset (cid:16) min( c r (cid:48)(cid:48) : r (cid:48)(cid:48) ∈R ) c r (cid:17) , and hence contributions of high costing routes com-pared to the cheapest alternative are reduced.As Ramming describes, however, when a route is completely distinctits path size term is not always equal to 1 which results in an undesiredpenalization upon the utility of that route. To combat this, Rammingproposes the Generalized Path Size Logit (GPSL) model. The GPSLmodel states that the GPSL path size term for route r ∈ R , γ GP SLr , isdeﬁned as follows:(27) γ GP SLr = (cid:88) a ∈ r c a c r (cid:80) r (cid:48) ∈R (cid:16) c r c r (cid:48) (cid:17) λ δ ar (cid:48) where δ ar (cid:48) = 1 if edge a is in path r (cid:48) and δ ar (cid:48) = 0 otherwise and λ ≥

0. It is easy to see that the GPSL model is equivalent to the PSLmodel when λ = 0 . In Eq. (27) the contribution of route r (cid:48) to thepath size term of route r (the path size contribution factor) is weightedaccording to the cost ratio between the routes, (cid:18)(cid:16) c r c r (cid:48) (cid:17) λ (cid:19) , and hencethe contributions of high costing routes to the path size terms of lowcosting routes is reduced. λ ≥ Adaptive Path Size Logit Model (APSL).

In an attempt toimprove on existing PSL models and extensions, Duncan et al. (2020)propose an internally consistent PSL model where all components as-sess the feasibility of routes according to its relative attractiveness dueto travel cost and distinctiveness. Formally, their correction can bedeﬁned as follows: Deﬁnition 1.

The APSL choice probabilities, P ∗ , ( for a choice set of size R ) are a solution to the ﬁxed-point problem P ∗ = G (cid:0) g (cid:0) γ AP SL ( P ∗ ) (cid:1)(cid:1) where: G r (cid:0) g r (cid:0) γ AP SL ( P ∗ ) (cid:1)(cid:1) = τ + (1 − N τ ) · g r (cid:0) γ AP SL ( P ∗ ) (cid:1) (28) g r (cid:0) γ AP SL ( P ∗ ) (cid:1) = (cid:0) γ AP SLr ( P ∗ ) (cid:1) β e − θc r (cid:80) r (cid:48) ∈R ( γ AP SLr (cid:48) ( P ∗ )) β e − θc r (cid:48) (29) γ AP SLr ( P ∗ ) = (cid:88) a ∈ r c a c r (cid:80) r (cid:48) ∈R (cid:16) P r (cid:48) P r (cid:17) δ ar (cid:48) (30) ∀ r ∈ R , ∀ P ∗ ∈ D ( τ ) , θ > , β ≥ , < τ ≤ R D ( τ ) = (cid:8) P ∗ ∈ R R ++ : τ ≤ P ∗ r ≤ (1 − ( R − τ ) , ∀ r ∈ R , (cid:80) r (cid:48) ∈R P r (cid:48) = 1 (cid:9) Despite the fact that there is no closed-form representation of thechoice probabilities for the APSL model, the APSL model corrects formany of the internal consistency issues in route cost and distinctivenessthat trouble other PSL models, which makes its recent introduction inthe literature particularly useful in working to predict route choiceprobabilities more appropriately.

References

Baillon, J. B. and Cominetti, R. (2008). Markovian traﬃc equilibrium.

Math Program. Ser. B , 111:33–56.Ben-Akiva, M. and Bierlarie, M. (1999). Discrete choice methods andtheir applications to short term travel decisions.

Handbook of trans-portation Science , pages 5–34.Ben-Akiva, M. and Lerman, S. (1985).

Discrete Choice Analysis: The-ory and Application to Travel Demand . MIT Press, Cambridge, 1edition.Ben-Akiva, M. and Ramming, M. S. Lecture notes: discrete choicemodels of traveler behavior in networks.

Prepared for Advanced Meth-ods for Planning and Management of Transportation Networks .

Braess, D. (1968). Uber ein paradoxon aus der verkehrsplanung.

Un-ternehmensforschung , 12(1):258–268. Braess, D., Nagurney, A., and Wakolbinger, T. (2005). On a paradoxof traﬃc planning.

Transportation Science , 39(4):446–450.Duncan, L. C., Watling, D. P., Connors, R. D., Rasmussen, T. K., andNielsen, O. A. (2020). Path size logit route choice models: Issues withcurrent models, a new internally consistent approach, and parameterestimation on a large-scale network with gps data.

TransportationResearch Part B: Methodological , 135:1 – 40.Fosgerau, M., Frejinger, E., and Karlstrom, A. (2013). A link basednetwork route choice model with unrestricted choice set.

Transporta-tion Research Part B: Methodological , 56:70–80.Fosgerau, M. and Jiang, G. (2019). Travel time variability and rationalinattention.

Transportation Research Part B: Methodological , 120:1– 14.Fosgerau, M., Melo, E., de Palma, A., and Shum, M. Discrete choiceand rational inattention: A general equivalence result.

InternationalEconomic Review , n/a(n/a).Frejinger, E. and Bierlarie, M. (2007). Capturing correlation with sub-networks in route choice models.

Transportation Research Part B ,pages 363–378.Fudenberg, D. and Strzalecki, T. (2015). Dynamic logit with choiceaversion.

Econometrica , 83(2):651–691.Jiang, G., Fosgerau, M., and Lo, H. K. (2020). Route choice, travel timevariability, and rational inattention.

Transportation Research PartB: Methodological , 132:188 – 207. 23rd International Symposium onTransportation and Traﬃc Theory (ISTTT 23).Kreps, D. (1979). A representation theorem for “preference for ﬂexi-bility”.

Econometrica , 47(3):565–577.Lorca, J. and Melo, E. (2020). Choice aversion in directed networks.

Working Papers .Luce, D. R. and Suppes, P. (1965).

Preference, utility, and subjectiveprobability . New York: John Wiley & Sons, 1 edition.Mai, T., Fosgerau, M., and Frejinger, E. (2015). A nested recursivelogit model for route choice analysis.

Transportation Research PartB: Methodological , 75:100 – 112. Matˇejka, F. and McKay, A. (2015). Rational inattention to discretechoices: A new foundation for the multinomial logit model.

AmericanEconomic Review , 105(1):272–98.McFadden, D. (1974). Conditional logit analysis of qualitative choicebehavior.

Frontiers in Econometrics , pages 105–142.McFadden, D. (1978).

Behavioural travel modelling , chapter 13: Quan-titive methods for analyzing travel behaviour of individuals: somerecent developments, pages 279–318. Croom Helm London.McFadden, D. (1981).

Structural Analysis of Discrete Data with Econo-metric Applications , chapter Econometric Models of ProbabilisticChoice, pages 198–272. Cambridge: MIT.Ortoleva, P. (2013). The price of ﬂexibility: Towards a theory of think-ing aversion.

Journal of Economic Theory , 148(3):903 – 934.Ramming, M. S. Network knowledge and route choice.

Ph.D. Thesis,Massachusetts Institute of Technology .Roughgarden, T. (2016).

Twenty Lectures in Algorithmic Game The-ory . Cambridge University Press, 1 edition.Scheibehenne, B., Greifeneder, R., and Todd, P. M. (2010). Can ThereEver Be Too Many Options? A Meta-Analytic Review of ChoiceOverload.

Journal of Consumer Research , 37(3):409–425.Sheena, I. and Lepper, M. R. (2000). When choice is demotivating:can one desire too much of a good thing ?

Journal of Personalityand Social Psychology , 79(6):995–1006.Train, K. (2009).

Discrete Choice Methods with Simulation . CambridgeUniversity Press, second edition.Zimmermann, M. and Frejinger, E. (2020). A tutorial on recursivemodels for analyzing and predicting path choice behavior.