[PDF] Incentive-based Decentralized Routing for Connected and Autonomous Vehicles using Information Propagation

Abstract

Routing strategies under the aegis of dynamic traffic assignment have been proposed in the literature to optimize system performance. However, challenges have persisted in their deployment ability and effectiveness due to inherent strong assumptions on traveler behavior and availability of network-level real-time traffic information, and the high computational burden associated with computing network-wide flows in real-time. This study proposes an incentive-based decentralized routing strategy to nudge the network performance closer to the system optimum for the context where all vehicles are connected and autonomous vehicles (CAVs). The strategy consists of three stages. The first stage incorporates a decentralized local route switching dynamical system to approximate the system optimal route flow in a local area based on vehicles' knowledge of local traffic information. The second stage optimizes the route for each CAV by considering individual heterogeneity in traveler preferences (e.g., the value of time) to maximize the utilities of all travelers in the local area. Constraints are also incorporated to ensure that these routes can achieve the approximated local system optimal flow of the first stage. The third stage leverages an expected envy-free incentive mechanism to ensure that travelers in the local area can accept the optimal routes determined in the second stage. The study analytically discusses the convergence of the local route switching dynamical system. We also show that the proposed incentive mechanism is expected individual rational and budget-balanced, which ensures that travelers are willing to participate and guarantee the balance between payments and compensations, respectively. Further, the conditions for the expected incentive compatibility of the incentive mechanism are analyzed and proved, ensuring behavioral honesty in disclosing information.

Full PDF

11 Incentive-based Decentralized Routing for Connected and Autonomous Vehicles using Information Propagation

Chaojie Wang a , Srinivas Peeta a,b* , Jian Wang c* a School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, 30318, U.S.A. b H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, 30318, U.S.A. c School of Transportation, Southeast University, Nanjing, 211189, China

Abstract

Routing strategies under the aegis of dynamic traffic assignment have been proposed in the uliterature to optimize system performance. However, challenges have persisted in their deployment ability and effectiveness due to inherent strong assumptions on traveler behavior and availability of network-level real-time traffic information, and the high computational burden associated with computing network-wide flows in real-time. To address these gaps, this study proposes an incentive-based decentralized routing strategy to nudge the network performance closer to the system optimum for the context where all vehicles are connected and autonomous vehicles (CAVs). The strategy consists of three stages. The first stage incorporates a local route switching dynamical system to approximate the system optimal route flow in a local area based on vehicles’ knowledge of local traffic information. This system is decentralized in the sense that it only updates the local route choices of vehicles in this area rather than route choices of all vehicles in the network, which circumvents the high computational burden associated with computing the flows on the entire network. The second stage optimizes the route for each CAV by considering individual heterogeneity in traveler preferences (e.g., the value of time) to maximize the utilities of all travelers in the local area. Constraints are also incorporated to ensure that these routes can achieve the approximated local system optimal flow of the first stage. The third stage leverages an expected envy-free incentive mechanism to ensure that travelers in the local area can accept the optimal routes determined in the second stage. The study analytically discusses the convergence of the local route switching dynamical system. We also show that the proposed incentive mechanism is expected individual rational and budget-balanced, which ensure that travelers are willing to participate and guarantee the balance between payments and compensations, respectively. Further, the conditions for the expected incentive compatibility of the incentive mechanism are analyzed and proved, ensuring behavioral honesty in disclosing information. Thereby, the proposed incentive-based decentralized routing strategy can enhance network performance and user satisfaction under fully connected and autonomous environments.

Keywords: decentralized routing; incentive mechanism; connected and autonomous vehicle Introduction

The determination of origin-destination (OD) routes is a key decision for travelers, and can also entail en route decision-making to account for dynamics in network conditions, especially under information availability. Over the past three decades, dynamic traffic assignment (DTA) has been identified as the methodological engine to determine OD routes that satisfy some individual traveler objectives/constraints and/or system-wide goals in traffic networks (Peeta and Ziliaskopoulos, 2001). Advances in DTA models related to routing strategies have focused on enhancing realism and prediction accuracy. Despite these advances, challenges have persisted in a deployment context due to the complexity of the problem arising from the multiple dimensions that characterize it. First, accurate real-time information on network-level traffic conditions and/or route characteristics (such as travel time) is presumed to be available either a priori for the entire time horizon of interest or at the current time. While some studies (e.g., Du et al., 2012; Du et al. 2013) account for randomness in the link travel time distributions due to measurement errors or the fusion of information from multiple sources, an underlying assumption in most DTA studies is the availability of traffic data on all links of a network. While even today the presumption of ubiquitous data availability at the network level in itself is rather optimistic due to the lack of sensor coverage on all network links, the assumption of its seamless * Corresponding author. E-mail address: [email protected] (C. Wang); [email protected] (S. Peeta); [email protected] (J. Wang) availability to all travelers in real-time is unrealistic in a deployment context. Hence, the availability of massive amounts of information required for DTA should not be presumed to be trivial or seamless, especially when the reliability of the predictions depends heavily on the accuracy and timeliness of the information. Second, the heterogeneity in traveler characteristics in the context of making routing decisions is not modeled satisfactorily. For example, the use of user classes (Peeta and Mahmassani, 1995a) masks differences in responses among the individuals of a user class. Further, there is randomness in the fractions of different user classes in the traffic stream with time. Hence, while the consideration of multiple user classes may be useful in identifying routing strategies for planning purposes, the accuracy of state prediction is lacking in a real-time deployment context. This motivates the consideration of real-time network traffic management strategies that directly capture individual heterogeneity related to traveler characteristics. Third, computational tractability has always been a challenge for the real-time deployment of DTA models, especially for real-world urban networks. Due to the complexity of the DTA problem, analytical solutions with adequate levels of modeling realism are lacking in a deployment context. Even for models with simplified assumptions, the computational time of analytical solutions escalates rapidly with network size. Hence, simulation-based algorithms (Peeta and Mahmassani, 1995b) have been adopted in real-world implementations. However, the computational burden for deploying these algorithms can be significant when determining solutions in a centralized manner, leading to mitigation strategies such as the use of mesoscopic traffic simulators, and the deployment of distributed/decentralized control-based solutions (Hawas and Mahmassani, 1997; Pavlis and Papageorgiou, 1998). Over the years, various studies have leveraged DTA models to additionally explore mechanisms to influence travelers’ routing decisions so as to push the network performance closer to the system optimal solution. Brueckner et al. (2001) recommend demand-side solutions to mitigate traffic congestion rather than the use of expensive supply-side mechanisms such as the construction of new lanes or roads. In this context, incentive-based approaches are gaining attention as mechanisms to leverage the heterogeneity in individual behavior to enhance system performance. However, many practical incentive-based approaches (Merugu et al., 2009)) to influence individual travelers require monetary investment from one or more stakeholders. Further, existing incentive mechanisms primarily aim to influence macro travel decisions like travel mode choice (Kazhamiakin et al., 2015) and departure time choice. However, incentive mechanisms to influence micro travel decisions (such as real-time route choice), for which opportunities exist frequently in real-world networks, are rare. Partly, this can be attributed to safety concerns associated with providing real-time incentives, arising from distraction, cognitive burden, and limited attention span of drivers due to the inherently multitasking environment of congested traffic networks. Hence, such incentives have only been proposed in the routing context for pre-trip route choice (Hu et al., 2015). The problem is also challenging because micro travel decisions occur in network traffic conditions characterized by dynamics and randomness. Further, centralized incentive mechanisms that would potentially target individuals for route-related incentives in a coordinated manner require understanding/knowledge of their behavioral characteristics, which is challenging in a deployment context. Moreover, there is a lack of theoretical analysis on the participation willingness and behavioral honesty in existing incentive mechanisms, which are both critical concerns in practical implementation. The emerging disruptive and transformative technologies of automation and connectivity provide several enablers to foster the development of a new generation of incentive mechanisms that target micro travel decisions by individuals to enhance system performance. A connected transportation environment consisting of vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communications enables vehicles to have access to real-time traffic information by leveraging vehicles’ ability to communicate their travel experience (for example, link travel time, location, speed, etc.) to each other (through V2V) and the system (through V2I), thereby obviating the need for ubiquitous network-wide installation of sensors (e.g., Bagloee et al., 2017). Some studies (Kim et al., 2016; Kim et al., 2017; Wang et al., 2018; Wang et al., 2019; Kim and Peeta, 2019) have proposed models related to information propagation to alleviate the computational burden and scalability issues with centralized information communication-based strategies. However, information propagation suffers from accumulated communication delay. For the timeliness and accuracy of the traffic information, information propagation is more likely to be meaningfully applied in local information updating. While the leveraging of connected technologies through V2V communications can enable seamless access to local traffic information for vehicles, and mitigate computational burden issues through the fostering of decentralized routing strategies, technologies associated with automation and autonomy provide additional capabilities to address the other DTA deployment limitations that were identified earlier. Hence, while V2V-based information propagation enables efficient communications between vehicles, automation allows vehicles to react and respond instantly to the information received. Thereby, it is possible to leverage complex incentive mechanisms, which factor the individual heterogeneity, to develop decentralized incentive-based routing strategies for CAVs. This study proposes a new incentive-based routing strategy to leverage connectivity and automation technologies to address the aforementioned gaps related to existing routing strategies using DTA models. First, vehicular communication-based information propagation enables local information availability without incurring computational burden for a central server. Second, the proposed routing strategy fully factors the individual heterogeneity in the behaviors of vehicles through individual route evaluation functions rather than modeling user class behavior like conventional DTA models to estimate the likely route decisions of vehicles. Hence, the routing decisions are more representative as the characteristics and preferences of each vehicle (traveler) are fully reflected through individual evaluation functions. Third, the decentralized nature of the routing strategy makes it both flexible and computational tractable compared with current predetermined or centralized incentive schemes. Also, the cooperation enabled through the decentralized route assignment model leads to better system performance, unlike under V2V communication-enabled uncoordinated routing decisions by individual vehicles. Overall, the proposed incentive-based decentralized routing strategy for CAVs can enhance system performance and promote user satisfaction under realistic information availability assumptions while factoring individual heterogeneity and being computationally tractable.

Fig. 1. Flow chart of the proposed decentralized routing strategy.

Fig. 1 illustrates the conceptual flow chart of the routing strategy. The time horizon of interest is divided into small time intervals, within which only local traffic information is updated in real-time, while that in the rest of the network is considered unchanged. In each time interval, vehicle groups generated according to their origin-destination (OD) pair information will calculate the optimal routes that cooperatively enhance system performance, assign vehicles to corresponding routes, and develop optimal incentives that nudge agents† to follow the route suggestions. At the beginning of the next interval, due to the updated boundary of the local area and possibly updated information related to further areas, the optimal routes will likely change. Therefore, the routing strategy will be executed iteratively. At the beginning of each time interval 𝜏 , vehicles of the vehicle group corresponding to an OD pair will start a decentralized local route flow assignment protocol. Following the protocol, the vehicle group will update the route flows according to the local traffic information and disseminate them using local information propagation to update the local traffic information. This procedure is repeated until a stable state is reached. We will show that the route flows of the stable state are optimal route flows, which can achieve good network-level system performance. The optimal route flows will be the inputs for the vehicle route assignment model, which assigns vehicles in the group to the different routes to achieve the optimal route flows by considering individual route evaluation functions. The individual route evaluation function captures the personalized characteristics and preferences over different routes, which enables the vehicle route assignment model to maximize the group utility with the full consideration of individual heterogeneity. The optimal vehicle route assignments are the inputs to the incentive mechanism, which determines the incentives for agents in the vehicle group that ensure that vehicle route assignments are fair/satisfactory for each agent. We will show that for some specific situations, optimal incentives developed by the mechanism are: † Following the tradition in economics, we will use the term agent instead of vehicle/driver/traveler when describing utility-related models. (i) budget-balanced, which indicates no monetary investment is required in the long term; (ii) expected individual rational, which implies that agents are willing to participate repeatedly; and (iii) expected incentive compatible, implying agents will behave honestly for long-term benefit under the proposed mechanism. Therefore, the optimal incentives will nudge drivers to follow the optimal vehicle route assignment, thus achieving the optimal route flows in the time interval 𝜏 . At the end of interval 𝜏 , since the boundary of the local area changes and the traffic information related to further areas may have been updated, this strategy is repeated in the next time interval 𝜏 + 1 with the updated conditions. This paper is organized as follows. Section 2 describes the problem and justifies the iterative local routing strategy. In Section 3, we present the decentralized local route assignment model and prove the system properties it can achieve. The vehicle route assignment model is discussed in Section 4. In Section 5, we present the fair incentive mechanism, and illustrate that it is budget-balanced, expected individual rational, and expected incentive compatible under specific situations. We conclude the paper in Section 6 with some comments. Problem Description

Consider a dynamic traffic network

𝐺(𝑁, 𝐸) , where 𝑁 is the node set, and 𝐸 is the directed link set. This study seeks to assign time-dependent routes to vehicles by considering heterogeneous individual preferences to achieve objectives at both the system and individual levels. The time horizon of interest is divided into small equal time intervals. In each time interval 𝜏 , the route updating scheme contains three stages. In the first stage, a decentralized dynamic system is used to determine the approximated system optimal route flows in the local area based on vehicles’ knowledge of the local traffic information. In the second stage, each vehicle is assigned a route based on its self-reported individual preferences of the routes to achieve the approximated optimal route flows obtained in the first stage. The last stage incorporates an incentive mechanism to compensate or charge the individuals based on individual heterogeneities to ensure everyone can accept the assigned routes. Hence, to ensure that the mechanism is budget-balanced to preclude the need for external funding, the benefiting vehicles compensate the sacrificing vehicles. Fig. 2. Real-time traffic information availability range

Fig. 2 shows the context of this study. Suppose at the beginning of time interval 𝜏 , a stream of vehicles just arrive at the node 𝑟 . As the traffic flows can change dynamically, the traffic conditions far away from node 𝑟 at the current time may not be identical to the conditions that vehicles at node 𝑟 will experience in the future at those locations. Thereby, this study assumes only the real-time traffic information in the local area (area within the green circle) will be provided to these vehicles at node 𝑟 through infrastructure-to-vehicle communications. The traffic information outside of the local area is not updated for vehicles at node 𝑟 during the time interval 𝜏 . Specifically, during the time interval 𝜏 , the vehicle group (indicated by the little circle in Fig. 2) will disseminate their travel information (number of vehicles, destinations, individual evaluation functions, temporary routes, etc.) to the roadside units within the local area (black squares within the green circle in Fig. 2). The roadside units will then predict the future traffic states based on the received information on temporary route flows and then distribute the updated route and payment/compensation plans to all of these vehicles. In Fig. 2, the local area related to the origin node 𝑟 is included in the green circle, while the destination node 𝑠 is outside the local area. 𝜔 and 𝜅 are nodes close to the boundary of the local area. We will label (𝑟, 𝜔) and (𝑟, 𝜅) the local OD pairs and (𝑟, 𝑠) the global OD pair. Let 𝑑 be the range that the local area information covers. The value of 𝑑 depends on the information propagation speed, roadside unit coverage, etc. Further, the vehicles just arriving at node 𝑟 at the beginning of time interval 𝜏 also send information to the roadside units, such as the chosen route, the origins, the destinations, and travelers’ characteristics (e.g., the value of time), etc. The roadside units will then determine the optimal route for each vehicle in this stream based on real-time traffic information and projection of future traffic conditions based on each vehicle’s information. It should be noted that new vehicles may pass node 𝑟 during the time when the optimal schemes are being computed; these vehicles will not be included in the global OD demand at the beginning of 𝜏 to determine their optimal route choices because they may not be able to change their route choices accordingly. Let 𝑅 × 𝑆 be the set of OD pairs in the network, and (𝑟, 𝑠) ∈ 𝑅 × 𝑆 denotes a global OD pair. Let Ω !" = {𝜔, 𝜅 ⋯ } be the set of nodes close to the boundary that is used by routes of the global OD pair (𝑟, 𝑠) . For simplicity, we will denote Ω !" as the set of local destinations. Denote 𝑃 $ as the set of local routes for local OD pair (𝑟, 𝜔) . As the local routes are part of the corresponding global routes, the number of routes in the local OD pairs is less than that in the global OD pair. For example, in Fig. 2, the global route set for global OD pair (𝑟, 𝑠) contains 12 routes, while the local route sets for local OD pairs (𝑟, 𝜔) and (𝑟, 𝜅) only contains 5 routes. Note that the difference would be more significant if a smaller part of the trip is within the green circle because the number of routes will increase rapidly with the increase in the number of en route bifurcations, which implies that our local route assignment dynamical system consists of much fewer state variables compared with existing approaches. Also, only vehicles in the local area are considered in the route switching dynamical system to compute the local system optimal flows. Thereby, it is highly efficient computationally by avoiding the inclusion of all vehicles in the network for computing the network-level system optimal flows. It should be noted that the length of time interval 𝜏 determines the interval for updating the vehicles’ routes; that is, vehicles will update their routes in each time interval 𝜏 . To enhance traffic performance, improved routes are provided to vehicles based on real-time traffic conditions. Thereby, the value of 𝜏 should be such that vehicles can update their routes before reaching the local destination using the method proposed above. Route Flow Assignment

This section describes the details of the route switching dynamical system to approximate the system optimal route flow in the local area for each time interval 𝜏 . Denote 𝑥 $%&!" (𝑡) as the flow of route 𝑖 for local OD pair (𝑟, 𝜔) at time 𝑡 in the dynamical system in time interval 𝜏 for the global OD pair (𝑟, 𝑠) . 𝒙 & (𝑡) is the vector of all 𝑥 $%&!" (𝑡) . Since we focus on analyzing the flow in one time interval, for simplicity, the time interval indicator will be removed, i.e., we will use 𝑥 $%!" and 𝒙 to denote 𝑥 $%!" (𝑡) and 𝒙 & (𝑡) respectively, hereafter. As discussed earlier, the sum of local OD demands that are associated with the same global OD pair (i.e., (𝑟, 𝜔) and (𝑟, 𝜅) in Fig. 2 are both associated with the global OD pair (𝑟, 𝑠) ) does not change over the time unit 𝑡 in the dynamical system. However, different from general static traffic assignment problems, the local OD demands do not remain unchanged in the process as vehicles heading to 𝑠 can switch between (𝑟, 𝜔) and (𝑟, 𝜅) . Let 𝐶 %$ ≡ 𝐶 %$ (𝑡) be the marginal travel time of the local route 𝑖 for local OD pair (𝑟, 𝜔) . When the system reaches a local system optimal (LSO) solution, the marginal travel time of used local routes for the same local OD pair is the same. However, differences exist between the marginal travel time of used local routes for different local OD pairs because the marginal travel times of used routes for OD pairs beyond the local area are assumed to be fixed. For example, in Fig. 2, the marginal travel times of used local routes for OD pair (𝑟, 𝜔) are identical at LSO, as are those for local OD pair (𝑟, 𝜅) . But the marginal travel times of local routes for OD pair (𝑟, 𝜔) are different from those of OD pair (𝑟, 𝜅) . The difference is equal to the difference of the fixed marginal travel time of routes for OD pairs (𝜅, 𝑠) and (𝜔, 𝑠) . In other words, at the LSO state, global routes of a global OD pair have the same marginal travel time given the limited information outside the local area, which makes the global system state a system optimal (SO) one. We assume that the average marginal travel time difference between (𝜅, 𝑠) and (𝜔, 𝑠) , Δ $’ , is known at the beginning of 𝜏 , and is not updated during the solution procedure of the dynamical system since it is related to road segments outside the local area. Note that the traffic information of the network beyond the local area network cannot be predicted accurately due to the potential for high variance. Therefore, it is reasonable to pursue an approximated local system optimal (ALSO) solution rather than a strict LSO. Let 𝛿 !" , 𝛿 !" ≥ 0 , be the tolerance of the marginal travel time of used routes for global OD pair (𝑟, 𝑠) , i.e., ALSO is reached if the difference in the marginal travel times of two arbitrary used routes for global OD pair (𝑟, 𝑠) is within [−𝛿 !" , 𝛿 !" ] . Route Switching Model

This section proposes a route-switching dynamical model to obtain the route flows for the ALSO by leveraging the work of Smith (1984). If we introduce the rectifier function:

Γ(𝑥) = max(𝑥, 0) , (1) we can represent the route switching logic for local route 𝑖 of local OD pair (𝑟, 𝜔) as 𝑥 $()→%)!" = K ΓL𝐶 )$ − 𝐶 %$ − 𝛿 !" M𝑥 $)!" , if 𝑗 ∈ 𝑃 $ and 𝑗 ≠ 𝑖;ΓL𝐶 )’ − 𝐶 %$ − Δ ’$ − 𝛿 !" M𝑥 ’)!" , if 𝑗 ∈ 𝑃 ’ and 𝜅 ∈ Ω !" \{𝜔}. (2) Equation (2) describes the route flow switching rate from route 𝑗 to route 𝑖 under different conditions of 𝑗 . The logic is similar to Smith’s model other than the incorporation of Δ ’$ and 𝛿 !" to factor the aforementioned properties of the local route assignment problem. Note that Δ $$ = 0 . According to Equation (2), the total change rate of the flow on local route 𝑖 for OD pair (𝑟, 𝜔) is 𝑥̇ $%!" = X X YΓL𝐶 )’ − 𝐶 %$ − Δ ’$ − 𝛿 !" M𝑥 ’)!" − ΓL𝐶 %$ − 𝐶 )’ − Δ $’ − 𝛿 !" M𝑥 $%!" Z )∈- ! /01 )2% ’∈3 " (3) Considering all global OD pairs (𝑟, 𝑠) ∈ 𝑅 × 𝑆 , the dynamical system that describes the local route switching process can be represented as: 𝒙̇ = 𝛷(𝒙) ⋅ 𝒙 = ⎣⎢⎢⎢⎡𝛷 ! % " % (𝒙) 0 … … 00 𝛷 ! % " & (𝒙) 0 … …… 0 … 0 …… … 0 𝛷 ! & " % (𝒙) 00 … … 0 … ⎦⎥⎥⎥⎤ ⋅ 𝒙 , (4a) where 𝛷 !" (𝒙) = dΨ $ % $ % !" Ψ $ % $ & !" …Ψ $ & $ % !" Ψ $ & $ & !" …… … ⋱g (4b) is the coefficient matrix related to the route switching process of the global OD (𝑟, 𝑠) , Ψ $$!" = ⎣⎢⎢⎢⎡ Ψ $$!" [1, 1] Γ(𝐶 − 𝐶 − 𝛿 !" ) … Γ(𝐶 − 𝐶 %$ − 𝛿 !" ) …Γ(𝐶 − 𝐶 − 𝛿 !" ) Ψ $$!" [2,2] … Γ(𝐶 − 𝐶 %$ − 𝛿 !" ) …… … ⋱ … …Γ(𝐶 %$ − 𝐶 − 𝛿 !" ) Γ(𝐶 %$ − 𝐶 − 𝛿 !" ) … Ψ $$!" [𝑖, 𝑖] …… … … … …⎦⎥⎥⎥⎤ (4c) is the coefficient matrix related to the flow switching within the local route set 𝑃 $ , Ψ $$!" [𝑖, 𝑖] = − X ΓL𝐶 )$ − 𝐶 %$ − 𝛿 !" M )∈- ’ \{%} − X X ΓL𝐶 )’ − 𝐶 %$ − Δ ’$ − 𝛿 !" M )∈- ! ’∈3 " (4d) and Ψ $’!" = ⎣⎢⎢⎢⎡Γ(𝐶 − 𝐶 − Δ $’ − 𝛿 !" ) Γ(𝐶 − 𝐶 − Δ $’ − 𝛿 !" ) … Γ(𝐶 − 𝐶 %$ − Δ $’ − 𝛿 !" ) …Γ(𝐶 − 𝐶 − Δ $’ − 𝛿 !" ) Γ(𝐶 − 𝐶 − Δ $’ − 𝛿 !" ) … Γ(𝐶 − 𝐶 %$ − Δ $’ − 𝛿 !" ) …… … ⋱ … …Γ(𝐶 %$ − 𝐶 − Δ $’ − 𝛿 !" ) Γ(𝐶 %$ − 𝐶 − Δ $’ − 𝛿 !" ) … Γ(𝐶 %$ − 𝐶 %’ − Δ $’ − 𝛿 !" ) …… … … … …⎦⎥⎥⎥⎤ (4e) is the coefficient matrix related to flow switching from local routes in 𝑃 ’ to local routes in 𝑃 $ . System Properties

This section presents two important properties of the proposed dynamical system (4): flow non-negativity and flow conservation. The former property ensures that the route flow is always non-negative during the switching process, and the latter property ensures the total demand of all local OD pairs equals the demand of the corresponding global OD pairs.

Theorem 1.

If the initial flow of all routes is non-negative, the local route flows determined by the dynamical system in Equation (4) are always non-negative.

Proof.

Suppose there exists a route whose flow becomes negative in the switching process. As the dynamical system is continuous, the route flow must reach 0 before becoming negative. Without loss of generality, let the route be 𝑖 , i.e., 𝑥 $%!" = 0 . As all other routes are non-negative, we have 𝑥 ’)!" ≥ 0, 𝑖 ∈ 𝑃 $ , 𝑗 ∈ 𝑃 ’ , 𝑗 ≠ 𝑖, 𝜔, 𝜅 ∈ Ω !" , (𝑟, 𝑠) ∈ 𝑅 × 𝑆 . According to Equation (3), 𝑥̇ $%!" = X X ΓL𝐶 )’ − 𝐶 %$ − Δ ’$ − 𝛿 !" M𝑥 ’)!")∈- ! /01 )2% ’∈3 " ≥ 0 (5) Eq. (5) indicates that 𝑥 $%!" will never be negative if the initial states of all route flows are non-negative. □ As discussed earlier, the sum of local route flows for the same local OD pair is not conserved. However, the summation of the demand of all local OD pairs belonging to the same global OD pair is conserved. It should be equal to the OD demand for the corresponding global OD pair. The following theorem discusses this fact.

Theorem 2.

Let 𝐷 !"& be the demand at node 𝑟 destined to node 𝑠 at the beginning of 𝜏 . Then, ∑ ∑ 𝑥 $%!"%∈- ’ $∈3 " = 𝐷 !"& always holds for the dynamical system defined in Equation (4). Proof. From Equation (4), we have:

X X 𝑥̇ $%!"%∈- ’ $∈3 " = X X X X ΓL𝐶 )’ − 𝐶 %$ − Δ ’$ − 𝛿 !" M𝑥 ’)!")∈- ! /01 )2% ’∈3 " %∈- ’ $∈3 " − X X X X ΓL𝐶 %$ − 𝐶 )’ − Δ $’ − 𝛿 !" M𝑥 $%!")∈- ! /01 )2% ’∈3 " %∈- ’ $∈3 " = 0. (6) Therefore, ∑ ∑ 𝑥 $%!"%∈- ’ $∈9 " = 𝐷 !"& always holds. □ Theorems 1 and 2 imply that the feasible state vector set is positively invariant under the dynamical system.

Convergence Analysis

We show that for arbitrary initial route flows, the route flows determined by Equation (4) will always converge to the ALSO solution.

Theorem 3. (LaSalle's theorem). Let

Ω ⊂ 𝐷 be a compact set that is positively invariant with respect to the autonomous system 𝑥̇ = 𝑓(𝑥) . Let

𝑉: 𝐷 → 𝑅 be a continuously differentiable function such that

𝑉̇ ≤ 0 in Ω . Let 𝐸 be the set of all points in Ω where 𝑉̇ (𝑥) = 0 . Let 𝑀 be the largest invariant set in 𝐸 . Then, every solution starting in Ω approaches 𝑀 as 𝑡 → ∞ (Khalil, 1996). Lemma 1.

The dynamical system defined by Equation (4) is an autonomous (time-invariant) system. Proof.

As discussed earlier, the marginal travel time 𝐶 %$ is obtained through the updated real-time local route flows, which are the state variables 𝒙 . Therefore, the ODEs in Equation (4) do not explicitly depend on 𝑡 . That is to say, the dynamical system is a time-invariant system. □ We now investigate the candidate Lyapunov function:

𝑉(𝒙) = X X X X X Γ L𝐶 %$ − 𝐶 )’ − Δ $’ − 𝛿 !" M𝑥 $%!")∈- ! %∈- ’ ’∈3 " $∈3 " (!,")∈;×= (7) which is a differentiable distance measure from 𝒙 to ALSO. Lemma 2.

If the link travel time on each link is continuously differentiable and non-decreasing,

𝑉̇ ≤ 0 for the dynamical system defined in Equation (4) and the equality holds only when 𝒙̇ = 0 . Proof.

Note that 𝜕𝑉𝜕𝑥 $%!" = X X X X X 2Γ t𝐶 % ( $ ( − 𝐶 ) ( ’ ( − Δ $ ( ’ ( − 𝛿 ! ( " ( u 𝑥 $ ( % ( ! ( " ( v𝜕𝐶 % ( $ ( 𝜕𝑥 $%!" − 𝜕𝐶 % ( $ ( 𝜕𝑥 $%!" w ) ( ∈- !( % ( ∈- ’( ’ ( ∈3 "( $ ( ∈3 "( (! ( ," ( )∈;×= + X X Γ L𝐶 %$ − 𝐶 )’ − Δ $’ − 𝛿 !" M )∈- ! ’∈3 " (8) Denote 𝑱 as the Jacobian matrix of the vector of the marginal travel time function. If we use [𝑖, 𝜔, 𝑟, 𝑠] as the index for the element corresponding to the index of 𝑥 $%!" in 𝒙 , we have: L(𝒙̇ > )𝑱M[𝑖, 𝜔, 𝑟, 𝑠] (9) = X X X 𝑥̇ $ ( % ( ! ( " ( 𝜕𝐶 % ( $ ( 𝜕𝑥 $%!"% ( ∈- ’( $ ( ∈3 "( (! ( ," ( )∈;×= = X X X X X Γ t𝐶 ) ( ’ ( − 𝐶 % ( $ ( − Δ ’ ( $ ( − 𝛿 ! ( " ( u 𝑥 ’ ( ) ( ! ( " ( 𝜕𝐶 % ( $ ( 𝜕𝑥 $%!") ( ∈- !( % ( ∈- ’( ’ ( ∈3 "( $ ( ∈3 "( (! ( ," ( )∈;×= − X X X X X Γ t𝐶 % ( $ ( − 𝐶 ) ( ’ ( − Δ $ ( ’ ( − 𝛿 ! ( " ( u 𝑥 $ ( % ( ! ( " ( 𝜕𝐶 % ( $ ( 𝜕𝑥 $%!") ( ∈- !( % ( ∈- ’( ’ ( ∈3 "( $ ( ∈3 "( (! ( ," ( )∈;×= = − X X X X X Γ t𝐶 % ( $ ( − 𝐶 ) ( ’ ( − Δ $ ( ’ ( − 𝛿 ! ( " ( u 𝑥 $ ( % ( ! ( " ( v𝜕𝐶 % ( $ ( 𝜕𝑥 $%!" − 𝜕𝐶 % ( $ ( 𝜕𝑥 $%!" w ) ( ∈- !( % ( ∈- ’( ’ ( ∈3 "( $ ( ∈3 "( (! ( ," ( )∈;×= Combining Equations (8) and (9), we have

𝑉̇ = ∇𝑉(𝒙) ⋅ 𝒙̇ = −2𝒙̇ > 𝑱𝒙̇ + X X X X X Γ L𝐶 %$ − 𝐶 )’ − Δ $’ − 𝛿 !" M𝑥̇ $%!")∈- ! %∈- ’ ’∈3 " $∈3 " (!,")∈;×= (10) Since the link travel time of each link is continuously differentiable and non-decreasing, the marginal travel time of each link is also continuously differentiable and non-decreasing. Thereby, 𝑱 is positive semi-definite. We have 𝑉̇ ≤ $ $ $ $ $ Γ ! &𝐶 " − 𝐶 $% − Δ − 𝛿 &’ +𝑥̇ ! "∈) " %∈* (&,’)∈.×0 = $ $ $ $ $ $ $ Γ ! &𝐶 " − 𝐶 $% − Δ − 𝛿 &’ +Γ /𝐶 $ & % & − 𝐶 " − Δ % & − 𝛿 &’ % & $ & &’ $ & ∈) !& % & ∈* $∈) ! "∈) " %∈* (&,’)∈.×0 − $ $ $ $ $ $ $ Γ ! &𝐶 " − 𝐶 $% − Δ − 𝛿 &’ +Γ /𝐶 " − 𝐶 $ & % & − Δ & − 𝛿 &’ $ & ∈) !& % & ∈* $∈) ! "∈) " %∈* (&,’)∈.×0 (11) Denote Γ %) = ΓL𝐶 %$ − 𝐶 )’ − Δ $’ − 𝛿 !" M , for 𝑖 ∈ 𝑃 $ , 𝑗 ∈ 𝑃 ’ , 𝜔, 𝜅 ∈ Ω !" , (𝑟, 𝑠) ∈ 𝑅 × 𝑆 . Equation (11) can be rewritten as 𝑉̇ ≤ X X X X X X X Γ %)5 tΓ ) ( % 𝑥 ’ ( ) ( !" − Γ %) ( 𝑥 $%!" u ) ( ∈- !( ’ ( ∈3 " )∈- ! %∈- ’ ’∈3 " $∈3 " (!,")∈;×= = X X X X X X X tΓ %)5 − Γ ) ( )5 u Γ ) ( % 𝑥 ’ ( ) ( !") ( ∈- !( ’ ( ∈3 " )∈- ! %∈- ’ ’∈3 " $∈3 " (!,")∈;×= (12) When Γ ) ( % > 0 , we have 𝐶 ) ( ’ ( − 𝐶 %$ − Δ ’ ( $ − 𝛿 !" > 0 . Therefore, L𝐶 %$ − 𝐶 )’ − Δ $’ − 𝛿 !" M − t𝐶 ) ( ’ ( − 𝐶 )’ − Δ ’ ( ’ − 𝛿 !" u = − t𝐶 ) ( ’ ( − 𝐶 %$ − Δ ’ ( $ u < −𝛿 !" ≤ 0 (13) Therefore, there are three possible situations related to Γ %) and Γ ) ( ) . Γ %) = 0, Γ ) ( ) = 0 , then tΓ %)5 − Γ ) ( )5 u Γ ) ( % 𝑥 ’ ( ) ( !" = 0 ; Γ %) = 0, Γ ) ( ) > 0 , then tΓ %)5 − Γ ) ( )5 u Γ ) ( % 𝑥 ’ ( ) ( !" < 0 ; Γ %) < 0, Γ ) ( ) < 0 , then tΓ %)5 − Γ ) ( )5 u Γ ) ( % 𝑥 ’ ( ) ( !" = − t𝐶 ) ( ’ ( − 𝐶 %$ − Δ ’ ( $ u LΓ %) + Γ ) ( ) MΓ ) ( % 𝑥 ’ ( ) ( !" < −𝛿 !" LΓ %) + Γ ) ( ) MΓ ) ( % 𝑥 ’ ( ) ( !" < 0. Therefore, we have 𝑑𝑑𝑡 𝑉(𝒙) = ∇𝑉(𝒙) ⋅ 𝑥̇ < 0 (14) except at the ALSO state, that is 𝒙 ∈ 𝐸 , where

𝐸 = |𝒙}Γ %) 𝑥 $%!" = 0, ∀𝑖 ∈ 𝑃 $ , ∀𝑗 ∈ 𝑃 ’ , ∀𝜔, 𝜅 ∈ Ω !" , ∀(𝑟, 𝑠) ∈ 𝑅 × 𝑆(cid:127) (15) □ Theorem 4.

The dynamical system defined in Equation (4) will always converge to the ALSO state with arbitrary initial route flows if the link travel time is continuously differentiable and non-decreasing. Proof.

According to Theorems 1, 2, and Lemma 1,

𝑀 = (cid:128)𝒙(cid:129) ∑ ∑ 𝑥 $%!"%∈- ’ $∈9 " = 𝐷 !"& and 𝑥 $%!" ≥ 0, ∀𝑖 ∈ 𝑃 $ , ∀𝑗 ∈ 𝑃 ’ , ∀𝜔, 𝜅 ∈ Ω !" , ∀(𝑟, 𝑠) ∈ 𝑅 × 𝑆(cid:130) (16) is a compact set that is positively invariant with respect to the autonomous system defined in Equation (4). The candidate Lyapunov function in Equation (7) is a continuously differentiable function and 𝑉̇ ≤ 0 in 𝑀 . According to Lemma 2, the ALSO state set 𝐸 itself is the largest invariant set in 𝐸 . Therefore, using LaSalle's theorem, we show that initial states with arbitrary route flows, which are always in 𝑀 , converge to the ALSO set 𝐸 as 𝑡 → ∞ . □ Discrete Route Assignment Dynamical System

The dynamical system described in Equation (4) is a steady state model. For implementation consideration, the dynamical system is discretized, such that the difference between the state vector in the (𝑘 + 1) ?@ iteration and the 𝑘 ?@ iteration is proportional to the local route flow changing rate in Equation (4): 𝑿 AB4 − 𝑿 A = 𝛼 A 𝛷(𝑿 A ) ⋅ 𝑿 A (17) where 𝛼 A is the step size at the 𝑘 ?@ iteration. Many methods have been proposed to determine the step size in the discrete route switching day-to-day model (Powell and Sheffi, 1982; Smith and Wisten, 1995; Mounce and Carey, 2015) to guarantee the convergence property. In this study, the step size 𝛼 A is determined using the method proposed by Wang et al. (2019) to enhance convergence performance. Numerical Illustration of Convergence Property

The Sioux Falls network is used to demonstrate the convergence performance of the proposed decentralized route flow assignment method. Suppose in the local area shown in Fig. 3(a), 4000 vehicles just pass the node 13 (on link 39) and head to the destination node 16. The local area is shown in the red dash circle, which contains three local destination nodes 23, 22, and 21. Based on the discussion of the distributed route flow assignment method, the local network can be equivalently depicted as Fig. 3(b).

Fig. 3. Example network; (a) Local area in the Sioux Falls network; (b) Equivalent network of the local network.

The following BPR function is used to estimate the link travel time 𝑡 C = 𝑡 CD (cid:134)1 + (cid:135)𝑣 C 𝑐 C (cid:138) E (cid:139) , ∀𝑎 (18) where 𝑡 C is the travel time of link 𝑎 ; 𝑡 CD is the free flow travel time of link 𝑎 ; and 𝑣 C and 𝑐 C are the flow and capacity of link 𝑎 , respectively. The travel time of links 8, 9, and 10 are fixed as 11, 8, and 10, respectively, since they are beyond the local network. The other inputs for the BPR function are shown in Table 1. (a) (b)

0 Table 1. Inputs of parameters in the BPR function for links in the local network. Links 1 2 3 4 5 6 7 𝑡 !"

3 6 5 2 2 3 3 𝑐 ! To measure the solution quality, we define the convergence indicator (denoted as 𝐺 ) as follows: 𝐺 = ∑ ∑ ∑ ∑ ∑ 𝛤L𝐶 %$ − 𝐶 )’ − ∆ $’ − min(𝐶 AF , ∀𝑘 ∈ 𝑃 F , 𝑤 ∈ 𝛺 !" ) − 𝛿 !" M𝑥 %!"&$)∈- ’ ,)2%%∈- ’ ’∈9 " $∈9 " (!,")∈;×= ∑ ∑ ∑ ∑ 𝐶 %$ 𝑥 %!"&$%∈- ’ ’∈9 " $∈9 " (!,")∈;×= The convergence indicator implies that if the path flow solution is closer to the approximated LSO state, 𝐺 is closer to 0. For simplicity, assume ∆ $’ ≡ 0, ∀𝜔, 𝜅 ∈ 𝛺 !" and 𝛿 !" = 0.1 . Fig. 4 shows the convergence performance of the proposed distributed route flow assignment method. It demonstrates that the distributed route flow assignment method can obtain a solution with the convergence criteria (i.e., the value of 𝐺 ) lower than GE in 561 iterations. Thereby, this method can effectively solve the SO problem on the local network. Fig. 4. Convergence performance of the proposed distributed route flow assignment method Table 2. LSO link flows. Link ID 1 2 3 4 5 6 7 8 9 10 Flow 4000 1954.4 2045.6 659.3 0 0 598.9 1446.7 1258.2 1295.1 Vehicle Route Assignment

The route switching dynamical system provides an optimal local route flow solution at the ALSO state set. In this section, a vehicle route assignment problem will be formulated to assign each vehicle to a specific route to achieve the optimal local route flows. To capture individual heterogeneity, the vehicle route assignment problem incorporates individual evaluation functions that define each individual’s perception of the utilities of a route. The vehicle route assignment problem seeks to optimize the route assigned to each individual to maximize the sum of the utilities of all individuals. Throughout this section, we assume there exists one driver/passenger in each vehicle. Thereby, the term vehicle and individual are used interchangeably. Suppose at time 𝜏 the initial route choices for all vehicles from origin node 𝑟 to destination node 𝑠 is denoted by 𝝁 = (𝜇 , 𝜇 , … , 𝜇 H ), | where 𝑛 is the demand of the global OD pair (𝑟, 𝑠) , 𝑛 = |𝐷 !"& | . Let 𝜼 = (𝜂 , 𝜂 , … , 𝜂 H ) be the new routes assigned to the 𝑛 vehicles, respectively, based on the vehicle route assignment problem. Let 𝑣 % be individual 𝑖′s evaluation function, which measures the utility of the routes for individual 𝑖 . For example, 𝑣 % (𝜌) =−𝜆 % 𝑇(𝜌) is the monetary time lost for individual 𝑖 if he/she chooses route 𝜌 , where 𝜆 % is the value of time of individual 𝑖 , and 𝑇(𝜌) is the travel time of route 𝜌 . Note that we do not have any constraints on the form of 𝑣 % . We only assume that the monetary loss is proportional to both the route travel time and the individual’s value of time, i.e., 𝑣 % (𝜌)~ −𝜆 % 𝑇(𝜌) . Let 𝒗 be the vector of all individual evaluation functions, 𝒗 = (𝑣 , 𝑣 , … , 𝑣 H ) . V a l u e o f c onv e r g e n ce c r it e r i on To ensure each individual can accept the assigned route, let 𝑝 % (𝑖 = 1,2, ⋯ , 𝑛) be the payment of individual 𝑖 , where 𝑝 % can be a negative value or a positive value, meaning that the individual will receive money (negative value) or pay some money (positive value) for the assigned route. Let 𝒑 = (𝑝 , 𝑝 , … , 𝑝 H ) . Based on the above discussion, if an individual 𝑖 is switching from the initial route 𝜇 % to the new route 𝜂 % , the utilities (denoted as 𝑢 % ) for this change is 𝑢 % = 𝑣 % (𝜂 % ) − 𝑣 % (𝜇 % ) − 𝑝 % (19) We seek to determine the optimal new route for each individual (i.e., 𝜼 ∗ ) such that the sum of the utilities across individuals is maximized, i.e., 𝜼 ∗ = argmax 𝜼 X 𝑢 %H%K4 = argmax 𝜼 vXL𝑣 % (𝜂 % ) − 𝑣 % (𝜇 % )M H%K4 − 𝐶w = argmax 𝜼 X 𝑣 % (𝜂 % ) H%K4 (20) where ∑ 𝑝 %H%K4 = 𝐶 is a constant. Equation (20) indicates that the optimal route solutions for all individuals 𝜼 ∗ does not depend on 𝑝 % , ∀𝑖 . Let 𝐴 = {1, 2, … , 𝑛} and 𝑃 !" be the set of all individuals for global OD pair (𝑟, 𝑠) and the set of all local routes for global OD pair (𝑟, 𝑠) , respectively. Denote 𝜌 as an arbitrary route in set 𝑃 !" . The vehicle route assignment problem can be formulated as the following linear integer program: max L )* X 𝑏 %M 𝑣 % (𝜌) (%,M)∈N×- " (21a) subject to X 𝑏 %M%∈N = 𝑥 M!" for 𝜌 ∈ 𝑃 !" , (21b) X 𝑏 %MM∈- " = 1 for 𝑖 ∈ 𝐴, (21c) 𝑏 %M ∈ {0, 1} for (𝑖, 𝜌) ∈ 𝐴 × 𝑃 !" (21d) where 𝑏 %M is a binary indicator, 𝑏 %M = 1 if vehicle 𝑖 is assigned to route 𝜌 . Otherwise, 𝑏 %M = 0 . The objective of the vehicle route assignment problem (21) (Equation (21a)) seeks to maximize the sum of the utilities of the agents. Equation (21b) ensures that the vehicle route assignments can lead to the approximated optimal route flow given by the route switching dynamical system (4). Equation (21c) indicates that each vehicle can only be assigned to one route. Theorem 5 (Heller and Tompkins, 1956).

Let 𝐴 be an 𝑚 by 𝑛 matrix whose rows can be partitioned into two disjoint sets 𝐵 and 𝐶 . Then, the following four conditions together are sufficient for 𝐴 to be totally unimodular: • Every entry in 𝐴 is

0, +1 or −1 ; • Every column of 𝐴 contains at most two non-zero (i.e., +1 or −1 ) entries; • If two non-zero entries in a column of 𝐴 have the same sign, then the row of one is in 𝐵 , and the other is in C; • If two non-zero entries in a column of 𝐴 have opposite signs, then the rows of both are in 𝐵 , or both are in C. Let 𝒚 = [𝑦 𝑦 ⋯] be a column vector of binary decision variables with the dimension equal to the number of columns of matrix A . Let 𝑏 be a column vector with each entry being an integer. The following theorem will be useful to demonstrate the method to obtain the solution to problem (21). Theorem 6 (Guzelsoy and Ralphs, 2007).

If the constraint matrix A and the right-hand side vector b of a mixed-integer program are totally unimodular and integer respectively, then the linear integer programming problem constructed upon constraints {𝐴𝒚 = 𝑏, 𝑦 % = 0 or 1, ∀𝑖} can be relaxed as the corresponding linear programming problem with constraints {𝐴𝒚 = 𝑏, 0 ≤ 𝑦 % = 1, ∀𝑖} , i.e., the optimal solution of the relaxed linear program must be integer-valued. It is not hard to prove that the constraint matrix of Equation (21) is unimodular. Note that the right-hand side of Equations (21b) and (21c) are all integers. Using Theorem 6, we can replace the binary constraint defined in Equation (21d) with 𝑏 %M ∈ [0, 1] for (𝑖, 𝜌) ∈ 𝐴 × 𝑃 !" . Thereby, the linear integer programming problem can be converted into a relaxed linear programming problem which can be solved using, for example, the simplex method. Note that agents’ payments are related to the incentives they can obtain according to the incentive mechanism in the next section. However, Equation (20)) shows that as long as the sum of incentives for the vehicle group is fixed, the optimal vehicle route assignments derived from group utility maximization does not change with the incentives received by each agent. In other words, the optimal incentives generated by the incentive mechanism does not affect the vehicle route assignment model. The local network in Fig. 3(b) will be used to illustrate the procedure to determine the vehicle route assignments. As we want to specify the route assigned to each vehicle, the number of vehicles in the group is limited as 20. The inputs for the parameters in the link performance function are shown in Table 3. Table 4 shows the computed LSO route flows. Table 3. Inputs of parameters in the BPR function for links in the local network. Links 1 2 3 4 5 6 7 𝑡 !"

3 6 5 2 2 3 3 𝑐 !

8 4 4 4 4 4 4 Table 4. LSO route flows. Route 1 2 3 4 Links 1-2-10 1-2-4-9 1-3-7-9 1-3-8 Travel time 289.96 290.09 331.61 331.50 Flow 8 2 1 9

Suppose 𝑣 % (𝜌) = −𝜆 % 𝑇(𝜌) , where the value of time parameters 𝜆 % are randomly generated. The vehicle route assignment problem (21) is solved using the simplex method. The optimal vehicle route assignment results are shown in Table 5 in Section 5.1. Incentive Mechanism

Similar to the route-swapping dynamical system, the vehicle route assignment problem (21) is also formulated from the system point of view, seeking to determine optimal routes for each individual to maximize the sum of utilities of the vehicle group. While the vehicle route assignment problem considers heterogeneity in individuals’ preferences by incorporating individual evaluation functions, its solutions do not depend on the individual-level payments. This indicates that the travel costs of the routes assigned to the individuals are different. Thereby, the vehicle route assignments may not be acceptable to some agents as these assignments do not fully consider fairness and individual heterogeneity. Next, we will propose an incentive mechanism to promote the acceptance of the vehicle route assignments. The incentive mechanism ensures that the vehicle route assignments are envy-free, implying each individual is satisfied with the route assigned to him/her. However, envy-freeness is achieved only if all agents are involved with the incentive mechanism voluntarily. To enable this, we provide additional group compensations to ensure each agent is expected to benefit from this incentive mechanism. Another concern is the honesty of the agents, i.e., whether agents report their information in the individual evaluation functions (e.g., the value of time) honestly. This concern will be addressed by the expectation incentive compatibility. We will show analytically that the agents will benefit the most from this incentive mechanism if they behave honestly. Through the aforementioned properties, the proposed incentive mechanism relaxes the obligation assumptions (honesty, compliance) in traveler behavior that constrain most existing studies. Individual travelers are not assumed to follow the recommended routes or behave honestly but are nudged to do so willingly.

Expected Envy-free Incentive Mechanism

We label vehicle route assignments as envy-free if the utility obtained by any agent through selecting another route is not higher than that of the one assigned to him/her, i.e., 𝑢 % (𝜂 % , 𝑝 % ) ≥ 𝑢 % L𝜂 ) , 𝑝 % M, ∀ 𝑖, 𝑗 ∈ 𝐴 (22) where 𝜂 % is the route assigned to agent 𝑖 by the vehicle route assignment problem; 𝜂 ) is another route (i.e., the one assigned to agent 𝑗 ). Denote 𝑒 %) as agent 𝑖 's evaluation of the new route assigned to agent 𝑗 : 𝑒 %) = 𝑣 % L𝜂 ) M − 𝑣 % L𝜇 ) M + 𝑎 ) , for 𝑖, 𝑗 ∈ 𝐴 (23) where 𝑎 ) is the adjustment incentives given to agent 𝑗 . Haake et al. (2002) proposed a compensation procedure which eliminates the envy of the initial utility assignment for 𝑛 = |𝐴| agents. Following the procedure, we can guarantee envy-freeness in 𝑛 − 1 compensation steps, which means we can determine 𝑎 % , 𝑖 ∈ 𝐴 , such that 𝑒 %% ≥ 𝑒 %) , ∀ 𝑖, 𝑗 ∈ 𝐴 (24) Since we are pursuing budget balancing, the sum of incentives ∑ 𝑎 %% should be paid equally by all agents, which will maintain the envy-freeness. Taking this into account, each agent pays an additional ∑ 𝑎 %% , Equation (24) becomes Equation (22). In this way, agent 𝑖 's total payment can be derived as 𝑝 % = 1𝑛 X 𝑎 ))∈N − 𝑎 % (25) Note that 𝑎 % and 𝑝 % are related to the initial route choice 𝜇 % . It is reasonable to assume that in the long term, the initial state at the beginning of each time interval is random. The adjustment procedure defined above would compensate more the agents who are less satisfied with the initial routes. It is envy-free under a specific initial route choice distribution 𝝁 , but agents who have better initial routes would envy agents with worse initial routes. Since the optimal vehicle route assignments generated in the second stage are not related to the evaluation of the initial routes, agents with worse initial routes are envied more because they will have a higher compensation. Therefore, to address the unfairness related to the randomness of the initial route choice distribution, we define agent 𝑖 's expected evaluation of the new route assigned to agent 𝑗 as the expected utility of agent 𝑖 if he/she were assigned to agent 𝑗 ’s new route: 𝐸 𝝁 L𝑒 %) M = 𝑣 % L𝜂 ) M − 𝐸 𝝁 t𝑣 % L𝜇 ) Mu + 𝑝 ) , for 𝑖, 𝑗 ∈ 𝐴 (26) where 𝐸 𝝁 L𝑒 %) M = ∑ 𝑣 % (𝜇 A ) HAK4 is agent 𝑖 's expected evaluation of agent 𝑗 's initial routes, which is the average of agent 𝑖 's evaluation of all initial routes. The expected envy-freeness is represented as 𝐸 𝝁 (𝑒 %% ) ≥ 𝐸 𝝁 L𝑒 %) M, ∀ 𝑖, 𝑗 ∈ 𝐴 (27) Based on this definition, the expected envy-free compensation procedure can be described as follows. Note that it is a computational procedure, which implies that there will not be an 𝑛 -round monetary transfer or an extra charging step. Each agent will be notified of the incentives (can be positive or negative) associated with his/her route choice update. Step 1. In the first round, there will always be at least one agent who experiences no envy (see Theorem 1 in Haake et al. (2002)). Therefore, no adjustment incentives are provided in the first round. Step 2.

Calculate 𝐸 𝝁 L𝑒 %) M, 𝑖, 𝑗 ∈ 𝐴 . If all agents are expected envy-free, go to Step 4. Step 3.

Perform a new round of compensations: identify all agents who are envious of a non-envious agent the most, and give them an adjustment incentive using their maximum envy difference. Go to Step 2. Step 4.

Sum up all the adjustment incentives computed in all rounds, split them equally, and charge all agents. Following the above steps, we compute the optimal incentives ( 𝑎 % ) and payments ( 𝑝 % ) for 20 vehicles, which are shown in Table 5 along with the route assignments 𝜂 % . Vehicle 1 is assigned to Route 1 in Table 4 (1-2-10), which is the route with the shortest travel time. However, it needs to pay its share of the sum of incentives, 12.401. Vehicle 3 is assigned to Route 4 in Table 4 (1-3-8), which has the second highest travel time. It is then compensated by 24.786 to ensure acceptance of this route. With its share of the sum of incentives also being 12.401, its total payment is 12.401-24.786 = -12.385. Also, the expected evaluation table corresponding to the vehicle route assignments and payments is shown in Table 6, where the value in row 𝑖 and column 𝑗 is agent 𝑖 's expected evaluation of the new route assigned to agent 𝑗 as defined in Equation (26). The values on the diagonal line with a shaded background in the table are the 𝐸 P (𝑒 %% ) . Note that these values are larger than the other values in the same column. Thereby, Equation (27) holds for all 𝑖, 𝑗 ∈ [1, … , 20] . This implies that each agent perceives the utility of the assigned route to him/her as being higher than on other routes. Therefore, the proposed incentive mechanism is expected envy-free. Table 5. Vehicle route assignments and incentives.

Vehicle ID 𝜆 " 𝜂 " 𝑎 " 𝑝 "

1 0.80 1 0.000 12.401 2 0.91 1 0.000 12.401 3 0.45 4 24.786 -12.385 4 0.46 4 24.786 -12.385 5 0.72 1 0.000 12.401 6 0.64 2 0.080 12.321 7 0.54 4 24.786 -12.385 8 0.84 1 0.000 12.401 9 0.61 2 0.080 12.321 10 0.42 4 24.786 -12.385 11 0.60 4 24.786 -12.385 12 1.00 1 0.000 12.401

13 0.40 4 24.786 -12.385 14 0.43 4 24.786 -12.385 15 0.87 1 0.000 12.401 16 0.76 1 0.000 12.401 17 0.23 4 24.786 -12.385 18 0.71 1 0.000 12.401 19 0.49 4 24.786 -12.385 20 0.15 3 24.788 -12.387

Table 6. Expected evaluation table 𝐸 𝝁 (𝑒 " *

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 2 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 3 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 4 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 5 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 6 4.124 6.449 -3.108 -2.713 2.545 0.958 -1.231 4.970 0.378 -3.631 0.015 8.319 -4.064 -3.359 5.592 3.372 -7.534 2.418 -2.235 -9.136 7 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 8 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 9 4.124 6.449 -3.108 -2.713 2.545 0.958 -1.231 4.970 0.378 -3.631 0.015 8.319 -4.064 -3.359 5.592 3.372 -7.534 2.418 -2.235 -9.136 10 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 11 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 12 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 13 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 14 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 15 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 16 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 17 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 18 4.143 6.482 -3.132 -2.735 2.555 0.958 -1.244 4.994 0.375 -3.659 0.009 8.363 -4.094 -3.385 5.620 3.387 -7.585 2.427 -2.254 -9.197 19 -4.105 -6.436 3.147 2.751 -2.522 -0.931 1.265 -4.953 -0.349 3.672 0.015 -8.311 4.106 3.399 -5.577 -3.351 7.585 -2.394 2.271 9.192 20 -4.112 -6.445 3.143 2.747 -2.529 -0.936 1.260 -4.961 -0.354 3.669 0.010 -8.321 4.103 3.395 -5.586 -3.358 7.584 -2.401 2.268 9.192

Expected Individual Rationality with Budget Balance

The individual rationality ensures that CAVs want to participate as they are at least as well off by participating compared with if they do not participate: 𝑢 % = 𝑣 % (𝜂 % ) − 𝑣 % (𝜇 % ) − 𝑝 % ≥ 0, for 𝑖 ∈ 𝐴 (28) Note that the incentive mechanism may not exist to increase the utilities of all agents as someone may be assigned a very good route initially. To address this problem, we introduce the definition of expected individual rationality. The incentive mechanism is expected individual rational if it can increase the expected utilities for all agents, i.e., 𝐸 𝝁 (𝑢 % ) = 𝑣 % (𝜂 % ) − 𝐸 𝝁 L𝑣 % (𝜇 % )M − 𝑝 % ≥ 0, for 𝑖 ∈ 𝐴 (29) Theorem 7.

Suppose the expected evaluations of the new routes of all agents are higher than those of their old ones, i.e., 𝐸 𝜼 (𝑣 % ) ≥ 𝐸 𝝁 (𝑣 % ), 𝑖 ∈ 𝐴 , then the incentive mechanism defined above is expected individual rational. Proof. From Equation (27), we have: 𝑣 % (𝜂 % ) + 𝑎 % ≥ 𝑣 % (𝜂 % ) + 𝑎 ) , ∀ 𝑖, 𝑗 ∈ 𝐴 . Summing over 𝑗, 𝑗 ∈ 𝐴 , we have 𝑛(𝑣 % (𝜂 % ) + 𝑎 % ) ≥ XL𝑣 % L𝜂 ) M + 𝑎 ) M )∈N 𝑣 % (𝜂 % ) + 𝑎 % − 1𝑛 X 𝑎 ))∈N ≥ 𝐸 𝜼 (𝑣 % ) 𝑢 % + 𝑣 % (𝜇 % ) ≥ 𝐸 𝜼 (𝑣 % ) Therefore, 𝐸 𝝁 (𝑢 % ) ≥ 𝐸 𝜼 (𝑣 % ) − 𝐸 𝝁 (𝑣 % ) ≥ 0 □ Theorem 7 implies that vehicles are willing to participate in the mechanism if they perceive the new route options developed using the route swapping dynamical system are better, on average. However, the route swapping dynamical system cannot reduce the travel times of vehicles for each global OD pair as it seeks to reach the ALSO state. In other words, the travel times of some vehicles will increase if they follow the routes provided by the vehicle route assignment problem (21). Thereby, they may not be willing to participate. To address this problem, we now add an additional group compensation process to charge the vehicles that benefit from the route switching, and to compensate those whose travel times are increased after route switching. The objective of the compensation is to achieve the expected individual rationality for everyone. Consider the situation where 𝑣 % (𝜌) = −𝜆 % 𝑇(𝜌) . The condition for expected individual rationality in Theorem 7 can be written as 𝜆 % L𝑇– 𝝁 − 𝑇– 𝜼 M ≥ 0 (30) where 𝑇– 𝝁 , 𝑇– 𝜼 are the average travel times of the initial route options 𝝁 and new route options 𝜼 . For each agent in the vehicle group, we provide an additional group compensation, 𝜆 % L𝑇– 𝜼 − 𝑇– 𝝁 + 𝜖M , where 𝜖 ≥ 0 . Then, agent 𝑖 's expected evaluation of agent 𝑗 's new route becomes 𝐸 𝝁 L𝑒 %) M = −𝜆 % 𝑇(𝜂 % ) + 𝜆 % 𝑇– 𝝁 + 𝑎 ) + 𝜆 % L𝑇– 𝜼 − 𝑇– 𝝁 + 𝜖M = −𝜆 % 𝑇 Q (𝜂 % ) + 𝜆 % 𝑇– 𝝁 + 𝑎 ) (31) where 𝑇 Q (𝜂 % ) = 𝑇(𝜂 % ) + 𝑇– 𝝁 − 𝑇– 𝜼 − 𝜖 is the adjusted travel time of 𝜂 % . Replacing 𝑇– 𝜼 with 𝑇– 𝜼Q = ∑ 𝑇(𝜂 % ) H%K4 in Equation (30), the additional group compensation ensures the expected individual rationality condition: 𝜆 % L𝑇– 𝝁 − 𝑇– 𝜼Q M = 𝜆 % 𝜖 ≥ 0 (32) The sum of all payments by agents in Equation (20) in the vehicle route assignment model can be formulated as : X 𝑝 %H%K4 = − vX 𝜆 %H%K4 w L𝑇– 𝜼 − 𝑇– 𝝁 + 𝜖M = −𝑛𝜆̅L𝑇– 𝜼 − 𝑇– 𝝁 + 𝜖M (33) Note that the sum of all payments is a constant; thus, the optimal vehicle route assignments do not change after we introduce the additional group compensations. When 𝜖 = 0 , ∑ 𝑝 %H%K4 is positive if 𝑇– 𝜼 < 𝑇– 𝝁 , and it is negative if 𝑇– 𝜼 >𝑇– 𝝁 . This indicates that the traffic manager collects money from the vehicle groups whose average travel time decreases and pays the vehicle groups whose average travel time increases. Let us assume that the average value of time 𝜆̅ is the same for all vehicle groups. Then, the traffic manager can achieve the expected individual rationality and can make a profit in the long term if the total travel time of all vehicle groups decreases (same expected individual utility for agents participating and agents not participating when 𝜖 = 0 ). We can adjust 𝜖 to be a positive value to realize budget balance and ensure that each agent participating in the proposed incentive mechanism can benefit (i.e., the utility for switching to the assigned route is positive). Expected Incentive Compatibility

In economics, incentive compatibility constraint motivates agents to behave in a manner consistent with the optimal solution. In our incentive mechanism, the objective is that the utilities agents can obtain by reporting their true route evaluation functions (i.e., 𝑣 % , ∀𝑖 ) are larger than by reporting arbitrary route evaluation functions (denoted as 𝑣 %Q ), to promote honest behavior. That is, the incentive compatibility constraint seeks to ensure 𝑢 % L(𝑣 % , 𝒗 − 𝑣 % )M ≥ 𝑢 % L(𝑣 %Q , 𝒗 − 𝑣 % )M, ∀𝑖 (34) where 𝒗 − 𝑣 % denotes the route evaluation function combination of other vehicles except for agent 𝑖 . However, Green and Laffont (1979) have shown that non-manipulability is incompatible with envy-freeness and the budget balance constraint. Andersson et al. (2014) compromise on the non-manipulability by seeking the least-manipulable mechanisms among the envy-free and budget-balanced ones. They define a new measure of minimal manipulability as the number of agents who can manipulate the rule at a given preference profile and show that we could obtain optimal fair allocation rules as agents-counting-minimally manipulable rules. Another approach is to weaken or abandon the budget balance constraint. Cohen et al. (2010) prove that we could determine the payments to satisfy the envy-freeness and incentive compatibility constraints separately by finding the shortest paths in a weighted directed graph. Moreover, they show that by removing some of the edges in the graph, the shortest paths can represent the payments that are both envy-free and incentive compatible. Sun and Yang (2003) replace the budget-balance constraint with a maximum payment limit for each indivisible object. They develop an allocation mechanism that fairly assigns the objects and always elicits honest preferences over both objects and money (incentive compatibility). In our study, we modify the incentive compatibility defined in Equation (34) into the expected incentive compatibility, which indicates that agents will have no better expected utilities by manipulating their preferences. Suppose 𝑣 % (𝜌) = −𝜆 % 𝑇(𝜌), 𝜆 % ∈ [𝜆 R , 𝜆 S ] , where 𝜆 % > 0 is the time value of 𝑖 , 𝑇(𝜌) is the travel time of path 𝜌 , and 𝜆 R , 𝜆 S are the lower-bound and upper-bound of 𝜆 % , respectively. Next, we show that agents cannot benefit by reporting a manipulated 𝜆 % under the proposed expected envy-free compensation mechanism. Lemma 3. If 𝑣 (𝜌) = −𝜆 𝑇(𝜌), ∀𝑖 , then the optimal sum of utilities defined by Equation (20) is max T X 𝑢 %H%K4 = − X 𝜆 (!) 𝑇 (!) + 𝐹(𝒗, 𝝁 ) H!K4 (35) where 𝜆 (!) is the 𝑟 ?@ largest 𝜆 among all agents, 𝑇 (!) is the 𝑟 ?@ shortest travel time among all agent routes and 𝐹(𝒗, 𝝁) is a constant denoting the sum of the evaluations of agents' initial routes. Proof.

Note that ∑ 𝑎 % 𝑏 %H%K4 is maximum when both {𝑎 % } and {𝑏 % } are sorted in ascending order or in descending order. Therefore, the maximum sum of utilities is achieved when the 𝑟 ?@ largest 𝜆 is paired with the 𝑟 ?@ shortest travel time. □ Lemma 4.

If the optimal vehicle route assignment problem assigns the agent with the value of time 𝜆 (!) to the route with travel time 𝑇 (!) , then the payment of the agent can be formulated as: 𝑝 (4) = 1𝑛 X X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M )UK5 H)K5 , 𝑝 (%) = 1𝑛 X X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M )UK5H)K5 − X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M %UK5 (36) Proof.

For notational simplicity, we re-number the agents from the largest 𝜆 to the minimum. We denote the adjustment incentive for the 𝑗 ?@ agent in the 𝑘 ?@ round as 𝑎 )[A] . In the first round, 𝑎 )[4] = 0, 𝑗 ∈ 𝐴 , we have 𝐸 𝝁 (𝑒 ) = 𝜆 (4) L𝑇–(𝝁) − 𝑇 (4)

M + 𝑎 ≥ 𝜆 (4)

L𝑇–(𝝁) − 𝑇 ())

M + 𝑎 )[4] = 𝐸 𝝁 L𝑒 M, 𝑗 ∈ 𝐴

Therefore, agent is expected envy-free. In round 2, agent 𝑗(𝑗 > 1) is envious of agent 1 the most. Therefore, 𝑎 )[5] =𝜆 ()) L𝑇 ()) − 𝑇 (4) M, 𝑗 ≥ 2 , we have 𝐸 𝝁 (𝑒 ) = 𝜆 (5) L𝑇–(𝝁) − 𝑇 (5)

M + 𝑎 ≥ 𝜆 (5)

L𝑇–(𝝁) − 𝑇 (4)

M = 𝐸 𝝁 (𝑒 ); 𝐸 𝝁 (𝑒 ) = 𝜆 (5) L𝑇–(𝝁) − 𝑇 (5)

M + 𝑎 ≥ 𝜆 (5)

L𝑇–(𝝁) − 𝑇 ())

M + 𝑎 )[5] = 𝐸 𝝁 L𝑒 M, 𝑗 ≥ 2.

Agent 2 becomes expected envy-free. While 𝐸 𝝁 (𝑒 ) = 𝜆 (4) L𝑇–(𝝁) − 𝑇 (4)

M + 𝑎 ≥ 𝜆 (4)

L𝑇–(𝝁) − 𝑇 ())

M + 𝜆 ()) L𝑇 ()) − 𝑇 (4) M = 𝐸 𝝁 L𝑒 M, 𝑗 ∈ 𝐴 which indicates agent remains expected envy-free. Similarly, in Round 𝑘(𝑘 > 2) , agent 𝑗(𝑗 ≥ 𝑘) is envious of agent 𝑘 − 1 the most. Therefore, 𝑎 )[A] = 𝑎 AG4[AG4] +𝜆 ()) L𝑇 ()) − 𝑇 (AG4) M = ∑ 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M + 𝜆 ()) L𝑇 ()) − 𝑇 (AG4) M AG4UK5 , 𝑗 ≥ 𝑘 , we have 𝐸 𝝁 (𝑒 AA ) = 𝜆 (A) L𝑇–(𝝁) − 𝑇 (A)

M + 𝑎

A[A] = 𝜆 (A)

L𝑇–(𝝁) − 𝑇 (A)

M + X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M + 𝑎 )[)]AUK)B4 ≥ 𝜆 (A)

L𝑇–(𝝁) − 𝑇 (A)

M + 𝜆 (A)

X L𝑇 (U) − 𝑇 (UG4)

M + 𝑎 )[)]AUK)B4 = 𝜆 (A)

L𝑇–(𝝁) − 𝑇 ())

M + 𝑎 )[)] = 𝐸 𝝁 L𝑒 A) M, ∀ 𝑗 < 𝑘; 𝐸 𝝁 (𝑒 AA ) = 𝜆 (A) L𝑇–(𝝁) − 𝑇 (A)

M + 𝑎

A[A] ≥ 𝜆 (A)

L𝑇–(𝝁) − 𝑇 ())

M + 𝑎 )[A] = 𝐸 𝝁 L𝑒 A) M, ∀ 𝑗 ≥ 𝑘.

Agent 𝑘 becomes expected envy-free. While, for 𝑗 < 𝑘, 𝑠 ≥ 𝑘 , 𝐸 𝝁 L𝑒 )) M = 𝜆 ())

L𝑇–(𝝁) − 𝑇 ())

M + 𝑎 )[)] = 𝜆 ())

L𝑇–(𝝁) − 𝑇 (")

M + 𝑎 )[)] + 𝜆 ())

X L𝑇 (U) − 𝑇 (UG4) M "UK)B4 = 𝜆 ()) L𝑇–(𝝁) − 𝑇 (")

M + 𝑎 )[)] + 𝜆 ())

X L𝑇 (U) − 𝑇 (UG4) M AG4UK)B4 + 𝜆 ()) L𝑇 (") − 𝑇 (AG4) M ≥ 𝜆 ()) L𝑇–(𝝁) − 𝑇 (")

M + 𝑎 )[)] + 𝜆 ())

X L𝑇 (U) − 𝑇 (UG4) M AG4UK)B4 + 𝜆 (") L𝑇 (") − 𝑇 (AG4) M = 𝜆 ()) L𝑇–(𝝁) − 𝑇 (")

M + 𝑎 "[A] = 𝐸 𝝁 L𝑒 )" M which indicates agent 𝑗(𝑗 < 𝑘) stays expected envy-free. Therefore, after 𝑘 rounds, agents

1, 2, … , 𝑘 become expected envy-free. At the end of round 𝑛 , we have final adjustment incentives 𝑎 % which make all agents expected envy-free: 𝑎 = 𝑎 = 0; 𝑎 % = 𝑎 %[%] = X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M %UK5 , 𝑖 ≥ 2. (37) Then, we can derive the payments from 𝑝 % = ∑ 𝑎 )H)K4 − 𝑎 % , 𝑖 ∈ 𝐴 : 𝑝 = 1𝑛 X X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M )UK5 H)K5 , 𝑝 % = 1𝑛 X X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M )UK5H)K5 − X 𝜆 (U) L𝑇 (U) − 𝑇 (UG4) M %UK5 . □ Fig. 5. Relationship between 𝑎 and 𝑝 . The result of Lemma 4 is also illustrated in Fig. 5. The green line represents the adjustment incentives that eliminate the expected envy, while the red line shows the equal share that each agent needs to pay for the incentives. Combining them, the blue lines in the figure represent the total payment each agent makes (blue lines below the red line are positive payments, while those above are negative). Now, we can analyze the additional utility an agent can gain through manipulation.

Lemma 5. (Proof in Appendix A).

Let 𝑇 (D) = 𝑇 (4) for simplicity. If the agent with the 𝑟 ?@ largest 𝜆 reports a fake value of time 𝜆 X which ranks 𝑘 ?@ among all 𝜆 s, the additional utility ( 𝛥𝑢 (A)(!) L𝜆 X M ) he gains is If 𝑟 > 𝑘 : 𝛥𝑢 (A)(!) L𝜆 X M = 𝑘 − 1𝑛 L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4) M − XL𝜆 ()) − 𝜆 (!) ML𝑇 ()B4) − 𝑇 ()) M !G4)KA − X 𝑛 − 𝑗𝑛 L𝜆 ()) − 𝜆 ()B4) ML𝑇 ()B4) − 𝑇 ()) M !G4)KA ; (38a) if 𝑟 = 𝑘 : 𝛥𝑢 (A)(!) L𝜆 X M = 𝑟 − 1𝑛 L𝜆 X − 𝜆 (!) ML𝑇 (!) − 𝑇 (!G4) M; (38b) if 𝑟 < 𝑘 : 𝛥𝑢 (A)(!) L𝜆 X M = − (cid:135)𝜆 (!) − 𝑘 − 1𝑛 𝜆 X − 𝑛 + 1 − 𝑘𝑛 𝜆 (A) (cid:138) L𝑇 (A) − 𝑇 (AG4) M − XL𝜆 (!) − 𝜆 ()B4)

ML𝑇 ()) − 𝑇 ()G4) M AG4)K! + X 𝑛 − 𝑗 + 1𝑛 L𝜆 ()) − 𝜆 ()B4)

ML𝑇 ()) − 𝑇 ()G4) M AG4)K! . (38c) From Equation (38), we note that 𝛥𝑢 (A)(!) L𝜆 X M is also related to the distribution of 𝜆 ()) (𝑗 ≠ 𝑟) , and 𝑇 ()) (𝑗 ∈ 𝐴) , which are unknown to the agent with the 𝑟 ?@ largest 𝜆 . Therefore, different from the definition of expected envy-freeness and expected individual rationality, the definition of expected incentive compatibility is not only related to the distribution of 𝝁 but also the distribution of 𝜼 and 𝝀 = (𝜆 , 𝜆 , … , 𝜆 H ) . Denote 𝜆 Y as the real value of time for the agent. Note that in Equation (38), 𝜆 (!) is actually 𝜆 Y , which is also a variable when defining expected incentive compatibility. Therefore, the expected incentive compatibility can be defined as: 𝐸 𝝀,𝝁,𝜼 tΔ𝑢 (A)(!) L𝜆 X , 𝜆 Y Mu ≤ 0 (39)

Theorem 8. (Proof in Appendix A).

Suppose 𝜆 % ∼ 𝑈(𝜆 R , 𝜆 S ), 𝑖 ∈ 𝐴 , where 𝜆 R , 𝜆 S are the lower-bound and upper-bound of 𝜆 % , respectively. If an agent with 𝜆 Y reports a fake value of time 𝜆 X , the additional expected utility he/she will gain is non-positive if 𝜆 X − 𝜆 Y ≤ 0 or 𝜆 X − 𝜆 Y ≥ (𝜆 S − 𝜆 R ) , where 𝑛 is the number of agents. Theorem 8 implies that under the above assumptions, agents can only have a chance to gain positive expected additional utility by reporting a 𝜆 X greater than his real value of time 𝜆 Y , but less than 𝜆 Y + (𝜆 S − 𝜆 R ) . It is reasonable to assume that in practice, there is a minimum division value in the value of time settings (akin to a minimum division value equal to $0.01 when making a transfer). If the minimum division value is greater than or equal to (𝜆 S − 𝜆 R ) , no value of time manipulation would be beneficial for all agents. Corollary 1.

If agents have to report their value of time from the feasible value of time set, (cid:128) )[ + B(\G))[ , \ (cid:129) 𝑗 =0, 1, … , 𝑁(cid:130) ,when 𝑁 is large, the proposed incentive mechanism is expected incentive compatible for 𝑛 ≥ 2𝑁 , which implies that agents will always report the one larger than but closest to their true value of time. Proof. The minimum division value of the above feasible value of time set is [ , G[ + \ . When 𝑛 ≥ 2𝑁 , the minimum division value is greater than or equal to (𝜆 S − 𝜆 R ) . For a large 𝑁 , the reported value of time 𝜆 % still approximately follows the uniform distribution; thus, Theorem 8 holds. Therefore, the proposed mechanism is expected incentive compatible with this condition. □ Fig. 6. Average additional utility that agent with 𝜆 $ gains by reporting 𝜆 % . In practice, the number of vehicles in the vehicle group with the same global OD pair is large enough to satisfy the condition in Corollary 1 in most cases. We now illustrate the expected incentive compatibility with a numerical experiment by calculating the average additional utility that agents can gain by reporting different 𝜆 X . Since we seek to illustrate that when 𝜆 X = 𝜆 Y , the average additional utility achieves the maximum value 0, the unit does not matter in the verification. We assume that agents’ values of time follow the uniform distribution 𝑈(0.1, 0.9) , and the minimum division value is 0.01. According to Corollary 1, the number of vehicles in the vehicle group is set as 𝑛 =2 ×

D.^GD.4D.D4 = 160 . We randomly generate 160 route travel times from

𝑈(50, 60) . For each L𝜆 Y , 𝜆 X M pair, we generate 159 other 𝜆 s randomly from 𝑈(0.1, 0.9) , follow the vehicle route assignment model and the incentive mechanism to calculate the utility of the agent when he/she reports 𝜆 Y honestly and the utility when he reports 𝜆 X , and compute additional utility gains. We repeat the procedure 100 times to calculate the average additional utility for each L𝜆 Y , 𝜆 X M pair and plot the result in Fig. 6. It shows that agents cannot gain positive average additional utility by reporting manipulated 𝜆 X ; when 𝜆 X = 𝜆 Y (the agent is behaving honestly), the average additional utility reaches the maximum, 0. Concluding Comments

This study proposes an incentive-based decentralized routing strategy for CAVs using information propagation in a local area. The dynamic traffic network is decomposed into deterministic network problems in small time intervals. In each time interval, following a decentralized three-stage scheme, vehicles update their route choice and take the corresponding incentives. In the first stage, a decentralized route assignment model based on a local route switching dynamical system is developed to obtain an optimal route flow solution for the ALSO state. Then, a vehicle route assignment problem is formulated to assign each vehicle a route to achieve the ALSO state and to maximize the sum of the utilities based on individual evaluation functions. Then, we propose an expected envy-free incentive mechanism to charge or compensate the agents so that everyone can accept the assigned route determined by the optimal vehicle route assignment problem. We further analyze the expected individual rationality, budget balance, and expected incentive compatibility of the incentive mechanism. The budget balance constraint is used in this study to highlight incentive strategies that do not require external funding. Instead, they entail the exchange of payments involving CAV-based individual travelers (agents) whose behavioral preferences are input into their vehicles a priori . Thereby, payments are seamlessly exchanged by the CAVs through V2V and V2I communications in real-time. To the best of our knowledge, this is the first attempt to bridge the gap between heterogeneous individual-level objectives and system-level objectives in the context of vehicular routing decisions for OD travel by ensuring practical realism. The proposed decentralized routing strategy can enhance system performance and ensure individual satisfaction simultaneously by incorporating a route assignment model with an incentive mechanism. Application of the routing strategy requires the availability of local real-time traffic information, which can be realized in connected and autonomous driving environments. Moreover, the proposed routing strategy can be solved analytically and implemented in a fully decentralized manner to circumvent the computational issue that limits most existing approaches in practice. Further, we theoretically prove that the proposed incentive mechanism satisfies the expected individual rationality constraint, the budget-balance constraint, and the expected incentive compatibility constraint simultaneously, which enhances its practical applicability and realism. In summary, the proposed decentralized routing strategy is deployable both in terms of computational tractability and behavioral realism. The study opens a new venue related to incentive/pricing strategies that leverage emerging connectivity and automation technologies. There are opportunities for future enhancements related to this study, including: (i) integrating the ALSO-based routing strategies and the incentive mechanism to analyze system performance; (ii) proposing a modeling framework to optimize the values of cover range 𝑑 and time interval 𝜏 to improve system performance; (iii) extending the context to a multimodal traffic environment; (iv) analyzing the expected individual rationality and expected incentive compatibility for general individual evaluation functions instead of the specific form 𝑣 % (𝜌) = −𝜆 % 𝑇(𝜌) used in this study; (v) using cell/link transmission model instead of the BPR function to better characterize the travel time in both uncongested and congested traffic flow environments; and (vi) incorporating a reputation system to further strengthen the incentive compatibility (to ensure that agents do take the routes determined for them rather than just report their preferences honestly).

Acknowledgements

This study is supported by funding from the National Science Foundation (1662692-CMMI) and Georgia Institute of Technology to the second author. Additional support is provided to the third author from the Natural Science Foundation of China (52002191) and Natural Science Foundation of Zhejiang province (LQ20E08004). Any errors or omissions remain the sole responsibility of the authors. Appendix A. Proofs of Lemma 5 and Theorem 8 Lemma 5.

ML𝑇 ()) − 𝑇 ()G4) M AG4)K! + X 𝑛 − 𝑗 + 1𝑛 L𝜆 ()) − 𝜆 ()B4)

ML𝑇 ()) − 𝑇 ()G4) M AG4)K! . (38c) Proof.

The utility difference consists of three parts: the evaluation difference between 𝑇 (!) and 𝑇 (A) , the adjustment incentive difference for 𝑟 ?@ largest and 𝑘 ?@ largest 𝜆 , and the difference in ∑ 𝑎 )H)K4 under L𝜆 (4) , 𝜆 (5) , … , 𝜆 (!) , … , 𝜆 (H) M and L𝜆 (4) , … , 𝜆 X , … , 𝜆 (H) M . Fig. 7. Adjustment incentive difference for (a) 𝑟 > 𝑘 ; (b) 𝑟 < 𝑘 . For 𝑟 > 𝑘 , 𝛥𝑢 (A)(!) L𝜆 X M = Y𝜆 (!) L𝑇 (!) − 𝑇 (A) MZ + …𝜆 X L𝑇 (A) − 𝑇 (AG4) M − X 𝜆 ()) L𝑇 ()) − 𝑇 ()G4) M !)KA ‰ − 1𝑛 …L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4)

M(𝑛 + 1 − 𝑘) + XL𝜆 ()) − 𝜆 ()B4)

ML𝑇 ()B4) − 𝑇 ())

M(𝑛 − 𝑗) !G4)KA ‰ = 𝑘 − 1𝑛 L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4) M (a) (b) − XL𝜆 ()) − 𝜆 (!) ML𝑇 ()B4) − 𝑇 ()) M !G4)KA − X 𝑛 − 𝑗𝑛 L𝜆 ()) − 𝜆 ()B4) ML𝑇 ()B4) − 𝑇 ()) M !G4)KA ; For 𝑟 = 𝑘 , 𝛥𝑢 (A)(!) L𝜆 X M = 0 + YL𝜆 X − 𝜆 (!) ML𝑇 (!) − 𝑇 (!G4)

MZ − 1𝑛 YL𝜆 X − 𝜆 (!) ML𝑇 (!) − 𝑇 (!G4)

M(𝑛 + 1 − 𝑟)Z = 𝑟 − 1𝑛 L𝜆 X − 𝜆 (!) ML𝑇 (!) − 𝑇 (!G4) M; For 𝑟 < 𝑘 , 𝛥𝑢 (A)(!) L𝜆 X M = Y𝜆 (!) L𝑇 (!) − 𝑇 (A) MZ + …𝜆 X L𝑇 (A) − 𝑇 (AG4) M + X 𝜆 ()B4) L𝑇 ()) − 𝑇 ()G4) M − 𝜆 (!) L𝑇 (!) − 𝑇 (!G4) M AG4)K! ‰ + 1𝑛 …L𝜆 (A) − 𝜆 X ML𝑇 (A) − 𝑇 (AG4)

M(𝑛 + 1 − 𝑘) + XL𝜆 ()) − 𝜆 ()B4)

ML𝑇 ()) − 𝑇 ()G4)

M(𝑛 − 𝑗 + 1)

AG4)K! ‰ = − (cid:135)𝜆 (!) − 𝑘 − 1𝑛 𝜆 X − 𝑛 + 1 − 𝑘𝑛 𝜆 (A) (cid:138) L𝑇 (A) − 𝑇 (AG4) M − XL𝜆 (!) − 𝜆 ()B4)

ML𝑇 ()) − 𝑇 ()G4) M AG4)K! + X 𝑛 − 𝑗 + 1𝑛 L𝜆 ()) − 𝜆 ()B4)

ML𝑇 ()) − 𝑇 ()G4) M AG4)K! . □ Theorem 8.

Suppose 𝜆 % ∼ 𝑈(𝜆 R , 𝜆 S ), 𝑖 ∈ 𝐴 , where 𝜆 R , 𝜆 S are the lower-bound and upper-bound of 𝜆 % , respectively. If an agent with 𝜆 Y reports a fake value of time 𝜆 X , the additional expected utility he/she will gain is non-positive if 𝜆 X −𝜆 Y ≤ 0 or 𝜆 X − 𝜆 Y ≥ (𝜆 S − 𝜆 R ) , where 𝑛 is the number of agents. Proof . When 𝜆 X = 𝜆 Y = 𝜆 (!) , Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y M = 0 , we have 𝐸 𝝀,𝝁,𝜼 tΔ𝑢 (A)(!) L𝜆 X , 𝜆 Y Mu = 0 . (40) When 𝜆 X < 𝜆 Y = 𝜆 (!) , the agent does not know the distribution of 𝝀 and 𝝁 , and hence does not know how large 𝜆 X and 𝜆 Y are. 𝐸 𝝀,𝝁,𝜼 tΔ𝑢 (A)(!) L𝜆 X , 𝜆 Y Mu = (cid:190) (cid:190) X X 𝑃 !,AHG4 𝑃 [|!,AHG4 𝑃 𝜼 Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y M𝑑𝝀𝑑𝜼

HAK!H!K4𝜼𝝀 (41) where 𝑃 !,AHG4 = 𝑃L𝜆 X ∈ L𝜆 (AG4) , 𝜆 (A) Z, 𝜆 Y ∈ L𝜆 (!G4) , 𝜆 (!B4) ZM is the probability that 𝜆 X is larger than or equal to 𝜆 (A) but smaller than 𝜆 (AG4) while 𝜆 Y is larger than or equal to 𝜆^((𝑟 + 1)) but smaller than 𝜆 (!G4) when there are 𝑛 − 1 other agents, 𝑃 [|!,AHG4 is the probability distribution of 𝜆 conditioned on 𝜆 X ∈ L𝜆 (AG4) , 𝜆 (A) Z and 𝜆 Y ∈ L𝜆 (!G4) , 𝜆 (!B4) Z , and 𝑃 𝜼 is the probability distribution of 𝜼 . Re-combining the terms in Equation (41), we have X X 𝑃 !,AHG4 𝑃 [|!,AHG4 𝛥𝑢 (A)(!) L𝜆 X , 𝜆 Y = 𝜆 (!) M HAK!H!K4 = − X X 𝑃 !,AHG4 𝑃 [|!,AHG4 (cid:135)𝜆 Y − 𝑘 − 1𝑛 𝜆 X − 𝑛 + 1 − 𝑘𝑛 𝜆 (A) (cid:138) L𝑇 (A) − 𝑇 (AG4) M HAK!H!K4 − X X 𝑃 !,AHG4 𝑃 [|!,AHG4 (cid:192)XL𝜆 Y − 𝜆 ()B4) ML𝑇 ()) − 𝑇 ()G4) M AG4)K!HAK!B4HG4!K4 − X 𝑛 − 𝑗 + 1𝑛 L𝜆 ()) − 𝜆 ()B4)

ML𝑇 ()) − 𝑇 ()G4) M AG4)K! ` ≤ 0 (42) When 𝜆 X > 𝜆 Y , 𝐸 𝝀,𝝁,𝜼 tΔ𝑢 (A)(!) L𝜆 X , 𝜆 Y Mu = (cid:190) (cid:190) X X 𝑃 !,AHG4 𝑃 [|!,AHG4 𝑃 𝜼 Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y M𝑑𝝀𝑑𝜼 !AK4H!K4𝜼𝝀 (43) Since 𝜆 % follows a uniform distribution, 𝜆 % ∼ 𝑈(𝜆 R , 𝜆 S ), 𝑖 ∈ 𝐴 , we have 𝑃 !,AHG4 = 𝑃L𝜆 X ∈ L𝜆 (AG4) , 𝜆 (A) Z, 𝜆 Y ∈ L𝜆 (!G4) , 𝜆 (!B4) ZM = t𝑛 − 1𝑘 − 1u t𝑛 − 𝑘𝑟 − 𝑘u ´𝜆 S − 𝜆 X 𝜆 S − 𝜆 R ˆ AG4 ´𝜆 X − 𝜆 Y 𝜆 S − 𝜆 R ˆ !GA ´𝜆 Y − 𝜆 R 𝜆 S − 𝜆 R ˆ HG! (44) Re-combining the terms in Equation (43), we have

X X 𝑃 !,AHG4 𝑃 [|!,AHG4 Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y = 𝜆 (!) M !AK4H!K4 = X X 𝑃 !,AHG4 𝑃 [|!,AHG4 𝑘 − 1𝑛 L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4) M !AK4H!K4 − X X 𝑃 !,AHG4 𝑃 [|!,AHG4 (cid:192)XL𝜆 ()) − 𝜆 Y ML𝑇 ()B4) − 𝑇 ()) M !G4)KA + X 𝑛 − 𝑗𝑛 L𝜆 ()) − 𝜆 ()B4) ML𝑇 ()B4) − 𝑇 ()) M !G4)KA ` !G4AK4H!K4 = X X 𝑃 !,AHG4 𝑃 [|!,AHG4 𝑘 − 1𝑛 L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4) M + X X …𝑃 !,AHG4 𝑃 [|!,AHG4 𝑘 − 1𝑛 L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4) M !AK5H!K5 −𝑃 !,AG4HG4 𝑃 [|!,AG4HG4 (cid:192) X L𝜆 ()) − 𝜆 Y ML𝑇 ()B4) − 𝑇 ()) M !G4)KAG4 + X 𝑛 − 𝑗𝑛 L𝜆 ()) − 𝜆 ()B4) ML𝑇 ()B4) − 𝑇 ()) M !G4)KAG4 `‰ From Equation (44), 𝑃 !,AHG4 = !GAB4AG4 [ , G[ - [ - G[ . 𝑃 !,AG4HG4 . Therefore, we have X X 𝑃 !,AHG4 𝑃 [|!,AHG4 Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y M !AK4H!K4 = X X 𝑃 !,AG4HG4 …𝑃 [|!,AHG4 𝑟 − 𝑘 + 1𝑘 − 1 𝜆 S − 𝜆 X 𝜆 X − 𝜆 Y 𝑘 − 1𝑛 L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4) M !AK5H!K5 −𝑃 [|!,AG4HG4 (cid:192) X L𝜆 ()) − 𝜆 Y ML𝑇 ()B4) − 𝑇 ()) M !G4)KAG4 + X 𝑛 − 𝑗𝑛 L𝜆 ()) − 𝜆 ()B4) ML𝑇 ()B4) − 𝑇 ()) M !G4)KAG4 `‰ ≤ X X 𝑃 !,AG4HG4 ˜𝑃 [|!,AHG4 𝑟 − 𝑘 + 1𝑘 − 1 𝜆 S − 𝜆 X 𝜆 X − 𝜆 Y 𝑘 − 1𝑛 L𝜆 X − 𝜆 (A) ML𝑇 (A) − 𝑇 (AG4) M !AK5H!K5 −𝑃 [|!,AG4HG4 vL𝜆 (AG4) − 𝜆 Y ML𝑇 (A) − 𝑇 (AG4)

M + 𝑛 − 𝑘 + 1𝑛 L𝜆 (AG4) − 𝜆 (A)

ML𝑇 (A) − 𝑇 (AG4)

Mwg

Note that 𝜆 % follows a uniform distribution, 𝜆 % ∼ 𝑈(𝜆 R , 𝜆 S ), 𝑖 ∈ 𝐴 , (cid:190)𝑃 [|!,AHG4 L𝜆 X − 𝜆 (A) M𝑑𝝀 𝝀 = 𝜆 X − 𝜆 Y 𝑟 − 𝑘 + 1 ; (cid:190)𝑃 [|!,AG4HG4 L𝜆 (AG4) − 𝜆 Y M𝑑𝝀 𝝀 = (𝑟 − 𝑘 + 1)(𝜆 X − 𝜆 Y )𝑟 − 𝑘 + 2 ; (cid:190)𝑃 [|!,AG4HG4 L𝜆 (AG4) − 𝜆 (A) M𝑑𝝀 𝝀 = 𝜆 X − 𝜆 Y 𝑟 − 𝑘 + 2. Therefore, we have (cid:190) X X 𝑃 !,AHG4 𝑃 [|!,AHG4 Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y M𝑑𝝀 !AK4H!K4𝝀 ≤ X X 𝑃 !,AG4HG4 L𝜆 X − 𝜆 Y ML𝑇 (A) − 𝑇 (AG4)

M ´1𝑛 𝜆 S − 𝜆 X 𝜆 X − 𝜆 Y + 1𝑛 𝑘 − 1𝑟 − 𝑘 + 2 − 1ˆ !AK5H!K5 (45) ≤ dX X 𝑃 !,AG4HG4 L𝜆 X − 𝜆 Y M ´1𝑛 𝜆 S − 𝜆 X 𝜆 X − 𝜆 Y + 1𝑛 𝑘 − 1𝑟 − 𝑘 + 2 − 1ˆ !AK5H!K5 g ⋅ dX XL𝑇 (A) − 𝑇 (AG4) M !AK5H!K5 g = L𝜆 X − 𝜆 Y M v1𝑛 𝜆 S − 𝜆 X 𝜆 X − 𝜆 Y + 1𝑛 X X 𝑘 − 1𝑟 − 𝑘 + 2 𝑃 !,AG4HG4!AK5H!K5 − 1w XL𝑇 (!) − 𝑇 (4) M H!K5

Note that

X X 𝑘 − 1𝑟 − 𝑘 + 2 𝑃 !,AG4HG4!AK5H!K5 = X X 𝑘 − 1𝑟 − 𝑘 + 2 (𝑛 − 1) !(𝑘 − 2)! (𝑟 − 𝑘 + 1)! (𝑛 − 𝑟)! L𝜆 S − 𝜆 X M AG5 L𝜆 X − 𝜆 Y M !GAB4 L𝜆 Y − 𝜆 R M HG! (𝜆 S − 𝜆 R ) HG4!AK5H!K5 = 𝜆 S − 𝜆 R 𝑛L𝜆 X − 𝜆 Y M X X(𝑘 − 1)𝑃 !,AG4H!AK5H!K5 (46) where 𝑃 !,AG4H is the probability that 𝜆 X is larger than or equal to 𝜆 (A) but smaller than 𝜆 (AG4) while 𝜆 Y is larger than or equal to 𝜆 (!B4) but smaller than 𝜆 (!G4) when there are 𝑛 other agents. Note that ∑ ∑ (𝑘 − 1)𝑃 !,AG4H!B4AK5H!K4 can be interpreted as the expected order of agent with 𝜆 X among all 𝑛 agents. X X(𝑘 − 1)𝑃 !,AG4H!AK5H!K5 ≤ X X(𝑘 − 1)𝑃 !,AG4H!B4AK5H!K4 = 𝜆 S − 𝜆 X 𝜆 S − 𝜆 R 𝑛 (47) Combining Equations (46) and (47), we have X X 𝑘 − 1𝑟 − 𝑘 + 2 𝑃 !,AG4HG4!AK5H!K5 ≤ 𝜆 S − 𝜆 X 𝜆 X − 𝜆 Y (48) Inserting Equation (48) back into Equation (45), we have (cid:190) X X 𝑃 !,AHG4 𝑃 [|!,AHG4 Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y M𝑑𝝀 !AK4H!K4𝝀 ≤ L𝜆 X − 𝜆 Y M v1𝑛 𝜆 S − 𝜆 X 𝜆 X − 𝜆 Y + 1𝑛 X X 𝑘 − 1𝑟 − 𝑘 + 2 𝑃 !,AG4HG4!AK5H!K5 − 1w XL𝑇 (!) − 𝑇 (4) M H!K5 ≤ (cid:134)2𝑛 L𝜆 S − 𝜆 X M − L𝜆 X − 𝜆 Y M(cid:139) L𝑇 (!) − 𝑇 (4) M ≤ (cid:134)2𝑛 (𝜆 S − 𝜆 R ) − L𝜆 X − 𝜆 Y M(cid:139) L𝑇 (!) − 𝑇 (4) M (49) Therefore, when 𝜆 X − 𝜆 Y ≥ (𝜆 S − 𝜆 R ) , 𝐸 𝝀,𝝁,𝜼 tΔ𝑢 (A)(!) L𝜆 X , 𝜆 Y Mu = (cid:190) (cid:190) X X 𝑃 !,AHG4 𝑃 [|!,AHG4 𝑃 𝜼 Δ𝑢 (A)(!) L𝜆 X , 𝜆 Y M𝑑𝝀𝑑𝜼 !AK4H!K4𝜼𝝀 ≤ (cid:134)2𝑛 (𝜆 S − 𝜆 R ) − L𝜆 X − 𝜆 Y M(cid:139) (cid:190) XL𝑇 (!) − 𝑇 (4) M H!K5 𝑑𝜼 𝜼 ≤ 0. □ References