[PDF] Cooperative Autonomous Vehicle Speed Optimization near Signalized Intersections

Abstract

Road congestion in urban environments, especially near signalized intersections, has been a major cause of significant fuel and time waste. Various solutions have been proposed to solve the problem of increasing idling times and number of stops of vehicles at signalized intersections, ranging from infrastructure to vehicle-based techniques. However, all the solutions introduced to solve the problem have approached the problem from a single vehicle point of view. This research introduces a game-theoretic cooperative speed optimization framework to minimize vehicles' idling times and number of stops at signalized intersections. This framework consists of three modules to cover individual autonomous vehicle speed optimization; conflict recognition; and cooperative speed decision making. A time token allocation algorithm is introduced through the proposed framework to allow the vehicles to cooperate and agree on certain speed actions such that the average idling times and number of stops are minimized. Simulation to test and validate the proposed framework is conducted and results are reported.

Full PDF

CCooperative Autonomous Vehicle SpeedOptimization near Signalized Intersections

Mahmoud Faraj, Baris Fidan, and Vincent GaudetFaculty of Engineering, University of Waterloo, ON, CanadaEmail: { msfaraj,ﬁdan,vcgaudet } @uwaterloo.ca Abstract

Road congestion in urban environments, especially near signalized intersec-tions, has been a major cause of signiﬁcant fuel and time waste. Various solutionshave been proposed to solve the problem of increasing idling times and number ofstops of vehicles at signalized intersections, ranging from infrastructure to vehicle-based techniques. However, all the solutions introduced to solve the problem haveapproached the problem from a single vehicle point of view. This research in-troduces a game-theoretic cooperative speed optimization framework to minimizevehicles’ idling times and number of stops at signalized intersections. This frame-work consists of three modules to cover individual autonomous vehicle speed op-timization; conﬂict recognition; and cooperative speed decision making. A timetoken allocation algorithm is introduced through the proposed framework to allowthe vehicles to cooperate and agree on certain speed actions such that the averageidling times and number of stops are minimized. Simulation to test and validatethe proposed framework is conducted and results are reported.

In the U.S. every year, 4.8 billion hours are wasted by trafﬁc congestion [1]. Someautomobile industrial companies have made leaps toward manufacturing AutonomousVehicles (AVs), which can implement different levels of automatic functions as a meansof achieving a safer and more efﬁcient transportation system [2–6]. Some research hasbeen conducted on optimal speed computation near Signalized Intersections (SIs) tominimize the vehicles’ idling times and number of stops. [7] has proposed an algorithmcalled Green Light Optimal Speed Advisory (GLOSA) to minimize the number of stoptimes at SIs through a journey. The impacts of this algorithm on trafﬁc efﬁciency andaverage trip time were reported. The performance analysis of the same algorithm wasinvestigated again in [8] using the performance metrics of average fuel consumptionand average stop times at SIs.[9] has introduced an approach for a single vehicle optimal speed computation to1 a r X i v : . [ m a t h . O C ] M a r educe fuel consumption and CO emissions. In this approach though, the case of adriver cruising his/her vehicle speed to pass a TL is not considered. [10] has used theapproach proposed in [9] to reduce CO emissions. Since [10] pays attention to the re-duction of CO by reducing stop-and-go driving, the case where the driver may cruisehis/her vehicle speed to pass the TL is taken into account. [11] has investigated theimpacts of Vehicle-to-Infrastructure (V2I) communication, namely TL to vehicle com-munication, on fuel and emission reductions. [12] has posed a speed advisory modelthat computes a fuel-optimal speed proﬁle during deceleration and acceleration phasesat SIs.Game theory was introduced in the early years of the twentieth century in [13]and [14]. It has been applied to some transportation problems, such as shortest pathand road congestion problems. [15] has proposed a shortest-path game with transfer-able utility, focusing on the allocation of proﬁts generated by the coalitions of play-ers. [16] has presented a shortest-path game in which players own road segments in anetwork. Each player in the game receives a non-negative reward if he/she transports agood from the source to the destination. [17] has introduced a model in which playerswho share resources (i.e., routes) can form coalitions to selﬁshly compete against eachother to maximize their values. [18] has discussed the similarities between cooperativecongestion games and their non-cooperative counterparts to demonstrate important is-sues, such as the existence of and the convergence to a pure strategy Nash Equilibrium(NE).Furthermore, game theory has been applied to the dynamic TL signal timing controlproblem. [19] has introduced a model for TL system control based on Markov Chaingame with the objective of minimizing the queue lengths at multiple SIs. [20] hasproposed a two-player cooperation game for TL signal timing control applied to a two-phase SI. Similar research to [19] is presented in [21] where a non-cooperative gameto model the TL signal timing control problem is introduced based on game theory andmodeled as a ﬁnite controlled Markov Chain. However, the TL model in [21] is appliedto a single SI. [22] has presented a game theory model based on Cournot’s Oligopolygame. [23] has proposed a novel game theory optimization algorithm for TL signaltiming control. The Nash Bargaining (NB) is used to ﬁnd the optimal strategy of theTL signal timing.Noticeably, all the techniques proposed for minimizing the idling times and numberof stops of vehicles approaching an SI have been limited to optimal speed computationin which the vehicles individually and independently compute and update their optimalspeeds. In this research work, a Cooperative Speed Optimization Framework (CSOF)is proposed. The proposed CSOF is designed to function on AVs of the highest level ofautonomy (i.e., the AV is completely autonomous and the driver does not take controlof the AV at any point in time). The cooperative notion of the CSOF is to allow theAVs to interact with each other and agree on implementing certain speed actions whenapproaching SIs. It relies on linear programming and game theory, consisting of threemodules to cover individual vehicle speed optimization, conﬂict recognition, and speedoptimization decision making. A time token allocation algorithm is proposed to beembedded in the TL such that AVs are able to cooperate with each other and with the2L. Thus, the average idling times and number of stops at SIs are minimized.The rest of the paper is organized as follows: Section 2 states the problem of min-imizing the idling times and number of stops of AVs approaching an SI as well asaddresses the game theory formulation toward the proposed solution. Section 3 in-troduces the game theory cooperative framework, CSOF, proposed to minimize theaverage idling times and number of stops of AVs. Section 4 poses the concept of coop-erative bargaining. Section 5 presents the simulation environment and reports resultsinvestigating the performance of the CSOF. Finally, concluding remarks and futurework recommendations are discussed in Section 6. Consider as a simple example the illustration of a two-lane, four-roadway SI (Fig. 1).For simplicity, assume that the TL control system has a two-phase static cycle whereEast and West roadways are one phase and North and South roadways are the otherphase. Each phase has a signal design of Green-Yellow-Red; however, for simplicity,the yellow-light time is assumed to be part of the green-light-time duration. The keyTL parameters are the green-light-time duration T g , the red-light-time duration T r , andthe TL cycle duration T c = T g + T r . These parameters are assumed to be constant, e.g., T g = sec , T r = sec , and T c = sec . Assume that there is V2I communication suchthat the vehicles heading toward the TL can receive signal timing information and thatevery AV is conducting speed optimization re-planning to have a chance of meeting thegreen-light time. Deﬁnition 1

Speed optimization re-planning is a game in which each AV performsspeed optimization every time step t. There is a probability p that the AV will proceedaccording to the previous strategy at time step t − and a probability ( − p ) that itwill move to a different strategy (i.e., adopt a different speed). To clarify the complexity of the problem, assume that for a certain cycle, the arrivaland maximum departure rates of each roadway at the TL are λ = . veh / sec and µ = . veh / sec , respectively [24]. Therefore, in this particular cycle, the number ofAVs arriving from each roadway during the red time is N arr = λ T r = ( . )( ) = veh ,while the maximum number of AVs that can depart the TL from each roadway duringthe green time and whole cycle is N dep = µ T g = ( . )( ) = veh .Making this setting, assume that the TL has just turned green for the East-Westdirections. Consider the case of two AVs travelling on the West roadway performing speed optimization re-planning . Taking the queue size into account, according to thecomputations by these AVs, each of them can pass within the current green light. Since3 igure 1: A simple example of a trafﬁc light scenario. only one AV can pass through, the other will experience an unexpected delay, waitingfor the next green light. Hence, AVs negatively impact the objectives of each other.

Consider a group of AVs travelling within a locality with m TLs. Each AV with in-dex AV i contemplates speed optimization to minimize its idling times at TLs froman initial location p i to a ﬁnal destination p fi . The trip from p i to p fi is made ona path, P ( p i , p fi ) , constructed from a set of road segments ending with TLs, L = { L , L , . . . , L m } . The speed v i ( t ) of each AV i belongs to a set of feasible speeds, V = { ¯ v , ¯ v , . . . , ¯ v f } . The cost of the trip for AV i on a road segment L j , where j = , , . . . , m ,denoted by C L j sv ( i ) , explicitly models the idling times of AVs. For AV i , the cost of travel-ling over road segment L j incurred by choosing a time indexed sequence of velocities, sv , is deﬁned as follows: C L j sv ( i ) = (cid:26) t i i f stop i f no stop (1)where t i is the idling time of AV i at the TL positioned at the end of road segment L j .The total cost for AV i , incurred over a path P ( p i , p fi ) that is composed of the roadsegments L P , . . . , L PN P ∈ L (sequentially), is the summation of idling times at all TLs4long the path, i.e., C Psv ( i ) = N P ∑ j = C L Pj sv ( i ) . (2)where N P is the number of TLs on the path P , and sv denotes the sequence of velocitiesfor road segment L Pj . sv is the concatenation of sv , . . . , sv N P . To provide a sub-optimalsolution to the above overall task, we follow a decentralized approach and considereach road segment L j ∈ L of the locality separately. We propose a ”Time Token Allo-cation Algorithm” for the TL and a cooperative distributed conﬂict resolution schemefor the vehicles in each such L j .For player AV i , the optimal speed value in the set of possible speeds may lead to atime token within the green light, allocated by the TL, τ i . The time token τ i is the indexof a time window assigned by the TL using the Time Token Allocation Algorithm. Thismeans that for player AV i , the cost associated with the time token τ i is the minimum(e.g., player AV i will pass through the TL without stopping, C L j sv ( i ) = G . In this game, AVs with con-ﬂicting allocated time tokens agree to take certain speed actions to resolve the conﬂict.Thus, for each player AV i , there is a ﬁnite non-empty set of speed actions V . There is anidling time cost, C L j sv ( i ) , associated with each sequence of actions sv . Action sequencesare associated with a preference relationship such that C L j sv ∗ ( i ) < C L j sv ( i ) means sv ∗ (cid:31) sv j (i.e., the sequence of actions sv ∗ is preferred over that sv as it incurs less cost). As such,the speed optimization game is represented as G = ( n , V , C ) . In the speed optimization re-planning game (Deﬁnition 1), the chosen actions of AVsare considered pure strategies, and therefore, there always exists pure equilibriumwhere no AV wishes to unilaterally change its optimal solution. However, this is trueonly for non-strictly competitive games. As the game progresses, AVs compete to gainresources such that one AV’s gain is another AV’s loss.A mixed strategy game, which always has a mixed equilibrium, is a game in whichthe strategies available to the players are not deterministic but are regulated by prob-abilistic rules [25]. Thus, from Deﬁnition 1, it is concluded that there is a probabilitydistribution over all the strategies available to every AV in the game. Hence, the speedoptimization re-planning game, as described in Deﬁnition 1, is a mixed strategy gamefor which a mixed equilibrium always exists.5 igure 2:

Schematic depiction of the cooperative speed optimization framework.

Schematics of the Cooperative Speed Optimization Framework (CSOF) we propose tosolve the speed optimization re-planning game is provided in Fig. 2. This frameworkconsists of three modules to cover issues of (i) AV rational speed optimization, (ii)information and conﬂict recognition, and (iii) cooperative speed optimization decisionmaking. However, before proceeding to the core of the proposed scheme, we address afew issues with respect to the safety constraints of consecutive AVs on the roadway.

The CSOF is designed to function on multiple-lane roadways only. A car-followingmodel is deﬁned with two essential rules. First, AVs using the CSOF in free motioncan smoothly overtake each other on the roadway to comply to certain speed actions re-sulting from their interaction and cooperation. Second, under certain trafﬁc conditionssuch as when overtaking is not possible, a safe following distance between consecutiveAVs is maintained. Consecutive AVs are modeled to maintain a minimum time gap oftwo seconds in order to avoid collision [26]. All the AVs are identical in length andhave an average length of 5 meters . The reaction time of AVs to the sudden decelerationof the trafﬁc ahead is assumed to be 1 . seconds [27]. The objective of this module is to provide each player (vehicle), AV i , with the opti-mal speed at every time step t . Based on the Time to Intersection T T I i and using a6ime Token Allocation Algorithm, the TL may allocate τ i to AV i , where τ i is an integervalue indicating the index of a time window during which AV i can pass the intersec-tion smoothly. The speed v i ( t ) of AV i at time step t is a function of the trafﬁc density D ( L j ) on road segment L j . [28] justiﬁed the linearity of the relation between trafﬁcdensity and speed under mild generic assumptions, concluding that as trafﬁc concen-tration/density increases, speed decreases. As such, some deﬁnitions are stated. • The maximum speed AV i can travel at, v max , will only occur when there are noother vehicles on the roadway. • In general, the speed of AV i goes to zero as the road reaches the maximum den-sity, v i ( t ) converges to 0 as D ( L j ) converges to D max ( L j ) .Therefore, considering a linear relation between the trafﬁc density and speed, the speed v i ( t ) of AV i with respect to the trafﬁc density D ( L j ) on road segment L j is v i ( t ) = v max (cid:18) − D ( L j ) D max ( L j ) (cid:19) (3) AV i is allocated a token τ i only if T T I i falls within the upcoming green-light time,i.e., T T I i ≤ R g or R r < T T I i ≤ R r + T g where R g and R r are the remaining green-lightand red-light times respectively. For AV i approaching a TL, the speed that minimizes AV i s idling time cost is found as follows: As AV i receives upcoming signal information from the TL, indicating that the currentlight is green, there are three possible cases in terms of T T I i and R g : • Case 1:

T T I i ≤ R g . In this case, using the current speed, AV i will be able to passthrough within the remaining green-light time. The TL allocates a time token τ i to AV i . Thereby, AV i maintains its speed to pass during the assigned time token. s i = v i ( t ) sub ject to : v i ( t ) ≥ d i ( t ) / a i v i ( t ) ≤ d i ( t ) / b i v i ( t ) ≥ v min v i ( t ) ≤ v max where T T I i = d i ( t ) / v i ( t ) , v i ( t ) is the speed of AV i at time step t , d i ( t ) is thedistance of AV i to the stop line of the TL at time step t , s i is the optimal speed of AV i at time step t +

1, and v max and v min are the maximum and minimum speed7imits on the road segment respectively, while a i and b i represent the lower andupper boundaries of the allocated time token respectively. a i = ( τ i − ) µ (4) b i = τ i µ (5)where µ is the departure rate in veh / sec . • Case 2: R g + T r ≥ T T I i > R g . In this case, the vehicle is not allocated a timetoken, and the speed of the vehicle is optimized over the distance to the TL sothat T T I i is sufﬁcient to meet the next green light. s i = min v i ( t ) sub ject to : v i ( t ) ≥ d i ( t ) / ( R g + T r + T q ) v i ( t ) ≤ d i ( t ) / ( R g + T r + T g ) v i ( t ) ≥ v min v i ( t ) ≤ v max where T q is the time needed to clear all the vehicles in the queue, and it is foundas follows: T q = n ( t ) µ (6)where n ( t ) denotes the number of vehicles currently in the queue. In addition,if the current speed does not allow AV i to be part of the green-light time butthe maximum speed of the roadway does, the speed optimization system willaccelerate the speed of AV i such that it is allocated a token. s i = max v i ( t ) sub ject to : v i ( t ) ≥ d i ( t ) / a i v i ( t ) ≤ d i ( t ) / b i v i ( t ) ≥ v min v i ( t ) ≤ v max • Case 3: R g + T r + T g ≥ T T I i > R g + T r . In this case, AV i will maintain its currentspeed as T T I i leads AV i to be part of the green-light time of the next cycle;However, AV i will not yet be allocated a time token. s i = v i ( t ) sub ject to : v i ( t ) ≥ v min v i ( t ) ≤ v max .2.2 Light is Red If the information received by the vehicle from the TL indicates that the current lightis red, there are three possible cases in terms of

T T I i and R r : • Case 1:

T T I i < R r . In this case, AV i will not be allocated a time token, and itsspeed is optimized such that it will meet the next green light. s i = min v i ( t ) sub ject to : v i ( t ) ≥ d i ( t ) / ( R r + T q ) v i ( t ) ≤ d i ( t ) / ( R r + T g ) v i ( t ) ≥ v min v i ( t ) ≤ v max • Case 2: R r < T T I i ≤ R r + T g . In this case, AV i is allocated a time token withinthe upcoming green-light time. s i = v i ( t ) sub ject to : v i ( t ) ≥ d i ( t ) / ( R r + a i ) v i ( t ) ≤ d i ( t ) / ( R r + b i ) v i ( t ) ≥ v min v i ( t ) ≤ v max • Case 3:

T T I i > R r + T g . In this case, AV i is not allocated a time token and itsspeed is optimized to meet the green-light time of the next cycle. s i = min v i ( t ) sub ject to : v i ( t ) ≥ d i ( t ) / ( R r + T g + T r + T q ) v i ( t ) ≤ d i ( t ) / ( R r + T r + T g ) v i ( t ) ≥ v min v i ( t ) ≤ v max All the AVs involved in the cooperative process are assumed to be Electric AutonomousVehicles (EAVs); therefore, the energy consumption model presented in [29] has beenmodiﬁed to compute the instant energy consumed by every AV at time step t . The totalenergy cost consumed by AV i at time step t consists of multiple sub-costs as follows:9 Potential Consumed/Gained Energy: the potential energy EC Pi ( t ) at time step t isconsumed from the battery during the uphill travel and is gained into the batteryduring the downhill travel. The potential consumed and gained energies are EC PCi ( t ) = η [ m g u ( t )] (7) EC PGi ( t ) = − η [ m g u ( t )] (8)where η is the efﬁciency of AV i , m is the mass of AV i , g is the gravity factor, and u ( t ) is the elevation of the road segment at time step t . • Loss of Energy: the loss of energy at time step t , which is always consumed fromthe battery, occurs due to aerodynamic and rolling resistances. EC lossi ( t ) = η [ f r m g v i ( t ) + ρ A d r v i ( t )] (9)where f r is the friction coefﬁcient, ρ is the air density coefﬁcient, A is the crosssectional area of AV i , and d r is the air drag coefﬁcient. • Acceleration/Deceleration Energy: the acceleration energy EC aci ( t ) at time step t is consumed from the battery as AV i accelerates to a higher speed while thedeceleration energy EC dci ( t ) at time step t is recuperated and stored into thebattery as AV i comes to a lower speed. EC aci ( t ) = η P wri d i ( t ) v i ( t ) − v i ( t − ) (10) EC dci ( t ) = − η P wri d i ( t ) v i ( t ) − v i ( t − ) (11)where P wri is the power of the electric motor of AV i . • Energy Consumed by On-Board Electric Devices: this energy is not path relatedand is consumed directly from the battery at time step t by the on-board electricdevices such as air conditioner, windshield wipers, etc. EC edi ( t ) = n ∑ j = P edi ( j ) t edi ( j ) (12)where P edi ( j ) is the power withdrawn at time step t by the electric device j and t edi ( j ) is the time that device j takes in use.Therefore, the total energy cost consumed by AV i at time step t is computed as EC Ti ( t ) = [ EC PCi ( t ) + EC PGi ( t ) + EC lossi ( t )+ EC aci ( t ) + EC dci ( t )] (13)10 .3 Information and Conﬂict Recognition Module In this module, if two or more players have been allocated the same time token, the TLinforms them that they have a conﬂict. Players with conﬂicting tokens communicatewith each other to share their strategies and associated costs. Consequently, they startto negotiate to ﬁnd a binding agreement based on which they can cooperate and agreeon certain speed actions. Once an agreement is reached, all the players abide by therules to apply those actions.

The cooperative game notion in this module is based on the assumption that players canreach a binding agreement with which they commit to apply certain strategic actions.As players are assumed to be rational, if the idling time cost to a player is greaterthan what it would have been without cooperation, then the rationality assumption isviolated. Therefore, the following rationality axiom is stated.

Axiom 1

At time step t, there exists an optimal strategy v k for player AV i such thatC L j v k ( i ) ≤ C L j v j ( i ) , ∀ k , j ∈ V (i.e., the cost associated with this optimal strategy is lessthan or equal to that associated with any other strategy player AV i can take). However,player AV i is free to choose any other strategy that might yield a higher cost, but onlyin exchange for a reward. The TL allocates time tokens to the players using Algorithm 1. In Algorithm 1,

V IN is the AV identiﬁcation number, N q is the number of vehicles currently in the queue, T sd is the slot duration (i.e., time token duration), and

T slot is the time token locationin the TL memory. According to this algorithm, the TL gives priority in allocatingtokens to the queued AVs. The rest of the green light time is segmented as tokensand offered to the approaching AVs. Players with conﬂicting tokens will list the costscaused by the conﬂict rather than the expected ones. It is assumed that each AV hasa mode property, which may take one of three values at a time:

Rush Mode , NormalMode , or

Relaxed Mode . • Rush Mode : this mode is used by the AV for urgent and emergency situations(e.g., must be in the hospital shortly). • Normal Mode : this mode is used by the AV when there is no emergency; the AVmay yield the road to others. • Relaxed Mode : this mode is used by the AV when there is plenty of time. TheAV would yield the road to other vehicles comfortably.Each mode is represented by a scalar value. For instance, the mode values may be 0,1, and 2 for modes

Relaxed , Normal , and

Rush respectively. Players will ﬁrst play the11 lgorithm 1 : Time Token Allocation Algorithm for a Single Roadway

Input: V IN i , T T I i ; Output: τ i if Light is Green thenif ( T T I i ≤ R g ) thenfor j = N q + N dep do a = T sd ∗ ( j − ) ; b = T sd ∗ j ; if ( T T I i ≥ a & T T I i ≤ b ) then T slot ( , j ) = V IN i ; τ i = j ; break ; end ifend forelse τ i = end ifelseif ( T T I i < R r ) then τ i = else if ( T T I i > R r + ( T sd ∗ N q ) & T T I i ≤ R r + T g ) thenfor j = N q + N dep do a = R r + T sd ∗ ( j − ) ; b = R r + T sd ∗ j ; if ( T T I i ≥ a && T T I i ≤ b ) then T slot ( , j ) = V IN i ; τ i = j ; break ; end ifend forelse τ i = end ifend if AV winner = max ( M ( AV ) , M ( AV )) (14)If both players are using the same mode, they will decide the winner based on thecredit points that they have. The one with the most will eventually win the game.Again, a credit point is deducted from the winner, and a credit point is granted to theloser. In this case, the winner is determined using the following formula. AV winner = max ( CP ( AV ) , CP ( AV )) (15)If it happens that both players have the same mode value and number of creditpoints, a random number-generation procedure between the TL and players is con-ducted to resolve the issue. Basically, each of the players as well as the TL will gener-ate a random number. The one whose generated number is closer to that of the TL willwin the current time token but lose a credit point. The other will gain a credit point andrequest a different token. The winner of the game is determined using the followingformula. RN = | RN ( T L ) − RN ( AV ) | RN = | RN ( T L ) − RN ( AV ) | AV winner = (cid:26) AV i f RN < RN AV otherwise (16)To further clarify the cooperative speed optimization game, an example of two AVsapproaching a TL is presented next. Example 1

Recall the problem statement example (Fig. 1) and assume that the TLhas just turned green for the East-West directions. After communicating with theTL, the two AVs, approaching the TL from the West, have been allocated, at timestep t, the same and only-remaining time token. Both vehicles will have only twostrategies to choose from. For instance, player AV s available strategies are V AV = { v ( t ) AV , v ( t ) AV } , where using strategy v ( t ) AV corresponds to using the current timetoken and using strategy v ( t ) AV corresponds to minimizing its own speed and request-ing a token within the next green light. Table 1 has been constructed to clarify the gamesetting.When both players choose the same strategy resulting in the use of strategy proﬁle ( C L j v ( t ) ( ) , C L j v ( t ) ( )) or ( C L j v ( t ) ( ) , C L j v ( t ) ( )) , the cost is high for both of them. When able 1: Action and response table of a two-player game

Players/Strategies v ( t ) AV v ( t ) AV v ( t ) AV ( C L j v ( t ) ( ) , C L j v ( t ) ( )) ( C L j v ( t ) ( ) , C L j v ( t ) ( )) v ( t ) AV ( C L j v ( t ) ( ) , C L j v ( t ) ( )) ( C L j v ( t ) ( ) , C L j v ( t ) ( )) either of the two players chooses the optimal strategy while the other chooses the sec-ond preferred strategy resulting in the use of strategy proﬁle ( C L j v ( t ) ( ) , C L j v ( t ) ( )) or ( C L j v ( t ) ( ) , C L j v ( t ) ( )) , the game is stable. In this case, the strategy proﬁle is an NE andPareto Optimal (PO). Hence, the binding agreement between the players would en-force the idea that they should choose different strategies. A numerical representationof such a game may be shown as in Table 2. Table 2:

Action and response table of a two-player game

Players/Strategies v ( t ) AV v ( t ) AV v ( t ) AV ( , ) ( , ) v ( t ) AV ( , ) ( , ) When more than two vehicles are allocated the same token, a multi-phase cooperativeprocedure is implemented to resolve the conﬂict. The multi-phase game is composedof multiple two-player sub-games. In each sub-game, only two players cooperate toﬁnd their acceptable joint strategies. Then, the winners of the two-player sub-gameswill play another sub-game and so on until the winner of the only available time tokenis determined.

A cooperative game is a tuple of two elements ( N , f ) , where N = { , , ... n } , is a ﬁniteset of AVs willing to trade credit points, and f is a function that maps subsets of N tonumbers. If S is a subset of N , such that S ⊆ N , then f ( S ) is the total value inducedwhen the members of S come together to trade credit points. For further clariﬁcation,an example is presented next. Example 2

Assume that there are three AVs, N = { AV , AV , AV } , heading toward aTL, where AV is a seller of a credit point, while AV and AV are two buyers. Considerthe case that AV has only one credit point to sell at $3 and each of the buyers contem-plates to buy at most one credit point. AV is willing to pay $5 , while AV is willing to ay $8 . The characteristic function, f , of this game is deﬁned as follows:f ( AV ) = f ( AV ) = f ( AV ) = f ( AV , AV ) = − = f ( AV , AV ) = − = f ( AV , AV ) = f ( AV , AV , AV ) = − = As introduced by [30], the marginal contribution concept provides the analytical rea-soning of bargaining. Let N \ AV i be the subset of N that contains all the AVs except AV i . The marginal contribution of AV i is f ( N ) − f ( N \ AV i ) and denoted by MC AV i . Forexample, the marginal contributions of the previously deﬁned game are MC AV = f ( N ) − f ( N \ AV ) = − = MC AV = f ( N ) − f ( N \ AV ) = − = MC AV = f ( N ) − f ( N \ AV ) = − = Deﬁnition 2

An allocation, ( x av , x av , ..., x av n ) , which is a collection of numbers rep-resenting the division of the overall value, where x av i indicates the value received byAV i , is individually rational if x av i ≥ f ( AV i ) , ∀ i ∈ { , , . . . , n } . Deﬁnition 3

An allocation, ( x av , x av , ..., x av n ) , is efﬁcient if ∑ ni = x av i = f ( N ) Deﬁnition 4

An individually rational and efﬁcient allocation, ( x av , x av , ..., x av n ) , sat-isﬁes the Marginal Contribution Principle if x av i ≤ MC AV i , ∀ i ∈ { , , . . . , n } . The core is the solution concept of coalitional games, containing all the possible al-locations (i.e., divisions of the overall value) [33]. Let x ( S ) be the sum of the valuesreceived by the AVs in the subset S , such that x ( S ) = ∑ i ∈ S x av i (17)In addition, let the marginal contribution of a subset S of N be MC S = f ( N ) − f ( N \ S ) .According to [30], the core has two main conditions.15 heorem 1 An allocation, ( x av , x av , ..., x av n ) , is part of the core if it is efﬁcient andfor every subset S of N, x ( S ) ≥ f ( S ) is satisﬁed. Proof 1

An allocation that belongs to the core of the game is individually rational.let S = { AV i } for i = , , . . . , nNoticeably, x { AV i } = x av i both represent the values received by AV i .Therefore, the condition x ( S ) ≥ f ( S ) is in fact the individual rationality condition x av i ≥ f ( AV i ) . (cid:3) Theorem 2

An allocation, ( x av , x av , ..., x av n ) , is part of the core if it is efﬁcient andfor every subset S of N, x ( S ) ≤ MC S is satisﬁed. Proof 2

An allocation that belongs to the core of the game satisﬁes x ( S ) ≤ MC S .Using the individual rationality condition, consider N \ Sx ( N \ S ) ≥ f ( N \ S ) (18) x ( N \ S ) = x ( N ) − x ( S ) (19) By efﬁciency, we have x ( N ) = f ( N ) . (20) Substituting (19) into (18) x ( N ) − x ( S ) ≥ f ( N \ S ) (21) Substituting (20) into (21) x ( S ) ≤ f ( N ) − f ( N \ S ) = MC S (22) (cid:3) Therefore, the core of the cooperative credit point bargaining game is deﬁned as fol-lows: { ( x av , x av , ..., x av n ) : ∑ i ∈ N x av i = f ( N ) , andx ( S ) ≥ f ( S ) , ∀ S ∈ N } (23)To ﬁnd the core elements, we propose that the problem is formulated as a ConstraintSatisfaction Problem (CSP). The most common CSP solving techniques are Backtrack-ing Search and Local Search [35]. For instance, the feasible allocations in Example 216re the points ( x av , x av , x av ) , such that x av + x av + x av = sub ject to : x av + x av ≥ x av + x av ≥ x av + x av ≥ x av ≥ , x av ≥ , x av ≥ x av , x av and x av are dom ( x av ) = { any value between and } dom ( x av ) = { any value between and } dom ( x av ) = { any value between and } By solving this problem as a CSP, the core of the game is

Core = { ( x av , x av , x av ) : i = ∑ i = x av i = f ( N ) , andx ( S ) ≥ f ( S ) , ∀ S ∈ N } Core = { ( $2 , $0 , $3 )( $3 , $0 , $2 )( $4 , $0 , $1 )( $5 , $0 , $0 ) } . Simulation was conducted to test and validate the performance of the CSOF. The simu-lation was performed in MATLAB using the concept of Object Oriented Programming(OOP). A two-lane roadway sub-network containing three SIs was chosen in Water-loo, ON, Canada, to conduct the simulation (Fig. 3). The SIs are as follows: SI1,Westmount Road North with Columbia Street West; SI2, Westmount Road North withBearinger Road; and SI3, Northﬁeld Drive West with Weber Street North. Every SIhas a static TL system such that TL1, TL2, and TL3 for SI1, SI2, and SI3 respectively.Each TL control system has a two-phase cycle where the East-West roadways are onephase and South-North roadways are the other phase. Each phase has a signal designof Green-Yellow-Red; however, for simplicity, the yellow-light time is assumed to bepart of the green-light-time duration. To enhance safety, one second of red-light timeis given to all the roadways between every two consecutive phases.In order to overcome randomization and capture the real behaviour of trafﬁc, thesimulation was run for more than three hours. The maximum and minimum speedlimits on any roadway in the network are v max = km / hour and v min = km / hour igure 3: A sub-network with three signalized intersections. respectively where the highest volume of trafﬁc a road segment may have is assumedto be 85percent of the maximum density, D max ( L j ) . AVs are generated randomly intothe network based on Poisson Distribution. The generated AVs travel at an averagespeed of 50 km / hour until they get within the activation distance (i.e., the distance atwhich the vehicles get within the V2I communication range and start cooperating).The activation distance was ﬁxed at 500 meters . The performance of the CSOF is com-pared to a Non-Cooperative Speed Optimization algorithm (NCSO) (i.e., the vehiclesindividually compute their optimal speeds). Once they are within range, AVs startconducting speed optimization based on the functioning optimization technique (i.e.,CSOF or NCSO) to have a chance of meeting the green light when they arrive at theTL.Figs. 4, 5, and 6 report the results of the total average idling time, total average num-ber of stops, and total average energy consumption at SI1, SI2, and SI3, comparing theCSOF to the NCSO algorithm. As can be seen, the CSOF outperformed the NCSOby achieving lower average idling times and average number of stops. This is becausethe conﬂicting passing times of vehicles through the intersections are resolved by theCSOF. All the AVs using the CSOF, meant to arrive during the green-light time, are al-18 igure 4: Total average idling time at SI1, SI2, and SI3.

Figure 5:

Total average number of stops at SI1, SI2, and SI3. located time tokens before reaching the intersections. As such, when they arrive, theyare able to pass through smoothly during their allocated times. In addition, due to theroad and signal timing constraints, the AVs that could only arrive at the intersectionsduring the red-light times were not allocated time tokens before reaching the intersec-tions. These AVs joined the queues with less waiting times. As mentioned previously,the time needed to clear the queue is excluded from that available as time tokens to theapproaching AVs. It can be noticed from the result ﬁgures that for the two comparedtechniques, as the number of AVs approaching the intersections increases, the averageidling times and number of stops become greater, and the improvement achieved bythe CSOF in minimizing the average idling times is more signiﬁcant. The reductionsin average idling times that have been achieved by the CSOF when compared to theNCSO for SI1, SI2, and SI3 are summarized in Tables 3, 4, and 5 respectively.In addition, the CSOF has achieved lower average energy consumption. In the caseof CSOF, as soon as a vehicle is allocated a time token, it maintains its speed so thatit passes the intersection smoothly during its allocated token. Hence, in general thereare less speed variations resulting from the AVs using the CSOF. To prove this, Fig. 719 igure 6:

Total average energy consumption of vehicles approaching SI1, SI2, and SI3.

Figure 7:

Speed trajectories of six vehicles approaching intersection 1, (a) vehicles using the CSOF, (b)vehicles using the NCSO. captures the speed trajectories of six AVs approaching SI1 from the Westmount RoadNorth direction from the moment they joined the activation distance of the CSOF untilthey passed the stop line of the intersection. It is clear that on average, the reductions inspeed variations of AVs using the CSOF are not signiﬁcant as compared to those usingthe NCSO algorithm. As a result, the energy savings of AVs using the CSOF have notbeen signiﬁcant as reported in Fig. 6.Furthermore, the total average idling time and number of stops achieved by the20 igure 8:

Activation distance analysis for intersection 2, (a) total average idling time, (b) total averagenumber of stops.

CSOF were investigated with respect to the V2I activation distance. SI2 was chosento conduct the investigation. It is assumed that the V2I communication radio is avail-able in a range of up to 800 meters away from the intersection, so it was varied from300 meters to 800 meters in steps of 100 meters . Fig. 8 depicts the total values of av-erage idling time and average number of stops being achieved by the CSOF. As canbe seen, it is concluded that the optimal point of activation is found near 500 meters .At shorter distances, some AVs are forced to arrive during the red-light time due tothe fact that the time available to the AVs to get allocated tokens, or reallocated tokensafter playing a game, to adjust their speeds accordingly is not enough to result in lowaverage values of idling time and number of stops. At further activation distances, theaverage values of idling time and number of stops are slightly increased but remainnear the optimal level.

Table 3:

Reduction in average idling time at signalized intersection 1.

AVs (veh/hour) 300 600 900 1200 1500 1800NCSO (sec) 0.829 3.831 4.301 5.55 5.967 6.262CSOF (sec) 0.05 0.441 1.227 1.688 2.521 2.933Reduction (%) 94 88 71 70 58 53 able 4: Reduction in average idling time at signalized intersection 2.

AVs (veh/hour) 300 600 900 1200 1500 1800NCSO (sec) 0.388 0.888 2.299 2.754 3.614 4.632CSOF (sec) 0.035 0.176 0.347 0.603 0.725 1.08Reduction (%) 91 80 85 78 80 77

Table 5:

Reduction in average idling time at signalized intersection 3.

AVs (veh/hour) 300 600 900 1200 1500 1800NCSO (sec) 0.711 1.826 2.694 4.649 6.208 6.412CSOF (sec) 0.044 0.159 0.483 1.084 1.727 2.878Reduction (%) 94 91 82 77 72 55

This research has addressed the problem of minimizing the total average idling timesand number of stops for Electric Autonomous Vehicles (EAVs) approaching SignalizedIntersections (SIs). A Cooperative Speed Optimization Framework (CSOF) was pro-posed. The proposed framework consists of three modules to tackle issues of individualspeed optimization, conﬂict recognition, and cooperative speed decision making. Sim-ulation was conducted to investigate the performance of the CSOF in comparison witha Non-Cooperative Speed Optimization (NCSO) algorithm. The simulation tested theperformance of the two techniques under various trafﬁc conditions, showing that theCSOF outperformed the NCSO in terms of minimizing the total average idling times,number of stops, and energy consumption. Furthermore, the performance of the CSOFwas investigated in terms of the V2I communication range. It was concluded that theEAVs using the CSOF could achieve the minimum values of total average idling timesand number of stops at a communication range activation near 500 meters . The futurework of this research should investigate the scalability of the CSOF under various ge-ometrical designs of SIs and TL phase/cycle settings. Moreover, the cooperation ofa dynamic trafﬁc light system with the EAVs using the CSOF should be investigatedto achieve further minimization of average idling times, number of stops, and energyconsumption.

References [1] G. Silberg and R. Wallace, ”Self-driving Cars: The Next Revolution,” KPMG LLPand the Center for Automotive Research (CAR), USA, Tech. Rep., 2012.[2] A. Pawsey and C. Nath, ”Autonomous Road Vehicles,” The Parliamentary Ofﬁceof Science and Technology, London, UK, PostNote No. 443, Sep. 2013.223] T. Litman, ”Autonomous Vehicle Implementation Predictions: Implications forTransport Planning,” Todd Alexander Litman: Victoria Transport Policy Institute,Victoria, BC, Tech. Rep., 2013-2017.[4] M. Oonk and J. Svensson, ”Roadmap: Automation in Road Transport,” iMobilityForum, Brussels, Tech. Rep. Version 1.0, May 2013.[5] D. Fagnant and K. Kockelman. ”Preparing a Nation for Autonomous Vehicles: op-portunities, Barriers and Policy Recommendations”,

Transportation Research PartA: Policy and Practice

Wireless Communications and Mobile Computing , Vol. 11, pp. 1657-1667, Nov.2011.[8] K. Katsaros, R. Kernchen, M. Dianati and D. Rieck, ”Performance study of a GreenLight Optimized Speed Advisory (GLOSA) application using an integrated coop-erative ITS simulation platform”, , Istanbul, 2011, pp. 918-923.[9] M. Alsabaan, K. Naik, T. Khalifa and A. Nayak. ”Applying Vehicular Networksfor Reduced Vehicle Fuel Consumption and CO2 Emissions,” in

Intelligent Trans-portation Systems , A. Abdel-Rahim, Ed, ISBN: 978-953-51-0347-9, InTech, Mar.2012, pp. 1-21.[10] C. Li and S. Shimamoto, ”An Open Trafﬁc Light Control Model for Reducing Ve-hicles’ CO Emissions Based on ETC Vehicles,” in

IEEE Transactions on VehicularTechnology , vol. 61, no. 1, pp. 97-110, Jan. 2012.[11] T. Tielert, M. Killat, H. Hartenstein, R. Luz, S. Hausberger and T. Benz, ”The im-pact of trafﬁc-light-to-vehicle communication on fuel consumption and emissions,” , Tokyo, 2010, pp. 1-8.[12] H. Rakha, R. Kishore, K. anathsharma, and K. Ahn, ”AERIS: Eco-Vehicle SpeedControl at Signalized Intersections Using I2V Communication,” United States De-partment of Transportation, New Jersey, USA, Tech. Rep. FHWAJPO-12-063, 2012.[13] E. Zermelo. ”Uber eine anwendung der mengenlehre auf die theorie desschachspiels.”

In Proc. of the Fifth International Congress of Mathematicians, ,1913, pp 501-504.[14] J. Neumann and O. Morgenstern.

Theory of games and economic behaviour .Princeton, NJ, US: Princeton University Press, 1944. pp. 1-625.2315] V. Fragnelli, I. Garcia-Jurado, and L. Mendez-Naya, ”On shortest path games”,

Mathematical Methods of Operations Research, 52(2):251-264 , 2000.[16] M. Voorneveld and S. Grahn, ”Cost allocation in shortest path games”,

Mathe-matical Methods of Operations Research, 56(2):323-340 , 2002.[17] D. Fotakis, S. kontogiannis, and P. Spirakis. ”Atomic Congestion Games amongCoalitions.”

ACM Transactions on Algorithm , Vol. 4, No. 4, Article 52, pp. 1-27,Aug. 2008.[18] A. Hayrapetyan, E. Tardos, and T. Wexler. ”The effect of collusion in congestiongames.”

In Proceedings of the Thirty-eight Annual ACM Symposium on Theory ofComputing , 2006, pp. 89-98.[19] S. Moya and A. Poznyak, ”Stackelberg-nash concept applied to the trafﬁc controlproblem with a dominating intersection,” , Mexico City, 2008, pp.137-142.[20] T. Linglong, Z. Xiaohua, H. Dunli, S. Yanzhang and W. Ren, ”A Study of SingleIntersection Trafﬁc Signal Control Based on Two-player Cooperation Game Model,” , Beidaihe, Hebei,2010, pp. 322-327.[21] I. Alvarez and A. Poznyak, ”Game theory applied to urban trafﬁc control prob-lem,”

ICCAS 2010 , Gyeonggi-do, 2010, pp. 2164-2169.[22] M. Khanjary, ”Using game theory to optimize trafﬁc light of an intersection,”

IEEE 14th International Symposium on Computational Intelligence and Informatics(CINTI) , Budapest, 2013, pp. 249-253.[23] H. M. Abdelghaffar, Hao Yang and H. A. Rakha, ”Isolated trafﬁc signal controlusing a game theoretic framework,”

IEEE 19th International Conference on Intelli-gent Transportation Systems (ITSC) , Rio de Janeiro, 2016, pp. 1496-1501.[24] B. Ronoh, K. Nyogesa, A. Otieno, and B. Korir. ”Central Moments of TrafﬁcDelay at a Signalized Intersection.”

Mathematical Theory and Modeling , Vol. 4,No. 9, pp. 31-52, 2014.[25] M. Osborne and A. Rubinstein.

A Course in Game Theory . Massachusetts Insti-tute of Technology, Cambridge, Massachusetts: The MIT Press, 1994, pp. 9-43.[26] S. T. Police.

Basic Theory of Driving: The Ofﬁcial Handbook, Ninth Edition .Singapore Trafﬁc Police, 2017, pp. 1-100.[27] P. Lertworawanich, ”Safe Following Distances Based on the Car Following Mod-els,”

PIARC International Seminar on Intelligent Transport System (ITS) In RoadNetwork Operations , Kuala Lumpur, Malaysia, 2006, pp. 1-19.2428] B. Greenshields, The photographic method of studying trafﬁc behaviour,

Pro-ceedings of the 13th Annual Meeting of the Highway Research Board , Washington,DC, 1933, pp. 382-399.[29] M. Faraj and O. Basir, ”Optimal energy/time routing in battery-powered vehi-cles”,

IEEE Transportation Electriﬁcation Conference and Expo (ITEC) , Dearborn,MI, 2016, pp. 1-6.[30] A. Brandenburger, ”Cooperative Game Theory: Characteristic Functions, Allo-cations, Marginal Contribution”, 2007.[31] J. Nash. ”The Bargaining Problem.”

The Econometric Society , Vol. 18, pp.155162, Apr. 1950.[32] J. Nash. ”Two-Person Cooperative Games.”

The Econometric Society , Vol. 21,pp. 128140, Jan. 1953.[33] R. Gilles.

The Cooperative Game Theory of Networks and Hierarchies . BerlinHeidelberg: Springer Berlin Heidelberg, 2010, pp. 29-68.[34] H. Wiese. Class Lecture, Topic: ”Applied cooperative game theory”, Universityof Leipzig, Apr. 2010.[35] S. Russell and P. Norvig.