On the Control of Agents Coupled through Shared Unit-demand Resources
aa r X i v : . [ c s . S Y ] A p r On the Control of Agents Coupled through Shared Unit-demandResources
Syed Eqbal Alam ∗ , Robert Shorten † , Fabian Wirth ‡ , and Jia Yuan Yu ∗ Abstract — We consider a control problem involving severalagents coupled through multiple unit-demand resources. Suchresources are indivisible, and each agent’s consumption is mod-eled as a Bernoulli random variable. Controlling the numberof such agents in a probabilistic manner, subject to capacityconstraints, is ubiquitous in smart cities. For instance, suchagents can be humans in a feedback loop—who respond to aprice signal, or automated decision-support systems that strivetoward system-level goals. In this paper, we consider both singlefeedback loop corresponding to a single resource and multiplecoupled feedback loops corresponding to multiple resourcesconsumed by the same population of agents. For example,when a network of devices allocates resources to deliver severalservices, these services are coupled through capacity constraintson the resources. We propose a new algorithm with fundamentalguarantees of convergence and optimality, as well as present anexample illustrating its performance.
Keywords— distributed optimization, optimal control,multi-resource allocation, unit-demand resources, smartcity, electric vehicle charging
I. I
NTRODUCTION
Classical control has much to offer in a smart-city context.However, while this is, without doubt, true, many problemsarising in the context of smart cities reveal subtle constraintsthat are relatively unexplored by the control community. At ahigh level, both classical control and smart-city control dealwith regulation problems. Nevertheless, in many (perhapsmost) smart-city applications, control involves orchestratingthe aggregate effect of a number of agents who respond toa signal (sometimes called a price ) in a probabilistic way. Afundamental difference between classical control and smart-city control is the need to study the effect of control signalson the statistical properties of the populations that we wish toinfluence, while at the same time ensuring that the control isin some sense optimal. This fundamental difference concernsthe need of ergodic feedback systems, and even though thisproblem is rarely studied in control, it is the issue that isperhaps the most pressing in real-life applications; sincethe need for predictability at the level of individual agentsunderpins an operator’s ability to write economic contracts.Our starting point for this work is the previous papers [1],[2], and the observation that many problems in smart citiescan be cast in a framework, where a large number of agents, ∗ Concordia Institute for Information Systems Engineering, ConcordiaUniversity, Montreal, Quebec, Canada † School of Electrical, Electronic and Communications Engineering, Uni-versity College Dublin, Dublin, Ireland ‡ Faculty of Computer Science and Mathematics, University of Passau,Passau, GermanyThe work is partly supported by Natural Sciences and EngineeringResearch Council of Canada grant no. RGPIN-2018-05096. such as people, cars, or machines, often with unknownobjectives—compete for a limited resource. It is a challengeto allocate a resource in a manner that utilizes it optimallyand gives a guaranteed level of service to each of the agentscompeting for that resource. For example, allocating parkingspaces [3], [4], [5], regulating cars competing for shared roadspace [6], or allocating shared bikes [7], [8], are examples inwhich resource utilization should be maximized, while at thesame time delivering a certain quality of service (QoS) to in-dividual agents is a paramount constraint. As we have notedin [2], [9], [10], at a high level, these are primarily optimalcontrol problems but with the added objective of controllingthe microscopic properties of the agent population. Thus, thedesign of feedback systems for deployment in smart citiesmust combine notions of regulation, optimization, and theexistence of this unique invariant measure [11].Specifically, in this paper, we consider the problem ofcontrolling a number of agents coupled through multipleshared resources, where each agent demands the resourcesin a probabilistic manner. This work builds strongly on theprevious work [1] in which the optimal control and ergodiccontrol of a single population of agents are considered. Aswe have mentioned, controlling networks of agents whichdemand resources in a probabilistic manner is ubiquitous insmart cities. In many smart-city applications, the probabilis-tic intent of agents can be natural (where humans are in afeedback loop and respond, for example, to a price signal),or designed (implemented in a decision support system) sothat the network achieves system-level goals. Often, suchfeedback loops are coupled together, as agents contributeor participate in multiple services. For example, when anetwork of devices allocates resources to deliver severalservices, these services are coupled through the consumptionof multiple shared resources; usually, we call such resourcesas unit-demand resources which are either allocated one unitof the resource or not allocated. A concrete manifestation ofsuch a system is the IBM Research’s project parked cars as aservice delivery platform [12]. Here, networks of parked carscollaborate to offer services to city managers. Examples ofservices include wifi coverage , finding missing objects , and gas leak detection and localization . Here, vehicle ownersallocate parts of their resource stochastically to contribute todifferent services, each of which are managed by a feedbackloop. The allocation between services is usually coupled viaa nonlinear function that represents the trade-off betweenresource allocation (energy, sensors), and the reward forparticipating in delivering a particular service. We shall givea concrete example of such a system later in the paper. Its our firm belief that such systems are ubiquitous in smartcities, and represent a new class of problems in feedbacksystems.Our main contribution in this paper is to establish stochas-tic schemes for a practically important class of problemsfor several agents coupled through multiple unit-demandshared resources in coupled feedback loops. Each agentdemands the unit-demand shared resources in a probabilisticmanner based on its private cost function and constraints;the constraints are based on multiple unit-demand sharedresources. This scheme is a generalization of the single unit-demand resource allocation algorithm proposed in [1] andfollows more relaxed constraints than [9]. Furthermore, theresults of convergence, as well as optimality, are derived fornetworks with a single unit-demand resource; the results arefurther extended for multiple unit-demand resources.II. P RELIMINARIES
Suppose that n agents are coupled through m resources R , R , . . . , R m and each agent has a cost function thatdepends on the allocation of these resources in the closedcoupled feedback loop. Let the desired value or capacity ofresource R , R , . . . , R m be C , C , . . . , C m , respectively.We denote N , { , , . . . , n } , M , { , , . . . , m } , and use i ∈ N as an index for agents and j ∈ M to index theresources. Let ξ ji ( k ) denote independent Bernoulli randomvariable which represents the instantaneous allocation ofresource R j of agent i at time step k . Furthermore, let y ji ( k ) ∈ [0 , denote the average allocation of resource R j of agent i at time step k . We define y ji ( k ) as follows, y ji ( k ) , k + 1 k X ℓ =0 ξ ji ( ℓ ) , (1)for i = 1 , , . . . , n , and j = 1 , , . . . , m . We assume thatagent i has a cost function g i : (0 , m → R + whichassociates a cost to a certain allotment of resources to theagent. We assume that g i is twice continuously differentiable,convex, and increasing in all variables, for all i . We alsoassume that the agents do not share their cost functionsor allocation information with other agents. Then insteadof defining the resource allocation problem in terms of theinstantaneous allocation ξ ji ( k ) ∈ { , } , for all i, j and k ,we define the objective and constraints in terms of averagesas follows, min y ,...,y mn n X i =1 g i ( y i , . . . , y mi ) , subject to n X i =1 y ji = C j , j = 1 , . . . , m,y ji ≥ , i = 1 , . . . , n, and , j = 1 , . . . , m. (2)Let y ∗ = ( y ∗ , . . . , y ∗ mn ) ∈ (0 , nm denote the solution to(2). Let N denote the set of natural numbers, and let k ∈ N denote the time steps. Next, our objective is to propose adistributed iterative algorithm that determines instantaneous allocation { ξ ji ( k ) } and ensures that the long-term averageallocation, as defined in (1) converge to optimal allocationas follows (treated in Section IV), lim k →∞ y ji ( k ) = y ∗ ji , for i = 1 , , . . . , n, and j = 1 , , . . . , m, thereby achieving the minimum social cost in the sense oflong-term averages. By compactness of the constraint set,optimal solutions exist. The assumption that the cost function g i is strictly convex leads to strict convexity of P ni =1 g i ,which follows that the optimal solution is unique. A. Optimality conditions
Let L : R nm × R m × R m → R , and let µ =( µ , µ , . . . , µ m ) and λ = ( λ , λ , . . . , λ m ) are Lagrangemultipliers of the resources. Then we define Lagrangian ofProblem (2) as follows, L ( y, µ, λ ) , n X i =1 g i ( y i , . . . , y mi ) − m X j =1 µ j ( n X i =1 y ji − C j ) + m X j =1 n X i =1 λ j y ji . Recall that y ∗ i , . . . , y ∗ mi ∈ (0 , are the optimal allocationsof agent i of Problem (2), for i = 1 , , . . . , n . Now, let ∇ j g i denote (partial) derivative of the cost function g i withrespect to resource R j , for j = 1 , , . . . , m . Then followingsimilar analysis as [9], we find that the derivatives of the costfunctions of all agents competing for a particular resourcereach consensus at optimal average allocations. That is, thefollowing holds true, for i, u ∈ N , and j ∈ M : ∇ j g i ( y ∗ i , . . . , y ∗ mi ) = ∇ j g u ( y ∗ u , . . . , y ∗ mu ) . (3)Furthermore, Karush-Kuhn-Tucker (KKT) conditions aresatisfied by the consensus of derivatives (cf. (3)) of thecost functions that are necessary and sufficient conditionsof optimality of the optimization Problem (2); a similaranalysis is done in [9], [14], [15], readers can find furtherdetails of KKT conditions at Chap. 5.5.3 [13]. In this paper,we use this principle to show that the proposed algorithmreaches optimal values asymptotically. The consensus ofderivatives of cost functions are also used in [14], [16], [17](single resource), [9], [15] (multi-resource—stochastic), [18](multi-resource—derandomized) to show the convergence ofallocations to optimal values.III. A LLOCATING SINGLE UNIT - DEMAND RESOURCETHROUGH A FEEDBACK LOOP
In this section, we consider the single resource case of[1] and briefly describe the proposed distributed, iterativeand stochastic allocation algorithm. We also provide proofof its convergence and optimality properties with a fewassumptions.With a single resource, we can simplify notation bydropping the index j . Each agent i has a strictly convexcost function g i : (0 , → R + . The binary random variable ξ i ( k ) ∈ { , } denotes the allocation of the unit resource foragent i at time step k . Let y i ( k ) be the average allocation upto time step k of agent i that is, y i ( k ) , k +1 P kℓ =0 ξ i ( ℓ ) .et ξ ( k ) ∈ { , } n and y ( k ) ∈ [0 , n denote the vectorswith entries ξ i ( k ) , y i ( k ) , respectively, for i = 1 , , . . . , n .The idea is to choose the probability for random variable ξ i so as to ensure convergence to the socially optimum valueand to adjust overall consumption to the desired level C byapplying a normalization factor Ω to the probability, for all i . When an agent joins the network at time step k ∈ N , itreceives the normalization factor Ω( k ) . At each time step k , the central agency updates Ω( k ) using a gain parameter τ , past utilization of the resource, and its capacity; then itbroadcasts the new value to all agents in the network, Ω( k + 1) , Ω( k ) − τ (cid:16) n X i =1 ξ i ( k ) − C (cid:17) , (4)where τ ∈ (cid:16) , (cid:16) max y ∈ [0 , n n X i =1 y i g ′ i ( y i ) (cid:17) − (cid:17) . (5)After receiving this signal, agent i responds in a randomfashion based on its available information. The probabilityfunction σ i ( · ) uses the average allocation of the resource toagent i and the derivative g ′ i of the cost function g i , is givenby, σ i (Ω( k ) , y i ( k )) , Ω( k ) y i ( k ) g ′ i ( y i ( k )) , for i ∈ N . (6)Agent i updates its resource demand at each time step eitherby demanding one unit of the resource or not demanding it,as follows, ξ i ( k + 1) = ( with probability σ i (Ω( k ) , y i ( k ));0 with probability − σ i (Ω( k ) , y i ( k )) . We point out that for the above formulation we requireassumptions on the cost function g i and the admissiblevalue of Ω because the scheme requires that (6) does, infact, define a probability. For ease of notation, we define v i ( z ) , z/g ′ i ( z ) , where z ∈ [0 , , and v ( y ) to be thevector with components v i ( y i ) , for i = 1 , , . . . , n , where y ∈ [0 , n . Definition 3.1 (Admissibility):
Let n ∈ N , and let g i :[0 , → R + be continuously differentiable and strictlyconvex, for i = 1 , . . . , n . We call the set { g i , i = 1 , . . . , n } and Ω > admissible, if(i) v i is well defined on [0 , , for i = 1 , . . . , n ,(ii) there are constants < a < b < , such that σ i (Ω , z ) = Ω v i ( z ) ∈ [ a, b ] , for i = 1 , . . . , n , and z ∈ [0 , .The definition of admissibility imposes several restrictionson the possible cost function g i , similar to those imposed in[14]. See this reference for a detailed discussion and possiblerelaxations. For the case that Ω is a constant that is, Ω does not depend on time step k ∈ N ; therefore, (4) is notactive, the convergence of the scheme follows using toolsfrom classical stochastic approximation [19]. Theorem 3.2: [19, Theorem 2.2] If x ( k ) ∈ R n + , for k ∈ N ,be formulated as follows x ( k + 1) = x ( k ) + a ( k ) (cid:2) h ( y ( k )) + M ( k + 1) (cid:3) , (7) for a fixed x (0) and Assumptions 3.3 (i) to (iv)are satisfied; then { x ( k ) } converges to a connectedchain-transitive set of the differential equation ˙ x ( t ) = h ( x ( t )) , almost surely , for t ≥ . Assumption 3.3: (i) The map h is Lipschitz.(ii) Step-size a ( k ) > , for k ∈ N , and ∞ X ℓ =0 a ( ℓ ) = ∞ , and ∞ X ℓ =0 (cid:0) a ( ℓ ) (cid:1) < ∞ . (iii) { M ( k ) } is a martingale difference sequence with re-spect to the σ -algebra F k generated by the eventsup to time step k . Also, for l -norm k·k , martingaledifference sequence { M ( k ) } is square-integrable thatis, E (cid:0) k M ( k + 1) k |F k (cid:1) ≤ η (1 + k x ( k ) k ) , almost surely, for k ∈ N , and η > . (iv) Sequence { x ( k ) } is almost surely bounded.Theorem on the convergence of average allocation y ( k ) isstated as follows. Theorem 3.4 (Convergence of average allocations):
Let n ∈ N . Assume that the cost function g i : [0 , → R + is strictly convex, continuously differentiable and strictlyincreasing in each variable, for i = 1 , . . . , n . Let Ω > , andassume that { g i , i = 1 , . . . , n } and Ω are admissible. Thenalmost surely, lim k →∞ y ( k ) = y ∗ , where y ∗ is characterizedby the condition, Ω = g ′ i ( y ∗ i ) , for i = 1 , . . . , n. (8) Proof:
By definition, we have, y ( k + 1) = kk + 1 y ( k ) + 1 k + 1 ξ ( k + 1) . (9)Let σ (Ω , y ( k )) denote the vectors with entries σ i (Ω , y i ( k )) ,for i = 1 , , . . . , n , and k = 0 , , , . . . Thus, (9) may bereformulated as y ( k + 1) = y ( k )+1 k + 1 (cid:2)(cid:0) σ (Ω , y ( k )) − y ( k ) (cid:1) + (cid:0) ξ ( k + 1) − σ (Ω , y ( k )) (cid:1)(cid:3) . (10)Furthermore, let (cid:0) ξ ( k + 1) − σ (Ω , y ( k )) (cid:1) be denoted by M ( k + 1) , and the step-size k +1 be denoted by a ( k ) , for k ∈ N . Also, let (cid:0) σ (Ω , y ( k )) − y ( k ) (cid:1) be denoted by h ( y ( k )) .Then we can reformulate (10) similar to (7).We can verify that Assumption 3.3 (i) to (iv) are satisfiedfor formulation (10). Recall that h ( y ( k )) = (cid:0) σ (Ω , y ( k )) − y ( k ) (cid:1) ; thus, the map h : y σ (Ω , y ) − y = Ω v ( y ) − y isLipschitz, which satisfies Assumption 3.3 (i). Also, the step-size a ( k ) = k +1 is positive, for k = 0 , , , . . . , and we canderive that ∞ X ℓ =0 a ( ℓ ) = ∞ , and ∞ X ℓ =0 (cid:0) a ( ℓ ) (cid:1) < ∞ , which satisfy Assumption 3.3 (ii). Additionally, we note thatthe expectation: E (cid:0) ξ ( k + 1) − σ (Ω , y ( k )) |F k (cid:1) = 0 , (11)here F k is the σ -algebra generated by the events up totime step k . This follows immediately from the definitionof the probability σ i ( · ) . By (11), we say that { M ( k ) } isa martingale difference sequence with respect to σ -algebra;also, the sequence { ξ ( k + 1) − σ (Ω , y ( k )) } is of coursebounded, with little manipulation we can show that the mar-tingale difference sequence { M ( k ) } is square-integrable—which satisfy Assumption 3.3 (iii). Moreover, the iterate y ( k ) ∈ [0 , n is bounded almost surely, which satisfiesAssumption 3.3 (iv). Thus, it follows that almost surely { y ( k ) } converges to a connected chain-transitive set of thedifferential equation, ˙ y = Ω v ( y ) − y. (12)It remains to show that the differential equation has anasymptotically stable fixed point whose domain of attractioncontains the set [0 , n , as this then determines the uniquepossible limit point of { y ( k ) } . We note first that the differ-ential equation is given by n decoupled equations ˙ y i = Ω v i ( y i ) − y i , for i = 1 , , . . . , n. The fixed points for each of these -dimensional equationsare characterized by the condition Ω y ∗ i /g ′ i ( y ∗ i ) − y ∗ i = 0 . Wehave by Definition 3.1 (ii) that Ω v i (0) > and Ω v i (1) − < , for i = 1 , , . . . , n. (13)This shows that y ∗ i ∈ (0 , and so a little manipulationshows that fixed points are characterized by, Ω = g ′ i ( y ∗ i ) , for i = 1 , , . . . , n. As g i is strictly convex, g ′ i is strictly increasing and sothe fixed point for each of the decoupled equations isunique. Now, (13) together with sign considerations showsasymptotic stability and the desired property of the domainof attraction. The proof is complete.Notice that proof of convergence is based on the constant Ω ; it is an open problem to prove convergence of averageallocation with Ω( k ) that varies with time step k (cf. (4)). Remark 3.5 (Optimality):
We note that the fixed pointcondition (8) can be interpreted as an optimality condition—as established in (3). If we define C ∗ , P ni =1 y ∗ i then (8)shows that y ∗ is the unique optimal point of the optimizationproblem: min y ,...,y n n X i =1 g i ( y i ) , subject to n X i =1 y i = C ∗ , y i ≥ , for i = 1 , , . . . , n. Furthermore, the equation shows that Ω may be used toadjust the fixed points; thus the constraints. As the cost func-tion g i is strictly convex and increasing in each variable, thederivative g ′ i is positive and increasing. Therefore, increasing Ω increases each y ∗ i (Ω) ; thus the total constraint C ∗ (Ω) ,while decreasing Ω has the opposite effect. The simple PIcontroller for Ω in (4) thus has the purpose of adjusting to the right level of resource consumption. The full proof ofconvergence of the scheme with PI-controller in the loop isbeyond the scope of the present paper.IV. A LLOCATING MULTIPLE UNIT - DEMAND RESOURCESTHROUGH COUPLED FEEDBACK LOOPS
We turn our attention in this section to the case of multipleresources shared by the same population of agents. Wepresent a new algorithm that generalizes the single-resourcealgorithm of the previous section to multiple unit-demandresources. The agents are coupled through these sharedresources.Before presenting the algorithm, we introduce the follow-ing additional notions. Suppose that there exists δ > , suchthat G δ is a set of continuously differentiable, convex andincreasing functions, and g , g , . . . , g n ∈ G δ . We assumethat G δ is common knowledge to the control unit, andeach cost function g i is private and should be kept private.Although G δ is common knowledge, due to the large numberof the cost functions g , g , . . . , g n in G δ , it is difficult forthe control unit to guess the cost function g i of a particularagent i ; it is true for every agent in the network.Each agent in the network runs the distributed unit-demandmulti-resource allocation algorithm. Let τ j be the gainparameter, Ω j ( k ) denotes the normalization factor (signal ofthe controller) of the feedback loop, and let C j represent thedesired value (capacity) of resource R j , respectively, for all j . We use the term control unit instead of controller here.The control unit updates Ω j ( k ) according to (15) at eachtime step and broadcasts it to all agents in the network, forall j and k . When an agent joins the network at time step k , it receives the parameter Ω j ( k ) for resource R j , for all j .Every agent’s algorithm updates its resource demand at eachtime step—either by demanding one unit of the resource ornot demanding it. The normalization factor Ω j ( k ) dependson its value at the previous time step, τ j , capacity C j , andthe total utilization of resource R j at the previous timestep, for all j and k . After receiving this signal, agent i ’salgorithm responds in a probabilistic manner. It calculatesits probability σ ji ( k ) using its average allocation y ji ( k ) ofresource R j and the derivative of its cost function, for all j and k , as described in (16). Agent i finds out the outcome ofBernoulli trial for resource R j , outcome with probability σ ji ( k ) and outcome with probability − σ ji ( k ) ; based onthe value or , the algorithm decides whether to demandone unit of the resource R j or not. If the value is , thenthe algorithm demands one unit of the resource; otherwise, itdoes not demand the resource, analogously, it is done for allthe resources. This process repeats over time. We present theproposed unit-demand multi-resource allocation algorithmfor the control unit in Algorithm 1 and the algorithm foreach agent in Algorithm 2.After introducing the algorithms, we describe here howto calculate different factors. Let x , . . . , x mn ∈ [0 , bethe deterministic values of average allocations then the We initialize it with a positive real number for each resource. lgorithm 1:
Algorithm of control unitInput: C , . . . , C m , τ , . . . , τ m , ξ i ( k ) , . . . , ξ mi ( k ) , for k ∈ N and i ∈ N .Output: Ω ( k + 1) , Ω ( k + 1) , . . . , Ω m ( k + 1) , for k ∈ N .Initialization: Ω j (0) ← . , for j ∈ M , foreach k ∈ N doforeach j ∈ M do calculate Ω j ( k + 1) according to (15) andbroadcast in the network; endendAlgorithm 2: Unit-demand multi-resource allocation al-gorithm of agent i Input: Ω ( k ) , Ω ( k ) , . . . , Ω m ( k ) , for k ∈ N .Output: ξ i ( k + 1) , ξ i ( k + 1) , . . . , ξ mi ( k + 1) , for k ∈ N .Initialization: ξ ji (0) ← and y ji (0) ← ξ ji (0) , for j ∈ M . foreach k ∈ N doforeach j ∈ M do σ ji ( k ) ← Ω j ( k ) y ji ( k ) ∇ j g i ( y i ( k ) ,...,y mi ( k )) ;generate Bernoulli independent random variable b ji ( k ) with the parameter σ ji ( k ) ; if b ji ( k ) = 1 then ξ ji ( k + 1) ← ; else ξ ji ( k + 1) ← ; end y ji ( k + 1) ← k +1 k +2 y ji ( k ) + k +2 ξ ji ( k + 1); endend control unit calculates the gain parameter τ j with commonknowledge of G δ , for all j , as follows, τ j ∈ (cid:0) , (cid:0) sup x ,...,x mn ∈ R + ,g ,...,g n ∈G δ n X i =1 x ji ∇ j g i ( x i , . . . , x mi ) (cid:1) − (cid:1) . (14)Now, we define Ω j ( k + 1) which is based on the utilizationof resource R j at time step k and common knowledge G δ asfollows, Ω j ( k + 1) , Ω j ( k ) − τ j (cid:16) n X i =1 ξ ji ( k ) − C j (cid:17) . (15)We call Ω j ( k ) as the normalization factor , used by thecontrol unit. After receiving the normalization factor Ω j ( k ) from the control unit at time step k , agent i respondswith probability σ ji ( k ) in the following manner to demandresource R j at next time step, for all i , j and k : σ ji ( k ) , Ω j ( k ) y ji ( k ) ∇ j g i ( y i ( k ) , y i ( k ) , . . . , y mi ( k )) . (16) Notice that Ω j ( k ) is used to bound the probability σ ji ( k ) ∈ (0 , , for all i , j and k . Furthermore, note that for simplicityof notation we use σ ji ( k ) instead of σ ji (Ω j ( k ) , y ji ( k )) in thissection.Let ξ j ( k ) ∈ { , } n and y j ( k ) ∈ [0 , n denote the vectorswith entries ξ ji ( k ) , y ji ( k ) , respectively, and σ j ( k ) denotesthe vector with entries σ ji ( k ) , for i = 1 , , . . . , n , j =1 , , . . . , m , and k = 0 , , , . . . Then similar to the singleresource case, we can restate the definition of admissibility asin Definition 3.1 and the theorem of convergence of averageallocation y j ( k ) for a constant Ω j , for j = 1 , , . . . , m , asin Theorem 3.4. We state the generalized theorem of con-vergence of average allocations of multi-resource as follows. Theorem 4.1 (Convergence of average allocations):
Let n ∈ N . Assume that the cost function g i : [0 , n → R + is strictly convex, continuously differentiable and strictlyincreasing in each variable, for i = 1 , . . . , n . Furthermore,let Ω j > , and assume that { g i , i = 1 , . . . , n } and Ω j are admissible, for j = 1 , . . . , m . Then almost surely, lim k →∞ y j ( k ) = y ∗ j , where y ∗ j is characterized by thecondition, Ω j = ∇ j g i ( y ∗ i , y ∗ i , . . . , y ∗ mi ) , (17)for i = 1 , . . . , n, and j = 1 , . . . , m. Proof:
We write the average allocation y j ( k ) as: y j ( k + 1) = kk + 1 y j ( k ) + 1 k + 1 ξ j ( k + 1) , for j = 1 , , . . . , m . This may be reformulated, for j =1 , . . . , m as: y j ( k + 1) = y j ( k )+1 k + 1 (cid:2)(cid:0) σ j ( k ) − y j ( k ) (cid:1) + (cid:0) ξ j ( k + 1) − σ j ( k ) (cid:1)(cid:3) . (18)Notice that (18) is similar to (10); thus, the proof followsthe single resource case.Readers may note that proof of convergence with Ω j ( k ) thatvaries with time step k ∈ N (cf. (15)), for j = 1 , . . . , m , isan open problem.Analogous to Remark 3.5 with similar assumption on C ∗ j ,for j = 1 , . . . , m , we can write that the fixed point condi-tion (17) can be interpreted as an optimality condition—asestablished in (3). Also, we say that y ∗ j , for j = 1 , . . . , m ,is the unique optimal point of the optimization Problem 2. Remark 4.2 (Privacy of an agent):
The control unit onlyknows about the aggregate utilization P ni =1 ξ ji ( k ) of re-source R j at time step k that ensures the privacy of proba-bility and cost function of an agent.Furthermore, notice that the network has very little communi-cation overhead. Suppose that Ω j ( k ) takes the floating pointvalues represented by µ bits. If there are m unit-demandresources in the network, then the communication overheadin the network will be µm bits per time unit. Moreover, thecommunication complexity is independent of the number ofagents participating in the network.We briefly present [9] here. Let us assume that there aretwo unit-demand resources R and R in a network of n gents. Agent i desires to receive on-average T i ∈ [0 , amount of the unit-demand resources in long-run, for i =1 , , . . . , n . Although, the paper follows the same updatescheme for normalization factors Ω ( k ) and Ω ( k ) (cf. (15)).However, the goal of the scheme is different from this paper.They aim to achieve: lim k →∞ y i ( k ) + y i ( k ) = T i , for i = 1 , . . . , n. And lim k →∞ y i ( k ) = β i T i , and , lim k →∞ y i ( k ) = (1 − β i ) T i , where β i ∈ [0 , , for i = 1 , , . . . , n .V. A PPLICATION TO ELECTRIC VEHICLE CHARGING
In this section, we use Algorithms 1 and 2 to regulate thenumber of electric vehicles that share a limited number oflevel and level charging points. We illustrate throughnumerical results that utilization of charging points (level or level ) is concentrated around its desired value (capacity);moreover, agents receive the optimal charging points inlong-term averages, we verify this using the consensus ofderivatives of cost functions of agents which satisfies all theKKT conditions for the optimization Problem 2, as describedin Section II.As a background, the transportation sector in the UScontributed around % of greenhouse gas (GHG) emissionin in which light-duty vehicles like cars have %contribution. Furthermore, the share of carbon dioxide is . % of all GHG gases from the transportation sector [20].To put it in context, currently, we have more than billionvehicles (electric (EV) as well as internal combustion engine(ICE)) on the road worldwide [21], the number is increasingvery rapidly which will result in increased CO emission infuture. Therefore, strategies are needed to reduce the CO emission. Though electric-only vehicles produce zero emis-sion, the electricity generating units produce GHG emissionat source depending on the power generation technique used,for example, thermal-electric, hydro-electric, wind power,nuclear power, etc. The US Department of Energy [22] statesthat annual CO emission by an electric vehicle (EV) is , . kg (share of CO emission in producing electricityfor charging an EV) and an ICE is , . kg.Now, consider a situation where a city sets aside severalfree (no monetary cost) electric vehicle supply equipment (EVSE) which supports level and level chargers ata public EV charging station to serve the residents or topromote usage of electric vehicles or both. Level chargerworks at – Volt (V) AC, – Ampere (A) and ittakes around – hours to charge the battery of an EVfully, whereas level charger works at V AC, – Aand it takes around – hours to charge the battery fully—depending on the battery capacity, onboard charger capacity,and a few other factors [23]. The voltage and current ratingof chargers vary, details of ranges can be found in [24], [25].Furthermore, suppose that the city has installed C EVSEswhich support level chargers and C EVSEs which support level chargers. Let n electric cars are coupled through level and level charging points. Now, the city must decidewhether to allocate level charging point or level chargingpoint to an electric car to regulate the number of cars utilizingcharging points. Clearly, in such a situation, charging pointsshould be allocated in a distributed manner that preservesthe privacy of individual car users, but also maximizes thebenefit to the city. We use the proposed distributed stochasticalgorithm which ensures the privacy of electric car usersand allocates charging points optimally to maximize socialwelfare, for example, to minimize total electricity cost or CO emission.According to [26], on average . kg of CO is pro-duced to generate and distribute kWh of electric energy inthe European Union with mix energy sources. Let I be thecurrent flowing in the circuit and V be the voltage rating ofthe circuit, let E CO be the rate of CO emission per kWh.If an EV is charged for t hours at a charging point then itstotal share of CO emission, say T CO ( t ) for generation anddistribution of I × V × t kWh electric energy is calculatedas T CO ( t ) = I × V × t × E CO , we use E CO = 0 . kg.Table I illustrates the total CO emission in kg by level and level chargers in four-hours duration. We use this datato formulate the cost function g i of (electric) car i , for allthe cars. Charger type power (kW) CO emission in four hoursLevel . – .
40 2 . – . kgLevel . – .
60 8 . – . kg TABLE I: CO emission in generation and distribution ofelectricitySuppose that each car user has private cost function g i which depends on the average allocations y i ( k ) and y i ( k ) of level and level charging points, respectively, for i = 1 , , . . . , n . We assume that the city agency (controlunit) broadcasts the normalization factors Ω ( k ) and Ω ( k ) to each competing electric car after every hours, here wechose a duration of hours because of charging rate of level chargers. Note that an EV user can unplug the vehicle inthe middle of charging without fully charging the battery.Now, suppose that the cost functions are classified into fourclasses based on—the type of vehicle, its battery capacity,onboard charger capacity, and a few other factors. We assumethat a set of vehicles belonging to each class. Based on thevalues in Table I, we let the constants a = 2 . , b = 8 . , andlet f i , f i be uniformly distributed random variables, where f i ∈ [1 , . , f i ∈ [1 , , for all i . The cost function g i islisted in (19), where first and second terms represent CO emission at a basic assumed rate of charging of the battery,whereas third and subsequent terms are CO emission dueto different charging losses or factors. We observe that noallocation of charging points produce zero CO emission,
500 1000
Iterations A v g . a ll o c . o f c h a r g e p o i n t s Level 1, charger 22Level 2, charger 22Level 1, charger 981Level 2, charger 981 (a)
Iterations D e r i v a t i v e o f g i ∇ g i (.) (b) Iterations D e r i v a t i v e o f g i ∇ g i (.) (c) Fig. 1: (a) Evolution of average allocation of charging points, (b) evolution of profile of derivatives of g i of all the electriccars with respect to level chargers, (c) evolution of profile of derivatives of g i of all the electric cars with respect to level chargers. Iterations T o t a l a v e r ag e a ll o c a t i o n s Level 1 chargerLevel 2 charger (a)
Iterations U t ili z a t i o n Level 1 chargerLevel 2 charger (b)
Fig. 2: (a) Evolution of the sum of average allocation of charging points, (b) utilization of charging points over the last time steps, capacities of level and level chargers are C = 400 and C = 500 , respectively.the cost functions are as follows, g i ( y i , y i ) = ( i ) ay i + by i + af i ( y i ) + bf i ( y i ) , ( ii ) ay i + by i + af i ( y i ) / bf i ( y i ) , ( iii ) ay i + by i + af i ( y i ) / af i ( y i ) + bf i ( y i ) , ( iv ) ay i + by i + af i ( y i ) + bf i ( y i ) . (19)Now, let the number of electric cars be n = 1200 thatuse level and level chargers. We classify these carsas follows—cars to belong to class , cars to belong to class , cars to belong to class ,and cars to belong to class . Each class hasa set of cost functions; the cost functions of class arepresented in (19) ( i ) and analogously for other classes. Let C = 400 and C = 500 . The parameters of the algorithmsare initialized with the following values; Ω (0) = 0 . , Ω (0) = 0 . , τ = 0 . , and τ = 0 . . Weuse the proposed Algorithm 1 and Algorithm 2 to allocatecharging points to n electric cars that are coupled throughlevel and level charging points. If a car user is lookingfor a free charging point, then it sends a request to the cityagency in a probabilistic manner based on its private costfunction g i and its previous average allocation of level and level charging points. Based on the request, the cityagency allocates one of the charging points or both or none.Furthermore, the car users do not share their cost functionsor history of their allocations with other car users or with thecity agency. Notice a limitation of this application, followingthe proposed algorithm, in some cases; a car user can receiveaccess to both level and level charging points for a singlecar, which may not be desired in real-life scenarios.We present simulation results of automatic allocation ofcharging points here. We observe that the electric car usersreceive optimal allocations of both types of charging pointsand minimize the overall CO emission. Moreover, weobserve in Figure 1(a) that the long-term average allocationsof charging points of electric cars converge to their respectiveoptimal values.As described earlier in (3), to show the optimality of thesolution, the derivatives of the cost functions of all the carswith respect to a particular type of charger should make aconsensus. The profile of derivatives of cost functions ofthe cars with respect to level and level chargers fora single simulation is illustrated in Figure 1(b) and 1(c),respectively. We observe that they converge with time andhence make a consensus, which meets the KKT conditionsor optimality. Note that we use third and subsequent termsof (19) to calculate the derivative ∇ j g i which shifts its valueby constants a or b without affecting the KKT points, but itprovides faster convergence in the simulation. The empiricalresults thus obtained, show the convergence of the long-term average allocations of charging points to their respectiveoptimal values using the consensus of derivatives of the costfunctions, which results in the optimum emission of CO .We also observed that σ ji ( k ) is in (0 , most of the timewith the current values of Ω j ( k ) and τ j with a few initialovershoots. To overcome the overshoots of probability σ ji ( k ) ,we use σ ji ( k ) = min n Ω j ( k ) y ji ( k ) ∇ j g i ( y i ( k ) ,y i ( k ) ,...,y mi ( k )) , o ,for all i, j and k .Figure 2(a) illustrates the sum of the average allocations P ni =1 y ji ( k ) over time. We observe that the sum of theaverage allocations of charging points converge to respectivecapacity over time that is, for large k , P ni =1 y ji ( k ) ≈ C j , forall j . We further illustrate the utilization of charging pointsfor the last time steps in Figure 2(b). It is observed thatmost of the time the total allocation of charging points isconcentrated around its capacity. To reduce the overshootof total allocation of level j charging points, we assume aconstant γ j < and modify the algorithm of the city agency(control unit) to calculate Ω j ( k +1) (cf. (15)) in the followingmanner, Ω j ( k + 1) = Ω j ( k ) − τ j (cid:16) n X i =1 ξ ji ( k ) − γ j C j (cid:17) , for j = 1 , and all k .VI. C ONCLUSION
We proposed a new algorithm to solve a class of multi-variate resource allocation problems. The solution approachis distributed among the agents and requires no communica-tion between agents and little communication with a centralagent. Each agent can, therefore, keep its cost functionprivate. This generalizes the unit-demand single resourceallocation algorithm of [1]. In the single-resource case, basedon a constant normalization factor, we showed that the long-term average allocations of a unit-demand resource convergeto optimal values; multiple (unit-demand) resource casefollows this result. Additionally, experiments show that thelong-term average allocations converge rapidly to optimumvalues in the multi-resource case.Open problems are to prove convergence with a time-varying normalization factor for single-resource as well asmulti-resource cases. Another open problem is to analyze therate of convergence. In terms of applications, our proposedapproach can be used to allocate resources, such as Internet-of-Things (IoT) devices in hospitals, smart grids, to list afew. It can also be used to allocate virtual machines to usersin cloud computing. R
EFERENCES[1] W. M. Griggs, J. Y. Yu, F. R. Wirth, F. Hausler, and R. Shorten, “Onthe design of campus parking systems with QoS guarantees,”
IEEETrans. Intelligent Transportation Systems , vol. 17, no. 5, pp. 1428–1437, 2016. [2] A. R. Fioravanti, J. Marecek, R. N. Shorten, M. Souza, and F. R.Wirth, “On classical control and smart cities,” in
IEEE Conference onDecision and Control , pp. 1413–1420, Dec 2017.[3] R. Arnott and J. Rowse, “Modeling parking,”
Journal of UrbanEconomics , vol. 45, no. 1, pp. 97–124, 1999.[4] D. Teodorovic and P. Lucic, “Intelligent parking systems,”
EuropeanJournal of Operational Research , vol. 175, no. 3, pp. 1666–1681,2006.[5] T. Lin, H. Rivano, and F. L. Moul, “A survey of smart parkingsolutions,”
IEEE Transactions on Intelligent Transportation Systems ,vol. 18, no. 12, pp. 3229–3253, Dec 2017.[6] I. Jones, “Road space allocation: the intersection of transport planning,governance and infrastructure,” 2014.[7] T. Raviv and O. Kolka, “Optimal inventory management of a bike-sharing station,”
IIE Transactions , vol. 45, no. 10, pp. 1077–1093,2013.[8] P. DeMaio, “Bike-sharing: History, impacts, models of provision, andfuture,”
Journal of Public Transportation , vol. 12, no. 4, pp. 41–56,2009.[9] S. E. Alam, R. Shorten, F. Wirth, and J. Y. Yu, “Distributed algorithmsfor Internet-of-Things-enabled prosumer markets: A control theoreticperspective,”
Analytics for the Sharing Economy: Mathematics, En-gineering and Business Perspectives , (forthcoming), Springer, 2019,(preprint: arXiv:1812.07636 [cs.SY]).[10] E. Crisostomi, R. Shorten, and F. Wirth, “Smart cities: A goldenage for control theory? [industry perspective],”
IEEE Technology andSociety Magazine , vol. 35, no. 3, pp. 23–24, Sep 2016.[11] A. R. Fioravanti, J. Marecek, R. N. Shorten, M. Souza, and F. R. Wirth,“On the ergodic control of ensembles,” arXiv:1807.03256 [math.OC],2018.[12] R. Cogill, O. Gallay, W. Griggs, C. Lee, Z. Nabi, R. Ordonez, M. Rufli,R. Shorten, T. Tchrakian, R. Verago, F. Wirth, and S. Zhuk, “Parkedcars as a service delivery platform,” in
International Conference onConnected Vehicles and Expo , pp. 138–143, Nov 2014.[13] S. Boyd and L. Vandenberghe,
Convex optimization . CambridgeUniversity Press, 2004.[14] F. Wirth, S. Stuedli, J. Y. Yu, M. Corless, and R. Shorten, “Nonhomo-geneous place-dependent Markov chains, unsynchronised AIMD, andnetwork utility maximization,” accepted in the Journal of the ACM ,2019, (preprint: arXiv:1404.5064 [math.OC]).[15] S. E. Alam, R. Shorten, F. Wirth, and J. Y. Yu, “Communication-efficient distributed multi-resource allocation,” in
IEEE InternationalSmart Cities Conference , pp. 1–8, Sep 2018.[16] H. Nabati and J. Y. Yu, “Distributed, private, and derandomizedallocation algorithm for EV charging,” in
IEEE International SmartCities Conference , pp. 1–8, Sep 2018.[17] K. Chaturvedi, J. Y. Yu, and S. Rao, “Distributed and efficient resourcebalancing among many suppliers and consumers,” in
IEEE Interna-tional Conference on Systems, Man, and Cybernetics , pp. 3584–3589,Oct 2018.[18] S. E. Alam, R. Shorten, F. Wirth, and J. Y. Yu, “Derandomized dis-tributed multi-resource allocation with little communication overhead,”in
Allerton Conference on Communication, Control, and Computing ,pp. 84–91, Oct 2018.[19] V. S. Borkar,
Stochastic approximation . Cambridge University Press,2008.[20] “Fast facts: U.S. transportation sector GHG emissions 1990–2015,”
United States Environmental Protection Agency , EPA-420-F-17-013,July 2017.[21] J. Sousanis, “World vehicle population tops 1 billion units,” Aug 2015.[22] “Emissions from hybrid and plug-in electric vehicles,”
U.S. Depart-ment of Energy - Energy Efficiency and Renewable Energy, AlternativeFuels Data Center , Feb 2018.[23] S. Schey, “Canadian electric vehicle infrastructure deployment guide-lines,” 2014.[24] M. Yilmaz and P. T. Krein, “Review of battery charger topologies,charging power levels, and infrastructure for plug-in electric andhybrid vehicles,”
IEEE Transactions on Power Electronics , vol. 28,no. 5, pp. 2151–2169, May 2013.[25] Q. Wang, X. Liu, J. Du, and F. Kong, “Smart charging for electricvehicles: A survey from the algorithmic perspective,”