[PDF] A Decentralized Multi-Objective Optimization Algorithm

Abstract

During the past two decades, multi-agent optimization problems have drawn increased attention from the research community. When multiple objective functions are present among agents, many works optimize the sum of these objective functions. However, this formulation implies a decision regarding the relative importance of each objective function. In fact, optimizing the sum is a special case of a multi-objective problem in which all objectives are prioritized equally. In this paper, a distributed optimization algorithm that explores Pareto optimal solutions for non-homogeneously weighted sums of objective functions is proposed. This exploration is performed through a new rule based on agents' priorities that generates edge weights in agents' communication graph. These weights determine how agents update their decision variables with information received from other agents in the network. Agents initially disagree on the priorities of the objective functions though they are driven to agree upon them as they optimize. As a result, agents still reach a common solution. The network-level weight matrix is (non-doubly) stochastic, which contrasts with many works on the subject in which it is doubly-stochastic. New theoretical analyses are therefore developed to ensure convergence of the proposed algorithm. This paper provides a gradient-based optimization algorithm, proof of convergence to solutions, and convergence rates of the proposed algorithm. It is shown that agents' initial priorities influence the convergence rate of the proposed algorithm and that these initial choices affect its long-run behavior. Numerical results performed with different numbers of agents illustrate the performance and efficiency of the proposed algorithm.

Full PDF

aa r X i v : . [ c s . M A ] O c t Noname manuscript No. (will be inserted by the editor)

A Decentralized Multi-Objective Optimization Algorithm

Maude J. Blondin · Matthew Hale

Received: date / Accepted: date

Abstract

Maude J. Blondin, Corresponding authorUniversit´e de SherbrookeSherbrooke, [email protected] HaleUniversity of FloridaGainesville, Floridamatthewhale@uﬂ.edu Maude J. Blondin, Matthew Hale of objective functions is proposed. This exploration is performed through anew rule based on agents’ priorities that generates edge weights in agents’communication graph. These weights determine how agents update their de-cision variables with information received from other agents in the network.Agents initially disagree on the priorities of the objective functions thoughthey are driven to agree upon them as they optimize. As a result, agents stillreach a common solution. The network-level weight matrix is (non-doubly)stochastic, which contrasts with many works on the subject in which it isdoubly-stochastic. New theoretical analyses are therefore developed to ensureconvergence of the proposed algorithm. This paper provides a gradient-basedoptimization algorithm, proof of convergence to solutions, and convergencerates of the proposed algorithm. It is shown that agents’ initial priorities in-ﬂuence the convergence rate of the proposed algorithm and that these initialchoices aﬀect its long-run behavior. Numerical results performed with diﬀerentnumbers of agents illustrate the performance and eﬃciency of the proposedalgorithm.

Keywords

Multi-agent systems · Distributed Optimization · Pareto Front · Multi-objective Optimization

Over the last two decades, multi-agent systems have attracted signiﬁcant in-terest [1]-[3]. In particular, the study of the consensus problem, where agentshave to agree on a common value, has been motivated by emerging applica-

Decentralized Multi-Objective Optimization Algorithm 3 tions such as formation control [4]. The consensus problem has been extendedto multi-agent optimization, i.e., agents collectively work towards minimizinga sum of objective functions by minimizing a local objective and repeatedlyaveraging their iterates to reach agreement on a ﬁnal answer. One commonapproach in problems with many objectives is optimizing their sum with eachagent independently optimizing only one of the objective functions [6]-[19].However, optimizing the sum carries an implicit decision about the problemformulation, namely that all the objectives have the same priority and that allagents agree on these priorities.Equal prioritization among functions represents a special case of a multi-objective problem, and applications in which objectives may have diﬀerentimportance are easy to envision. For instance, in a ﬂeet of self-driving cars,agents may have diﬀerent priorities in trajectory planning such as minimizingfuel usage vs. travel time, or in a collection of smart buildings, agents mayhave diﬀerent preferences regarding the management of their energy [20].A large body of work on multi-objective optimization to solve problemsof this kind has emerged for centralized cases. The Tchebycheﬀ method, theweighting method, and the ǫ -Constraint method [21] are examples of algo-rithms for centralized multi-objective optimization problems. More algorithmsof this category are surveyed in [21][22]. Such algorithms explore the Paretooptimal set using diﬀerent prioritizations of the objective functions of theproblem. With regard to these techniques, minimizing the sum of objectivefunctions leads to a single element of the Pareto Front. Further exploring this Maude J. Blondin, Matthew Hale front can provide additional optimal solutions in diﬀerent senses. For multi-agent systems, exploring the Pareto Front would provide a larger range ofoperating conditions for systems based on agents’ needs, which can be en-coded in heterogeneous weights on objectives. To the best of our knowledge,such methods remain largely unexplored in a multi-agent context.This paper proposes a distributed algorithm for multi-agent multi-objectiveset-constrained problems, and the proposed algorithm enables the explorationof the Pareto Front. In particular, a team of m agents optimizes the weightedsum of convex cost functions f ( x ) = P mi =1 w i f i ( x ), where agent i minimizes f i . A common convex set constrains the agents. At the beginning of the opti-mization process, agents have an initial vector of priorities encoded as weightsand an initial vector of decision variables. The proposed algorithm performsfour steps at each iteration: i) agent i updates its vector of priorities usingthose received from other agents in the network, ii) the vectors of prioritiesare used to generate the matrix of information weights for the decision variableupdate, iii) agent i updates its vector of decision variables with the generatedmatrix and the decision variables received from its neighbors, and iv) agent i takes a gradient descent step and projects its estimates on the constraint set.The proposed algorithm belongs to a class of averaging-based distributedoptimization algorithms, e.g., [6][8]-[13]. The existing literature considers pre-dominantly problems with doubly-stochastic weights on agents’ informationexchanges. Indeed, many works rely on the doubly-stochasticity assumption intheir model to provide convergence rates and proofs of convergence [6][7][8][14]- Decentralized Multi-Objective Optimization Algorithm 5 [17]. Computing the inﬁnite product of doubly-stochastic matrices simpliﬁesthe analysis of agents’ computations, and there exist several rules that ensurethe information matrix is doubly-stochastic, such as Metropolis-based weights[23] and the equal-neighbor model [24][25]. These rules restrict communicationamong agents and do not allow agents to individually prioritize informationreceived from other agents in the network. In addition, these rules requirecoordination among agents to selected admissible information weights, whichcan be diﬃcult to achieve if communicating is diﬃcult or costly. The pro-posed algorithm addresses the limitations related to the doubly-stochasticityassumption in addition to giving agents increased ﬂexibility in their choices.In particular, the following aspects distinguish our algorithm from the existingliterature: – Agents independently prioritize the information received from their neigh-bors. The sum of each agent’s preferences must be 1. While individualagents can easily ensure that their preferences sums to 1, this implies thatagents do not have know or consider other agents’ preferences. Therefore,preferences of all agents for a particular objective function need not to sumto 1. – This independence regarding prioritization of objective functions leads to anetwork-level information exchange matrix that is (non-doubly) stochastic. – While the agents are reaching an agreement on preferences, they explore thePareto Front of objective functions. This front exploration leads to optimal

Maude J. Blondin, Matthew Hale solutions in diﬀerent senses, which provides broader operating conditionsfor systems in conformity with agents’ needs/preferences.Because of these distinctions, new theoretical analysis is required to ensurealgorithm convergence. In this paper, the proposed algorithm operates overan undirected graph with time-varying weights, and the constraint set is thesame for all agents. Theoretical analysis shows that the proposed algorithmdrives agents to a common solution. Agents simultaneously reach an agree-ment on their preferences and compute the optimum with respect to theseagents’ preferences. Also, we develop convergence rates for the proposed algo-rithm, which are shown to be signiﬁcantly inﬂuenced by agents’ preferences.Numerical simulations show the convergence of the proposed algorithm to theoptimal solution along with its convergence rate. The agents’ agreement onpreferences is also illustrated. Simulations further show that agents’ initialpreferences directly inﬂuence the ﬁnal results of their computations. This pa-per is an extension of [26] and it adds proof of convergence and convergencerates, in addition to new simulation results.The rest of the paper is organized as follows. Section 2 presents backgroundon graph theory and multi-agent interactions. The multi-agent optimizationmodel and the proposed distributed optimization algorithm are provided inSection 3. Section 4 provides proofs of convergence and convergence rates ofthe proposed algorithm. Section 5 presents numerical results, and Section 6concludes the paper.

Decentralized Multi-Objective Optimization Algorithm 7

In this paper, agents’ interactions are represented by a connected and undi-rected graph G = ( V, E ), where V = [ m ] := { , , . . . , m } is the set of agentsand E ⊂ V × V is the set of edges. An edge exists between agent i and j , i.e.,( i, j ) ∈ E , if agent i communicates with agent j . By convention, ( i, i ) / ∈ E forall i . The degree of agent i is the total number of agents that agent i commu-nicates with, denoted deg( i ). The degree matrix, denoted ∆ ( G ), is a diagonal n × n matrix, with deg( i ) on its diagonal for i = 1 , . . . , n . The maximum vertexdegree of G is ∆ max = max i ∈ [ m ] deg( i ).The adjacency matrix is an n × n matrix denoted H ( G ), where h ij is theentry in the j -th row and i -th column, deﬁned as h ij =  i, j ) ∈ E otherwise . Since G is an undirected graph without self-loops, H ( G ) is symmetric withzeros on its main diagonal. The Laplacian matrix associated with G is alsosymmetric, and is deﬁned as L ( G ) = ∆ ( G ) − H ( G ) . (1)In this paper, we consider an arbitrary graph G , and, because G is unam-biguous, we will simply write its Laplacian as L . Maude J. Blondin, Matthew Hale

In this section, we formally deﬁne the class of problems to be solved. Then wepropose a multi-objective multi-agent update law for solving them.3.1 Problem FormulationIn this paper, problems in which agents minimize a prioritized sum of convexobjective functions are considered. Agent i minimizes only the function f i ,about which we make the following assumption. Assumption 1

For all i ∈ { , . . . , m } , the function f i : R n → R is continu-ously diﬀerentiable and convex. △ All agents’ decision variables are constrained to lie in the set X , aboutwhich we assume the following. Assumption 2

X is non-empty, compact, and convex. △ We next consider the following optimization problem.

Problem 3.1

Given functions { f i } i ∈{ ,...,m } satisfying Assumption m X i =1 w i f i ( x ) , (2)subject to x ∈ X Decentralized Multi-Objective Optimization Algorithm 9 where x is the vector of decision variables, w i is a priority assigned to f i , P mi =1 w i = 1, and 0 < w i < i . Agent i knows only its objective func-tion f i . The constraint set X is identical for all agents. ⋄ For centralized problems, the priorities { w i } i ∈{ ,...,m } are ﬁxed. Therefore,a standard convex optimization method could solve Problem 1 in a centralizedway. However, for decentralized cases, agents may choose diﬀerent priorities.Agent i may choose { w il } l ∈{ ,...,m } while agent j chooses { w jl } l ∈{ ,...,m } , with w il = w jl for all l .As this occurs, these priorities provide each agent with the ﬂexibility tohave preferences. For instance, mobile autonomous agents generating a tra-jectory may want to optimize fuel usage and travel time, and each agent canprioritize these two objectives according to their own needs. If agents’ prioritiesdiﬀer, agents are solving diﬀerent problems because they minimize diﬀerentoverall objective functions. As a result, reaching a common solution requiresdevising an optimization algorithm and driving agent priorities to a commonvalue.Changing agents’ priorities from their initial values implies that no sin-gle agents’ preferences are obeyed exactly. However, the net change across allagents can be done fairly. One such way is to drive all agent’s priorities totheir average value. While one could envision ﬁrst computing the average pri-orities and then optimizing, this is undesirable because it requires solving twoseparate problems sequentially. Instead, we devise an update law that drives agents to a common solution by interlacing optimization steps with priorityaveraging steps. Also, this interlacing enables agents to continuously modifytheir preferences based on the task at hand.3.2 Proposed Update LawAt iteration k , agent i updates its priority vector w i . Agent i assigns a priorityto all agents (corresponding to the objective function updated by that agent),even though agent i does not communicate with all agents. This provides agent i with a way to inﬂuence all ﬁnal priorities, and, as will be shown below, aﬀectthe ﬁnal results agents attain. Agent i also updates its decision vector x i byadding the weighted estimates received from its neighbors, then minimizingits objective function f i through gradient step, and then projecting its newestimate on its constraint set X . We have w i ( k + 1) = w i ( k ) + c n X j =1 h ji ( w j ( k ) − w i ( k )) (3) a ij ( k + 1) = q ij w ij ( k ) + m X j = ij =1 w ij ˜ q ij (4) v i ( k ) = n X j =1 a ij ( k ) x j ( k ) (5) x i ( k + 1) = P X [ v i ( k ) − α k d i ( k )] , (6) Decentralized Multi-Objective Optimization Algorithm 11 where 0 < c < /∆ max , h ij ( k ) is the j th i th entry of H ( G ), a ij ( k ) is the weightthat agent i assigns to the data provided by agent j at iteration k , q ij is the j th i th entry of Q , where Q = H + I and I is the identity matrix, ˜ q ij = 1 − q ij , α k isthe gradient step size for all agents at time k , P X i is the projection operation,and d i is the gradient vector of agent i at x i ( k ). Formally, d i ( k ) = ∇ f i ( x i ( k )).The equivalent network-level representation of (3)-(10) is W ( k + 1) = P W ( k ) , (7)where P = I − cL ( G ) and W ( k ) is the column matrix of agents’ priorities,along with A ( k ) in which its column vectors are a i ( k ) for i = { , . . . , m } , and V ( k ) in which its column vectors are v i ( k ) for i = { , . . . , m } . A ( k ) = Q ◦ W ( k ) + ( W ( k ) ◦ ˜ Q ) J ◦ I, (8) V ( k ) = A ( k ) X ( k ) , (9) X ( k + 1) = P X c [ V ( k ) − α k D ( k )] . (10)where ◦ denotes the Hadamard product, P X c is the projection operationthat projects each column of V ( k ) − α k D ( k ) individually, D ( k ) is the columnmatrix of d i ( k ) for i = { , . . . , m } , and X ( k ) is the column matrix of x i ( k )for i = { , . . . , m } . In line with the multi-objective optimization concept, ouralgorithm uses the priority vectors, w i for i = { , . . . , m } , to quantify the im-portance of information received to update x i for i = { , . . . , m } . Contrary to most existing works, the A ( k ) matrix is a function of W ( k ) and A ( k ) matrixis stochastic, instead of doubly-stochastic. This occurs because an agent canensure that its weights sum to 1, though diﬀerent agents’ weights for a partic-ular objective need not to sum to 1. This implies that A ( k )’s row sums neednot to equal 1.In (8), Q ◦ W ( k ) computes the Hadamard product between Q and W ( k ),where the resulting matrix contains w ij ( k ) for ( i, j ) ∈ E , w ii ( k ) for all i , andthe remaining terms are set to zero. Therefore, if agent i does not communicatewith agent j , a zero is assigned to that agent. Regarding the second term of(8), ( W ( k ) ◦ ˜ Q ) J ◦ I creates a diagonal matrix, where the diagonal terms arethe sum of each row. The ﬁrst term summed to the second term in (8) meansthat agent i assigns to itself the weights w ij if ( i, j ) / ∈ E and assigns a zerovalue to the entries of the i -th row and j -th colum for ( i, j ) / ∈ E .The next lemma pertains to the weights of the A matrix and the commu-nication between agents. Lemma 3.1

Since a ij ( k ) is obtained from (3) and (4) , we have1. a ii ( k ) ≥ min j ∈ [ n ] min i ∈ [ n ] w ij (0) for all k ≥ and all i .2. a ij ( k ) ≥ min j ∈ [ n ] min i ∈ [ n ] w ij (0) for all k ≥ and all ( i, j ) ∈ E .3. a ij ( k ) = 0 for all k if ( i, j ) / ∈ E .Proof See Appendix.To simplify the forthcoming development, Eq. (10) can be written as follows[5],

Decentralized Multi-Objective Optimization Algorithm 13 x i ( k + 1) = v i ( k ) − α k d i ( k ) + φ i ( k ) , (11) φ i ( k ) = P X [ v i ( k ) − α k d i ( k )] − v i ( k ) + α k kd i ( k ) . (12)For all i and for all k and s where k > s , the above equivalent form allows usto express the decision variable update over time as: x i ( k + 1) = m X j =1 [ Φ ( k, s )] ij x j ( s ) − k − X r = s m X j =1 [ Φ ( k, r + 1)] ij α r d j ( r ) − α k d i ( k ) + k − X r = s m X j =1 [ Φ ( k, r + 1)] ij φ j ( r ) + φ i ( k ) , (13)where the transition matrix Φ ( k, s ) = A ( k ) A ( k − , . . . , A ( s ) [6]. This section provides the convergence analysis for the proposed algorithm (3)-(10).The following well-known lemma conﬁrms that the priority update (3) doesindeed compute average priorities.

Lemma 4.1 lim k →∞ w i ( k ) = w = P mj =1 w j (0) m for j = 1 , . . . , m . At thenetwork level, W ( k ) = W , where , W = w ⊺ .Proof See [27][28].

From Assumption 1, the gradient is continuous and from Assumption 2 X is compact. Therefore, we have || d i ( k ) || ≤ L for all i . From that statement,Lemma 4.2 follows. Lemma 4.2

The errors φ i ( k ) satisfy || φ i ( k ) || ≤ α k L for all i and k .Proof See [5].The next Lemma describes the convergence behavior of Φ ( k, s ). Lemma 4.3

From Lemma 3.1, the convergence of Φ ( k, s ) is geometric accord-ing to | [ Φ ( k, s )] ji − γ j ( s ) | ≤ Cβ k − s , (14) where B = m − , m is the number of agents, γ j ( s ) = lim k →∞ [ Φ ( k, s )] ji , C = 2 (1 + min j ∈ [ m ] min i ∈ [ m ] w ji (0) − B )1 − min j ∈ [ m ] min i ∈ [ m ] w ji (0) B , and β = (1 − min j ∈ [ m ] min i ∈ [ m ] w ji (0) B ) /B .Proof See Lemma 3 and Lemma 4 in [6] and Lemma 3.1 above.To prove the convergence results, we use the following lemmas [5].

Lemma 4.4

Assume that < ρ < , { λ k } k ∈ N be a positive scalar sequence,and lim k →∞ λ k = 0 . Then, lim k →∞ k X l =0 ρ k − l λ l . = 0 Moreover, if P ∞ k λ k < ∞ , we have ∞ X k =1 k X l =0 ρ k − l λ l < ∞ . Decentralized Multi-Objective Optimization Algorithm 15

Proof

See proof for Lemma 7 in [5].

Lemma 4.5

Assume that X is a nonempty closed convex set in R n . Thus, weobtain for any x ∈ R n , || P X [ x ] − y || ≤ || x − y || − || P X [ x ] − x || for all y ∈ X .Proof See proof for Lemma 1(b) in [5].

Lemma 4.6

Let x i ( k ) be generated by (9) - (10) . We have for any z ∈ X andall k ≥ , m X i =1 || x i ( k + 1) − z || ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − z || + α k m X i =1 || d i ( k ) || − α k m X i =1 ( f i ( v i ( k )) − f i ( z )) − m X i =1 || φ i ( k ) || . Proof

See Appendix.The following lemma demonstrates that disagreements between agents go to0, namely that || x i ( k ) − x j ( k ) || as k → ∞ . To assess agent disagreements, weconsider agents’ disagreements with the average of their decision variables, y ( k ) = 1 m m X j =1 x j ( k ) . (15)In view of (9) and (11), we have y ( k + 1) = 1 m m X i =1 m X j =1 a ij ( w i ( k )) x j ( k ) − α k m m X i =1 d i ( k ) + 1 m m X i =1 φ i ( k ) . (16) Lemma 4.7

Let the algorithm generate iterates of x i ( k ) by the algorithm (9) - (10) and consider { y ( k ) } k ∈ N deﬁned in (16) .(a) If the stepsize is decreasing such as lim k →∞ α k = 0 , thus lim k →∞ || x i ( k ) − y ( k ) || = 0 for all i. (b) If P ∞ k =1 α k < ∞ therefore ∞ X k =1 α k || x i ( k ) − y ( k ) || < ∞ for all i. Proof

See Appendix.From Lemma 4.7(a), the following theorem is obtained regarding the con-vergence rate of || x i ( k ) − y ( k ) || . As it has been demonstrated that agents’disagreements go to 0, as k → ∞ (Lemma 4.7a), this theorem shows the rateto reach agreement on agents’ decision variable. Theorem 4.1

Following Assumption 2, there is an M such that P mj =1 || x j (0) || ≤ M . Let ǫ > be given and let K be the ﬁrst time that α k ≤ ǫ . Let C be de-ﬁned as in Lemma 4.3. Then < min j ∈ [ m ] min i ∈ [ m ] w ji (0) < , α is decreasing, and lim k →∞ α k = 0 , and for all k ≥ K + 3 , we have || x i ( k ) − y ( k ) || ≤ mCM ((1 − min j ∈ [ m ] min i ∈ [ m ] w ji (0) B ) ( k − /B )+ 4 mCLα ((1 − min j ∈ [ m ] min i ∈ [ m ] w ji (0) B ) ( k − K ) /B )1 − (1 − min j ∈ [ m ] min i ∈ [ m ] w ji (0) B ) /B + 4 α k − L + 4 mCLα ǫ − β . Decentralized Multi-Objective Optimization Algorithm 17

Proof

Recall (56) and β = ((1 − min j ∈ [ m ] min i ∈ [ m ] w ji (0) B ) /B ): || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + 4 mCL k − X r =0 β k − r α r + 4 α k − L. Then, we have || x i ( k ) − y ( k ) || ≤ mCM β k − + 4 mCL k − X r =0 β k − r α r + 4 α k − L. (17)Suppose we have an arbitrary ǫ > K be deﬁned so that α r ≤ ǫ (since α r →

0) for all k ≥ K + 3. We therefore have k − X r =0 β k − r α r ≤ K X r =0 β k − r α r + ǫ k − X r = K +1 β k − r ≤ max ≤ t ≤ K α t K X r =0 β k − r + ǫ k − X r = K +1 β k − r . (18)Because of P k − r = K +1 β k − r ≤ − β , we obtain k − X r =0 β k − r α r ≤ max ≤ t ≤ K α t K X r =0 β k − r + ǫ − β . (19)Similarly, since P Kr =0 β k − r ≤ β k − K − β , we obtain for all k ≥ K + 3, k − X r =0 β k − r α r ≤ max ≤ t ≤ K α t β k − K − β + ǫ − β . (20)Inserting (20) into (17), we get for k ≥ K + 3, || x i ( k ) − y ( k ) || ≤ mCM β k − + 4 mCL h max ≤ t ≤ K α t β k − K − β + ǫ − β i + 4 α k − L. (21) Because α t is decreasing, we obtain for all k ≥ K + 3, || x i ( k ) − y ( k ) || ≤ mCM β k − + 4 mCLα h β k − K − β + ǫ − β i + 4 α k − L. (22) (cid:3) The convergence rate is aﬀected by the value of β . Recall β = (1 − min j ∈ [ n ] min i ∈ [ n ] w ji (0) B ) /B , meaning the value of β is a function of the mini-mum initial priority and the number of agents. The convergence rate slowsdown as the minimum initial agent weight decreases and the number of agentsincreases. Agents should therefore carefully choose their preferences. A smallinitial priority would make the convergence rate very slow, which can harmalgorithm performance. This suggests that agents’ priorities must be balancedwith need for attaining a high-quality ﬁnal result with a reasonable conver-gence rate. Along the same lines, an extremely large team of agents wouldincrease the limit of the convergence rate; as the number of agents increasesagents’ preferences associated to objective functions tend to be smaller sinceagents’ preferences sum to 1.Based on Lemmas 4.6 and 4.7, the next theorem presents the asymptoticconvergence of the proposed algorithm. In distinction to [5], it is shown thatthe iterates x i ( k ) converge to an optimal solution for an information exchangematrix A that is (non-doubly) stochastic, which weights are obtained fromagents’ priorities (8). Decentralized Multi-Objective Optimization Algorithm 19

Theorem 4.2

The iterates x i ( k ) are generated by (7) - (10) with stepsize sat-isfying conditions of Lemma 4.7. Assume that the optimal solutions set X ∗ isnonempty. Therefore, an optimal point x ∗ ∈ X ∗ exists such that lim k →∞ || x i ( k ) − x ∗ || = 0 for all i. Proof

From Lemma 4.6, we have m X i =1 || x i ( k + 1) − z || ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − z || + α k m X i =1 || d i ( k ) || − α k m X i =1 ( f i ( v i ( k )) − f i ( z )) − m X i =1 || φ i ( k ) || . Using the gradient bound and by removing the last nonpositive term on theright hand side, we get m X i =1 || x i ( k + 1) − z || ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − z || + α k mL − α k m X i =1 ( f i ( v i ( k )) − f i ( y ( k ))) − α k ( f ( y ( k )) − f ( z )) . (23)Considering the gradient boundedness and the stochasticity of weights, wehave | f i ( v i ( k )) − f i ( y ( k )) | ≤ L || v i ( k ) − y ( k ) || ≤ L m X j =1 a ij ( w i ( k )) || x j ( k ) − y ( k ) || . (24)Summing (24) over m and using it in (23), we obtain, m X i =1 || x i ( k + 1) − z || ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − z || + α k mL +2 α k L m X i =1 m X j =1 a ij ( k ) || x j ( k ) − y ( k ) || − α k ( f ( y ( k )) − f ( z )) . (25)Considering z = x ∗ ∈ X ∗ , and by restructuring the terms we get, m X i =1 || x i ( k + 1) − x ∗ || + 2 α k ( f ( y ( k )) − f ( x ∗ )) ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − x ∗ || + α k mL +2 α k L m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − y ( k ) || . (26)By summing (26) over an arbitrary window from some positive integer K to N with K < N , we obtain, m X i =1 || x i ( N + 1) − x ∗ || + 2 N X k = K α k ( f ( y ( k )) − f ( x ∗ )) ≤ m X i =1 m X j =1 a ij ( w i ( K )) || x j ( K ) − x ∗ || + mL N X k = K α k + 2 L N X k = K α k m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − y ( k ) || . (27)With K = 1 and N → ∞ in (27), using P ∞ k =1 α k < ∞ and P ∞ k =1 α k P mj =1 || x j ( k ) − y ( k ) || < ∞ , which is a result of Lemma 4.7, we have ∞ X k =1 α k ( f ( y ( k )) − f ( x ∗ )) < ∞ . Decentralized Multi-Objective Optimization Algorithm 21

Because x j ( k ) ∈ X for all j , y ( k ) ∈ X for all k . Given that x ∗ ∈ X ∗ , f ( y ( k )) − f ∗ ≥ k . As a result of this relation and the assumption that P ∞ k =1 α k = ∞ , and P ∞ k =1 α k ( f ( y ( k )) − f ( x ∗ )) < ∞ , we obtain,lim inf k →∞ ( f ( y ( k )) − f ( x ∗ )) = 0 . (28)The forthcoming development demonstrates that agents converge to theoptimal point x ∗ . The nonnegative term in left side hand of (27) can be re-moved. Therefore, we have m X i =1 || x i ( N + 1) − x ∗ || ≤ m X i =1 m X j =1 a ij ( w i ( K )) || x j ( K ) − x ∗ || + mL N X k = K α k + 2 L N X k = K α k m X i =1 m X j =1 a ij ( w i ( K )) || x j ( k ) − y ( k ) || . (29)Given that P k α k < ∞ and P ∞ k =1 α k P mi =1 P mj =1 a ij ( w i ( k )) || x i ( k ) − y ( k ) || < ∞ , it results that x i ( k ) is bounded for each i , andlim sup N →∞ m X i =1 || x i ( N + 1) − x ∗ || ≤ lim inf K →∞ m X i =1 m X j =1 a ij ( w i ( K )) || x j ( K ) − x ∗ || . This implies that the scalar sequence P mi =1 || x i ( k ) − x ∗ || converges for every x ∗ ∈ X ∗ .Given that lim k →∞ || x i ( k ) − y ( k ) || = 0 (Lemma 4.7), { y ( k ) } k ∈ N is boundedand the scalar sequence || y ( k ) − x ∗ || is convergent for x ∗ ∈ X ∗ . Because y ( k ) is bounded, y ( k ) has a limit point. From (28), we havelim inf k →∞ f ( y ( k )) = f ∗ . Considering the previous equality and the conti-nuity of f , one of the limit points of { y ( k ) } must be in X ∗ , which is denotedby x ∗ . Therefore, || y ( k ) − x ∗ || is convergent. Thus, lim k →∞ y ( k ) = x ∗ andlim k →∞ || x i ( k ) − y ( k ) || = 0, which implies that each sequence { x i ( k ) } con-verges to the same x ∗ ∈ X ∗ . (cid:3) From Theorem 4.2 and Lemma 4, the following convergence rate is ob-tained.

Theorem 4.3

Let ǫ > be given and let K be the ﬁrst time that α k ≤ ǫ .Using Lemma 4 and Theorem 4.1, we have for s ≥ K + 3 , m X i =1 || x i ( k + 1) − x ∗ || ≤ m X i =1 m X j =1 q ij ω ( k +1) max || x j ( s ) − x ∗ || + k X r = s m X i =1 m X j =1 q ij ω ( k +1 − r ) max α r L h α r L + 4 mCM β r − +8 mCLα h β r − K − β + ǫ − β i + 8 α r − L ] . (30) Proof

From (25), we have m X i =1 || x i ( k + 1) − z || ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − z || + α k mL +2 α k L m X i =1 m X j =1 a ij ( k ) || x j ( k ) − y ( k ) || − α k ( f ( y ( k )) − f ( z )) . (31)Dropping the last negative term, we ﬁnd Decentralized Multi-Objective Optimization Algorithm 23 m X i =1 || x i ( k + 1) − x ∗ || ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − x ∗ || + α k mL +2 α k L m X i =1 m X j =1 a ij ( k ) || x j ( k ) − y ( k ) || . (32)Re-arranging the terms, we have m X i =1 || x i ( k + 1) − x ∗ || ≤ m X i =1 m X j =1 a ij ( w i ( k )) " || x j ( k ) − x ∗ || + α k L + 2 α k L || x j ( k ) − y ( k ) || . (33)Deﬁne ω ( k ) = max j ∈ [ n ] max i ∈ [ n ] a ij ( k ). Therefore, the maximum value that ω ( k ) cantake is ω max = 1 − min j ∈ [ n ] min i ∈ [ n ] w ji (0)). We therefore obtain m X i =1 || x i ( k + 1) − x ∗ || ≤ m X i =1 m X j =1 q ij ω max " || x j ( k ) − x ∗ || + α k L + 2 α k L || x j ( k ) − y ( k ) || . (34)Using Theorem 4.1, we obtain m X i =1 || x i ( k + 1) − x ∗ || ≤ m X i =1 m X j =1 q ij ω ( k +1) max || x j ( s ) − x ∗ || + k X r = s m X i =1 m X j =1 q ij ω ( k +1 − r ) max α r L h α r L + 4 mCM β r − +8 mCLα h β r − K − β + ǫ − β i + 8 α r − L ] . (35) (cid:3) The convergence rate is determined by ω max . Since ω max = 1 − min j ∈ [ n ] min i ∈ [ n ] w ji (0), the initial agents’ weights inﬂuence the convergence rate. If the smallest initial weight is extremely small, it could be detrimental for the al-gorithm performance as it would slow down signiﬁcantly the convergence rate.Agents should consider balancing their need for reaching a high-quality ﬁnalresult and reasonable convergence rate. Agents should avoid extreme diﬀerencein their highest and lowest priorities. Three simulation scenarios are run to illustrate the performance of the pro-posed algorithm. The numerical studies considers quadratic functions deﬁnedas, f i ( x ) = 12 x T Q i x + r Ti x + c i , (36)where x ∈ R n is the decision vector, Q ∈ R n × n is a symmetric positive deﬁnitematrix, r ∈ R n , and i = 1 , . . . , n . The matrix Q i and the vector r i and c i aregenerated randomly and unique for each agent. An agent i knows exclusivelythe objective function f i . The agents goal is to solve the following problemusing (3)-(10): minimize m X i =1 w i f i ( x ) . (37)subject to x ∈ [ − ,

000 to 1 , . For all scenario, the initial gradient step size is α = 0.2 and we let α k = α k . Decentralized Multi-Objective Optimization Algorithm 25 -2 -1.5 -1 -0.5 0 0.5 1 1.5 f (x) value -1.5-1-0.500.511.522.533.5 f ( x ) v a l ue Proposed algorithmOptimal value (0.8404,01596) (0.5826 0.4174)(0.3354 0.6646)

Fig. 1 f ( x ) in terms of f ( x ) for a team of 2 agents. It can be seen that the proposedalgorithm reaches various optimal solutions. Some of agents’ priorities are shown. As thepriority of f diminishes, the value of f increases and the value of f decreases. Table 1

Initial priorities and priorities value for consensusAgent1 2 3 ¯ ww w w m = 3,and the decision vector has 10 variables, i.e., n = 10. The team minimizes 10quadratic functions as deﬁned by (36) to solve (37). The network exchangesinformation 100,000 times. Table 1 shows the initial agent preferences, w i ,and the convergence of the priority vector, ¯ w . The sum of each agent prioritiesequals 1, i.e., P i =1 w ji = 1.Table 2 presents the results obtained by the proposed algorithm. The ﬁrstthree columns correspond to the initial decision vector of each agent. Thefourth column presents the ﬁnal average estimate reached by the agents, i.e., y ( k ) for k → ∞ . The last column shows the optimal solution.The results obtained by the proposed algorithm closely approach the opti-mal value, x ∗ . Fig. 2 presents the algorithm’s convergence rate calculated with(35) and K = 1. Decentralized Multi-Objective Optimization Algorithm 27

Table 2

Result obtained by the proposed algorithm for Scenario 1 x x x ˆ x x ∗ -728.77 -284.03 -981.79 16.72 16.71-94.90 406.18 951.03 -0.53 -0.53429.65 792.26 -532.88 -9.03 -9.0214.82 -360.08 73.41 5.52 5.52846.91 -797.02 147.99 -4.74 -4.74-789.88 986.15 -602.51 8.16 8.15-285.74 723.87 -584.77 1.01 1.01-820.97 39.10 -30.25 -5.07 -5.07634.15 -431.47 888.04 -13.59 -13.58-352.03 361.52 -10.59 -8.68 -8.67 f ( x ) -6.1094e+03 -6.1094e+03 -5 Fig. 2

Convergence rate of the proposed algorithm - Team of 3 agents. The agents convergetowards the optimal solution. -2 Fig. 3

Convergence rate of the proposed algorithm for a team of 100 agents. During theﬁrst iterations, agents’ decision variables are close to the boundary of the constraint set,which explains the values P mi =1 || x i ( k + 1) − x ∗ || obtained during the ﬁrst iterations. Ittakes several iterations to move agents away from the boundaries. However, once agents arefar from the boundaries, agents converge quickly towards the optimal solution. variables. The team consists of 100 agents with quadratic functions deﬁnedby (36) of 100 variables. Therefore, the agent teams solve Problem 3.1 where m = 100. The set of constraints is the same as scenario 1 and the quadraticfunctions are also created randomly. Fig. 3 displays P mi =1 || x i ( k + 1) − x ∗ || over the course of the algorithm. As k → ∞ , the P mi =1 || x i ( k + 1) − x ∗ || →

0, which means the agent team approximately reach the optimal solution.Indeed, f ( x ∗ ) = − .

11 and f (ˆ x ) = − . x i is subject to the constraint set [ − ,

000 to 1 , Decentralized Multi-Objective Optimization Algorithm 29 jected onto the limits of the constraint set. It takes several iterations before asigniﬁcant number of agents move away from the boundary of the constraintset. However, once this number is reached, the algorithm converges quicklytowards the optimal solution.

In this paper, a distributed algorithm to optimize a prioritized sum of con-vex objective functions was proposed. The proposed algorithm allows agentsto have diﬀerent priorities regarding other agents’ objective functions. Theseagents’ priorities enable the exploration of the Pareto Front, which providesoptimal solutions in diﬀerent senses. A rule based on agents’ priority generatesthe information exchange matrix used to update agents’ estimates. In the pro-posed algorithm, this matrix is stochastic, whereas, in most other distributedalgorithms, the information exchange matrix is doubly-stochastic. Therefore,new theoretical analyses were needed because of the diﬀerence in the network-level set-up. It has been proved that the proposed algorithm converged towardsthe optimal solution. Also, convergence rates were obtained, which are inﬂu-enced by agents’ initial weights. Numerical results illustrated the performanceof the proposed algorithm. Future works include time-varying topology andimplementing the algorithm on a team of robots.

Acknowledgements

Maude J. Blondin would like to thank the support of Fonds derecherche Nature et technologies postdoctoral fellowship.0 Maude J. Blondin, Matthew Hale

Appendix:

This appendix contains the proofs for some lemmas presented in the paper.

Proof of Lemma 3.1 [26]Deﬁne µ ( k ) := min j ∈ [ m ] min i ∈ [ m ] w ji ( k ). Then, W ( k + 1) = P W ( k ) can be expressed as  p . . . p m ... ... ... p i . . . p mi ... ... ... p m . . . p mm   µ ( k ) + δ ( k ) . . . µ ( k ) + δ m ( k )... ... ... µ ( k ) + δ i ( k ) . . . µ ( k ) + δ mi ( k )... ... ... µ ( k ) + δ m ( k ) . . . µ ( k ) + δ mm ( k )  =  w ( k + 1) . . . w m ( k + 1)... ... ... w i ( k + 1) . . . w mi ( k + 1)... ... ... w m ( k + 1) . . . w mm ( k + 1)  , (38)where δ ji ( k ) = w ji ( k ) − µ ( k ) ≥ i, j = { , . . . , n } . Then, we have w ji ( k + 1) = m X n =1 p ni [ µ ( k ) + δ jn ( k )] = m X n =1 p ni µ ( k ) + m X n =1 p ni δ jn ( k )= µ ( k ) m X n =1 p ni + m X n =1 p ni δ jn ( k ) . (39)By deﬁnition, we know that P nm =1 p mi = 1. Therefore, we get w ji ( k + 1) = µ ( k ) + m X n =1 p ni δ jn ( k ) . (40)Since δ jn ≥ p ni ≥ i, j, n = { , . . . , m } , w ji ( k + 1) ≥ µ ( k ) = min j ∈ [ m ] min i ∈ [ m ] w ji ( k ) for i, j = { , . . . , m } and all k . This establishes that the minimum of W ( k ) is non-decreasingand other agents cannot go below the previous minimum at the next time step.Therefore, since (8) deﬁnes A ( k ), the smallest non-zero element of A ( k ), denotedmin + i ∈ [ m ] min + j ∈ [ m ] [ A ( k )] ji , is at least min i ∈ [ m ] min j ∈ [ m ] w ji ( k ). This directly implies that the lower boundcan be set as min j ∈ [ m ] min i ∈ [ m ] w ji (0). (cid:3) Decentralized Multi-Objective Optimization Algorithm 31

Proof of Lemma 4.6

From Lemma 4.5 and since x i ( k + 1) = P X i [ v i ( k ) − α k d i ( k )], we have || x i ( k + 1) − z || ≤ || v i ( k ) − α k d i ( k ) − z || −|| P X i [ v i ( k ) − α k d i ( k )] − v i ( k ) − α k d i ( k ) || . From the deﬁnition of φ i ( k ) in (12), the previous relation becomes, || x i ( k + 1) − z || ≤ || v i ( k ) − α k d i ( k ) − z || − || φ i ( k ) || . By expanding || v i ( k ) − α k d i ( k ) − z || , we have || v i ( k ) − α k d i ( k ) − z || = || v i ( k ) − z || + α k || d i ( k ) || − α k d i ( k ) ′ ( v i ( k ) − z ) . (41)Because d i ( k ) is the gradient of f i ( x ) at x = v i ( k ), we obtain from convexity that d i ( k ) ′ ( v i ( k ) − z ) ≥ f i ( v i ( k )) − f i ( z ) . (42)By bringing together (41) and (42), we get || x i ( k + 1) − z || ≤ || v i ( k ) − z || + α k || d i ( k ) || − α k [ f i ( v i ( k ) − f i ( z ))] − || φ i ( k ) || . (43)Given the deﬁnition of v i ( k ), using the convexity of the norm squared function and thestochasticity of the a i ( w i ( k )), we ﬁnd that || v i ( k ) − z || ≤ m X j =1 a ij ( w i ( k )) || x j ( k ) − z || . (44)It then follows from (43) and (44) that || x i ( k + 1) − z || = m X j =1 a ij ( w i ( k )) || x j ( k ) − z || + α k || d i ( k ) || − α k [ f i ( v i ( k )) − f i ( z )] − || φ i ( k ) || . (45)By summing (45) over i = 1 , . . . , m , we obtain the desired relation:2 Maude J. Blondin, Matthew Hale m X i =1 || x i ( k + 1) − z || ≤ m X i =1 m X j =1 a ij ( w i ( k )) || x j ( k ) − z || + α k m X i =1 || d i ( k ) || − α k m X i =1 [ f i ( v i ( k )) − f i ( z )] − m X i =1 || φ i ( k ) || . (cid:3) Proof of Lemma 4.7 (a) From (13), we have, x i ( k ) = m X j =1 [ Φ ( k − , s )] ij x j ( s ) − k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij α r d j ( r ) − α k − d i ( k −

1) + k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij φ j ( r ) + φ i ( k − . (46)Using the following transition matrices Φ ( k, s ) = A ( W ( k )) A ( W ( k − , . . . , A ( W ( s )) (47)and following the same logic to obtain (13) [6], (16) can be re-written for all k and s with k > s as, y ( k ) = 1 m m X i =1 m X j =1 [ Φ ( k − , s )] ij x j ( s ) − m m X i =1 k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij α r d i ( r )+1 m m X i =1 k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij φ i ( r ) − α k m m X i =1 d i ( k −

1) + 1 m m X i =1 φ i ( k − . (48)By subtracting (48) from (46), we obtain, x i ( k ) − y ( k ) = m X j =1 [ Φ ( k − , s )] ij x j ( s ) − m m X i =1 m X j =1 [ Φ ( k − , s )] ij x j ( s ) − k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij α r d j ( r ) + 1 m m X i =1 k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij α r d j ( r ) − α k − d i ( k −

1) + α k − m m X i =1 d i ( k −

1) + k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij φ j ( r ) − m m X i =1 k − X r = s m X j =1 [ Φ ( k − , r + 1)] ij φ j ( r ) + φ i ( k − − m m X i =1 φ i ( k − . (49) Decentralized Multi-Objective Optimization Algorithm 33Taking the norm of (49), we get || x i ( k ) − y ( k ) || ≤ m X j =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , s )] ij − m m X i =1 [ Φ ( k − , s )] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) || x j ( s ) || + k − X r = s m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α r || d j ( r ) || + α k − || d i ( k − || + α k − m m X i =1 || d i ( k − || + k − X r = s m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) || φ j ( r ) || + || φ i ( k − || + 1 m m X i =1 || φ i ( k − || . (50)Using Lemma 4.3 and for s = 0, and k → ∞ , the ﬁrst right-hand term of (50) is || x i ( k ) − y ( k ) || ≤ m X j =1 "(cid:12)(cid:12) [ Φ ( k − , ij − γ j (0) (cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m m X i =1 [ Φ ( k − , ij − γ j (0) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) || x j (0) || + k − X r =0 m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α r || d j ( r ) || + α k − || d i ( k − || + α k − m m X i =1 || d i ( k − || + k − X r =0 m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) || φ j ( r ) || + || φ i ( k − || + 1 m m X i =1 || φ i ( k − || , (51)which can be simpliﬁed as,4 Maude J. Blondin, Matthew Hale || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + k − X r =0 m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α r || d j ( r ) || + α k − || d i ( k − || + α k − m m X i =1 || d i ( k − || + k − X r =0 m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) || φ j ( r ) || + || φ i ( k − || + 1 m m X i =1 || φ i ( k − || . (52)Similarly, using Lemma 4.3, the second right-hand term is || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + 2 mCL k − X r =0 β k − r α r + α k − || d i ( k − || + α k − m m X i =1 || d i ( k − || + k − X r =0 m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) || φ j ( r ) || + || φ i ( k − || + 1 m m X i =1 || φ i ( k − || . (53)Using Lemma 4.2 and the gradient bound, the third-hand right term is || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + 2 mCL k − X r =0 β k − r α r + 2 α k − L + k − X r =0 m X j =1 "(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ Φ ( k − , r + 1)] ij − m m X i =1 [ Φ ( k − , r + 1)] ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) || φ j ( r ) || + || φ i ( k − || + 1 m m X i =1 || φ i ( k − || . (54)Using again Lemma 4.2 and 4.3, we obtain for the last two terms, || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + 2 mCL k − X r =0 β k − r α r + 2 α k − L +2 mCL k − X r =0 β k − r α r + 2 α k − L. (55) Decentralized Multi-Objective Optimization Algorithm 35We therefore obtain, || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + 4 mCL k − X r =0 β k − r α r + 4 α k − L. (56)Since 0 < β < β k → k → ∞ . Assuming that α k → i , lim sup k →∞ || x i ( k ) − y ( k ) || ≤ mCL lim sup k →∞ k − X r =0 β k − r α r . (57)By Lemma 4.4, we have lim k →∞ k − X r =0 β k − r α r = 0 . Therefore, lim k →∞ || x i ( k ) − y ( k ) || = 0 for all i .(b) By multiplying (56) with α k , we get α k || x i ( k ) − y ( k ) || ≤ mCα k β k − m X j =1 || x j (0) || + 4 mCL k − X r =0 β k − r α k α r + 4 α k α k − L. Using 2 α k α r ≤ α k + α r and α k β k − ≤ α k + β k − for any k and r , we obtain α k || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + 2 mCα k m X j =1 || x j (0) || +2 mCLα k k − X r =0 β k − r + 2 mCL k − X r =0 β k − r α r + 2 L ( α k + α k − ) . Since P k − r =0 β k − r ≤ − β , we have α k || x i ( k ) − y ( k ) || ≤ mCβ k − m X j =1 || x j (0) || + 2 mCα k m X j =1 || x j (0) || +2 mCLα k − β + 2 mCL k − X r =0 β k − r α r + 2 L ( α k + α k − ) . By summing from k = 1 to k = ∞ , we obtain6 Maude J. Blondin, Matthew Hale ∞ X k =1 α k || x i ( k ) − y ( k ) || ≤ mC ∞ X k =1 β k − m X j =1 || x j (0) || +2 mC ∞ X k =1 α k m X j =1 || x j (0) || + 2 mCL − β ∞ X k =1 α k +2 mCL ∞ X k =1 k − X r =0 β k − r α r + 2 L ∞ X k =1 ( α k + α k − ) . (58)In (58), the ﬁrst term is summable since 0 < β <

1. The second and third, and ﬁfthterms are also summable since P k →∞ α k < ∞ . By Lemma 4.4, the fourth term is summable.Thus, P ∞ k =1 α k || x i ( k ) − y ( k ) || < ∞ for all i . (cid:3) References

1. J. Qin, Q. Ma, Y Shi and L. Wang,

Recent advances in consensus of multi-agent systems:A brief survey . IEEE Transactions on Industrial Electronics. 2016 Dec 7;64(6):4972-83.2. A. Filotheou, A. Nikou and D. V. Dimarogonas,

Decentralized Control of UncertainMulti-Agent Systems with Connectivity Maintenance and Collision Avoidance , 2018 Eu-ropean Control Conference (ECC), Limassol, 2018, pp. 8-13.3. X. Wang, H. Su, X. Wang, and G. Chen,

An overview of coordinated control for multi-agent systems subject to input saturation , Perspectives in Science. 2016 Mar 1;7:133-9.4. Oh KK, Park MC, Ahn HS. A survey of multi-agent formation control. Automatica. 2015Mar 1;53:424-40.5. Nedi´c, Angelia, Asuman Ozdaglar, and Pablo A. Parrilo.

Constrained consensus , arXivpreprint arXiv:0802.3922 (2008).6. A. Nedi´c and A. Ozdaglar,

Distributed subgradient methods for multi-agent optimization ,IEEE Transactions on Automatic Control, 54(1), 48, 2009.7. A. Nedic and A. Ozdaglar, 2010.

10 cooperative distributed multi-agent , Convex Opti-mization in Signal Processing and Communications, 340.8. A. Nedic, A. Ozdaglar and P.A. Parrilo, Constrained consensus and optimization inmulti-agent networks. IEEE Transactions on Automatic Control, 55(4), 922-938, 2010. Decentralized Multi-Objective Optimization Algorithm 379. K.I. Tsianos, S. Lawlor and M.G. Rabbat, Consensus-based distributed optimization:Practical issues and applications in large-scale machine learning. In2012 50th AnnualAllerton Conference on Communication, Control, and Computing (Allerton) 2012 Oct 1(pp. 1543-1550). IEEE.10. J.C. Duchi, A. Agarwal and M.J. Wainwright, Dual averaging for distributed optimiza-tion: Convergence analysis and network scaling. IEEE Transactions on Automatic control.2011 Jun 30;57(3):592-606.11. Wang J, Elia N. Control approach to distributed optimization. In2010 48th AnnualAllerton Conference on Communication, Control, and Computing (Allerton) 2010 Sep(pp. 557-561). IEEE.12. Agarwal A, Duchi JC. Distributed delayed stochastic optimization. InAdvances in Neu-ral Information Processing Systems 2011 (pp. 873-881).13. Liu Q, Wang J. A second-order multi-agent network for bound-constrained distributedoptimization. IEEE Transactions on Automatic Control. 2015 Mar 27;60(12):3310-5.14. Y. Zhang, Y. Lou and Y. Hong, An approximate gradient algorithm for constraineddistributed convex optimization. IEEE/CAA Journal of Automatica Sinica 1.1, 61-67,2014.15. B. Touri, and A. Nedic, On backward product of stochastic matrices. Automatica 48.8:1477-1488, 2018.16. S.S. Ram, A. Nedi´c and V.V. Veeravalli, 2010. Distributed stochastic subgradient projec-tion algorithms for convex optimization. Journal of optimization theory and applications,147(3), pp.516-545.17. P. Bianchi, G. Fort, W. Hachem and J. Jakubowicz, ”Performance analysis of a dis-tributed Robbins-Monro algorithm for sensor networks.” In 2011 19th European SignalProcessing Conference, pp. 1030-1034. IEEE, 2011.18. I. Lobel, A. Ozdaglar and D. Feijer, (2011).

Distributed multi-agent optimization withstate-dependent communication , Mathematical programming, 129(2), 255-284.19. A. Nedic and D.P. Bertsekas,

Incremental subgradient methods for nondiﬀerentiableoptimization. SIAM Journal on Optimization , 2001;12(1):109-38.8 Maude J. Blondin, Matthew Hale20. Byungchul Kim and O. Lavrova, ”Optimal power ﬂow and energy-sharing among multi-agent smart buildings in the smart grid,” 2013 IEEE Energytech, Cleveland, OH, 2013,pp. 1-5, doi: 10.1109/EnergyTech.2013.6645336.21. K. M. Miettinen,

Nonlinear Multiobjective Optimiation , Kluwer Academic Publishers,1999.22. Y. Collette and P. Siarry,

Multiobjective Optimization: Principles and Case Studies ,Springer, 2004.23. L. Xiao, S. Boyd and S.J. Kim,

Distributed average consensus with least-mean-squaredeviation , Journal of parallel and distributed computing, 67(1), 33-46, 2007.24. A. Olshevsky and J.N. Tsitsiklis,

Convergence speed in distributed consensus and aver-aging , SIAM review 53.4: 747-772, 2011.25. V.D. Blondel, J.M. Hendrickx, A. Olshevsky and J.N. Tsitsiklis,

Convergence in multi-agent coordination, consensus, and ﬂocking , In Proceedings of the 44th IEEE Conferenceon Decision and Control, pp. 2996-3000, 2005.26. Blondin, Maude J., and Matthew Hale.