A distributed service-matching coverage via heterogeneous mobile agents
11 A distributed service-matching coverage viaheterogeneous mobile agents
Yi-Fan Chung and Solmaz S. Kia,
Senior Member, IEEE
Abstract —We propose a distributed deployment solution for agroup of mobile agents that should provide a service for a denseset of targets. The agents are heterogeneous in a sense thattheir quality of service (QoS), modeled as a spatial Gaussiandistribution, is different. To provide the best service, the objectiveis to deploy the agents such that their collective QoS distributionis as close as possible to the density distribution of the targets. Wepropose a distributed consensus-based expectation-maximization(EM) algorithm to estimate the target density distribution,modeled as a Gaussian mixture model (GMM). The GMMnot only gives an estimate of the targets distribution, but alsopartitions the area to subregions, each of which is represented byone of the GMMs Gaussian bases. We use the Kullback-Leiblerdivergence (KLD) to evaluate the similarity between the QoSdistribution of each agent and each Gaussian basis/subregion.Then, a distributed assignment problem is formulated and solvedas a discrete optimal mass transport problem that allocates eachagent to a subregion by taking the KLD as the assignment cost.We demonstrate our results by a sensor deployment for eventdetection where the sensor’s QoS is modeled as an anisotropicGaussian distribution.
Over the last decade, deploying a group of networked mo-bile agents to cover a region with a service objective suchmonitoring, data collection, and wireless communication haveattracted considerable attention, see for a few examples [1]–[3]. The deployment strategy commonly includes partitioningthe environment and allocating agents to those partitions. Thatis, the area of interest is partitioned into subregions and eachagent is allocated to a location in the subregion such that somecoverage metric is optimized.The classic Voronoi-based deployment strategy [4]–[15] isa prime example of multi-agent deployment for area cover-age. [4] as one of the initial work in this area develops adeployment algorithm based on the Llyod method to computethe Voronoi partition and allocate the agents to the CentroidalVoronoi configuration which is well-known as the optimalconfiguration of a class of locational optimization cost func-tion [5]. The original Voronoi-based deployment strategy isdeveloped for homogeneous agents. To reach the optimal cov-erage with heterogeneous agents whose service capabilities aredifferent, [7]–[9] employ the weighted Voronoi diagram wherethe weightings account for heterogeneity among the agents.The works mention above assume the footprint of the serviceprovided by an agent is disk-shaped, i.e., the distributionof QoS is isotropic. However, an anisotropic service modelis more realistic because sensory systems such as cameras,directional antenna, and radars are anisotropic. [10], [11] forwedge-shape and [12] for elliptic footprint adapt an anisotropicservice model by the modifying Voronoi diagrams to match
The authors are with the Department of Mechanical and AerospaceEngineering, University of California, Irvine, Irvine, CA 92697, { yfchung,solmaz } @uci.edu . This work is supported by NSFaward IIS-SAS-1724331. the features of the anisotropy of the sensors. But these meth-ods increase the complexity of the Voronoi partition, whichmake the design of distributed optimal deployment strategiesvery challenging. The heterogeneity in deployment algorithmdesign can also be due to non-uniformity in area of interest.To deal with such scenarios, a priority (sensory) function ofthe position is introduced to indicate the importance level overthe area, where a location needs higher QoS if the value ofthe priority function is higher at that location. The work [4]–[12] mentioned above assume the priority function is known toeach agent. This assumption may not be realistic for every ap-plication. [13] uses the parameterized basis functions to modelthe priority distribution, and [14] models the distribution by azero-mean Gaussian random field. Then, in both [13] and [14],the agents gradually fit their model to the true distributionusing their local sensor measurement while exploring the area.In [15], the authors assume the unknown priority functionis a function of the position of some unknown targets. Thesearch agents aim to detect the targets while exploring the area,and then, broadcast their information about the environmentto the service agents so the service agents can focus on thedeployment problem.In this paper, we propose a novel deployment strategy formobile agents to cover a collection of dense targets withtheir heterogeneous anisotropic services. We model the agents’QoS distributions by Gaussian distributions with differencecovariance matrix reflecting the difference in the capabilitiesof the agents. Since the footprint of Gaussian distribution iselliptic, the agents’ QoS are heterogeneous and anisotropic.Furthermore, the density distribution of the targets is consid-ered as the priority function which is unknown a priori. Ourdeployment objective is deploying the agents such that theresulting QoS distribution of agents is similar to the densitydistribution of the targets. Hence, the agents’ service efficientlycovers the targets; that is, the place containing more (resp.fewer) targets is served with higher (resp. lower) QoS.We model the unknown density distribution of the targetsby a GMM. We propose a distributed consensus-based EMalgorithm to enable the agents to learn the parameters of theGMM. With the proposed distributed EM, we do not requireeach agent to measure the targets locally. Only a subset ofthe agents can measure the targets and their information isthen propagated through the consensus protocol to all agents.Moreover, the GMM intrinsically partitions the area into aset of subregions, each of which represents a Gaussian basis.Therefore, after estimating the target density distribution, theagents also complete the area partitioning task. We note thatunlike the distributed Voronoi partition that requires the agentsto be able to communicate to their Voronoi neighbors, which attimes can be unrealistic because the physical distance between a r X i v : . [ ee ss . S Y ] S e p Voronoi neighbors may but them outside of the communicationrange of each other, our approach only requires the commu-nication graph among the agents to be connected. We use theKLD measure to assess the similarity of the QoS provided bythe agents and the targets’ density distributions. We propose toobtain the optimal deployment pose (position and orientation)of the service agents by minimizing this KLD measure. Sincethis KLD measure is highly coupled and computing a dis-tributed solution for it is challenging, we propose a suboptimaldeployment solution in the form of an optimal mass transportproblem to allocate each agent to a Gaussian basis subregionof the GMM used to estimate the targets’ distribution, wherethe cost of transporting the agent to a subregion is the KLDvalue between the agent’s QoS distribution and the Gaussianbasis distribution. We show that this assignment problem is adistributed linear programming that can be solved efficientlyby the distributed simplex algorithm of [16]. We illustrate ourresults via an application in the deployments of sensor networkfor event detection.I. N
OTATIONS AND P RELIMINARIES
We let R , R > , R ≥ , Z , Z > and Z ≥ denote the set ofreal, positive real, non-negative real, integer, positive inte-ger, and non-negative integer, respectively. For s ∈ R d , (cid:107) s (cid:107) = √ s (cid:62) s denotes the standard Euclidean norm. Welet n (resp. n ) denote the vector of n ones (resp. n zeros), and I n denote the n × n identity matrix. Given twocontinuous probability density distributions p ( x ) and q ( x ) , x ∈ X , the KullbackLeibler divergence (KLD) is defined as D KL (cid:0) p ( x ) || q ( x ) (cid:1) = (cid:82) x ∈ X p ( x ) ln p ( x ) q ( x ) d x . KLD is a measureof similarity (dissimilarity) between two probability distribu-tions p ( x ) and q ( x ) , where the smaller the value the moresimilar two distributions are. KLD is zero if and only if the twodistribution are identical [17, p.34]. For Gaussian distributions, p ( x ) = N ( µ , Σ ) and q ( x ) = N ( µ , Σ ) , the KLD has aclosed form expression [18, eq. (2)] D KL (cid:16) p ( x ) || q ( x ) (cid:1) = 12 (cid:0) ln | Σ || Σ | + ( µ − µ ) (cid:62) Σ − ( µ − µ )+ tr( Σ − Σ ) − n (cid:17) , (1)where n is the dimension of the distributions.We follow [19] for Our graph theoretic notations and def-initions. A graph , is a triplet G = ( V , E , A ) , where V = { , . . . , N } is the node set and E ⊆ V × V is the edge set ,and A ∈ R N × N is a adjacency matrix such that a ij = if ( i, j ) ∈ E and a ij = 0 , otherwise. An edge ( i, j ) from i to j means that agents i and j can communicate. A path is asequence of nodes connected by edges. A connected graph isan undirected graph in which for every pair of nodes there isa path connecting them.To develop our distributed density estimator in Section IV,we rely on the dynamics active weighted average consensusalgorithm that is shown in Algorithm 1. In a dynamics activeweighted average consensus at any time, only a subset ofthe agents are active, meaning that only a subset of agentscollects measurements r i . The objective then is to enable allthe agents, both active and passive, to obtain the weighted Algorithm 1
Active weighted average consensus algo-rithm [20] [ y i , z i , v i ] ← Con( η i , r i , z i , v i ) Require:
Weight η i , reference r i , number of loops L , a small enough number δ c > . Initialization: z i (1) = z i and v i (1) = v i for l = 1 : L do y i ( l ) = z i ( l ) + η i ( l ) r i ( l ) , z i ( l + 1) = z i ( l ) − δ c η i ( l )( y i ( l ) − r i ( l )) − δ c N (cid:88) j =1 a ij ( y i ( l ) − y j ( l )) − δ c N (cid:88) j =1 a ij ( v i ( l ) − v j ( l )) , v i ( l + 1) = v i ( l ) + δ c (cid:88) Nj =1 a ij ( y i ( l ) − y j ( l )) . end forreturn y i ( l ) , z i ( l + 1) , v i ( l + 1) average of the collected measurements, (cid:80) i η i ( l ) r i ( l ) (cid:80) i η i ( l ) , withoutknowing the set of active agents. Here, η i ( l ) = 0 if i is passiveat time step l and η i ( l ) ∈ R > if i is active. [20] shows thatAlgorithm 1, starting at any z i , y i (0) ∈ R , makes y i ( l ) trackthe time varying weighted average signal (cid:80) i η i ( l ) r i ( l ) (cid:80) i η i ( l ) with abounded tracking error as l → ∞ . Moreover, if the weightsand reference signals are static, the tracking error vanisheswith time, i.e., lim l →∞ y i ( l ) = (cid:80) i η i r i (cid:80) i η i .II. P ROBLEM D EFINITION AND O BJECTIVE
We consider a deployment problem for a group of mobileagents over a set of dense targets { x nt } Mn =1 ⊂ R on aplanar ground with the objectives such as event detection,wireless communication or monitoring, which we refer to it ingeneral term as providing a ‘service’. The probability densitydistribution p ( x ) , x ∈ R , which is unknown to the agents,represents the density distribution of the targets. The mobileagents communicating over a connected undirected graph G = ( V , E , A ) consist of two types. There are a set V a ⊆ V of active agents that have the capability to actively detect thetargets and a set V s = { , · · · , N } ⊆ V service agents thatare deployed to provide the targets with a service, see Fig 1.Unlike some existing literature like [15], V a and V s do not haveto be mutually exclusive. We assume that the active agentshave partitioned the area such that each target is detected onlyby one active agent, i.e., no overlapping detection.We let ( x i s , θ i s ) ∈ R × [0 , π ] be the pose (position andorientation) of service agent i ∈ V s . The QoS distribution Q i ( x | x i s , θ i s ) provided by a service agent i ∈ V s is modeledby a scaled Gaussian probability distribution Q i ( x | x i s , θ i s ) = z i N ( x | x i s , Σ( θ i s )) , x ∈ R , where z i ∈ R > is the scaleconstant and N ( x | x i s , Σ( θ i s )) is the Gaussian distribution withthe mean x i s and the covariance matrix Σ( θ i s ) . We define thenormalized collective QoS provided by the service agents bythe probability density distribution q ( x |{ x i s , θ i s } i ∈V s ) = (cid:80) i ∈V s Q i (cid:82) x ∈ R (cid:80) i ∈V s Q i d x = (cid:80) i ∈V s z i N ( x | x i s , Σ( θ i s )) (cid:80) i ∈V s z i = (cid:88) i ∈V s ω i s N ( x | x i s , Σ( θ i s )) , (2) Fig. 1:
A multi-agent system with active agents and service agents. where ω i s = z i (cid:80) i ∈V s z i represents the relative service capabilityof agent i among V s .Our objective in this paper is to first enable all the agents, bothactive and service agents, obtain an estimate mixture model ˆ p ( x ) of the density distribution of the targets in a distributedmanner. Then, design a distributed deployment strategy to re-position the service agents in a way that their collective QoSserves the targets in an efficient manner. In other words, weseek locations and orientations for service agents such that thecollective QoS distribution q is as much similar to as possibleto the estimated target density distribution ˆ p . The optimalsolution for the deployment objective can be obtained from { x i s , θ i s } i ∈V s = arg min D KL (cid:0) ˆ p ( x ) || q ( x ) (cid:1) . (3)We note that ˆ p ( x ) and q ( x ) are mixture distributions for whichobtaining a closed-from for their KLD is quite challenging. Inpractice, KLD for mixture models are usually estimated byusing costly Monte-Carlo sampling simulations [21]. More-over, the collective QoS distribution q ( x ) contributed by eachagent’s QoS distribution, ω i s N ( x | x i s , Σ( θ i s )) , i ∈ V s , is a globalinformation. Accordingly, designing a distributed solver for (3)is challenging. Therefore, in this paper, we seek a suboptimalsolution for (3) that can be implemented in a distributedmanner and has low computational complexity.III. O VERVIEW OF THE P ROPOSED M OBILE A GENT D EPLOYMENT S OLUTION
Our proposed distributed solution to meet our objective statedin Section II is the two-stage process depicted in Fig. 2. Inthe first stage, we use a GMM with N Gaussian bases tomodel the target density distribution. The active agents V a detect the positions of the targets, considered as the sampleddata from the unknown distribution p ( x ) . Then, a distributedEM algorithm, which uses a set of active weighted averageconsensus algorithms, is used to enable both active and serviceagents obtain a coherent estimate of the parameters of the N Gaussian bases of the GMM. The Gaussian bases of theGMM partition the target area into N subregions each ofwhich corresponds to a Gaussian basis. The second stage ofour solution is an agent allocation process which follows anoptimal mass transport framework. In this allocation process,first each service agent i ∈ V s computes the KLD betweenits QoS distribution, ω i s N ( x | x i s , Σ( θ i s )) , and each subregion’sGaussian basis obtained in stage 1. Then, a distributed as-signment problem is formulated with the KLDs as the cost Fig. 2: The proposed two-stage distributed deployment solution. of deploying the agent to the each respective subregion. As aresult, each agent is paired with a subregion and the summationof the divergences corresponding to each paired agent’s QoSdistribution and subregion’s Gaussian basis is minimized. Thelast step in this stage is a transportation process in whichwe implement a local controller to drive the agents to theirassigned destinations in finite time. For dynamic targets, theprocess repeats. We present the details of each stage in thefollowing sections.IV. S
TAGE DISTRIBUTED TARGET DENSITYDISTRIBUTION ESTIMATION
GMM is characterized by finite sum of Gaussian bases withdifferent weights, means and covariance matrices. Let x ∈ R be the observed target’s position drawn from a mixture of N Gaussian bases with the distribution N ( x | µ k , Σ k ) , where µ k ∈ R is the mean and Σ k ∈ R × is the covariance matrixfor k ∈ K = { , · · · , N } . Let z ∈ R be the indicator whichindicates the variable x belongs to k th Gaussian basis when z = k . The variable z is not observed so z is also calledhidden variable or latent variable. The probability of drawinga variable from the k th Gaussian basis is denoted as π k :=Pr( z = k ) . The distribution of x given the k th mixture basisis Gaussian, i.e., ˆ p ( x | z = k ) = N ( x | µ k , Σ k ) . Therefore, themarginal probability distribution for x is given by ˆ p ( x ) = (cid:88) Nk =1 π k N ( x | µ k , Σ k ) (4)The parameters that should be determined to obtain the esti-mate ˆ p ( x ) are the set { π k , µ k , Σ k } Nk =1 . Next, we employ theEM algorithm to obtain these parameters [22].The EM algorithm obtains the maximum likelihood estimatesof { π k , µ k , Σ k } Nk =1 given M independent detected targets’positions { x n t } Mn =1 . It is an iterative method that alternatesbetween an expectation (E) step and a maximization (M) step.Given a detected target x nt , n ∈ { , · · · , M } , E-step computesthe posterior probability γ kn := Pr( z = k | x nt ) = π k N ( x nt | µ k , Σ k ) (cid:80) Nj =1 π j N ( x nt | µ j , Σ j ) , (5)using the current value of { π k , µ k , Σ k } Nk =1 . Then, M-stepupdates the parameter set { π k , µ k , Σ k } Nk =1 by the followingequations using the current γ kn : π k = (cid:80) Mn =1 γ kn M , (6a) µ k = (cid:80) Mn =1 γ kn x nt (cid:80) Mn =1 γ kn , (6b) Σ k = (cid:80) Mn =1 γ kn ( x nt − µ k )( x nt − µ k ) (cid:62) (cid:80) Mn =1 γ kn , (6c)for k ∈ { , · · · , N } . M-step needs the global information toupdate the parameter set { π k , µ k , Σ k } Nk =1 because the sum-mations in (6) are over all detected targets n ∈ { , · · · , M } .However, the information of the targets’ positions { x nt } Mn =1 isdistributed among the active agents V a . We observe that theright hand side quantities of (6) are in the form of (weighted)average. Therefore, we propose a distributed implementationof the EM algorithm, which invokes a set of active weightedaverage consensus algorithms such that all the agents, V = V a ∪ V s obtain an approximate value of (6) by locally ex-changing the information with their neighbors. Suppose eachagent i ∈ V maintains a local copy of the parameter set ofthe Gaussian bases { π ik , µ ik , Σ ik } Nk =1 . At the E-step, everyactive agent i ∈ V a computes γ kn for k ∈ { , · · · , N } and n ∈ V it where V it is the set of targets detected by active agent i ∈ V a . Then, in the M-step, every agent i ∈ V executesConsensus Algorithm 1 with proper setting its weight η i andreference r i to estimate the update of { π ik , µ ik , Σ ik } Nk =1 . It isclear that by setting η i = |V it | and r i = (cid:80) n ∈V it γ kn |V it | if i ∈ V a ,otherwise, η i = 0 and r i = 0 , the consensus variable y i in Algorithm 1 asymptotically converges to (6a). Similarly,letting η i = (cid:80) n ∈V it γ kn and r i = (cid:80) n ∈V it γ kn x n (cid:80) n ∈V it γ kn if i ∈ V a ,otherwise, η i = 0 and r i = 0 , y i converges to (6b); letting η i = (cid:80) n ∈V it γ kn and r i = (cid:80) n ∈V it γ kn ( x n − µ ik )( x n − µ ik ) (cid:62) (cid:80) n ∈V it γ kn if i ∈ V a , otherwise, η i = 0 and r i = 0 , y i converges to (6c).The proposed consensus based distributed EM algorithm issummarized in Algorithm 2.Because the algorithm is terminated in a finite time. It isexpected that ˆ p i ( x ) of each agent i be slightly different thanother agents. In what follows we let, ˆ p i ( x ) = (cid:88) Nk =1 π ik N ( x | µ ik , Σ ik ) , (7)be the local final estimate of agent i ∈ V . A. Numerical demonstration
We demonstrate a numerical simulation to show the perfor-mance of the proposed distributed EM algorithm. Consider agroup of mobile agents where V a = { , , , } are the activeagents that monitor the targets to enable the service agents V s = V = { , , , , , } to obtain an estimate of the densitydistribution p ( x ) of a group of M = 1000 targets. The agents V communicate over a connected ring graph whose adjacencymatrix is A = − − − − − − −
10 0 − − − − − − − . The numbers of the targets detected by agent i ∈ V a are |V t | = 100 , |V t | = 250 , |V t | = 450 , and |V t | = 200 . Algorithm 2
Consensus-based distributed EM algorithm for GMM
Require:
Detected targets set { x nt } n ∈V it by agent i , number of Gaussianbases N , number of loops T Initialization: { π ik , z iπ,k , v iπ,k } Nk =1 , { µ ik , z iµ,k , v iµ,k } Nk =1 , { Σ ik , z i Σ ,k , v i Σ ,k } Nk =1 . for t = 1 : T doif i ∈ V a then (cid:46) E-stepCompute γ kn in (5) using the current value of { π ik µ ik , Σ ik } for k = { , · · · , N } and n ∈ V it . end iffor k = 1 : N do (cid:46) M-step if i ∈ V a then [ π ik , z iπ,k , v iπ,k ] ← Con( |V it | , (cid:80) n ∈V it γ kn |V it | , z iπ,k , v iπ,k )[ µ ik , z iµ,k , v iµ,k ] ← Con( (cid:88) n ∈V it γ kn , (cid:80) n ∈V it γ kn x n (cid:80) n ∈V it γ kn , z iµ,k , v iµ,k )[ Σ ik , z i Σ ,k , v i Σ ,k ] ← Con( (cid:88) n ∈V it γ kn , (cid:80) n ∈V it γ kn ( x n − µ tk )( x n − µ tk ) (cid:62) (cid:80) n ∈V it γ kn , z i Σ ,k , v i Σ ,k ) else [ π ik , z iπ,k , v iπ,k ] ← Con(0 , , z iπ,k , v iπ,k )[ µ ik , z iµ,k , v iµ,k ] ← Con(0 , , z iµ,k , v iµ,k )[ Σ ik , z i Σ ,k , v i Σ ,k ] ← Con(0 , , z i Σ ,k , v i Σ ,k ) end ifend forend forreturn { π ik , µ ik , Σ ik } Nk =1 The agents, both active and service, execute the distributedEM Algorithm 2 to estimate the parameters of the Gaussianbases of GMM for the target density distribution. In thesimulation, the number of the iteration-loops of the consensusalgorithm and EM algorithm are L = 20 and T = 50 ,respectively. Agents’ estimation results are illustrated in Fig.3 where the black circle’s represent the targets, the ellipticfootprints are the 3- σ uncertainty ellipses of the Gaussianbases N ( x | µ ik , Σ ik ) , k ∈ { , · · · , } estimated by agent i andthe thickness of the elliptic footprint represents π ik . The resultshows that with the proposed distributed EM algorithm, theagents successfully estimate the parameter of GMM for thetarget’s distribution though agent and do not detect anytarget. We note that with the help of the consensus algorithmall agents get the approximately same estimation results.V. S TAGE
2: D
ISTRIBUTED D EPLOYMENT OF S ERVICE A GENTS
In stage 1, the target density distribution is modeled and esti-mated by a GMM. The result of GMM intrinsically partitionsthe area into a set of subregions each of which represents aGaussian basis. Our suboptimal solution to the deploymentproblem (3) is to deploy each service agent i ∈ V s = { , · · · , N } to optimally cover an assigned subregion k ∈ K = { , · · · , N } . The service agent assignment is based on thesimilarity of the agent’s QoS distribution, ω i s N ( x | x i s , Σ( θ i s )) , Fig. 3:
The estimate of GMM of each agents of the demonstrationin section IV-A. to the Gaussian basis subregion, π ik N ( x | µ ik , Σ ik ) , such thatthe summation of the KLD of each assigned agent-subregionpair is minimized. This objective can be formalized as follows.For any service agent i ∈ V s let C ik ( x i s , θ i s ) = D KL (cid:0) π ik N ( x | µ ik , Σ ik ) || ω i s N ( x | x i s , Σ( θ i s ))= π ik (cid:16) ln π ik ω i s + D KL (cid:0) N ( x | µ ik , Σ ik ) ||N ( x | x i s , Σ( θ i s )) (cid:1)(cid:17) , (8)for k ∈ K . We note that C ik in (8) is a continuous functionof the service agent’s pose ( x i s , θ i s ) . We introduce a binarydecision variable Z ik ∈ { , } , which is if agent i is assignedto region k and otherwise. With the right notation at handthen, our suboptimal deployment solution is given by { x i (cid:63) s ,θ i (cid:63) s , { Z (cid:63)ik } k ∈K } i ∈V s = arg min (cid:88) i ∈V s (cid:88) k ∈K C ik ( x i s , θ i s ) Z ik , (9) Z ik ∈ { , } , i ∈ V s , k ∈ K , (cid:88) k ∈K Z ik = 1 , ∀ i ∈ V s , (cid:88) i ∈V s Z ik = 1 , ∀ k ∈ K . Next, we introduce a set of manipulations that allows us toarrive at a distributed solution for solving (9). For each serviceagent i ∈ V s , we start by defining C (cid:63)ik = min x i s ,θ i s C ik ( x i s , θ i s ) , k ∈ K . (10)Given (10) and the fact that C ik depends on the pose of agent i only, it is straightforward to show that (9) can be written inthe equivalent form of Z (cid:63)ik = arg min (cid:88) i ∈V s (cid:88) k ∈K C (cid:63)ik Z ik , (11) Z ik ∈ { , } , i ∈ V s , k ∈ K , (cid:88) k ∈K Z ik = 1 , ∀ i ∈ V s , (cid:88) i ∈V s Z ik = 1 , ∀ k ∈ K . where ( x i (cid:63) s , θ i (cid:63) s ) for each service agent i ∈ V s is equal tominimizer ( x ik (cid:63) s , θ ik (cid:63) s ) of the k th (10) that corresponds to Z (cid:63)ik = 1 . The equivalent optimization representation (11) castsour suboptimal service agent assignment problem in the formof a discrete optimal mass transport problem [23]. In thisoptimal mass transport problem, the minimum value of (8)given in (10) can be viewed as the cost of assigning agent i to the k th subregion/basis of the GMM. In Section V-B, weshow that the mixed integer programming problem (11), in factcan be cast as a linear programming in continuous space, andthen solved in a distributed manner using existing optimizationalgorithms. Once each service agent obtains its assigned pose,we transport the agents to their assigned region by a finite-timeminimum energy control. In what follows, before presentingour equivalent linear programming representation of (11), wediscuss how we can obtain the minimizers of (10). Morespecifically, in Section V-A we show that the minimum valuefor each C ik happens at x ik (cid:63) s = µ k and the orientation θ ik (cid:63) s that makes the principal axis of the uncertainty ellipsesof the service distribution and the corresponding Gaussiandistribution are in parallel. A. Similarity assessment for Gaussian QoS distribution toGaussian basis subregion
Given the QoS distribution provided by agent i ∈ V s tobe ω i s N ( x | x i s , Σ i ( θ i s )) , where the mean of the Gaussiandistribution is at the agent’s location x i s and the covari-ance matrix is with principal (major) axis at angle θ i s , seeFig. 4. Hence, the covariance matrix can be decomposed into Σ i ( θ i s ) = R ( θ i s ) Λ i R ( θ i s ) (cid:62) , where R ( θ i s ) = (cid:104) cos θ i s − sin θ i s sin θ i s cos θ i s (cid:105) and Λ i = (cid:104) σ ix σ iy (cid:105) , in which σ ix , σ iy ∈ R > with σ ix ≥ σ iy are known service parameters determines the ‘shape’ of theservice agent i . Similarly, agent i ’s estimated covariancematrix Σ ik , for the k th subregion/basis of its estimated ˆ p ( x ) ,see (7), can be written as Σ ik ( θ ik ) = R ( θ ik ) Λ ik R ( θ ik ) (cid:62) , where Λ ik = (cid:20) σ ik,x σ ik,y (cid:21) , in which θ ik is the angle of principal(major) axis of the covariance matrix and σ ik,x , σ ik,y ∈ R > with σ ik,x ≥ σ ik,y are the variances in the major axis and minoraxis direction, respectively, see Fig. 4. With the right notationat hand, the theorem below gives a closed-form solution forthe minimizer ( x ik (cid:63) s , θ ik (cid:63) s ) of (10). Theorem 1.
Consider the optimization problem (10) . Then,one of the global minimizer of optimization (10) is ( x ik (cid:63) s , θ ik (cid:63) s ) = ( µ ik , θ ik ) , where θ ik is the angle of the principalaxis of Σ ik . Moreover, C (cid:63)ik = π ik (cid:16) ln π ik ω i s + 12 (cid:0) ln σ ix σ iy σ ik,x σ ik,y + σ ik,x σ iy + σ ik,y σ ix σ ix σ iy − (cid:1)(cid:17) . (12) Fig. 4:
The principal axis angle of (a) agent i ’s QoS Gaussiandistribution and (b) the k the subregion/basis of ˆ p i ( x ) . Proof.
We first note that since π ik and ω i s arefixed parameters, (10) is equivalent to minimize D KL (cid:0) N ( x | µ ik , Σ ik ) ||N ( x | x i s , Σ i ( θ i s )) (cid:1) . Invoking (1) wecan write D KL (cid:0) N ( x | µ ik , Σ ik ) ||N ( x | x i s , Σ i ( θ i s )) (cid:1)(cid:17) = 12 (cid:0) ln | Σ i ( θ i s ) || Σ ik | (cid:124) (cid:123)(cid:122) (cid:125) ( a ) + ( x i s − µ ik ) (cid:62) Σ i ( θ i s ) − ( x i s − µ ik ) (cid:124) (cid:123)(cid:122) (cid:125) ( b ) + tr( Σ i ( θ i s ) − Σ ik ) (cid:124) (cid:123)(cid:122) (cid:125) ( c ) − (cid:1) . (13)We note that, in (13), ( a ) = ln | R ( θ i s ) Λ i R (cid:62) ( θ i s ) || R ( θ ik ) Λ k R (cid:62) ( θ ik ) | = ln | Λ i || Λ k | = ln σ ix σ iy σ ik,x σ ik,y , thus ( a ) is a fix term and does not depend on the decisionvariable θ i s . Next, we note that ( b ) is the only term in (13)that depends on µ ik and x i s . For any value other than x i s = µ ik , ( b ) returns a positive value, which means that the minimumof (13) happens at x ik (cid:63) s = µ ik . Lastly, we note that ( c ) in (13)reads also as ( c ) = tr( R ( θ i s )( Λ i ) − R ( − θ i s + θ ik ) Λ k R ( θ ik ))= tr( R ( θ i s − θ ik )( Λ i ) − R ( − θ i s + θ ik ) Λ k ) . Now, let ¯ θ = θ i s − θ ik , s ¯ θ = sin(¯ θ ) and c ¯ θ = cos(¯ θ ) . Then, wecan write (c) as ( c ) = tr (cid:16) (cid:20) c ¯ θ − s ¯ θ s ¯ θ c ¯ θ (cid:21) (cid:34) σ ix σ iy (cid:35) (cid:20) c ¯ θ s ¯ θ − s ¯ θ c ¯ θ (cid:21) (cid:20) σ ik,x σ ik,y (cid:21) (cid:17) = ( σ ik,x σ iy + σ ik,y σ ix ) c ¯ θ + ( σ ik,x σ ix + σ ik,y σ iy ) s ¯ θσ ix σ iy . Let α = σ ik,x σ iy + σ ik,y σ ix and β = σ ik,x σ ix + σ ik,y σ iy . Then, ( c ) reduces ( c ) = α + ( β − α ) s ¯ θσ ix σ iy . Because σ ik,x ≥ σ ik,y and σ ix ≥ σ iy , we have β ≥ α and ( β − α ) s ¯ θ is non-negative. Hence, the global minimum of ( c ) is ασ ix σ iy = σ ik,x σ iy + σ ik,y σ ix σ ix σ iy which happens at ¯ θ (cid:63) = n π , n ∈{ , , · · · } , i.e., θ ik (cid:63) s = θ ik + n π , n ∈ { , , · · · } . To completethe proof, we note that n = 0 leads to one of the globalminimums θ ik (cid:63) s = θ ik . Given Theorem 1, if the optimization problem (11) allocatesservice agent i to the k th subregion/basis of ˆ p i ( x ) , the corre-sponding final pose of agent i will be x i (cid:63) s = µ ik , θ i (cid:63) s = θ ik . B. Distributed multi-agent assignment problem
The assignment optimization problem (11) is an integer op-timization problem. As it is known in the discrete optimalmass transport literature [23], by the convex relaxation [24],the integer optimization (11) can be transferred to the linearprogramming problem stated as follows: min Z ik ≥ (cid:88) i ∈V s (cid:88) k ∈K C (cid:63)ik Z ik (14)s.t. (cid:88) k ∈K Z ik = 1 , ∀ i ∈ V s , (cid:88) i ∈V s Z ik = 1 , ∀ k ∈ K . Since only agent i knows its own cost C (cid:63)ik for k ∈ K ,we are interested in solving optimization problem (14) in adistributed way. In general, problem (14) may exist severaloptimal solutions Z ik(cid:63) . We also require the agents agreedon the same optimal assignment plan. A distributed simplexalgorithm proposed by [16] can achieve this aim. We rewrite(14) to the standard form of linear programming min Z C (cid:63) T Z (15)s.t. AZ = b , Z ≥ . where b = N , Z = [ Z , · · · , Z N , Z , · · · , Z ,N , · · · , Z N · · · , Z NN ] (cid:62) , C (cid:63) = [ C (cid:63) , · · · , C (cid:63) N , C (cid:63) , · · · , C (cid:63) ,N , · · · , C (cid:63)N · · · , C (cid:63)NN ] (cid:62) , A = [ A , · · · , A N , A , · · · , A ,N , · · · , A N · · · , A NN ] , in which, A ik ∈ R N is a column vector with i -th and ( N + k ) -th entries are , and others are . A column of problem (15)is a vector h ik ∈ R N defined as h ik = [ C (cid:63)ik A (cid:62) ik ] (cid:62) .The set of all columns is denote by H = { h ik } i ∈V s ,k ∈K .Thus, the linear program (15) is fully characterized by thepair ( H , b ) . The information of H is distributed in the serviceagents. Let P i = { h ik } k ∈K is the problem column setknown by agent i ∈ V s , which satisfies H = ∪ Ni =1 P i and P i ∩ P j = ∅ , ∀ ( i, j ) ∈ V s . We assume the communicationgraph G s ( V s , E s ) of the service agents is connected. Hence thetuple ( G s , ( H , b ) , {P i } i ∈V s ) forms a distributed linear programthat can be solved by the distributed simplex algorithm [16].The result of the optimization problem (14) is the optimal plan Z (cid:63)ik , where Z (cid:63)ik = 1 means assigning the agent i to the k thsubregion with the optimal pose x i s (cid:63) = x ik s (cid:63) and θ i s (cid:63) = θ ik s (cid:63) .The last step in Stage 2 of our deployment solution isagents transportation to their corresponding assigned pose. Inpractice, local controllers are expected to complete this task.One such local controller can be the well-known minimumenergy control [25, page 138] that can transport the agents totheir respective assigned pose in finite time τ ∈ R > whilealso enabling the agents to save on transportation energy.Let the local dynamics of agent i (linearized) be given by Fig. 5:
The distribution of QoS of agent ˙ χ i ( t ) = A i χ i ( t ) + B i u i ( t ) , where u i ( t ) is the controlvector, and χ i is the state vector of agent i , which containsthe pose and possibly other states. We assume that ( A i , B i ) is controllable. Starting at an initial condition χ i ( t ) , theminimum energy control is given by u i ( t ) = B i (cid:62) e A i (cid:62) ( t + τ − t ) G i − ( χ i(cid:63) − e A i τ χ i ( t )) (16)for t ∈ [ t , t + τ ] , where χ i(cid:63) is agent i ’s desired statewhose pose component is set to ( x i s (cid:63) , θ i s (cid:63) ) , and G i is thecontrollability Gramian. Control (16) is a finite time con-trol that guarantees to drive the agent from it initial state χ i ( t ) to it desired state χ i(cid:63) in finite transportation time τ ,i.e., χ i ( t + τ ) = χ i (cid:63) . Stage 2 of our distributed deploymentis finished at t + τ . If the targets are dynamic, our two-stagedeployment process can repeat to re-position the service agentsin accordance to the changes in targets distribution.VI. D EMONSTRATIONS
Consider the the same setting in Section IV-A where a groupof agents with the connected communication graph G s used a distributed EM Algorithm 1 to obtain the parameters { π ik µ ik , Σ ik } k =1 of the Gaussian bases of the targets distribu-tion. The agents are equipped with a wireless sensor whichis used to detect events of interest that occurred with targets.A commonly used sensor model is a probabilistic functionconditioned on the sensor location and the event location [26],[27], i.e., Pr(
Detected | x i s , x t ) . For example in [27], given asensor location at x i s , i ∈ V s and an event happening at x t , theprobability of the sensor detecting the event is expressed as Pr(
Detected | x i s , x t ) = β i e − α i ( x i s − x t ) (cid:62) ( x i s − x t ) γi , where α i , β i , γ i are sensor i ’s parameters. In this case, the QoSof the sensor i at location x i s over the 2-D space x ∈ R can bedefined as Q ( x | x i s ) = Pr( Detected | x i s , x ) = z i N ( x | x i s , Λ i ) ,where z i = (cid:113) π | Λ i | β i , Λ i = (cid:104) σ i σ i (cid:105) , σ i = γ i α i . Inthis example, we consider a more general sensor model withanisotropic sensory capability, i.e. QoS is Q ( x | x i s , θ i s ) = z i N ( x | x i s , Σ i ( θ i s )) , with Σ i ( θ i s ) = R ( θ i s ) Λ i R (cid:62) ( θ i s ) , Λ i = (cid:104) σ ix σ iy (cid:105) , and θ i s is the orientation of sensor i . Lastly, thecollective density distribution of QoS provided by the 6sensors is q ( x |{ x i s , Σ i ( θ i s ) } i ∈V s ) = (cid:88) i ∈V s ω i s N ( x | x i s , Σ i ( θ i s )) where ω i s = z i (cid:80) Ni =1 z i . We set ω s = 0 . , σ x = 70 , σ y = 25 , ω s = 0 . , σ x = 30 , σ y = 15 , ω s = 0 . , σ x = 80 , σ y = 30 , ω s = 0 . , σ x = 30 , σ y = 30 , ω s = 0 . , σ x = 60 , σ y = 40 ,and ω s = 0 . , σ x = 30 , σ y = 30 . See Fig. 5 for a graphicaldemonstration of QoS of agent .Each agent i ∈ V s evaluates its costs C (cid:63)ik for all k ∈ K by (12).Then, the agents cooperatively solve the distributed multi-agent assignment problem (15) by the means of distributedsimplex algorithm [16]. The optimal assignment plan of (15)is Z (cid:63) = 1 , Z (cid:63) = 1 , Z (cid:63) = 1 , Z (cid:63) = 1 , Z (cid:63) = 1 , and Z (cid:63) = 1 .Suppose the service agents i ∈ V s , motion is described by aunicycle dynamics ˙ x i s ,x = v i cos θ i s , ˙ x i s ,y = v i sin θ i s , ˙ θ i s = ω i , (17)where [ x i s ,x x i s ,y ] (cid:62) = x i s , v i ∈ R and ω i ∈ R are are linearvelocity and angular velocity of each agent i , respectively.The agents execute the feedback linearization procedure [28]to achieve the equivalent linear dynamics ˙ χ i ˙ χ i ˙ χ i ˙ χ i = (cid:20) (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) A χ i χ i χ i χ i + (cid:20) (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) B (cid:104) u i u i (cid:105) , (18)by the change of variables χ i = x i s ,x , χ i = v i cos θ i s , χ i = x i s ,y , χ i = v i sin θ i s and the compensator ˙ v i = u i cos θ i s + u i sin θ i s ,ω i = u i cos θ i s − u i sin θ i s v i . (19)Finally, control (16) is applied to to drive the agents to theirdesired state χ i (cid:63) = [ x i (cid:63) s ,x v i (cid:63) cos θ i (cid:63) s x i (cid:63) s ,y v i (cid:63) sin θ i (cid:63) s ] (cid:62) corresponding to their assigned subregions and the optimalpose, i.e., x i (cid:63) s = µ ik and θ i (cid:63) s = θ ik if Z (cid:63)ik = 1 , where v i (cid:63) > is the arrival velocity which we can assign.The density distribution of QoS provided by the agents(sensors) after deployment is illustrated in Fig. 6, wherethe black circles are the targets, the colored dots representthe 6 agents and the blue color map is the collective QoSdistribution. The darker blue indicates the better QoS. We cansee that the collective QoS distribution is similar to the targetsdistribution. Hence, with the proposed two-stage deploymentstrategy, the distribution of QoS efficiently covers the targets.VII. C ONCLUSION
We proposed a two-stage distributed deployment solution formobile agents to efficiently cover a group of dense targets withtheir service, i.e., the resulting collective QoS distribution is asmuch similar to as possible to the target density distribution.In the first stage, we proposed the distributed consensus-based EM algorithm to enable all agents to estimate the targetdensity distribution by a GMM. In the second stage, we firstassessed the smallest KLD of each pair of agent’s GaussianQoS distribution and Gaussian basis subregion. Then, weformulated a distributed multi-agent assignment problem toallocate each service agent to a subregion by taking the
Fig. 6:
The QoS distribution provided by the 6 sensors. smallest KLD between the agent and the subregion as theassignment cost. As a result, the summation of the KLD ofeach assigned agent-subregion pair is minimal. We illustratedthe effectiveness of our proposed deployment solution usingan application example for event detection via mobile sensors.R
EFERENCES[1] M. Kumar, K. Cohen, and B. HomChaudhuri, “Cooperative controlof multiple uninhabited aerial vehicles for monitoring and fightingwildfires,”
Journal of Aerospace Computing, Information, and Commu-nication , vol. 8, no. 1, pp. 1–16, 2011.[2] A. Faustine, A. N. Mvuma, H. J. Mongi, M. C. Gabriel, A. J. Tenge,and S. B. Kucel, “Wireless sensor networks for water quality monitoringand control within lake victoria basin,” 2014.[3] S. Susca, F. Bullo, and S. Martinez, “Monitoring environmental bound-aries with a robotic sensor network,”
IEEE Transactions on ControlSystems Technology , vol. 16, no. 2, pp. 288–296, 2008.[4] J. Cortes, S. Martinez, T. Karatas, and F. Bullo, “Coverage controlfor mobile sensing networks,”
IEEE Transactions on robotics andAutomation , vol. 20, no. 2, pp. 243–255, 2004.[5] F. Bullo, J. Cortes, and S. Martinez,
Distributed control of roboticnetworks: a mathematical approach to motion coordination algorithms ,vol. 27. Princeton University Press, 2009.[6] G. Wang, G. Cao, and T. F. La Porta, “Movement-assisted sensordeployment,”
IEEE Transactions on Mobile Computing , vol. 5, no. 6,pp. 640–652, 2006.[7] L. C. Pimenta, V. Kumar, R. C. Mesquita, and G. A. Pereira, “Sensingand coverage for a network of heterogeneous robots,” in , pp. 3947–3952, IEEE, 2008.[8] A. Pierson, L. C. Figueiredo, L. C. Pimenta, and M. Schwager, “Adapt-ing to performance variations in multi-robot coverage,” in , pp. 415–420, IEEE, 2015.[9] O. Arslan and D. E. Koditschek, “Voronoi-based coverage control of het-erogeneous disk-shaped robots,” in , pp. 4259–4266, IEEE, 2016.[10] K. Laventall and J. Cort´es, “Coverage control by multi-robot networkswith limited-range anisotropic sensory,”
International Journal of Con-trol , vol. 82, no. 6, pp. 1113–1121, 2009.[11] F. Farzadpour, X. Zhang, X. Chen, and T. Zhang, “On performancemeasurement for a heterogeneous planar field sensor network,” in , pp. 166–171, IEEE, 2017.[12] A. Gusrialdi, S. Hirche, T. Hatanaka, and M. Fujita, “Voronoi basedcoverage control with anisotropic sensors,” in , pp. 736–741, IEEE, 2008.[13] M. Schwager, D. Rus, and J.-J. Slotine, “Decentralized, adaptive cover-age control for networked robots,”
The International Journal of RoboticsResearch , vol. 28, no. 3, pp. 357–375, 2009. [14] A. Carron, M. Todescato, R. Carli, L. Schenato, and G. Pillonetto,“Multi-agents adaptive estimation and coverage control using gaussianregression,” in , pp. 2490–2495, IEEE, 2015.[15] F. Sharifi, M. Mirzaei, Y. Zhang, and B. W. Gordon, “Cooperativemulti-vehicle search and coverage problem in an uncertain environment,”
Unmanned systems , vol. 3, no. 01, pp. 35–47, 2015.[16] M. B¨urger, G. Notarstefano, F. Bullo, and F. Allg¨ower, “A distributedsimplex algorithm for degenerate linear programs and multi-agent as-signments,”
Automatica , vol. 48, no. 9, pp. 2298–2304, 2012.[17] D. J. MacKay,
Information theory, inference and learning algorithms .Cambridge university press, 2003.[18] J. R. Hershey and P. A. Olsen, “Approximating the kullback leibler di-vergence between gaussian mixture models,” in ,vol. 4, pp. IV–317, IEEE, 2007.[19] F. Bullo, J. Cort´es, and S. Mart´ınez,
Distributed Control of RoboticNetworks . Applied Mathematics Series, Princeton University Press,2009.[20] Y.-F. Chung and S. Kia, “Dynamic active average consensus and itsapplication in containment control,” arXiv:2008.05722 , 2020.[21] J. R. Hershey and P. A. Olsen, “Approximating the kullback leibler di-vergence between gaussian mixture models,” in ,vol. 4, pp. IV–317, IEEE, 2007.[22] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihoodfrom incomplete data via the em algorithm,”
Journal of the RoyalStatistical Society: Series B , vol. 39, no. 1, pp. 1–22, 1977.[23] G. Peyr´e and M. Cuturi, “Computational optimal transport,”
Foundationsand Trends in Machine Learning , vol. 11, no. 5-6, pp. 355–607, 2019.[24] D. P. Bertsekas,
Network optimization: continuous and discrete models .Athena Scientific Belmont, MA, 1998.[25] F. Lewis, D. Vrabie, and V. Syrmos,
Optimal Control . Wiley, 2012.[26] F. Bourgault, T. Furukawa, and H. F. Durrant-Whyte, “Optimal searchfor a lost target in a bayesian world,” in
Field and service robotics ,pp. 209–222, Springer, 2003.[27] H. Xiao, R. Cui, and D. Xu, “A sampling-based bayesian approach forcooperative multiagent online search with resource constraints,”
IEEETransactions on Cybernetics , vol. 48, no. 6, pp. 1773–1785, 2017.[28] G. Oriolo, A. D. Luca, and M. Vendittelli, “WMR control via dynamicfeedback linearization: design, implementation, and experimental valida-tion,”