[PDF] Multi-Sensor Control for Multi-Object Bayes Filters

Abstract

Sensor management in multi-object stochastic systems is a theoretically and computationally challenging problem. This paper presents a novel approach to the multi-target multi-sensor control problem within the partially observed Markov decision process (POMDP) framework. We model the multi-object state as a labeled multi-Bernoulli random finite set (RFS), and use the labeled multi-Bernoulli filter in conjunction with minimizing a task-driven control objective function: posterior expected error of cardinality and state (PEECS). A major contribution is a guided search for multi-dimensional optimization in the multi-sensor control command space, using coordinate descent method. In conjunction with the Generalized Covariance Intersection method for multi-sensor fusion, a fast multi-sensor algorithm is achieved. Numerical studies are presented in several scenarios where numerous controllable (mobile) sensors track multiple moving targets with different levels of observability. The results show that our method works significantly faster than the approach taken by a state of art method, with similar tracking errors.

Full PDF

XXXXXXX, VOL. XX, NO. XX, MMMM YYYY 1

Multi-Sensor Control for Multi-Object BayesFilters

Xiaoying Wang, Reza Hoseinnezhad ∗ , Amirali K. Gostar, Tharindu Rathnayake, Benlian Xu andAlireza Bab-Hadiashar Abstract

Sensor management in multi-object stochastic systems is a theoretically and computationally challenging problem.This paper presents a novel approach to the multi-target multi-sensor control problem within the partially observedMarkov decision process (POMDP) framework. We model the multi-object state as a labeled multi-Bernoulli randomﬁnite set (RFS), and use the labeled multi-Bernoulli ﬁlter in conjunction with minimizing a task-driven controlobjective function: posterior expected error of cardinality and state (PEECS). A major contribution is a guided searchfor multi-dimensional optimization in the multi-sensor control command space, using coordinate descent method. Inconjunction with the Generalized Covariance Intersection method for multi-sensor fusion, a fast multi-sensor algorithmis achieved. Numerical studies are presented in several scenarios where numerous controllable (mobile) sensors trackmultiple moving targets with different levels of observability. The results show that our method works signiﬁcantlyfaster than the approach taken by a state of art method, with similar tracking errors.

Index Terms partially observed Markov decision process, multi-target tracking, random ﬁnite sets, labeled multi-Bernoulli ﬁlter,coordinate descent.

I. I

NTRODUCTION

Multi-object multi-sensor management/control is a challenging optimal nonlinear control problem focused ondirecting multiple sensors to obtain most informative measurements for the purpose of multi-object ﬁltering [1]. Thisproblem is different from classical control problems as the overall controlled system is a highly complex stochasticmulti-object system, where not only the number of objects vary randomly in time, but also the measurements returnedby each sensor are subject to missed detections and false alarms. Indeed, the multi-object state and multi-objectobservations are inherently ﬁnite-set-valued, and standard optimal control techniques are not directly applicable.In stochastic multi-object systems, we can still cast the multi-object multi-sensor control problem as a partiallyobserved Markov decision process (POMDP), where the states and observations are instead ﬁnite-set-valued, and ∗ Corresponding Author; email: [email protected]. Wang and B. Xu are with Changshu Institute of Technology, Changshu, Soochow, P. R. of China. X. Wang is also a visiting researcherat School of Engineering, RMIT University, Victoria, Australia.R. Hoseinnezhad, A. K. Gostar, T. Rathnayake and A. Bab-Hadiashar are with School of Engineering, RMIT University, Victoria, Australia.Manuscript received mmmm dd, yyyy.

October 11, 2018 DRAFT a r X i v : . [ c s . S Y ] F e b XXXXX, VOL. XX, NO. XX, MMMM YYYY 2 control vectors are drawn from a set of admissible sensor actions based on the current information states, whichare then assessed against the values of an objective function associated with each multi-sensor action [2]. In thisframework, a solution would include three major steps: (1) modeling the overall system as a stochastic multi-objectsystem, (2) devising a tractable (accurate or approximate) way to propagate the multi-object posterior, and (3)solving an optimization problem to ﬁnd the multi-sensor control command, according to an objective function. Thispaper presents a formulation of the multi-sensor control problem as a POMDP with ﬁnite-set-valued states andmeasurements, a labeled random set ﬁlter used to propagate the multi-object posterior, and a task-driven objective(cost) function.To our knowledge, the problem of multi-sensor control for labeled random set ﬁlters is only recently consideredby Meng et al. [3]. In this method, local Vo-Vo ﬁlters are operating at each sensor node, and the resulting Vo-Vodensities (posteriors) are fused using the Generalized Covariance Intersection (GCI) rule as formulated in [6]. Theapproach opted by Meng et al. [3] to solve the multi-sensor control problem is an exhaustive search scheme, inwhich the objective function is computed for all possible combinations of sensor control actions. This approachworks well for a few sensors only, but in presence of numerous sensors, may become computationally intractable.The major contribution of this paper is the introduction of a guided search to solve the multi-dimensionaldiscrete optimization problem embedded in multi-sensor control. We avoid the curse of dimensionality by usingan accelerated scheme inspired by the coordinate descent method [7]. This leads to signiﬁcant improvement inthe runtime of the algorithm and its real-time feasibility, especially in presence of numerous sensors. Anothercontribution is the detailed sequential Monte-Carlo (SMC) implementation of the proposed multi-sensor controlframework with Labeled Multi-Bernoulli (LMB) ﬁlters running in each sensor node. The novel idea inherent inthe proposed SMC implementation is that sensor control and the actual ﬁlters are all implemented using the sameparticles, hence substantial savings are achieved in terms of memory and computational requirements. We alsoexperimentally analyse the computational complexity of the proposed method and demonstrate that it varies almostquadratically with the number of controlled sensors (polynomial complexity). This is while an exhaustive searchsimilar to the one used in [3] has exponential (hence, non-polynomial) complexity.Extensive simulation studies involving numerous controllable sensors demonstrate that our method returns ac-ceptable tracking results quantiﬁed in terms of OSPA error values [8]. Indeed, in comparison to the state of art(running exhaustive search in an approach similar to [3]), the proposed multi-sensor control method returns similartracking errors but converges signiﬁcantly faster.The organization of the paper is as follows. Section II presents a formalized statement of the multi-sensor controlproblem in POMDP framework and sets out the background and design requirements for various components ofthe framework. The proposed multi-sensor control solution is then presented in section III, outlining the generalframework and proposed choices for its components, as well as a step-by-step algorithm for the SMC implementation.Simulation results are presented in section IV. Section V concludes the paper. The authors of [4] originally called their ﬁlter the delta-Generalized Multi-Bernoulli ( δ − GLMB) ﬁlter. In this work, we follow the simplername suggested by R. Mahler in his book [5].

October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 3

II. P

ROBLEM S TATEMENT

Consider a stochastic multi-object system, in which at any discrete (or sampling) time k , the multi-object state X k is a labeled random ﬁnite set (RFS) comprised of a random number of single-object states, X k = { ( x ,k , (cid:96) ) , . . . , ( x n k ,k , (cid:96) n k ) } ∈ F ( X × L ) (1)where X and L denote the state and label spaces, respectively, and F ( · ) means “all ﬁnite subsets of.”The system is modeled as a one-step-ahead Markovian process which is characterized by a transition density f ( X k | X k − ) . A practical approximation for the process can be formulated based on assuming that while transitingfrom time k − to time k , each existing object x independently continues to exist with a survival probability p S ( x ) and single-object transition density f k | k − ( ·| x ) , and a number of new objects are born according to a given RFSdensity.At each time k , the multi-object state is partially observed by a network of n s sensors, each returning a setof measurements (called detections or point measurements). Let Z i be the measurement set returned by the i -thsensor, s i , ( i = 1 : n s ) . Denoting the space of point measurements by Z , the space of measurement sets will be Z = F ( Z ) . Each sensor s i can be controlled (e.g. translated, rotated) according to a sensor command u i ∈ U where U is a ﬁnite space of sensor commands. The cumulated measurement is an n s -tuple of measurement sets, Z k = ( Z , . . . , Z n s ) ∈ Z n s . (2)The relationship between the multi-sensor measurement and the multi-object state is stochastically modeled by themulti-object likelihood function g ( Z k | X k , u ) , where u = ( u , · · · , u n s ) ∈ U n s is the multi-sensor command. The likelihood function is usually modeled in terms of a single-object likelihood g ( z | x , u ) , a state-dependent detection probability p D ( x ) and assuming a Poisson process for the number of falsealarms which together are modeled as a Poisson RFS characterized by an intensity function κ ( · ) .The multi-sensor control problem can be formally cast in the framework of the following 6-tuple discrete-timePOMDP: Ψ = { X , U n s , f ( ·|· ) , Z n s , g ( ·|· , u ) , ν ( u ; · ) } (3)where ν ( u ; · ) is an objective function that associates a reward or cost with a choice of multi-sensor control command u = ( u , . . . , u n s ) given the recent multi-object state X k − or its statistical characteristics. In a one-step-aheadmulti-sensor control solution, the aim is to ﬁnd the multi-sensor command, u ∗ = ( u ∗ , . . . , u ∗ n s ) that satisﬁes ( u ∗ , . . . , u ∗ n s ) = arg min / max ( u ,...,u ns ) ∈ U ns ν ( u , . . . , u n s ; X k − ) . (4) October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 4

A. Multi-target Bayes ﬁlter within POMDP

Given the POMDP with the components given in (3), the probability density of multi-object state of the systemcan be recursively estimated by a multi-target Bayes ﬁlter. Let us denote the density of multi-object state at time k by π k ( X k | Z k ) , where Z k denotes the ensemble of all multi-sensor measurements accumulated up to time k .In a Bayesian ﬁltering scheme, the density is recursively propagated through two steps: prediction and update [4],[5]. The predicted density is computed by the multi-object Chapman-Kolmogorov equation: π k | k − ( X k | Z k − ) = (cid:82) π k − ( X k − | Z k − ) f ( X k | X k − ) δ X k − . (5)With the arrival of new observations Z k = ( Z , . . . , Z n s ) from the sensors controlled by a multi-sensor action u k ,a posterior density is obtained using multi-object Bayes’ rule: π k ( X k | Z k ) = g ( Z k | X k , u k ) π k | k − ( X k ) (cid:82) g ( Z k | X , u k ) π k | k − ( X ) δ X . (6) Remark 1:

Given the posterior recursion (5) and (6), the objective function component of the POMDP in (3), isusually deﬁned as a function of the probability density of the multi-object state, and the optimization componentof the multi-sensor control framework is expressed as u ∗ = arg min / max u ∈ U ns ν ( u ; π k − ( X k − ) . (7) Remark 2:

The integrals in (5) and (6) are set integrals as deﬁned in [5]. The recursion (5) and (6) has no analyticsolution in general. An SMC implementation of the Bayes multi-object ﬁlter (with RFS states without labels) isgiven in [9]. However, this technique is computationally prohibitive which at best is able to accommodate a smallnumber of targets. This SMC implementation of the multi-object Bayes ﬁlter was employed by the multi-targetsensor control algorithm proposed in [10].Due to general intractability of propagation of the full posterior density given by (5) and (6), several alternativeshave been proposed which are designed to propagate important statistics or parameters instead of the full posterior.Well-known examples of such ﬁlters are probability hypothesis density (PHD) ﬁlter and its cardinalized version(CPHD) [5], and the multi-Bernoulli ﬁlter and its cardinality-balanced version (CB-MeMBer) [11]. In a series ofworks [12], [13], [14], [15], [16], [17], various implementations of these ﬁlters such as SMC and track-before-detect (TBD) were introduced, as well as a robust version of multi-Bernoulli ﬁlter. These methods can not generatetarget tracks (using labels) in a rigorously mathematical way, and are usually applied in conjunction with a labelmanagement strategy [16], [14].Since 2010, a series of random set ﬁlters have been developed, in which the multi-object random state includeslabel. The labeled random ﬁnite sets were shown to admit conjugacy of a particular form of prior density (theVo-Vo density) with the general multiple point measurement set likelihood [18]. Following this result, the Vo-Voﬁlter was introduced [4], [19]. Variants of the Vo-Vo ﬁlter such as the labeled multi-Bernoulli (LMB) ﬁlter [20]and M- δ -GLMB ﬁlter [21] were also proposed and applied in various applications. October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 5

The proposed multi-sensor control framework can be implemented with different multi-object ﬁlters. For the sakeof completion and presenting a step-by-step pseudocode, we have chosen to implement our method with the LMBﬁlter.

B. Objective function

The choice of objective function ν ( u ; π k − ( · )) is a critical part of the control solution design task. The objectivefunctions commonly used in sensor control solutions in the stochastic signal processing and control literature, canbe generally divided into two types: information-driven and task-driven. The information-driven reward functionquantiﬁes the expected information gain from prior to posterior after a hypothesized sensor control action. Forexample, R´enyi divergence was usedby Ristic et al. [10], [22] for sensor control with random set ﬁlters in general [10]and PHD ﬁlters in particular [22]. Recently, in a number of works, the Cauchy-Schwarz divergence has been adoptedas the reward function [23], [24], [25].The task-driven cost functions are usually formulated in terms of the expected error of estimation. Examplesof such cost functions include the MAP estimate of cardinality variance [26], statistical mean of cardinalityvariance [27], posterior expected error of cardinality and states (PEECS) [28], [29], [30] and statistical meanof the OSPA error [31]. A general discussion and comparison between task-driven and information-driven objectivefunctions for sensor management is presented in [32].In the multi-sensor control framework proposed in this paper, we use PEECS as the objective (cost) function.The rationale behind this choice is that while computing PEECS can be faster than the common divergencefunctions, comparable or better tracking accuracies can be achieved via minimizing PEECS as the sensor controlcost function [29], [30]. C. Sensor fusion and optimal control

In presence of multiple sensors (or sensor nodes in a sensor network), usually a multi-object Bayes ﬁlter runsat each node and the local posteriors need to be fused. The Generalized Covariance Intersection (GCI) rule hasbeen widely used for consensus-based fusion of multiple multi-object densities of various forms. Examples includethe fusion of Poisson multi-object posteriors of multiple local PHD ﬁlters [33], i.d.d. clusters densities of severallocal CPHD ﬁlters [34], multi-Bernoulli densities of local multi-Bernoulli ﬁlters [35], and LMB or Vo-Vo densitiesof several local LMB or Vo-Vo ﬁlters [6]. The problem of multi-sensor control for labeled random set ﬁlters isrecently considered by Meng et al. [3]. In this method, local Vo-Vo ﬁlters are operating at each sensor node, andthe resulting Vo-Vo densities (posteriors) are fused using the GCI-rule (as formulated in [6]).The common underlying assumption for solving the multi-sensor control problem is that in an exhaustive search scheme, the objective function is computed for all possible combinations of sensor control actions u = ( u , . . . , u n s ) .This approach works well for a relatively small number of sensors. For instance, the case study presented in thework of Meng et al. [3] involves only two sensors. In presence of numerous sensors, their combined controlbecomes computationally intractable if implemented via an exhaustive search. Indeed, the computational cost ofoverall multi-sensor control procedure will grow exponentially with the number of sensors. Our framework includes October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 6 a guided search method that solves the optimization problem without the need for an exhaustive search and can beutilized to simultaneously control numerous sensors.III. M ULTI -S ENSOR C ONTROL F RAMEWORK

Given the system model presented in section II, an effective design for a multi-sensor control framework ispresented in this section. We ﬁrst outline an overview of the general components and steps involved in our proposedapproach. Having the big picture in mind, we then present the details of various components as implemented inour experiments.Let us assume that at each time k , the fused prior from the previous step, ˜ π k − , is processed through the predictionstep of the Bayes ﬁlter. A multi-object set estimate, ˆ X k | k − is then extracted from the predicted density and usedto compute predicted ideal measurement sets (PIMS) [28], [29], [5] for each sensor node and each possible controlcommand applied to that node, denoted by { Z i ( u ) } u ∈ U for sensor i .In the next step, at each sensor node, a pseudo update is performed using each PIMS associated with a controlcommand. The resulting pseudo posteriors are then processed by an optimization module to output an optimal setof control commands. The control actions are then applied to the sensors (for instance, they are displaced or rotatedaccording to the chosen action command) following which, the measurement sets Z , . . . , Z n s are acquired from thesensors. Using those measurement sets, the predicted multi-object density is locally updated in each sensor node,then the local posteriors are fused using a fusion rule such as the GCI-rule. The fused posterior is post-processed( e.g. low weight components are pruned or particles are resampled). The resulting posterior is then used as priorin the next time step. A. Labeled Multi-Bernoulli ﬁlter

The notion of Labeled Multi-Bernoulli (LMB) RFS was introduced for the ﬁrst time in [18], with the LMBﬁlter recursion further developed in [20]. The LMB distribution is completely described by its components π = { ( r ( (cid:96) ) , p ( (cid:96) ) ( · )) } (cid:96) ∈ L where r ( (cid:96) ) is the probability of existence of an object with label (cid:96) ∈ L , and p ( (cid:96) ) ( x ) is theprobability density of the object’s state x ∈ X conditional on its existence. The LMB RFS density is given by π ( X ) = ∆( X ) w ( L ( X )) [ p ] X , (8)where L ( X ) is the set of all labels extracted from labeled states in X , and ∆( X ) (cid:44)  | X | = |L ( X ) | , (9)in which | · | means “the cardinality of”, and [ p ] X (cid:44) (cid:89) ( x,(cid:96) ) ∈ X p ( (cid:96) ) ( x ) , (10)and w ( L ) = (cid:89) i ∈ L (cid:16) − r ( i ) (cid:17) (cid:89) (cid:96) ∈ L L ( (cid:96) ) r ( (cid:96) ) (1 − r ( (cid:96) ) ) (11) October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 7 is the probability of joint existence of all objects with labels (cid:96) ∈ L and non-existence of all other labels [20].In a Bayes multi-object ﬁlter, suppose that the prior is an LMB with parameters { ( r ( (cid:96) ) , p ( (cid:96) ) ( · )) } (cid:96) ∈ L . In an SMCimplementation, the density function of each component with label ( (cid:96) ) is approximated by J ( (cid:96) ) particles and weights, p ( (cid:96) ) ( x ) ≈ J ( (cid:96) ) (cid:88) j =1 w ( (cid:96) ) j δ ( x − x ( (cid:96) ) j ) (12)where δ ( · ) is the Dirac delta function.In the prediction step of an LMB ﬁlter, the LMB prior is turned into the following new LMB density with evolvedparticles and probabilities of existence including the LMB birth components: π + = (cid:110)(cid:16) r ( (cid:96) )+ ,S , p ( (cid:96) )+ ,S (cid:17)(cid:111) (cid:96) ∈ L ∪ (cid:110)(cid:16) r ( (cid:96) ) B , p ( (cid:96) ) B (cid:17)(cid:111) (cid:96) ∈ B (13)where r ( (cid:96) )+ ,S = η S ( (cid:96) ) r ( (cid:96) ) (14) p ( (cid:96) )+ ,S = (cid:104) p S ( · , (cid:96) ) f ( x |· , (cid:96) ) , p ( (cid:96) ) ( · ) (cid:105) /η S ( (cid:96) ) (15)and η S ( (cid:96) ) = (cid:104) p S ( · , (cid:96) ) , p ( (cid:96) ) ( · ) (cid:105) . (16)Let us denote the predicted LMB parameters by { r ( (cid:96) )+ , { w ( (cid:96) )+ j , x ( (cid:96) )+ j } J ( (cid:96) )+ j =1 } (cid:96) ∈ L + where L + = L ∪ B . Note that in aboveequations, (cid:104) f, g (cid:105) (cid:44) (cid:90) X f ( x ) g ( x ) dx denotes inner product of two functions. Remark 3:

As part of the multi-sensor control framework, a multi-object state estimate needs to be computedfrom the predicted density. A maximum a posteriori (MAP) estimate for the number of objects can be found fromcardinality distribution, ˆ n = arg max n ρ ( n )= arg max n ρ (0) (cid:80) L ⊆ L , | L | = n (cid:18)(cid:81) (cid:96) ∈ L r ( (cid:96) )+ − r ( (cid:96) )+ (cid:19) . (17)where ρ (0) = (cid:81) (cid:96) ∈ L (1 − r ( (cid:96) )+ ) . Given the number of objects, we ﬁnd the ˆ n labels with highest probabilities ofexistence. For each label, an expected a posteriori (EAP) state estimate is given by ˆ x ( (cid:96) ) pseudo = J ( (cid:96) )+ (cid:88) j =1 w ( (cid:96) )+ j x ( (cid:96) )+ j (18)and the set of all estimates is denoted by ˆ X pseudo . The subscript “pseudo” is used because the estimates are resultedfrom the predicted, and not the updated, density.Assume that at a sensor node i , the control command u ∈ U is applied, and a measurement set denoted by Z i is acquired. Let us denote the updated LMB by π i,u ( ·| Z i ) = (cid:110)(cid:16) r ( (cid:96) ) i,u , p ( (cid:96) ) i,u ( · ) (cid:17)(cid:111) (cid:96) ∈ L + . October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 8

According to LMB update equations derived in [20], the parameters of the above density are given by: r ( (cid:96) ) i,u = (cid:88) ( I + ,θ ) ∈F ( L + ) × Θ I + w ( I + ,θ ) ( Z ) 1 I + ( (cid:96) ) (19) p ( (cid:96) ) i,u ( x ) = 1 r ( (cid:96) ) (cid:88) ( I + ,θ ) ∈F ( L + ) × Θ I + w ( I + ,θ ) ( Z ) 1 I + ( (cid:96) ) p ( θ ) ( x, (cid:96) ) (20)where w ( I + ,θ ) ( Z ) ∝ w + ( I + )[ η ( θ ) Z ] I + (21) p ( θ ) ( x, (cid:96) ) = p ( (cid:96) )+ ( x ) ψ Z ( x, (cid:96) ; θ ) η ( θ ) Z ( (cid:96) ) (22) η ( θ ) Z ( (cid:96) ) = (cid:104) p ( (cid:96) )+ ( x ) , ψ Z ( x, (cid:96) ; θ ) (cid:105) (23) ψ Z ( x, (cid:96) ; θ ) =  p D ( x,(cid:96) ) g ( z θ ( (cid:96) ) | x,(cid:96) ) κ ( z θ ( (cid:96) ) ) , if θ ( (cid:96) ) > − p D ( x, (cid:96) ) , if θ ( (cid:96) ) = 0 (24)and Θ I + is the space of mappings θ : I + → { , , . . . , | Z |} such that θ ( i ) = θ ( i (cid:48) ) > implies i = i (cid:48) , and theweight term, w + ( I + ) , is given by: w + ( I + ) = (cid:89) i ∈ L + (cid:16) − r ( i )+ (cid:17) (cid:89) (cid:96) ∈ I + L + ( (cid:96) ) r ( (cid:96) )+ − r ( (cid:96) )+ . (25) B. Sensor fusion

During the update step of LMB ﬁlter, the particles do not change, and only their weights evolve. Hence, x ( (cid:96) ) i,u,j = x (cid:96) + j . In other words, all the updated LMB posteriors will have the same particles but with different weights andexistence probabilities. This makes the fusion of the posteriors generated at each sensor straightforward.For sensor fusion purposes, we use the GCI-rule as derived in [6], [3] for fusion of multiple LMB densities.For each multi-sensor command candidate u = ( u , . . . , u n s ) ∈ U n s , the corresponding posteriors are LMB’swith parameters (cid:110) { ( r ( (cid:96) ) i,u i , p ( (cid:96) ) i,u i ( · )) } (cid:96) ∈ L (cid:111) n s i =1 where each density is approximated by the same particles but differentweights, p ( (cid:96) ) i,u i ( x ) ≈ J ( (cid:96) )+ (cid:88) j =1 w ( (cid:96) ) i,u i ,j δ ( x − x ( (cid:96) )+ ,j ) . (26)The GCI-rule returns the following fused existence probabilities and densities: r ( (cid:96) ) u = (cid:82) (cid:81) n s i =1 (cid:16) r ( (cid:96) ) i,u i p ( (cid:96) ) i,u i ( x ) (cid:17) ω i dx (cid:81) n s i =1 (cid:16) − r ( (cid:96) ) i,u i (cid:17) ω i + (cid:82) (cid:81) n s i =1 (cid:16) r ( (cid:96) ) i,u i p ( (cid:96) ) i,u i ( x ) (cid:17) ω i dx (27) p ( (cid:96) ) u ( x ) = (cid:81) n s i =1 (cid:16) p ( (cid:96) ) i,u i ( x ) (cid:17) ω i (cid:82) (cid:81) n s i =1 (cid:16) p ( (cid:96) ) i,u i ( x ) (cid:17) ω i dx (28)where ω i is a constant weight indicating the strength of our emphasis on sensor s i in the fusion process. Theseweights should be normalized, i.e. (cid:80) n s i =1 ω i = 1 . In our simulation studies, we assumed that all sensor nodes haveequal priority, and used the values ω i = n s . October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 9

Substituting each density with its particle approximation (26) turns the integrals to weighted sums over theparticles. It is here that sharing the same particles between all the densities becomes instrumental for computationof fused parameters. The fused existence probability is given by: r ( (cid:96) ) u = (cid:80) J ( (cid:96) )+ j =1 (cid:81) n s i =1 (cid:16) r ( (cid:96) ) i,u i w ( (cid:96) ) i,u i (cid:17) ω i (cid:81) n s i =1 (cid:16) − r ( (cid:96) ) i,u i (cid:17) ω i + (cid:80) J ( (cid:96) )+ j =1 (cid:81) n s i =1 (cid:16) r ( (cid:96) ) i,u i w ( (cid:96) ) i,u i (cid:17) ω i . (29)The fused densities also take the form of weighted sum of Dirac deltas: p ( (cid:96) ) u ( x ) = J ( (cid:96) )+ (cid:88) j =1 w ( (cid:96) ) u ,j δ ( x − x ( (cid:96) )+ j ) (30)where w ( (cid:96) ) u ,j = (cid:81) n s i =1 (cid:16) w ( (cid:96) ) i,u i ,j (cid:17) ω i (cid:80) J ( (cid:96) )+ j =1 (cid:81) n s i =1 (cid:16) w ( (cid:96) ) i,u i ,j (cid:17) ω i (31)is the fused weight of each particle in the fused pseudo-posterior. Remark 4:

To maintain tractability, LMB components with extremely small existence probabilities should bepruned. This is performed after GCI-fusion of the posteriors.

C. Objective function

For the objective function, we chose the task-driven cost function termed PEECS in [29]. It returns a linearcombination of the cardinality and state estimation errors which are quantiﬁed by computing the variance ofcardinality and weighted sum of single-object variances, respectively. Consider the fused LMB posterior parametrizedby π u = (cid:110) ( r ( (cid:96) ) u , p ( (cid:96) ) u ( · )) (cid:111) (cid:96) ∈ L + where p ( (cid:96) ) u ( x ) = (cid:80) J ( (cid:96) )+ j =1 w ( (cid:96) ) u ,j δ ( x − x ( (cid:96) )+ j ) . The PEECS cost associated with the multi-sensor control choice u is then given by: ν ( u ; π u ) = η(cid:15) | X | + (1 − η ) (cid:15) X (32)where η is a user-deﬁned parameter representing the level of emphasis on desired accuracy of number of targetsversus accuracy of state estimates, and (cid:15) | X | = (cid:88) (cid:96) ∈ L + r ( (cid:96) ) u (1 − r ( (cid:96) ) u ) ; (cid:15) X = (cid:80) (cid:96) ∈ L + r ( (cid:96) ) u σ X ( (cid:96) ) (cid:80) (cid:96) ∈ L + r ( (cid:96) ) u (33)in which we have: σ X ( (cid:96) ) = J ( (cid:96) )+ (cid:88) j =1 w ( (cid:96) ) u ,j (cid:16) x ( (cid:96) )+ j (cid:17) −  J ( (cid:96) )+ (cid:88) j =1 w ( (cid:96) ) u ,j x ( (cid:96) )+ j  . (34) D. Optimization for multi-sensor control

The ﬁnal step to solve the control problem in POMDP framework is to ﬁnd the optimum point of the objectivefunction. The common approach, which is usually tractable with few sensors, is based on an exhaustive grid searchin the discrete multi-sensor control command space. In this approach, for all possible n s -tuples u = ( u , . . . , u n s ) ∈ U n s October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 10 an ideal measurement set (PIMS) [36] is synthetically generated from the prediction at each sensor node, and usingthe PIMS, a local LMB update is run to create a pseudo-posterior. For each possible multi-sensor control command u , the corresponding pseudo posteriors are fused and the objective function is computed. The optimal sensor controldecision is then given by: ( u ∗ , . . . , u ∗ n s ) = arg min ( u ,...,u ns ) ∈ U ns ν ( u , . . . , u n s ; π k | k − ) (35)where ν ( u , . . . , u n s ; π k | k − ) denotes the objective function computed from the fused pseudo-posterior via updatingthe predicted density π k | k − if control actions ( u , . . . , u n s ) are applied. Note that in the above equation (and therest of this paper), the objective function is assumed to be a cost. If it is a reward, its optimization would requiremaximization.The above search requires the “fusion of pseudo-posteriors followed by computation of the objective function”to be repeated for | U | n s times where | U | denotes the cardinality of single sensor control commands space U . Thecomputational cost increases exponentially with the number of sensors, and becomes intractable when a relativelylarge number of sensors are involved.We propose a guided search routine inspired by the coordinate descent method that signiﬁcantly accelerates theoptimization process and makes it suitable for real time implementation. Our guided search is an iterative coordinatedescent type method with random initializations. Coordinate descent algorithms are well-known for their simplicity,computational efﬁciency and scalability. An overview of coordinate descent algorithms for various optimizationproblems with different constraints is presented in [37]. These algorithms are derivative-free and perform a linesearch along one coordinate direction at the current point in each iteration and use different coordinate directionscyclically to ﬁnd a local optimum point.Coordinate descent provides a sub-optimal solution with non-differentiable objective functions. [7] considersconvergence of coordinate descent methods to a stationary, but not necessarily minimum, point for objective functionsthat include a non-differentiable and separate contribution (also called “non-smooth part”). In a later work, Spall [38]analyzes the convergence of more general seasaw processes for optimization and identiﬁcation, showing that underreasonable conditions, the cyclic scheme converges to the optimal joint value for the full vector of unknownparameters (sensor commands, in the context of our work).To ﬁnd the best n s -tuple of control commands u = ( u , . . . , u n s ) in the n s -dimensional command space U n s ,our guided search starts with random initialization of control commands, denoted by u = ( u , . . . , u n s ) . We thensolve the optimization problem u = arg min u ν ( u, u , . . . , u n s ; π k | k − ) (36)via exhaustive search in the space of coordinates associated with sensor 1. We replace the candidate multi-sensorcontrol action with ( u , u , . . . , u n s ) . Repeating the one-dimensional search for all the other coordinates associatedwith different sensors, our candidate turns into u = ( u , . . . , u n s ) . We repeat this cycle over to obtain the nextcandidates u , u , . . . until convergence, i.e. until we ﬁnd n for which u n − = u n . When used in such an iterative(cyclic) routine, the search is proven to converge in ﬁnite time [39]. October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 11

The converged n s -tuple can be a local optimum. Hence, we need to repeat the process with multiple randominitializations and choose the best candidate as the multi-sensor control command, u ∗ . The required number ofrepeated convergence with random initializations depends on the number of local optima and the desired chanceof success. If there are M local optima, in the worst case scenario they all have basins of attraction with the samehyper-volume, i.e. each basin of attraction is comprised of M of all the points in U n s . Thus, the chance of arandomly initialized search converging to the global optimum will be M . With N random initializations, the totalchance of success is P success = 1 − (1 − M ) N . Hence, the required number of random initializations is given by N = (cid:24) log(1 − P success )log(1 − M ) (cid:25) (37)where (cid:100)·(cid:101) means rounding up to the next integer.In our experiments, choosing the number of local optima at M = 2 n s led to sufﬁcient random initializations forsatisfactory results. Based on equation (37), with a probability of success of 95%, for n s =

5, 10 and 20 sensors, wewould require N =

29, 59 and 119 random initializations which need far less computation than exhaustive searchin the multi-dimensional space U n s . Interestingly, the required number of initializations in equation (37) does notdepend on | U | , i.e. it does not increase with the resolution of the sensor command space. E. Step-by-step Algorithm

Algorithm 1 shows a complete step-by-step pseudocode for multi-sensor control within the LMB ﬁlter, thatoutputs a fused posterior. Starting with an LMB prior (which is the fused LMB posterior from previous time),the function P

REDICT ( · ) implements the LMB prediction step. Multiple object states are then estimated from thepredicted LMB density by calling the function E STIMATE ( · ) which implements equations (17) and (18).The coordinate descent guided search for multi-sensor control is implemented through the line numbers 3–24in Algorithm 1. Before the search begins, for every sensor, s i and every possible action command u , a PIMS iscomputed. Using that set of ideal measurements, the LMB density is then updated by calling function U PDATE ( · ) ,and its parameters (existence probabilities r ( (cid:96) ) i,u , particles x ( (cid:96) ) i,u,j and their weights w ( (cid:96) ) i,u,j ) are recorded.The function C OST computes the PEECS cost value for each set of local posteriors associated with multiplesensor control commands. Both within the cost computation steps, and at the conclusion of the Algorithm 1, weneed to apply the GCI-rule to fuse the locally (pseudo-)updated LMB posteriors. The function GCI

FUSION performsthis task. IV. S

IMULATION R ESULTS

We conducted an extensive set of experiments involving various scenarios with different numbers of targets,sensors, target motion models and sensor detection proﬁle models. In each experiment, we compared the performanceof the proposed multi-sensor control solution with the exhaustive search-based method (similar to [3]), in termsof both accuracy and computational cost. This section includes representative set of our simulation results. Those Usually, the basin of attraction for the global optimum is larger and the chance of success in each round of random initialization is greaterthan M . October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 12

Algorithm 1

Step-by-step pseudocode for a single iteration of ﬁltering, fusion and multi-sensor control. Algorithm 1

Step-by-step pseudocode for one time iteration of ﬁltering, fusion and multi-sensor control.I

NPUTS :– label space L – current LMB existence probabilities { r ( ‘ ) } ‘ ∈ L – current LMB densities approximated by particles {{ w ( ‘ ) j , x ( ‘ ) j } J ( ‘ ) j =1 } ‘ ∈ L – birth model LMB parameters B = { r ( i ) B , { w ( i ) j,B , x ( i ) j,B } J ( i ) B j =1 } i ∈ B – survival probability function p S ( x, ‘ ) – single-target transition density f ( x |· , ‘ ) – single-target likelihood g ( z | x, ‘ ) – detection probability p D ( x, ‘ ) and clutter intensity κ ( · ) – number of sensors n s and current sensors’ states { x s i } n s i =1 – sensor control command space U – required number of consecutive local convergences for global optimization, m max O UTPUTS : the fused posterior parameters to be propagated to next time, parametrized by { ˜ r ( ‘ ) , { ˜ w ( ‘ ) j , ˜ x ( ‘ ) j } ˜ J ( ‘ ) j =1 } ‘ ∈ L { r ( ‘ )+ , { w ( ‘ )+ j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + ← P REDICT ( { r ( ‘ ) , { w ( ‘ ) j , x ( ‘ ) j } J ( ‘ ) j =1 } ‘ ∈ L , B , p S ( · , · ) , f ( ·|· , · ) ) . Prediction step according to (13)–(16). ˆ X pseudo ← E STIMATE ( { r ( ‘ )+ , { w ( ‘ )+ j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + ) . Extracting pseudo-estimates from the predicted LMB distribution, according to (17) and (18). for i = 1 : n s do for u ∈ U do Z ← PIMS ( | ˆ X pseudo | , ˆ X pseudo , x s i + u, g ( z | x, ‘ )) . Using the PIMS function in (Gostar et al.,2016) with controlled sensor states. { r ( ‘ ) i,u , { w ( ‘ ) i,u,j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + ← U PDATE ( { r ( ‘ )+ , { w ( ‘ )+ j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + , Z, p D ( · , · ) , g ( ·|· , · ) , κ ( · )) . Updating the LMB based on equations (19)–(25) and saving the posterior parameters and particles. . Note that particles themselves, and the number of particles, don’t change. Only their weights do, i.e. . ∀ i, u, x ( ‘ ) i,u,j = x ( ‘ )+ j and J ( ‘ ) i,u = J ( ‘ )+ . end for end for m ← . Number of local convergences initialized. while m < m max do Randomly initialize the sensor control commands ( u , . . . , u n s ) ∈ U n s do u old ← ( u , . . . , u n s ) for i = 1 : n s do u ∗ i ← arg min u i ∈ U C OST ( (cid:8) { r ( ‘ ) i,u i , { w ( ‘ ) i,u i ,j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + (cid:9) n s i =1 ) u i ← u ∗ i end for while ( u , . . . , u n s ) = u old . Local minimum reached. Saving the point and cost ... m ← m + 1 u m ← u old C m ← C OST ( (cid:8) { r ( ‘ ) i,u i , { w ( ‘ ) i,u i ,j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + (cid:9) n s i =1 ) end while . Done with m max times of random initialisations and searches. . Now saving the best result ... m ∗ ← arg min m C m u ∗ ← u m ∗ . Sensor control completed. Best commands decided.

Apply sensor control commands u ∗ to sensors, and acquire measurements { Z i } n s i =1 for i = 1 : n s do { r ( ‘ ) i , { w ( ‘ ) i,j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + ← U PDATE ( { r ( ‘ )+ , { w ( ‘ )+ j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + , Z i , p D ( · , · ) , g ( ·|· , · ) , κ ( · )) . Updating the LMB based on equations (19)–(25). end for { ˜ r ( ‘ ) , { ˜ w ( ‘ ) j , ˜ x ( ‘ ) j } ˜ J ( ‘ ) j =1 } ‘ ∈ L ← GCI

FUSION ( (cid:8) { r ( ‘ ) i , { w ( ‘ ) i,j , x ( ‘ )+ j } J ( ‘ )+ j =1 } ‘ ∈ L + (cid:9) n s i =1 ) October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 13 show the advantages of the proposed method, particularly, in terms of computation time. All scenarios share thefollowing parameters.The targets maneuver in an area of m × m. The single target state x is comprised of its label andunlabeled state. The label is formed as (cid:96) = ( k B , i B ) where k B is the birth time of the target and i B is an indexto distinguish targets born at the same time. The unlabeled state is four-dimensional and includes the Cartesiancoordinates of the target and its speed in those directions, denoted by x = [ p x ˙ p x p y ˙ p y ] (cid:62) . Each target movesaccording to the Nearly-Constant Velocity (NCV) model with its variance parameter denoted by σ w . With thismodel, the transition density is f k | k − ( x k | x k − , (cid:96) ) = N ( x k ; F k x k − , Q k ) , where F k =  T T  , Q k = σ w  T T T T T T T T  (38)and T is the sampling interval (in our experiments T = 1 s). The probability of survival is ﬁxed at p S ( x, (cid:96) ) = 0 . .For each sensor s i , with its location denoted by [ x s i y s i ] (cid:62) , each target (if detected) leads to a bearing and rangemeasurement vector with the measurement noise density given by N ( · ; [0 0] (cid:62) , R ) in which R = diag ( σ θ , σ r ) with σ θ = 2 π/ rad and σ r = 10 m being the scales of range and bearing noise. Thus, the single target likelihoodfunction is g ( z i | x, (cid:96) ) = N ( z ; µ i ( x ) , R ) , where µ i ( x ) = (cid:20) arctan (cid:18) p x − x s i p y − y s i (cid:19) (cid:113) ( p x − x s i ) + ( p y − y s i ) (cid:21) (cid:62) . (39)Each measurement set acquired from each sensor also includes Poisson distributed clutter with the ﬁxed clutterrate of λ c = 5 . In all scenarios, the density p ( (cid:96) ) ( · ) of each labeled Bernoulli component in the ﬁlter is approximatedby J ( (cid:96) ) = 1000 particles. All simulation experiments were coded using MATLAB R2015b and ran on an Intel Corei7-4770 CPU @3.40GHz, and 8 GB memory. Scenario 1: Pseudo-stationary targets

In this scenario, we tried the commonly used case study in which ﬁve targets move with relatively smalldisplacements (are pseudo-stationary). To realize such movements, we applied the NCV motion model with thevery small variance σ w = 5 × − m / s borrowed from similar simulations reported in [22], [28], [29]. In thisscenario, the detection proﬁle of each sensor is range-dependent. The detection probability of target with the state x by sensor s i is given by: p iD ( x, (cid:96) ) =  d i ( x ) (cid:54) R − d i ( x ) − R η if R < d i ( x ) (cid:54) R + η (40)where R = 200 m, and η = 1000 m denotes the maximum range of detection, and d i ( x ) denote the sensor-targetdistance given by: d i ( x ) = (cid:113) ( p x − x s i ) + ( p y − y s i ) . (41) October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 14 y x x s y s u u u u u u u u u Fig. 1. Nine possible sensor displacements. Note that U = { u j } j =0:8 where u denotes zero displacement. The detection probability decreases with increasing sensor-target distance. Because of this, and considering that thetargets stay almost in the same distance away from each other all the time, the sensor control is intuitively expectedto drive all the sensors towards the center of the pseudo-stationary targets.The birth process is modeled by an LMB density with | B | = 5 components. Each component has the sameexistence probability of r ( i ) B = 0 . , and a Gaussian density p ( i ) B = N ( x ; m ( i ) B , P B ) , where the mean and covarianceof Gaussians are m (1) B = [800 0 600 0] (cid:62) ; m (2) B = [650 0 500 0] (cid:62) ; m (3) B = [620 0 700 0] (cid:62) ; m (4) B = [750 0 800 0] (cid:62) ; m (5) B = [700 0 400 0] (cid:62) ; P B = diag (cid:0) , × − , , × − (cid:1) . Each sensor s i can be displaced by the multi-sensor control to one of the following possible displacementcommands u ∈ U : U =   ∪  ∆ R cos( j ∆ θ )∆ R sin( j ∆ θ )  j =0 ,...,N θ − where ∆ R = 50 m , N θ = 8 and ∆ θ = 2 π/N θ . Thus, nine control actions are possible at each time step as shownin Fig. 1.Figures 2 (a) and 2 (b) show the sensor movements in cases with three and four sensors, respectively. As expected,our proposed multi-sensor control method drives all the sensors towards the center of the ﬁve targets.To quantify and compare the performance of our method, we ran 200 Monte Carlo repetitions, and computedthe average OSPA errors with order and cut-off parameters p = 2 and c = 100 (see [8, Eqs. (3)-(4)] for deﬁnitionof OSPA and its parameters). We also ran the same Monte Carlo experiments but with exhaustive search-basedsensor control (same approach in the state of art [3]). Figures 3(a) and 3(b) demonstrate that in terms of multi-targettracking errors, the proposed method is comparable with the state of art. The run times for each time-step (averagedover 200 Monte Carlo runs and shown in Figs. 3(c) and 3(d), however, demonstrate that our method runs three October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 15

200 400 600 800 10002004006008001000 Target 1Target 2Target 3 Target 4Target 5Sensor 1 Sensor 2Sensor 3Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5 x coordinate (m) y c oo r d i na t e ( m ) (a)

200 400 600 800 10002004006008001000 Target 1Target 2Target 3 Target 4Target 5Sensor 1 Sensor 2Sensor 3Sensor 4 Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5Target 1Target 2Target 3 Target 4Target 5 x coordinate (m) y c oo r d i na t e ( m ) (b)Fig. 2. Sensor movements in scenario 1 for (a) three sensors and (b) four sensors. The start and end points for each sensor movement aredenoted by colored (cid:4) and ∗ , respectively. These ﬁgures are best viewed in color. times faster than the exhaustive search method used in [3] in presence of three sensors, and 17 times faster whenthere are four sensors.We also tried the experiment with 10 sensors but the exhaustive search-based method turned out to be intractablein our system (only up to ﬁve sensors are feasible). Our accelerated solution however, succeeded with sensorsmoving generally towards the center of targets as expected (See Fig. 4), and the OSPA errors were reasonablysmall—similar to the results shown in Figs. 3(a) and 3(b).We also applied our method to control various numbers of sensors in the same multi-target tracking scenario.We tried up to 36 sensors, and for each case, recorded the run times as plotted in Fig. 5. The results show thatwith increasing the number of sensors, the run time increases almost quadratically (the best quadratic ﬁt is alsodisplayed). Indeed the computational complexity of our method is almost O ( n s ) , which is signiﬁcantly lower thanexhaustive search-based multi-sensor control, i.e. O ( | U | n s ) . Scenario 2: Maneuvering targets

In this section, we present the results of multi-sensor control for targets that maneuver with a relatively highspeed. In such cases, the sensors are expected to follow and possibly approach the center of the moving targets. Wepresent the results of six sensors controlled to track ﬁve targets. For the purpose of visualization of the sensor controlperformance, we tuned the motion model parameters of the targets in such a way that they move approximately in

October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 16 O SPA ( m )( c = , p = ) Our MethodExhaustive (a) O SPA ( m )( c = , p = ) Our MethodExhaustive (b) e Carlo k R un t i m e ( s ) Our MethodExhaustive (c) Mont e Carlo k R un t i m e ( s ) Our MethodExhaustive (d)Fig. 3. (a) Average OSPA errors in scenario 1 for three sensors and (b) four sensors. (c) Run times for each time-step (averaged over 200Monte Carlo runs) for three sensors and (d) four sensors.

October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 17

400 600 800 1000 x coordinate (m) y c oo r d i na t e ( m ) Target 1Target 2Target 3 Target 4Target 5Sensor 1Sensor 2Sensor 3 Sensor 4Sensor 5Sensor 6Sensor 7 Sensor 8Sensor 9 Sensor 10Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5 Target 1Target 2Target 3 Target 4Target 5

Fig. 4. Controlled movements of 10 sensors in scenario 1: The sensors generally approach the center of targets as expected. Best viewed incolor. number of sensors r un t i m e ( s ) recorded run times quadratic fit Fig. 5. Recorded run times for scenario 1 in presence of various numbers of sensors. the same direction with the same speed. To achieve such target maneuvers, we used the NCV motion model butwith the following covariance matrix: Q k =  B B  ; B =  . . . .  . (42)Also the birth model parameters were different from scenario 1, as listed below: m (1) B = [1100 0 200 0] (cid:62) ; m (2) B = [1200 0 300 0] (cid:62) ; m (3) B = [1100 0 300 0] (cid:62) ; m (4) B = [1200 0 400 0] (cid:62) ; m (5) B = [1200 0 200 0] (cid:62) ; P B = 50 I October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 18

200 400 600 800 1000 1200−2000200400600800 Sensor 1Sensor 2Sensor 3Sensor 4Sensor 5Sensor 6 x coordinate (m) y c oo r d i na t e ( m ) Fig. 6. Target and controlled sensor paths in scenario 2 for displacement sensor control actions. y x s e n s o r ’ s a x i s θφ Fig. 7. Schematics of notations used to formulate control action space and detection probability in scenario 3 with spinning control actions. where I denotes the four-dimensional identity matrix.We examined the performance of our proposed multi-sensor control method, ﬁrst with six sensors in a similarfashion to scenario 1 with parameters of state-dependent detection probability to be R = 320 m; η = 4000 m . Asnapshot of the ﬁnal target locations and their paths as well as the controlled sensors and their paths are shownin Fig. 6. It clearly shows how the sensors move and converge to follow the targets. A video of the simulation isavailable as supplementary material.

Scenario 3: Maneuvering targets and spinning sensors

The proposed multi-sensor control method is not limited to displacement sensor control actions only. The controlactions can have other forms. For instance, the sensors can be spun to control angles. We ran a simulation withsix sensors that could spin in the interval of 0 ◦ to +180 ◦ . With these sensors, the detection proﬁle is angle-related. For each sensor, the control action command is an axis angle to which the sensor would spin whenthe control action is applied. Denoting the angle of direction by θ , we considered the action command space of U = (cid:8)(cid:0) j (cid:1) × ◦ (cid:9) j =0:16 . The detection probability was assumed to vary with the relative angle of the target withrespect to the sensor’s axis direction, denoted by φ as shown in Fig. 7. The variations were modeled as follows: p D ( φ ) = 99 . × (cid:18) − mod ( | φ | , ◦ )2000 ◦ (cid:19) . (43) October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 19

200 400 600 800 1000 1200 x coordinate (m) -2000200400600800 y c oo r d i na t e ( m ) Fig. 8. A snapshot of the maneuvering target locations and the controlled angles of sensors in scenario 3.

Figure 8 shows a snapshot of the targets and how the sensors’ axes have been controlled to point towards them.A video of the simulation is available as supplementary material that demonstrates the continuous spinning of thesensors in such a way that in general they all point towards the group of targets moving in the scene.V. C

ONCLUSIONS

A complete POMDP framework for devising multi-sensor control solutions in stochastic multi-object systems wasintroduced, and a suitable set of choices for various components of the proposed POMDP were outlined. Detailsof one possible implementation were presented in which the multi-object state is modeled as an LMB RFS, andthe SMC implementation of the LMB ﬁlter is employed. The proposed framework makes use of a novel guidedsearch approach for multi-dimensional optimization in the multi-sensor control command space, for minimizationof a task-driven control objective function. It also utilizes Generalized Covariance Intersection (GCI) method formulti-sensor fusion. A step-by-step algorithm was detailed for SMC implementation of the proposed method withLMB ﬁlters running at each sensor node.Numerical studies were presented for several scenarios where numerous controllable (mobile) sensors track mul-tiple moving targets with different levels of observability. The results demonstrated good performance in controllingnumerous sensors (in terms of OSPA errors). They also showed that our proposed method runs substantially fasterthan the traditional exhaustive search-based technique. Indeed we showed that while the computational cost oftraditional methods grow exponentially with increasing the number of sensors, our method has only second ordercomputational complexity. A

CKNOWLEDGMENT

This project was supported by the Australian Research Council through ARC Discovery grant DP160104662, aswell as National Nature Science Foundation of China grants 61673075.R

EFERENCES[1] R. P. S. Mahler and T. R. Zajic, “Probabilistic objective functions for sensor management,” in

Proceedings of SPIE, Signal Processing,Sensor Fusion, and Target Recognition , vol. 5429, Orlando, 2004, Conference Proceedings, pp. 233–244.[2] D. A. Castan´on and L. Carin,

Stochastic control theory for sensor management . Springer, 2008.

October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 20 [3] M. Jiang, W. Yi, and L. Kong, “Multi-sensor control for multi-target tracking using Cauchy-Schwarz divergence,”

ArXiv e-prints , 2016,http://arxiv.org/abs/1603.08355.[4] B. N. Vo, B. T. Vo, and D. Phung, “Labeled random ﬁnite sets and the Bayes multi-target tracking ﬁlter,”

IEEE Transactions on SignalProcessing , vol. 62, no. 24, pp. 6554–6567, 2014.[5] R. P. S. Mahler,

Advances in Statistical Multisource-Multitarget Information Fusion . Artech House, 2014.[6] C. Fantacci, B. N. Vo, B. T. Vo, G. Battistelli, and L. Chisci, “Consensus labeled random ﬁnite set ﬁltering for distributed multi-objecttracking,”

ArXiv e-prints , 2016, http://arxiv.org/abs/1501.01579.[7] P. Tseng, “Convergence of a block coordinate descent method for nondifferentiable minimization,”

Journal of Optimization Theory andApplications , vol. 109, no. 3, pp. 475–494, 2001.[8] D. Schuhmacher, B. T. Vo, and B. N. Vo, “A consistent metric for performance evaluation of multi-object ﬁlters,”

IEEE Transactions onSignal Processing , vol. 56, no. 8 I, pp. 3447–3457, 2008.[9] B. N. Vo, S. Singh, and A. Doucet, “Sequential Monte Carlo methods for multi-target ﬁltering with random ﬁnite sets,”

IEEE Transactionson Aerospace and Electronic Systems , vol. 41, no. 4, pp. 1224–1245, 2005.[10] B. Ristic and B. N. Vo, “Sensor control for multi-object state-space estimation using random ﬁnite sets,”

Automatica , vol. 46, no. 11, pp.1812–1818, 2010.[11] B. T. Vo, B. N. Vo, and A. Cantoni, “The cardinality balanced multi-target multi-Bernoulli ﬁlter and its implementations,”

IEEE Transactionson Signal Processing , vol. 57, no. 2, pp. 409–423, 2009.[12] B. N. Vo, B. T. Vo, N. T. Pham, and D. Suter, “Joint detection and estimation of multiple objects from image observations,”

IEEETransactions on Signal Processing , vol. 58, no. 10, pp. 5129–5141, 2010.[13] J. Wong, B. T. Vo, B.-N. Vo, and R. Hoseinnezhad, “Multi-Bernoulli based track-before-detect with road constraints,” in

Fusion 2012 ,Singapore, July 2012, pp. 840–846.[14] R. Hoseinnezhad, B. N. Vo, and B. T. Vo, “Visual tracking in background subtracted image sequences via multi-Bernoulli ﬁltering,”

IEEETransactions on Signal Processing , vol. 61, no. 2, pp. 392–397, 2013.[15] R. Hoseinnezhad, B.-N. Vo, D. Suter, and B.-T. Vo, “Multi-object ﬁltering from image sequence without detection,” in

ICASSP 2010 ,Dallas, TX, 2010, Conference Proceedings, pp. 1154–1157.[16] R. Hoseinnezhad, B.-N. Vo, B.-T. Vo, and D. Suter, “Visual tracking of numerous targets via multi-Bernoulli ﬁltering of image data,”

Pattern Recognition , vol. 45, no. 10, pp. 3625–3635, 2012.[17] B. T. Vo, B. N. Vo, R. Hoseinnezhad, and R. P. S. Mahler, “Robust multi-Bernoulli ﬁltering,”

IEEE Journal of Selected Topics in SignalProcessing , vol. 7, no. 3, pp. 399–409, 2013.[18] B. T. Vo and B. N. Vo, “Labeled random ﬁnite sets and multi-object conjugate priors,”

IEEE Transactions on Signal Processing , vol. 61,no. 13, pp. 3460–3475, 2013.[19] F. Papi, B. N. Vo, B. T. Vo, C. Fantacci, and M. Beard, “Generalized labeled multi-Bernoulli approximation of multi-object densities,”

IEEE Transactions on Signal Processing , vol. 63, no. 20, pp. 5487–5497, 2015.[20] S. Reuter, B. T. Vo, B. N. Vo, and K. Dietmayer, “The labeled multi-Bernoulli ﬁlter,”

IEEE Transactions on Signal Processing , vol. 62,no. 12, pp. 3246–3260, 2014.[21] C. Fantacci and F. Papi, “Scalable multisensor multitarget tracking using the marginalized δ –GLMB density,” IEEE Signal ProcessingLetters , vol. 23, no. 6, pp. 863–867, 2016.[22] B. Ristic, B. N. Vo, and D. Clark, “A note on the reward function for PHD ﬁlters with sensor control,”

IEEE Transactions on Aerospaceand Electronic Systems , vol. 47, no. 2, pp. 1521–1529, 2011.[23] H. G. Hoang, B. N. Vo, B. T. Vo, and R. P. S. Mahler, “The Cauchy-Schwarz divergence for poisson point processes,”

IEEE Transactionson Information Theory , vol. 61, no. 8, pp. 4475–4485, Aug 2015.[24] Y. Liu and H. G. Hoang, “Sensor selection for multi-target tracking via closed form Cauchy-Schwarz divergence,” in , Seoul,2014, Conference Proceedings, pp. 93–98.[25] M. Beard, B. T. Vo, B. N. Vo, and S. Arulampalam, “Sensor control for multi-target tracking using Cauchy-Schwarz divergence,” in

Fusion2015 , Washton, D.C, 2015, Conference Proceedings, pp. 937–944.[26] H. G. Hoang and B. T. Vo, “Sensor management for multi-target tracking via multi-Bernoulli ﬁltering,”

Automatica , vol. 50, no. 4, pp.1135–1142, 2014.[27] A. K. Gostar, R. Hoseinnezhad, and A. Bab-Hadiashar, “Multi-Bernoulli sensor control for multi-target tracking,” in

ISSNIP 2013 ,Melbourne, 2013, pp. 312–317.

October 11, 2018 DRAFTXXXXX, VOL. XX, NO. XX, MMMM YYYY 21 [28] ——, “Robust multi-bernoulli sensor selection for multi-target tracking in sensor networks,”

IEEE Signal Processing Letters , vol. 20,no. 12, pp. 1167–1170, 2013.[29] ——, “Multi-Bernoulli sensor-selection for multi-target tracking with unknown clutter and detection proﬁles,”

Signal Processing , vol. 119,pp. 28–42, 2016.[30] ——, “Sensor control for multi-object tracking using labeled multi-Bernoulli ﬁlter,” in

Fusion 2014 , Salamanca, 2014, ConferenceProceedings, pp. 1–8.[31] ——, “Multi-Bernoulli sensor control via minimization of expected estimation errors,”

IEEE Transactions on Aerospace and ElectronicSystems , vol. 51, no. 3, pp. 1762–1773, July 2015.[32] C. Kreucher, A. O. Hero Iii, and K. Kastella, “A comparison of task driven and information driven sensor management for target tracking,”in

Proceedings of CDC-ECC ’05 , Seville, 2005, Conference Proceedings, pp. 4004–4009.[33] G. Battistelli, L. Chisci, C. Fantacci, A. Farina, and R. P. S. Mahler, “Distributed fusion of multitarget densities and consensus PHD/CPHDﬁlters,” in

Proceedings of SPIE, Signal Processing, Sensor/Information Fusion, and Target Recognition , vol. 9474, Baltimore, 2015,Conference Proceedings, pp. 1–15.[34] G. Battistelli, L. Chisci, C. Fantacci, A. Farina, and A. Graziano, “Consensus CPHD ﬁlter for distributed multitarget tracking,”

IEEEJournal of Selected Topics in Signal Processing , vol. 7, no. 3, pp. 508–520, 2013.[35] W. Bailu, Y. Wei, R. Hoseinnezhad, L. Suqi, K. Lingjiang, and Y. Xiaobo, “Distributed fusion with multi-Bernoulli ﬁlter based ongeneralized covariance intersection,”

IEEE Transactions on Signal Processing , vol. 65, pp. 242–255, jan 2017.[36] R. P. S. Mahler,

Multitarget sensor management of dispersed mobile sensors , ser. Series on Computers and Operations Research. WorldScientiﬁc, 2004.[37] S. J. Wright, “Coordinate descent algorithms,”

Mathematical Programming , vol. 151, no. 1, pp. 3–34, 2015.[38] J. C. Spall, “Cyclic seesaw process for optimization and identiﬁcation,”

Journal of Optimization Theory and Applications , vol. 154, pp.187–208, 2012.[39] A. Saha and A. Tewari, “On the ﬁnite time convergence of cyclic coordinate descent methods,”

ArXiv e-prints , 2010, http://arxiv.org/abs/1005.2146., 2010, http://arxiv.org/abs/1005.2146.