[PDF] Multiple Object Tracking in Unknown Backgrounds with Labeled Random Finite Sets

Abstract

This paper proposes an on-line multiple object tracking algorithm that can operate in unknown background. In a majority of multiple object tracking applications, model parameters for background processes such as clutter and detection are unknown and vary with time, hence the ability of the algorithm to adaptively learn the these parameters is essential in practice. In this work, we detail how the Generalized Labeled Multi Bernouli (GLMB) filter a tractable and provably Bayes optimal multi-object tracker can be tailored to learn clutter and detection parameters on the fly while tracking. Provided that these background model parameters do not fluctuate rapidly compared to the data rate, the proposed algorithm can adapt to the unknown background yielding better tracking performance.

Full PDF

11 Multiple Object Tracking in Unknown Backgroundswith Labeled Random Finite Sets

Yuthika Punchihewa, Ba-Tuong Vo, Ba-Ngu Vo and Du Yong Kim

Abstract —This paper proposes an on-line multiple objecttracking algorithm that can operate in unknown background.In a majority of multiple object tracking applications, modelparameters for background processes such as clutter and detec-tion are unknown and vary with time, hence the ability of thealgorithm to adaptively learn the these parameters is essentialin practice. In this work, we detail how the Generalized LabeledMulti-Bernouli (GLMB) ﬁlter, a tractable and provably Bayesoptimal multi-object tracker, can be tailored to learn clutter anddetection parameters on-the-ﬂy while tracking. Provided thatthese background model parameters do not ﬂuctuate rapidlycompared to the data rate, the proposed algorithm can adapt tothe unknown background yielding better tracking performance.

Index Terms —Random ﬁnite sets, generalized labeled multi-Bernoulli, multi-object tracking, data association, optimal assign-ment, ranked assigment, Gibbs sampling

I. I

NTRODUCTION

In a multi-object scenario the number of objects and theirindividual states evolve in time, compounded by false detec-tions, misdetections and measurement origin uncertainty [1]–[4]. For example, in the video dataset KITTI-17 from KITTIdatasets [5], see Fig. 1, the number of objects varies withtime due to objects coming in and out of the scene, and thedetector (e.g. background subtraction, foreground modelling[6]) used to convert each image into point measurements,invariably misses objects in the scene as well as generatingfalse measurements or clutter.Knowledge of parameters for uncertainty sources such asclutter and detection proﬁle are of critical importance inBayesian multi-object ﬁltering, arguably, more so than themeasurement noise model. Most multi-object tracking tech-niques are built on the assumption that multi-object systemmodel parameters are known a priori, which is generally notthe case in practice [1]–[4]. Signiﬁcant mismatches in clutterand detection model parameters inevitably result in erroneousestimates. For the video tracking example in Fig. 1 the clutterrate and detection proﬁle are not known and have to be guessedbefore a multi-object tracker can be applied. The trackingperformance of the Bayes optimal multi-object tracking ﬁlter[7], [8], for the guessed clutter rate and ’true’ clutter rate (thatvaries with time as shown in Fig. 2), demonstrates signiﬁcantperformance degradation.Except for a few applications, the clutter rate and detectionproﬁle of the sensor are not available. Usually these parametersare either estimated from training data or manually tuned. However, a major problem in many applications is the time-varying nature of the misdetection and clutter processes, seeFig. 2 for example. Consequently, there is no guarantee that themodel parameters chosen from training data will be sufﬁcientfor the multi-object ﬁlter at subsequent frames. Thus, currentmulti-object tracking algorithms are far from being a ’plug-and-play’ technology, since their application still requirescumbersome and error-prone user conﬁguration.This paper proposes an online multi-object tracker thatlearns the clutter and detection model parameters while track-ing. Such capability is essential for applications where theclutter rate and detection proﬁle vary with time. Speciﬁcally,we detail a GLMB ﬁlter for Jump Markov system (JMS),which is applicable to tracking multiple manuevering objectsas well as joint tracking and classiﬁcation of multiple objects.Using the JMS-GLMB ﬁlter, we develop a multi-object trackerthat can adaptively learn clutter rate and detection proﬁlewhile tracking, provided that the detection proﬁle and clut-ter background do not change too rapidly compared to themeasurement-update rate. An efﬁcient implementation of theproposed ﬁlter and experiments conﬁrm markedly improvedperformance over existing multi-object ﬁlters for unknownbackground such as the λ -CPHD ﬁlter [9]. Preliminary resultshave been reported in [10], which outlines a GLMB ﬁlter forjump-Markov system model.We remark that robust Bayesian approaches to problemswith model mismatch in the literature such as [11]–[16] are toocomputationally intensive for an on-line multi-object tracker.A Sequential Monte Carlo technique for calibration of time-invariant multi-object model parameters was proposed in [17].While this approach is quite general it is not directly applicableto time-varying clutter rate and detection proﬁle, and is alsotoo computationally intensive for an on-line tracker. Previouswork on CPHD/PHD, multi-Bernoulli and multi-target Bayesﬁlters for unknown clutter rate and detection proﬁle [9], [18]–[23] do not output object tracks. Further, the CPHD/PHD,multi-Bernoulli ﬁlters require more drastic approximationsthan the GLMB ﬁlter.The remainder of paper is organized as follows. SectionII provides background material on multi-object tracking andthe GLMB ﬁlter. Section III details two versions of theGLMB ﬁlter for a general multi-object JMS model and a non-interacting multi-object JMS model. Section IV presents anefﬁcient implementation of the non-interacting JMS-GLMBﬁlter for tracking in unknown clutter rate and detection proﬁle.Numerical studies are presented in Section V and concludingremarks are given in Section VI. a r X i v : . [ s t a t . O T ] J u l Fig. 1: Frames 16, 48 of the image sequence from [5] and object detections obtained using the detector in [35]. The numberof objects varies with time due to objects coming in and out of the scene. Object estimates (marked by blue boxes) using thestandard GLMB ﬁlter for guessed clutter rate of 60 (left column) and ’true’ clutter rate (right column). Tracking using ’true’clutter rate accurately estimated several objects that were missed in the frames on the left.Fig. 2: ’True’ clutter rate for the ﬁrst 60 frames of the dataset[5]. Note that it is not possible to know the true clutter rate forreal video data. For illustration we assume that the clutter ratevaries slowly and use the average clutter count over a moving10-frame window as the ’true’ clutter rate.II. B

ACKGROUND

This section reviews relevant background on the randomﬁnite set (RFS) formulation of multi-object tracking and theGLMB ﬁlter. Throughout the article, we adopt the followingnotations. For a given set S , | S | denotes its cardinality (numberof elements), S ( · ) denotes the indicator function of S , and F ( S ) denotes the class of ﬁnite subsets of S . We denote theinner product (cid:82) f ( x ) g ( x ) dx by (cid:104) f, g (cid:105) , the list of variables X m , X m +1 , ..., X n by X m : n , the product (cid:81) x ∈ X f ( x ) (with f ∅ = 1 ) by f X , and a generalization of the Kroneker deltathat takes arbitrary arguments such as sets, vectors, integersetc., by δ Y [ X ] (cid:44) (cid:26) , if X = Y , otherwise . A. Multi-object State

At time k , an existing object is described by a vector x k ∈ X . To distinguish different object trajectories, each object isidentiﬁed by a unique label (cid:96) k that consists of an orderedpair ( t, i ) , where t is the time of birth and i is the index ofindividual objects born at time t [7]. The trajectory of an objectis given by the sequence of states with the same label. Formally, the state of an object at time k is a vector x k = ( x k , (cid:96) k ) ∈ X × L k , where L k denotes the label spacefor objects at time k (including those born prior to k ). Notethat L k is given by B k ∪ L k − , where B k denotes the labelspace for objects born at time k (and is disjoint from L k − ).In the RFS approach to multi-object tracking [3], [4]. thecollection of object states, referred to as the multi-object state ,is naturally represented as a ﬁnite set [24]. Suppose that thereare N k objects at time k , with states x k, , ..., x k,N k , then the multi-object state is deﬁned by the ﬁnite set X k = { x k, , ..., x k,N k } ∈ F ( X × L k ) , We denote the set { (cid:96) : ( x, (cid:96) ) ∈ X } of labels of X by L ( X ) .Note that since the label is unique, no two objects have thesame label, i.e. δ | X | [ |L ( X ) | ] = 1 . Hence ∆( X ) (cid:44) δ | X | [ |L X | ] is called the distinct label indicator .A labeled RFS is a random variable on F ( X × L ) such thateach realization has distinct labels. The distinct label propertyensures that at any time no two tracks can share any commonpoints. For the rest of the paper, we follow the conventionthat single-object states are represented by lower-case letters(e.g. x , x ), while multi-object states are represented by upper-case letters (e.g. X , X ), symbols for labeled states and theirdistributions are bold-faced (e.g. x , X , π , etc.), and spacesare represented by blackboard bold (e.g. X , Z , L , etc.). Fornotational compactness, we drop the time subscript k , and usethe subscript ‘ + ’ for time k + 1 . B. Standard multi-object system model

Given the multi-object state X at time k , each state ( x, (cid:96) ) ∈ X either survives with probability P S ( x, (cid:96) ) and evolves toa new state ( x + , (cid:96) + ) at time k + 1 with probability density f + ( x + | x, (cid:96) ) δ (cid:96) [ (cid:96) + ] or dies with probability − P S ( x, (cid:96) ) . Theset B + of new objects born at time k + 1 is distributedaccording to the labeled multi-Bernoulli (LMB) density ∆( B + ) (cid:2) B + r B, + (cid:3) L ( B + ) [1 − r B, + ] B + −L ( B + ) p B + B, + , (1) where r B, + ( (cid:96) ) is the probability that a new object with label (cid:96) is born, p B, + ( · , (cid:96) ) is the distribution of its kinematic state,and B + is the label space of new born objects [7]. The multi-object state X + (at time k +1 ) is the superposition of survivingobjects and new born objects. Note that the label space of allobjects at time k + 1 is the disjoint union L + = L (cid:93) B + . Itis assumed that, conditional on X , objects move, appear anddie independently of each other.For a given multi-object state X , each ( x, (cid:96) ) ∈ X is eitherdetected with probability P D ( x, (cid:96) ) and generates a detection z ∈ Z with likelihood g ( z | x, (cid:96) ) or missed with probability − P D ( x, (cid:96) ) . The multi-object observation is the superpositionof the observations from detected objects and Poisson clutterwith (positive) intensity κ . Assuming that, conditional on X ,detections are independent of each other and clutter, the multi-object likelihood function is given by [7], [8] g ( Z | X ) ∝ (cid:88) θ ∈ Θ Θ( L ( X )) ( θ ) (cid:89) ( x,(cid:96) ) ∈ X ψ ( θ ( (cid:96) )) Z ( x, (cid:96) ) (2)where: Θ is the set of positive 1-1 maps θ : L → { : | Z |} , i.e.maps such that no two distinct arguments are mapped to thesame positive value , Θ( I ) is the set of positive 1-1 maps withdomain I ; and ψ ( j ) { z M } ( x, (cid:96) ) = (cid:40) P D ( x,(cid:96) ) g ( z j | x,(cid:96) ) κ ( z ) , if j = 1 : M − P D ( x, (cid:96) ) , if j = 0 . (3)The map θ speciﬁes which objects generated which detections,i.e. object (cid:96) generates detection z θ ( (cid:96) ) ∈ Z , with undetectedobjects assigned to . The positive 1-1 property means that θ is 1-1 on { (cid:96) : θ ( (cid:96) ) > } , the set of labels that are assignedpositive values, and ensures that any detection in Z is assignedto at most one object.For the special case with zero-clutter, i.e. κ is identi-cally zero, the multi-object likelihood function still takesthe same form, but with P D ( x, (cid:96) ) g ( z j | x, (cid:96) ) /κ ( z ) replaced by P D ( x, (cid:96) ) g ( z j | x, (cid:96) ) , see [3], [4]. To cover both positive andidentically-zero clutter intensities we write ψ ( j ) { z M } ( x, (cid:96) ) = (cid:40) P D ( x,(cid:96) ) g ( z j | x,(cid:96) ) κ ( z )+ δ [ κ ( z )] , if j = 1 : M − P D ( x, (cid:96) ) , if j = 0 . (4) C. Generalized Labeled Multi-Bernoulli

A Generalized Labeled Multi-Bernoulli (GLMB) ﬁlteringdensity, at time k , is a multi-object density that can be writtenin the form π ( X ) = ∆( X ) (cid:88) ξ ∈ Ξ ,I ⊆ L ω ( I,ξ ) δ I [ L ( X )] (cid:104) p ( ξ ) (cid:105) X . (5)where each ξ ∈ Ξ (cid:44) Θ × ... × Θ k represents a historyof association maps ξ = ( θ k ) , each p ( ξ ) ( · , (cid:96) ) is a prob-ability density on X , and each ω ( I,ξ ) is non-negative with (cid:80) ξ ∈ Ξ (cid:80) I ⊆ L ω ( I,ξ ) = 1 . The cardinality distribution of aGLMB is given by Pr( | X | = n ) = (cid:88) ξ ∈ Ξ ,I ⊆ L δ n [ | I | ] ω ( I,ξ ) , (6) while, the existence probability and probability density of track (cid:96) ∈ L are respectively r ( (cid:96) ) = (cid:88) ξ ∈ Ξ ,I ⊆ L I ( (cid:96) ) ω ( I,ξ ) , (7) p ( x, (cid:96) ) = 1 r ( (cid:96) ) (cid:88) ξ ∈ Ξ ,I ⊆ L I ( (cid:96) ) ω ( I,ξ ) p ( ξ ) ( x, (cid:96) ) . (8)Given the GLMB density (5), an intuitive multi-object esti-mator is the multi-Bernoulli estimator , which ﬁrst determinesthe set of labels L ⊆ L with existence probabilities abovea prescribed threshold, and second the mode/mean estimatesfrom the densities p ( · , (cid:96) ) , (cid:96) ∈ L , for the states of the objects.A popular estimator is a suboptimal version of the MarginalMulti-object Estimator [3], which ﬁrst determines the pair ( L, ξ ) with the highest weight ω ( L,ξ ) such that | L | coincideswith the mode cardinality estimate, and second the mode/meanestimates from p ( ξ ) ( · , (cid:96) ) , (cid:96) ∈ L , for the states of the objects.For the standard multi-object system model the GLMBdensity is a conjugate prior, and is also closed under theChapman-Kolmogorov equation [7]. Moreover, the GLMBposterior can be tractably computed to any desired accuracy inthe sense that, given any (cid:15) > , an approximate GLMB within (cid:15) from the actual GLMB in L distance, can be computed(in polynomial time) [8]. The GLMB ﬁltering density can bepropagated forward in time via a prediction step and an updatestep as in [8] or in one single step as in [25]. Since the numberof components grow exponentially in the predicted/ﬁltereddensities during prediction/update stages, truncation of hy-potheses with low weights is essential during implementation.Polynomial complexity schemes for truncation of insigniﬁcantweights were given in [8] and [25], via Murty’s algorithm witha quartic (or at best cubic) complexity, or via Gibbs samplingwith a linear complexity, where the complexity is given in thenumber of measurements.III. J UMP M ARKOV S YSTEM

GLMB F

ILTERING

We ﬁrst derive from the GLMB recursion a multi-objectﬁlter for Jump Markov system (JMS) in subsection III-A,which is applicable to tracking multiple manuevering objectsas well as joint tracking and classiﬁcation of multiple objects.When the modes of the multi-object JMS do not interact, theJMS-GLMB recursion reduces to a more tractable form, whichis presented in subsection III-B. This special case is then usedto develop a multi-object tracker that can operate in unknownbackground in section IV.

A. GLMB ﬁlter for Jump Markov Systems A Jump Markov System (JMS) consists of a set of param-eterised state space models, whose parameters evolve withtime according to a ﬁnite state Markov chain. A JMS canbe speciﬁed in terms of the standard system parameters foreach mode or class as follows.Let M be the (discrete) index set of modes in the system.Suppose that mode m is in effect at time k , then the statetransition density from ζ , at time k , to ζ + , at time k + 1 ,is denoted by f ( m )+ ( ζ + | ζ ) , and the likelihood of ζ generatingthe measurement z is denoted by g ( m ) ( z | ζ ) [26], [27], [28]. Moreover, the joint transition of the state and mode assumesthe form: f + ( ζ + , m + | ζ, m ) = f ( m + )+ ( ζ + | ζ ) ϑ + ( m + | m ) , (9)where ϑ + ( m + | m ) denotes the probability of switching frommode m to m + (and satisﬁes (cid:80) m + ∈ M ϑ + ( m + | m ) = 1 ). Notethat by deﬁning the augmented state as x = ( ζ, m ) ∈ X × M , a JMS model can be expressed as a standard state spacemodel with transition density (9) and measurement likelihoodfunction g ( z | ζ, m ) = g ( m ) ( z | ζ ) .In a multi-object system, each object is identiﬁed by a label (cid:96) that remains unchanged throughout its life, hence the JMSstate equation for such an object is written as f + ( ζ + , m + | ζ, m, (cid:96) ) = f ( m + )+ ( ζ + | ζ, (cid:96) ) ϑ + ( m + | m ) (10) g ( z | ζ, m, (cid:96) ) = g ( m ) ( z | ζ, (cid:96) ) (11)Additionally, to emphasize the dependence on the mode,the survival, birth and detection parameters are, respectively,denoted as p ( m + ) B, + ( ζ + , (cid:96) + ) (cid:44) p B, + ( ζ + , m + , (cid:96) + ) ,P ( m ) S ( ζ, (cid:96) ) (cid:44) P S ( ζ, m, (cid:96) ) ,P ( m ) D ( ζ, (cid:96) ) (cid:44) P D ( ζ, m, (cid:96) ) . Substituting these parameters and the JMS state equations(10)-(11) into the GLMB recursion in [25] yields the so-calledJMS-GLMB recursion.

Proposition 1.

If the ﬁltering density at time k is the GLMB(5), then the ﬁltering density at time k + 1 is the GLMB π ( X + | Z + ) ∝ ∆( X + ) (cid:88) I,ξ,I + ,θ + ω ( I,ξ ) ω ( I,ξ,I + ,θ + ) Z + δ I + [ L ( X + )] (cid:104) p ( ξ,θ + ) Z + (cid:105) X + (12) where I ∈ F ( L ) , ξ ∈ Ξ , I + ∈ F ( L + ) , θ + ∈ Θ + , ω ( I,ξ,I + ,θ + ) Z + =1 Θ + ( I + ) ( θ + ) (cid:104) − ¯ P ( ξ ) S (cid:105) I − I + (cid:104) ¯ P ( ξ ) S (cid:105) I ∩ I + × [1 − r B, + ] B + − I + r B + ∩ I + B, + (cid:104) ¯ ψ ( ξ,θ + ) Z + (cid:105) I + (13) ¯ P ( ξ ) S ( (cid:96) ) = (cid:88) m ∈ M ¯ P ( ξ ) S ( m, (cid:96) ) , (14) ¯ P ( ξ ) S ( m, (cid:96) ) = (cid:68) p ( ξ ) ( · , m, (cid:96) ) , P ( m ) S ( · , (cid:96) ) (cid:69) , (15) ¯ ψ ( ξ,θ + ) Z + ( (cid:96) ) = (cid:88) m + ∈ M ¯ ψ ( ξ,θ + ) Z + ( m + , (cid:96) ) , (16) ¯ ψ ( ξ,θ + ) Z + ( m + , (cid:96) ) = (cid:68) ¯ p ( ξ )+ ( · , m + , (cid:96) ) , ψ ( θ + ( (cid:96) )) Z + ( · , m + , (cid:96) ) (cid:69) (17) ¯ p ( ξ )+ ( ζ + , m + , (cid:96) ) = 1 B + ( (cid:96) ) p ( m + ) B ( ζ + , (cid:96) ) +1 L ( (cid:96) ) (cid:80) m ∈ M (cid:68) P ( m ) S ( · , (cid:96) ) f ( m + )+ ( ζ + |· , (cid:96) ) , p ( ξ ) ( · , m, (cid:96) ) (cid:69) ϑ ( m + | m )¯ P ( ξ ) S ( (cid:96) ) (18) p ( ξ,θ + ) Z + ( ζ + , m + , (cid:96) ) = ¯ p ( ξ )+ ( ζ + , m + , (cid:96) ) ψ ( θ + ( (cid:96) )) Z + ( ζ + , m + , (cid:96) )¯ ψ ( ξ,θ + ) Z + ( m + , (cid:96) ) (19) ψ ( j ) { z | Z | } ( ζ, m, (cid:96) ) = (cid:40) P ( m ) D ( ζ,(cid:96) ) g ( m ) ( z j | ζ,(cid:96) ) κ ( z j )+ δ [ κ ( z j )] , if j ∈ { , ..., | Z |} − P ( m ) D ( ζ, (cid:96) ) , if j = 0 (20)Notice that the above expression is in δ -GLMB form sinceit can be written as a sum over I + , ξ, θ + with weights ω ( I + ,ξ,θ + ) Z + ∝ (cid:88) I ω ( I,ξ ) ω ( I,ξ,I + ,θ + ) Z + . This special case of the GLMB recursion is particularlyuseful for tracking multiple manuevering objects and jointmulti-object tracking and classiﬁcation. Indeed the applicationof the JMS-GLMB recursion to multiple manuevering objecttracking has been reported our preliminary work [10], whereseparate prediction and update steps was introduced. The sameresult was independently reported in [29].

B. Multi-Class GLMB

The JMS-GLMB recursion can be applied to the joint multi-object tracking and classiﬁcation problem by using the modeas the class label (not to be confused to object label). Whatdistinguishes this problem from generic JMS-GLMB ﬁlteringis that the modes do not interact with each other in thefollowing sense:1) All possible states of a new object with the same objectlabel share a common mode (class label);2) An object cannot switch between different modes fromone time step to the next.Let B ( m ) denote the set of labels of all elements in X × M × B with mode m . Then condition 1 implies that the label sets B ( m ) and B ( m (cid:48) ) for different modes m and m (cid:48) are disjoint(otherwise there exist a label (cid:96) in both B ( m ) and B ( m (cid:48) ) , whichmeans there are states in X × M × B with different modes m and m (cid:48) but share a common label (cid:96) ). Furthermore, the sets B ( m ) , m ∈ M cover B , i.e. B = (cid:83) m ∈ M B ( m ) , and thus forma partition of the space B . A new object is classiﬁed as class m (and has mode m ) if and only if its label falls into B ( m ) .Thus for an LMB birth model, condition 1 means r B, + ( (cid:96) + ) = (cid:88) m + ∈ M r ( m + ) B, + B ( m +)+ ( (cid:96) + ) , (21) p ( m + ) B, + ( ζ + , (cid:96) + ) = p ( m + ) B, + ( ζ + )1 B ( m +)+ ( (cid:96) + ) . (22)Note that r ( m + ) B, + and p ( m + ) B, + ( ζ + ) are respectively the existenceprobability and probability density of the kinematics ζ + ofa new object given mode m + , while B ( m +)+ ( (cid:96) + ) is theprobability of mode m + given label (cid:96) + .Condition 2 means that the mode transition probability ϑ ( m + | m ) = δ m [ m + ] , (23)which implies that each object belongs to exactly one ofthe classes in M for its entire life. Consequently, the non-interacting mode condition means that at time k , the labelspace for all class m objects is L ( m ) = (cid:85) kt =0 B ( m ) t , andthe set of all possible labels is given by the disjoint union L = (cid:85) m ∈ M L ( m ) .For a multi-object JMS system with non-interacting modes,the JMS-GLMB recursion reduces to a form where the weights and multi-object exponentials can be separated according toclasses. We call this form the multi-class GLMB. Proposition 2.

Let X ( m ) denote the subset of X with mode m , and hence X = (cid:85) m ∈ M X ( m ) . Suppose that the hybridmulti-object density at time k is a GLMB of the form π ( X ) = (cid:88) ξ,I Θ( I ) ( ξ ⊥ Θ) (cid:89) m ∈ M π ( I ( m ) ,ξ ( m ) ) ( X ( m ) ) (24) where ξ ∈ Ξ , I ⊆ L , ξ ⊥ Θ denotes the projection ξ into thespace Θ , I ( m ) (cid:44) I ∩ L ( m ) , ξ ( m ) = ξ | L ( m )0 × ... × L ( m ) k (i.e. themap ξ restricted to L ( m )0 × ... × L ( m ) k ), and π ( I,ξ ) ( X ) (cid:44) ∆( X ) w ( I,ξ ) δ I [ L ( X )] (cid:104) p ( ξ ) (cid:105) X (25) Then the hybrid multi-object ﬁltering density at time k + 1 isthe GLMB π Z + ( X + ) ∝ (cid:88) ξ,I,θ + ,I + Θ + ( I + ) ( θ + ) (cid:89) m ∈ M π ( m,I ( m ) ,ξ ( m ) ,I ( m )+ ,θ ( m )+ ) Z + ( X ( m )+ ) (26) where I + ∈ F ( L + ) , θ + ∈ Θ + , I ( m )+ = I + ∩ L ( m )+ , θ ( m )+ = θ + | L ( m )+ π ( m,I,ξ,I + ,θ + ) Z + ( X + ) =∆( X + ) w ( m,I,ξ,I + ,θ + ) Z + w ( I,ξ ) δ I + [ L ( X + )] (cid:104) p ( ξ,θ + ) Z + (cid:105) X + (27) w ( m,I,ξ,I + ,θ + ) Z + = (cid:104) ¯ ψ ( ξ,θ + ) Z + ( m, · ) (cid:105) I + [1 − r B, + ] B ( m )+ − I + r B ( m )+ ∩ I + B, + × (cid:104) − ¯ P ( ξ ) S ( m, · ) (cid:105) I − I + (cid:104) ¯ P ( ξ ) S ( m, · ) (cid:105) I ∩ I + (28) ¯ P ( ξ ) S ( m, (cid:96) ) = (cid:68) p ( ξ ) ( · , m, (cid:96) ) , P ( m ) S ( · , (cid:96) ) (cid:69) , (29) ¯ ψ ( ξ,θ + ) Z + ( m, (cid:96) ) = (cid:68) ¯ p ( ξ )+ ( · , m, (cid:96) ) , ψ ( θ + ( (cid:96) )) Z + ( · , m, (cid:96) ) (cid:69) , (30) ¯ p ( ξ )+ ( ζ, m, (cid:96) ) =1 L ( m ) ( (cid:96) ) (cid:68) P ( m ) S ( · , (cid:96) ) f ( m )+ ( ζ |· , (cid:96) )) , p ( ξ ) ( · , m, (cid:96) ) (cid:69) ¯ P ( ξ ) S ( m, (cid:96) )+1 B ( m )+ ( (cid:96) ) p ( m ) B ( ζ, (cid:96) ) (31) p ( ξ,θ + ) Z + ( ζ, m, (cid:96) ) = ¯ p ( ξ )+ ( ζ, m, (cid:96) ) ψ ( θ + ( (cid:96) )) Z + ( ζ, m, (cid:96) )¯ ψ ( ξ,θ + ) Z + ( m, (cid:96) ) (32) ψ ( j ) { z | Z | } ( ζ, m, (cid:96) ) = (cid:110) P ( m ) D ( ζ,(cid:96) ) g ( m ) ( z j | ζ,(cid:96) ) κ ( z j )+ δ [ κ ( z j )] , if j ∈ { , ..., | Z |} − P ( m ) D ( ζ, (cid:96) ) , if j = 0 (33) Proof.

Note that the L ( m )0 × ... × L ( m ) k , m ∈ M form a partitionof L × ... × L k , and since each ξ ( m ) was deﬁned as arestrictions of ξ over L ( m )0 × ... × L ( m ) k , ξ is completelycharacterized by the ξ ( m ) , m ∈ M . By deﬁning ω ( I,ξ ) = 1 Θ( I ) ( ξ ⊥ Θ) (cid:89) m ∈ M w ( I ( m ) ,ξ ( m ) ) (34) p ( ξ ) ( ζ, m, (cid:96) ) = (cid:104) p ( ξ ( m ) ) ( ζ, m, (cid:96) ) (cid:105) L ( m ) ( (cid:96) ) (35)it can be seen that (24) is a GLMB of the form (5) since δ I [ L ( X )] = (cid:89) m ∈ M δ I ( m ) [ L ( X ( m ) )] (cid:104) p ( ξ ) (cid:105) X = (cid:104) p ( ξ ) (cid:105) (cid:85) m ∈ M X ( m ) = (cid:89) m ∈ M (cid:104) p ( ξ ( m ) ) (cid:105) X ( m ) . Thus by applying Proposition 1, the hybrid multi-object ﬁl-tering density at time k + 1 is given by (12-20). Substituting(34), (35), (21-23) into (12-20), decomposing X + = (cid:93) m ∈ M X ( m )+ (36) ω ( I,ξ,I + ,θ + ) Z + = 1 Θ + ( I + ) ( θ + ) (cid:89) m ∈ M w ( m,I ( m ) ,ξ ( m ) ,I ( m )+ ,θ ( m )+ ) Z + (37) p ( ξ,θ + ) Z + = (cid:18) p ( ξ ( m ) ,θ ( m )+ ) Z + (cid:19) L ( m )+ ( (cid:96) ) (38)and rearranging yields (26). Note that (23) ensures that m + = m .Given a GLMB ﬁltering density of the multi-class form (24),the GLMB ﬁltering density for class c ∈ M , can be obtainedby marginalizing the other classes according to the followingproposition. Proposition 3.

For the multi-class GLMB (24), the marginalGLMB for class c is given by π (cid:16) X ( c ) (cid:17) = ∆( X ( c ) ) (cid:88) ξ,I ω ( I,ξ ) δ I ( c ) [ L ( X ( c ) )] (cid:104) p ( ξ ( c ) ) (cid:105) X ( c ) Proof.

Note that (cid:90) π ( I ( m ) ,ξ ( m ) ) ( X ( m ) ) δ X ( m ) = (cid:90) ∆( X ( m ) ) w ( I ( m ) ,ξ ( m ) ) δ I ( m ) [ L ( X ( m ) )] (cid:104) p ( ξ ) (cid:105) X ( m ) δ X ( m ) = w ( I ( m ) ,ξ ( m ) ) . Since, the X ( m ) , m ∈ M are disjoint, π ( X ( c ) ) = (cid:90) π (cid:18) (cid:85) m ∈ M X ( m ) (cid:19) δ (cid:32) (cid:85) m ∈ M −{ c } X ( m ) (cid:33) = (cid:90) (cid:88) ξ,I Θ( I ) ( ξ ⊥ Θ) × (cid:89) m ∈ M π ( I ( m ) ,ξ ( m ) ) ( X ( m ) ) δ (cid:32) (cid:85) m ∈ M −{ c } X ( m ) (cid:33) = (cid:88) ξ,I Θ( I ) ( ξ ⊥ Θ) π ( I ( c ) ,ξ ( c ) ) ( X ( c ) ) × (cid:89) m ∈ M −{ c } (cid:90) π ( I ( m ) ,ξ ( m ) ) ( X ( m ) ) δ X ( m ) = (cid:88) ξ,I Θ( I ) ( ξ ⊥ Θ) π ( I ( c ) ,ξ ( c ) ) ( X ( c ) ) × (cid:89) m ∈ M −{ c } w ( I ( m ) ,ξ ( m ) ) . = ∆( X ( c ) ) (cid:88) ξ,I ω ( I,ξ ) δ I ( c ) [ L ( X ( c ) )] (cid:104) p ( ξ ( c ) ) (cid:105) X ( c ) . IV. GLMB F

ILTERING WITH U NKNOWN B ACKGROUND

Clutter or false detections are generally understood asdetections that do not correspond to any object [1]–[4]. Sincethe number false detections and their values are random, clutteris usually modelled by RFSs in the literature [3], [4], [30].The simplest and the most commonly used clutter model is thePoisson RFS [30], in most cases, with a uniform intensity overthe surveillance region. Alternatively clutter can be treated asdetections originating from clutter generators –objects that arenot of interest to the tracker [9], [18]–[20].In [9] a CPHD recursion was derived to propagate separateintensity functions for clutter generators and objects of inter-est, and their collective cardinality distribution of the hybridmulti-object state. Similarly, in [20] analogous multi-Bernoullirecursions were derived to propagate the disjoint union ofobjects of interest and clutter generators. In this work we showthat the multi-class GLMB ﬁlter is an effective multi-objectobject tracker that can operate under unknown background bylearning the clutter and detection model on-the-ﬂy.This section details an on-line multi-object tracker thatoperates in unknown clutter rate and detection proﬁle. Inparticular we propose a GLMB clutter model in subsectionIV-A by treating clutter as a special class of objects with com-pletely uncertain dynamics, and describe a dedicated GLMBrecursion for propagating the joint ﬁltering density of cluttergenerators and objects of interest. Implementation details aregiven in subsection IV-B. Extension of the proposed algorithmto accommodate unknown detection proﬁle is described insubsection IV-F.

A. GLMB Joint Object-Clutter Model

We propose to model the ﬁnite set of clutter generators and objects of interest as two non-interacting classes of ob-jects, and propagate this so-called hybrid multi-object ﬁlteringdensity forward in time via the multi-class GLMB recursion.The GLMB ﬁltering density of the hybrid multi-object statecaptures all relevant statistical information on the objects ofinterest as well as the clutter generators. What distinguishesthe objects of interest from clutter generators is that the formerhave relatively predictable dynamics whereas the latter havecompletely random dynamics.In the hybrid multi-object model, the Poisson clutter in-tensity κ is identically 0 and each detection is generatedfrom either a clutter generator or an object of interest, whichconstitute, respectively, the two modes (or classes) 0 and 1of the mode space M = { , } . Since the classes are non-interacting, there are no switchings between objects of interestand clutter generators. Moreover, the label space for new bornclutter generators B (0) and the label space for new born objectsof interest B (1) are disjoint and the LMB birth parameters aregiven by r B, + ( (cid:96) + ) = r (0) B, + B (0)+ ( (cid:96) + ) + r (1) B, + B (1)+ ( (cid:96) + ) ,p ( m + ) B, + ( ζ + , (cid:96) + ) = p ( m + ) B, + ( ζ + )1 B ( m +)+ ( (cid:96) + ) Since clutter are distinguishable from targets by their com-pletely random dynamics, each clutter generator has a transi-tion density independent of the previous state and a uniform measurememt likelihood in the observation region with vol-ume

V f (0)+ ( ζ + | ζ, (cid:96) ) = s ( ζ + ) g (0) ( z | ζ, (cid:96) ) = u ( z ) V − Note that the labels of clutter generators can effectively beignored since it is implicit that their labels are distinct but areotherwise uninformative. Further, for Gaussian implementa-tions it is assumed that the survival and detection probabilitiesfor clutter generators are state independent P (0) S ( ζ, (cid:96) ) = P (0) S P (0) D ( ζ, (cid:96) ) = P (0) D Applying the multi-class GLMB recursion to this model, itcan be easily seen that all clutter generators are functionallyidentical (from birth through prediction and update) p (0) B ( ζ, (cid:96) ) = ¯ p ( ξ (0) )+ ( ζ, , (cid:96) ) = p ( ξ (0) ,θ (0)+ ) Z + ( ζ, , (cid:96) ) = s ( ζ ) and that the weight update for clutter generators reduces to w (0 ,I (0) ,ξ (0) ,I (0)+ ,θ (0)+ ) Z + = (cid:104) − P (0) S (cid:105) | I (0) − I (0)+ | (cid:104) P (0) S (cid:105) | I (0) ∩ I (0)+ | × (cid:104) − r (0) B, + (cid:105) | B (0)+ − I (0)+ | (cid:104) r (0) B, + (cid:105) | B (0)+ ∩ I (0)+ | × (cid:104) − P (0) D, + (cid:105) |{ (cid:96) ∈ I (0)+ : θ (0)+ ( (cid:96) )=0 }| (cid:104) P (0) D, + V − (cid:105) |{ (cid:96) ∈ I (0)+ : θ (0)+ ( (cid:96) ) > }| (39)Thus propagation of clutter generators within each GLMBcomponent reduces to propagation of their weights w (0 ,I (0)+ ,ξ (0) ,θ (0)+ ) Z + = (cid:80) I (0) w ( I (0) ,ξ (0) ) w (0 ,I (0) ,ξ (0) ,I (0)+ ,θ (0)+ ) Z + . B. Implementation

The key challenge in the implemention of the multi-classGLMB ﬁlter is the propagation of the GLMB components,which involves, for each parent GLMB component ( I, ξ ) ,searching the space F ( L + ) × Θ + to ﬁnd a set of ( I + , θ + ) such that the children components ( I, ξ, I + , θ + ) have signiﬁcantweights ω ( I,ξ,I + ,θ + ) Z + . In [25], the set of ( I + , θ + ) is generated froma Gibbs sampler with stationary distribution is constructedso that only valid children components have positive prob-abilities, and those with high weights are more likely to besampled than those with low weights. A direct application ofthis approach to generate new children would, however, beexpensive, for the following reasons.Let P = | I | , P (0) = | I (0) | , P (1) = | I (1) | and M = | Z + | .According to [25] the complexity of the joint prediction andupdate via Gibbs sampling with T iterations is O ( T P M ) .Since the present formulation treat clutter as objects, thetotal number of hypothesized objects P ≥ P (0) ≥ M , andhence the complexity is at least O ( T M ) , which is cubicin the number of measurements and results in a relativelyinefﬁcient implementation. This occurs because the majorityof the computational effort is spent on clutter generators even though they are not of interest. This problem is exacerbatedas the clutter rate increases.In the following we propose a more efﬁcient implementationby focusing on the ﬁltering density of the objects of interestinstead of the hybrid multi-object ﬁltering density. Observethat given any ( I (1)+ , θ (1)+ ) ∈ F ( L (1)+ ) × Θ (1)+ , and ( I (0)+ , θ (0)+ ) ∈F ( L (0)+ ) × Θ (0)+ , where Θ ( m )+ denotes the space of positive 1-1maps from L ( m )+ to { , , ..., M } , we can uniquely deﬁne ( I + , θ + ) (cid:44) ( I (1)+ (cid:93) I (0)+ , L (1)+ θ (1)+ + 1 L (0)+ θ (0)+ ) . (40)Further, the weight of the resulting component ( I, ξ, I + , θ + ) is ω ( I,ξ,I + ,θ + ) Z + = 1 Θ( I + ) ( θ + ) w (0 ,I (0) ,ξ (0) ,I (0)+ ,θ (0)+ ) Z + w (1 ,I (1) ,ξ (1) ,I (1)+ ,θ (1)+ ) Z + (41)see Proposition 2 (37). Note that if θ + is not a valid associationmap then Θ( I + ) ( θ + ) = 0 , and hence the weight is zero.For each parent GLMB component ( I, ξ ) , rather thansearching for ( I + , θ + ) with signiﬁcant ω ( I,ξ,I + ,θ + ) Z + in the space F ( L + ) × Θ + , we:1) seek ( I (1)+ , θ (1)+ ) with signiﬁcant w (1 ,I (1) ,ξ (1) ,I (1)+ ,θ (1)+ ) Z + from the smaller space F ( L (1)+ ) × Θ (1)+ ;2) for each such ( I (1)+ , θ (1)+ ) ﬁnd the ( I (0)+ , θ (0)+ ) with thebest w (0 ,I (0) ,ξ (0) ,I (0)+ ,θ (0)+ ) Z + , subject to the constraint L (1)+ θ (1)+ + 1 L (0)+ θ (0)+ ∈ Θ( I (1)+ (cid:93) I (0)+ ); (42)3) construct ( I + , θ + ) from ( I (1)+ , θ (1)+ ) and ( I (0)+ , θ (0)+ ) via(40) and compute the corresponding weight via (41).Due to the constraint 42, Θ( I + ) ( θ + ) = 1 , and hence,it follows from (41) that the resulting GLMB component ( I, ξ, I + , θ + ) also has signiﬁcant weight.The advantage of this strategy is two fold: • searching over a much smaller space F ( L (1)+ ) × Θ (1)+ results in a linear complexity in the measurements O ( T ( P (1) ) M ) since typically P (1) << M ; • ﬁnding ( I (0)+ , θ (0)+ ) with the best weight subject to theconstraint θ + ∈ Θ( I + ) is straight forward and requiresmiminal computation. C. Propagating Objects of Interest

One way to generate signiﬁcant ( I (1)+ , θ (1)+ ) is todesign a Gibbs sampler with stationary distribution w (1 ,I (1) ,ξ (1) ,I (1)+ ,θ (1)+ ) Z + . However, this approach requirescomputing the hybrid multi-object density, which we try toavoid in the ﬁrst place.A much more efﬁcient alternative is to treat the multi-Bernoulli clutter as Poisson with matching intensity, and applythe standard GLMB ﬁlter (the JMS-GLMB ﬁlter (12) with asingle-mode), where the Gibbs sampler [31] (or Murty’s algo-rithm [32]) can be used to obtain signiﬁcant ( I (1)+ , θ (1)+ ) [25].Since there are | I (0) | clutter generators from the previous timewith survival probability P (0) S , and | B (0)+ | clutter birth withprobability r (0) B, + , the predicted clutter intensity is given by ˆ κ + = ( P (0) S | I (0) | + r (0) B, + | B (0)+ | ) P (0) D, + V − . Note that a PoissonRFS has larger variance on the number of clutter points than a multi-Bernoulli with matching intensity. Hence, in treatingclutter as a Poisson RFS, we are effectively tempering withthe clutter model to induce the Gibbs sampler (or Murty’salgorithm) to generate more diverse components [25].Following [25], let us enumerate Z + = { z M } , I (1) = { (cid:96) R } , and B (1)+ = { (cid:96) R +1: P } . The ( I (1)+ , θ (1)+ ) ∈ F ( L (1)+ ) × Θ( I (1)+ ) at time k + 1 with signiﬁcant weights are determinedby solving a ranked assignment problem with cost matrix [ η ( ξ (1) ) i ( j )] , i = 1 : P , j = − M , where η ( ξ (1) ) i ( j ) =  − ¯ P ( ξ (1) ) S (1 , (cid:96) i ) (cid:96) i ∈ I (1) , j < P ( ξ (1) ) S (1 , (cid:96) i )¯ ψ ( ξ (1) ,θ (1)+ ) Z + (1 , (cid:96) i ) (cid:96) i ∈ I (1) , j ≥ − r B, + ( (cid:96) i ) (cid:96) i ∈ B (1)+ , j < r B, + ( (cid:96) i )¯ ψ ( ξ (1) ,θ (1)+ ) Z + (1 , (cid:96) i ) (cid:96) i ∈ B (1)+ , j ≥ ψ ( ξ (1) ,θ (1)+ ) Z + (1 , (cid:96) ) = (cid:28) ¯ p ( ξ (1) )+ ( · , , (cid:96) ) , ψ ( θ (1)+ ( (cid:96) )) Z + ( · , , (cid:96) ) (cid:29) ψ ( j ) Z + ( ζ, , (cid:96) ) =  P (1) D, + ( ζ,(cid:96) ) g (1)+ ( z j | ζ,(cid:96) )ˆ κ + , if j ∈ { , ..., M } − P (1) D, + ( ζ, (cid:96) ) , if j = 0 Such a ranked assignment problem can be solved by Murty’salgorithm or the Gibbs sampler given in Section III-D [25].

D. Propagating Clutter Generators

Given ( I (1)+ , θ (1)+ ) pertaining to the objects of interest, weproceed to determine ( I (0)+ , θ (0)+ ) pertaining to clutter gen-erators, which maximizes ω (0 ,I (0) ,ξ (0) ,I (0)+ ,θ (0)+ ) Z + where I (0)+ ⊆ I (0) ∪ B (0)+ and θ (0)+ : I (0)+ → { M } subject to constraint(42).Denote by Z (1)+ ⊆ Z + the set of measurements assignedto I (1)+ by θ (1)+ and the remaining set of measurements Z + − Z (1)+ , due to clutter generators, by Z (0)+ . Recall that cluttergenerators are functionally identical except in label and thattheir propagation reduces to calculating their correspondingweights (39). Let N (0) S = | I (0) ∩ I (0)+ | and N (0) B, + = | B (0)+ ∩ I (0)+ | denote the counts of surviving and new born clutter gener-ators respectively. Then | I (0) − I (0)+ | = | I (0) | − N (0) S and | B (0)+ − I (0)+ | = | B (0)+ | − N (0) B, + . Observe that the count | Z (0)+ | of clutter must equal the number of detections of cluttergenerators according to ( I (0)+ , θ (0)+ ) , i.e. | Z (0)+ | = |{ (cid:96) ∈ I (0)+ : θ (0)+ ( (cid:96) ) > }| and hence the count of misdetections of cluttergenerators according to ( I (0)+ , θ (0)+ ) is N (0) S + N (0) B, + − | Z (0)+ | = |{ (cid:96) ∈ I (0)+ : θ (0)+ ( (cid:96) ) = 0 }| . Consequently the weight (39) canbe rewritten as ω (0 ,I (0) ,ξ (0) ,I (0)+ ,θ (0)+ ) Z + = (cid:104) − P (0) S (cid:105) | I (0) |− N (0) S (cid:104) P (0) S (cid:105) N (0) S (cid:104) − r (0) B, + (cid:105) | B (0)+ |− N (0) B, + × (cid:104) r (0) B, + (cid:105) N (0) B, + (cid:104) − P (0) D, + (cid:105) N (0) S + N (0) B, + −| Z (0)+ | (cid:104) P (0) D, + V − (cid:105) | Z (0)+ | ∝ (cid:34) P (0) S (1 − P (0) D, + )1 − P (0) S (cid:35) N (0) S (cid:34) r (0) B, + (1 − P (0) D, + )1 − r (0) B, + (cid:35) N (0) B, + Thus seeking the best ( I (0)+ , θ (0)+ ) subject to constraint (42)reduces to seeking the best ( N (0) S , N (0) B, + ) subject to theconstraints ≤ N (0) S ≤ | I (0) | , ≤ N (0) B, + ≤ | B (0)+ | and N (0) S + N (0) B, + ≤ | Z (0)+ | . E. Linear Gaussian Update Parameters

Let N ( · ; ¯ ζ, P ) denotes a Gaussian density with mean ¯ ζ andcovariance P . Then for a linear Gaussian multi-object modelof the objects of interest P (1) S ( ζ, (cid:96) ) = P (1) S , P (1) D ( ζ, (cid:96) ) = P (1) D , f (1)+ ( ζ + | ζ, (cid:96) ) = N ( ζ + ; F ζ, Q ) , g (1) ( z | ζ, (cid:96) ) = N ( z ; Hζ, R ) ,and p (1) B, + ( ζ + ) = N ( ζ + ; ¯ ζ (1)+ , P (1)+ ) , where F is the transitionmatrix, Q is the process noise covariance, H is the observationmatrix, R is the observation noise covariance, ¯ ζ (1)+ and P (1)+ are the mean and covariance of the kinematic state of a newobject of interest. If each current density of an object ofinterest is a Gaussian of the form p ( ξ (1) ) ( ζ, , (cid:96) ) = N ( ζ ; ¯ ζ ( ξ (1) ) ( (cid:96) ) , P ( ξ (1) ) ( (cid:96) )) (43)then the terms (30), (31), (32) can be computed analyticallyusing the following identities: (cid:90) N ( ζ ; ¯ ζ, P ) N ( ζ + ; F ζ, Q ) dζ = N ( ζ + ; F ¯ ζ, F P F T + Q ) , N ( ζ ; ¯ ζ, P ) N ( z ; Hζ, R )= q ( z ) N ( ζ ; ¯ ζ + K ( z − H ¯ ζ ) , [ I − KH ] P ) ,q ( z ) = N ( z ; H ¯ ζ, HP H T + R ) ,K = P H T (cid:2) HP H T + R (cid:3) − . F. Extension to Unknown Detection Probability

Following the approach in [9], to jointly estimate an un-known detection probability, we augment a variable a ∈ [0 , to the state, i.e. x = ( ζ, m, a, (cid:96) ) , so that P ( m ) D ( ζ, a, (cid:96) ) = a. (44)Additionally, in this model g ( m ) ( z | ζ, a, (cid:96) ) = g ( m ) ( z | ζ, (cid:96) ) , P ( m ) S ( ζ, a, (cid:96) ) = P ( m ) S , p (1) B, + ( ζ + , a + ) = p (1) B, + ( ζ + ) p (1) B, + ( a + ) ,and the transition density is given by f ( m )+ ( ζ + , a + | ζ, a, (cid:96) ) = f ( m )+ ( ζ + , | ζ, (cid:96) ) f (∆)+ ( a + | a ) . (45)The unknown detection probability is then modelled on aBeta distribution β ( · , s, t ) where s and t are positive shapeparameters and the single-object state density is modelled bya Beta-Gaussian density: p ( ξ (1) ) ( ζ, , a, (cid:96) )= β ( a ; s ( ξ (1) ) ( (cid:96) ) , t ( ξ (1) ) ( (cid:96) )) N ( ζ ; m ( ξ (1) ) ( (cid:96) ) , P ( ξ (1) ) ( (cid:96) ) Note that in practice, we only use the Beta model for theunknown detection probability of the objects of interest. Forclutter generators, we use a ﬁxed detection probability between0.5 and 1. Values close to 0.5 result in a large variance onthe clutter cardinality and faster reponse to changes in clutterparameter, while the converse is true for values close to 1. Analytic computation of the terms (30), (31), (32) can beperformed separately for the Gaussian part (which has beengiven in the previous subsection) and the Beta part using [9]: β ( a + ; s + , t + ) = (cid:90) β ( a ; s, t ) f (∆)+ ( a + | a ) da where s + = (cid:32) µ β (1 − µ β ) σ β − (cid:33) µ β ,t + = (cid:32) µ β (1 − µ β ) σ β − (cid:33) (cid:0) − µ β (cid:1) .µ β = ss + t , σ β = st ( s + t ) ( s + t + 1) (note that β ( · ; s + , t + ) has the same mean µ β as β ( · ; s, t ) buta larger variance than σ β ) and (1 − a ) β ( a ; s, t ) = B ( s, t + 1) B ( s, t ) β ( a ; s, t + 1) ,aβ ( a ; s, t ) = B ( s + 1 , t ) B ( s, t ) β ( a ; s + 1 , t ) , where B ( s, t ) = (cid:82) a s − (1 − a ) t − da .V. N UMERICAL S TUDIES

A. Simulations

The following simulation scenario is used to test theproposed robust multi-object ﬁlter. The target state vector [ x, y, ˙ x, ˙ y ] T consists of cartesian coordinates and the veloci-ties. Objects of interest move according to a constant velocitymodel, with zero-mean Gaussian process noise of covariance Q f = v f  T / T / T / T T / T /

20 0 T / T  where v f = 5 ms − and T = 1 s . Objects of interest areborn from a labeled multi Bernoulli distribution with fourcomponents of 0.03 birth probability, and birth densities N ( · , [0 , , , T , P γ ) , N ( · , [400 , − , , T , P γ ) , N ( · , [ − , − , , T , P γ ) , N ( · , [ − , , , T , P γ ) , where P γ = diag([50 , , , . The probability of survivalis set at 0.99.Objects of interest enter and leave the observation region [ − , m × [ − , m at different times reachinga maximum of ten targets. The measurements are the objectpositions obtained through a sensor located at coordinate (0 , . Measurement noise is assumed to be distributed Gaus-sian with zero-mean and covariance Q r where v r = 3 ms − . Q r = v r (cid:20) (cid:21) The detection model parameters for all new born objects ofinterest are set at s = 9 and t = 1 resulting in a mean of 0.9 for Scenario ID Clutter Rate Detection Probability1 10 0.972 10 0.853 70 0.974 varying between 25-35 0.95

TABLE I: Simulation Parameters unknown to the ﬁlterthe detection probability. At the initial timestep, clutter genera-tors are born from a (labeled) multi-Bernoulli distribution with120 components, each with 0.5 birth probability and uniformbirth density. At subsequent timesteps clutter generators areborn from a (labeled) multi-Bernoulli distribution with 30components, each with 0.5 birth probability and uniform birthdensity. Probability of survival and probability of detection ofthe clutter generators are both set at 0.9.Four scenarios corresponding to four different pairings ofaverage (unknown) clutter rate and detection probability (seeTable 1) are studied.The Fig. 3(a) shows the OSPA [33] errors obtained from100 Monte Carlo runs (OSPA c = 300, p = 1) for the pro-posed GLMB ﬁlter in comparison with λ -CPHD [9] ﬁlter forscenario 1. Estimated clutter rates and detection probabilitesby the two ﬁlters are shown in Fig. 3(b), while estimated tracksfor objects of interest taken from a single run is shown in Fig.3(c). It can be seen that for the given parameters, the GLMBﬁlter performs far better than the λ -CPHD in terms of clutterrate, detection probability and track estimation for objects ofinterest.We further investigate the performance of the proposedalgorithm by varying the background parameters in scenarios2 and 3. The average detection probability in scenario 2 islower than that of scenario 1, while the average clutter rate inscenario 3 is higher than that of scenario 1. Note from Figure3 that λ -CPHD ﬁlter begins to fail in scenario 1. The OSPAerrors for 100 Monte Carlo runs, estimates of the clutter rateand detection probabilities for the more challenging scenarios2 and 3 are given in Fig. 4, Fig. 5 at which λ -CPHD competelybreaks down. On the other hand the proposed GLMB ﬁlter iscapable of accurately tracking the objects of interest as wellas estimating the unknown clutter and detection parameters.The fourth scenario comprises of a wavering clutter rate withcomparison to the λ -CPHD ﬁlter. Perceiving Fig. 6 it is clearthat that the proposed ﬁlter outperforms λ -CPHD and is quiteadept at converging swiftly to the shifted clutter rate. B. Video Data

The proposed ﬁlter for jointly unknown clutter rate anddetection probability is tested on two image sequences: S2.L1from PETS2009 datasets [34] and KITTI-17 from KITTIdatasets [5]. The detections are obtained using the detectionalgorithm in [35].

Dataset 1:

The state vector consists of the target x, y positions and the velocities in each direction. The processnoise is assumed to be distributed from a zero-mean Gaussianwith covariance Q f where v f = 2 pixels. Actual targets are (a) OSPA Error(b) estimated clutter and detection parameters(c) Track Estimations Fig. 3: Scenario 1. The bumps in the OSPA error for GLMBin 3(a) appear close to time steps where a new birth or a deathof an object of interest occurs.assumed to be born from a labeled multi Bernoulli distributionwith seven components of 0.03 birth probability, and Gaussianbirth densities, N ( · , [260; 260; 0; 0] T , P γ ) , N ( · , [740; 370; 0; 0] T , P γ ) , N ( · , [10; 200; 0; 0] T , P γ ) , N ( · , [280; 80; 0; 0] T , P γ ) , N ( · , [750; 130; 0; 0] T , P γ ) , N ( · , [650; 270; 0; 0] T , P γ ) , N ( · , [500; 200; 0; 0] T , P γ ) , where P γ = diag([10; 10; 3; 3]) . The observation space is a × pixel image frame.Actual target measurements contain the x, y positions withmeasurement noise assumed to be distributed zero-mean Gaus- (a) OSPA Error(b) estimated clutter and detection parameters Fig. 4: Scenario 2. Comparison with λ -CPHD not included asit completely fails at this juncture.sian with covariance Q r with v r = 3 pixels. Clutter targetsare born from a multi Bernoulli distribution with 30 birthcomponents in the ﬁrstmost time step and 12 componentsin subsequent time steps each with 0.5 birth probability anduniform birth density. Probability of survival and detection forclutter targets are both set at 0.9.The Fig. 7 shows tracking results at frames 20, 40 and 100respectively. True and estimated clutter cardinality statisticsare given in Fig. 8. From these ﬁgures it can be observedthat the ﬁlter successfully outputs object tracks and that theestimated clutter rate nearly overlays the true clutter rate. Dataset 2:

The detection results from this dataset (KITTI17)comprises of a higher number of false measurements thanthe PETS2009 S2.L1 dataset. The state vector consists of thetarget x, y positions and the velocities in each direction. Theprocess noise is assumed to be distributed from a zero-meanGaussian with covariance Q f where v f = 2 pixels. Actualtargets are assumed to be born from a labeled multi Bernoullidistribution with three components of 0.05 birth probability,and birth densities N ( · , [550; 200; 0; 0] T , P γ ) , N ( · , [1200; 250; 0; 0] T , P γ ) , N ( · , [500; 250; 0; 0] T , P γ ) where P γ = diag([10; 10; 1; 1]) . State transition function for actual targets are based onconstant velocity model with a 0.99 probability of survival.Process noise is assumed to be distributed from a zero-mean Gaussian with covariance Q f with v f = 2 pixels perframe. The observation space is a 1220 ×

350 pixel imageframe. Actual target measurements contain the x, y positionswith measurement noise assumed to be distributed zero-mean (a) OSPA Error(b) estimated clutter and detection parameters

Fig. 5: Scenario 3. Comparison with λ -CPHD not included asit completely fails at this juncture. (a) OSPA Error(b) estimated clutter and detection parameters Fig. 6: Scenario 4.Gaussian with covariance Q r with v r = 3 pixels. Cluttertarget are born from 60 identical and uniformly distributedbirth regions in the ﬁrstmost time step and 20 birth regionsin the subsequent time steps each with a birth probability of0.5. Probability of survival and detection for clutter targets areboth set at 0.9. Fig. 7: Tracking results for frames 20, 40, 100 in dataset 1.The frames on the left of Fig. 9 shows tracking resultsfor frames 15, 35 and 50 obtained from the standard GLMBﬁlter for the guessed clutter rate of 60. The frames on theright of Fig. 9 shows tracking results for the same framesusing the proposed ﬁlter. When comparing each frame pairit can be noted that some objects that were missed by thestandard algorithm with the guessed clutter rate has beenpicked up by the proposed algorithm. Comparison betweentrue and estimated clutter cardinality statistics given in Fig.10 demonstrates that the estimated clutter rate is close enough Fig. 8: Estimated clutter rate for dataset 1.to the true clutter rate to achieve a similar performance if fedback to the standard algorithm [8].VI. C

ONCLUSION

In this paper we have proposed a tractable algorithmfor tracking multiple objects in environments with unknownmodel parameters, such as clutter rate and detection prob-ability, based on the GLMB ﬁlter. Speciﬁcally, objects ofinterest and clutter objects are treated as non-interactingclasses of objects, and a GLMB recursion for propagatingthe joint ﬁltering density of these classes are derived, alongwith an efﬁcient implementation. Simulations and applicationsto video data demonstrate that the proposed ﬁlter has goodtracking performance in the presence of unknown backgroundand outperforms the λ -CPHD ﬁlter. Moreover, it can alsoestimate the clutter rate and detection probability parameterswhile tracking. R EFERENCES[1] Y. Bar-Shalom and T. E. Fortmann,

Tracking and Data Association . SanDiego: Academic Press, 1988.[2] S. S. Blackman and R. Popoli,

Design and Analysis of Modern TrackingSystems , ser. Artech House radar library. Artech House, 1999.[3] R. Mahler,

Statistical Multisource-Multitarget Information Fusion .Artech House, 2007.[4] R. Mahler,

Advances in Statistical Multisource-Multitarget InformationFusion , Artech House, 2014.[5] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for AutonomousDriving? The KITTI Vision Benchmark Suite,”

Proc. 19th Conferenceon Computer Vision and Pattern Recognition , 2012.[6] A. Elgammal, R. Duraiswami, D. Harwood, and L.S. Davis, “Back-ground and foreground modeling using nonparametric kernel densityestimation for visual surveillance,” Proceedings of the IEEE, vol. 90,no. 7, pp. 1151-1163, 2002.[7] B.-T. Vo and B.-N. Vo, “Labeled random ﬁnite sets and multi-objectconjugate priors,”

IEEE Trans. Signal Process. , vol. 61, no. 13, pp.3460–3475, 2013.[8] B.-N. Vo, B.-T. Vo, and D. Phung, “Labeled random ﬁnite sets and theBayes multi-target tracking ﬁlter,”

IEEE Trans. Signal Process. , vol. 62,no. 24, pp. 6554–6567, 2014.[9] R. Mahler, B.-T. Vo, and B.-N. Vo “CPHD ﬁltering with unknown clutterrate and detection proﬁle,”

IEEE Trans. Signal Processing , vol. 59, no. 8,pp. 3497-3513, 2011.[10] Y. Punchihewa, B.-N. Vo, and B.-T. Vo, “A Generalized LabeledMulti-Bernoulli Filter for Maneuvering Targets,”

Proc. 19thInt. Conf. Inf. Fusion , pp. 980-986. July 2016. Available:https://arxiv.org/pdf/1603.04565.pdf[11] F. G. Cozman, ”A brief introduction to the theory of sets of proba-bility measures”, Tech Rep. is CMU-RI-TR 97-24, Robotics Insititute,Carnegie Mellon Universities, 1999.[12] B. Noack, V. Klumpp, D. Brunn and U. Hanebeck, ”Nonlinear Bayesianestimation with convex sets of probability densities”, FUSION 2008,Cologne.[13] P. Walley, Statistical Reasonning with Imprecise Probabilities, Mono-graphs on Statistics and Applied Probability, Vol. 42, London Chapmanand Hall, 1991. Fig. 9: Tracking results for frames 15,35,50 with guessed clutter rate 60 (left) and the proposed ﬁlter (right) for dataset 2.Fig. 10: Estimated clutter rate for dataset 2. [14] S. Basu, ”Ranges of posterior probabilities over a distribution band”,Journal of Statistical Planning and Inference, Vol. 44, pp. 149-166, 1995.[15] J. Berger, D. Insua, F. Ruggeri, ”Robust Bayesian Analysis”, Lecturenotes in Statistics, Vol. 152, pp. 1-32, Springer, 2000.[16] M. Berliner, ”Hierachical Bayesian time series models”, in MaximumEntropy and Bayesian Methods, K. Hauson and R. Silver Eds, Kluwer,1996, pp. 15-22.[17] S. Singh, N. Whiteley, and S. Godsil, ”An approximate likelihoodmethod for estimating the static parameters in multi-target trackingmodels,” Tech. Rep, Dept. of Eng. University of Cambridge, CUED/F-INFENG/TR.606.[18] R. Mahler, and A. El-Fallah, “CPHD ﬁltering with unknown probabilityof detection,” in I. Kadar (ed.), Sign. Proc., Sensor Fusion, and Targ.Recogn. XIX, SPIE Proc. Vol. 7697, 2010.[19] R. Mahler, and A. El-Fallah, “CPHD and PHD ﬁlters for unknownbackgrounds, III: Tractable multitarget ﬁltering in dynamic clutter,” inO. Drummond (ed.), Sign. and Data Proc. of Small Targets 2010, SPIEProc. Vol. 7698, 2010.[20] B.-T. Vo, B.-N. Vo, R. Hoseinnezhad, and R. Mahler, “Robust Multi-Bernoulli ﬁltering,”

IEEE Journal on Selected Topics in Signal Process-ing , vol. 7, no. 3, pp. 399-409, 2013.[21] R. Mahler,and B.-T. Vo, ”An improved CPHD ﬁlter for unknownclutter backgrounds,” I n Proc. Int. Soc. Optics and Photonics, SPIEDefense+Security , pp. 90910B-90910B, June 2014.[22] J. Correa, and M. Adams, ”Estimating detection statistics within aBayes-closed multi-object ﬁlter,”

Proc. 19th Ann. Conf. Inf. Fusion , pp.811-819, Heidleberg, Germany, 2016. [23] S. Rezatoﬁghi, S. Gould, B. -T. Vo, B.-N. Vo, K. Mele, and R. Hartley,“Multi-target tracking with time-varying clutter rate and detection pro-ﬁle: Application to time-lapse cell microscopy sequences,”

IEEE Trans.Med. Imag. , vol. 34, no. 6, pp. 1336–1348, 2015.[24] B.-N. Vo, B. T. Vo, N.-T. Pham, and D. Suter, “Joint detection andestimation of multiple objects from image observations,”

IEEE Trans.Signal Process. , vol. 58, no. 10, pp. 5129–5141, 2010.[25] B.-N. Vo, B.-T. Vo, and H. Hoang, “An Efﬁcient Implementation ofthe Generalized Labeled Multi-Bernoulli Filter,”

IEEE Trans. SignalProcess. , vol. 65, no. 8, pp. 1975–1987, 2017[26] X. R. Li, “Engineer’s guide to variable-structure multiple-model esti-mation for tracking,” Chapter 10, in

Multitarget-Multisensor Tracking:Applications and Advances , Volume III, Ed. Y. Bar-Shalom and W. D.Blair, pp. 449–567, Aetech House, 2000.[27] X. R. Li and V. P. Jilkov, “A survey of maneuvering target tracking,Part V: Multiple-Model methods,”

IEEE Trans. Aerospace & ElectronicSystems , vol. 41, no. 4, pp. 1255–1321, 2005.[28] B. Ristic, S. Arulampalam, and N. J. Gordon,

Beyond the Kalman Filter:Particle Filters for Tracking Applications . Artech House, 2004.[29] M Jiang, W Yi, R Hoseinnezhad, L Kong, “Adaptive Vo-Vo ﬁlter formaneuvering targets with time-varying dynamics,”

Proc. 19th Int. Conf.Inf. Fusion , pp. 666-672. July 2016.[30] R. Mahler, “Multitarget Bayes ﬁltering via ﬁrst-order multitarget mo-ments,”

IEEE Trans. Aerosp. Electron. Syst. , vol. 39, no. 4, pp. 1152–1178, 2003.[31] K. G. Murty, “An algorithm for ranking all the assignments in orderof increasing cost,”

Operations Research , vol. 16, no. 3, pp. 682–687,1968.[32] G. Casella and E. I. George, “Explaining the Gibbs sampler,”

TheAmerican Statistician , vol. 46, no. 3, pp. 167–174, 1992.[33] D. Schumacher, B.-T. Vo, and B.-N. Vo, “A consistent metric forperformance evaluation of multi-object ﬁlters,”

IEEE Trans. SignalProcess. , vol. 56, no. 8, pp. 3447–3457, 2008.[34] J. Ferryman and A. Shahrokni, “Pets2009: Dataset and challenge, ” inProc. IEEE 12th Int. Performance Eval. Tracking Surveillance , Dec.2009.[35] P. Dollar, R. Appel, S. Belongie, and P. Perona, “Fast feature pyramidsfor object detection, ”