[PDF] Robust Particle Filtering via Bayesian Nonparametric Outlier Modeling

Abstract

This paper is concerned with the online estimation of a nonlinear dynamic system from a series of noisy measurements. The focus is on cases wherein outliers are present in-between normal noises. We assume that the outliers follow an unknown generating mechanism which deviates from that of normal noises, and then model the outliers using a Bayesian nonparametric model called Dirichlet process mixture (DPM). A sequential particle-based algorithm is derived for posterior inference for the outlier model as well as the state of the system to be estimated. The resulting algorithm is termed DPM based robust PF (DPM-RPF). The nonparametric feature makes this algorithm allow the data to "speak for itself" to determine the complexity and structure of the outlier model. Simulation results show that it performs remarkably better than two state-of-the-art methods especially when outliers appear frequently along time.

Full PDF

aa r X i v : . [ s t a t . C O ] M a y Robust Particle Filtering via BayesianNonparametric Outlier Modeling

Bin Liu

School of Computer ScienceJiangsu Key Lab of Big Data Security & Intelligent ProcessingNanjing University of Posts and Telecommunications

Nanjing, [email protected]

Abstract —This paper is concerned with the online estimation ofa nonlinear dynamic system from a series of noisy measurements.The focus is on cases wherein outliers are present in-betweennormal noises. We assume that the outliers follow an unknowngenerating mechanism which deviates from that of normal noises,and then model the outliers using a Bayesian nonparametricmodel called Dirichlet process mixture (DPM). A sequentialparticle-based algorithm is derived for posterior inference for theoutlier model as well as the state of the system to be estimated.The resulting algorithm is termed DPM based robust PF (DPM-RPF). The nonparametric feature makes this algorithm allowthe data to “speak for itself” to determine the complexity andstructure of the outlier model. Simulation results show that itperforms remarkably better than two state-of-the-art methodsespecially when outliers appear frequently along time.

Index Terms —Bayesian nonparametrics, Dirichlet process mix-ture, particle ﬁltering, robust state ﬁltering, outliers

I. I

NTRODUCTION

This paper deals with the online estimation of states innonlinear and non-Gaussian dynamic systems based on noisymeasurements polluted by outliers. Particle ﬁlters (PFs), alsoknown as Sequential Monte Carlo (SMC) methods, are mainlyused for state estimation in nonlinear and non-Gaussian sys-tems [1]–[3]. However, most existent PF methods in theliterature adopt a pre-determined parametric model, e.g., azero-mean Gaussian, to characterize the statistical propertyof the measurement noise. This simple treatment will lead toa signiﬁcant degradation in ﬁltering performance when theactual measurements are with the presence of outliers. Tolessen such model mismatch problem caused by the presenceof outliers, the common practice is to resort to the multiplemodel strategy (MMS), namely by employing multiple pre-set models together to characterize the statistical propertyof normal noises together with outliers [4]–[8]. An efﬁcientapproach to handle model uncertainty incurred by employingmultiple models is Bayesian model averaging [4].A limitation of the aforementioned MMS based methods isthat, to use them, one has to specify a set of candidate modelsbeforehand even if there is no prior knowledge available

This work was partly supported by National Natural Science Foundation(NSF) of China (No. 61571238), Scientiﬁc Research Foundation of NanjingUniversity of Posts and Telecommunications (No.NY218072) and a researchfund from Yancheng Big Data Research Institute. for model speciﬁcation. To this end, an incremental learningassisted particle ﬁltering (ILAPF) algorithm is proposed [9].The basic idea underlying ILAPF is to learn an outlier modelonline instead of specifying a set of candidate models ofﬂine.The ILAPF algorithm is shown to be simple while efﬁcient,while its drawback is that it only uses a uniform distributionto roughly characterize the statistical pattern of the outliers.The uniform distribution is certainly unsatisfactory when thetrue distribution pattern of the outliers is much more complexand far away from being uniform. This observation motivatesus to develop a more powerful learning assisted PF algorithm,which is able to reveal and then make use of any possiblecomplex patterns in the outliers’ distribution. We proposeusing Bayesian nonparametric DPM to model the generativemechanism of the outliers. We show that our algorithm allowsthe data “speak for itself” to determine the complexity andstructure of the outlier model, thus sidestepping the issue ofpre-specifying candidate models and model selection.The DPM model was recently introduced to deal withswitching linear dynamical models in e.g., [10]–[12], whichassume that the state transition prior is uncertain. In contrastwith such previous work, this work assumes that the state tran-sition prior is precisely known, and focus on taking advantageof DPM in modeling the measurement noise.The remainder of the paper is organized as follows. SectionII succinctly describes our model. Section III presents theproposed algorithm in detail. Section IV reports the simulationresults, and ﬁnally, Section V concludes.II. M

ODEL

We consider a state space model as follows x t = f ( x t − ) + u t (1) y t = h ( x t ) + n t , (2)where t denotes the discrete time index, x ∈ R d x the state ofinterest to be estimated, y ∈ R d y the measurement observed, f the state transition function, h the measurement function, u and n are independent identically distributed (i.i.d.) processnoise and measurement noise, respectively. The probabilitydensity function (pdf) of u t is precisely known. n t may bea standard measurement noise or an outlier. For the formercase, we have n t ∼ N ( µ (0) , Σ (0) ) , and for the latter n t ∼ F ( · ) ,here F ( · ) denotes an unknown outlier distribution. The sym-bol ∼ means distributed according to , and N ( µ, Σ) denotesGaussian with mean µ and covariance Σ . Considering anoutlier set O , wherein its elements o (1) , ..., o ( I ) are statisticallyexchangeable, we express F ( · ) as a DPM model as follows G ∼ DP ( H , α ) , (3) θ ( i ) | G ∼ G , o ( i ) | θ ( i ) ∼ g ( ·| θ ( i ) ) , where DP ( H , α ) is a Dirichlet process (DP) parameterized bya concentration paramter α > and a base distribution H [13],[14], G is a random distribution drawn from the DP, θ ( i ) ∈ Θ is the parameter of the cluster to which o ( i ) belongs. Hereand in what follows, the notation ( i ) in a subscript representsthe index of a data item in a set, where the bracket is usedto discriminate it from the time index. By integrating over G ,we obtain a marginal representation of the prior distributionof θ ( i +1) as follows θ ( i +1) | θ (1) , . . . , θ ( i ) ∼ α + i  α H + i X j =1 δ θ ( j )  , (4)where δ θ denotes the delta-mass function located at θ . Thisrepresentation is often known as the Blackwell MacQueen urnscheme [15]. The DP can also be represented by a ChineseRestaurant Process (CRP), which describes a partition of θ ( i ) swhen G is marginalized out [16], [17]. According to CRP, theﬁrst outlier is assigned to the ﬁrst cluster, and the i th outlieris assigned to the k th cluster with probability p ( z ( i ) = k ) = n k I − α , for k ≤ K (5) p ( z ( i ) = k ) = αI − α , for k = K + 1 where z is a membership indicator, namely z ( i ) = k means o ( i ) belongs to cluster k , n k is the number of outliers includedin cluster k . Each cluster, say cluster k , is deﬁned by aparametric pdf g ( ·| θ ⋆ ( k ) ) and a prior on θ ⋆ ( k ) . Set g ( ·| θ ⋆ ( k ) ) , N ( ·| µ ( k ) , Σ ( k ) ) , θ ⋆ ( k ) , ( µ ( k ) , Σ ( k ) ) , and employ a conjugateNormal-Inverse-Wishart (NIW) prior for θ ⋆ ( k ) Σ ( k ) | κ, W ∼ IW ( ·| κ, W − ) , (6) µ ( k ) | Σ ( k ) , µ ⋆ , ρ ∼ N ( ·| µ , Σ ( k ) /ρ ) , where IW ( ·| κ, W − ) denotes an inverse-Wishart (IW) pdfparameterized by the degree of freedom κ and the scale matrix W , µ ⋆ and ρ are the other hyper-parameters of this NIW prior.Due to conjugacy of the NIW and Gaussian, the posteriorof θ ⋆ ( k ) based on O and Z = [ z (1) , . . . , z ( I ) ] is also NIWdistributed as follows [18], p ( θ ⋆ ( k ) ) ∝ N IW ( µ ⋆ , ρ, κ, W ) Y i : z i = k g ( o ( i ) | θ ⋆ ( k ) ) (7) = N IW ( µ ( k ) , ρ ( k ) , κ ( k ) , W ( k ) ) , where µ ( k ) = ρρ + n k µ ⋆ + n k ρ + n k ¯ o ( k ) (8) ρ ( k ) = ρ + n k κ ( k ) = κ + n k W ( k ) = W + R ( k ) + ρn k ρ + n k (¯ o ( k ) − µ ⋆ )(¯ o ( k ) − µ ⋆ ) T where R ( k ) = P i : z i = k ( o ( i ) − ¯ o ( k ) )( o ( i ) − ¯ o ( k ) ) T , ¯ o ( k ) =1 /n k P i : z i = k o ( i ) .In the above model, µ (0) , Σ (0) are deterministic and known; α , κ , W , µ ⋆ and ρ are hyper-parameters preset by the user.The other parameters will be inferred online by the algorithmdescribed in the next Section.III. A LGORITHM

In this section, we present our algorithm, DPM-RPF, forsequential inference of the state of interest x t based on themodel presented in the above Section. The task here is toprovide a recursive solution to compute p ( x t | y t ) (or in short p t | t ), which denotes the posterior of x t given measurementsobserved up to time t . Note that p t | t can be indeed computedfrom p t − | t − recursively as follows [1] p t | t = p ( y t | x t ) R p ( x t | x t − ) p t − | t − d x t − p ( y t | y t − ) . (9)The DPM-RPF algorithm starts by initializing hyper-parameters for the DPM model, specifying the particle size J of the PF, drawing a set of equally weighted random samples(also called particles) { x j , ω j } Jj =1 from the prior p | , p ( x ) and initializing the outlier set O to be empty. A pseudo-codeto implement DPM-RPF is shown in Algorithm 1.Suppose that computations of DPM-RPF at time t − havebeen completed. We now have at hand a set of weightedsamples { x jt − , ω jt − } Jj =1 , that satisﬁes p t − | t − ≃ J X j =1 ω jt − δ x jt − , (10)and a DPM based outlier model that has K active mixingcomponents. We show in what follows how to leverage therecursion in Eqn.(9) to update the particle set to obtain a MonteCarlo approximation to p t | t . The posterior of the DPM modelwill also be updated if a new outlier is found. A. Importance Sampling under Model Uncertainty

To begin with, following the importance sampling principle,we draw particles ˆ x jt , j = 1 , . . . , J , from a proposal distri-bution q ( x t | x t − , y t ) and then calculate the unnormalizedimportance weight ˆ ω jt = ω jt − p (ˆ x jt | x jt − ) p ( y t | ˆ x jt ) /q (ˆ x jt | x jt − , y t ) . (11)Set q ( x t | x t − , y t ) = p ( x t | x t − ) as in the Bootstrap ﬁlter [3],then it leads to ˆ ω jt = ω jt − p ( y t | ˆ x jt ) . (12)rom Eqn.(2), we see that the likelihood in Eqn.(12), namely p ( y t | ˆ x jt ) , is deﬁned by the pdf of n t . We consider K + 2 can-didate pdfs of n t , namely N ( ·| µ ( k ) , Σ ( k ) ) , k = 0 , . . . , K + 1 ,each corresponding to a hypothesis on the likelihood functionthat should be used in Eqn.(12). Let l denote the hypothesisindicator, and set p l ( y t | ˆ x jt ) = N ( y t − h (ˆ x jt ) | µ ( l ) , Σ ( l ) ) , l = 0 , . . . , K + 1 . (13)As is shown, l = 0 indicates the standard measurement noisehypothesis. If ≤ l ≤ K , it represents a hypothesis that n t isdrawn from one of the active mixing components of the DPMoutlier model. l = K + 1 means that n t is drawn from a newmixing component of DPM that may become active later. Theparameter value of the new mixing component is drawn fromthe NIW prior presented in Eqn.(6). For each hypothesis l , itsmarginal likelihood is L ( l ) , p ( y t | l, y t − ) = J X j =1 ˆ ω jt,l , (14)where ˆ ω jt,l = ω jt − p l ( y t | ˆ x jt ) (Note that here ω jt − is an outputat time t − of the algorithm. It is not dependant on l . Seethe next paragraph on how ω jt is calculated). The prior of thehypothesis l , denoted by p ( l ) , is proportional to the number ofdata points allocated into cluster l . Then, using Bayes theorem,we obtain the posterior probability of hypothesis l as follows π ( l ) = p ( l ) L ( l ) P K +1 k =0 p ( k ) L ( k ) , l = 0 , . . . , K + 1 . (15) B. Model Selection and Resampling

Now we sample a hypothesis m from the posterior bysetting m = l with probability π ( l ) , l = 0 , . . . , K + 1 . Basedon hypothesis m , we normalize the importance weights asfollows ω jt = ˆ ω jt P Ja =1 ˆ ω at , j = 1 , . . . , J, (16)where ˆ ω jt = ω jt − p m ( y t | ˆ x jt ) . An optimal estimate of n t interms of minimum mean squared error (MMSE) is ˆ n t = y t − h  J X j =1 ω jt x jt  . (17)We allocate ˆ n t into cluster m and increments the size ofcluster m by 1. If m > , we add ˆ n t into O and thenupdate Z correspondingly. If m = K + 1 , we activate the newmixing component with its parameter value drawn from theNIW prior in Eqn.(6) and then increments K by 1. To preventparticle degeneracy, a resampling procedure is adopted, whichdiscards particles with low weights and duplicate those withhigh weights. In our experiment, we selected the residualresampling method [19]–[21]. Algorithm 1

A pseudo-code to implement DPM-RPF Initialization: Conﬁgure hyper-parameters α , κ , W , µ ⋆ and ρ for the DPM model; Set K = 0 ; Specify the particlesize J of PF; Draw x j ∼ p ( x ) and set ω j = 1 /J , ∀ j ∈ { , . . . , J } ; Initialize O and Z to be empty. Initialize A and B for the model reﬁnement procedure. for t =1,2,. . . do Draw ˆ x jt ∼ p ( x t | x t − ) , ∀ j ; Calculate p l ( y t | ˆ x jt ) by Eqn.(13), ∀ l ∈ { , . . . , K + 1 } ; Calculate L ( l ) by Eqn.(14), ∀ l ∈ { , . . . , K + 1 } ; Calculate π ( l ) by Eqn.(15), ∀ l ∈ { , . . . , K + 1 } ; Sample m ∼ P K +1 l =0 π ( l ) δ l , i.e., set m = l withprobability π ( l ) , ∀ l ∈ { , . . . , K + 1 } ; Calculate ω jt , ∀ j , by Eqn.(16); Calculate the MMSE estimate of x t : ¯ x t = P Jj =1 ω jt ˆ x jt . Calculate ˆ n t by Eqn.(17); Allocate ˆ n t into cluster m and increments the size ofcluster m by 1; If m > , add ˆ n t into O and then update Z correspond-ingly. If m = K +1 , activate the new mixing componentwith its parameter value drawn from the NIW prior, seeEqn.(6), and then increments K by 1; Given { ˆ x jt , ω jt } Jj =1 , perform the resampling procedure,obtaining an updated particle set { ˆ x jt , ω jt } Jj =1 , in which ω jt = 1 /J, ∀ j ; Check the size of O . If it is a multiple of A , do themodel reﬁnement procedure as presented in subsectionIII-C. Output: ¯ x t . end for C. Model Reﬁnement

The ﬁnal building block of the DPM-RPF algorithm istermed model reﬁnement. Only if a new mixing componentof the DPM model becomes active and meanwhile the size ofthe updated O becomes a multiple of A at the current timestep, we do the model reﬁnement operation.The model reﬁnement procedure consists of running B iterations of Gibbs sampling to sample from the posterior ofthe model parameter based on O and Z as follows [22]: • Sample z ( i ) from p ( z ( i ) | Z − i , π, θ ⋆ , O ) ∝ K X k =1 h π k p ( o ( i ) | θ ⋆ ( k ) ) I z ( i ) ,k i , (18)where Z − i = [ z (1) , . . . , z ( i − , z ( i +1) , . . . , z ( I ) ] , I a,b takes value at 1 if a = b , and 0 otherwise. • Sample π from p ( π | Z, θ ⋆ , O ) ∝ Dirichlet ( n + α/K, . . . , n K + α/K ) . (19) • Sample each θ ⋆ ( k ) from the NIW posterior based on Z and O , see Eqn.(7)-(8). and B are constants preset by the user. The sampleyielded at the last iteration is taken as the outputted parameterconﬁguration that will be used in the next time step.IV. P ERFORMANCE E VALUATION

We used simulated experiments to evaluate the performanceof the presented algorithm. We also considered the heteroge-neous mixture model based robust PF (HMM-RPF) [4] andthe ILAPF [9] as competitors for performance comparison.

A. Experimental setting

We consider a modiﬁed version of the time-series experi-ment presented in [23]. The state transition function is x t +1 = 1+sin (cid:18) π mod ( t + 1 , (cid:19) +0 . x t + u t , ≤ t < , (20)where x is ﬁxed at 1, u t ∼ Gamma (3 , , mod( a, b ) returnsthe remainder after the division of a by b . The measurementfunction is speciﬁed as follows y t = (cid:26) . x t + n t , if mod ( t, ≤ . x t − n t , otherwise (21)In the simulation, to generate a measurement at t , a realizationof n t is drawn with probability P o from F = 0 . N ( ·| , . . N ( ·| , . , and with probability − P o from N ( ·| , . . F represents the generative distribution of the outliers and thelatter is the standard measurement noise distribution a priori known. The arrival time of and the generative distribution ofthe outliers are invisible to the algorithms to be tested.In the experiments, the hyper-parameters of DPM-RPF areinitialized as follows: α = 1 , µ ⋆ = 21 , κ = 10 , W = 5 , ρ = 1 , A = 10 , B = 20 . The ILAPF algorithm is initializedwith ˆ lb = 10 , ˆ ub = 90 , which represents the initial guess forthe outliers’ value range. The free parameter I in ILAPF isset at 20, the same as in [9]. The HMM-RPF algorithm isinitialized in the same way as in [4]. The particle size J isﬁxed at 200 for every algorithm involved. B. Experimental Results

At ﬁrst, we assessed the ability of DPM-RPF in discoveringclustering patterns hidden in the outliers. We simulate 480outliers drawn from F and run the DPM based sequentialoutlier model inference part of the DPM-RPF algorithm 30times. Fig.1 shows the Kullback-Leibler (KL) distance fromthe estimated and the true F along time for each Monte Carlorun. It is shown that a sharp decrease in the KL distancehappens at a very early stage, then the KL distance decreasesgradually as more outliers appear along the time. This demon-strates that the posterior estimate of F can approach the real F as more and more outliers are put into the inference procedure.Then we compared DPM-RPF with HMM-RPF and ILAPFin terms of the mean-square-error (MSE) of the state estimates.We calculated the mean and variance of the MSE over 100independent runs of each algorithm. The result is plotted in t -2 -1 K L Fig. 1. The recorded KL distances at each time step between the posteriorestimate of the outlier distribution and the real answer based on 30 indepen-dent Monte Carlo runs of the DPM based sequential outlier model inferenceprocedure. The thick solid line represents the mean of the KL distances overthose 30 runs. Note that a base 10 logarithmic scale is used for the y-axis.TABLE IT

HE MEAN RUNNING TIME ( IN SECONDS ) CALCULATED OVER

INDEPENDENT RUNS FOR CASES P o = 0 . AND P o = 0 . Algorithm HMM-RPF ILAPF DPM-RPFCase P o = 0 . P o = 0 . Fig.2. As is shown, when the outliers rarely appear (corre-sponding to case P o = 0 . ), DPM-RPF performs comparablywith ILAPF and slightly better than HMM-RPF. As the outliersappear more and more frequently, the superiority of DPM-RPFin terms of MSE compared with its competitors becomes moreand more remarkable.Fig.3 shows a snapshot of the estimated trajectory of thesystem state yielded from an example run of the algorithmsfor a frequent outlier case corresponding to case P o = 0 . . Wecan see that, although the outliers appear intensively over timein the measurements (indicated by a large value of P o ), theDPM-RPF algorithm still works well in accurately trackingthe ﬂuctuations in the state trajectory, while ILAPF can onlyfollow the true trajectory roughly, HMM-RPF performs worst.A running time comparison among the involved algorithmsis presented in Table I. It shows that, for case P o = 0 . ,DPM-RPF has a computational complexity in between ILAPFand HMM-RPF; while for case P o = 0 . , the running timeof DPM-RPF becomes larger than the others. We can obtainthe reason of this result by performing an analysis on thecomplexity cost of DPM-RPF. Due to the presence of theDPM outlier modeling procedure, as more outliers appear, thecomplexity of the algorithm will be increased accordingly.V. C ONCLUSIONS

In this paper, we presented a Bayesian nonparametrics basedrobust PF algorithm termed DPM-RPF. We applied the DPMmodel to characterize the unknown generative mechanism ofthe outliers and then derived the DPM-RPF algorithm for .1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 P o M S E HMM-RPFILAPFDPM-RPF

Fig. 2. The mean and variance of the state estimation MSE calculated over100 independent runs of each algorithm for cases P o = 0 . , . , . , . and0.9

540 550 560 570 580 590 600 t true statesHMM-RPFILAPFDPM-RPF Fig. 3. A snapshot of the ﬁltering result for the last 60 time steps for afrequent outlier case in which P o = 0 . sequential posterior inference of the outlier model as well asthe system state of interest.The experimental result provides a strong evidence on thesuperiority of the presented algorithm in terms of discoveringthe mixture patterns underlying the outliers. It also shows thatthe more frequently the outliers appear, the more obviousthe advantage of DPM-RPF in terms of ﬁltering accuracy.The complexity cost of the proposed algorithm is empiricallystudied (see Table I). It is shown that the complexity cost ofDPM-RPF is dependant on the number of outliers. As outliersappear more frequently, the computation complexity of DPM-RPF increases accordingly, and vice versa.A further rigorous theoretical study and more realisticapplication studies in scenarios like multi-target tracking inclutter [24] and ﬁltering with imprecisely time-stamped mea-surements [25] can be conducted as future work. In addition,how to conﬁgure hyper-parameters of the model in a smarterway is also interesting to be investigated. R EFERENCES[1] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorialon particle ﬁlters for online nonlinear/non-Gaussian Bayesian tracking,”

IEEE Trans. on Signal Processing , vol. 50, no. 2, pp. 174–188, 2002.[2] J. Carpenter, P. Clifford, and P. Fearnhead, “Improved particle ﬁlter fornonlinear problems,”

IEE Proceedings-Radar, Sonar and Navigation ,vol. 146, no. 1, pp. 2–7, 1999.[3] N. Gordon, D. Salmond, and A. F. M. Smith, “Novel approach tononlinear/non-Gaussian Bayesian state estimation,”

IEE Proceedings F(Radar and Signal Processing) , vol. 140, no. 2, pp. 107–113, 1993.[4] B. Liu, “Robust particle ﬁlter by dynamic averaging of multiple noisemodels,” in

Proc. of the 42nd IEEE Int’l Conf. on Acoustics, Speech,and Signal Processing (ICASSP) . IEEE, 2017, pp. 4034–4038.[5] Y. Dai and B. Liu, “Robust video object tracking via Bayesian modelaveraging-based feature fusion,”

Optical Engineering , vol. 55, no. 8, pp.083102(1)–083102(11), 2016.[6] B. Liu, “Instantaneous frequency tracking under model uncertaintyvia dynamic model averaging and particle ﬁltering,”

IEEE Trans. onWireless Communications , vol. 10, no. 6, pp. 1810–1819, 2011.[7] C. C. Drovandi, J. Mcgree, and A. N. Pettitt, “A Sequential MonteCarlo algorithm to incorporate model uncertainty in Bayesian sequentialdesign,”

Journal of Computational and Graphical Statistics , vol. 23, no.1, pp. 3–24, 2014.[8] I. Urteaga, M. F. Bugallo, and P. M. Djuri´c, “Sequential monte carlomethods under model uncertainty,” in . IEEE, 2016, pp. 1–5.[9] B. Liu, “ILAPF: Incremental learning assisted particle ﬁltering,” in . IEEE, 2018, pp. 4284–4288.[10] F. Caron, M. Davy, A. Doucet, E. Duﬂos, and P. Vanheeghe, “Bayesianinference for linear dynamic models with Dirichlet process mixtures,”

IEEE Trans. on Signal Processing , vol. 56, no. 1, pp. 71–84, 2008.[11] C. Magnant, A. Giremus, E. Grivel, L. Ratton, and B. Joseph, “Dirichlet-process-mixture-based Bayesian nonparametric method for Markovswitching process estimation,” in . IEEE, 2015, pp. 1969–1973.[12] E. Fox, E.B. Sudderth, M.I. Jordan, and A.S. Willsky, “Bayesiannonparametric inference of switching dynamic linear models,”

IEEETransactions on Signal Processing , vol. 59, no. 4, pp. 1569–1585, 2011.[13] T. S. Ferguson, “A Bayesian analysis of some nonparametric problems,”

The Annals of Statistics , pp. 209–230, 1973.[14] Y. W. Teh, “Dirichlet process,” in

Encyclopedia of machine learning ,pp. 280–287. Springer, 2011.[15] D. Blackwell and J.B. MacQueen, “Ferguson distributions via p´olya urnschemes,”

The Annals of Statistics , vol. 1, no. 2, pp. 353–355, 1973.[16] P. Orbanz and Y. W. Teh, “Bayesian nonparametric models,” in

Encyclopedia of Machine Learning , pp. 81–89. Springer, 2011.[17] D. J. Aldous, “Exchangeability and related topics,” in ´Ecole d’ ´Et´e deProbabilit´es de Saint-Flour XIII1983 , pp. 1–198. Springer, 1985.[18] K. P. Murphy, “Conjugate Bayesian analysis of the Gaussian distri-bution,” Tech. Rep., Department of Computer Science, UBC, January2007.[19] R. Douc and O. Capp´e, “Comparison of resampling schemes for particleﬁltering,” in

Proc. of the 4th Int’l Symp. on Image and Signal Processingand Analysis (ISPA) . IEEE, 2005, pp. 64–69.[20] T. Li, M. Bolic, and P. M. Djuric, “Resampling methods for particleﬁltering: classiﬁcation, implementation, and strategies,”

IEEE SignalProcessing Magazine , vol. 32, no. 3, pp. 70–86, 2015.[21] J. D. Hol, T. B. Schon, and Gustafsson F., “On resampling algorithmsfor particle ﬁlters,” in

Proc. of the IEEE Nonlinear Statistical SignalProcessing Workshop (NSSPW) . IEEE, 2006, pp. 79–82.[22] R. M. Neal, “Markov chain sampling methods for Dirichlet processmixture models,”

Journal of Computational and Graphical Statistics ,vol. 9, no. 2, pp. 249–265, 2000.[23] R. Van Der Merwe, A. Doucet, N. De Freitas, and E. Wan, “Theunscented particle ﬁlter,” in

NIPS , 2000, pp. 584–590.[24] B. Liu, C. Ji, Y. Zhang, C. Hao, and K. Wong, “Multi-target trackingin clutter with sequential Monte Carlo methods,”

IET radar, sonar &navigation , vol. 4, no. 5, pp. 662–672, 2010.[25] L. M. Milleﬁori, P. Braca, K. Bryan, and P. Willett, “Adaptive ﬁlteringof imprecisely time-stamped measurements with application to AISnetworks,” in2015 18th Int’l Conf. on Information Fusion (Fusion)