Bayesian Fusion of Data Partitioned Particle Estimates
Caleb Miller, Michael D. Schneider, Jem N. Corcoran, Jason Bernstein
BBayesian Fusion of Data Partitioned Particle Estimates ∗ Caleb Miller † , Michael D. Schneider ‡ , Jem N. Corcoran † , and Jason Bernstein ‡ Abstract.
We present a Bayesian data fusion method to approximate a posterior distribution from an ensembleof particle estimates that only have access to subsets of the data. Our approach relies on approximateprobabilistic inference of model parameters through Monte Carlo methods, followed by an updateand resample scheme related to multiple importance sampling to combine information from the initialestimates. We show the method is convergent in the particle limit and directly suited to applicationon multi-sensor data fusion problems by demonstrating efficacy on a multi-sensor Keplerian orbitdetermination problem and a bearings-only tracking problem.
Key words.
Monte Carlo Methods, Multiple Importance Sampling, Resampling, Data Fusion
AMS subject classifications.
1. Introduction.
Data fusion is the process through which different sources of informationare combined to form a joint estimate of a process, target, or distribution of interest. Thereare many methods for data fusion, especially in the areas of multi-sensor measurements andtarget tracking [3, 14]. Several data fusion techniques exist to combine analytic descriptions ornumerical estimates of probability density functions (PDFs) [7]. Optimal Bayesian fusion ofPDFs [4] follows Bayes’ rule for conditional density inference in multiplying individual PDFsfollowed by division by the joint marginal distribution of the data. The normalised weightedgeometric mean rule [1], an extension to non-Gaussian PDFs of the covariance intersectionrule [11], avoids division by the marginal data density by direct computation of numericalweights on each function involved in the product of individual measurement PDFs.To be applicable to non-linear and non-Gaussian problems, many methods must approx-imate distributions through sampling techniques. Sampling methods complicate the afore-mentioned fusion strategies as additional steps of estimation are then required to make PDFmultiplication feasible. If the sampling-based approximation of the PDFs are Dirac mixturemeasures, multiplication must be applied after a process such as kernel density estimationto return non-zero estimates of the PDF products [15]. We present cross-pollination , an al-ternative Bayesian sampling-based strategy, that uses multiple importance sampling ideas.Schneider et al. [19] adapted the multiple importance sampling work of Elvira et al. [6], toa sensor data fusion task through a multiple importance sampling framework. We improvethis method via guarantees of convergence in the particle limit and validation of consistencyof different weighting schemes, thus providing a data fusion strategy to accurately quantifyuncertainty while avoiding the need for multiplication of PDF approximations by workingdirectly with the existing particle estimates. ∗ Submitted to the editors October 26, 2020.
Funding:
This work was performed under the auspices of the U.S. Department of Energy by Lawrence LivermoreNational Laboratory under Contract DE-AC52-07NA27344. Funding for this work was provided by LLNL LaboratoryDirected Research and Development grant 19-SI-004. † Department of Applied Mathematics, CU, Boulder, CO ([email protected], [email protected]). ‡ Lawrence Livermore National Lab ([email protected],[email protected]). a r X i v : . [ s t a t . C O ] O c t . Method.2.1. Objective, Set-Up, and Requirements. The goal of cross-pollination is to approxi-mate a posterior distribution P ( θ | D ), conditioned on a set of data or observations D , froman ensemble of existing particle estimates of marginal posterior distributions P ( θ | D j ), wherethe D j ( j = 1 , . . . , M ) form a partition of D .The random variable of interest θ can take various forms e.g., an unknown value as in sub-section 3.2, a set of parameters which drives a deterministic motion model as in subsection 3.3,or a time-indexed collection of states which estimate a full trajectory as in subsection 3.4. Theinitial particle estimates can be obtained online through importance sampling-resampling, orbe pre-proccessed samples that contain the information from D j through other Monte Carloor Markov chain Monte Carlo techniques. Cross-pollination can be applied in batches orsequentially, through repeated applications of the main steps of our weighting scheme andresampling.There are a few requirements to apply cross-pollination. The first requirement is thata common prior is used across all the estimates of the random variable of interest. Thesecond is that likelihood functions must be available for all observations. These requirementsare essential for Bayesian calculations and are included here for completeness. The thirdrequirement is that the measurements must be independent, this is a common assumption insampling-resampling frameworks as it allows for joint likelihood functions to be expressed asproducts of individual likelihood functions [20]. The fourth requirement, which we will referto as the core requirement , is that particles obtained from any of the estimates must be ableto map into the domain of the likelihood functions of the complement observations. Thisrequirement is what allows our weighting scheme to occur. We will see that in the inferenceof parameters in a deterministic motion model subsection 3.3 the core requirement is easilysatisfied, while in the case of a stochastic motion model more work is required to satisfy thecore requirement subsection 3.4. In the discussion of the method thatfollows we will assert the data D = ( z , z , . . . , z M ) was partitioned into single observations, D j = z j , and that the initial samples were obtained through importance sampling. Bothof these choices are only here for the purpose of clarity; we will divert from this set up inthe applications in section 3. To speak accurately about updating the samples we turn to ameasure theoretic viewpoint.Let π be the probability measure corresponding to the posterior distribution P ( θ | D ), π j, be the probability measure corresponding to the initial marginal posterior distribution P ( θ | D j ). We denote the probability measures that estimate these measures as π N and π Nj, respectively, noting that the superscript N indicates that these estimates are each comprisedof N particles approximately sampled from their respective distributions.Let θ be a random variable of interest with prior P ( θ ) and observations D := ( z , z , . . . , z M )with measurement uncertainty which is encapsulated by a likelihood function. For eachmember of the partitioned data D j , a Dirac mixture measure π Nj, is formed from impor-tance sampling using a likelihood function associated with the observation, g j ( θ ), where ( D j | θ ) ∝ g j ( θ ). Specifically,(2.1) π Nj, = 1 N N (cid:88) i =1 δ θ ( i ) j ( j = 1 , . . . , M ) , where the particles θ ( i ) j ( i = 1 , . . . , N ) are obtained through importance sampling with thecommon prior P ( θ ) as the importance distribution and the likelihood function of the obser-vation used to weight and resample. We denote the Dirac function δ ( x − x ∗ ) as δ x ∗ . We areto weight the particles of our initial measure π Nj, , which is built from a single observation, toform an empirical measure π Nj that estimates π . We use the Radon-Nikodym derivative to ob-tain the weights needed to match expectations between the two measures. As π is absolutelycontinuous with respect to π j, we have that(2.2) dπdπ j, ( θ ) = P ( D − j | θ ) P ( D − j | D j ) ∝ (cid:89) k (cid:54) = j g k ( θ ) , where D − j = ( z , z , . . . , z j − , z j +1 , . . . , z M ) and the final product is due to the independenceof the measurements. Each particle is now updated with a weight that corresponds to thevalue in the product of likelihood functions of the complementary data. We refer to this stepas cross-epoch weighting .(2.3) π Nj = N (cid:88) i =1 w ( i ) j δ θ ( i ) j , w ( i ) j = (cid:101) w ( i ) j (cid:80) Ni (cid:48) =1 (cid:101) w ( i (cid:48) ) j , (cid:101) w ( i ) j = (cid:89) k (cid:54) = j g k (cid:16) θ ( i ) j (cid:17) . Notice that the structure of the weights is reflected in an application of Bayes’ rule workingdirectly with the probability distribution functions,(2.4) P ( θ | D ) = P ( θ | D − j , D j ) = P ( D − j | θ , D j ) P ( D − j | D j ) P ( θ | D j ) = P ( D − j | θ ) P ( D − j | D j ) P ( θ | D j ) . Now, we combine the empirical measures through pooling and resampling to obtain N equallyweighted particles. One way to achieve this is through combining all of the measures with asummation and dividing by the total number of measures(2.5) π MN = 1 M M (cid:88) j =1 π Nj = 1 M M (cid:88) j =1 N (cid:88) i =1 w ( i ) j δ θ ( i ) j , and then employing a resampling technique, say multinomial, to obtain our final approxima-tion(2.6) π N = 1 N N (cid:88) i =1 δ θ ( i ) , here θ ( i ) ( i = 1 , . . . , N ) are identically and independently distributed from the measure π MN .We can fuse the particles in another way; instead of normalizing the weights by datapartition, as in (2.3), we can go straight to a fused measure by normalizing all the weightstogether,(2.7) π MNt = M (cid:88) j =1 N (cid:88) i =1 w ( i ) j δ θ ( i ) j , w ( i ) j = (cid:101) w ( i ) j (cid:80) Mj (cid:48) =1 (cid:80) Ni (cid:48) =1 (cid:101) w ( i (cid:48) ) j (cid:48) , (cid:101) w ( i ) j = (cid:89) k (cid:54) = j g k (cid:16) θ ( i ) j (cid:17) . Then, in the same fashion we would employ a sampling technique to get to our final estimateof N particles,(2.8) π Nt = 1 N N (cid:88) i =1 δ θ ( i ) , where θ ( i ) ( i = 1 , . . . , N ) are identically and independently distributed from the measure π MNt . We refer to the methods as ‘norming-apart’ and ‘norming-together’, respectively. Thisnorming-apart process is diagrammed in Figure 1 both methods are proved to converge undera root mean square distance in the particle limit when the initial estimates are formed usingimportance sampling. The proof of the norming-apart method is presented in Appendix A.The ease of these formulations suggest straightforward algorithms for both methods, pseudo-code is provided in Appendix B.
Importance sampling is a well-knowntechnique of sampling from a known distribution q ( θ ) and adjusting the samples with impor-tance weights to approximate a target distribution P ( θ ) [18]. The typical form of the impor-tance weights for a sample θ ( i ) is w (cid:16) θ ( i ) (cid:17) = P (cid:16) θ ( i ) (cid:17) /q (cid:16) θ ( i ) (cid:17) . Multiple importance samplingexpands this idea by introducing a number of proposal distributions q j ( θ ) ( j = 1 , . . . , M ) thatcan be adjusted in various ways to estimate the desired distribution. Both the weighting andordering of selection from proposal distributions can affect the final estimate [6].The connection to cross-pollination is formed by considering the importance samplingtarget distribution as the posterior conditioned on all the data P ( θ ) = P ( θ | D ) and theimportance sampling proposal distributions to be the marginal posteriors q j ( θ ) = P ( θ | D j ),which makes the initial measures π Nj, the collections of samples obtained from each of ourproposal distributions. By using the data-dependent proposals we have created a data fusionalgorithm that can make use of existing multiple importance sampling theory. Both thenorming-apart and norming-together cross-pollination schemes resemble aspects of genericdiverse population Monte Carlo methods discussed by Elvira et. al [5]. Schneider’s originalwork on cross-pollination [19] used the deterministic mixture weights proposed by Elvira etal. [6]. In this case, one would have,(2.9) π MNdm = M (cid:88) j =1 N (cid:88) i =1 w ( i ) j δ θ ( i ) j , w ( i ) j = (cid:101) w ( i ) j (cid:80) Mj (cid:48) =1 (cid:80) Ni (cid:48) =1 (cid:101) w ( i (cid:48) ) j (cid:48) , (cid:101) w ( i ) j = (cid:81) Mk =1 g k (cid:16) θ ( i ) j (cid:17)(cid:80) Mk =1 g k (cid:16) θ ( i ) j (cid:17) , Nj, π Nj π MN π N W e i g h t e d P oo l e d R e s a m p l e d Figure 1.
Cross-Pollination Process: Particles (circles) have initially only gained the statistical information(squares) of their own data (shared color between circle and square). The process continues by imparting thestatistical information that the other particles have not been exposed to yet (different color between circles andsquares) and imparting weights (size of particle, larger size indicating larger weight). The particles are thenpooled and resampled based on their weights. and, through sampling would obtain,(2.10) π Ndm = 1 N N (cid:88) i =1 δ θ ( i ) , where θ ( i ) ( i = 1 , . . . , N ) are identically and independently distributed from the measure π MNdm .This weight formulation is of particular interest as Elvira et al. [6] catalog it as the schemehaving the minimum variance of the multiple importance sampling schemes they examined.However, this scheme is only shown to be consistent, in the limit of the number of proposals,when the normalizing constant is known. This makes the deterministic mixture weights schemeless applicable to the problem at hand which demands a fixed number of proposal distributionsdepending on how the data has been partitioned and has no knowledge of the normalizingconstant, which is typically estimated by the sum of the weights in importance samplingtheory.
3. Applications. .1. Set-up. We demonstrate the algorithms on three examples: the first is an examplewith an analytic posterior distribution that serves as an introduction to application of themethod as well as a verification of the convergence rate, the second is an orbit determina-tion problem where particle estimates borne of data from two different sensors are fused viathe norming-together method, the third example is a bearings-only tracking example thatshows one way the ideas of cross-pollination may be carried into a sequential problem witha stochastic motion model. For simplicity in the latter two examples, we consider normal ormultivariate normal likelihood functions for measurements. We denote the PDF of a normaldistribution with mean µ and variance σ evaluated at x as N ( x ; µ, σ ). Similarly, for a mul-tivariate normal we will have MVN ( x ; µ , Σ ) to be the PDF of a multivariate normal withmean µ and covariance matrix Σ evaluated at x . Our theory argues for a convergencerate of √ N of our estimator π N to the probability measure π that corresponds to the posteriordistribution. We verify on an example with Gamma distributions. We denote that x isdistributed from a Gamma distribution with shape parameter k and rate parameter θ by x ∼ G ( k , θ ). First we give our random variable of interest a gamma prior distribution x ∼ G ( k , θ ). Next we have three observations each with a Gamma distribution so that y j | x ∼ G ( k j , x − ) ( j = 1 , , x | y , y , y ∼ G (cid:0) k + k + k + k , ( θ − + y + y + y ) − (cid:1) . For this experiment we chose k = 5 / , θ = 1 / , y = 1 , y = 2 , y = 3 , k = 4 , k =10 , k = 25. The sets of initial particles corresponding to π Nj, ( j = 1 , ,
3) were obtained byone round of standard importance sampling. The particles were then cross-epoch weightedand resampled under both methods of norming-together and norming-apart cross-pollination.We conducted 1000 Monte Carlo trials for N = 10 , , , , particles.To reiterate, we formed the particles from the observations (labelled Obs 1 particles etcin Figure 2a) based on drawing particles from the prior and importance sampling with thesingle Gamma likelihood. The particles were then put through the cross-pollination process ofcross-epoch weighting and resampling to form the final estimate of particles (labelled Cross-pol particles in Figure 2a). Our theory is supported by Figure 2b as we see the mean absoluteerror decreases at a rate proportional to the square root of the number of particles for eachmoment. The proportion changes depending on which moment is being estimated, as onewould expect the proportionality constant is higher for the higher order moments.We included this example to provide an introduction to the application of the method andcheck convergence rates where the posterior distribution had an analytic form. We believecross-pollination can be a useful tool in multi-sensor data fusion applications as demonstratedin the following applications. Space traffic management is becoming an increasingly chal-lenging remote sensing and data fusion problem as the number of satellites and debris aroundthe Earth continue to grow. A key aspect of space traffic management problems includesorbit determination. Multi-sensor data fusion for an orbital determination problem has beenexplored using the geometric mean density (GMD) fusion rule [15]. a) (b) Figure 2. (a) Blue, purple, and cyan histograms show the initial sets of particles for a trial with particles taken created from observations , , and respectively. The grey histogram is of particles that havebeen selected after a single run of cross-pollination. Solid pink line is the posterior PDF. (b) The black dottedlines follow √ N (top) and / √ N (bottom). Solid red lines from darkest to lightest plot the mean over MonteCarlo trials of the absolute error of the estimated first four moments (mean, variance, skew, kurtosis) againstthe true value of the moments for the norming-apart method. Similarly the green dashed lines describe the samesituation for the norming-together method.
In an orbit determination scenario where ground-based optical angles-only measurementsare accessible online and orbital measurements are lagged, practitioners may desire makingground-based estimates immediately and fusing the orbital estimates when they become avail-able at a later time. We simulate this situation below for Keplerian orbits.Keplerian orbits trace closed ellipses in space and are described in a six-dimensional statespace, which is typically described by the six orbital elements: semi-major axis, eccentric-ity, inclination, argument of periapsis, right ascenscion of the ascending node, and the trueanomaly. Equivalently, these orbits can be identified by a three-dimensional position and athree-dimensional velocity at a given time. We let θ = ( x, y, z, ˙ x, ˙ y, ˙ z ) at time t , P ( θ ) be theprior, and consider two sets of three right ascension (RA) and declination (DEC) measure-ments occurring: one taken from a sensor in low Earth orbit (LEO) D over three minutesand the other from a ground-based observer D over a four minute period taking place sixminutes after the final orbiting-sensor measurement. Sensor measurements are modeled asmultivariate Gaussian with mean given by the true RADEC and standard deviations givenby σ LEO = 2 arcseconds and σ ground = 20 arcseconds. Therefore, the likelihood function formeasurements j of the LEO observer is,(3.2) g j (cid:16) θ ( i ) (cid:17) = MVN (cid:16) H j (cid:16) θ ( i ) (cid:17) ; z j , Σ LEO (cid:17) , where the non-linear measurement function H j maps the particles into the correct measure- ent space by propagating the orbit specified by θ ( i ) to the correct observation time t j andretrieving the RADEC measurements that would have been obtained from jth sensor, andthe covariance matrix is Σ = σ I . Of course, the ground based likelihoods take the sameform but with σ ground . This set-up satisfies the requirements of cross-pollination.We will apply cross-pollination to the estimates that comprise π N , and π N , which consist of50 ,
000 samples each, obtained with an MCMC sampler to produce initial orbit determinationsbased on the data provided. We fuse these samples using norming-together cross-pollinationand perturb the samples to achieve our final estimate π N . We diagram this process in Figure 3. Figure 3.
From left to right: a random sample of of the , particles estimates traced out orbits(grey) and the true orbit (red) for the LEO sensor, ground-based sensor, and the estimates after being refinedthrough the norming-together cross-pollination process. From Figure 3 we can see that the refined orbits from the norming-together cross-pollinationprocess are providing an improved picture of the situation and uncertainty. For this examplethe root mean square error of the positional portion of the orbital elements for the samplesbefore cross-pollination was 280 km and 681 km for the ground and LEO samples respectively,after cross-pollination this was improved to an root mean square error of 32 km . Similarlythe root mean square errors of the the velocity portion improved from 104 m/s and 2243 m/s to 42 m/s . We can see in Figure 3 a distinct improvement in the uncertainty as the groundsensor MCMC process allowed many samples from a degenerate orbital plane, due to a smallobservation window and large uncertainty in the measurements, that has been eliminated inthe fused results. Furthermore, the distinct “banana shaped uncertainty” [10] that emerges inorbit determination problems is seen in the rightmost image of Figure 3, indicating uncertaintyrealism. We have applied cross-pollination to two sit-uations where the core requirement was easily satisfied—likelihood functions could be imme-diately applied to particles. This was due to fusing three measurements at a single epochin subsection 3.2 and utilizing a deterministic motion model in subsection 3.3. Not all sit-uations will follow these formats. In this application we demonstrate an adaptation of thecross-pollination ideas for a sequential smoothing problem with a stochastic motion model.Say we desire a smoothing distribution P ( x | y ) for a tracking problem where anobject follows stochastic dynamics. Furthermore, suppose that the estimates obtained are from rocesses ran on two sensors with differing cadences so that we desire to fuse the trajectories(particles) of π N corresponding to P ( x , x , x | D = ( y , y , y )) and π N corresponding to P ( x , x , x | D = ( y , y , y )). If the y i took place at distinct times t i and the times areincreasing t < t < t < . . . , it is clear that a likelihood that corresponds to t can notbe applied to a trajectory from π N as the state at time t has not been recorded. We notethat this corresponds to the problem of having to incorporate out-of-sequence measurements(OOSM) which has been discussed in tracking and data fusion literature [2]. An attractivesolution to implement cross-pollination would be to consider the Bayesian updates used inthe OOSM particle filter [16]. In the following example we mitigate this problem by savingthe state of the particles where any observation occurred or will occur. Depending on sensorarchitecture this may not be possible and other mentioned methods could be employed sothat likelihood functions could be applied at all necessary times.We apply cross-pollination to a bearings-only tracking problem inspired by Gordon etal. [9]. Several other methodologies have been developed to perform well on the bearing-onlytracking problem, for example the resample-move filter [8]. We choose this problem not tocompete with the other methods but to demonstrate the versatility of cross-pollination to amulti-sensor sequential problem with a stochastic motion model. We will be estimating thethe state vectors of positions and velocities, up to time t , which will be driven through a linearmotion model with additive Gaussian noise. That is,(3.3) θ t = ( x , . . . x t ) , x j = Φ x j − + Γ w j , (3.4) x j = ( x, ˙ x, y, ˙ y ) Tj , w j = ( w x , w y ) Tj , w j ∼ N ( , σ q I ) , Φ =
Γ = . .
50 1 . (3.5)Bearing observations of the process will be obtained through two different sensors with additiveGaussian noise. The jth sensor is located at ( x S j , y S j ) (actual locations ( − ,
1) and (1 , − g j (cid:16) θ ( i ) (cid:17) = N (cid:16) H (cid:16) θ ( i ) (cid:17) ; z j , σ r (cid:17) , H (cid:16) θ ( i ) (cid:17) = arctan (cid:18) y i − y Sj x i − x Sj (cid:19) . The standard deviations of the sensor and model noise are σ r = 0 .
005 and σ q = 0 . x ∼ MVN ( x p , Σ p ) , x p = [0 , , . , − . T , Σ p = . . .
00 0 0 0 . . he initial state is x = [ − . , . , . , − . T . This starting state is propagated throughthe motion model for 20 time steps to form the true trajectory.To demonstrate further flexibility of the ideas of cross-pollination we partition the datacompletely so that D j = y j with D = ( y , y , . . . , y ). We obtain the sample trajectories θ ( i ) j that comprise the empirical measure π Nj, corresponding to P ( θ j | D j = y j ) by propagat-ing prior samples forward until time j , saving the state at each time (1 , , . . . , j ), and thenperforming importance sampling and resampling of the whole trajectory based on the jth likelihood function.To start the process of sequential cross-pollination we begin with π N , and motion the tra-jectories to t . The forward motioned trajectories of π N , are then fused with the trajectoriesof π N , by weighting the motioned trajectories with the likelihood g and weighting the trajec-tories of π N , with the “backward in time” likelihood g . After the cross-epoch weighting thetrajectories can be pooled and resampled using norming-apart or norming-together processesas they have seen all the relevant statistical information up to the second epoch. To continuethe process the cross-pollinated trajectories denoted π N , ∗ are motioned forward to t and thencross-pollinated with π N , . Notice that the trajectories of π N , would now be weighted withtwo likelihoods g and g . Extending this notion, the samples of π Nj, would be weighted with aproduct of j − t = 20, so that π N , ∗ would estimate the measure that corresponds to P ( θ | D = y ). Pseudo-code for thisprocess is provided in Appendix B. Due to particle sparsity issues raised by the applicationof several backward in time likelihoods we implemented this process with log likelihoods andresampled the particles during the process of applying the cross-epoch weight whenever theeffective sample size was below a threshold and perturbed the particles that survived the re-sampling. Notice that if the initial particles are obtained via importance sampling resamplingand the resample step accepts only particles originating from the measure π Nj, ∗ at each stepthen this algorithm is essentially the sequential importance resampling particle filter [9]. Theadditional steps provide new possible trajectories that could help reduce particle sparsity.In Figure 4 we see wide swaths of initial particles that comprise empirical measures atparticular epochs π Nj, ( j = 1 , , , , , ,
4. Discussion.
We have presented two methods capable of fusing estimates with guaran-tees in the particle limit. Applying the methods to different examples have led us to someempirical insights about the methods. In our experience, norming-together is more robustto particle degeneracy issues. For example, consider the scenario of fusing two batches ofparticles from two different sensors where one sensor has far more uncertainty than the other.Applying norming-apart with too few particles can be detrimental in this scenario if the un-certainty from the “bad” sensor was so wide that a single particle receives high weight. Whenpooled together with reasonably weighted particles from the other set this “best of the bad”particle would be replicated many times in the resampling framework and could lead to biasedestimates. Computations should be performed with sufficient numbers of particles so that thisproblem is less likely to occur. In this same scenario it may be the case that norming-together
Sequential cross-pollination applied to a two sensor bearing-only tracking problem. Depicted arethe sensor locations (purple and blue diamonds), initial particles (purple and blue circles, color matched tothe sensor from which the particles have seen data from), cross-pollinated particles (black circles), and truelocations (red diamond) for epochs , , , , , , , and . Time progresses from top to bottom so thatepoch is at the top of the figure and epoch is at the bottom. only selects particles from the sensor with the reduced uncertainty in the resampling step,while still refining the estimates. This demonstrates that information from more uncertainsensors is still valuable and can be used in improving estimates through cross-pollination.The sequential cross-pollination example demonstrated applicability of cross-pollination ideasto stochastic motion models, however we believe that cross-pollination is most readily appli-cable in data-starved situations which would correspond to fewer applications of likelihoodfunctions.Many ideas must be considered in order to improve the method. Different weightingand resampling schemes for multiple importance sampling lead to improved variances of theestimators [5, 6]. For large sets of data the weights obtained from cross-epoch weightingmay render many particles degenerate. Sequential resampling may be used to combat thisdegeneracy so that fewer likelihoods are considered at a time [9]. In the batch case, this leadsto questions about the order in which the likelihoods should be applied to the weights. Forinstance, if the order of application of likelihoods is based from the most to least uncertaindata improvements may be possible [17].We are interested in applications and extension of cross-pollination to the areas of multipletarget tracking, distributed sensing, and optimal control. In multiple target tracking, trackestimates may be cross-pollinated if they are associated to the same object. As both meth-ods pool the sets of particles together, they could be considered centralized fusion methods. orming-apart is decentralized before the particles are pooled together—only the likelihoodsfrom the other are required to update the individual sets of particles for an accurate posteriorestimates. The ideas in the norming-apart approach could have interesting applications indistributed sensing. Rather than passing state vectors or model parameters from sensor tosensor, likelihood functions could be passed between sensors as a way to refine estimates.This is especially interesting in the case of limited communication bandwidths between sen-sors. For optimal control, it is possible to identify optimal paths to reach a specified targetand associated optimal actions from Monte Carlo sampling of paths under an uncontrolled( i.e., open loop) stochastic motion model [12]. In some application instances, informationabout candidate paths can be available from different sensors or agents, which might then becombined with cross-pollination resampling to enable optimal control solutions with greatercomputational efficiency. Appendix A. Theory.
We now prove that the final resampled measure from the norming-apart cross-pollinationprocess, π N , converges to π in the particle limit. In particular, we prove this convergence ifthe initial particles were sourced via importance sampling. We make use of the following “rootmean square” distance, notation, and operators on the set of probability measures presentedby Law et al. [13, p.87-93]:(A.1) d ( µ, ν ) = sup || f || ∞ ≤ (cid:112) ( E | µ ( f ) − ν ( f ) | ) , where(A.2) µ ( f ) = (cid:90) R n f ( v ) µ ( dv ) , (A.3) ( L j µ ) ( dν ) = g j ( ν ) µ ( dν ) (cid:82) R n g j ( ν ) µ ( dν ) , and(A.4) (cid:0) S N µ (cid:1) ( dν ) = 1 N N (cid:88) n =1 δ ν ( n ) ( dν ) , ν ( n ) ∼ µ i.i.d. . Here, g j is the likelihood function corresponding to the j th data point D j . We also make useof two lemmas proved by Law et al. [13, p.87-93]: Lemma A.1.
The sampling operator satisfies (A.5) sup µ ∈ P ( R N ) d ( S N µ, µ ) < √ N .
Lemma A.2.
If there exists κ ∈ (0 , such that for all ν ∈ R N and j ∈ N , κ ≤ g j ( ν ) ≤ κ − ,then (A.6) d ( L j ν, L j µ ) ≤ κ − d ( ν, µ ) . e can now frame the norming-apart process as a sequence of applications of S N and L j .We will use L − j to be transforming a measure, which has incorporated the j th data, usingthe Radon-Nikodym derivative to incorporate the remaining data (via a product of likelihoodfunctions) to an appropriate approximate measure of the full posterior. We can update theempirical measure equations, (2.1)-(2.6), using this notation: π Nj, = S N L j S N P, (A.7) π Nj = L − j π Nj, = L − j S N L j S N P, (A.8) π MN = 1 M M (cid:88) j =1 π Nj = 1 M M (cid:88) j =1 L − j S N L j S N P, (A.9) π N = S N π MN , (A.10)where P is the prior distribution of θ .Armed with this theory we can prove our result with some applications of the triangleinequality. We are loose with our application of the likelihood lemma (A.6) in our proofsummarizing the result as d ( L j ν, L j µ ) ≤ C j d ( ν, µ ). Theorem A.3.
Assumptions: Those of Lemma
A.2 and independent measurements. Then, d ( π N , π ) < C M √ N (A.11) roof. d ( π N , π ) ≤ d ( π N , π MN ) + d ( π MN , π )= d ( S N π MN , π MN ) + d ( π MN , π ) ≤ √ N + d M M (cid:88) j =1 π Nj , π ≤ √ N + 1 √ M M (cid:88) j =1 d ( π Nj , π ) ≤ √ N + 1 √ M M (cid:88) j =1 d ( L − j π Nj, , L − j L j P ) ≤ √ N + 1 √ M M (cid:88) j =1 C − j d ( π Nj, , L j P ) ≤ √ N + 1 √ M M (cid:88) j =1 C − j d ( S N L j S N P, L j P )= 1 √ N + 1 √ M M (cid:88) j =1 C − j d ( S N L j S N P, L j P ) ≤ √ N + 1 √ M M (cid:88) j =1 C − j ( d ( S N L j S N P, L j S N P ) + d ( L j S N P, L j P )) ≤ √ N + 1 √ M M (cid:88) j =1 C − j (cid:18) √ N + C j d ( S N P, P ) (cid:19) ≤ √ N + 1 √ M M (cid:88) j =1 C − j (cid:18) √ N + C j √ N (cid:19) = 1 + √ M (cid:80) Mj =1 ( C − j + C − j C j ) √ N We have shown convergence of the norming-apart method in the particle limit. The fourthline of the proof is not trivial, it was achieved by bounding cross terms of the squared sumby a sum of the two terms squared, that is, ab ≤ ( a + b ) / π MNt = M (cid:88) j =1 K j π Nj , K j = (cid:80) Ni =1 w ( i ) j (cid:80) Mj (cid:48) =1 (cid:80) Ni =1 w ( i ) j (cid:48) , e may employ a similar proof to show that the norming-together method converges in theparticle limit. Appendix B. Pseudo-code.
Algorithm B.1
Norming-Apart Cross-Pollination
Input:
Acquire the initial samples that comprise π Nj, . for j = 1 to M do Obtain θ ( i ) j ( i = 1 , . . . , N ) end forCross-Epoch Weighting: Form π Nj . for j = 1 to M do Calculate weights w ( i ) j ( i = 1 , . . . , N ) according to equation (2.3). end forPool Particles and Weights: Form π MN .Form T = (cid:83) Mj =1 (cid:83) Ni =1 θ ( i ) j and W = (cid:83) Mj =1 (cid:83) Ni =1 (cid:16) M w ( i ) j (cid:17) . Resample:
Form π N .Perform multinomial sampling on the set of weights, W , to obtain N (from M × N total)particles θ ( i ) ( i = 1 , . . . , N ) from T . Algorithm B.2
Norming-Together Cross-Pollination
Input:
Acquire the initial samples that comprise π Nj, . for j = 1 to M do Obtain θ ( i ) j ( i = 1 , . . . , N ) end forCross-Epoch Weighting: Form π Nj . for j = 1 to M do Calculate weights w ( i ) j ( i = 1 , . . . , N ) according to equation (2.7). end forPool Particles and Weights: Form π MNt .Form T = (cid:83) Mj =1 (cid:83) Ni =1 θ ( i ) j and W = (cid:83) Mj =1 (cid:83) Ni =1 w ( i ) j . Resample:
Form π N .Perform multinomial sampling on the set of weights, W , to obtain N (from M × N total)particles θ ( i ) ( i = 1 , . . . , N ) from T . lgorithm B.3 Sequential Cross-Pollination
Initialization:
Obtain θ ( i )1 ( i = 1 , . . . , N ) that comprise π N , corresponding to P ( θ | z )Set θ ( i )1 , ∗ = θ ( i )1 Iteration Cross-Pollination:for j=2 to M do Obtain θ ( i ) j ( i = 1 , . . . , N ) that comprise π Nj, corresponding to P ( θ j | z j )Compute unnormalized weights (cid:101) w ( i ) j = (cid:81) k The authors thank Ian Grooms for valuable discussions. REFERENCES [1] T. Bailey, S. Julier, and G. Agamennoni , On conservative fusion of information with unknownnon-gaussian dependence , in 2012 15th International Conference on Information Fusion, IEEE, 2012,pp. 1876–1883, https://ieeexplore . ieee . org/document/6290529 (accessed 2020-10-26).[2] Y. Bar-Shalom, H. Chen, and M. Mallick , One-step solution for the multistep out-of-sequence-measurement problem in tracking , IEEE Transactions on aerospace and electronic systems, 40 (2004),pp. 27–37, https://doi . org/10 . . . Y. Bar-Shalom, P. K. Willett, and X. Tian , Tracking and data fusion , vol. 11, YBS publishingStorrs, CT, USA:, 2011, https://doi . org/10 . . . J. Dezert , Optimal bayesian fusion of multiple unreliable classifiers . onera . fr/sites/default/files/u523/C014 - Dezert - Fusion2001 . pdf (accessed 2020-10-26).[5] V. Elvira, L. Martino, D. Luengo, and M. F. Bugallo , Population monte carlo schemes withreduced path degeneracy , in 2017 IEEE 7th International Workshop on Computational Advancesin Multi-Sensor Adaptive Processing (CAMSAP), IEEE, 2017, pp. 1–5, https://doi . org/10 . . . V. Elvira, L. Martino, D. Luengo, M. F. Bugallo, et al. , Generalized multiple importance sam-pling , Statistical Science, 34 (2019), pp. 129–155, https://doi . org/10 . C. Genest, J. V. Zidek, et al. , Combining probability distributions: A critique and an annotatedbibliography , Statistical Science, 1 (1986), pp. 114–135, https://doi . org/10 . W. R. Gilks and C. Berzuini , Following a moving target—monte carlo inference for dynamic bayesianmodels , Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63 (2001),pp. 127–146, https://doi . org/10 . . N. J. Gordon, D. J. Salmond, and A. F. Smith , Novel approach to nonlinear/non-gaussian bayesianstate estimation , in IEE proceedings F (radar and signal processing), vol. 140, IET, 1993, pp. 107–113,https://doi . org/10 . . . J. T. Horwood and A. B. Poore , Gauss von mises distribution for improved uncertainty realism inspace situational awareness , SIAM/ASA Journal on uncertainty Quantification, 2 (2014), pp. 276–304,https://doi . org/10 . S. J. Julier and J. K. Uhlmann , A non-divergent estimation algorithm in the presence of unknowncorrelations , in Proceedings of the 1997 American Control Conference (Cat. No. 97CH36041), vol. 4,IEEE, 1997, pp. 2369–2373, https://doi . org/10 . . . H. J. Kappen , Path integrals and symmetry breaking for optimal control theory , Journal of statisticalmechanics: theory and experiment, 2005 (2005), p. P11011, https://doi . org/10 . K. Law, A. Stuart, and K. Zygalakis , Data assimilation , Cham, Switzerland: Springer, (2015),https://doi . org/10 . R. P. Mahler , Statistical multisource-multitarget information fusion , Artech House, Inc., 2007, https://doi . org/10 . . ch16.[15] J. S. McCabe and K. J. DeMars , Fusion methodologies for orbit determination with distributed sen-sor networks , in 2018 21st International Conference on Information Fusion (FUSION), IEEE, 2018,pp. 1323–1330, https://doi . org/10 . . . M. Orton and A. Marrs , A bayesian approach to multi-target tracking and data fusion with out-of-sequence measurements , in IEE Target Tracking: Algorithms and Applications (Ref. No. 2001/174),IET, 2002, pp. 15–1, https://doi . org/10 . L. Y. Pao and L. Trailovic , The optimal order of processing sensor information in sequential mul-tisensor fusion algorithms , IEEE Transactions on Automatic Control, 45 (2000), pp. 1532–1536,https://doi . org/10 . . C. Robert and G. Casella , Monte Carlo statistical methods , Springer Science & Business Media, 2013,https://doi . org/10 . M. D. Schneider and W. A. Dawson , Synthesis of disparate optical imaging data for space domainawareness , arXiv preprint arXiv:1609.07157, (2016), https://amostech . com/TechnicalPapers/2016/Astrodynamics/Schneider . pdf (accessed 2020-10-26).[20] L. D. Stone, R. L. Streit, T. L. Corwin, and K. L. Bell , Bayesian multiple target track-ing , Artech House, 2013, https://us . artechhouse . com/Bayesian-Multiple-Target-Tracking-Second-Edition-P1802 . aspx (accessed 2020-10-26).aspx (accessed 2020-10-26).