Multi-Robot Informative Path Planning for Active Sensing of Environmental Phenomena: A Tale of Two Algorithms
MMulti-Robot Informative Path Planning for Active Sensingof Environmental Phenomena: A Tale of Two Algorithms
Nannan Cao and Kian Hsiang Low
Department of Computer ScienceNational University of SingaporeRepublic of Singapore {nncao, lowkh}@comp.nus.edu.sg John M. Dolan
Robotics InstituteCarnegie Mellon UniversityPittsburgh PA 15213 USA [email protected]
ABSTRACT
A key problem of robotic environmental sensing and moni-toring is that of active sensing: How can a team of robotsplan the most informative observation paths to minimizethe uncertainty in modeling and predicting an environmen-tal phenomenon? This paper presents two principled ap-proaches to efficient information-theoretic path planning basedon entropy and mutual information criteria for in situ ac-tive sensing of an important broad class of widely-occurringenvironmental phenomena called anisotropic fields. Our pro-posed algorithms are novel in addressing a trade-off betweenactive sensing performance and time efficiency. An impor-tant practical consequence is that our algorithms can exploitthe spatial correlation structure of Gaussian process-basedanisotropic fields to improve time efficiency while preserv-ing near-optimal active sensing performance. We analyzethe time complexity of our algorithms and prove analyti-cally that they scale better than state-of-the-art algorithmswith increasing planning horizon length. We provide the-oretical guarantees on the active sensing performance ofour algorithms for a class of exploration tasks called tran-sect sampling, which, in particular, can be improved withlonger planning time and/or lower spatial correlation alongthe transect. Empirical evaluation on real-world anisotropicfield data shows that our algorithms can perform better orat least as well as the state-of-the-art algorithms while of-ten incurring a few orders of magnitude less computationaltime, even when the field conditions are less favorable.
Categories and Subject Descriptors
G.3 [
Probability and Statistics ]: Stochastic processes;I.2.9 [
Robotics ]: Autonomous vehicles
General Terms
Algorithms, Performance, Experimentation, Theory
Keywords
Multi-robot exploration and mapping, adaptive sampling,active learning, Gaussian process, non-myopic path planning
1. INTRODUCTION
Research in environmental sensing and monitoring has re-cently gained significant attention and practical interest, es-
Appears in:
Proceedings of the 12th International Confer-ence on Autonomous Agents and Multiagent Systems (AA-MAS 2013) , Ito, Jonker, Gini, and Shehory (eds.), May, 6–10, 2013,Saint Paul, Minnesota, USA.Copyright c (cid:13) pecially in supporting environmental sustainability effortsworldwide. A key direction of this research aims at sensing,modeling, and predicting the various types of environmen-tal phenomena spatially distributed over our natural andbuilt-up habitats so as to improve our knowledge and un-derstanding of their economic, environmental, and healthimpacts and implications. This is non-trivial to achieve dueto a trade-off between the quantity of sensing resources (e.g.,number of deployed sensors, energy consumption, missiontime) and the uncertainty in predictive modeling. In thecase of deploying a limited number of mobile robotic sens-ing assets, such a trade-off motivates the need to plan themost informative resource-constrained observation paths tominimize the uncertainty in modeling and predicting a spa-tially varying environmental phenomenon, which constitutesthe active sensing problem to be addressed in this paper.A wide multitude of natural and urban environmentalphenomena is characterized by spatially correlated field mea-surements, which raises the following fundamental issue facedby the active sensing problem:How can the spatial correlation structure of anenvironmental phenomenon be exploited to im-prove the active sensing performance and com-putational efficiency of robotic path planning?The works of [11, 12, 13] have tackled this issue specificallyin the context of an environmental hotspot field by study-ing how its spatial correlation structure affects the perfor-mance advantage of adaptivity in path planning: If the fieldis large with a few small hotspots exhibiting extreme mea-surements and much higher spatial variability than the restof the field, then adaptivity can provide better active sens-ing performance. On the other hand, non-adaptive samplingtechniques [2, 8, 14] suffice for smoothly-varying fields.In this paper, we will investigate the above issue for an-other important broad class of environmental phenomenacalled anisotropic fields that exhibit a (often much) higherspatial correlation along one direction than along its per-pendicular direction. Such fields occur widely in naturaland built-up environments and some of them include (a)ocean and freshwater phenomena like plankton density [6],fish abundance [23], temperature and salinity [22]; (b) soiland atmospheric phenomena like peat thickness [25], surfacesoil moisture [26], rainfall [18]; (c) mineral deposits like ra-dioactive ore [19]; (d) pollutant and contaminant concentra-tion like air [1], heavy metals [16]; and (e) ecological abun-dance like vegetation density [9].The geostatistics community has examined a related issueof how the spatial correlation structure of an anisotropic field a r X i v : . [ c s . L G ] F e b an be exploited to improve the predictive performance ofa sampling design for a static sensor network. To resolvethis, the following heuristic design [25] is commonly usedfor sampling the anisotropic fields described above: Arrangeand place the static sensors in a rectangular grid such thatone axis of the grid is aligned along the direction of lowestspatial correlation (i.e., highest spatial variability) and thegrid spacing along this axis as compared to that along itsperpendicular axis is proportional to the ratio of their re-spective spatial correlations. In the case of path planningfor k robots, one may consider the sampling locations of therectangular grid as cities to be visited in a k -traveling sales-man problem so as to minimize the total distance traveled ormission time [15]. However, since the resulting observationpaths are constrained by the heuristic sampling design, theyare suboptimal in solving the active sensing problem (i.e.,minimizing the predictive uncertainty). This drawback isexacerbated when the robots are capable of sampling at ahigher resolution along their paths (e.g., due to high sensorsampling rate) than that of the grid, hence gathering subop-timal observations while traversing between grid locations.This paper presents two principled approaches to efficientinformation-theoretic path planning based on entropy andmutual information (respectively, Sections 3 and 4) criteriafor in situ active sensing of environmental phenomena. Incontrast to the existing methods described above, our pro-posed path planning algorithms are novel in addressing atrade-off between active sensing performance and computa-tional efficiency. An important practical consequence is thatour algorithms can exploit the spatial correlation structureof anisotropic fields to improve time efficiency while preserv-ing near-optimal active sensing performance. The specificcontributions of our work in this paper include: • Analyzing the time complexity of our proposed algorithmsand proving analytically that they scale better than state-of-the-art information-theoretic path planning algorithms[8, 13] with increasing length of planning horizon (Sec-tions 3 and 4); • Providing theoretical guarantees on the active sensing per-formance of our proposed algorithms (Sections 3 and 4) fora class of exploration tasks called the transect samplingtask (Section 2.1), which, in particular, can be improvedwith longer planning time and/or lower spatial correlationalong the transect; • Empirically evaluating the time efficiency and active sens-ing performance of our proposed algorithms on real-worldtemperature and plankton density field data (Section 5).
2. BACKGROUND2.1 Transect Sampling Task
In a transect sampling task [14, 24], a team of k robots istasked to explore and sample an environmental phenomenonspatially distributed over a transect (Fig. 1) that is dis-cretized into a r × n grid of sampling locations where thenumber n of columns is assumed to be much larger thanthe number r of sampling locations in each column, r is ex-pected to be small in a transect, and k ≤ r . The columnsare indexed in an increasing order from left to right. The k robots are constrained to simultaneously explore forwardone column at a time from the leftmost column ‘1’ to therightmost column ‘ n ’ such that each robot samples one lo-cation per column for a total of n locations. Hence, eachrobot, given its current location, can move to any of the r ?? Figure 1: Transect sampling task with robots ona temperature field (measured in ◦ C ) spatially dis-tributed over a m × m transect that is dis-cretized into a × grid of sampling locations (whitedots) (Image courtesy of [ ]). locations in the adjacent column on its right.In practice, the transect sampling task is especially ap-propriate for and widely performed by mobile robots withlimited maneuverability (e.g., unmanned aerial vehicles, au-tonomous surface and underwater vehicles (AUVs) [21]) be-cause it involves less complex path maneuvers that can beachieved more reliably using less sophisticated on-board con-trol algorithms. In terms of practical applicability, transectsampling is a particularly useful exploration task to be per-formed during the transit from the robot’s current locationto a distant planned waypoint [10, 24] to collect the mostinformative observations. For active sensing of ocean andfreshwater phenomena, the transect can span a spatial fea-ture of interest such as a harmful algal bloom or pollutantplume to be explored and sampled by a fleet of AUVs beingdeployed off a ship vessel. An environmental phenomenon is defined to vary as a re-alization of a rich class of Bayesian non-parametric mod-els called the
Gaussian process (GP) [20] that can formallycharacterize its spatial correlation structure and be refinedwith increasing number of observations. More importantly,GP can provide formal measures of predictive uncertainty(e.g., based on an entropy or mutual information criterion)for directing the robots to explore the highly uncertain areasof the phenomenon.Let D be a set of sampling locations representing the do-main of the environmental phenomenon such that each lo-cation x ∈ D is associated with a realized (random) mea-surement z x ( Z x ) if x is sampled/observed (unobserved).Let { Z x } x ∈D denote a GP, that is, every finite subset of { Z x } x ∈D has a multivariate Gaussian distribution [20]. TheGP is fully specified by its prior mean µ x (cid:44) E [ Z x ] and covari-ance σ xx (cid:48) (cid:44) cov[ Z x , Z x (cid:48) ] for all x, x (cid:48) ∈ D . In the experiments(Section 5), we assume that the GP is second-order station-ary, i.e., it has a constant prior mean and a stationary prior covariance structure (i.e., σ xx (cid:48) is a function of x − x (cid:48) for all x, x (cid:48) ∈ D ), both of which are assumed to be known. In par-ticular, its covariance structure is defined by the widely-usedsquared exponential covariance function σ xx (cid:48) (cid:44) σ s exp (cid:26) −
12 ( x − x (cid:48) ) T M − ( x − x (cid:48) ) (cid:27) + σ n δ xx (cid:48) (1)where σ s and σ n are, respectively, the signal and noise vari-ances controlling the intensity and noise of the measure-ments, M is a diagonal matrix with length-scale compo-nents (cid:96) and (cid:96) controlling the degree of spatial correlationor “similarity” between measurements along (i.e., horizontaldirection) and perpendicular to (i.e., vertical direction) thetransect, respectively, and δ xx (cid:48) is a Kronecker delta of value1 if x = x (cid:48) , and 0 otherwise. For anisotropic fields, (cid:96) (cid:54) = (cid:96) .An advantage of using GP to model the environmentalphenomenon is its probabilistic regression capability: Given vector s of sampled locations and a column vector z s ofcorresponding measurements, the joint distribution of themeasurements at any vector u of κ unobserved locationsremains Gaussian with the following posterior mean vectorand covariance matrix µ u | s = µ u + Σ us Σ − ss ( z s − µ s ) (2)Σ uu | s = Σ uu − Σ us Σ − ss Σ su (3)where µ u ( µ s ) is a column vector with mean components µ x for every location x of u ( s ), Σ us (Σ ss ) is a covariancematrix with covariance components σ xx (cid:48) for every pair oflocations x of u ( s ) and x (cid:48) of s , and Σ su is the transposeof Σ us . The posterior mean vector µ u | s (2) is used to pre-dict the measurements at vector u of κ unobserved locations.The uncertainty of these predictions can be quantified usingthe posterior covariance matrix Σ uu | s (3), which is indepen-dent of the measurements z s , in two ways: (a) the trace ofΣ uu | s yields the sum of posterior variances Σ xx | s over ev-ery location x of u ; (b) the determinant of Σ uu | s is used incalculating the Gaussian posterior joint entropy H ( Z u | Z s ) (cid:44)
12 log(2 πe ) κ (cid:12)(cid:12) Σ uu | s (cid:12)(cid:12) . (4)Unlike the first measure of predictive uncertainty which as-sumes conditional independence between measurements atvector u of unobserved locations, the entropy-based mea-sure (4) accounts for their correlation, thereby not overesti-mating their uncertainty. Hence, we will focus on using theentropy-based measure of uncertainty in this paper.
3. ENTROPY-BASED PATH PLANNING
Notations.
Each planning stage i is associated with column i of the transect for i = 1 , . . . , n . In each stage i , the teamof k robots samples from column i a total of k observations(each of which comprises a pair of a location and its measure-ment) that are denoted by a pair of vectors x i of k locationsand Z x i of the corresponding random measurements. Let X i denote the set of all possible robots’ sampling locations x i in stage i . It can be observed that χ (cid:44) |X | = . . . = |X n | = r C k . We assume that the robots can deterministically (i.e.,no stochasticity in motion) move from their current locations x i − in column i − x i in column i .Let x i : j and Z x i : j denote vectors concatenating robots’ sam-pling locations x i , . . . , x j and concatenating correspondingrandom measurements Z x i , . . . , Z x j over stages i to j , re-spectively, and X i : j denote the set of all possible x i : j . Maximum Entropy Path Planning (MEPP).
The workof [13] has proposed planning non-myopic observation paths x ∗ n with maximum entropy (i.e., highest uncertainty): x ∗ n = arg max x n ∈X n H ( Z x n ) (5)that, as proven in an equivalence result, minimize the pos-terior entropy/uncertainty remaining in the unobserved lo-cations of the transect. Computing the maximum entropypaths x ∗ n incurs O (cid:0) χ n ( kn ) (cid:1) , which is exponential in thelength n of planning horizon. To mitigate this computa-tional difficulty, an anytime heuristic search algorithm [7]is used to compute (5) approximately. However, its perfor-mance cannot be guaranteed. Furthermore, as reported in[14], when χ or n is large, its computed paths perform poorlyeven after incurring a huge amount of search time and space. Approximate MEPP ( m ) . To establish a trade-off be-tween active sensing performance and computational effi- ciency, the key idea is to exploit a property of the covariancefunction (1) that the spatial correlation of measurementsbetween any two locations decreases exponentially with in-creasing distance between them. Intuitively, such a propertymakes the measurements Z x i to be observed next in col-umn i near-independent of the past distant measurements Z x i − m − observed from columns 1 to i − m − i ) for a sufficiently large m by conditioning on thecloser measurements Z x i − m : i − observed in columns i − m to i − i ). Consequently, H ( Z x i | Z x i − )can still be closely approximated by H ( Z x i | Z x i − m : i − ) afterassuming a m -th order Markov property, thus yielding thefollowing approximation of the joint entropy H ( Z x n ) in (5): H ( Z x n ) = H ( Z x m ) + (cid:80) ni = m +1 H ( Z x i | Z x i − ) ≈ H ( Z x m ) + (cid:80) ni = m +1 H ( Z x i | Z x i − m : i − ) . (6)The first equality is due to the chain rule for entropy [3].Using (6), MEPP (5) can be approximated by the followingstage-wise dynamic programming equations, which we callMEPP( m ): V i ( x i − m : i − ) = max x i ∈X i H ( Z x i | Z x i − m : i − ) + V i +1 ( x i − m +1: i ) V n ( x n − m : n − ) = max x n ∈X n H ( Z x n | Z x n − m : n − ) (7)for stage i = m + 1 , . . . , n −
1, each of which induces a corre-sponding optimal vector x E i of k locations given the optimalvector x E i − m : i − obtained from previous stages i − m to i − .Let the optimal observation paths of MEPP( m ) be denotedby x E n that concatenates x E m = arg max x m ∈X m H ( Z x m ) + V m +1 ( x m ) (8)for the first m stages and x E m +1 , . . . , x E n derived using (7) forthe subsequent stages m + 1 to n . Our proposed MEPP( m )algorithm generalizes that of [14] which is essentially MEPP(1). Theorem 1 (Time Complexity).
Deriving x E n of MEPP( m ) requires O (cid:0) χ m +1 [ n + ( km ) ] (cid:1) time. The proof of Theorem 1 is given in Appendix A.1. UnlikeMEPP which scales exponentially in the planning horizonlength n , our MEPP( m ) algorithm scales linearly in n .Let ω and ω be the horizontal and vertical separationwidths between adjacent grid locations, respectively, (cid:96) (cid:48) (cid:44) (cid:96) /ω and (cid:96) (cid:48) (cid:44) (cid:96) /ω denote the normalized horizontaland vertical length-scale components, respectively, and η (cid:44) σ n /σ s . The following result bounds the loss in active sens-ing performance of the MEPP( m ) algorithm (i.e., (7) and(8)) relative to that of MEPP (5): Theorem 2 (Performance Guarantee).
The paths x E n are (cid:15) -optimal in achieving the maximum entropy crite-rion, i.e., H ( Z x ∗ n ) − H ( Z x E n ) ≤ (cid:15) where (cid:15) (cid:44) [ k ( n − m )] log (cid:40) (cid:8) − ( m + 1) / (2 (cid:96) (cid:48) ) (cid:9) η (1 + η ) (cid:41) . The proof of Theorem 2 is given in Appendix A.3. Theo-rem 2 reveals that the active sensing performance of MEPP( m ) In fact, solving MEPP( m ) (7) yields a policy that, in eachstage i , induces an optimal vector for every possible vector x i − m : i − (including possible diverged paths from x E i − m : i − due to external forces) obtained from previous m stages.an be improved by decreasing (cid:15) , which is achieved usinghigher noise-to-signal ratio η (i.e., noisy, less intense fields),smaller number k of robots, shorter planning horizon length n , larger m , and/or lower spatial correlation (cid:96) (cid:48) along thetransect. Two important implications result: (a) Increasing m trades off computational efficiency (Theorem 1) for betteractive sensing performance, and (b) if the spatial correlationof the anisotropic field along the transect is sufficiently lowto maintain a relatively tight bound (cid:15) such that only a small m is needed, then MEPP( m ) can exploit this spatial cor-relation structure to gain time efficiency while preservingnear-optimal active sensing performance. In practice, it isoften possible to obtain prior knowledge on a direction oflow spatial correlation (refer to ocean and freshwater phe-nomena in Section 1 for examples) and align it with thehorizontal axis of the transect.
4. MUTUAL INFORMATION-BASED PATHPLANNING
Notations.
Recall that the team of k robots selects k lo-cations x i to be sampled from column i of the transect for i = 1 , . . . , n . Let u i denote a vector of remaining r − k unobserved locations in column i and Z u i denote a vectorof the corresponding random measurements. Let u i : j and Z u i : j denote vectors concatenating remaining unobserved lo-cations u i , . . . , u j and concatenating corresponding randommeasurements Z u i , . . . , Z u j over stages i to j , respectively. Maximum Mutual Information Path Planning (M IPP).
An alternative to MEPP is to plan non-myopic observationpaths x (cid:63) n that share the maximum mutual information withthe remaining unobserved locations u (cid:63) n of the transect: x (cid:63) n = arg max x n ∈X n I ( Z x n ; Z u n ) I ( Z x n ; Z u n ) (cid:44) H ( Z u n ) − H ( Z u n | Z x n ) . (9)From (9), I ( Z x n ; Z u n ) measures the reduction in entropy/uncertainty of the measurements Z u n at the remaining un-observed locations u n of the transect by observing the mea-surements Z x n to be sampled along the paths x n . So, thepath planning of M IPP (9) is equivalent to the selectionof remaining unobserved locations with the largest entropyreduction (i.e., determining u (cid:63) n ). This may be mistakenlyperceived as the selection of remaining unobserved locationswith the lowest uncertainty (i.e., minimizing posterior en-tropy term H ( Z u n | Z x n ) in (9)), which is exactly whatthe path planning of MEPP (5) can achieve, as mentioned inSection 3. Note, however, that the maximum mutual infor-mation paths (9) planned by M IPP can in fact induce a verylarge prior entropy H ( Z u n ) but not necessarily the small-est posterior entropy H ( Z u n | Z x n ). Consequently, MEPPand M IPP exhibit different path planning behaviors andresulting active sensing performances, as shown empiricallyin Section 5.Similar to MEPP, M IPP incurs exponential time in thelength of planning horizon. To relieve this computationalburden, we will describe an approximation algorithm forplanning maximum mutual information paths next.
Approximate M IPP ( m ) . We will exploit the same prop-erty of the covariance function (1) as that used by MEPP( m )(Section 3) to establish a trade-off between active sensingperformance and computational efficiency for our M IPP( m )algorithm. However, this is not as straightforward to achieveas that to derive MEPP( m ) where a m -th order Markov property can simply be imposed on each posterior entropyterm in (6). To illustrate this, using the chain rule for mu-tual information [3], I ( Z x n ; Z u n ) = I ( Z x m ; Z u n ) + n − m − (cid:88) i = m +1 I ( Z x i ; Z u n | Z x i − )+ I ( Z x n − m : n ; Z u n | Z x n − m − ) , after which a m -th order Markov property is assumed toyield the following approximation: I ( Z x n ; Z u n ) ≈ I ( Z x m ; Z u n ) + n − m − (cid:88) i = m +1 I ( Z x i ; Z u n | Z x i − m : i − )+ I ( Z x n − m : n ; Z u n | Z x n − m : n − m − ) . (10)From (10), note that each conditional mutual informationterm I ( Z x i ; Z u n | Z x i − m : i − ) cannot be evaluated individu-ally because the remaining unobserved locations u n of thetransect (specifically, u i − m − and u i +1: n in the respectivecolumns 1 to i − m − i + 1 to n ) cannot be determinedsimply by knowing the robots’ past and current samplinglocations x i − m : i − and x i in columns i − m to i .To resolve this, we exploit the same property of the co-variance function (1) as that used by MEPP( m ) (Section 3)again: It makes the measurements Z x i to be observed nextin column i near-independent of the distant unobserved mea-surements Z u i − m − and Z u i + m +1: n in the respective columns1 to i − m − i + m + 1 to n (i.e., far from column i ) for a sufficiently large m by conditioning on the closermeasurements Z x i − m : i − and Z u i − m : i + m in columns i − m to i + m (i.e., closer to column i ). As a result, each term I ( Z x i ; Z u n | Z x i − m : i − ) in (10) can be closely approximatedby I ( Z x i ; Z u i − m : i + m | Z x i − m : i − ) for i = m + 1 , . . . , n − m − I ( Z x i ; Z u n | Z x i − m : i − )= H ( Z x i | Z x i − m : i − ) − H ( Z x i | Z x i − m : i − , Z u n ) ≈ H ( Z x i | Z x i − m : i − ) − H ( Z x i | Z x i − m : i − , Z u i − m : i + m )= I ( Z x i ; Z u i − m : i + m | Z x i − m : i − )where the approximation follows from the above-mentionedconditional independence assumption and the equalities aredue to the definition of conditional mutual information [3].Similarly, I ( Z x m ; Z u n ) and I ( Z x n − m : n ; Z u n | Z x n − m : n − m − )in (10) are, respectively, approximated by I ( Z x m ; Z u m )and I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − ). Then, I ( Z x n ; Z u n ) ≈ I ( Z x m ; Z u m )+ n − m − (cid:88) i = m +1 I ( Z x i ; Z u i − m : i + m | Z x i − m : i − )+ I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − )= I ( Z x m ; Z u m ) + n − (cid:88) i =2 m +1 I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − )+ I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − ) . (11)Using (11), M IPP (9) can be approximated by the followingstage-wise dynamic programming equations, which we callM IPP( m ): U i ( x i − m : i − ) = max x i ∈X i I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − )+ U i +1 ( x i − m +1: i ) U n ( x n − m : n − ) = max x n ∈X n I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − )(12)for stage i = 2 m +1 , . . . , n −
1, each of which induces a corre-sponding optimal vector x M i of k locations given the optimalvector x M i − m : i − obtained from previous stages i − m to − . Note that the term I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − )in each stage i can be evaluated now because the remainingunobserved locations u i − m : i in columns i − m to i can bedetermined since the robots’ past and current sampling lo-cations x i − m : i − and x i in the same columns are given (i.e.,as input to U i and under the max operator, respectively).Let the optimal observation paths of M IPP( m ) be denotedby x M n that concatenates x M m = arg max x m ∈X m I ( Z x m , Z u m ) + U m +1 ( x m ) (13)for the first 2 m stages and x M m +1 , . . . , x M n derived using (12)for the subsequent stages 2 m + 1 to n . Theorem 3 (Time Complexity).
Deriving x M n of M IPP( m ) requires O (cid:0) χ m +1 [ n + 2( r (2 m + 1)) ] (cid:1) time. The proof of Theorem 3 is given in Appendix B.1. Un-like M IPP that scales exponentially in the planning horizonlength n , our M IPP( m ) algorithm scales linearly in n .The following result bounds the loss in active sensing per-formance of the M IPP( m ) algorithm (i.e., (12) and (13))relative to that of M IPP (9):
Theorem 4 (Performance Guarantee).
The paths x M n are ε -optimal in achieving the maximum mutual infor-mation criterion, i.e., I ( Z x (cid:63) n ; Z u (cid:63) n ) − I ( Z x M n ; Z u M n ) ≤ ε where ε (cid:44) k ( n − m ) (cid:20) rn + 12 k ( n − m ) (cid:21) log
1+ exp (cid:110) − ( m +1) (cid:96) (cid:48) (cid:111) η (1 + η ) . The proof of Theorem 4 is given in Appendix B.3. As shownin Theorem 4, decreasing ε improves the active sensing per-formance of M IPP( m ); this can be achieved in a similarmanner to that for decreasing the loss bound (cid:15) of MEPP( m )(see paragraph after Theorem 2) since the two loss bounds ε and (cid:15) are similar. In addition, smaller number r of samplinglocations in each column decreases ε . M IPP( m ) sharesthe same implications as that of MEPP( m ): (a) Increas-ing m trades off time efficiency (Theorem 3) for improvedactive sensing performance, and (b) M IPP( m ) can exploita low spatial correlation (cid:96) (cid:48) of the anisotropic field along thetransect to improve time efficiency (i.e., only requiring asmall m ) while preserving near-optimal active sensing per-formance (i.e., still maintaining a relatively tight bound ε ).
5. EXPERIMENTS AND DISCUSSION
This section evaluates the active sensing performance andcomputational efficiency of the MEPP( m ) (i.e., (7) and (8))and M IPP( m ) (i.e., (12) and (13)) algorithms empiricallyon two real-world datasets: (a) May 2009 temperature fielddata of Panther Hollow Lake in Pittsburgh, PA spatially dis-tributed over a 25 m by 150 m transect that is discretizedinto a 5 ×
30 grid [17], and (b) June 2009 plankton densityfield data of Chesapeake Bay spatially distributed over a314 m by 1765 m transect that is discretized into a 8 ×
45 grid[5]. These environmental phenomena are modeled by GPswith hyperparameters (i.e., horizontal and vertical length-scales, signal and noise variances) (Section 2.2) learned us-ing maximum likelihood estimation (MLE) [20]: (a) (cid:96) = Similar to MEPP( m ), solving M IPP( m ) (12) yields a pol-icy that, in each stage i , induces an optimal vector for everypossible vector x i − m : i − (including possible diverged pathsfrom x E i − m : i − ) obtained from previous 2 m stages. (a) (cid:96) = 5 m, (cid:96) = 5 m. (b) (cid:96) = 5 m, (cid:96) = 16 m. (c) (cid:96) = 40 .
45 m, (cid:96) = 5 m. (d) (cid:96) = 40 .
45 m, (cid:96) = 16 m. Figure 2: Temperature fields (measured in ◦ C ) dis-cretized into × grids with varying horizontal andvertical length-scales. .
45 m, (cid:96) = 16 .
00 m, σ s = 0 . σ n = 0 . (cid:96) = 27 .
53 m, (cid:96) = 134 .
64 m, σ s = 2 . σ n = 0 .
041 for the plankton density field. Itcan be observed that the temperature and plankton densityfields have low noise-to-signal ratios η of 0 .
023 and 0 . m ) under these less favorable field conditions. Tofurther investigate our algorithms’ trade-off behaviors underdifferent horizontal and vertical spatial correlations, the cor-responding length-scales (cid:96) and (cid:96) of the original tempera-ture field (Fig. 2d) are reduced and fixed to produce threeother modified fields (Figs. 2a, 2b, 2c) with the signal andnoise variances σ s and σ n learned using MLE. Comparison with Active Sensing Algorithms.
Theperformance of our proposed algorithms is compared to thatof state-of-the-art information-theoretic path planning algo-rithms for active sensing: The work of [13] has proposed thefollowing greedy maximum entropy path planning (gMEPP)algorithm: V g i ( x i − ) = max x i ∈X i H ( Z x i | Z x i − ) (14)for stage i = 1 , . . . , n , each of which induces a correspondingoptimal vector x E i of k locations given the optimal vector x E i − obtained from previous stages 1 to i −
1. A greedymaximum mutual information path planning (gM IPP) al-gorithm is devised by [8] as follows: U g i ( x i − ) = max x i ∈X i I ( Z x i ; Z x i ) (15)for stage i = 1 , . . . , n , each of which induces a correspondingoptimal vector x M i of k locations given the optimal vector x M i − obtained from previous stages 1 to i −
1, and x i denotes a vector of all sampling locations in the domain D excluding those of x i . As mentioned earlier in Section 3,the work of [14] has developed MEPP(1), which is a specialcase of our MEPP( m ) algorithm.In contrast to our MEPP( m ) and M IPP( m ) algorithmsthat scale linearly in the length n of planning horizon (The-orems 1 and 3), deriving x E n of gMEPP and x M n of gM IPPincurs quartic time in n . Hence, if the required value of m is sufficiently small, then MEPP( m ) and M IPP( m ) can bemore efficient than the greedy algorithms, as shown below. Performance Metrics.
The tested algorithms are eval-uated using three different metrics: The (a) entropy met-ric EN( x n ) (cid:44) H ( Z u n | Z x n ) and (b) mutual informationmetric MI( x n ) (cid:44) I ( Z x n ; Z u n ) measure, respectively, the able 1: Comparison of EN ( x n ) , MI ( x n ) , and ER ( x n ) ( × − ) performance for different temperature fieldsshown in Fig. 2 with varying number of robots. For our proposed M IPP ( m ) and MEPP ( m ) algorithms, everyperformance result is preceded by the value of m (in round brackets) used. EN( x n ) MI( x n ) ER( x n )1 robot Field Field FieldAlgorithm a b c d a b c d a b c dgM2IPP: x M n [8] -64.4 -123.9 -173.3 -182.2 27.9 48.4 46.0 39.5 1.764 0.581 0.088 0.042gMEPP: x E n [13] -64.8 -128.4 -173.3 -182.4 26.5 44.7 46.0 39.5 2.792 0.572 0.077 0.037 M IPP ( m ): x M n (1) -64.5 (1) -123.9 (1) -167.2 (1) -182.0 (1) 27.9 (1) 48.4 (1) 39.6 (1) 39.4 (1) 1.764 (1) 0.581 (1) 0.488 (1) 0.049(2) -173.2 (2) 45.8 (2) 0.110 (2) 0.042(3) 0.034 MEPP ( m ): x E n (1) -64.8 (1) -128.4 (1) -161.2 (1) -180.4 (1) 23.9 (1) 44.7 (1) 33.2 (1) 36.9 (1) 5.115 (1) 0.572 (1) 3.765 (1) 0.757(2) -64.9 (2) -167.2 (2) -182.4 (2) 26.3 (2) 39.6 (2) 39.5 (2) 2.315 (2) 0.501 (2) 0.026(3) -171.6 (3) 44.2 (3) 2.080 (3) 0.241(4) -173.4 (4) 46.1 (4) 0.0682 robots Field Field FieldAlgorithm a b c d a b c d a b c dgM2IPP: x M n [8] -57.8 -100.5 -132.9 -138.0 41.7 62.0 45.8 36.9 1.153 0.265 0.019 0.016gMEPP: x E n [13] -59.8 -112.2 -132.9 -138.8 41.2 55.8 45.9 36.2 0.521 0.439 0.033 0.018 M IPP ( m ): x M n (1) -57.8 (1) -100.5 (1) -132.9 (1) -138.2 (1) 41.2 (1) 62.0 (1) 45.9 (1) 36.9 (1) 0.605 (1) 0.265 (1) 0.020 (1) 0.018(2) 41.8 (2) 0.014 MEPP ( m ): x E n (1) -59.8 (1) -113.0 (1) -129.3 (1) -138.4 (1) 41.6 (1) 56.4 (1) 41.8 (1) 36.9 (1) 0.662 (1) 0.378 (1) 0.286 (1) 0.012(2) -60.0 (2) -132.9 (2) 45.9 (2) 0.0183 robots Field Field FieldAlgorithm a b c d a b c d a b c dgM2IPP: x M n [8] -46.5 -80.5 -89.5 -92.8 40.8 61.3 41.4 31.6 0.272 0.012 0.018 0.008gMEPP: x E n [13] -46.3 -80.6 -89.5 -93.2 40.5 60.6 41.3 28.6 0.257 0.024 0.017 0.009 M IPP ( m ): x M n (1) -46.5 (1) -72.0 (1) -89.4 (1) -92.1 (1) 40.8 (1) 60.0 (1) 38.8 (1) 32.0 (1) 0.272 (1) 0.123 (1) 0.016 (1) 0.008(2) -89.5 (2) 41.3 (2) 0.229 (2) 0.014 MEPP ( m ): x E n (1) -45.9 (1) -81.3 (1) -89.4 (1) -93.5 (1) 40.2 (1) 61.6 (1) 38.7 (1) 28.2 (1) 0.231 (1) 0.014 (1) 0.013 (1) 0.007(2) -46.5 (2) 40.8 (4) 41.1 (3) 28.6(4) 29.0 posterior entropy/uncertainty and the reduction in entropy/uncertainty at the remaining unobserved locations u n ofthe transect given the observation paths x n . The differ-ence between the entropy and mutual information metricshas been explained in the paragraph after (9) in Section 4.The (c) ER( x n ) (cid:44) || z u n − µ u n | x n || / { µ n ( r − k ) } metric measures the mean-squared relative prediction errorresulting from using the posterior mean µ u | x n (2) to pre-dict the measurements at the remaining n ( r − k ) unobservedlocations u n of the transect given the measurements sam-pled along the observation paths x n where µ = 1 (cid:62) z u n / { n ( r − k ) } . It has an advantage over the two information-theoretic metrics of using ground truth measurements toevaluate if the phenomenon is being predicted accurately.However, unlike the EN( x n ) and MI( x n ) metrics that ac-count for the spatial correlation between measurements atthe unobserved locations u n , the ER( x n ) metric assumesconditional independence between them. In contrast to theER( x n ) metric, the EN( x n ) and MI( x n ) metrics conse-quently do not overestimate their uncertainty. Table 1 shows the results of EN( x n ), MI( x n ), and ER( x n )performance of tested algorithms for temperature fields withdifferent horizontal and vertical length-scales (Fig. 2) andwith varying number of robots. For our proposed M IPP( m )and MEPP( m ) algorithms, the results are reported in an in-creasing order of m until the performance has stabilized. Itcan be observed from Table 1 that MEPP( m ) with m > IPP( m ) often outperforms MEPP(1) [14] in the threemetrics, as discussed and explained later. Note that everyincrement of m increases the length of history of samplinglocations considered in each stage by two for M IPP( m ) in-stead of by one for MEPP( m ); this can be seen from theinputs to U i (12) and V i (7), respectively. The observationsof the results are detailed in the rest of this subsection. EN( x n )As expected, the entropy-based MEPP( m ) and gMEPPalgorithms generally perform better than or at least as well −1 m T i m e ( s ) −1 m T i m e ( s ) gMEPPgM2IPPMEPP(m)M2IPP(m) −1 m T i m e ( s ) (a) 1 robot. (b) 2 robots. (c) 3 robots. Figure 3: Graphs of incurred time by different activesensing algorithms vs. m for temperature fields withvarying number of robots. as the mutual information-based M IPP( m ) and gM IPPalgorithms in this metric.For fields a, b, and d (i.e., of small (cid:96) or large (cid:96) ) with anynumber of robots, MEPP( m ) can produce EN( x E n ) valueslower than or comparable to that achieved by gMEPP andgM IPP using small values of m (i.e., m = 1 or 2), hence in-curring 1 to 4 orders of magnitude less computational time,as shown in Fig. 3. This can be explained by one of thefollowing reasons: (a) A low spatial correlation along thetransect cannot be exploited by gMEPP and gM IPP, whichconsider the entire history of past measurements for improv-ing active sensing performance; (b) a high correlation per-pendicular to the transect can be exploited by MEPP( m ) forbetter active sensing performance; and (c) unlike the greedygMEPP and gM IPP algorithms, MEPP( m ) is capable ofnon-myopic planning to improve active sensing performance.For field c (i.e., of large (cid:96) and small (cid:96) ) with 1 robot,MEPP( m ) cannot exploit the low spatial correlation perpen-dicular to the transect for improving active sensing perfor-mance. Therefore, it needs to raise the value of m up to 4 inorder to better exploit the high spatial correlation along thetransect. Consequently, MEPP( m ) can achieve EN( x E n )performance comparable to that achieved by gMEPP andgM IPP while incurring similar computational time as gMEPPand about 2 orders of magnitude less time than gM IPP. In-creasing the number of robots allows MEPP( m ) to achieveEN( x E n ) performance comparable to that of gMEPP andgM IPP using smaller values of m (i.e., m = 1 or 2), henceincurring 1 to 4 orders of magnitude less time. igure 4: Plankton density (chl-a) field (measuredin mg m − ) discretized into a × grid. MI( x n )The mutual information-based M IPP( m ) and gM IPPalgorithms often perform better than or at least as well asthe entropy-based MEPP( m ) and gMEPP in this metric.For fields a, b, and d (i.e., of small (cid:96) or large (cid:96) ) with anynumber of robots, M IPP( m ) can generally yield MI( x M n )values higher than or comparable to that achieved by gM IPPand gMEPP using a small m value of 1, hence incurring lesscomputational time (in particular, about 2 orders of magni-tude less time than gM IPP), as shown in Fig. 3. This canbe explained by the same reasons as that discussed previ-ously in Section 5.1.1.For field c (i.e., of large (cid:96) and small (cid:96) ) with 1 or 3 robots,M IPP( m ) cannot exploit the low spatial correlation per-pendicular to the transect for improving active sensing per-formance. So, it has to increase the value of m to 2 in or-der to better exploit the high correlation along the transect.As a result, M IPP( m ) can achieve MI( x M n ) performancecomparable to that achieved by gM IPP and gMEPP whileincurring less time with 1 robot and slightly more time with3 robots than gM IPP. With 2 robots, m = 1 suffices forM IPP( m ) to achieve MI( x M n ) performance comparable tothat achieved by gM IPP and gMEPP while incurring lesstime (Fig. 3). A computationally cheaper alternative for ac-tive sensing of field c is to consider using MEPP( m ) withlarger m : When the values of m are raised to 4, 2, and4 for the respective 1-, 2-, and 3-robot cases, it can pro-duce MI( x E n ) performance comparable to that achieved bygM IPP and gMEPP while incurring similar or less time.
ER( x n )For field c (i.e., of large (cid:96) and small (cid:96) ) with any num-ber of robots, MEPP( m ) and M IPP( m ) cannot exploit thelow spatial correlation perpendicular to the transect for im-proving active sensing performance. Hence, their valuesof m need to be raised in order to exploit the high corre-lation along the transect. Compared to M IPP( m ), it iscomputationally cheaper (Fig. 3) and offers greater perfor-mance improvement (Table 1) to increase the value of m ofMEPP( m ), which can then produce ER( x E n ) values lowerthan that achieved by gMEPP and gM IPP while incurringsimilar computational time to gMEPP and about 2 ordersof magnitude less time than gM IPP with 1 robot and 1 to 4orders of magnitude less time than both with 2 or 3 robots.For field d (i.e., of large (cid:96) and large (cid:96) ) with any num-ber of robots, MEPP( m ) can now exploit the high spatialcorrelation perpendicular to the transect for better activesensing performance. As a result, MEPP( m ) can yield bet-ter ER( x E n ) performance than gMEPP and gM IPP usingsmaller values of m (i.e., m = 1 or 2), hence incurring 1 to4 orders of magnitude less time.For fields a and b (i.e., of small (cid:96) ) with 1 or 2 robots,M IPP( m ) can produce ER( x M n ) values lower than or com-parable to that achieved by gM IPP and gMEPP using asmall m value of 1, hence incurring less time (in particu-lar, about 2 orders of magnitude less time than gM IPP),
Table 2: Comparison of EN ( x n ) , MI ( x n ) , andER ( x n ) ( × − ) performance for plankton densityfield shown in Fig. 4 with varying number of robots. EN( x n ) MI( x n ) ER( x n )No. of robots k No. of robots k No. of robots k Algorithm 1 2 3 1 2 3 1 2 3gM2IPP: x M n [8] 124 55 28 83 162 201 0.65 0.09 0.01gMEPP: x E n [13] 117 42 -6 65 126 184 1.35 0.44 0.04 M IPP ( m ): x M n
124 55 28 83 162 201 0.65 0.09 0.01
MEPP ( m ): x E n
117 41 -8 65 128 187 1.35 0.41 0.01 as shown in Fig. 3. Increasing to 3 robots allows MEPP( m )to achieve ER( x E n ) performance better than or compara-ble to that of gMEPP and gM IPP using a small m valueof 1, hence incurring 3 to 4 orders of magnitude less time(Fig. 3). These can be explained by the same reasons asthat discussed previously in Section 5.1.1. Table 2 shows the results of EN( x n ), MI( x n ), and ER( x n )performance of tested algorithms for the plankton densityfield (Fig. 4) with varying number of robots. For our pro-posed M IPP( m ) and MEPP( m ) algorithms, the results areonly reported for m = 1, at which their performance has al-ready stabilized. As mentioned earlier in the first paragraphof Section 5, the plankton density field exhibits low and highspatial correlations, respectively, along and perpendicular tothe transect, which resemble that of temperature field b.The observations are as follows: With any number ofrobots, MEPP(1) can produce EN( x E n ) values lower thanthat achieved by gMEPP and gM IPP while incurring 2 to5 orders of magnitude less time, as shown in Fig. 5. Onthe other hand, M IPP(1) can yield MI( x M n ) and ER( x M n )performance better than or comparable to that achieved bygM IPP and gMEPP while incurring less time (in particu-lar, about 2 orders of magnitude less time than gM IPP)(Fig. 5). These can be explained by the same reasons asthat discussed previously in Section 5.1.1.
The observations of the above results are summarized be-low: For anisotropic fields with low spatial correlation alongthe transect (e.g., temperature fields a and b and planktondensity field), MEPP( m ) can perform better or at least aswell as gMEPP and gM IPP in the prediction error (i.e.,with 3 robots) and entropy metrics using small m values of1 or 2, hence incurring 1 to 4 orders of magnitude less time.M IPP( m ) can generally perform likewise in the predictionerror (i.e., with 1 or 2 robots) and mutual information met-rics using a small m value of 1, hence incurring less time aswell (in particular, 2 orders of magnitude less time thangM IPP). These observations are previously explained inSection 5.1.1. Note that they corroborate the second impli-cations of Theorems 2 and 4 on the performance guaranteesof MEPP( m ) and M IPP( m ).For anisotropic fields with high spatial correlation alongthe transect (e.g., temperature fields c and d), a larger m value is needed in order for MEPP( m ) and M IPP( m ) to ex-ploit it if the correlation perpendicular to the transect is low(i.e., field c). Compared to M IPP( m ), it is computationallycheaper to increase the value of m of MEPP( m ) such that itperforms better or at least as well as gMEPP and gM IPP inall three metrics while incurring similar time to gMEPP andabout 2 orders of magnitude less time than gM IPP with 1robot and often 1 to 4 orders of magnitude less time thanboth with 2 or 3 robots. If the correlation perpendicular to −1 m T i m e ( s ) −1 m T i m e ( s ) m T i m e ( s ) gMEPPgM2IPPMEPP(m)M2IPP(m) (a) 1 robot. (b) 2 robots. (c) 3 robots. Figure 5: Graphs of incurred time by different activesensing algorithms vs. m for plankton density fieldwith varying number of robots. the transect is high (i.e., field d) instead, it can be exploitedby MEPP( m ) and M IPP( m ) to improve active sensing per-formance and consequently allow m to be reduced to smallvalues of 1 or 2: MEPP( m ) can perform better or, if not,at least as well as gMEPP and gM IPP in the predictionerror and entropy metrics while incurring 1 to 4 orders ofmagnitude less time. M IPP( m ) can perform likewise inthe mutual information metric while incurring less time (inparticular, 2 orders of magnitude less time than gM IPP).
6. CONCLUSION
This paper describes two principled information-theoreticpath planning algorithms based on entropy and mutual in-formation criteria (respectively, MEPP( m ) and M IPP( m ))for active sensing of GP-based anisotropic fields. Two im-portant practical implications result from the theoreticalguarantees on the active sensing performance of our algo-rithms (Theorems 2 and 4): Increasing m trades off com-putational efficiency (Theorems 1 and 3) for better activesensing performance, and our algorithms can exploit a lowspatial correlation along the transect to improve time effi-ciency (i.e., only needing a small m ) while preserving near-optimal active sensing performance. This motivates the useof prior knowledge, if available, on a direction of low spatialcorrelation in order to align it with the horizontal axis ofthe transect. Empirical evaluation of real-world anisotropictemperature and plankton density field data reveals thatour algorithms can perform better or at least as well asgMEPP and gM IPP while often incurring a few ordersof magnitude less time. In particular, it can be observedthat anisotropic fields with low spatial correlation along thetransect or high correlation perpendicular to the transectallow our algorithms to perform well using small values of m , thus yielding significant computational gain over gMEPPand gM IPP. To perform well in a field with high correla-tion along the transect and low correlation perpendicular tothe transect (i.e., less favorable conditions), our algorithmshave to increase the value of m or the number of robots butcan still achieve comparable or better time efficiency thangMEPP and gM IPP.
7. REFERENCES [1] J. B. Boisvert and C. V. Deutsch. Modeling locallyvarying anisotropy of CO emissions in the UnitedStates. Stoch. Environ. Res. Risk Assess. ,25:1077–1084, 2011.[2] J. Chen, K. H. Low, C. K.-Y. Tan, A. Oran, P. Jaillet,J. M. Dolan, and G. S. Sukhatme. Decentralized datafusion and active sensing with mobile sensors formodeling and predicting spatiotemporal trafficphenomena. In
Proc. UAI , pages 163–173, 2012.[3] T. Cover and J. Thomas.
Elements of InformationTheory . John Wiley & Sons, NY, 1991. [4] A. Das and D. Kempe. Algorithms for subset selectionin linear regression. In
Proc. STOC , pages 45–54, 2008.[5] J. M. Dolan, G. Podnar, S. Stancliff, K. H. Low,A. Elfes, J. Higinbotham, J. C. Hosler, T. A. Moisan,and J. Moisan. Cooperative aquatic sensing using thetelesupervised adaptive ocean sensor fleet. In
Proc.SPIE Conference on Remote Sensing of the Ocean,Sea Ice, and Large Water Regions , volume 7473, 2009.[6] D. Kitsiou, G. Tsirtsis, and M. Karydis. Developingan optimal sampling design: A case study in a coastalmarine ecosystem.
Environmental Monitoring andAssessment , 71(1):1–12, 2001.[7] R. Korf. Real-time heuristic search.
Artif. Intell. ,42(2-3):189–211, 1990.[8] A. Krause, A. Singh, and C. Guestrin. Near-optimalsensor placements in Gaussian processes: Theory,efficient algorithms and empirical studies.
JMLR ,9:235–284, 2008.[9] P. Legendre and M.-J. Fortin. Spatial pattern andecological analysis.
Vegetatio , 80:107–138, 1989.[10] N. E. Leonard, D. Paley, F. Lekien, R. Sepulchre,D. M. Fratantoni, and R. Davis. Collective motion,sensor networks and ocean sampling.
Proc. IEEE ,95(1):48–74, 2007.[11] K. H. Low, J. Chen, J. M. Dolan, S. Chien, and D. R.Thompson. Decentralized active robotic explorationand mapping for probabilistic field classification inenvironmental sensing. In
Proc. AAMAS , pages105–112, 2012.[12] K. H. Low, J. M. Dolan, and P. Khosla. Adaptivemulti-robot wide-area exploration and mapping. In
Proc. AAMAS , pages 23–30, 2008.[13] K. H. Low, J. M. Dolan, and P. Khosla.Information-theoretic approach to efficient adaptivepath planning for mobile robotic environmentalsensing. In
Proc. ICAPS , pages 233–240, 2009.[14] K. H. Low, J. M. Dolan, and P. Khosla. ActiveMarkov information-theoretic path planning forrobotic environmental sensing. In
Proc. AAMAS ,pages 753–760, 2011.[15] K. H. Low, G. J. Gordon, J. M. Dolan, and P. Khosla.Adaptive sampling for multi-robot wide-areaexploration. In
Proc. IEEE ICRA , 2007.[16] D. McGrath, C. Zhang, and O. T. Carton.Geostatistical analyses and hazard assessment on soillead in Silvermines area, Ireland.
EnvironmentalPollution , 127:239–248, 2004.[17] G. Podnar, J. M. Dolan, K. H. Low, and A. Elfes.Telesupervised remote surface water quality sensing.In
Proc. IEEE Aerospace Conference , 2010.[18] C. Prudhomme and D. W. Reed. Mapping extremerainfall in a mountainous region using geostatisticaltechniques: A case study in Scotland.
Int. J.Climatol. , 19:1337–1356, 1999.[19] N. Rabesiranana, M. Rasolonirina, A. F. Solonjara,and R. Andriambololona. Investigating the spatialanisotropy of soil radioactivity in the region ofVinaninkarena, Antsirabe - Madagascar. In
Proc. 4thHigh-Energy Physics International Conference , 2009.[20] C. E. Rasmussen and C. K. I. Williams.
GaussianProcesses for Machine Learning . MIT Press, 2006.21] D. L. Rudnick, R. E. Davis, C. C. Eriksen,D. Fratantoni, and M. J. Perry. Underwater gliders forocean research.
Mar. Technol. Soc. J. , 38:73–84, 2004.[22] S. Sokolov and S. R. Rintoul. Some remarks oninterpolation of nonstationary oceanographic fields.
J.Atmos. Oceanic Technol. , 16:1434–1449, 1999.[23] J. C. Taylor, J. S. Thompson, P. S. Rand, andM. Fuentes. Sampling and statistical considerations forhydroacoustic surveys used in estimating abundance offorage fishes in reservoirs.
North American Journal ofFisheries Management , 25:73–85, 2005.[24] D. R. Thompson and D. Wettergreen. Intelligent mapsfor autonomous kilometer-scale science survey. In
Proc. i-SAIRAS , 2008.[25] R. Webster and M. Oliver.
Geostatistics forEnvironmental Scientists . John Wiley & Sons, Inc.,NY, 2nd edition, 2007.[26] J. G. Zhang, H. S. Chen, Y. R. Su, X. L. Kong,W. Zhang, Y. Shi, H. B. Liang, and G. M. Shen.Spatial variability and patterns of surface soilmoisture in a field plot of karst area in southwestChina.
Plant Soil. Environ. , 57(9):409–417, 2011.
PPENDIX
Notations.
Let σ x (cid:44) σ xx and σ x | s (cid:44) Σ xx | s in (3) for anylocation x . Let ξ (cid:44) exp {− ( m + 1) / (2 (cid:96) (cid:48) ) } . A. ENTROPY-BASED PATH PLANNINGA.1 Proof of Theorem 1
Given each vector x i − m : i − , the time needed to evalu-ate the posterior entropy H ( Z x i | Z x i − m : i − ) over all possible x i ∈ X i is χ × O (( km ) ) = O ( χ ( km ) ). The time needed toperform this over all χ m possible vectors x i − m : i − in eachstage i is χ m × O ( χ ( km ) ) = O ( χ m +1 ( km ) ). Since thecovariance function is stationary (i.e., it only depends onthe distance between locations), the entropies calculated ina stage are the same as those in every other stage. The timeneeded to propagate the optimal values from stages n − m + 1 is O ( χ m +1 ( n − m − x E m , the joint entropy H ( Z x m ) has to be evaluated overall possible vectors x m . Hence, the time needed to solvefor the optimal vector x E m is O ( χ m ( km ) ). As a result, thetime complexity of the MEPP( m ) algorithm is O ( χ m +1 [( n − m −
1) + ( km ) ] + χ m ( km ) ) = O ( χ m +1 [ n + ( km ) ]). A.2 Proof of Some Lemmas
Before giving the proof of Theorem 2, the following lem-mas are needed.
Lemma For any observation paths x n , H ( Z x E m ) + n (cid:88) i = m +1 H ( Z x E i | Z x E i − m : i − ) ≥ H ( Z x m ) + n (cid:88) i = m +1 H ( Z x i | Z x i − m : i − ) . Proof.
Using (7), V m +1 ( x m )= max x m +1 ∈X m +1 H ( Z x m +1 | Z x m ) + V m +2 ( x m +1 )= max x m +1 ∈X m +1 H ( Z x m +1 | Z x m ) +max x m +2 ∈X m +2 H ( Z x m +2 | Z x m +1 ) + V m +3 ( x m +2 )= max x m +1 ∈X m +1 ,x m +2 ∈X m +2 H ( Z x m +1 | Z x m ) + H ( Z x m +2 | Z x m +1 ) + V m +3 ( x m +2 ) . . . = max x m +1 ∈X m +1 ,...,x n ∈X n n (cid:88) i = m +1 H ( Z x i | Z x i − m : i − ) . (16)Given x m , the vectors x m +1 , . . . , x n that maximize theterm (cid:80) ni = m +1 H ( Z x i | Z x i − m : i − ) in (16) can be obtained.Using (8), the observation paths x n that maximize H ( Z x m )+ (cid:80) ni = m +1 H ( Z x i | Z x i − m : i − ) can be obtained. Therefore,Lemma 5 holds. Lemma In a GP, given an unobserved location y anda vector A of sampled locations, σ y | A ≥ σ n . Proof.
We know that σ y | A ≥
0. So, if σ n > σ y | A = σ s + σ n − Σ yA Σ − AA Σ Ay ≥ AA are σ s + σ n . On the other hand, if σ n = 0, σ y | A = σ s − Σ yA Σ − BB Σ Ay ≥ BB (cid:44) Σ AA − σ n I .Let A (cid:44) Σ AA , B (cid:44) Σ BB , E (cid:44) σ n I, Y (cid:44) Σ Ay , Y (cid:62) (cid:44) Σ yA ,and W (cid:44) Σ − AA Σ Ay = A − Y . Then, Y = AW . W (cid:62) EW + W (cid:62) E (cid:62) B − EW ≥ ⇒ W (cid:62) B (cid:62) B − EW + W (cid:62) E (cid:62) B − EW ≥ ⇒ W (cid:62) ( B + E ) (cid:62) B − EW ≥ ⇒ W (cid:62) A (cid:62) B − EW + W (cid:62) A (cid:62) W ≥ W (cid:62) A (cid:62) W ⇒ W (cid:62) A (cid:62) ( B − E + I ) W ≥ W (cid:62) A (cid:62) W ⇒ W (cid:62) A (cid:62) B − ( E + B ) W ≥ W (cid:62) A (cid:62) A − AW ⇒ ( AW ) (cid:62) B − AW ≥ ( AW ) (cid:62) A − AW ⇒ Y (cid:62) B − Y ≥ Y (cid:62) A − Y ⇒ Σ yA Σ − BB Σ Ay ≥ Σ yA Σ − AA Σ Ay . (21)To derive (19), since W is a vector and E = σ n I , W (cid:62) EW ≥
0. Since B is a covariance matrix that is invertible andpositive semi-definite, B − is positive semi-definite. Hence, W (cid:62) E (cid:62) B − EW ≥ B issymmetric, B (cid:62) = B . Hence, (20) can be obtained from (19).The rest of the derivation from (20) to (21) is straightfor-ward. From (17), σ y | A = σ s + σ n − Σ yA Σ − AA Σ Ay ≥ σ s + σ n − Σ yA Σ − BB Σ Ay (22) ≥ σ n . (23)Note that (22) and (23) follow from (21) and (18), respec-tively. Therefore, Lemma 6 holds. Lemma H ( Z x i − m − | Z x i − m : i − ) − H ( Z x i − m − | Z x i − m : i − , Z x i ) ≤ k log (cid:26) ξ η (1 + η ) (cid:27) . Proof.
We will first prove for the single-robot case. Thisresult will be used later for the multi-robot case. Let x A (cid:44) x i − m : i − and x p (cid:44) x i − m − . σ x i − m − | x i − m : i − − σ x i − m − | ( x i − m : i − ,x i ) = σ x p | x A − σ x p | ( x A ,x i ) ≤ σ x p − σ x p | x i = σ x p − (cid:18) σ x p − σ x p x i σ x i x p σ x i (cid:19) ≤ σ s exp (cid:110) − ( m +1) (cid:96) (cid:48) (cid:111) σ s + σ n = σ s ξ η . (24)The first inequality follows from the property that variancereduction is submodular [4] in many practical cases (e.g.,further conditioning on Z x A does not make Z x p and Z x i more correlated). To intuitively understand this notion ofsubmodularity, observing a new location x i will reduce thevariance at location x p more if few or no observations aremade, and less if many observations are already taken (e.g.,t locations x A ). The second equality is due to (3). Thesecond inequality follows from the fact that the distancebetween any two locations from stage i and stage i − m − ω ( m + 1). So, σ x p x i ≤ σ s exp {− ( m + 1) / (2 (cid:96) (cid:48) ) } . H ( Z x i − m − | Z x i − m : i − ) − H ( Z x i − m − | Z x i − m : i − , Z x i )= H ( Z x p | Z x A ) − H ( Z x p | Z x A , Z x i )= 12 log σ x p | x A σ x p | ( x A ,x i ) (25) ≤
12 log σ x p | ( x A ,x i ) + σ s ξ η σ x p | ( x A ,x i ) (26)= 12 log (cid:40) σ s ξ σ x p | ( x A ,x i ) (1 + η ) (cid:41) ≤
12 log (cid:26) σ s ξ σ n (1 + η ) (cid:27) (27) ≤ log (cid:26) ξ η (1 + η ) (cid:27) . (28)Using (4), (25) can be obtained. Inequality (26) results from(24). Inequality (27) can be obtained using Lemma 6.We will now prove for the k -robot case where k >
1. Then,vectors x i and x p comprise k locations each. Let x ji ( x jp ) de-note the j -th location component in vector x i ( x p ). Let x j : ti ( x j : tp ) denote a vector comprising the j -th to t -th locationcomponents in vector x i ( x p ). Using the chain rule for en-tropy [3], H ( Z x p | Z x A ) − H ( Z x p | Z x A , Z x i )= H ( Z x p | Z x A ) − H ( Z x p | Z x A , Z x i ) + (29) k (cid:88) j =2 H ( Z x jp | Z x j − p , Z x A ) − H ( Z x jp | Z x j − p , Z x A , Z x i ) . (30)For (29), H ( Z x p | Z x A ) − H ( Z x p | Z x A , Z x ki )= H ( Z x p | Z x A ) − [ H ( Z x p | Z x A , Z x i ) + H ( Z x ki | Z x p , Z x A , Z x i ) − H ( Z x ki | Z x A , Z x i )]= H ( Z x p | Z x A ) − H ( Z x p | Z x A , Z x i ) + H ( Z x ki | Z x A , Z x i ) − H ( Z x ki | Z x p , Z x A , Z x i )= H ( Z x p | Z x A ) − H ( Z x p | Z x A , Z x i ) + k (cid:88) t =2 H ( Z x ti | Z x A , Z x t − i ) − H ( Z x ti | Z x p , Z x A , Z x t − i ) ≤ k log (cid:26) ξ η (1 + η ) (cid:27) . (31)The inequality follows from a derivation similar to (28).Let x A j denote a vector concatenating x j − p and x A . For (30), k (cid:88) j =2 H ( Z x jp | Z x Aj ) − H ( Z x jp | Z x Aj , Z x ki )= k (cid:88) j =2 H ( Z x jp | Z x Aj ) − [ H ( Z x jp | Z x Aj , Z x i ) + H ( Z x ki | Z x jp , Z x Aj , Z x i ) − H ( Z x ki | Z x Aj , Z x i )]= k (cid:88) j =2 H ( Z x jp | Z x Aj ) − H ( Z x jp | Z x Aj , Z x i ) + H ( Z x ki | Z x Aj , Z x i ) − H ( Z x ki | Z x jp , Z x Aj , Z x i )= k (cid:88) j =2 H ( Z x jp | Z x Aj ) − H ( Z x jp | Z x Aj , Z x i ) + k (cid:88) t =2 H ( Z x ti | Z x Aj , Z x t − i ) − H ( Z x ti | Z x jp , Z x Aj , Z x t − i ) ≤ k ( k −
1) log (cid:26) ξ η (1 + η ) (cid:27) . (32)The inequality follows from a derivation similar to (28).Combining (31) and (32), Lemma 7 results. Corollary For t = 1 , . . . , i − m − , H ( Z x t | Z x t +1: i − ) − H ( Z x t | Z x t +1: i − , Z x i ) ≤ k log (cid:26) ξ η (1 + η ) (cid:27) . Proof.
The proof is similar to that of Lemma 7.
Lemma H ( Z x i | Z x i − m : i − ) − H ( Z x i | Z x i − ) ≤ ( i − m − k log (cid:26) ξ η (1 + η ) (cid:27) . Proof.
Using the chain rule for entropy [3], H ( Z x i − m − , Z x i | Z x i − m : i − )= H ( Z x i | Z x i − m : i − ) + H ( Z x i − m − | Z x i − m : i − , Z x i ) . (33) H ( Z x i − m − , Z x i | Z x i − m : i − )= H ( Z x i − m − | Z x i − m : i − ) + H ( Z x i | Z x i − m − , Z x i − m : i − )= H ( Z x i − m − | Z x i − m : i − ) + H ( Z x i | Z x i − ) . (34)From (33) and (34), H ( Z x i | Z x i − m : i − ) − H ( Z x i | Z x i − )= H ( Z x i − m − | Z x i − m : i − ) − H ( Z x i − m − | Z x i − m : i − , Z x i ) . (35)Applying the chain rule for entropy [3] to (35), H ( Z x i − m − | Z x i − m : i − ) − H ( Z x i − m − | Z x i − m : i − , Z x i )= i − m − (cid:88) t =1 H ( Z x t | Z x t +1: i − ) − H ( Z x t | Z x t +1: i − , Z x i ) (36) ≤ ( i − m − k log (cid:26) ξ η (1 + η ) (cid:27) . (37)The inequality (37) follows from Corollary 8. So, Lemma 9holds. .3 Proof of Theorem 2 Let θ (cid:44) H ( Z x E m ) + n (cid:88) i = m +1 H ( Z x E i | Z x E i − m : i − ) −{ H ( Z x ∗ m ) + n (cid:88) i = m +1 H ( Z x ∗ i | Z x ∗ i − m : i − ) } . (38)From Lemma 5, θ ≥
0. By the chain rule for entropy [3], H ( Z x ∗ n ) − H ( Z x E n )= H ( Z x ∗ m ) + n (cid:88) i = m +1 H ( Z x ∗ i | Z x ∗ i − ) −{ H ( Z x E m ) + n (cid:88) i = m +1 H ( Z x E i | Z x E i − ) } . (39)Let ∆ ∗ i (cid:44) H ( Z x ∗ i | Z x ∗ i − m : i − ) − H ( Z x ∗ i | Z x ∗ i − ) and ∆ E i (cid:44) H ( Z x E i | Z x E i − m : i − ) − H ( Z x E i | Z x E i − ) for i = m + 1 , . . . , n .Then, (39) can be re-written as H ( Z x ∗ n ) − H ( Z x E n )= H ( Z x ∗ m ) + n (cid:88) i = m +1 [ H ( Z x ∗ i | Z x ∗ i − m : i − ) − ∆ ∗ i ] −{ H ( Z x E m ) + n (cid:88) i = m +1 [ H ( Z x E i | Z x E i − m : i − ) − ∆ E i ] } = H ( Z x ∗ m ) + n (cid:88) i = m +1 H ( Z x ∗ i | Z x ∗ i − m : i − ) − n (cid:88) i = m +1 ∆ ∗ i − [ H ( Z x E m ) + n (cid:88) i = m +1 H ( Z x E i | Z x E i − m : i − ) − n (cid:88) i = m +1 ∆ E i ]= n (cid:88) i = m +1 [∆ E i − ∆ ∗ i ] − θ (40) ≤ n (cid:88) i = m +1 [∆ E i − ∆ ∗ i ] (41) ≤ n (cid:88) i = m +1 ∆ E i . (42)Since θ ≥
0, (41) results. Since ∆ ∗ i ≥ i = m + 1 , . . . , n ,(42) follows. By Lemma 9,∆ E i ≤ ( i − m − k log (cid:26) ξ η (1 + η ) (cid:27) for i = m + 1 , . . . , n . Then, Theorem 2 follows. B. MUTUAL INFORMATION-BASED PATHPLANNINGB.1 Proof of Theorem 3
Given each vector x i − m : i − , the time needed to evaluate I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − ) over all possible x i ∈ X i is χ ×O ([ r (2 m +1)] ) = O ( χ [ r (2 m +1)] ). The time needed toperform this over all χ m possible vectors x i − m : i − in eachstage i is χ m × O ( χ [ r (2 m + 1)] ) = O ( χ m +1 [ r (2 m + 1)] ).Similar to the MEPP( m ) algorithm, the conditional mu-tual information terms calculated in a stage are the same as those in every other stage. The time needed to prop-agate the optimal values from stages n − m + 1 is O ( χ m +1 ( n − m − I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − ) over all possible x n ∈ X n and all χ m possible vectors x n − m : n − in stage n is O ( χ m +1 [ r (2 m + 1)] ). To obtain the optimal vector x M m , I ( Z x m , Z u m ) has to be evaluated over all possi-ble x m . Hence, the time needed to solve for the optimalvector x M m is O ( χ m [ r (2 m )] ). As a result, the time com-plexity of the M IPP( m ) algorithm is O ( χ m +1 ( n − m − r (2 m + 1)] ) + χ m +1 [ r (2 m + 1)] + χ m [ r (2 m )] ) = O ( χ m +1 ( n + 2[ r (2 m + 1)] )). B.2 Proof of Some Lemmas
Before giving the proof of Theorem 4, the following lem-mas are needed.
Lemma
For any observation paths x n , I ( Z x M m ; Z u M m ) + n − (cid:88) i =2 m +1 I ( Z x M i − m ; Z u M i − m : i | Z x M i − m : i − m − )+ I ( Z x M n − m : n ; Z u M n − m : n | Z x M n − m : n − m − ) ≥ I ( Z x m ; Z u m ) + n − (cid:88) i =2 m +1 I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − )+ I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − ) . Proof.
Using (12), U m +1 ( x m )= max x m +1 ∈X m +1 I ( Z x m +1 ; Z u m +1 | Z x m ) + U m +2 ( x m +1 )= max x m +1 ∈X m +1 I ( Z x m +1 ; Z u m +1 | Z x m ) +max x m +2 ∈X m +2 I ( Z x m +2 ; Z u m +2 | Z x m +1 ) + U m +3 ( x m +2 )= max x m +1 ∈X m +1 ,x m +2 ∈X m +2 I ( Z x m +1 ; Z u m +1 | Z x m ) + I ( Z x m +2 ; Z u m +2 | Z x m +1 ) + U m +3 ( x m +2 ) . . . = max x m +1 ∈X m +1 ,...,x n ∈X n n − (cid:88) i =2 m +1 I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − ) + I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − ) . (43)Given x m , the vectors x m +1 , . . . , x n that maximize theterm (cid:80) n − i =2 m +1 I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − ) + I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − ) in (43) can be obtained.Using (13), the paths x n that maximize I ( Z x m ; Z u m )+ (cid:80) n − i =2 m +1 I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − ) + I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − ) can be obtained. There-fore, Lemma 10 holds. Lemma
For t = 1 , . . . , i − m − , H ( Z x t | Z x t +1: i − m − , Z u i − m : i ) − H ( Z x t | Z x t +1: i − m − , Z u i − m : i , Z x i − m ) ≤ k log (cid:26) ξ η (1 + η ) (cid:27) . Proof.
The proof is similar to that of Corollary 8.
Corollary
For t = 1 , . . . , i − m − , H ( Z u t | Z x i − m − , Z u t +1: i ) − H ( Z u t | Z x i − m − , Z u t +1: i , Z x i − m ) ≤ k ( r − k ) log (cid:26) ξ η (1 + η ) (cid:27) . roof. Note that the size of vector u t is r − k . The proofis similar to that of Corollary 8. Corollary
For t = i + 1 , . . . , n , H ( Z u t | Z x i − m − , Z u t − ) − H ( Z u t | Z x i − m − , Z u t − , Z x i − m ) ≤ k ( r − k ) log (cid:26) ξ η (1 + η ) (cid:27) . Proof.
The proof is similar to that of Corollary 12.
Lemma H ( Z x i − m | Z x i − m : i − m − , Z u i − m : i ) − H ( Z x i − m | Z x i − m − , Z u n ) ≤ ( n − m − rk log (cid:26) ξ η (1 + η ) (cid:27) . Proof.
Let x ∆ ( u ∆ ) denote a vector of all the locationsof x i − m − ( u n ) excluding those of x i − m : i − m − ( u i − m : i ).That is, x ∆ (cid:44) x i − m − and u ∆ (cid:44) ( u i − m − , u i +1: n ). H ( Z x i − m | Z x i − m : i − m − , Z u i − m : i ) − H ( Z x i − m | Z x i − m − , Z u n )= H ( Z x i − m | Z x i − m : i − m − , Z u i − m : i ) − [ H ( Z x i − m | Z x i − m : i − m − , Z u i − m : i ) + H ( Z x ∆ , Z u ∆ | Z x i − m : i − m − , Z u i − m : i , Z x i − m ) − H ( Z x ∆ , Z u ∆ | Z x i − m : i − m − , Z u i − m : i )] (44)= H ( Z x ∆ , Z u ∆ | Z x i − m : i − m − , Z u i − m : i ) − H ( Z x ∆ , Z u ∆ | Z x i − m : i − m − , Z u i − m : i , Z x i − m ) (45)= [ H ( Z x ∆ | Z x i − m : i − m − , Z u i − m : i ) − H ( Z x ∆ | Z x i − m : i − m − , Z u i − m : i , Z x i − m )] +[ H ( Z u ∆ | Z x i − m − , Z u i − m : i ) − H ( Z u ∆ | Z x i − m − , Z u i − m : i , Z x i − m )] (46)= i − m − (cid:88) t =1 [ H ( Z x t | Z x t +1: i − m − , Z u i − m : i ) − H ( Z x t | Z x t +1: i − m − , Z u i − m : i , Z x i − m )] + i − m − (cid:88) t =1 [ H ( Z u t | Z x i − m − , Z u t +1: i ) − H ( Z u t | Z x i − m − , Z u t +1: i , Z x i − m )] + n (cid:88) t = i +1 [ H ( Z u t | Z x i − m − , Z u t − ) − H ( Z u t | Z x i − m − , Z u t − , Z x i − m )] (47) ≤ [( i − m − k + ( n − m − r − k ) k ] log (cid:26) ξ η (1 + η ) (cid:27) (48) ≤ ( n − m − k + ( r − k ) k ] log (cid:26) ξ η (1 + η ) (cid:27) = ( n − m − rk log (cid:26) ξ η (1 + η ) (cid:27) . (49)Using the chain rule for entropy [3], (44), (46), and (47) canbe obtained. Using Lemma 11 and Corollaries 12 and 13,(48) can be obtained. Lemma I ( Z x i − m ; Z u n | Z x i − m − ) − I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − )= A i − m − B i − m where A i − m = H ( Z x i − m | Z x i − m : i − m − , Z u i − m : i ) − H ( Z x i − m | Z x i − m − , Z u n ) ,B i − m = H ( Z x i − m | Z x i − m : i − m − ) − H ( Z x i − m | Z x i − m − ) and A i − m ≤ ( n − m − rk log (cid:26) ξ η (1 + η ) (cid:27) ,B i − m ≤ ( i − m − k log (cid:26) ξ η (1 + η ) (cid:27) . Proof.
By the definition of conditional mutual informa-tion, I ( Z x i − m ; Z u n | Z x i − m − )= H ( Z x i − m | Z x i − m − ) − H ( Z x i − m | Z x i − m − , Z u n ) , (50) I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − )= H ( Z x i − m | Z x i − m : i − m − ) − H ( Z x i − m | Z x i − m : i − m − , Z u i − m : i ) . (51)Using (50) and (51), I ( Z x i − m ; Z u n | Z x i − m − ) − I ( Z x i − m ; Z u i − m : i | Z x i − m : i − m − )= A i − m − B i − m . By Lemma 14, A i − m ≤ ( n − m − rk log (cid:26) ξ η (1 + η ) (cid:27) . Using Lemma 9, B i − m ≤ ( i − m − k log (cid:26) ξ η (1 + η ) (cid:27) . Therefore, Lemma 15 holds.
B.3 Proof of Theorem 4
Let θ (cid:44) I ( Z x M m ; Z u M m ) + n − (cid:88) i =2 m +1 I ( Z x M i − m ; Z u M i − m : i | Z x M i − m : i − m − ) + I ( Z x M n − m : n ; Z u M n − m : n | Z x M n − m : n − m − ) − [ I ( Z x (cid:63) m ; Z u (cid:63) m ) + n − (cid:88) i =2 m +1 I ( Z x (cid:63)i − m ; Z u (cid:63)i − m : i | Z x (cid:63)i − m : i − m − ) + I ( Z x (cid:63)n − m : n ; Z u (cid:63)n − m : n | Z x (cid:63)n − m : n − m − )] . (52)From Lemma 10, θ ≥
0. By the chain rule for mutual infor-mation [3], I ( Z x (cid:63) n , Z u (cid:63) n ) − I ( Z x M n , Z u M n )= I ( Z x (cid:63) m ; Z u (cid:63) n ) + n − (cid:88) i =2 m +1 I ( Z x (cid:63)i − m ; Z u (cid:63) n | Z x (cid:63) i − m − ) + I ( Z x (cid:63)n − m : n ; Z u (cid:63) n | Z x (cid:63) n − m − ) − [ I ( Z x M m ; Z u M n ) + n − (cid:88) i =2 m +1 I ( Z x M i − m ; Z u M n | Z x M i − m − ) + I ( Z x M n − m : n ; Z u M n | Z x M n − m − )] . (53)y the definition of mutual information, I ( Z x m ; Z u n ) = H ( Z x m ) − H ( Z x m | Z u n ) , (54) I ( Z x m ; Z u m ) = H ( Z x m ) − H ( Z x m | Z u m ) . (55)Using the chain rule for entropy [3], H ( Z x m | Z u m ) − H ( Z x m | Z u n )= H ( Z x | Z u m ) − H ( Z x | Z u n ) + . . . + H ( Z x m | Z x m − , Z u m ) − H ( Z x m | Z x m − , Z u n )= m (cid:88) t =1 H ( Z x t | Z x t − , Z u m ) − H ( Z x t | Z x t − , Z u n ) ≤ m ( n − m )( r − k ) k log (cid:26) ξ η (1 + η ) (cid:27) . (56)Inequality (56) can be obtained using a proof similar toLemma 14. Applying (56) to (54) and (55), I ( Z x m ; Z u n ) − I ( Z x m ; Z u m ) = A m (57)where A m ≤ m ( n − m )( r − k ) k log (cid:26) ξ η (1 + η ) (cid:27) . (58)By the definition of mutual information, I ( Z x n − m : n ; Z u n | Z x n − m − )= H ( Z x n − m : n | Z x n − m − ) − H ( Z x n − m : n | Z x n − m − , Z u n ) , (59) I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − )= H ( Z x n − m : n | Z x n − m : n − m − ) − H ( Z x n − m : n | Z x n − m : n − m − , Z u n − m : n ) . (60)Using the chain rule of entropy, H ( Z x n − m : n | Z x n − m : n − m − ) − H ( Z x n − m : n | Z x n − m − )= n (cid:88) t = n − m H ( Z x t | Z x n − m : t − ) − H ( Z x t | Z x t − ) (61) ≤ ( m + 1)( n − m − k log (cid:26) ξ η (1 + η ) (cid:27) . (62)Using Lemma 9, inequality (62) can be obtained.By the chain rule of entropy, H ( Z x n − m : n | Z x n − m : n − m − , Z u n − m : n ) − H ( Z x n − m : n | Z x n − m − , Z u n )= n (cid:88) t = n − m H ( Z x t | Z x n − m : t − , Z u n − m : n ) − H ( Z x t | Z x t − , Z u n ) ≤ ( m + 1)( n − m − rk log (cid:26) ξ η (1 + η ) (cid:27) . (63)Using a proof similar to Lemma 14, inequality (63) can beobtained. Applying the results (62) and (63) to (59) and(60), I ( Z x n − m : n ; Z u n | Z x n − m − ) − I ( Z x n − m : n ; Z u n − m : n | Z x n − m : n − m − )= A n − m : n − B n − m : n (64) where A n − m : n ≤ ( m + 1)( n − m − rk log (cid:26) ξ η (1 + η ) (cid:27) , (65) B n − m : n ≤ ( m + 1)( n − m − k log (cid:26) ξ η (1 + η ) (cid:27) . (66)Using the above results, (53) can be rewritten as I ( Z x (cid:63) n ; Z u (cid:63) n ) − I ( Z x M n ; Z u M n )= A (cid:63) m + I ( Z x (cid:63) m ; Z u (cid:63) m ) + n − (cid:88) i =2 m +1 [ A (cid:63)i − m − B (cid:63)i − m + I ( Z x (cid:63)i − m ; Z u (cid:63)i − m : i | Z x (cid:63)i − m : i − m − )]+ A (cid:63)n − m : n − B (cid:63)n − m : n + I ( Z x (cid:63)n − m : n ; Z u (cid:63)n − m : n | Z x (cid:63)n − m : n − m − ) −{ A M m + I ( Z x M m ; Z u M m ) + n − (cid:88) i =2 m +1 [ A M i − m − B M i − m + I ( Z x M i − m ; Z u M i − m : i | Z x M i − m : i − m − )]+ A M n − m : n − B M n − m : n + I ( Z x M n − m : n ; Z u M n − m : n | Z x M n − m : n − m − ) } (67)= A (cid:63) m + n − (cid:88) i =2 m +1 ( A (cid:63)i − m − B (cid:63)i − m ) + ( A (cid:63)n − m : n − B (cid:63)n − m : n ) − [ A M m + n − (cid:88) i =2 m +1 ( A M i − m − B M i − m ) + ( A M n − m : n − B M n − m : n )] − θ (68) ≤ A (cid:63) m + n − (cid:88) i =2 m +1 A (cid:63)i − m + A (cid:63)n − m : n + n − (cid:88) i =2 m +1 B M i − m + B M n − m : n − [ A M m + n − (cid:88) i =2 m +1 A M i − m + A M n − m : n + n − (cid:88) i =2 m +1 B (cid:63)i − m + B (cid:63)n − m : n ](69) ≤ A (cid:63) m + n − (cid:88) i =2 m +1 A (cid:63)i − m + A (cid:63)n − m : n + n − (cid:88) i =2 m +1 B M i − m + B M n − m : n (70) ≤ [ m ( n − m )( r − k ) k + ( n − m − m + 1)( n − m − rk +12 ( n − m − n − m − k + ( m + 1)( n − m − k ]log (cid:26) ξ η (1 + η ) (cid:27) (71) ≤ [ m ( n − m )( r − k ) k + ( n − m )( n − m ) rk +12 ( n − m )( n − m − k + ( m + 1)( n − m ) k ] log (cid:26) ξ η (1 + η ) (cid:27) = [ m ( n − m )( r − k ) k + ( n − m )( n − m ) rk +12 ( n − m )( n − m ) k + m ( n − m ) k )] log (cid:26) ξ η (1 + η ) (cid:27) = [ mr + ( n − m ) r + 12 ( n − m ) k ]( n − m ) k log (cid:26) ξ η (1 + η ) (cid:27) = [ nr + 12 ( n − m ) k ]( n − m ) k log (cid:26) ξ η (1 + η ) (cid:27) . Using (57), (64), and Lemma 15, (67) can be obtained. Ap-plying θ to (67), (68) can be obtained. Since θ ≥≥