Adaptive Bayesian Radio Tomography
IIEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED, APRIL 4, 2018) 1
Adaptive Bayesian Radio Tomography
Donghoon Lee,
Student Member, IEEE,
Dimitris Berberidis,
Student Member, IEEE, and Georgios B. Giannakis,
Fellow, IEEE
Abstract —Radio tomographic imaging (RTI) is an emergingtechnology to locate physical objects in a geographical areacovered by wireless networks. From the attenuation measure-ments collected at spatially distributed sensors, radio tomog-raphy capitalizes on spatial loss fields (SLFs) measuring theabsorption of radio frequency waves at each location along thepropagation path. These SLFs can be utilized for interferencemanagement in wireless communication networks, environmentalmonitoring, and survivor localization after natural disaster suchas earthquakes. Key to success of RTI is to model accuratelythe shadowing effects as the bi-dimensional integral of the SLFscaled by a weight function, which is estimated using regularizedregression. However, the existing approaches are less effectivewhen the propagation environment is heterogeneous. To cope withthis, the present work introduces a piecewise homogeneous SLFgoverned by a hidden Markov random field (MRF) model. Effi-cient and tractable SLF estimators are developed by leveragingMarkov chain Monte Carlo (MCMC) techniques. Furthermore,an uncertainty sampling method is developed to adaptively collectinformative measurements in estimating the SLF. Numerical testsusing synthetic and real datasets demonstrate capabilities ofthe proposed algorithm for radio tomography and channel-gainestimation.
Index Terms —Radio tomography, channel-gain cartography,Markov chain Monte Carlo, active learning, Bayesian inference
I. I
NTRODUCTION
Tomographic imaging is a technique widely appreciatedby natural sciences, notably in medical imaging [29]. Theprinciples underpinning radio tomographic methods have beencarried over to construct underlying spatial loss fields (SLFs),which are maps quantifying the attenuation experienced byelectromagnetic waves in radio frequency (RF) bands at ev-ery spatial position [24]. To this end, pairs of collaboratingsensors are deployed over the area of interest to estimatethe attenuation introduced by the channel between thosepairs of sensors. Different from conventional methods, radiotomography relies on incoherent measurements containing nophase information; see also [17] for another application ofincoherent measurements to cognitive radio networks. Suchsimplification saves costs incurred for synchronization thatis necessary to calibrate phase differences among waveformsreceived at different sensors.
Parts of this work will be presented at the IEEE International Conference onAcoustics, Speech and Signal Processing, held in Calgary, Alberta, Canada,during Apr. 15-20, 2018.D. Lee, D. Berberidis, and G. B. Giannakis are with the Department ofElectrical and Computer Engineering and the Digital Technology Center,University of Minnesota, Minneapolis, MN 55455, USA. Emails: { leex6962,bermp001, georgios } @umn.edu.The work in this paper was supported by grants NSF grants 1442686,1508993, 1509040. SLFs are instrumental in various problems including radiotomography [32] and channel-gain cartography [16]. The ab-sorption captured by the SLF allows one to discern objectslocated in the area of interest, thus enabling radio tomographicimaging (RTI). Benefiting from the ability of RF waves topenetrate physical structures such as trees or buildings, RTIprovides a means of device-free passive localization [33],[34], and has found diverse applications in disaster responsesituations for e.g., detecting individuals trapped in buildings orsmoke [31]. SLFs are also useful in channel-gain cartographyto provide channel-state information (CSI) for links betweenarbitrary locations even where no sensors are present [16].Such maps can be employed in cognitive radio setups tocontrol the interference that a secondary network inflictsto primary users that do not transmit–a setup encounteredwith television broadcast systems [36], [5], [15]. The non-collaborative nature of these primary users precludes training-based channel estimation between secondary transmitters andprimary receivers. Other applications of channel-gain mapsinclude network planning, and interference management incellular networks.The key premise behind RTI is that spatially close radiolinks exhibit similar shadowing due to the presence of commonobstructions. This shadowing correlation is related to thegeometry of objects present in the area waves propagatethrough [24], [1]. As a result, shadowing is modeled as theweighted line integral of the underlying two-dimensional SLF.The weights in the integral are determined by a functiondepending on the transmitter-receiver locations [24], [9], [27],which models the SLF influence on the shadowing over a linkbetween those transceivers. Inspired by this SLF model, vari-ous tomographic imaging methods were proposed [32], [31],[30], [14]. To detect locations of changes in the propagationenvironment, the difference between the SLF at consecutivetime slots was employed [32], [30]. To cope with multipathfading in a cluttered environment, multiple channel measure-ments were utilized to enhance localization accuracy [14].Although these are calibration-free approaches, they cannotreveal static objects in the area of interest. It is also possibleto replace the SLF with a label field indicating presence (orabsence) of objects in motion on each voxel [31], and leveragethe influence moving objects on the propagation path have onvariance in RSS measurements. On the other hand, the SLFwas directly reconstructed in [9] to depict the static structurein the area of interest, but calibration was necessary by usingmeasurements collected without the structure.A different body of works inspired by the SLF model isavailable for channel-gain cartography [16], [27], [4], [18].Linear interpolation techniques such as kriging were employed a r X i v : . [ ee ss . SP ] A p r IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED, APRIL 4, 2018) to estimate shadowing effects based on spatially correlatedmeasurements [4], and the spatio-temporal dynamics weretracked by using Kalman filtering approaches [16]. SLFswith ‘regular patterns’ of objects have also been modeled asa superposition of a low-rank matrix plus a sparse matrixcapturing structure irregularities [18]. While the aforemen-tioned methods rely on heuristic criteria to choose the weightfunction, [27] provides a suite of blind algorithms to learnthe weight function using a non-parametric kernel regressionmethod.Conventionally, the SLF is learned via regularized least-squares (LS) methods tailored to the propagation environ-ment [9], [18], [30]. However, these approaches are less effec-tive when the propagation environment exhibits heterogeneouscharacteristics. To account for environmental heterogeneity,the novel method here leverages the Bayesian frameworkto learn the piecewise homogeneous SLF through a hiddenMarkov random field (MRF) model [12], which capturesspatial correlations of neighboring regions exhibiting relatedstatistical behavior. Efficient field estimators will be derivedby using Markov chain Monte Carlo (MCMC) sampling [8],which is a powerful tool for Bayesian inference when analyt-ical solutions of the minimum mean-square error (MMSE) orthe maximum a posteriori (MAP) estimators are not available.Furthermore, hyperparameters are estimated as well, insteadof being fixed a priori.Besides accounting for heterogeneous propagation, anothercontribution here is an adaptive data acquisition technique,with the goal of reducing SLF uncertainty, by cross-fertilizingideas from the fields of experimental design [6] and activelearning [20]. The conditional entropy of the SLF is consideredas an uncertainty measure in this work, giving rise to a noveldata acquisition criterion. Although such criterion is intractableespecially when the size of the SLF is large, its efficient proxycan be obtained thanks to the availability of posterior samplesfrom the proposed MCMC-based algorithm.The rest of the paper is organized as follows. Sec. II reviewsthe radio tomography model and states the problem. TheBayesian model and the resultant field reconstruction are thesubjects of Sec. III. Numerical tests with synthetic as well asreal measurements are provided in Sec. IV. Finally, Sec. Vsummarizes the main conclusions.
Notations:
Bold uppercase (lowercase) letters denote matri-ces (column vectors). Calligraphic letters are used for sets; I n is the n × n identity matrix; while n and n denote n × vectors of all zeros and ones, respectively. Operators ( · ) > and tr( · ) represent the transpose and trace of a matrix X ∈ R N x × N y , respectively; | · | is used for the cardinality ofa set, and the magnitude of a scalar; and vec( X ) produces acolumn vector x ∈ R N x N y by stacking the columns of a matrixone below the other ( unvec( x ) denotes the reverse process).For a vector y ∈ R n and an n × n weight matrix ∆ , theweighted norm of y is k y k ∆ := y > ∆y .II. B ACKGROUND AND P ROBLEM S TATEMENT
Consider a set of sensors deployed over a two-dimensionalgeographical area indexed by a set
A ⊂ R . After averaging out small-scale fading effects, the channel-gain measurementover a link between a transmitter located at x ∈ A and areceiver located at x ∈ A can be represented (in dB) as g ( x , x ) = g − γ
10 log d ( x , x ) − s ( x , x ) (1)where g is the path gain at unit distance; d ( x , x ) := k x − x k is the Euclidean distance between the transceivers at x and x ; γ is the pathloss exponent; and s ( x , x ) is the attenuation dueto shadow fading. For radio tomography, a tomographic modelfor the shadow fading is [24], [9], [18] s ( x , x ) = Z A w ( x , x , ˜ x ) f (˜ x ) d ˜ x . (2)where f : A → R denotes the spatial loss field (SLF)capturing the attenuation at location ˜ x , and w : A×A×A → R is the weight function modeling the influence of the SLFat ˜ x to the shadowing experienced by link x – x . Typically, w confers a greater weight w ( x , x , ˜ x ) to those locations ˜ x lying closer to the link x – x . Examples of the weight functioninclude the normalized ellipse model [30] w ( x , x , ˜ x ) := / p d ( x , x ) , if d ( x , ˜ x ) + d ( x , ˜ x ) < d ( x , x ) + λ/ , otherwise (3)where λ > is a tunable parameter. The value of λ iscommonly set to the wavelength to assign non-zero weightsonly within the first Fresnel zone. In radio tomography, theintegral in (2) is approximated as s ( x , x ) ’ c N g X i =1 w ( x , x , ˜ x i ) f (˜ x i ) (4)where { ˜ x i } N g i =1 is a grid of points over A and c is a constantthat can be set to unity without loss of generality by absorbingany scaling factor in f . Clearly, (4) shows that s ( x , x ) depends on f only through its values at the grid points.The model in (2) describes how the spatial distribution ofobstructions in the propagation path influences the attenuationbetween a pair of locations. The usefulness of this model istwofold: i) as f represents absorption across space, it can beused for imaging; and ii) once f and w are known, the gainbetween any two points x and x can be recovered through(1) and (2), which is precisely the objective of channel-gaincartography.The goal of radio tomography is to obtain a tomogram by es-timating f . To this end, N sensors located at { x , . . . , x N } ∈A collaboratively obtain channel-gain measurements. At timeslot τ , the radios indexed by n ( τ ) and n ( τ ) measure thechannel-gain ˇ g τ := g ( x n ( τ ) , x n ( τ ) ) + ν τ by exchangingtraining sequences known to both transmitting and receivingradios, where n ( τ ) , n ( τ ) ∈ { , . . . , N } and ν τ denotesmeasurement noise. It is supposed that g and γ have beenestimated during a calibration stage. After subtracting thesefrom ˇ g τ , the shadowing estimate is found as ˇ s τ := g − γ
10 log d ( x n ( τ ) , x n ( τ ) ) − ˇ g τ = s ( x n ( τ ) , x n ( τ ) ) − ν τ . (5) EE, BERBERIDIS, AND GIANNAKIS: ADAPTIVE BAYESIAN RADIO TOMOGRAPHY 3
Having available ˇ s t := [ˇ s , . . . , ˇ s t ] > ∈ R t along withthe known set of links { ( x n ( τ ) , x n ( τ ) ) } tτ =1 and the weightfunction w , the problem is to estimate f , or equivalently f := [ f (˜ x ) , . . . , f (˜ x N g )] > ∈ R N g using (4).Regularized least-squares (LS) estimators of f solve [9],[18], [30] min f t X τ =1 (cid:18) ˇ s τ − N g X i =1 w ( x n ( τ ) , x n ( τ ) , ˜ x i ) f (˜ x i ) (cid:19) + µ f R ( f ) (6)where R : R N g → R is a generic regularizer to promote aknown attribute of f , and µ f ≥ is a regularization weightto reflect compliance of f with this attribute. Although (6)has been successfully applied to radio tomographic imagingtasks after customizing the regularizer to the propagationenvironment, how accurate approximation is provided by aregularized solution of (6) is unclear when the propagationenvironment exhibits inhomogeneous characteristics.To overcome this and improve estimation performance ofthe SLF, a priori knowledge on the heterogeneous structure of f will be exploited next, under a Bayesian framework.III. A DAPTIVE B AYESIAN R ADIO T OMOGRAPHY
In this section, we view f as random, and forth propose atwo-layer Bayesian SLF model, along with an MCMC-basedapproach for inference. We further develop an adaptive dataacquisition strategy to select informative measurements. A. Bayesian model and problem formulation
Let A consist of two disjoint homogeneous regions A := { x | E [ f ( x )] = µ f , Var [ f ( x )] = σ f , x ∈ A} , and A := { x | E [ f ( x )] = µ f , Var [ f ( x )] = σ f , x ∈ A} , giving rise toa hidden label field z := [ z (˜ x ) , . . . , z (˜ x N g )] > ∈ { , } N g with binary entries z (˜ x i ) = k if ˜ x i ∈ A k ∀ i , and k = 0 , .The two separate regions can be used to model heterogeneousenvironments. For instance, if A corresponds to an urban area, A may include densely populated regions with buildings,while A with µ f < µ f may capture the less obstructiveopen spaces. In such a paradigm, we model the conditionaldistribution of f (˜ x i ) as f (˜ x i ) | z (˜ x i ) = k ∼ N ( µ f k , σ f k ) , (7)while the Ising prior [28], which is a binary version of thediscrete MRF Potts prior [12], is assigned to z in order tocapture the dependency among spatially correlated labels. Bythe Hammersley-Clifford theorem [10], the Ising prior of z follows a Gibbs distribution p ( z | β ) = 1 C ( β ) exp β N g X i =1 X j ∈N (˜ x i ) δ ( z (˜ x j ) = z (˜ x i )) (8)where N (˜ x i ) is a set of indices associated with 1-hop neigh-bors of ˜ x i on the rectangular grid in Fig. 1, β is a granularitycoefficient controlling the degree of homogeneity in z , δ ( · ) isKronecker’s delta, and C ( β ) := X z ∈Z exp β N g X i =1 X j ∈N (˜ x i ) δ ( z (˜ x j ) = z (˜ x i )) (9) Fig. 1: Four-connected MRF with z (˜ x i ) marked red and itsneighbors in N (˜ x i ) marked blue.Fig. 2: Gauss-Potts model for radio tomography, together withthe measurement model for sensors located at ( x n , x n ) .is the partition function with Z := { , } N g . By assumingconditional independence of { f (˜ x i ) } N g i =1 given z , the result-ing model is referred to as the Gauss-Potts model [2] withtwo labels. The Gauss-Potts model for radio tomography isdepicted in Fig. 2 with the measurement model in (2).To describe priors of other parameters, let ν t be inde-pendent and identically distributed (i.i.d) Gaussian with zeromean and variance σ ν , and θ denote a hyperparameter vec-tor comprising σ ν , β , and θ f := [ µ f , µ f , σ f , σ f ] > . Theweight matrix W ∈ R N g × t is formed with columns w τ :=[ w ( x n ( τ ) , x n ( τ ) , ˜ x ) , . . . , w ( x n ( τ ) , x n ( τ ) , ˜ x N g )] > ∈ R N g ofthe link x n ( τ ) – x n ( τ ) for τ = 1 , . . . , t . Assuming the indepen-dence among entries of θ , p ( θ ) can be expressed as p ( θ ) = p ( σ ν ) p ( β ) p ( µ f k ) p ( σ f k ) (10)with p ( µ f k ) = p ( µ f ) p ( µ f ) and p ( σ f k ) = p ( σ f ) p ( σ f ) ,where the individual priors p ( σ ν ) , p ( β ) , p ( µ f k ) , and p ( σ f k ) are specified next.
1) Granularity coefficient β : To cope with the variabilityof β in accordance with structural patterns of the propagationmedium, β is viewed as an unknown random variable that isto be estimated together with f and z under the Bayesianframework. Similar to e.g., [25], the uniform distribution isadopted for the prior of β as p ( β ) = U (0 ,β max ) ( β ) := ( /β max , if β ∈ [0 , β max ]0 , otherwise . (11)
2) Noise variance σ ν : In the presence of the additiveGaussian noise with fixed mean, it is common to assign a
IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED, APRIL 4, 2018)
Fig. 3: Graphical representation of the hierarchical Bayesianmodel for (hyper) parameters (those in boxes are fixed).conjugate prior to σ ν , which reproduces a posterior distributionin the same family of its prior. The inverse gamma (IG)distribution serves this purpose for σ ν ∈ R + as follows: p ( σ ν ) = IG ( a ν , b ν ) := b a ν ν Γ( a ν ) ( σ ν ) − a ν − exp (cid:18) − b ν σ ν (cid:19) (12)where a ν is referred to as the shape parameter, b ν as the scaleparameter, and Γ( · ) denotes the gamma function.
3) Hyperparameters of the SLF θ f : While the prior for µ f k is assumed to be Gaussian with mean m k and variance σ k (see also [2]), the inverse Gamma distribution parameterizedby { a k , b k } is considered for the prior of σ f k : p ( µ f k ) = N ( m k , σ k ) , k = 0 , , (13) p ( σ f k ) = IG ( a k , b k ) , k = 0 , . (14)Together with the priors for { f , z , θ } , our joint posterior is p ( f , z , θ | ˇ s t ) ∝ p (ˇ s t | f , σ ν ) p ( f | z , θ f ) p ( z | β ) p ( θ ) (15)where p (ˇ s t | f , σ ν ) is the data likelihood. Note that Fig. 3summarizes the proposed hierarchical Bayesian model for { ˇ s t , f , z , θ } as a directed acyclic graph, where the dependencybetween (hyper) parameters is indicated with an arrow.We will pursue the the conditional MMSE estimator ˆ f MMSE := E [ f | z = ˆ z MAP , ˇ s t ] (16)where the marginal MAP estimate is ˆ z MAP := arg max z p ( z | ˇ s t ) . (17)Furthermore, the marginal MMSE estimates of the θ entriesare found as c σ ν MMSE := E [ σ ν | ˇ s t ] (18) ˆ β MMSE := E [ β | ˇ s t ] (19) d µ f k MMSE := E [ m k | ˇ s t ] , k = 0 , (20) d σ f k MMSE := E [ σ k | ˇ s t ] , k = 0 , . (21) B. Sampling via Markov chain Monte Carlo
While approximate estimators have been proposed forBayesian inference (see e.g., [13], [35]), analytical solutionsto (16) − (21) are not tractable due to the complex form of Algorithm 1
Metropolis-within-Gibbs sampler for { f , z , θ } Input: z (0) , θ (0) , ˇ s t , N CL , N Burn-in , and N Iter for l = 1 to N Iter do Generate f ( l ) ∼ p ( f | ˇ s t , z ( l − , θ ( l − ) in (22) Generate z ( l ) ∼ p ( z | ˇ s t , f ( l ) , θ ( l − ) via Alg. 2 Generate β ( l ) ∼ p ( β | ˇ s t , f ( l ) , z ( l ) , σ ν ( l − , θ ( l − f ) viaAlg. 4 Generate σ ν ( l ) ∼ p ( σ ν | ˇ s t , f ( l ) , z ( l ) , β ( l ) , θ ( l − f ) in (32) Generate µ ( l ) f k ∼ p ( µ f k | ˇ s t , f ( l ) , z ( l ) , σ ν ( l ) , β ( l ) , σ l − f k ) in (34) for k = 0 , Generate σ l ) f k ∼ p ( σ f k | ˇ s t , f ( l ) , z ( l ) , σ ν ( l ) , β ( l ) , µ ( l ) f k ) in (38) for k = 0 , end for return S ( t ) := (cid:8) f ( l ) , z ( l ) , θ ( l ) (cid:9) N Iter l = N Burn-in +1 the posterior in (15) that does not permit marginalization ormaximization. To bypass this challenge, one can generate sam-ples from (15), and then numerically approximate the desiredestimators from those samples. MCMC is a class of methodsused to generate samples from a complex distribution [8].Among MCMC methods, Gibbs sampling [7] is particularlysuitable for this work. It draws samples following the targetdistribution (e.g., the posterior in (15)) by sweeping througheach variable to sample from its conditional distribution whilefixing the others to their up-to-date values. Although thesamples at early iterations of Gibbs sampling with randominitialization are not representative of the desired distribution(such duration is called the burn-in period N Burn-in ), the theoryof MCMC guarantees that the stationary distribution of thosesamples matches with the target distribution [8].Gibbs sampling requires only the conditional distributionwithin a proportionality scale. When a given conditional dis-tribution is not easy to simulate, one can resort to a Metropolis-Hastings (MH) sampler [11], which generates a candidate froma simple proposal distribution of such conditional distribution,and accepts (or rejects) the candidate as a sample of interestunder a certain acceptance ratio α . The substitution of MHsampling for some sampling steps inside the Gibbs samplerresults in a Metropolis-within-Gibbs (MwG) sampler, as listedin Alg. 1. Posterior conditionals considered in this work andassociated sampling methods will be described next.
1) Spatial loss field f : It is easy to show that p ( f | ˇ s t , z , θ ) ∝ p (ˇ s t | f , σ ν ) p ( f | z , θ f ) ∼ N ( ˇ µ f | z, θ , ˇ s t , Σ f | z, θ , ˇ s t ) (22)where Σ f | z, θ , ˇ s t := (cid:16) ( σ ν ) − WW > + ∆ − f | z (cid:17) − (23) ˇ µ f | z, θ , ˇ s t := Σ f | z, θ , ˇ s t (cid:16) ( σ ν ) − W ˇ s t + ∆ − f | z µ f | z (cid:17) (24)since p ( f | z , θ f ) follows N ( µ f | z , ∆ f | z ) by (7), with µ f | z := E [ f | z ] and ∆ f | z := diag ( { Var [ f i | z i ] } N g i =1 ) with f i := f (˜ x i ) and z i := z (˜ x i ) (see Appendix A for derivation). Hence, f can be easily simulated by a standard sampling method. EE, BERBERIDIS, AND GIANNAKIS: ADAPTIVE BAYESIAN RADIO TOMOGRAPHY 5
Algorithm 2
Single-site Gibbs sampler for z Input: f ( l ) and z ( l − Initialize ζ ( l ) := [ ζ ( l )1 , . . . , ζ ( l ) N g ] > = z ( l − for i = 1 to N g do Obtain h i in (26) with z = ζ ( l ) and f = f ( l ) Generate u ∼ U (0 , if u < (1 + h i ) − then Set ζ ( l ) i = 1 else Set ζ ( l ) i = 0 end if end for return z ( l ) = ζ ( l )
2) Hidden label field z : A Gibbs sampler is required tosimulate p ( z | ˇ s t , f , θ ) ∝ p ( f | z , θ f ) p ( z | β ) while avoiding theintractable computation of C ( β ) in (8). Let z − i and z N (˜ x i ) represent replicas of z without its i -th entry, and only withthe entries of N (˜ x i ) , respectively. By the Markovianity of z and conditional independence between f i and f j ∀ i = j given z , the conditional distribution of z i is p ( z i | z − i , ˇ s t , f , θ ) ∝ exp ‘ ( z i ) + β X j ∈N (˜ x i ) δ ( z j = z i ) (25)where ‘ ( z i ) := ln p ( f i | z i , θ f ) . After evaluating (25) for z i =0 , and normalizing, one can obtain p ( z i = 1 | z − i , ˇ s t , f , θ ) =(1 + h i ) − , where h i := exp (cid:20) ‘ ( z i = 0) − ‘ ( z i = 1) + X j ∈N (˜ x i ) β (1 − z j ) (cid:21) (26)with δ ( z j = 0) − δ ( z j = 1) = 1 − z j . Then, the sampleof z can be obtained via the single-site Gibbs sampler byusing (26), as summarized in Alg. 2. It is worth stressing thatthe sampling criterion with h i in (26) does not require theevaluation of C ( β ) .
3) Granularity coefficient β : The conditional distributionof β satisfies the following proportionality relation p ( β | ˇ s t , f , z , σ ν , θ f ) ∝ p ( z | β ) p ( β ) (27) ∝ β max C ( β ) exp β N g X i =1 X j ∈N (˜ x i ) δ ( z j = z i ) for β ∈ [0 , β max ] , simply by the Gibbs distribution in (8) andthe uniform prior of β in (11). Unfortunately, sampling of β isformidably challenging because evaluating the partition func-tion C ( β ) in p ( z | β ) , incurs exponential complexity. To addressthis, one may resort to auxiliary variable MCMC methods thatdo not require exact evaluation of p ( z | β ) , including the singleauxiliary variable method (SAVM) [22] and the exchangealgorithm [23]. Those methods replace C ( β ) with its single-point importance sampling estimate by using an auxiliaryvariable, which unfortunately must be generated via exactsampling that is generally expensive for statistical models withintractable partition functions. To bypass exact sampling for Algorithm 3
Single-site Gibbs sampler for z ∗ Input: z ( l ) , β ∗ , and N CL Initialize ζ ∗ := [ ζ ∗ , . . . , ζ ∗ N g ] > = z ( l ) for m = 1 to N CL do for i = 1 to N g do Obtain h ∗ i in (29) with z ∗ = ζ ∗ Generate u ∼ U (0 , if u < (1 + h ∗ i ) − then Set ζ ∗ i = 1 else Set ζ ∗ i = 0 end if end for end for return z ∗ = ζ ∗ generating this auxiliary variable, we will leverage a double -MH sampling method for β ; also [19].Let z ∗ and β ∗ denote the auxiliary variable of z and acandidate of β for MH sampling, respectively. The idea behindthe double-MH algorithm is to generate z ∗ through N CL cyclesof MH updates from the current sample z ( l ) , instead of usingexact sampling from p ( z ∗ | β ∗ ) . As the name suggests, thedouble-MH sampling includes two nested MCMC samplers:the inner one to generate a chain of the auxiliary variable ateach step of the outer sampler for β . It is instructive to mentionthat N CL is not necessarily large by initializing the chain with z ( l ) at the l -th iteration [25], [19], which means that additionalcomplexity to generate the auxiliary variable is not necessarilyhigh. In this work, z ∗ is obtained via another single-site Gibbssampler, as described in Alg. 3: p ( z ∗ i | z ∗− i , β ∗ ) ∝ exp β ∗ X j ∈N (˜ x i ) δ ( z ∗ j = z ∗ i ) ∀ i (28)and a sample of z ∗ i is generated by utilizing p ( z ∗ i =1 | z ∗− i , β ∗ ) = (1 + h ∗ i ) − with h ∗ i := exp (cid:20) X j ∈N (˜ x i ) β (1 − z ∗ j ) (cid:21) . (29)The overall double-MH sampler for β is summarized inAlg. 4. A proposal distribution of β ∗ is the truncated Gaussian q ( β ∗ | β ( l − ) = ( N ( β ( l − , σ q ) /c, if β ∗ ∈ [0 , β max ]0 , otherwise (30)with a tunable variable σ q and a normalizing constant c := Z β max q πσ q exp (cid:20) − σ q (cid:0) β ∗ − β ( l − (cid:1) (cid:21) dβ ∗ . (31)
4) Noise variance σ ν : With p ( σ ν ) in (12), we have theposterior conditional of σ ν satisfying p ( σ ν | ˇ s t , f , z , β, θ f ) ∝ p (ˇ s t | f , σ ν ) p ( σ ν ) ∝ IG ( a ν + t , b ν + 12 k ˇ s t − W > f k ) . (32) IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED, APRIL 4, 2018)
Algorithm 4
Double-MH sampler for β Input: β ( l − , z ( l ) , and N CL Generate β ∗ ∼ q ( β ∗ | β ( l − ) in (30) Generate z ∗ ∼ p ( z ∗ | β ∗ ) via Alg. 3 Set α := p ( β ∗ ) q ( β ( l − | β ∗ ) p ( z ∗ | β ( l − ) p ( z ( l ) | β ∗ ) p ( β ( l − ) q ( β ∗ | β ( l − ) p ( z ( l ) | β ( l − ) p ( z ∗ | β ∗ ) Obtain α = min { , α } Generate u ∼ U (0 , if u < α then β ( l ) = β ∗ else β ( l ) = β ( l − end if return β ( l ) Therefore, a sample of σ ν can be generated by a standardsampling method.
5) Means of the SLF µ f k : Let f k be the N k × vectorformed by concatenating f (˜ x i ) for ˜ x i ∈ A k , for k = 0 , .By recalling the priori independence between the parametersof disjoint homogeneous regions A and A , the posteriorconditional of µ f k := [ µ f , µ f ] > can be expressed as p ( µ f k | ˇ s t , f , z , σ ν , β, σ f , σ f ) (33) ∝ p ( f | z , θ f ) p ( µ f k ) ∝ p ( µ f | z , f , σ f ) p ( µ f | z , f , σ f ) with p ( µ f k | z , f k , σ f k ) ∝ p ( f k | z , µ f k , σ f k ) p ( µ f k ) , ∀ k. (34)Since a sample of each µ f k can be independently drawnaccording to p ( µ f k | z , f k , σ f k ) in (34), the sampling methodfor µ f k will be described.To efficiently simulate a sample of µ f k , the likelihood p ( f k | z , µ f k , σ f k ) is recast as an univariate distribution withrespect to the sample mean ¯ f k := ( P i f k,i ) /N k as p ( f k | z , µ f k , σ f k ) ∝ exp " − σ f k N k X i =1 ( f k,i − µ f k ) ∝ exp " − σ f k ( − µ f k N k X i =1 f k,i + N k µ f k ) ∝ exp " − N k σ f k ( ¯ f k − µ f k ) ∝ N ( µ f k , σ f k /N k ) . (35)Since p ( µ f k ) is the Gaussian conjugate prior, one can showthat p ( µ f k | z , f k , σ f k ) is Gaussian as well, parameterized by E (cid:2) µ f k | z , f k , σ f k (cid:3) = σ k ¯ f k σ k + ( σ f k /N k ) + σ f k /N k σ k + ( σ f k /N k ) m k Var (cid:2) µ f k | z , f k , σ f k (cid:3) = σ k + N k σ f k ! − . (36)Therefore, a sample of µ f k can be generated for k = 0 , byusing a standard sampling method.
6) Variances of the SLF σ f k : Similar to µ f k , the statisticalindependence between A and A leads to the following pro-portionality of the posterior conditional for σ f k := [ σ f , σ f ] > p ( σ f k | ˇ s t , f , z , σ ν , β, µ f k ) ∝ p ( f | z , θ f ) p ( σ f k ) ∝ p ( σ f | z , f , µ f ) p ( σ f | z , f , µ f ) (37)where p ( σ f k | z , f k , µ f k ) ∝ p ( f k | z , µ f k , σ f k ) p ( σ f k ) ∝ IG ( a k + N k , b k + 12 k f k − µ f k N k k ) , ∀ k. (38)Therefore, a sample of each σ k can be independently drawnaccording to p ( σ f k | z , f k , µ f k ) in (38). C. Efficient estimators for f , z , and θ . In this section, efficient sample-based estimators for f , z ,and θ are derived, by using a set of samples S ( t ) from Alg. 1.Building on [13], the elementwise marginal MAP estimator of z and its sample-based approximation are ˆ z i, MAP = arg max z i ∈{ , } p ( z i | ˇ s t ) ’ arg max z i ∈{ , } |S ( t ) | N Iter X l = N Burn-in +1 δ ( z ( l ) i = z i ) (39)for i = 1 , . . . , N g . After obtaining ˆ z MAP , the sample-basedelementwise conditional MMSE estimator of f follows as ˆ f i, MMSE ’ |S ( t ) i | N Iter X l = N Burn-in +1 f ( l ) i δ ( z ( l ) i = ˆ z i, MAP ) , ∀ i (40)where S ( t ) i ⊂ S ( t ) is a subset of samples such that z ( l ) i =ˆ z i, MAP for l = N Burn-in + 1 , . . . , N
Iter . To estimate θ , thefollowing marginal MMSE estimators are employed ˆ β MMSE ’ |S ( t ) | N Iter X l = N Burn-in +1 β ( l ) (41) c σ ν MMSE ’ |S ( t ) | N Iter X l = N Burn-in +1 σ ν ( l ) (42) d µ f k MMSE ’ |S ( t ) | N Iter X l = N Burn-in +1 µ ( l ) f k , k = 0 , (43) d σ f k MMSE ’ |S ( t ) | N Iter X l = N Burn-in +1 σ f k ( l ) , k = 0 , . (44) Remark 1 (Monitoring sampler-convergence).
The proposedsampler in Alg. 1 generates a sequence of samples from thedesired distribution in (15), after a burn-in period to diminishthe influence of initialization. By recalling that the stationarydistribution of those samples is matched with the desired dis-tribution, monitoring convergence of sample-sequences guidesthe choice of N Burn-in .Let ψ denote a generic scalar random variable of interest.Suppose that N Seq parallel sequences of length N Iter areavailable, and let ψ ( l,m ) denote the l -th sample of ψ in the m -th sequence for l = 1 , . . . , N Iter and m = 1 , . . . , N Seq . Then,
EE, BERBERIDIS, AND GIANNAKIS: ADAPTIVE BAYESIAN RADIO TOMOGRAPHY 7 the following potential scale reduction factor (PSRF) estimateis adopted for convergence diagnosis [8]PSRF ( ψ ) := N Iter − N Iter + σ Between σ Within (45)where N Iter := N Iter − N Burn-in , the within-sequence variance: σ Within := 1 N Seq N Seq X m =1 N Iter − N Iter X l = N Burn-in +1 (cid:0) ψ ( l,m ) − ¯ ψ ( m ) (cid:1) (46)with ¯ ψ ( m ) := P N Iter l = N Burn-in +1 ψ ( l,m ) / ( N Iter − ∀ m , and thebetween-sequence variance: σ Between := 1 N Seq N Seq X m =1 (cid:0) ¯ ψ ( m ) − ¯ ψ (cid:1) (47)with ¯ ψ ( m ) := P N Seq m =1 ¯ ψ ( m ) /N Seq . As those sequences convergewhile N Iter → ∞ , the PSRF declines to . In practice, eachsequence is supposed to follow the desired distribution whenPSRF ≤ . [8, p. 138]. For synthetic data tests, N Burn-in and N Iter were found to have PSRF ≤ . for f , z , and θ over N Seq = 20 independent sequences. On the otherhand, N Burn-in and N Iter for real data tests were found to havePSRF ≤ . for f and z , while the PSRF < . for θ ,over N Seq = 20 independent sequences. It allows to havemoderate-sized N Burn-in and N Iter for real data tests. Note thatelementwise { PSNR ( f i ) , PSNR ( z i ) } N g i =1 were monitored for f and z . D. Adaptive data acquisition via uncertainty sampling
The proposed Bayesian radio tomography accounts for theuncertainty of f , through the variance in (23). Using the latter,our idea is to adaptively collect a measurement (or a mini-batch of measurements) from the set of available sensing radiopairs, with the goal of reducing the uncertainty of f . To thisend, we will rely on the conditional entropy [3] that in ourcontext is given by H τ ( f | ˇ s τ , z , θ ) = X z ∈Z Z θ , ˇ s τ p (ˇ s τ , z , θ ) (48) × H τ ( f | ˇ s τ = ˇ s τ , z = z , θ = θ ) d θ d ˇ s τ where H τ ( f | ˇ s τ = ˇ s τ , z = z , θ = θ ):= − Z p ( f | ˇ s τ , z , θ ) ln p ( f | ˇ s τ , z , θ ) d f = 12 ln( (cid:12)(cid:12) Σ f | z (cid:12)(cid:12) ) + N g (cid:18) π ) (cid:19) (49)and | · | denotes matrix determinant. To obtain ˇ s τ +1 , onecan choose a pair of sensors, for which w τ +1 , minimizes H τ +1 ( f | ˇ s τ +1 , z , θ ) . Given ˇ s τ , we write H τ +1 ( f | ˇ s τ +1 , z , θ ) = H τ ( f | ˇ s τ , z , θ ) − X z ∈Z Z θ , ˇ s τ +1 p (ˇ s τ +1 , z , θ ) q ( z , θ , w τ +1 ) d θ d ˇ s τ +1 (50) Algorithm 5
Adaptive Bayesian radio tomography
Input: z (0) , θ (0) , ˇ s (0) , N CL , N Burn-in , and N Iter . Set ˇ s = ˇ s (0) for τ = 0 , , . . . do Obtain S ( τ ) via Alg. 1( z (0) , θ (0) , ˇ s τ , N CL , N Burn-in , N
Iter ) Obtain ˆ z ( τ ) MAP from (39) by using S ( τ ) Obtain ˆ f ( τ ) MMSE from (40) by using ˆ z ( τ ) MAP and S ( τ ) Obtain ˆ θ ( τ ) MMSE from (41)–(44) by using S ( τ ) Calculate ¯ u ( w ) in (52) for w ∈ W τ +1 by using S ( τ ) Collect ˇ s τ +1 from sensors associated with max ¯ u ( w ) Construct ˇ s τ +1 = [ˇ s > τ , ˇ s τ +1 ] > Set z (0) = ˆ z ( τ ) MAP and θ (0) = ˆ θ ( τ ) MMSE end for with q ( z , θ , w ) := ln (cid:0) σ ν ) − w > Σ f | z w (cid:1) / , and seek w τ +1 by solving(P1) max w ∈W τ +1 E z , θ | ˇ s τ [ q ( z , θ , w )]= X z ∈Z Z θ p ( z , θ | ˇ s τ ) q ( z , θ , w ) d θ (51)where W τ +1 is a set of weight vectors found from locationsof available sensing radio pairs at time slot τ + 1 (seeAppendix B for derivation of (P1)). Note that solving (P1)to find w τ +1 does not require p ( z , θ | ˇ s τ +1 ) , which meansthe joint posterior in (15) does not need to be retrained foradaptive data acquisition.Apparently, solving (P1) is not an easy task since evaluating E z , θ | ˇ s τ [ q ( z , θ , w )] is intractable especially for large N g since |Z| = 2 N g . Fortunately, the samples from Alg. 1 can be usedto approximate E z , θ | ˇ s τ [ q ( z , θ , w )] ’ |S ( τ ) | N Iter X l = N Burn-in +1 q ( z ( l ) , θ ( l ) , w ) =: ¯ u ( w ) . (52)Therefore, ˇ s τ +1 can be obtained from the pair of sensorscorresponding to w with the maximum value of ¯ u ( w ) in (52).The steps involved for adaptive Bayesian radio tomographyare listed in Alg. 5. Remark 2 (Mini-batch setup).
The proposed adaptive dataacquisition method can be easily extended to a mini-batchsetup of size N Batch per time slot τ as follows: i) findweight vectors { w ( i ) } N Batch i =1 ⊂ W τ +1 associated with N Batch largest values of ¯ u ( w ) in (52), and collect the correspondingmeasurements { ˇ s ( i ) τ +1 } N Batch i =1 (steps – in Alg. 5); and ii)aggregate those measurements below ˇ s τ to construct ˇ s τ +1 :=[ˇ s > τ , ˇ s (1) τ +1 , . . . , ˇ s ( N Batch ) τ +1 ] > (step in Alg. 5). Numerical testswill be performed to assess the mini-batch operation of Alg. 5.IV. N UMERICAL T ESTS
Performance of the proposed algorithms was assessedthrough numerical tests using both synthetic and real datasets.A few existing methods were also tested for comparison,including the ridge-regularized SLF estimate given by ˆ f LS =( WW > + µ f C − f ) − W ˇ s t [9], where C f is a spatial co-variance matrix modeling the similarity between points ˜ x i IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED, APRIL 4, 2018) and ˜ x j in area A . We further tested the total variation (TV)-regularized LS scheme in [26], which solves the regularizedproblem in (6) with R ( f ) = N x − X i =1 N y X j =1 | f i +1 ,j − f i,j | + N x X i =1 N y − X j =1 | f i,j +1 − f i,j | , (53)where F := unvec( f ) ∈ R N x × N y , and f i,j denotes the ( i, j ) -th element of F . As a competing alternative of theproposed adaptive sampling, simple random sampling wasconsidered for both regularized LS estimators, by selecting { ˇ s ( i ) τ +1 } N Batch i =1 ∀ τ uniformly at random. Particularly, Alg. 5 afterreplacing steps – with random sampling is named as the non-adaptive Bayesian algorithm , and will be compared withthe proposed method throughout synthetic and real data tests. A. Test with synthetic data
This section validates the proposed algorithm through syn-thetic tests. Random tomographic measurements were takenby N = 120 sensors uniformly deployed on boundaries of A := [0 . , . × [0 . , . , from which the SLF definedover a grid { ˜ x i } , i =1 := { , . . . , } was reconstructed. Togenerate the ground-truth SLF f , the hidden label field z was obtained first via the Metropolis algorithm [21] by usingthe prior of z in (8) with β = 1 . . Afterwards, f wasconstructed to have f (˜ x i ) ∼ N (0 . , ∀ ˜ x i ∈ A and f (˜ x j ) ∼N (5 , . ∀ ˜ x j ∈ A resulting in θ f = [0 . , , , . > , respec-tively, based on labels in z . True F := unvec( f ) ∈ R × and Z := unvec( z ) ∈ { , } × are depicted in Fig. 4with sensor locations marked with crosses. The effects ofcalibration are not accounted for this section, meaning that g and γ are assumed to be known, and the fusion centerdirectly uses shadowing measurements ˇ s τ . Under the mini-batch operation, each measurement ˇ s ( i ) τ ∀ τ, i was generatedaccording to (5), where s τ was obtained by (4) with w setto the normalized ellipse model in (3), while ν τ was setto follow zero-mean Gaussian with σ ν = 5 × − . Toconstruct W τ +1 per time slot τ , |W τ +1 | = 100 pairs ofsensors were uniformly selected at random with replacement.Then, N Batch = 40 shadowing measurements correspondingto { w ( i ) } N Batch i =1 ⊂ W τ +1 were collected to execute Alg. 5 for τ = 0 , . . . , .In all synthetic tests, the following simulation parameterswere used: N CL = 2 , N Burn-in = 200 , N Iter = 500 , and σ q = 0 . were used to run the proposed algorithm; andhyper-hyperparameters of θ were set as listed in Table I.For initialization, θ (0) was set to have β (0) = 0 . , µ (0) f k =[ m , m ] > , and randomly initialized σ ν and σ f k . Vector z (0) was obtained by drawing z (0) i ∼ Bern (0 . for i = 1 , . . . , N g ,where Bern (0 . denotes the Bernoulli distribution with meanequal to . . Furthermore, ˇ s (0) was collected from randomlyselected pairs of sensors. For both regularized LS esti-mators, µ f was found to minimize E (cid:2) k f − ˆ f k (cid:3) throughgrid-search.The first experiment was performed to validate the efficacyof Alg. 5. The estimates ˆ F = unvec( ˆ f ) and ˆ Z = unvec( ˆ z ) at τ = 15 are displayed in Figs. 5c and 5d, respectively, together
10 20 30 4010203040 01 (a)
10 20 30 4010203040 − (b) Fig. 4: True fields for synthetic tests: (a) hidden label field Z and (b) spatial loss field F with N = 120 sensor locationsmarked with crosses.TABLE I: Hyper-hyperparameters of θ for synthetic tests. β max m m σ k , ∀ k a ν b ν a k , ∀ k b k , ∀ k
10 0 . . . . . . . TABLE II: True θ and estimated ˆ θ via Alg. 5 (setting ofFigs. 5c and 5d); and non-adaptive Bayesian algorithm (settingof Figs. 5e and 5f ) averaged over 20 independent Monte Carloruns. θ True Est. (Alg. 5) Est. (non-adaptive) β . . ± × − . ± × − σ ν .
05 0 . ± − . ± . × − µ f . . ± × − . ± . × − µ f . ± × − . ± × − σ f . ± × − . ± . × − σ f . . ± × − . ± . × − with the estimated SLFs from the regularized-LS estimatorsin Figs. 5a and 5b. The most satisfactory result was obtainedby the proposed method since piecewise homogeneous regionsof the SLF were separately reconstructed by introducing thehidden label field.To test the proposed adaptive data acquisition method, ˆ F and ˆ Z reconstructed by the non-adaptive Bayesian algorithmare shown in Figs. 5e and 5f, respectively. Comparison be-tween Figs. 5c and 5e visually demonstrates that improvedSLF reconstruction performance could be achieved throughadaptive data acquisition with the same number of measure-ments. Accuracy of ˆ z was also quantitatively measured by thelabeling-error, defined as k z − ˆ z k /N g . Fig. 6 displays theprogression of the labeling-error averaged over independentMonte Carlo runs. It shows that the proposed adaptive methodconsistently outperforms the non-adaptive one, which impliesthat selection of informative measurements to decrease uncer-tainty of f given current estimates of z and θ could lead tomore accurate estimates of f and z in the next time slot.Meanwhile, average estimates of θ and associated standarddeviation denoted with ± are listed in Table II, where everyhyperparameter was accurately estimated. Together with theresult in Fig. 5, the accurate estimates of the hyperparame-ters confirm that the proposed method can faithfully capturepatterns of objects in area of interest, and also reveal theunderlying statistical properties.The rest of this section tests the performance of the proposed EE, BERBERIDIS, AND GIANNAKIS: ADAPTIVE BAYESIAN RADIO TOMOGRAPHY 9
10 20 30 4010203040 − (a)
10 20 30 4010203040 − (b)
10 20 30 4010203040 − (c)
10 20 30 4010203040 01 (d)
10 20 30 4010203040 − (e)
10 20 30 4010203040 01 (f)
Fig. 5: Estimated SLFs ˆ F at τ = 15 (with measurements)via (a) ridge-regularized LS ( µ f = 10 − and C f = I , );(b) TV-regularized LS ( µ f = 10 − ); (c) Alg. 5 through (d)estimated hidden label field ˆ Z ; and (e) non-adaptive Bayesianalgorithm, through (f) estimated ˆ Z . · − Time ( τ ) E [ k z − ˆ z k ] / N g Non-adaptiveAdaptive
Fig. 6: Progression of error in estimation of z . − − − Time ( τ ) N M S E Non-adaptiveAdaptiveRidge-reg.TV-reg.
Fig. 7: Progression of channel-gain estimation error.TABLE III: Hyper-hyperparameters of θ for real data tests. β max m m σ k , ∀ k a ν b ν a k , ∀ k b k , ∀ k .
01 1 0 .
01 0 .
01 0 . algorithm in channel-gain cartography. To this end, the samesetting used to produce Figs. 5c and 5d was adopted. From theestimate ˆ f MMSE obtained through Alg. 5, an estimate of theshadowing attenuation ˆ s ( x , x ) between two arbitrary points x and x in A is obtained through (4) by replacing f with ˆ f MMSE . Subsequently, an estimate of the channel-gain ˆ g ( x , x ) is obtained after substituting ˆ s ( x , x ) into (1).Since g and γ are known, obtaining s ( x , x ) amountsto finding g ( x , x ) ; cf. (1). This suggests adopting a perfor-mance metric quantifying the mismatch between s ( x , x ) and ˆ s ( x , x ) , using the normalized mean-square errorNMSE := E (cid:2) R A (cid:0) s ( x , x ) − ˆ s ( x , x ) (cid:1) d x d x (cid:3) E (cid:2) R A s ( x , x ) d x d x (cid:3) where the expectation is over the set { x n } Nn =1 of sensorlocations and realizations of { ν τ } τ . Simulations estimatedthe expectations by averaging over independent MonteCarlo runs. The integrals are approximated by averaging theintegrand over pairs of ( x , x ) chosen independently anduniformly at random over the boundary of A .Fig. 7 compares the NMSE of the proposed method withthose of the competing alternatives using the settings in Fig. 5.Evidently, the proposed method achieves the lowest NMSE forevery τ . Observe that both Bayesian approaches outperformthe regularized LS methods, which suggests the proposedmethod as a viable alternative of a conventional solutionadopted for radio tomography and channel-gain cartography. B. Test with real data
This section validates the proposed method using the realdata set in [9]. The test setup is depicted in Fig. 8, where A = [0 . , . × [0 . , . is a square with sides of 20feet (ft), over which a grid { ˜ x i } , i =1 := { , . . . , } of N g = 1 , points is defined. A collection of N = 80 sensors Fig. 8: Configuration of the testbed with N = 80 sensorlocations marked with crosses.measure the channel attenuation at . GHz between pairsof sensor positions, marked with the N = 80 crosses in Fig. 8.To estimate g and γ using the approach in [9], a first set of , measurements was obtained before placing the artificialstructure in Fig. 8. Estimates ˆ g = 54 . (dB) and ˆ γ = 0 . were obtained during the calibration step. Afterwards, thestructure comprising one pillar and six walls of differentmaterials was assembled, and T = 2 , measurements { ˇ g τ } Tτ =1 were acquired. Then, the calibrated measurements { ˇ s τ } Tτ =1 were obtained from { ˇ g τ } Tτ =1 by substituting ˆ g and ˆ γ into (5). In addition, { w τ } Tτ =1 were constructed with w in (3) by using known locations of sensor pairs. Note that τ is introduced to distinguish indices of the real data from τ used to index time slots in numerical tests.We randomly selected , measurements from { ˇ s τ } Tτ =1 to initialize ˇ s (0) , and used the remaining , measurementsto run the proposed algorithm under the mini-batch operationfor τ = 0 , . . . , , where every W τ +1 was formed by uniformlyselecting |W τ +1 | = 200 weight vectors at random from { w τ } τ associated with the remaining , measurementswithout replacement. Parameters of the proposed algorithmwere set to, N CL = 2 , N Burn-in = 300 , N Iter = 1 , , σ q = 10 − , and the hyper-hyperparameters of θ used arelisted in Table III. For initialization, z (0) was found by drawing z (0) i ∼ Bern (0 . ∀ i . Vector θ (0) was set to have β (0) = 0 . and µ (0) f k = [ m , m ] > , while σ ν and σ f k were initialized atrandom.Following [1], [9], a spatial covariance matrix was usedfor C f of the ridge-regularized LS estimator, which mod-els the similarity between points ˜ x i and ˜ x j as (cid:2) C f (cid:3) ij = σ s exp[ −k ˜ x i − ˜ x j k /κ ] [1], with σ s = κ = 1 , and µ f =6 × − ; see also [27]. On the other hand, the TV-regularizedLS estimator was tested with µ f = 4 . used in [27].Fig. 9 displays estimated SLFs ˆ F and associated hiddenfields ˆ Z at τ = 5 obtained by the proposed method and itscompeting alternatives. The pattern of the artificial structure isclearly delineated on ˆ F in Fig. 9c estimated by the proposedmethod, while the regularized LS estimators are not able to [ f t ] − . − . − . . . (a) [ f t ] − . − . − . − . − . . . . (b) [ f t ] − . − . − − . − . − . − . . . . (c) [ f t ] (d) [ f t ] − − . − − . . (e) [ f t ] (f) [ f t ] − − . − − . . (g) [ f t ] (h) Fig. 9: Estimated SLFs ˆ F at τ = 5 (with , measurements)via (a) ridge-regularized LS; (b) TV-regularized LS; (c) Alg. 5through (d) estimated hidden label field ˆ Z ; and (e) non-adaptive Bayesian algorithm, through (f) estimated ˆ Z , togetherwith one-shot estimates (g) ˆ F full and (h) ˆ Z full obtained by usingthe full dataset (with , measurements) via Alg. 5.capture such pattern without post-processing of the estimatedSLFs in Figs. 9a and 9b. Although the non-adaptive Bayesianalgorithm reconstructed the visually satisfying SLF for radiotomography as shown in Fig. 9e, ˆ F from the proposed methoddepicts the artificial structure more clearly; see e.g., objectpatterns in Figs. 9c and 9e corresponding to the dry wallin Fig. 8. As a benchmark, an one-shot estimate of theSLF, denoted as ˆ F full , is also displayed in Fig. 9g, which EE, BERBERIDIS, AND GIANNAKIS: ADAPTIVE BAYESIAN RADIO TOMOGRAPHY 11 . . . . . . . Time ( τ ) E [ k ˆ z f u ll − ˆ z k ] / N g Non-adaptiveAdaptive
Fig. 10: Progression of a mismatch between ˆ z and ˆ z full .TABLE IV: Estimated ˆ θ via benchmark algorithm (setting ofFigs. 9g and 9h); Alg. 5 (setting of Figs. 9c and 9d); andnon-adaptive Bayesian algorithm (setting of Figs. 9e and 9f),averaged over 20 independent Monte Carlo runs. θ Est. (benchmark) Est. (Alg. 5) Est. (non-adaptive) β . ± × − . ± × − . ± × − σ ν . ± .
05 10 . ± .
20 9 . ± . µ f − . ± . − . ± . − . ± . µ f . ± .
03 0 . ± .
03 0 . ± . σ f . ± .
12 0 . ± .
13 0 . ± . σ f . ± .
10 0 . ± .
10 0 . ± . was obtained via Alg. 5 by using the entire set of , measurements. Comparison of ˆ F in Fig. 9c with ˆ F full showsthat the proposed algorithm enables one to reconstruct the SLFclose to the benchmark by using fewer, but more informativemeasurements.The second experiment investigated the efficacy of theproposed adaptive data acquisition method in estimating z .By considering ˆ Z full = unvec( ˆ z full ) in Fig. 9h as a benchmark,the labeling error k ˆ z full − ˆ z k /N g was used as a performancemetric. Fig. 10 compares the labeling error of the proposedmethod with that of the non-adaptive algorithm, which areaveraged over independent Monte Carlo runs. The proposedmethod exhibits lower labeling errors than the non-adaptiveone except when τ = 2 . This illustrates that the proposed dataacquisition criterion delineates object patterns more accuratelywhile also reducing the measurement collection cost.To corroborate the hyperparameter estimation capability ofthe proposed algorithm, the estimates of θ averaged over independent Monte Carlo runs were listed in Table IV. Theestimate ˆ θ obtained by using the full dataset was consideredas a benchmark, to demonstrate that the proposed methodestimates θ closer to the benchmark.The last simulation assesses the performance of the pro-posed algorithm and competing alternatives for channel-gaincartography. The same set of shadowing measurements andsimulation setup as in first simulations of this section wereused. A channel-gain map is constructed to portray the gain [ f t ] − − . − − . . . [dB] (a) [ f t ] [dB] (b) [ f t ] − − . − − . . . [dB] (c) [ f t ] [dB] (d) [ f t ] − − . − − . . . [dB] (e) [ f t ] [dB] (f) [ f t ] − − . − − . . . [dB] (g) [ f t ] [dB] (h) Fig. 11: Estimated shadowing maps ˆ S and correspondingchannel-gain maps ˆ G at τ = 5 via (a)-(b) ridge-regularizedLS (setting of Fig. 9a); (c)-(d) TV-regularized LS (settingof Fig. 9b); (e)-(f) Alg. 5 (setting of Fig. 9c); and (g)-(h)non-adaptive Bayesian algorithm (setting of Fig. 9e), with thereceiver location at x rx = (10 . , . (ft) marked with theblue cross.between any point in the map, and a fixed receiver location x rx .Particularly, the proposed algorithm is executed and estimates { ˆ s (˜ x i , x rx ) } N g i =1 are obtained by substituting ˆ f and w into (4).Subsequently, { ˆ g (˜ x i , x rx ) } N g i =1 are obtained by substituting { ˆ s (˜ x i , x rx ) } N g i =1 into (1) with ˆ g and ˆ γ . After defining ˆ g :=[ˆ g (˜ x , x rx ) , . . . , ˆ g (˜ x N g , x rx )] > , one can construct the channel- gain map ˆ G := unvec(ˆ g ) with the receiver located at x rx .Let ˆ S := unvec(ˆ s ) denote the shadowing map with ˆ s :=[ˆ s (˜ x , x rx ) , . . . , ˆ s (˜ x N g , x rx )] > . Fig. 11 displays the estimatedshadowing maps ˆ S and corresponding channel-gain maps ˆ G ,obtained via various methods, when the receiver is located at x rx = (10 . , . (ft) marked by the cross. In all channel-gainmaps in Fig. 11, stronger attenuation is observed when a signalpasses through either more building materials (bottom-rightside of ˆ G ), or the concrete wall (left side of ˆ G ). In contrast,only the channel-gain maps in Figs. 11f and 11h reconstructedby the Bayesian methods exhibit less attenuation along theentrance of the artificial objects (top-right side of ˆ G ), whilechannel-gain tends to drop quickly within the vicinity of thereceiver in the channel-gain maps obtained by the regularizedLS estimators, as shown in Figs. 11b and 11d. This stemsfrom the fact that free space and objects are more distinctivelydelineated in ˆ F by the Bayesian approaches. Note that slightlydifferent observations were made in Figs. 11f and 11h sincethe shadowing map in Fig. 11g introduces stronger attenuationin free space below the receiver, which would disagree withintuition. All in all, the simulation results confirm that ourapproach could provide more specific CSI of the propagationmedium, and thus endow the operation of cognitive radionetworks with more accurate interference management.V. C ONCLUSIONS
This paper developed a novel adaptive Bayesian radiotomographic algorithm that estimates the spatial loss field ofthe radio tomographic model, which is of interest in imagingand channel-gain cartography applications, by using measure-ments adaptively collected based on the uncertainty samplingcriterion. Different from conventional approaches, leveraginga hidden label field contributed to effectively account forinhomogeneities of the spatial loss field. The effectivenessof the novel algorithm was corroborated through extensivesynthetic and real data experiments. Future research willinclude an online approach to Bayesian radio tomography tofurther reduce computational complexity.A
PPENDIX
A. Distribution of the proportionality of p ( f | ˇ s t , z , θ ) Recalling that p (ˇ s t | f , σ ν ) ∼ N ( W > f , σ ν I t ) and p ( f | z , θ f ) ∼ N ( µ f | z , ∆ f | z ) , one can expand p ( f | ˇ s t , z , θ ) in (22) to arrive at (cf. (23)) p ( f | ˇ s t , z , θ ) ∝ p (ˇ s t | f , σ ν ) p ( f | z , θ f ) ∝ exp (cid:20) − σ ν k ˇ s t − W > f k − k f − µ f | z k ∆ − f | z (cid:21) ∝ exp (cid:20) − f > Σ − f | z, θ , ˇ s t f + (cid:18) σ ν ˇ s > t W > + µ > f | z ∆ − f | z (cid:19) f (cid:21) = exp (cid:20) − f > Σ − f | z, θ , ˇ s t f + ˇ µ > f | z, θ , ˇ s t Σ − f | z, θ , ˇ s t f (cid:21) ∝ exp (cid:20) − k f − ˇ µ f | z, θ , ˇ s t k Σ − f | z, θ , ˇ s t (cid:21) , (54)which shows that the proportionality of p ( f | ˇ s t , z , θ ) follows N ( ˇ µ f | z, θ , ˇ s t , Σ f | z, θ , ˇ s t ) . (cid:4) B. Derivation of (P1)
At time slot τ , we seek w τ +1 minimizing H τ +1 ( f | ˇ s τ +1 , z , θ ) in (50), which amounts to solving max w ∈W τ +1 X z ∈Z Z θ , ˇ s τ +1 p (ˇ s τ +1 , z , θ ) q ( z , θ , w ) d θ d ˇ s τ +1 . (55)Then, one can show that Z p (ˇ s τ +1 , z , θ ) d ˇ s τ +1 = Z ˇ s τ +1 Z f p (ˇ s τ +1 , f , z , θ ) d f d ˇ s τ +1( e = Z Z p (ˇ s τ +1 | f , z , θ ) p (ˇ s τ | f , z , θ ) p ( f , z , θ ) d f d ˇ s τ +1 = Z Z p ( f , z , θ | ˇ s τ ) p (ˇ s τ ) d f d ˇ s τ = Z p ( z , θ | ˇ s τ ) p (ˇ s τ ) d ˇ s τ (56)where ( e holds due to independence between ˇ s τ +1 and ˇ s τ after conditioning on { f , z , θ } . By substituting (56) into (55)and recalling that ˇ s τ is given at time slot τ , finding w τ +1 boilsdown to solving max w ∈W τ +1 X z ∈Z Z θ p ( z , θ | ˇ s τ ) q ( z , θ , w ) d θ , (57)which is (P1). (cid:4) R EFERENCES[1] P. Agrawal and N. Patwari, “Correlated link shadow fading in multi-hopwireless networks,”
IEEE Trans. Wireless Commun. , vol. 8, no. 9, pp.4024–4036, Aug. 2009.[2] H. Ayasso and A. Mohammad-Djafari, “Joint NDT image restorationand segmentation using Gauss-Markov-Potts models and variationalBayesian computation,”
IEEE Trans. Image Process. , vol. 19, no. 9,pp. 2265 – 2277, Sep. 2010.[3] T. M. Cover and J. A. Thomas,
Elements of Information Theory . NewYork, NY: USA:Wiley, 1991.[4] E. Dall’Anese, S.-J. Kim, and G. B. Giannakis, “Channel gain maptracking via distributed kriging,”
IEEE Trans. Veh. Technol. , vol. 60,no. 3, pp. 1205–1211, 2011.[5] FederaluCommunicationsuCommission, “FCC 11-131,”
Unlicensed Op-eration in the TV Broadcast Bands , 2011.[6] V. V. Fedorov,
Theory of Optimal Experiments . New York, NY:Academic Press, 1972.[7] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, andthe Bayesian restoration of images,”
IEEE Trans. Pattern Anal. Mach.Intell. , vol. PAMI-6, no. 6, pp. 721–741, Nov. 1984.[8] W. R. Gilks, S. Richardson, and D. J. Spiegelhalter,
Markov Chain MonteCarlo in Practice . London: Chapman and Hall, 1996.[9] B. R. Hamilton, X. Ma, R. J. Baxley, and S. M. Matechik, “Propagationmodeling for radio frequency tomography in wireless networks,”
IEEEJ. Sel. Topics Sig. Proc.
Biometrika , vol. 57, no. 1, pp. 97–109, 1970.[12] D. Higdon, “Spatial applications of Markov chain Monte Carlo forBayesian inference,” Ph.D. dissertation, Dept. Stat., Univ. Washington,Seattle, WA, 1994.[13] G. Kail, J.-Y. Tourneret, F. Hlawatsch, and N. Dobigeon, “Blind decon-volution of sparse pulse sequences under a minimum distance constraint:a partially collapsed Gibbs sampler method,”
IEEE Trans. Sig. Proc. ,vol. 60, no. 6, pp. 2727 – 2743, 2012.
EE, BERBERIDIS, AND GIANNAKIS: ADAPTIVE BAYESIAN RADIO TOMOGRAPHY 13 [14] O. Kaltiokallio, M. Bocca, and N. Patwari, “Enhancing the accuracy ofradio tomographic imaging using channel diversity,” in
Proc. of the 9thIEEE Int. Conf. on Mobile Ad hoc and Sensor Syst. , Las Vegas, NV,Oct. 2012, pp. 254–262.[15] S.-J. Kim, E. Dall’Anese, J. A. Bazerque, K. Rajawat, and G. B.Giannakis, “Advances in spectrum sensing and cross-layer design forcognitive radio networks,” in
Academic Press Library in Signal Pro-cessing: Communications and Radar Signal Processing: 2 . AcademicPress, 2013, ch. 9, pp. 471–497.[16] S.-J. Kim, E. Dall’Anese, and G. B. Giannakis, “Cooperative spectrumsensing for cognitive radios using kriged Kalman filtering,”
IEEE J. Sel.Topics Sig. Process. , vol. 5, no. 1, pp. 24–36, Feb. 2011.[17] M. Laghate and D. Cabric, “Cooperatively learning footprints of multipleincumbent transmitters by using cognitive radio networks,”
IEEE Trans.Cogn. Commun. Netw. , vol. 3, no. 3, pp. 282 – 297, Sep. 2017.[18] D. Lee, S.-J. Kim, and G. B. Giannakis, “Channel gain cartography forcognitive radios leveraging low rank and sparsity,”
IEEE Trans. WirelessCommun. , vol. 16, no. 9, pp. 5953 – 5966, Nov. 2017.[19] F. Liang, “A double Metropolis-Hastings sampler for spatial models withintractable normalizing constants,”
J. Stat. Comp. Simul. , vol. 80, no. 9,pp. 1007–1022, Sep. 2010.[20] D. MacKay, “Information-based objective functions for active dataselection,”
Neural Comput. , vol. 4, no. 4, pp. 590 – 604, 1992.[21] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, andE. Teller, “Equation of state calculations by fast computing machines,”
J. Chem. Phys. , vol. 21, no. 6, pp. 1087–1092, Jun. 1953.[22] J. Møller, A. N. Pettitt, R. Reeves, and K. K. Berthelsen, “An efficientMarkov chain Monte Carlo method for distributions with intractablenormalising constants,”
Biometrika , vol. 93, no. 2, pp. 451–458, 2006.[23] I. Murray, Z. Ghahramani, and D. J. C. MacKay, “MCMC for doubly-intractable distributions,” in
Proc. of the 22nd Annual Conf. on Uncer-tainty in Artificial Intell. , Cambridge, MA, Jul. 2006.[24] N. Patwari and P. Agrawal, “Effects of correlated shadowing: Connectiv-ity, localization, and RF tomography,” in
Int. Conf. Info. Process. SensorNetworks , St. Louis, MO, Apr. 2008, pp. 82–93.[25] M. Pereyra, N. Dobigeon, H. Batatia, and J.-Y. Tourneret, “Estimatingthe granularity coefficient of a Potts-Markov random field within aMarkov chain Monte Carlo algorithm,”
IEEE Trans. Image Process. ,vol. 2, no. 6, pp. 2385 – 2397, Jun. 2013.[26] Z. Qin, D. Goldfarb, and S. Ma, “An alternating direction method fortotal variation denoising,”
Optim. Method Softw. , vol. 30, no. 3, pp.594–615, 2015.[27] D. Romero, D. Lee, and G. B. Giannakis, “Blind radio tomography,”
IEEE Trans. Sig. Proc. , vol. 66, no. 8, pp. 2055 – 2069, Apr. 2018.[28] D. Smith and M. Smith, “Estimation of binary Markov random fieldsusing Markov chain Monte Carlo,”
J. Comput. Graph. Stats. , vol. 15,no. 1, pp. 207 – 227, 2006.[29] N. B. Smith and A. Webb,
Introduction to Medical Imaging: Physics,Engineering and Clinical Applications . Cambridge University Press,2010.[30] J. Wilson and N. Patwari, “Radio tomographic imaging with wirelessnetworks,”
IEEE Trans. Mobile Comput. , vol. 9, no. 5, pp. 621–632,2010.[31] ——, “See-through walls: Motion tracking using variance-based radiotomography networks,”
IEEE Trans. Mobile Comput. , vol. 10, no. 5, pp.612–621, 2011.[32] J. Wilson, N. Patwari, and O. G. Vasquez, “Regularization methods forradio tomographic imaging,” in
Virginia Tech Symp. Wireless PersonalCommun. , Blacksburg, VA, Jun. 2009.[33] K. Woyach, D. Puccinelli, and M. Haenggi, “Sensorless sensing inwireless networks: Implementation and measurements,” in
Int. Symp.Modeling Optimization Mobile, Ad Hoc Wireless Netw. , Apr. 2006, pp.1–8.[34] M. Youssef, M. Mah, and A. Agrawala, “Challenges: Device-free passivelocalization for wireless environments,” in
Proc. ACM MobiCom , Sep.2007, pp. 222–229.[35] N. Zhao, A. Basarab, D. Kouam´e, and J.-Y. Tourneret, “Joint seg-mentation and deconvolution of ultrasound images using a hierarchicalBayesian model based on generalized Gaussian priors,”
IEEE Trans.Image Process. , vol. 25, no. 8, pp. 3736 – 3750, 2016.[36] Q. Zhao and B. M. Sadler, “A survey of dynamic spectrum access,”