[PDF] Distributed Maximum Likelihood Sensor Network Localization

Abstract

We propose a class of convex relaxations to solve the sensor network localization problem, based on a maximum likelihood (ML) formulation. This class, as well as the tightness of the relaxations, depends on the noise probability density function (PDF) of the collected measurements. We derive a computational efficient edge-based version of this ML convex relaxation class and we design a distributed algorithm that enables the sensor nodes to solve these edge-based convex programs locally by communicating only with their close neighbors. This algorithm relies on the alternating direction method of multipliers (ADMM), it converges to the centralized solution, it can run asynchronously, and it is computation error-resilient. Finally, we compare our proposed distributed scheme with other available methods, both analytically and numerically, and we argue the added value of ADMM, especially for large-scale networks.

Full PDF

aa r X i v : . [ c s . I T ] D ec Distributed Maximum LikelihoodSensor Network Localization

Andrea Simonetto* and Geert Leus

Abstract —We propose a class of convex relaxations to solvethe sensor network localization problem, based on a maximumlikelihood (ML) formulation. This class, as well as the tightness ofthe relaxations, depends on the noise probability density function(PDF) of the collected measurements. We derive a computationalefﬁcient edge-based version of this ML convex relaxation classand we design a distributed algorithm that enables the sensornodes to solve these edge-based convex programs locally bycommunicating only with their close neighbors. This algorithmrelies on the alternating direction method of multipliers (ADMM),it converges to the centralized solution, it can run asynchronously,and it is computation error-resilient. Finally, we compare ourproposed distributed scheme with other available methods, bothanalytically and numerically, and we argue the added value ofADMM, especially for large-scale networks.

Index Terms —Distributed optimization, convex relaxations,sensor network localization, distributed algorithms, ADMM,distributed localization, sensor networks, maximum likelihood.

EDICS Category: SEN-DIST, SEN-COLB

I. I

NTRODUCTION

Nowadays, wireless sensor networks are developed to pro-vide fast, cheap, reliable, and scalable hardware solutionsto a large number of industrial applications, ranging fromsurveillance [1], [2] and tracking [3], [4] to exploration [5], [6],monitoring [7], [8], robotics [9], and other sensing tasks [10].From the software perspective, an increasing effort is spenton designing distributed algorithms that can be embedded inthese sensor networks, providing high reliability with limitedcomputation and communication requirements for the sensornodes. Estimating the location of the nodes based on pair-wise distance measurements is regarded as a key enablingtechnology in many of the aforementioned scenarios, whereGPS is often not employable.From a strictly mathematical standpoint, this sensor networklocalization problem can be formulated as determining thenode position in R or R ensuring their consistency with thegiven inter-sensor distance measurements and (in some cases)with the location of known anchors. As it is well known, sucha ﬁxed-dimensional problem (often phrased as a polynomialoptimization) is NP-hard in general. Consequently, there havebeen signiﬁcant research efforts in developing algorithms andheuristics that can accurately and efﬁciently localize the nodesin a given dimension [11]–[13]. Besides heuristic geometric The authors are with the Faculty of Electrical Engineering, Mathematicsand Computer Science, Delft University of Technology, 2826 CD Delft, TheNetherlands. e-mails: { a.simonetto, g.j.t.leus } @tudelft.nl. * Correspondingauthor: Andrea Simonetto, phone: (+31)152782845, fax: (+31)152786190, e-mail: [email protected]. This research was supported in part by STWunder the D2S2 project from the ASSYS program (project 10561). schemes, such as multi-lateration, typical methods encompassmulti-dimensional scaling [14], [15], belief propagation tech-niques [16], and standard non-linear ﬁltering [17].A very powerful approach to the sensor network localizationproblem is to use convex relaxation techniques to massagethe non-convex problem to a more tractable yet approximateformulation. First adopted in [18], this modus operandi hassince been extensively developed in the literature (see forexample [19] for a comprehensive survey in the ﬁeld of signalprocessing). Semideﬁnite programming (SDP) relaxations forthe localization problem have been proposed in [20]–[27].Theoretical properties of these methods have been discussedin [28]–[30], while their efﬁcient implementation has beenpresented in [31]–[35]. Further convex relaxations, namelysecond-order cone programming relaxations (SOCP) havebeen proposed in [36] to alleviate the computational load ofstandard SDP relaxations, at the price of some performancedegradation. Highly accurate and highly computational de-manding sum of squares (SOS) convex relaxations have beeninstead employed in [37].Despite the richness of the convex relaxation literature, twomain aspects have been overlooked. First of all, a compre-hensive characterization of these convex relaxations based onthe maximum likelihood (ML) formulation is missing. In [21],[25], [38], [39] ML-based relaxations are explored, but onlyfor speciﬁc noise models (mainly Gaussian noise), withouta proper understanding of how different noise models wouldaffect performance.The second overlooked aspect regards the lack of distributedoptimization algorithms to solve convex relaxation problemswith certiﬁcates of convergence to the centralized optimizer,convergence rate, and proven robustness when applied to realsensor networks bounded by asynchronous communication andlimited computation capabilities. Contributions.

First, we generalize the current state-of-the-art convex relaxations by formulating the sensor networklocalization problem in a maximum likelihood framework andthen relaxing it. This class of relaxations (which depends onthe choice of the probability density function (PDF) of thenoise) is represented by the convex program (6). We show thatthis program is a rank relaxation of the original non-convexML estimation problem, and at least for two widely used cases(Gaussian noise and Gaussian quantized measurements), it is arank- D relaxation ( D being the dimension of the space wherethe sensor nodes live, Proposition 1). The relaxed convexprogram is then further massaged into the edge-based MLrelaxation (12) to lessen the computation requirements andto facilitate the distribution of the problem among the nodes.Furthermore, we show numerically that the tightness of the relaxation (in particular, the property of being derived from arank- D relaxation or not) can affect the performance of theconvex program (12) more than the correctness of the noisemodel.As a second contribution, we demonstrate how the edge-based ML convex relaxation can be handled via the alternatingdirection method of multipliers (ADMM), which gives usa powerful leverage for the analysis of the resulting algo-rithm. The proposed algorithm, Algorithm 1, is distributed innature: the sensor nodes are able to locate themselves andthe neighboring nodes without the knowledge of the wholenetwork. This algorithm converges with a rate of O (1 /t ) ( t being the number of iterations) to the solution of (12)(Theorem 1). Using Algorithm 1, each sensor node has atotal communication cost to reach a certain average localaccuracy of the solution that is independent of the networksize (Proposition 2 and Corollary 1). The proposed algorithmis then proven to converge even when run asynchronously(Theorem 2) and when the nodes are affected by computationerrors (Theorem 3). These features, along with guaranteedconvergence, are very important in real-life sensor networkapplications. Finally, we compare the usage of Algorithm 1with some other available possibilities, in particular, the meth-ods suggested in [40] and [41], both in terms of theoreticalperformances and simulation results. These analyses supportour proposed distributed algorithm, especially for large-scalesettings. Organization.

The remainder of the paper is organizedas follows. Section II details the problem formulation. Sec-tion III presents the proposed maximum likelihood convexrelaxation (6) along with some examples. Section IV intro-duces the edge-based relaxation (12), which is the buildingblock for our distributed algorithm. Section V surveys brieﬂydistributed techniques to solve the localization problem, while,in Section VI, we focus on the development of our distributedalgorithm and its analysis. Numerical simulations and com-parisons are displayed in Section VII, while our conclusionsare drawn in Section VIII.II. P

RELIMINARIES AND P ROBLEM S TATEMENT

We consider a network of n static wireless sensor nodeswith computation and communication capabilities, living ina D -dimensional space (typically D will be the standard 2-dimensional or 3-dimensional Euclidean space). We denotethe set of all nodes V = { , . . . , n } . Let x i ∈ R D bethe position vector of the i -th sensor node, or equivalently,let X = [ x , . . . , x n ] ∈ R D × n be the matrix collectingthe position vectors. We consider an environment with line-of-sight conditions between the nodes and we assume thatsome pairs of sensor nodes ( i, j ) have access to noisy rangemeasurements as r i,j = d i,j + ν i,j , (1)where d i,j = || x i − x j || is the noise-free Euclidean distanceand ν i,j is an additive noise term with known probability distri-bution. We call p i,j ( d i,j ( x i , x j ) | r i,j ) the inter-sensor sensingPDF, where we have indicated explicitly the dependence of d i,j on the sensor node positions ( x i , x j ) . In addition, we consider that some sensors also have accessto noisy range measurements with some ﬁxed anchor nodes(whose position a k , for k ∈ { , . . . , m } , is known by all theneighboring sensor nodes of each a k ) as v i,k = e i,k + µ i,k , (2)where, e i,k = || x i − a k || is the noise-free Euclidean distanceand µ i,k is an additive noise term with known probabilitydistribution. We denote as p i,k, a ( e i,k ( x i , a k ) | v i,k ) the anchor-sensor sensing PDF.We use graph theory terminology to characterize the setof sensor nodes V and the measurements r i,j and v i,k . Inparticular, we say that the measurements r i,j induce a graphwith V as vertex set, i.e., for each sensor node pair ( i, j ) forwhich there exists a measurement r i,j , there exists an edgeconnecting i and j . The set of all edges is E and its cardinalityis E . We denote this undirected graph as G = ( V , E ) . Theneighbors of sensor node i are the sensor nodes that areconnected to i with an edge. The set of these neighboringnodes is indicated with N i , that is N i = { j | ( i, j ) ∈ E} .Since the sensor nodes are assumed to have communicationcapabilities, we implicitly assume that each sensor node i cancommunicate with all the sensors in N i , and with these only.In a similar fashion, we collect the anchors in the vertexset V a = { , . . . , m } and we say that the measurements v i,k induce an edge set E a , composed by the pairs ( i, k ) for which there exists a measurement v i,k . Also, we denotewith N i, a the neighboring anchors for sensor node i , i.e., N i, a = { k | ( i, k ) ∈ E a } . Problem Statement.

The sensor network localization prob-lem is formulated as estimating the position matrix X (insome cases, up to an orthogonal transformation) given themeasurements r i,j and v i,k for all ( i, j ) ∈ E and ( i, k ) ∈ E a ,and the anchor positions a k , k ∈ V a . When V a = ∅ wecall the problem anchor-free localization. The sensor networklocalization problem can be written in terms of maximizingthe likelihood leading to the following optimization problem X ∗ ML = argmax X ∈ R D × n  X ( i,j ) ∈E ln p i,j ( d i,j ( x i , x j ) | r i,j )+ X ( i,k ) ∈E a ln p i,k, a ( e i,k ( x i , a k ) | v i,k )  . (3)This optimization problem is in general non-convex and it isalso NP-hard to ﬁnd any global solution. In this paper, underthe sole assumptions that: Assumption 1:

The sensing PDFs p i,j ( d i,j ( x i , x j ) | r i,j ) and p i,k, a ( e i,k ( x i , a k ) | v i,k ) are log-concave functions of the un-known distances d i,j and e i,k , Assumption 2:

The graph induced by the inter-sensor rangemeasurements G is connected,we will propose a convex relaxation to transform the MLestimator (3) into a more tractable problem, which we willthen solve using ADMM in a distributed setting, where each ofthe sensor nodes, by communicating only with the neighboringnodes, will determine its own position. III. C

ONVEX R ELAXATIONS

A. Maximum Likelihood Relaxation

To derive the mentioned convex relaxation of the ML esti-mator (3), several steps are needed. First of all, we introducethe new variables Y = X T X , δ i,j = d i,j , ǫ i,k = e i,k , and wecollect the d i,j , e i,k , δ i,j , ǫ i,k scalar variables into the stackedvectors d , e , δ , ǫ . Second, we rewrite the cost function of theML estimator as dependent only on the pair ( d , e ) as f ( d , e ) := − (cid:16) X ( i,j ) ∈E ln p i,j ( d i,j | r i,j )+ X ( i,k ) ∈E a ln p i,k, a ( e i,k | v i,k ) (cid:17) . (4)Third, we re-introduce the dependencies of ( d , e ) on X and on ( δ , ǫ ) by considering the following constrained optimization minimize X , Y , δ , ǫ , d , e f ( d , e ) (5a)subject to Y ii + Y jj − Y ij = δ i,j ,δ i,j = d i,j , d i,j ≥ , for all ( i, j ) ∈ E (cid:27) (5b) Y ii − x T i a k + || a k || = ǫ i,k ,ǫ i,k = e i,k , e i,k ≥ , for all ( i, k ) ∈ E a (cid:27) (5c) Y = X T X . (5d)The problem (5) is equivalent to (3): the constraints in theproblem (5) have both the scope of imposing the pair-wisedistance relations and of enforcing the chosen change ofvariables (in fact, without the constraints, all the variableswould be independent of each other). In the new variables andunder Assumption 1, f ( d , e ) is a convex function, however theconstraints of (5) still deﬁne a non-convex set. Nonetheless,we can massage the constraints by using Schur complementsand propose the following convex relaxation minimize X , Y , δ , ǫ , d , e f ( d , e ) (6a)subject to Y ii + Y jj − Y ij = δ i,j , δ i,j ≥ , (cid:18) d i,j d i,j δ i,j (cid:19) (cid:23) , d i,j ≥ , for all ( i, j ) ∈ E  (6b) Y ii − x T i a k + || a k || = ǫ i,k , ǫ i,k ≥ , (cid:18) e i,k e i,k ǫ i,k (cid:19) (cid:23) , e i,k ≥ , for all ( i, k ) ∈ E a  (6c) (cid:18) I D XX T Y (cid:19) (cid:23) , Y (cid:23) . (6d)The problem (6) is now convex (speciﬁcally, it is a con-vex optimization problem with generalized inequality con-straints [42]) and its optimal solution represents a lower boundfor the original non-convex ML estimator (3).In the problem (6), all the three constraints (6b) till (6d)are rank relaxed versions of (5b) till (5d), which makesproblem (6) a rank relaxation. Usually, convex relaxationsfor sensor network localization are formulated directly onthe squared distance variables ( δ , ǫ ) using a cost function f sq ( δ , ǫ ) (not ML) and eliminating the variables ( d , e ) . Thisway of formulating the problem does not capture the noise distribution, but renders the resulting relaxation a rank- D relaxation, since (6d) is the only relaxed constraint [21].Problem (6) both models correctly the noise distribution, beingderived from an ML formulation, and for some common usednoise PDFs can be transformed into a rank- D relaxation, inwhich case it is equivalent in tightness to relaxations based onsquared distance alone.In the next subsections, we specify the convex relaxation (6)for different noise distributions (satisfying Assumption 1) andprove that (6) can be expressed as a rank- D relaxation fortwo particular yet widely used cases. In Section VII, whilepresenting simulation results, we discuss how this aspect canaffect the quality of the position estimation. In particular, itappears that tighter relaxations may have a lower estimationerror, even when they employ less accurate noise models. B. Example 1– Gaussian Noise Relaxation

In the case of Gaussian noise, we assume that the noises ν i,j and µ i,k in the sensing equations (1) and (2) are drawn froma white zero-mean PDF, i.e., ν i,j ∼ N (0 , σ i,j ) and µ i,k ∼N (0 , σ i,k, a ) . The cost function f ( d , e ) then is f GN , ( d , e ) := X ( i,j ) ∈E σ − i,j ( d i,j − d i,j r i,j + r i,j )+ X ( i,k ) ∈E a σ − i,k, a ( e i,k − e i,k v i,k + v i,k ) . A natural way to rewrite this cost is to enforce the change ofvariables δ i,j = d i,j and ǫ i,k = e i,k , yielding f GN ( δ , ǫ , d , e ) := X ( i,j ) ∈E σ − i,j ( δ i,j − d i,j r i,j + r i,j )+ X ( i,k ) ∈E a σ − i,k, a ( ǫ i,k − e i,k v i,k + v i,j ) . With the cost f GN ( δ , ǫ , d , e ) , the optimization problem reads minimize X , Y , δ , ǫ , d , e f GN ( δ , ǫ , d , e ) (7a)subject to (6b) , (6c) , (6d) (7b)This relaxation is not only convex but also a semideﬁniteprogram (SDP), i.e., it has a linear cost function and general-ized linear constraints [42]. Some of its constraints are linearmatrix inequalities (LMIs). For the semideﬁnite program (7),the following proposition holds true. Proposition 1:

Under the assumption of Gaussian noise, thesemideﬁnite program (7) is a rank- D relaxation of the originalnon-convex optimization problem (3). Proof:

We need to show that at optimality the relaxedconstraints (6b) and (6c) are equivalent to the original con-straints (5b) and (5c). In other words, we need to show thatany optimal solution of the semideﬁnite program (7), say δ ∗ , ǫ ∗ , d ∗ , e ∗ , Y ∗ , X ∗ , satisﬁes the following δ ∗ i,j = d ∗ i,j , and ǫ ∗ i,k = e ∗ i,k A similar formulation for this relaxation can be found in [21]. We notethat problem (7) is not equivalent to (6) with cost function f GN , ( d , e ) , sincefor (6), δ i,j ≥ d i,j and ǫ i,k ≥ e i,k . for all ( i, j ) ∈ E and for all ( i, k ) ∈ E a . To see this, note thatthe LMIs in the constraints (6b) and (6c) can be rewritten as d i,j ≤ δ i,j , and e i,k ≤ ǫ i,k . (8)The cost function (7a) maximizes the scalar variables d i,j and e i,k , which are constrained only by (8). Therefore atoptimality, we will always have d ∗ i,j = δ ∗ i,j and e ∗ i,k = ǫ ∗ i,k ,and thus the claim holds. C. Example 2– Quantized Observation Relaxation

An interesting, and realistic, elaboration of the ML estimatoris when, due to limited sensing capabilities, the sensorsproduce a quantized version of r i,j and v i,k (see the discussionin [43], [44] for its relevance in sensor networks). Consider an S -element convex tessellation of R + , comprised of the convexsets {Q s } Ss =1 . A quantization of r i,j and v i,k produces theobservations q r,i,j,s and q v,i,k,s , which are unitary if r i,j ∈ Q s and v i,k ∈ Q s , respectively. Otherwise q r,i,j,s and q v,i,k,s arezero. The resulting cost function for the convex relaxation (6)is now f Q ( d , e ) := −  X ( i,j ) ∈E ln S X s =1 q r,i,j,s Z r ′ i,j ∈Q s p i,j ( d i,j | r ′ i,j ) d r ′ i,j ! + X ( i,k ) ∈E a ln S X s =1 q v,i,k,s Z v ′ i,k ∈Q s p i,k, a ( e i,k | v ′ i,k ) d v ′ i,k ! , which is convex, since the integral of a log-concave functionover a convex set is also log-concave. The resulting convexrelaxation reads minimize X , Y , δ , ǫ , d , e f Q ( d , e ) (9a)subject to (6b) , (6c) , (6d) (9b)which is a rank relaxation of (3), but in general not a rank- D relaxation. We can specify (9a) for Gaussian noise (usingthe same variable enforcing of f GN ) as done in the equationat the bottom of the page. It is not difﬁcult to show that theconvex relaxation (9) equipped with the cost f Q , GN ( δ , ǫ , d , e ) is now a rank- D relaxation, by using similar arguments as inProposition 1. D. Example 3– Laplacian Noise Relaxation

Laplacian noise is used for example to model outliers inrange measurements [45] and to model errors coming fromsignal interference, e.g., in UWB localization systems [46]. Inthe Laplacian noise case the cost function can be speciﬁed as f L ( d , e ) := X ( i,j ) ∈E | d i,j − r i,j | σ i,j + X ( i,k ) ∈E a | e i,k − v i,k | σ i,k, a , and the ML convex relaxation reads minimize X , Y , δ , ǫ , d , e f L ( d , e ) (10a)subject to (6b) , (6c) , (6d) . (10b)This ML convex relaxation is neither a rank- D relaxation,nor it can be transformed into one by some variable enforcingin the cost function, yet it correctly models Laplacian PDFs. E. Example 4- Uniform Noise Relaxation

Uniform noise distributions are used when the source oferror is not known a priori and only a bound on the noiselevel is available. For example, this is the case when we areaware of a lower bound on the pair-wise distances and of anupper bound dictated by connectivity [47], [48]. Consideringuniform noise PDFs in the range d i,j ± σ i,j and e i,k ± σ i,k, a ,the convex relaxation (6) becomes the following feasibilityproblem ﬁnd X , Y , δ , ǫ , d , e (11a)such that (6b) , (6c) , (6d) (11b) r i,j − σ i,j ≤ d i,j ≤ r i,j + σ i,j for all ( i, j ) ∈ E (11c) v i,k − σ i,k, a ≤ e i,k ≤ v i,k + σ i,k, a for all ( i, k ) ∈ E a . (11d)Also in this case, the ML convex relaxation is neither arank- D relaxation, nor it can be transformed into one by somevariable enforcing in the cost function, yet it correctly modelsuniform noise distributions.IV. E DGE - BASED C ONVEX R ELAXATIONS

The convex relaxations derived from (6) couple arbitrarilyfar away sensor nodes through the LMI constraint (6d). Thiscomplicates the design of a distributed optimization algorithm.In addition, due to (6d), the complexity of solving thesemideﬁnite program (6) scales at least as O ( n ) , i.e., isat least cubic in the number of sensor nodes [42], and itcould become unfeasible for large-scale networks. In orderto massage this coupling constraint, we introduce a furtherrelaxation for (6), which will be called edge-based ML (E-ML) relaxation. We consider the following relaxation of (6) minimize X , Y , δ , ǫ , d , e f ( d , e ) (12a)subject to (6b) , (6c) (12b)  I D x i x j x T i Y ii Y ij x T j Y ij Y jj  (cid:23) , for all ( i, j ) ∈ E . (12c) f Q , GN ( δ , ǫ , d , e ) := −  X ( i,j ) ∈E ln S X s =1 q r,i,j,s Z r ′ i,j ∈Q s exp h − σ − i,j / δ i,j − d i,j r ′ i,j + r ′ i,j ) i d r ′ i,j ! + X ( i,k ) ∈E a ln S X s =1 q v,i,k,s Z v ′ i,k ∈Q s exp h − σ − i,k, a / ǫ i,k − e i,k v ′ i,k + v ′ i,k ) i d v ′ i,k ! This relaxation employs the same idea of the edge-basedsemideﬁnite program (ESDP) relaxation of [24], [25] ofconsidering the coupling constraint (6d) to be valid on theedges only. Since the constraint (6d) implies (12c) but notthe contrary, the relaxation (12) is not a rank- D relaxation.However, it is straightforward to see that, if the original convexrelaxation (6) was a rank- D relaxation, then for the derived(12), it would be true that δ ∗ i,j = d ∗ i,j , ǫ ∗ i,k = e ∗ i,k . For example,this is the case for Gaussian noise, and we show how this canplay an important role for the accuracy in Section VII.The convex relaxation (12) is now ready to be distributedamong the sensor nodes.V. D ISTRIBUTED A LGORITHMS FOR S ENSOR N ETWORK L OCALIZATION

Different distributed methods for sensor network local-ization have been proposed in recent years. A ﬁrst groupconsists of heuristic algorithms, which are typically based onthe paradigm of dividing the nodes into arbitrarily selectedclusters, solving the localization problem within every clusterand then patching together the different solutions. Methodsthat belong to this group are [49]–[51], while heuristic ap-proaches to SDP relaxations are discussed in [47]. Amongthe disadvantages of the heuristic approaches is that weintroduce arbitrariness into the problem and we typically loseall the guarantees of performance of the “father” centralizedapproach. Furthermore, very often these heuristic methods are ad-hoc and problem-dependent, which makes their theoreticalcharacterization difﬁcult (in contrast with the usage of well-established decomposition methods [52]).The second group of methods employs decomposition tech-niques to guarantee that the distributed scheme convergesto the centralized formulation asymptotically. In this group,under the Gaussian noise assumption, we can ﬁnd methodsthat tackle directly the non-convex optimization problem (3)with parallel gradient-descent iterative schemes [53], [54] or(very recently) a work that uses a minimization-majorizationtechnique to massage (3) sequentially and then employsthe alternating direction method of multipliers (ADMM) todistribute the computations among the sensor nodes [55].These approaches have certiﬁcates of convergence to a localminimum of the original non-convex problem . Other methodsencompass algorithms that tackle multi-dimensional scalingwith a communication-intensive distributed spectral decom-position [56], and algorithms that tackle instead the convexSOCP/SDP relaxations [40], [41], [57]. In particular [57]proposes a parallel distributed version of an SOCP relaxation(similar to the ESDP in [24]), whose convergence propertiesare however not analyzed . In [40], the authors proposea further improvement of [57] based on the Gauss-Seidelalgorithm, which is sequential in nature (meaning that sensorshave to wait for each other before running their own local This may not be sufﬁcient for a reasonable localization; thus the needfor a good starting condition which can be provided by convex relaxations,see [33] for some interesting numerical examples. As a matter of fact, the proposed Jacobi-like algorithm is very hard tobe proven converging to the centralized solution, since the constraints arecoupled and not Cartesian, see [52] for a detailed discussion. algorithm) and offers convergence guarantees to the ESDPof [24]. However, due to the sequential nature, the convergence rate depends on the number of sensor nodes, which makesthe approach impractical for large-scale networks. Finally, in[41] duality is exploited to design inexact primal-dual iterativealgorithms based on the convex relaxation of [22], [23], [33].This last approach has the advantage to be parallel and notsequential, nonetheless it is based on consensus algorithmswhose convergence rate is also dependent on the size of thenetwork, thus less practical for a large number of sensor nodes.In the next section, we propose a distributed algorithm basedon ADMM to solve the edge-based convex relaxation (12). Thealgorithm is proven to converge to the centralized optimizer as O (1 /t ) , where t is the number of iterations. Furthermore, thecomputation and communication per iteration and per node donot depend on the size of the network, but only on the size ofeach one’s neighborhood. Finally, we prove that the algorithmconverges also in the case of asynchronous communicationprotocols and computation errors, making it robust to thesetwo common issues in sensor networks.VI. P ROPOSED D ISTRIBUTED A PPROACH

A. Preliminaries and Background on ADMM

In order to present our distributed algorithm, ﬁrst of all,we rewrite the convex program (12) in a more compact way.Deﬁne the shared vector z i,j := ( Y ii , Y jj , Y ij , δ i,j , d i,j , x T i , x T j ) T ∈ R D , for each ( i, j ) ∈ E and call z the stacked vector comprised ofall the z i,j ’s. In a similar fashion, deﬁne the local vector p i := ( ǫ T i , e T i , Y ii , x T i ) T ∈ R |N i, a | + D +1 , where ǫ i and e i are the concatenated vectors of ǫ i,k and e i,k for all k ∈ N a ,i , and call p the stacked vector of all the p i ’s.We note that p i and z i,j are not independent, but this will notbe an issue. Moreover, deﬁne the convex sets Z i,j := { z i,j | z i,j veriﬁes (6b) and (12c) } , P i := { p i | p i veriﬁes (6c) } . Problem (12) is then equivalent to minimize z , p f ( z , p ) (13a)subject to z i,j ∈ Z i,j for all ( i, j ) ∈ E (13b) p i ∈ P i for all i ∈ V (13c)where, for the general case, f ( z , p ) := − (cid:16) X ( i,j ) ∈E ln p i,j ( d i,j | r i,j )+ X ( i,k ) ∈E a ln p i,k, a ( e i,k | v i,k ) (cid:17) =: X i ∈V f i ( z , p i ) . (14)From the structure of the cost (14) and the problem (13) onecan already see that the convex optimization (13) is separableand has z i,j as complicating variables. One possible way tohandle this type of optimization problems in a distributed way is employing the alternating direction method of multipliers(ADMM). The reader is referred to [58] for a very recent sur-vey of this rather old technique, to the papers [43], [59] whichspan possible applications of the method in signal processing,and to the mentioned recent work [55] that employs ADMMfor a localization problem (albeit with a different ﬂavor asthe one presented here and applied to a different Gaussiannoise-based approximated version of the original non-convexproblem). In a nutshell, the strategy of ADMM is to assigncopies of the coupling variable z i,j to both node i and node j and then constrain these copies to be equal. The strength ofADMM, and the main reason of its employment in this paper,resides in its noise-resilience and computation error-resilienceas well as the very loose assumptions required to guaranteeits convergence (in contrast with typical dual, or primal-dualdecomposition schemes.)In order to apply ADMM to the problem (13), we deﬁnethe local versions of the vector z i,j as y ii,j and y ji,j , meaningthat y ii,j represents the vector z i,j as seen by the node i , while y ji,j represents the vector z i,j as seen by the node j . Call nowthe stacked vectors y ii as the ones comprised of y ii,j for allthe j ∈ N i . We can then rewrite (13) in yet another equivalent form as minimize y ,..., y nn , p , z X i ∈V f i ( y ii , p i ) (15a)subject to y ii,j ∈ Z i,j , y ji,j ∈ Z i,j for all ( i, j ) ∈ E (15b) p i ∈ P i for all i ∈ V (15c) y ii,j − z i,j = 0 y ji,j − z i,j = 0 (cid:27) for all ( i, j ) ∈ E (15d)Problems (12), (13), and (15) are all equivalent, but prob-lem (15) is better suited for ADMM, as we are about to see. Remark 1:

We remark that the sequential greedy optimiza-tion (SGO) method of [40] can also be applied to (13). How-ever, its analytical properties, such as convergence rate, noise-resilience, and computation error-resilience are still unknownat the moment; furthermore, its convergence has been provenonly under the strong assumption of decoupled constraints(and argued for the real case of sparse coupling constraints,see [40], Remark 4). Nonetheless, we will implement a dis-tributed algorithm using SGO applied to our E-ML formula-tion to compare its performance analytically and numericallyto ADMM. We will argue that SGO, given its sequentialnature, is less suitable for large-scale networks.

B. Proposed Algorithm

The ﬁrst step to derive the ADMM algorithm is, givena scalar ρ > , deﬁning the regularized Lagrangian ofproblem (15) as L ( y , p , z , λ ) := X i ∈V f i ( y ii , p i ) + X ( i,j ) ∈E h λ i T i,j ( y ii,j − z i,j ) + λ j T i,j ( y ji,j − z i,j ) i + X ( i,j ) ∈E ρ (cid:18)(cid:13)(cid:13) y ii,j − z i,j (cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) y ji,j − z i,j (cid:13)(cid:13)(cid:13) (cid:19) (16) where y is the shorthand notation for the vector ( y T , . . . , y n T n ) T , while λ is the shorthand notation forthe vector of multipliers. To each couple of equalityconstraints (15d) we assign the multipliers λ ii,j and λ ji,j .Solving (15) with ADMM means implementing the follow-ing recursion: initialize the variables y (0) , p (0) , z (0) , λ (0) , then ( y ( t +1) , p ( t +1) ) = argmin y ∈Z , p ∈P {L ( y , p , z ( t ) , λ ( t ) ) } , (17a) z ( t +1) = argmin z {L ( y ( t +1) , p ( t +1) , z , λ ( t ) ) } , (17b) λ ( t +1) = λ ( t ) + ρ ∇ λ h L ( y ( t +1) , p ( t +1) , z ( t +1) , λ ) i λ = λ ( t ) , (17c)for all t ≥ , and with the convex sets Z and P deﬁned as theunion of the sets Z ij for all ( i, j ) ∈ E and P i for all i ∈ V ,respectively.In our sensor network application, the recursion (17) isdistributed in nature, since it can be carried out as follows.1) Set y i (0) i,j , p (0) i , z (0) i,j , λ i (0) i,j , λ j (0) i,j to zero, for all thenodes.2) At each iteration t , each node owns the variables y i ( t ) i,j , p ( t ) i , z ( t ) i,j , λ i ( t ) i,j , λ j ( t ) i,j for all j ∈ N i ;3) Each node updates its local variables y i ( t ) i,j , p ( t ) i as ( y i ( t +1) i , p ( t +1) i ) = arg min y ii,j ∈Z i,j , p i ∈P i (cid:8) f i ( y ii , p i )+ X j ∈N i (cid:20) λ i ( t ) T i,j y ii,j + ρ (cid:13)(cid:13)(cid:13) y ii,j − z ( t ) i,j (cid:13)(cid:13)(cid:13) (cid:21) ; (18a)4) Each node sends its local vector y i ( t +1) i,j to its neighbor j , for all j ∈ N i ;5) Each node computes, for all j ∈ N i z ( t +1) i,j = arg min z i,j n − (cid:16) λ i ( t ) T i,j + λ j ( t ) T i,j (cid:17) z i,j + ρ (cid:18)(cid:13)(cid:13)(cid:13) y i ( t +1) i,j − z i,j (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) y j ( t +1) i,j − z i,j (cid:13)(cid:13)(cid:13) (cid:19)(cid:27) . (18b)We note here that since all the vectors y i ( t +1) i,j aretransmitted perfectly, the value of z ( t +1) i,j computed bynode i is the same as the one computed by node j ;6) Each node computes, for all j ∈ N i λ i ( t +1) i,j = λ i ( t ) i,j + ρ ( y i ( t +1) i,j − z ( t +1) i,j ) λ j ( t +1) i,j = λ j ( t ) i,j + ρ ( y j ( t +1) i,j − z ( t +1) i,j ) . (18c)We note that here also the values of λ i ( t +1) i,j and λ j ( t +1) i,j computed by node i are the same as the ones computedby node j ;7) Set t ← t + 1 and go to 2).We note that both the optimization problems (18a) and(18b) are convex programs. Problem (18a) is an SDP (whichcan be solved using standard convex optimization toolboxes,such as Yalmip or CVX). In order to see this more clearly, Program (18a) needs to be written in the equivalent form minimize y ii,j ∈Z i,j , p i ∈P i , γ f i ( y ii , p i ) + X j ∈N i h λ i ( t ) T i,j y ii,j + ρ γ i,j i (19a)subject to aa y ii,j − z ( t ) i,j ) T ( y ii,j − z ( t ) i,j ) γ i,j I D ! (cid:23) , (19b) γ i,j ≥ for all ( i, j ) ∈ N i , (19c)where each γ i,j and the vector containing them γ are slackvariables used to impose the quadratic penalty.Problem (18b) is an unconstrained quadratic program in z i,j , whose solution is z ( t +1) i,j = 12 (cid:20) y i ( t +1) i,j + y j ( t +1) i,j + 1 ρ (cid:16) λ i ( t ) i,j + λ j ( t ) i,j (cid:17)(cid:21) . (20)We can now simplify the relations (20) and (18c). By usingthe relations (18c), we can write (20) as z ( t +1) i,j = 12 h y i ( t +1) i,j + y j ( t +1) i,j i +1 ρ (cid:16) λ i ( t − i,j + λ j ( t − i,j + ρ ( y i ( t ) i,j + y j ( t ) i,j − z ( t ) i,j ) (cid:17) , and, using again (20) for z ( t ) i,j , we obtain z ( t +1) i,j = 12 h y i ( t +1) i,j + y j ( t +1) i,j i . (21)Furthermore, the following relations hold as by-products of (21) for all t ≥ : λ i ( t +1) i,j + λ j ( t +1) i,j = 0 , λ i ( t +1) i,j = t +1 X κ =1 ρ h y i ( κ ) i,j − y j ( κ ) i,j i . This simpliﬁes the ADMM algorithm deﬁned by the itera-tions (18) (as summarized in Algorithm 1), in particular thecomputation of λ j ( t +1) i,j is no longer required (as not neededany more for the computation of z i,j ). C. Properties of Algorithm 1 (Ideal)

We now analyze the analytical properties of Algorithm 1 interms of convergence and convergence rate of its solution tothe optimal solution of the centralized problem (12). As a by-product, we also characterize the number of iterations requiredto reach a given accuracy and the total communication cost.Let q denote the stacked vector of the optimization vari-ables, i.e., q := ( y T , p T , z T ) T , and let ¯ q t represent the runningaverages, i.e., ¯ q t = 1 t + 1 t X κ =0 q ( κ ) , (22)with q ( κ ) = ( y ( κ ) T , p ( κ ) T , z ( κ ) T ) T . Assume that the initialconvex problem (15) admits a solution and let ( q ∗ , λ ∗ ) bethis solution. Then the following convergence theorem holds. Recall that we have set λ i (0) i,j = and λ j (0) i,j = and applyrelations (18c) and (21) recursively. Algorithm 1

Distributed ADMM Algorithm for Problem (15)

Set y i (0) i,j , p (0) i , z (0) i,j , λ i (0) i,j to zero, for all the nodes Input: y i ( t ) i,j , p ( t ) i , z ( t ) i,j , λ i ( t ) i,j , for all j ∈ N i aaaa Each node update its local variables y i ( t ) i,j , p ( t ) i to y i ( t +1) i,j , p ( t +1) i by the convex program (19) up to a deﬁned accuracy ε Each node sends its local vector y i ( t +1) i,j to its neighbor j , for all j ∈ N i Each node computes, for all j ∈ N i its z ( t +1) i,j via Equation (21) Each node computes, for all j ∈ N i λ i ( t +1) i,j = λ i ( t ) i,j + ρ ( y i ( t +1) i,j − z ( t +1) i,j ) Output: y i ( t +1) i,j , p ( t +1) i , z ( t +1) i,j , λ i ( t +1) i,j , for all j ∈ N i Theorem 1:

Let q ( t ) be the solution generated with Algo-rithm 1 applied to the convex problem (15), whose solution isdenoted by ( q ∗ , λ ∗ ) . Let ¯ q t be deﬁned as (22). Let the graph G be connected (Assumption 2). The following relations hold: (a) ≤ L (¯ q t , λ ∗ ) − L ( q ∗ , λ ∗ ) ≤ C t + 1 , (b) lim t →∞ || q ( t ) − q ∗ || → ,where C ≥ is a constant that depends on the distance ofthe initial guess to the optimal solution, i.e., || q (0) − q ∗ || and || λ (0) − λ ∗ || , and on the parameter ρ . Proof:

Since, in the problem (15), the sets Z i,j and P i are closed and convex, and the costs f i are proper and convex,the part (a) of the proof follows from [60, Theorem 4.1]. Sincethe constraint (15d) deﬁnes a linear system with full-columnrank, the part (b) of the proof follows from [61, Theorem 1].Let now local Lagrangian functions be L i ( y ii , p i , z i , λ ii ) := f i ( y ii , p i ) + X j ∈N i h λ i T i,j y ii,j + ρ (cid:13)(cid:13) y ii,j − z i,j (cid:13)(cid:13) i , where z i and λ ii are the stacked vectors of the z i,j ’s and λ ii,j for j ∈ N i , respectively . Assume we are interested indetermining how many iterations t are needed to reach a givenaverage local accuracy η ≥ , meaning, ≤ n X i ∈V (cid:0) L i (¯ q i,t , λ i ∗ i ) − L i ( q ∗ i , λ i ∗ i ) (cid:1) ≤ η, (23)where ¯ q i,t is the running average (as in (22)) of the local vec-tor q ( t ) i := ( y i ( t ) T i , p ( t ) T i , z ( t ) T i ) T . The following propositionholds. Proposition 2:

Let q ( t ) i be the local solution generatedwith Algorithm 1 applied to the convex problem (15), whosesolution for sensor node i is denoted by ( q ∗ i , λ i ∗ i ) . Let ¯ q i,t be deﬁned as (22) for q ( t ) i . Let the graph G be connected(Assumption 2). Let Algorithm 1 be initialized with z (0) = and λ (0) = . Let η be a given average local accuracy level,as expressed in (23). If the number of iterations t is chosen as t ≥ t η := max i ∈V (cid:24) ρη ( ρ || z ∗ i || + || λ i ∗ i || ) + 1 (cid:25) , By these deﬁnitions, the total Lagrangian (16) can now be written as L ( y , p , z , λ ) = P i ∈V L i ( y ii , p i , z i , λ ii ) , and the update (18a) reads ( y i ( t +1) i , p ( t +1) i ) = arg min y ii,j ∈Z i,j , p i ∈P i n L i ( y ii , p i , z ( t ) i , λ i ( t ) i ) o . where ⌈·⌉ represents the ceiling operator, then the accuracy η is reached. Proof:

The proof follows from Theorem 1. From point (a) of Theorem 1, ≤ L (¯ q t , λ ∗ ) − L ( q ∗ , λ ∗ ) = X i ∈V (cid:0) L i (¯ q i,t , λ i ∗ i ) − L i ( q ∗ i , λ i ∗ i ) (cid:1) ≤ C t + 1 . (24)From [60, Theorem 4.1], C can be expressed as C = (2 ρ || z (0) − z ∗ || + ρ − || λ (0) − λ ∗ || ) / X i ∈V ( ρ || z ∗ i || + ρ − || λ i ∗ i || ) / , (25)where the last simpliﬁcation is due to the initialization of z (0) and λ (0) at zero, and z i,j = z j,i . Combining (24), (25), and(23) we obtain n X i ∈V (cid:0) L i (¯ q i,t , λ i ∗ i ) − L i ( q ∗ i , λ i ∗ i ) (cid:1) ≤ max i ∈V { ρ || z ∗ i || + || λ i ∗ i || } ρ ( t + 1) = η from which the claim follows.Proposition 2 says that the number of iterations for a givenaverage local accuracy does not depend on the network size,but only on the worst local initial error. We can also character-ize the total communication cost for node i to reach a givenaccuracy level (which also does not depend on the networksize) as follows. Corollary 1:

Under the same premises of Proposition 2, thecommunication cost c i for sensor node i (i.e., the numberof scalar numbers to send) to reach a desired average localaccuracy η is lower bounded by c i ≥ |N i | t η . Proof:

Straightforward given the communication costcounting of Section VI-F and Proposition 2.Theorem 1 indicates an O (1 /t ) rate of convergence ofAlgorithm 1 in ergodic sense (i.e., in the sense of the runningaverage vector). We note that this O (1 /t ) convergence is fast if one looks at the very loose assumptions. As a matter offact, f could also have been non-differentiable; we reportthat typically non-differentiable problems solved using sub-gradient algorithms converge as O (1 / √ t ) [62].The mentioned O (1 /t ) convergence rate assumes perfectand synchronous communication at step 4) and that theoptimizations at steps 3) and 5) are carried out exactly. Inreal situations, these are rather restrictive requirements. Inpractice, communication is affected by noise [43], packagescan be dropped, and it is in general asynchronous amongthe sensor nodes. In addition, the often limited computationalcapabilities of the sensor nodes limit the possibility to obtainhighly accurate solutions for the SDP in step 3). The strengthof ADMM is however to be resilient to these issues, whichin turn means that ADMM can be employed and convergencecan be guaranteed also with these issues present [58]. In thispaper, we decided to focus on the loss of synchronicity andlimited computation capabilities problems, since we believethey are the most critical ones in our application. D. Properties of Algorithm 1 (Asynchronous)

First of all, we consider asynchronous communication. Fol-lowing the main bulk of research in ADMM, we consider anedge set perspective. Suppose that at iteration t only a subset ofall the existing links is activated, denoted by E ( t ) , and supposethat Algorithm 1 is run in an asynchronous fashion, where ateach iteration t we only consider the variables associated with E ( t ) and we communicate only through E ( t ) . At each iteration t , we let the symmetric adjacency matrix associated with E ( t ) be denoted as A ( t ) . We further assume the following. Assumption 3:

At each iteration t the symmetric adjacencymatrix A ( t ) is generated by an i.i.d. Bernoulli process with Pr[[ A ( t ) ] ij = 1] = s ij > for all ( i, j ) ∈ E , with a givenprobability < s ij ≤ . Assumption 4:

Let G ( t ) := ( V , E ( t ) ) . For every t ′ ≥ , thereexists an integer T > such that: (i) the union of the edge sets satisﬁes S t ′ + Tℓ = t ′ E ( ℓ ) = E ; (ii) the union graph, i.e., S t ′ + Tℓ = t ′ G ( ℓ ) , is connected.These assumptions are rather standard in stochastic dis-tributed optimization [62], [63]. The convergence of Algo-rithm 1 under asynchronous communication can now beformally stated as follows. Theorem 2:

Let q ( t ) , asy = ( y ( t ) T , p ( t ) T , z ( t ) T ) T be thesolution generated by Algorithm 1 run in an asynchronousfashion, where at each iteration only a subset of edges areactive. Let ( q ∗ , λ ∗ ) be the solution of the convex problem (15).Under Assumptions 3 and 4, lim t →∞ (cid:13)(cid:13)(cid:13) q ( t ) , asy − q ∗ (cid:13)(cid:13)(cid:13) → , almost surely . Proof:

The proof is an application of [63, Theorem 3and Lemma 4]. Consider [63, Theorem 3]: Assumption 1 isvalid since in the problem (15), the sets Z i,j and P i areclosed and convex, and the costs f i are proper and convex.The Assumptions 2 and 3 are our Assumptions 3 and 4.Problem (15) can be put as the non-smooth unconstrainedproblem (2) of [63] and its dual is the problem (12) of [63].With this in place, by [63, Theorem 3] we have now almostsure convergence in the dual domain for Algorithm 1. By [63,Lemma 4] primal convergence follows, after which the claimis proven. E. Properties of Algorithm 1 (Computation errors)

The second aspect that we consider is the limited computa-tion capabilities of the sensor nodes. In particular, we assumethat each of the subproblems (18a) is solved up to an accuracy ε , i.e., the optimal solution of each subproblem satisﬁes ≤ L i ( y ii , p i , z ( t ) i , λ i ( t ) i ) − L i ( y i ( t +1) i , p ( t +1) i , z ( t ) i , λ i ( t ) i ) ≤ ε, for all y ii,j ∈ Z i,j , p i ∈ P i . (26)The following theorem is now in place. Theorem 3:

Let ¯ q t,ε be the running solution generatedwith Algorithm 1 under the assumption that each of thesubproblems (18a) is solved up to an accuracy ε , as speciﬁed TABLE IA

NALYTICAL COMPARISON OF THE AVAILABLE DISTRIBUTED ALGORITHMS . B

OTH

SGO

AND

ADMM

CAN BE APPLIED TO THE

E-ML

FORMULATION .E-ML with ADMM SGO of [40] MVU of [41]Size of the Convex Problem |N i | + 2 |N i, a | + 3 4 |N i | + 3 not applicableComputational Complexity O (cid:0) ( |N i | + |N i, a | ) (cid:1) O (cid:0) |N i | (cid:1) O (cid:0) |N i | (cid:1) Communication cost |N i | |N i | O ( |N i | ) Type of distributed algorithm Parallel, ADMM Sequential, Gauss-Seidel Parallel, Primal-Dual Subgradient and ConsensusConvergence rate O (1 /t ) , (Ergodic) O ( r t/n ) , (Actual), and O ( n/t ) , (Ergodic) O ( τ mix log ( n ) /t ) , (Ergodic) by condition (26). Let ( q ∗ , λ ∗ ) be the solution of the convexproblem (15). Then the following holds: ≤ L (¯ q t,ε , λ ∗ ) − L ( q ∗ , λ ∗ ) ≤ C t + 1 + nε, where C ≥ is a constant that depends on the distance ofthe initial guess to the optimal solution, i.e., || q (0) − q ∗ || and || λ (0) − λ ∗ || , and on the parameter ρ . Proof:

The proof follows directly from [60] substitutingtheir (3.5) with our (26).Proposition 3 implies that Algorithm 1 converges as O (1 /t ) to an error ﬂoor with magnitude nε . F. Comparison of Algorithm 1 with Alternatives

In this section, we analyze the computational complexityand the communication cost of Algorithm 1 and we compareit to some other distributed algorithms for convex relaxations,namely the sequential greedy optimization (SGO) algorithmof [40] and the distributed maximum variance unfolding(MVU) algorithm of [41] (we leave out the approach of [57]since convergence has not been proven). The aim is to showthe added value in using ADMM especially for large-scalenetworks. For simplicity, the nodes are located in R . E-ML with ADMM (Algorithm 1)

At each iteration, foreach sensor node, the most complex operation is to solvethe convex program (19). This convex program optimizesover y ii , p i , γ and it comprises of |N i | + 2 |N i, a | + 3 scalarvariables, |N i | +2 |N i, a | scalar equality/inequality constraints,and |N i | + 2 |N i, a | LMI constraints of size at most × (which is represented by the LMI with γ i,j ) . This yieldsa computational complexity of at least O (( |N i | + |N i, a | ) ) (see [42] for details on operation counts). The communicationcost per iteration per sensor is proportional to the number ofscalar variables that sensors have to send, and each sensorhas to send the updated y ji,j ∈ R to its neighbor j , for eachneighbor, i.e., a communication cost of |N i | . SGO.

The SGO algorithm of [40] is sequential in nature,meaning that only one local optimization can be run at thetime, and although its convergence has been argued, no formal In fact, assuming x i ∈ R , then y ii ∈ R |N i | +3 , p i ∈ R |N i, a | +3 , γ ∈ R |N i | , and eliminating the overlapping variables between y ii and p i the count follows. Furthermore, the local optimization has a part of (12b) and(19c) as equality/inequality constraints, in total |N i | +2 |N i, a | , and the otherpart of (12b) plus (12c) and (19b) as LMI, in total |N i | + 2 |N i, a | LMIs ofdimension at most × in the case of (19b). proof has been given for the convergence rate . Furthermorenoise-resilience as well as computation error-resilience are un-known features of SGO. For ease of comparison, we considerthe range-based localization SGO and ignore the anchors forsimplicity. In this context, and in the case we apply SGO toour E-ML formulation, at each iteration, for the one activesensor node (given that we keep x j ﬁxed and there is no z i,j or γ variable) the most complex operation is to solve aconvex program comprising of |N i | + 3 variables and |N i | LMI constraints, which leads to a computational complexityof at least O ( |N i | ) (see also [40]). The communication costsper iteration for the one active sensor is proportional to thenumber of scalar variables that have to be sent (the updated x i ) multiplied by the number of sensor nodes they have tobe sent to (the neighbors), yielding a cost of |N i | . For theconvergence rate, the best convergence rate that we can expectfrom a Gauss-Seidel algorithm (with some strong assumptionson the constraints and cost function) is linear [52], i.e., theconvergence rate is O ( r t ) for a certain (problem-dependentand a priori unknown) < r < . Given that one iterationof the Gauss-Seidel comprises n sub-iterations of the SGO,the convergence rate of SGO is at best O ( r t/n ) , or O ( n/t ) inergodic sense. MVU.

The MVU algorithm of [41] is parallel in nature, em-ploys a primal-dual scheme with a nested consensus step, andcannot handle anchors. It is based on the decentralized spectraldecomposition algorithm of [66] and it requires each node toeventually locate all the others. At each iteration, for eachsensor node, the computational complexity is at most O ( |N i | ) and the communication cost at most O ( |N i | ) . The convergencerate is based on the convergence of the decentralized spectraldecomposition algorithm, which requires O ( τ mix log ( n )) sub-iterations ( τ mix is the mixing time of a random walk on thegraph G ), and on the convergence rate of the primal-dualscheme, proven to be O (1 /t ) in ergodic sense.Table I collects the performed analyses and indicates thatADMM may be the best choice to increase the convergencerate, especially in the case of large-scale networks. This comeswith a limited increase in communication cost, which howevercan always be tuned choosing the neighborhood’s size. In thenext section, we display what this means in simulation resultsalong with other relevant comparisons. Coloring procedures as in [52] could be employed to partially parallelizeSGO. These coloring techniques depend on the availability of a coloringscheme before running the SGO. Coloring schemes are NP-hard problems, andalbeit there are decentralized techniques to compute bounds, the number of it-erations to achieve a given accuracy is between O (log( n )) and O ( n exp( n )) [64], [65], which undermines their applicability for large-scale settings. VII. N

UMERICAL S IMULATIONS

In this section, we report several numerical comparisonsfor both the centralized formulation (12) and the distributedAlgorithm 1. The aim of the section is to show how the E-ML relaxation performs under various noise conditions, tosupport the idea that tighter relaxations perform better in termsof position error (even though they may model the noisePDF wrongly), and to display the numerical properties of thedistributed Algorithm 1.

A. Centralized simulations

We consider -dimensional problems and we use the bench-mark test10-500 ∼ yyye/, where the sensor nodes are randomly distributedin the unit box [ − . , . . We let ξ i,ℓ be the position errorof sensor node i for a certain realization of the noise ℓ , i.e., ξ i,ℓ := || ˆ x i,ℓ − x i || , where ˆ x i,ℓ is the estimated position and x i is the true position. We consider the position root meansquare error ( PRMSE ) as a metric of performance for theproposed convex relaxations, i.e.,

PRMSE = s P Lℓ =1 P i ∈V ξ i,ℓ L , (27)where L is the total number of noise realizations. Along withthis metric, we consider the worst case maximum error, i.e., ME = max i ∈V ,ℓ ∈ [1 ,L ] ξ i,ℓ , (28)and we compute the Cram´er-Rao lower bound (CRLB) as acomparison benchmark as in [12]. Gaussian noise setting. (Figures 1 and 2) In the ﬁrstexample, we focus on a Gaussian noise setting. We ﬁx themaximum number of neighbors for each sensor node to , weset the number of anchors to m = 5 , we consider additivewhite noise of the same standard deviation σ i,j = σ i,k, a forall the measurements, and we average over realizations.In Figure 1, we compare the E-ML approach, i.e., theproblem (12) with cost function f GN ( δ , ǫ , d , e ) , with theESDP relaxation of [24] (considered to be the state-of-the-art in convex relaxations) by increasing the number of sensornodes n and keeping all the other parameters the same ( σ i,j = σ i,k, a = 0 . ). As we can see, the performance of E-ML isbetter than the one of ESDP, albeit only slightly. Furthermore,as one could expect, by increasing n we average out the noise,which in turn means a better average performance and lessdifference among the two schemes.In Figure 2, we study the performance of the E-ML ap-proach and of the ESDP relaxation by increasing the noisevalue, for n = 8 . As we can see, the performance of E-ML is again slightly better than the one of ESDP, and thedifference increases with the noise value (notice that the graphis in logarithmic scale). Laplacian noise setting. (Figure 3) In this second example,we focus on a Laplacian noise setting. Also in this example,we ﬁx the maximum number of neighbors for each sensornode to , and we set m = 5 and L = 50 . −2 −1 Number of sensor nodes n P o s iti on E rr o r[-] ESDP [24], ME E-ML, problem (12), ME √ CRLB /n ESDP [24],

PRMSE /n E-ML, problem (12),

PRMSE /n Fig. 1. Comparison between E-ML relaxation (12) and ESDP relaxationof [24] in the Gaussian noise setting for different values of the sensor nodenumber n and ﬁxed σ i,j = σ i,k, a = 0 . . −3 −2 −1 −4 −3 −2 −1 Noise standard deviation σ i,j = σ i,k, a [-] P o s iti on E rr o r[-] ESDP [24], ME E-ML, problem (12), ME √ CRLB /n ESDP [24],

PRMSE /n E-ML, problem (12),

PRMSE /n Fig. 2. Comparison between E-ML relaxation (12) and ESDP relaxationof [24] in the Gaussian noise setting for different values of the measurementnoise standard deviation σ i,j = σ i,k, a and ﬁxed n = 8 . −3 −2 −1 −5 −4 −3 −2 −1 Noise standard deviation σ i,j = σ i,k, a [-] P R M S E / n [-] Lapl. E-ML (12), n = 8 ESDP [24], n = 8 GN E-ML (12), n = 8 √ CRLB /n , n = 8 Lapl. E-ML (12), n = 64 ESDP [24], n = 64 GN E-ML (12), n = 64 √ CRLB /n , n = 64 Fig. 3. Comparison between Laplacian E-ML relaxation, i.e., (12) with f L ( d , e ) , ESDP relaxation of [24], and Gaussian E-ML relaxation, i.e., (12)with f GN , in the Laplacian noise setting for different values of the sensornode number n and different noise values σ i,j = σ i,k, a . In Figure 3, we compare the Laplacian E-ML relaxation,i.e., problem (12) with cost function f L ( d , e ) , the Gaus-sian E-ML relaxation, i.e., problem (12) with cost function f GN ( δ , ǫ , d , e ) , and the ESDP relaxation of [24]. We use themodiﬁed version of the CRLB of [67] as a benchmark, sinceLaplacian distributions are not differentiable. We vary boththe number of sensor nodes n and the noise value. As we cansee, although the Laplacian E-ML relaxation correctly modelsthe noise distribution, it performs worse than the other convexrelaxations. The reason is that it is not derived from a rank- D relaxation and therefore it is a “looser” relaxation with respectto the other ones considered in this example. B. Distributed simulations

We use the same setting of the centralized simulations (i.e.,anchor number m = 5 , and maximum number of neighbors foreach sensor node is ) and we consider Gaussian noise. We testAlgorithm 1 based on ADMM for different values of n , com-putation accuracy nε , and asynchronous communication. Inorder to generate the computation error, we set sedumi.eps to ε , which is an upper bound for our deﬁnition of ε . Weset the regularization parameter ρ = 0 . . We ﬁrst focus onsynchronous communication and then on the asynchronousimplementation. Synchronous case. (Figures 4, 5, and 6) Figures 4, 5,and 6 collect the synchronous communication results forAlgorithm 1 and conﬁrm the O (1 /t ) convergence of ADMM.In Figures 4 and 5, we ﬁx the centralized problem asthe relaxation (12) with cost function f GN ( δ , ǫ , d , e ) and wecompare the convergence of Algorithm 1 to the centralizedsolution (Theorem 1) with the one of SGO applied to the samecentralized problem. We also show the effect of computationinaccuracies (Theorem 3) supporting our theoretical ﬁndings.As we can see, by comparing SGO with the ADMM approach,we notice the slower convergence of the former (for a large-scale setting) due to its sequential nature (in fact, in the caseof SGO, at each iteration t we update only one sensor nodeposition). We see also that SGO is resilient to computationinaccuracies (at least in this simulation), it seems to have alinear type of convergence (as argued), and it may be a choicein case of small-size networks. Further studies are howevernecessary to certify the reliability of SGO to a broader classof scenarios.Figure 6 represents the sensor node locations computed asthe solution of Algorithm 1 for different iterations t . Thealgorithm is initialized with X (0) = and then run till t = 400 . The “trajectories” of the running averaged variables ¯ X t (i.e., the position part of the ¯ q t vector (22)) as a functionof the iteration number t are displayed. As we can see, for t = 400 , Algorithm 1 is practically converged onto the realsensor node locations. Asynchronous case. (Figure 7) For the asynchronous com-munication case, we use the same setting as the synchronous SeDuMi considers this tolerance to be related also to feasibility and notonly optimality, as we do, see [68] for details. −5 −4 −3 −2 −1 Iteration t L ( q ( t ) , λ ∗ ) − L ( q ∗ , λ ∗ ) n = 8 , exact n = 32 , nε = 0 . n = 128 , nε = 0 . SGO , n = 8 , exact SGO , n = 128 , nε = 0 . Fig. 4. Actual objective convergence of Algorithm 1 in different settingsand comparison with SGO solving the same E-ML problem (12). −5 −4 −3 −2 −1 Iteration t L ( ¯ q t , λ ∗ ) − L ( q ∗ , λ ∗ ) / t li n e n = 8 , exact n = 32 , nε = 0 . n = 128 , nε = 0 . SGO , n = 8 , exact SGO , n = 128 , nε = 0 . Fig. 5. Ergodic objective convergence of Algorithm 1 in different settingsand comparison with SGO solving the same E-ML problem (12). −0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.5−0.4−0.3−0.2−0.100.10.20.30.40.5 x x Anchors’ positionsSensor nodes’ real positionsDistr. solution at iteration t = 400 Distr. solution from iteration t = 0 till t = 400 Fig. 6. Position solutions ¯ X t as a function of t plotted as trajectories (usingAlgorithm 1 in its exact form) from t = 0 till t = 400 . The values of ¯ X (blue circles) are practically coincident with the real sensor node positions(red squares). The crosses represent the anchors. −6 −5 −4 −3 −2 −1 Iteration t || q ( t ) − q ∗ || n = 8 , s ij = 1 . n = 8 , s ij = 0 . n = 8 , s ij = 0 . n = 32 , s ij = 0 . Fig. 7. Actual primal convergence of Algorithm 1 to the optimizer of thecentralized E-ML problem (12) for asynchronous communication. scenario and we consider different values for the number ofsensor nodes n and probability s ij (Assumption 3).In Figure 7, the results are displayed. In particular, wehave depicted the distance between the primal solution fromAlgorithm 1, q ( t ) , and the optimal value found using thecentralized problem (12), i.e., q ∗ . As we expect, Algorithm 1converges to the optimal primal solution of the centralizedproblem (12) (Theorem 2).VIII. C ONCLUSIONS

We have studied the sensor network localization problem.We have argued that employing convex relaxations based on amaximum likelihood formulation to massage the original non-convex formulation can offer a powerful handle on computingaccurate solutions. In order to take full advantage of thisaspect, we have shown that the relaxation has to be as tight aspossible to the original non-convex problem, (in some cases,disregarding the noise model). Furthermore, we have discusseda distributed implementation of the resulting convex relaxationvia the ADMM. By exploiting the analytical properties ofADMM (convergence rate, asynchronism-resilience, computa-tion error-resilience), we have studied the resulting distributedalgorithm showing its added value with respect to availabletechniques, especially in large networks.Among future research plans, we are interested in studyingmobile sensor network localization problems by using convexrelaxations based on a maximum a posteriori formulation ofthe estimation problem.R

EFERENCES[1] P. Biswas and S. Phoha, “Self-Organizing Sensor Networks for Inte-grated Target Surveillance,”

IEEE Transactions on Computers , vol. 55,no. 8, pp. 1033 – 1047, 2006.[2] T. R¨aty, “Survey on Contemporary Remote Surveillance Systems forPublic Safety,”

IEEE Transactions on Systems, Man, and Cybernetics,Part C: Applications and Reviews , vol. 40, no. 5, pp. 493 – 515, 2010.[3] O. Songhwai, L. Schenato, P. Chen, and S. Sastry, “Tracking and Co-ordination of Multiple Agents Using Sensor Networks: System Design,Algorithms and Experiments,”

Proceedings of the IEEE , vol. 95, no. 1,pp. 234 – 254, 2007. [4] J. Liu, M. Chu, and J. Reich, “Multitarget Tracking in Distributed SensorNetworks,”

IEEE Signal Processing Magazine , vol. 24, no. 3, pp. 36 –46, 2007.[5] T. Sun, L.-J. Chen, C.-C. Han, and M. Gerla, “Reliable Sensor Networksfor Planet Exploration,” in

Proceedings of the IEEE Networking, Sensingand Control conference , (Tucson, USA), pp. 816 – 821, March 2005.[6] N. Leonard, D. A. Paley, F. Lekien, R. Sepulchre, D. Fratantoni, andR. Davis, “Collective Motion, Sensor Networks, and Ocean Sampling,”

Proceeding of IEEE , vol. 1, no. 1, pp. 48 – 74, 2007.[7] P. Corke, T. Wark, R. Jurdak, W. Hu, P. Valencia, and D. Moore, “En-vironmental Wireless Sensor Networks,”

Proceeding of IEEE , vol. 98,no. 11, pp. 1903 – 1917, 2010.[8] G. Sun, G. Qiao, and B. Xu, “Corrosion Monitoring Sensor Networkswith Energy Harvesting,”

IEEE Sensors Journal , vol. 11, no. 6, pp. 1476– 1477, 2011.[9] K. Zhou and S. I. Roumeliotis, “Multirobot Active Target Trackingwith Combinations of Relative Observations,”

IEEE Transactions onRobotics , vol. 27, no. 4, pp. 678 – 695, 2011.[10] T. Arampatzis, J. Lygeros, and S. Manesis, “A Survey of Applicationsof Wireless Sensors and Wireless Sensor Networks,” in

Proceedings ofthe Mediterranean Conference on Control and Automation , (Limassol,Cypros), pp. 719 – 724, June 2005.[11] K. Langendoen and N. Reijers, “Distributed Localization in WirelessSensor Networks: a Quantitative Comparison,”

Computer Networks ,vol. 43, no. 6, pp. 499 – 518, 2003.[12] N. Patwari, J. N. Ash, S. Kyperountas, A. O. Hero III, R. L. Moses,and N. S. Correal, “Locating the Nodes: Cooperative Localization inWireless Sensor Networks,”

IEEE Signal Processing Magazine , vol. 22,no. 4, pp. 54 – 69, 2005.[13] G. Mao, B. Fidan, and B. Anderson, “Wireless Sensor Network Local-ization Techniques,”

Computer Networks , vol. 51, no. 10, pp. 2529 –2553, 2007.[14] Y. Shang, W. Ruml, Y. Zhang, and M. P. J. Fromherz, “Localizationfrom Mere Connectivity,” in

Proceedings of the 4th ACM InternationalSymposium on Mobile Ad Hoc Networking and Computing , (Annapolis,USA), pp. 201 – 212, June 2003.[15] K. W. Cheung and H. C. So, “A Multidimensional Scaling Frame-work for Mobile Location Using Time-of-Arrival Measurements,”

IEEETransactions on Signal Processing , vol. 53, no. 2, pp. 460 – 470, 2005.[16] H. Wymeersch, J. Lien, and M. Z. Win, “Cooperative Localization inWireless Networks,”

Proceeding of IEEE , vol. 97, no. 2, pp. 427 – 450,2009.[17] F. S. Cattivelli and A. H. Sayed, “Distributed Nonlinear Kalman Filteringwith Applications to Wireless Localization,” in

Proceedings of the35th IEEE International Conference on Acoustics, Speech, and SignalProcessing , (Dallas, USA), pp. 3522 – 3525, March 2010.[18] L. Doherty, K. S. J. Pister, and L. E. Ghaoui, “Convex PositionEstimation in Wireless Sensor Networks,” in

Proceedings of the 20thAnnual Joint Conference of the IEEE Computer and CommunicationsSocieties , (Anchorage, USA), pp. 1655 – 1663, April 2001.[19] Z.-Q. Luo, W.-K. Ma, A. M.-C. So, Y. Ye, and S. Zhang, “SemideﬁniteRelaxation of Quadratic Optimization Problems,”

IEEE Signal Process-ing Magazine , vol. 27, no. 3, pp. 20 – 34, 2010.[20] P. Biswas and Y. Ye, “Semideﬁnite Programming for Ad Hoc WirelessSensor Network Localization,” in

Proceedings of the 3rd InternationalConference on Information Processing in Sensor Networks , (Berkeley,USA), pp. 46 – 54, April 2004.[21] P. Biswas, T.-C. Lian, T.-C. Wang, and Y. Ye, “Semideﬁnite Pro-gramming Based Algorithms for Sensor Network Localization,”

ACMTransactions on Sensor Networks , vol. 2, no. 2, pp. 188 – 220, 2006.[22] K. Q. Weinberger and L. K. Saul, “Unsupervised Learning of ImageManifolds by Semideﬁnite Programming,”

International Journal ofComputer Vision , vol. 70, no. 1, pp. 77 – 90, 2006.[23] J. Sun, S. Boyd, L. Xiao, and P. Diaconis, “The Fastest Mixing MarkovProcess on a Graph and a Connection to a Maximum Variance UnfoldingProblem,”

SIAM Review , vol. 48, no. 4, pp. 681 – 699, 2006.[24] Z. Wang, S. Zheng, S. Boyd, and Y. Ye, “Further Relaxations of theSDP Approach to Sensor Network Localization,”

SIAM Journal onOptimization , vol. 19, no. 2, pp. 655 – 673, 2008.[25] K. W. K. Lui, W.-K. Ma, H. C. So, and F. K. W. Chan, “Semi-DeﬁniteProgramming Algorithms for Sensor Network Node Localization WithUncertainties in Anchor Positions and/or Propagation Speed,”

IEEETransactions on Signal Processing , vol. 57, no. 2, pp. 752 – 763, 2009.[26] T. K. Pong and P. Tseng, “(Robust) Edge-based Semideﬁnite Pro-gramming Relaxation of Sensor Network Localization,”

MathematicalProgramming , vol. 130, no. 2, pp. 321 – 358, 2011. [27] T. Pong, “Edge-based Semideﬁnite Programming Relaxation of SensorNetwork Localization with Lower Bound Constraints,” ComputationalOptimization and Applications , vol. 53, no. 1, pp. 23 – 44, 2012.[28] A. M.-C. So and Y. Ye, “Theory of Semideﬁnite Programming for SensorNetwork Localization,”

Mathematical Programming , vol. 109, no. 2,pp. 367 – 384, 2007.[29] A. Javanmard and A. Montanari, “Localization from Incomplete NoisyDistance Measurements,” in

Proceedings of the IEEE InternationalSymposium on Information Theory , (San Petersbourg, Russia), pp. 1584– 1588, August 2011.[30] D. Shamsi, N. Taheri, Z. Zhu, and Y. Ye, “Conditions for Correct SensorNetwork Localization Using SDP Relaxation,” tech. rep., 2012. arXiv:1010.2262v4 [math.MG].[31] M. Fukuda, M. Kojima, K. Murota, and K. Nakata, “Exploiting Sparsityin Semideﬁnite Programming via Matrix Completion I: General Frame-work,”

SIAM Journal of Optimization , vol. 11, no. 3, pp. 647 – 674,2001.[32] K. Nakata, K. Fujisawa, M. Fukuda, M. Kojima, and K. Murota, “Ex-ploiting Sparsity in Semideﬁnite Programming via Matrix Completion II:Implementation and Numerical Results,”

Mathematical Programming ,vol. 95, no. 2, pp. 303 – 327, 2003.[33] K. Q. Weinberger, F. Sha, Q. Zhu, and L. K. Saul,

Advances in NeuralInformation Processing Systems 19 , ch. Graph Laplacian Regularizationfor Large-Scale Semideﬁnite Programming, pp. 1489 – 1496. Cam-bridge, MA: MIT Press, 2007.[34] N.-H. Z. Leung and K.-C. Toh, “An SDP-Based Divide-and-ConquerAlgorithm for Large-Scale Noisy Anchor-Free Graph Realization,”

SIAMJournal on Scientiﬁc Computing , vol. 31, no. 6, pp. 4351 – 4372, 2009.[35] S. Kim, M. Kojima, and H. Waki, “Exploiting Sparsity in SDP Relax-ation for Sensor Network Localization,”

SIAM Journal of Optimization ,vol. 20, no. 1, pp. 192 – 215, 2009.[36] P. Tseng, “Second-Order Cone Programming Relaxation of SensorNetwork Localization,”

SIAM Journal of Optimization , vol. 18, no. 1,pp. 156 – 185, 2007.[37] J. Nie, “Sum of Squares Method for Sensor Network Localization,”

Computational Optimization and Applications , vol. 43, no. 2, pp. 151 –179, 2009.[38] K. Yang, G. Wang, and Z.-Q. Luo, “Efﬁcient Convex RelaxationMethods for Robust Target Localization by a Sensor Network UsingTime Differences of Arrival,”

IEEE Transactions on Signal Processing ,vol. 57, no. 7, pp. 2775 – 2784, 2009.[39] P. O˘guz-Ekim, J. Gomes, J. Xavier, and P. Oliveira, “A Convex Relax-ation for Approximate Maximum-Likelihood 2D Source Localizationfrom Range Measurements,” in

Proceedings of the 35th IEEE Interna-tional Conference on Acoustics, Speech, and Signal Processing , (Dallas,USA), pp. 2698 – 2701, March 2010.[40] Q. Shi, C. He, H. Chen, and L. Jiang, “Distributed Wireless SensorNetwork Localization Via Sequential Greedy Optimization Algorithm,”

IEEE Transactions on Signal Processing , vol. 58, no. 6, pp. 3328 –3340, 2010.[41] A. Simonetto, T. Keviczky, and D. V. Dimarogonas, “Distributed So-lution for a Maximum Variance Unfolding Problem with Sensor andRobotic Network Applications,” in

Proceedings of the 50th AnnualAllerton Conference on Communication, Control, and Computing , (Mon-ticello, USA), pp. 63 – 70, October 2012.[42] S. Boyd and L. Vandenberghe,

Convex Optimization . CambridgeUniversity Press, 2004.[43] I. D. Schizas, A. Ribeiro, and G. B. Giannakis, “Consensus in

AdHoc

WSNs With Noisy Links— Part I: Distributed Estimation ofDeterministic Signals,”

IEEE Transactions on Signal Processing , vol. 56,no. 1, pp. 350 – 364, 2008.[44] F. Y. Jakubiec and A. Ribeiro, “D-MAP: Distributed Maximum a Pos-teriori Probability Estimation of Dynamic Systems,”

IEEE Transactionson Signal Processing , vol. 61, no. 2, pp. 450 – 466, 2013.[45] P. O˘guz-Ekim, J. Gomes, J. Xavier, and P. Oliveira, “Robust Localizationof Nodes and Time-Recursive Tracking in Sensor Networks Using NoisyRange Measurements,”

IEEE Transactions on Signal Processing , vol. 59,no. 8, pp. 3930 – 3942, 2011.[46] H. Wymeersch, S. Maran`o, W. M. Gifford, and M. Z. Win, “A MachineLearning Approach to Ranging Error Mitigation for UWB Localization,”

IEEE Transactions on Communications , vol. 60, no. 6, pp. 1719 – 1728,2012.[47] P. Biswas, K.-C. Toh, and Y. Ye, “A Distributed SDP Approach forLarge-Scale Noisy Anchor-Free Graph Realization with Applicationsto Molecular Conformation,”

SIAM Journal of Scientiﬁc Computing ,vol. 30, no. 3, pp. 1251 – 1277, 2008. [48] J.-P. Sheu, W.-K. Hu, and J.-C. Lin, “Distributed Localization Schemefor Mobile Sensor Networks,”

IEEE Transactions on Mobile Computing ,vol. 9, no. 4, pp. 516 – 526, 2010.[49] F. Chan and H. So, “Accurate Distributed Range-Based PositioningAlgorithm for Wireless Sensor Networks,”

IEEE Transactions on SignalProcessing , vol. 57, no. 10, pp. 4100 – 4105, 2009.[50] U. A. Khan, S. Kar, and J. M. F. Moura, “DILAND: An Algorithm forDistributed Sensor Localization With Noisy Distance Measurements,”

IEEE Transactions on Signal Processing , vol. 58, no. 3, pp. 1940 –1947, 2010.[51] M. Cucuringu, Y. Lipman, and A. Singer, “Sensor Network Localiza-tion by Eigenvector Synchronization over the Euclidean Group,”

ACMTransactions on Sensor Networks , vol. 8, no. 3, pp. 19:1 – 19:42, 2012.[52] D. P. Bertsekas and J. N. Tsitsiklis,

Parallel and Distributed Computa-tion: Numerical Methods . Athena Scientiﬁc, Belmont, Massachusetts,1997.[53] J. A. Costa, N. Patwari, and A. O. Hero III, “Distributed Weighted-Multidimensional Scaling for Node Localization in Sensor Networks,”

ACM Transactions on Sensor Networks , vol. 2, no. 6, pp. 39 – 64, 2005.[54] G. C. Calaﬁore, L. Carlone, and M. Wei, “A Distributed Technique forLocalization of Agent Formations From Relative Range Measurements,”

IEEE Transactions on Systems, Man, and Cybernetics – Part A: Systemsand Humans , vol. 42, no. 5, pp. 1065 – 1076, 2012.[55] C. Soares, J. Xavier, and J. Gomes, “DCOOL-NET: Distributed Cooper-ative Localization for Sensor Networks.” available at arXiv:1211.7277,2012.[56] A. Montanari and S. Oh, “On Positioning via Distributed MatrixCompletion,” in

Proceedings of the IEEE Sensor Array and MultichannelSignal Processing Workshop , (Jerusalem, Israel), pp. 197 – 200, October2010.[57] S. Srirangarajan, A. Tewﬁk, and Z.-Q. Luo, “Distributed Sensor NetworkLocalization using SOCP Relaxation,”

IEEE Transactions on WirelessCommunications , vol. 7, no. 12, pp. 4886 – 4895, 2008.[58] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed Op-timization and Statistical Learning via the Alternating Direction Methodof Multipliers,”

Foundations and Trends R (cid:13) in Machine Learning , vol. 3,no. 1, pp. 1 – 122, 2011.[59] H. Zhu, G. B. Giannakis, and A. Cano, “Distributed In-Network ChannelDecoding,” IEEE Transactions on Signal Processing , vol. 57, no. 10,pp. 3970 – 3983, 2009.[60] B. He and X. Yuan, “On the O (1 /t ) Convergence Rate of AlternatingDirection Method,”

Optimization Online , 2011.[61] J. Mota, J. Xavier, P. Aguiar, and M. Puschel, “D-ADMM: ACommunication-Efﬁcient Distributed Algorithm for Separable Optimiza-tion,”

IEEE Transactions on Signal Processing , 2013. in press.[62] J. C. Duchi, A. Agarwal, and M. Wainwright, “Dual Averaging forDistributed Optimization: Convergence Analysis and Network Scaling,”

IEEE Transactions on Automatic Control , vol. 57, no. 3, pp. 592 – 606,2012.[63] F. Iutzeler, P. Bianchi, P. Ciblat, and W. Hachem, “AsynchronousDistributed Optimization using a Randomized Alternating DirectionMethod of Multipliers,” in

Accepted for the 52nd IEEE Conference onDecision and Control , (Firenze, Italy), December 2013. Available atarXiv:1303.2837.[64] F. Kuhn and R. Wattenhofer, “On the Complexity of Distributed GraphColoring,” in

Proceedings of the 25th annual ACM symposium onPrinciples of distributed computing , (Denver, USA), pp. 7 – 15, July2006.[65] K. Duffy, N. O’Connell, and A. Sapozhnikov, “Complexity Analysis ofa Decentralised Graph Colouring Algorithm,”

Information ProcessingLetters , vol. 107, no. 2, pp. 60 – 63, 2008.[66] D. Kempe and F. McSherry, “A Decentralized Algorithm for SpectralAnalysis,”

Journal of Computer and System Sciences , vol. 74, no. 1,pp. 70 – 83, 2008.[67] M. Leng and Y.-C. Wu, “On Joint Synchronization of Clock Offsetand Skew for Wireless Sensor Networks under Exponential Delay,” in