[PDF] On the Robustness of Multi-View Rotation Averaging

Abstract

Rotation averaging is a synchronization process on single or multiple rotation groups, and is a fundamental problem in many computer vision tasks such as multi-view structure from motion (SfM). Specifically, rotation averaging involves the recovery of an underlying pose-graph consistency from pairwise relative camera poses. Specifically, given pairwise motion in rotation groups, especially 3-dimensional rotation groups (\eg, \mathbb{SO}(3)), one is interested in recovering the original signal of multiple rotations with respect to a fixed frame. In this paper, we propose a robust framework to solve multiple rotation averaging problem, especially in the cases that a significant amount of noisy measurements are present. By introducing the \epsilon-cycle consistency term into the solver, we enable the robust initialization scheme to be implemented into the IRLS solver. Instead of conducting the costly edge removal, we implicitly constrain the negative effect of erroneous measurements by weight reducing, such that IRLS failures caused by poor initialization can be effectively avoided. Experiment results demonstrate that our proposed approach outperforms state of the arts on various benchmarks.

Full PDF

OOn the Robustness of Multi-View Rotation Averaging

Rotation averaging is a synchronization process on sin-gle or multiple rotation groups, and is a fundamental prob-lem in many computer vision tasks such as multi-view struc-ture from motion (SfM). Speciﬁcally, rotation averaging in-volves the recovery of an underlying pose-graph consis-tency from pairwise relative camera poses. Speciﬁcally,given pairwise motion in rotation groups, especially 3-dimensional rotation groups ( e.g ., SO (3) ), one is interestedin recovering the original signal of multiple rotations withrespect to a ﬁxed frame. In this paper, we propose a robustframework to solve multiple rotation averaging problem, es-pecially in the cases that a signiﬁcant amount of noisy mea-surements are present. By introducing the (cid:15) -cycle consis-tency term into the solver, we enable the robust initializationscheme to be implemented into the IRLS solver. Instead ofconducting the costly edge removal, we implicitly constrainthe negative effect of erroneous measurements by weight re-ducing, such that IRLS failures caused by poor initializationcan be effectively avoided. Experiment results demonstratethat our proposed approach outperforms state of the arts onvarious benchmarks.

1. Introduction

Robot navigation guided by visual information, namely,simultaneous localization and mapping (SfM), primarily in-volves estimating and updating the camera trajectory dy-namically.

Pose graph optimization , as a fundamental ele-ment in SfM, devotes to iteratively ﬁx the erroneous calcu-lation of the camera poses due to the noisy input and mis-placed data association. Conventional pose graph optimiza-tion techniques are principally fulﬁlled with bundle adjust-ment (BA), which reﬁnes camera poses by progressivelyminimizing the point-camera re-projection errors. Fusedby state-of-the-art nonlinear programming algorithms, e.g ., Levenberg-Marquardt method [1], Gauss-Newton method, etc ., the camera poses and map points are successively op-timized according to the sequential input images.Motion averaging has attracted surging research interestsin the 3D vision ﬁeld most recently, especially on structure-from-motion (SfM) related tasks. In contrast with BA-basedapproaches which mainly leverage point-camera correspon-dences, global motion averaging aim to recover the cam-era poses by solving the synchronization problem, i.e ., toachieve a set of camera orientations and locations which areconsistent with the pairwise measurements between them.General motion averaging pipelines involve two steps: therotation averaging based on the epipolar geometric correla-tion, followed by the translation averaging where the cam-era orientations are the solver to the previous step and con-sidered ﬁxed. Translation averaging is well-known to be aconvex problem and all the stationary points are thus globaloptimal solution. However, analogous statements on robust-ness of existing methods are still lacking in the rotation av-eraging study.We aim to show that given cycle consistency constraints, i.e ., progressively the relative rotations on a cycle structureshould end with identity, the local iterative method on multi-ple rotation averaging problem can be initialized in a morerobust manner. It is well known that Lie groups are sen-sitive to perturbations, even small noise on few elementsin the rotation matrix can result in completely different ro-tations. Due to the nature of multiple rotation averagingproblem, however, the measurements are normally noisy. Inviewgraph of large-scale problems, the measurement noiseon some edges can propagate progressively over the entiregraph, resulting in unsatisfactory solutions. Moreover, formultiple rotation averaging, there does not exist a canoni-cal direct solver and all iterative solvers rely heavily on areasonably well initialization, convergence of the iterativesolver tends to be excessively slow or eventually fails toconverge.1 a r X i v : . [ c s . C V ] F e b n this work, we address all the issues mentioned aboveby introducing the measurement de-noising (or measure-ment reweighting), which is conducted before initiating theIRLS solver. Imposing the cycle consistency is essentiallyconducting single rotation averaging over the cyclic sub-graph. Within each cycle, the deviation of the erroneousedge is constrained by redundant measurements and here-after diluted by allowing a set of weights on all the edges.These edge weights will then be normalized within the scaleof the whole graph and exploited as the initialization in thelatter IRLS iterations. Furthermore, we exploit a novel costfunction in the IRLS steps such that the penalty on the erro-neous measurements changes accordingly.To validate our approach, we conduct experiments onchallenging collections of unordered internet photos of var-ious sizes and demonstrate that our proposed scheme yieldssimilar or higher accuracy than state-of-the-art results.In summary, the key contributions of our paper are:• We propose a robust framework to exactly recover aconsistent set of camera poses in the presence of a sig-niﬁcant amount of noisy measurements and/or outliersin SfM problems.• We show that with desired connectivity on a graph, thenoisy measurements can be guaranteed to yield an er-ror upper bound• We address the issues arising for existing multiple ro-tation averaging approaches. In particular, we demon-strate the conditions under which multiple rotation av-eraging schemes may fail.

2. Related Work

Camera pose estimation lies in the heart of monocularSfM systems, whereas the camera orientation and transla-tion optimization consist the camera pose reﬁnement pro-cess. Compared with conventional BA-based approacheswhere re-projection errors are iteratively minimized, ap-proaches fused by rotation averaging methods have been re-cently proven more efﬁcient yielding comparable or higheraccuracy which greatly beneﬁt real-time applications withlimited computational power. Rotation averaging has beenﬁrst introduced into 3D vision by [2] where the authors ex-ploit Lie-algebraic averaging and propose an efﬁcient androbust solver for large-scale rotation averaging problems,and was later studied in [3]. Outperformance with themotion averaging backbone against canonical approachesstimulates numerous SfM frameworks [4–10], whereasglobal motion averaging is conducted to simultaneouslysolve all camera orientations from inter-camera relative mo-tions. In [7], the authors develop a camera clustering algo-rithm and present a hybrid pipeline applying the parallel-processed local increment into global motion averaging framework. Similarly, in [5], distributed large-scale motionaveraging is addressed. In [8], a hybrid camera estimationpipeline is proposed where the dense data association intro-duces a single rotation averaging scheme into visual SfM.Rotation averaging [11] has shown improved robustnesscompared with canonical BA-based approaches in numer-ous aspects. For instance, proper initialization plays a vi-tal role in equipping a sufﬁciently stable monocular system[12], while [13] addresses the initialization problem for 3Dpose graph optimization and survey 3D rotation estimationtechniques, where the proposed initialization demonstratessuperior noise resilience. In [14] and [15] the process is ini-tialized by optimizing a l loss function to guarantee a rea-sonable initial estimate. It has also been shown in 567 thatestimating rotations separately and initialize the 2D posegraph with the measurements provide improved accuracyand higher robustness. In [16, 17] it has been exploited thatcamera rotation can be computed independent of translationgiven speciﬁc epipolar constraints. It is well known thatmonocular SfM is sensitive to outliers and many robust ap-proaches [18–21] have thus been designed to better handlethe noisy measurements. Moreover, Lagrangian duality hasbeen reconciled in recent literature [22–24] to address thesolution optimality. A recent paper [25] shows that certiﬁ-ably global optimality is obtainable by utilizing Lagrangianduality to handle the quadratic non-convex rotation con-strains [26] and further derives the analytical error boundin the rotation averaging framework. Recent work [27]and [8] attempt to rely solely on rotation averaging with-out BA to handle SfM tasks. In [8], the authors partition theinput sequence into blocks according to the pairwise covis-ibility and the optimization is processed hierarchically withlocal BA and global single rotation averaging. While [8]yields high accuracy, it is demanding to handle the latencybetween local and global optimization and the system maysuffer time overhead progressively.Recent work [28–30] tackle the multiple rotation aver-aging problem by exploiting rank constraints on the globalfundamental matrices. While the factorization-based meth-ods show high accuracy dealing with large-scale datasets, itis much slower and costly than local iterative solver-basedapproaches. Our proposed approach falls into the latter cat-egory. Inspired by recent work [31] where an algorithm isproposed to solve group synchronization under signiﬁcantamounts of corruption or noise, we realize that most itera-tive solvers for general group synchronization relies heavilyon the initialization scheme and thus tends to fail in pres-ence of noisy measurements. Analogous to [31], our workalso focuses on robustifying the the rotation averaging innoisy scenarios. However, [31] applies message passingscheme to explicitly estimate the underlying noise levelswhile we propose to implicitly decrease the weights on thenoisy edges by enforcing cycle consistency. Other work uti-2izing cycle consistency includes [32] and [33], where [32]proposes to detect corrupted measurements by maximizinglog likelihood function and [33] classiﬁes the edges as un-corrupted as long as they belong to any cycle-consistent cy-cle. In this work, instead of detecting or removing erro-neous edge explicitly, we propose to implicitly avoid thenegativity brought by the noisy measurements, with enforc-ing the reweighted graph to be cycle consistent.

3. Theory

Consider a simple directed graph G := ( V , E ) , where V = [ n ] = { , , · · · , n } denotes the set of vertices , E = { ( i, j ) | i (cid:54) = j, i, j ∈ V} denotes the set of directed edges .We further associate a set of labels { Λ , Σ } with G such thata new tuple H := ( V , E , λ, σ ) is constructed in a way that λ : V → Λ , σ : E → Σ . Speciﬁcally, as we primarily focuson the synchronization with 3-dimensional rotation groups( i.e ., SO (3) ), we assume Λ ⊆ SO (3) and Σ ⊆ SO (3) unlessstated otherwise for the rest of the paper. Furthermore, weassume ( j, i ) ∈ E if and only if ( i, j ) ∈ E for all i (cid:54) = j ∈ V due to the nature of the problem. We thus have σ ( i, j ) = σ ( j, i ) (cid:62) = σ ( j, i ) − . For ease of notation, we henceforthlet E be the set of undirected edges associated with orderedlabeling σ .With the notations above, we further clarify the problemas follows. Consider an n -view scene with m measurementsof relative rigid motions between the cameras. Speciﬁcally,node i ∈ V in H denotes the i t h view, λ ( i ) ∈ SO (3) denotesthe absolute camera rotation at i ; ( i, j ) ∈ E if there existsrelative transformation measurements between the i t h viewand the j t h view, where |V| = n , |E| = m . Denote ˜ σ ( i, j ) ∈ SO (3) for the measurement of relative rotation from i to j (and ˜ σ ( j, i ) for the opposite direction) and ¯ σ ( i, j ) for thecorresponding ground truth. The edge measurement error isdenoted by (cid:15) ( i, j ) := d (˜ σ ( i, j ) , ¯ σ ( i, j )) , where d : SO (3) × SO (3) → R +0 denotes some metric. Deﬁnition 1 (graph consistency) . Given a labeled graph H = ( V , E , λ, ˆ σ ) and some metric d as deﬁned above. Fora given λ , H is λ -consistent if and only if λ ( i ) · ˆ σ ( i, j ) = λ ( j ) , ∀ ( i, j ) ∈ E , (1) which is equivalent with (cid:88) ( i,j ) ∈E d ( λ ( i )ˆ σ ( i, j ) , λ j ) = 0 . (2)The multiple rotation averaging problem studied in thispaper can thus be considered as the group synchronizationprocess, where we recover λ by imposing graph consistencyon H . Since no direct solver for synchronization on SO (3) exists, it is conventionally formulated as the optimization Figure 1: Common cases of group synchronization in struc-ture from motion tasks. The coordinates represent cam-eras with different motion, where solid black lines representthere exists measurable relative motion between the twoframes, red dashed lines represent that the measurementsof relative motion contain high noise though they are mea-surable. In (a), the pose-graph does not contain cycles sothe error compensation based on cycle consistency cannotbe conducted. In (b), the pose-graph contains cycles but themeasurements are noise-free. In (c), the pose-graph con-tains cycles but some of the edges are noisy – in this paper,we focus on group synchronization in this case.problem, which can be solved iteratively. Speciﬁcally, Eq. 2is equivalently as arg min λ (cid:88) ( i,j ) ∈E d ( λ ( i )ˆ σ ( i, j ) , λ j ) , (3)in practice, Eq. 3 is rarely solved in its original form. In-stead, a cost function is often applied into the optimization,equivalently as solving arg min λ (cid:88) ( i , j ) ∈E ρ ( d ( ˆ σ ( i , j ) , λ − λ j )) , (4)where ρ ( · ) is some cost function. In the following analysis,we use geodesic distance for d ( · , · ) by convention and l cost function for measurement correction as the low penaltymakes the solver more robust than the counterparts, we usea convex cost function for IRLS as described in §4.2. Re-fer to §5.3.1 for the comparison of the performances withdifferent cost functions.To solve Eq. 4, we want to ﬁrst make sure that ˆ σ is reli-able to some extent as Eq. 4 is well-known to be sensitive tomeasurement outliers. Accordingly, We begin the analysisby introducing cycle consistency. Deﬁnition 2 (cycle consistency) . Given an edge-labeledgraph H = ( V , E , σ ) as deﬁned above, for i ∈ V , denote C i := {C i , C i , · · · } for the union of all the cycles contain-ing i . Then H is cycle consistent if and only if σ ( i, i ) · m k − (cid:89) j =1 σ ( i j , i j +1 ) · σ ( i m k , i ) = I σ ∀ i ∈ V , ∀C ki ∈ C i , (5) where m k = |C ki | , I σ denotes the identity mapping under σ . (cid:15) -cycle consistency of cycle ( i, j, k ) with respect to i on the unit 3-sphere embedded in R . λ ( i ) is an arbitrary rotation, ˆ π denotes the equivalent transfor-mation of ˆ σ on the manifold. D ( λ ( i ) , (cid:15) π ) denotes the diskon S centered at λ ( i ) with geodesic radius (cid:15) π .Generally, cycle consistency implicitly infers that all theedge measurements are precise in the graph, which rarelyoccurs in reality. Indeed, for most computer vision relatedtasks, input de-noising is one of the most important steps.For example, camera relative motion measurements are es-sentially computed with pairwise geometric constraints incommon SfM pipelines. Since these constraints are nor-mally derived from the photometric information, such asfeature matching, the matching outliers can bring signiﬁcantnoise in the measurements and thus result in corrupted, erro-neous edge labelling. Most previous work use RANSAC todetect and remove feature outliers and/or edge outliers, theprocess itself is, however, extremely costly and slow whenthe graph is sufﬁciently large. In this work, we proposeto reduce the negative effects of noisy edges to robustifythe solver, by implicitly enforcing cycle consistency, withreweighting both the vertices and the edges.Given that a non-trivial multiple rotation averaging prob-lem involves at least one cycle in the pose-graph, we relaxDeﬁnition 2 into the following. Deﬁnition 3 ( (cid:15) -cycle consistency) . Let H , C i be deﬁned asin Deﬁnition 2, we say H is (cid:15) -cycle consistent if max i ∈VC ki ∈C i d ( σ ( i, i ) · m k − (cid:89) j =1 σ ( i j , i j +1 ) · σ ( i m k , i ) , I σ ) ≤ (cid:15) (6)To visualize the geometric meaning of Deﬁnition 3, let usconsider a unit 3-sphere S and a simple cyclic graph withvertices ( i, j, k ) . Given the mutual relative rotation mea-surements ˆ σ ( i, j ) , ˆ σ ( j, k ) , ˆ σ ( k, i ) , there exists a set of 3-sphere rotations ˆ π ( i, j ) , ˆ π ( j, k ) , ˆ π ( k, i ) , which are isomor- phic to its SO (3) counterparts * . As we only consider therelative rotations here, without loss of generality, assume anarbitrary λ ( i ) ∈ S . In this case, ˆ λ ( j ) = λ ( i )ˆ π ( i, j ) is onthe 3-sphere and so are the following rotations henceforth.After successive rotations, we arrive at that, it is desiredto have ˆ λ ( k )ˆ π ( k, i ) ∈ D ( λ ( i ) , (cid:15) ) , where D ( p, r ) := { q ∈S | d ( q, p ) ≤ r } . That is, the computed ˆ λ ( i ) should locatein the (cid:15) -disk of λ ( i ) on S . See Fig. 2 for the visualization.Moreover, now we can rewrite Eq. 6 of ( i, j, k ) in its sphereform, that is d π (ˆ π ( i, j ) + ˆ π ( j, k ) + ˆ π ( j, k ) , I π ) ≤ (cid:15) π , (7)where d π , I π , (cid:15) π are deﬁned on S in analogous mannerwith those deﬁned above on SO (3) . More details on theisomorphism can be found in the supplementary materials.The main reason of exploiting the S isomorphism isthat, the proper vector ﬁeld associated with the sphere en-ables the reweighting scheme, such that the redundant er-ror distribution over all measurements in cycles can be efﬁ-ciently propagated. The weighting can also be uniquely pro-jected back to SO (3) with ( α w , β w , γ w ) , where ( α, β, γ ) denotes the corresponding Euler angles. Assumption 4 (reweighted (cid:15) -cycle consistency condition) . Given H , for all ( i, j ) where there exists at least one cycle C ⊆ H such that ( i, j ) ∈ E C , there exists a set of weights w ij associated with the measurements such that the (cid:15) -cycleconsistency is satisﬁed for H , for an arbitrary (cid:15) > . For-mally, d π ( (cid:88) ( i,j ) ∈C w ij ˆ π ( i, j ) , I π ) ≤ (cid:15) π , (8) equivalently with d ( (cid:89) ( i,j ) ∈C\ ( k,i ) ˆ σ w ( i, j ) · ˆ σ w ( k, i ) , I σ ) ≤ (cid:15). (9) In the following analysis, we assume that H is con-nected though it may not hold valid in generalized groupsynchronization problem. In SfM tasks, isolate verticesin pose-graph rarely occur and will be discarded whencethey do. Indeed, we further assume there are more than m/ edges which are contained in at least one cycle, where m = |E| . We summarize the assumptions and have the fol-lowing proposition. Proposition 5.

Given a connected H with |E| = m , assumethat there are p ( > m ) edges which are contained in at leastone cycle and denote the set of the edges as E p . Then ifAssumption 4 holds for given (cid:15) , there exists { w ij } such that w ij d (ˆ σ ( i, j ) , λ − i λ j ) < (cid:15), ∀ ( i, j ) ∈ E p . (10) * We are using sloppy notations here, formally, there exists a mapping π : S → S such that π ∼ = SO (3) roof. Assume that there does not exist self-consistent cor-rupted cycle, then for a given edge ( i, j ) , within any cycle C ⊆ H that contains ( i, j ) , Assumption 4 gives that d ( (cid:89) ( i,j ) ∈C\ ( k,i ) ˆ σ w ( i, j ) · ˆ σ w ( k, i ) , I σ ) ≤ (cid:15)., (11)equivalent with d ( λ ( i ) (cid:89) ( i,j ) ∈C\ ( k,i ) ˆ σ w ( i, j ) · ˆ σ w ( k, i ) , λ ( i )) ≤ (cid:15). (12)for any λ ( i ) as deﬁned.Denote δ ij for the deviation error on edge ( i, j ) , thengiven the error upper bound, the worst case scenario for δ ij is that δ rs = 0 for any other ( r, s ) ∈ C , i.e ., ˆ σ w ( r, s ) =ˆ σ ( r, s ) = σ ( r, s ) , where the weights on all other edges aretrivial. The inequality above can thus be rewritten into d ( λ ( j )ˆ σ w ( j, i ) , λ ( i )) = w ij d (ˆ σ ( j, i ) , λ ( i ) λ ( j ) − ) ≤ (cid:15). (13) Lemma 6.

Let H be as deﬁned in Prop. 5, then the ( m − p ) ‘acyclic’ edges must belong to some tree with root con-tained in V p , where V p := { i ∈ H|∃ j ∈ H s .t. ( i, j ) ∈ E p } . The lemma above immediately leads to the fact that,only the optimizations conducted on H p is effective dueto the lack of measurement redundancy in H n \ p , i.e ., thecost function on H n \ p can always be made zero ( i.e . triv-ial optimization). We summarize the observation into thefollowing lemma. Lemma 7.

Let H and E p be as deﬁned in Prop. 5, thensolving Eq. 4 on H is equivalent with that on H p :=( V p , E p , λ, σ ) . Combining Prop. 5 and the lemmas above, we make thefollowing claim on the convexity of solving Eq. 4.

Theorem 8 (local convexity) . The multiple rotation averag-ing problem deﬁned on H w is equivalent with that deﬁnedon H wp , and locally convex everywhere on the domain ofsome cost function ρ if p ≥ c (1 − (cid:15) ) q n , where c, q > de-pends on ρ , H w is the reweighted H according to Prop. 5,deﬁned as H w := ( V , E w , ˆ σ, λ ) . To avoid the ambiguity in notations, note that ( V , E w , ˆ σ ) is equivalent with ( V , E , ˆ σ w ) with the weights deﬁned indifferent subspaces. It has been well proven that the (lo-cal) convexity of the rotation averaging problem dependson the graph connectedness and the noise level of the in-put measurements. Intuitively, it is more difﬁcult to achievethe solver optima with a weakly-connected graph † and a † Note that the weak-connectedness here means that not sufﬁcientlymany vertices are connected, which is not the same with that in generalgraph theory terms. large number of erroneous measurements. Moreover, con-sider an edge ( i, j ) which is present in k cycles in H , thenthe weighted measurement ˆ σ w ( i, j ) essentially gets closerto the noise-free ground truth σ ∗ ( i, j ) as k increases. De-tailed proofs to Thm. 8 is provided in the supplementarymaterials.Given the local convexity given by Thm. 8, it immedi-ately follows that there exists a globally optimal solutionto the objective function with some cost function ρ , whichleads to the following statement. Theorem 9 (cost function) . Consider the optimizationproblem deﬁned in Eq. 3, denote λ ∗ as the optimal solutionset. Then there exists a convex, differentiable ρ : R → R such that Eq. 4 converges to λ ∗ ρ ∼ λ ∗ with equality up to aglobal action. In developing the iteration updates, l and l cost func-tions have been prevailing in the previous approaches.Among which l shows a stronger robustness in handlingthe problem of noisy or corrupted nature, but it tends to takesigniﬁcantly long time when dealing with large-scale prob-lems. While l cost function shows a superior convergencespeed, l -type cost functions are more sensitive to outliersas the gradient for general l cost is unbounded. In ourapproach, we use ρ ( x ) = x exp( τ x ) as the cost function,where τ represents the penalty parameter. Derivation of theupdate rules according to conventional IRLS algorithm aregiven in § 4.2.

4. Method

In this section we introduce the optimization scheme wepropose to solve the multiple rotation averaging problemwith addressing on the measurement correction. Speciﬁ-cally, we ﬁrst depict the measurement de-noising algorithmowing to the cycle consistency constraints introduced in§4.1, followed by the implementation of the IRLS solverin our proposed scheme, described in §4.2.

Recall Assumption 4 where we presume that H isequipped with (cid:15) -cycle consistency in solving Eq. 4. Byenforcing cycle consistency, the latter solver ( e.g . IRLS,ADMM, etc .) beneﬁts signiﬁcantly for the following rea-sons. First, Lie groups are well-known to be sensitive toperturbations, i.e ., even small noise on few elements in therotation matrix can result in completely different rotations.Due to the nature of the multiple rotation averaging prob-lem, however, the measurements are normally noisy. Inviewgraph of large-scale problems, the measurement noiseon some edges can propagate progressively over the wholegraph, resulting in unsatisfactory solutions. Moreover, for5ultiple rotation averaging, there does not exist a canon-ical direct solver and all iterative solvers rely heavily on areasonably well initialization. With severely noisy measure-ments and a trivial set of initialization, convergence of theiterative solver tends to be excessively slow or eventuallyfails to converge.In this work, we address all the issues mentioned aboveby introducing the measurement de-noising (or measure-ment reweighting), which is conducted before initiating theIRLS solver. Imposing the cycle consistency is essentiallyconducting single rotation averaging over the cyclic sub-graph. Within each cycle, the deviation of the erroneousedge is constrained by redundant measurements and here-after diluted by allowing a set of weights on all the edges.These edge weights will then be normalized within the scaleof the whole graph and exploited as the initialization in thelatter IRLS iterations.In details, given a graph H with relative rotation mea-surements as group ratios, we ﬁrst conduct cycle detectionto ﬁnd the set of cycles C := {C k |C k ⊆ H} . As a cy-cle might contain a large number of edges and the com-putation is excessively costly, in practice, we instead ran-domly pick three vertices for (cid:98) (cid:112) |V C k |(cid:99) times. Assume,for example, i, j, k ∈ V C k are selected in an iteration,then ˆ σ ( i, j ) = (cid:81) ( r,s ) ∈ path ji ˆ σ ( r, s ) , where path ji denotesthe connected edge from i to j and we denote ˆ σ ( j, k ) and ˆ σ ( k, i ) in the same manner. With the weights computedas depicted in §3.1, we end up with ‘triangle consistency’with the current vertices. These weights are then propa-gated along the corresponding path with scale normaliza-tion. In the experiments we simply take the path weightaverage over the degree of the path and achieve a satisfac-tory weight initialization. In the weight solution, we use l norm for the residual computation to further increase the ro-bustness of our proposed scheme. In contrast with edge re-moval schemes, we implicitly penalize the erroneous mea-surements by imposing smaller weights on them, as to avoidthe high computational costs in edge noise removal.It is common for an edge to appear in multiple cycles,in this case, the ﬁnal edge weight is calculated with theweighted mean with respect to the according cycle sizes, i.e ., consider an edge ( i, j ) such that there is a weight set { w kij } for ( i, j ) according to cycle consistency in different C k ’s, then the ﬁnal weight for ( i, j ) is w ij = (cid:88) k |E C k | w kij / (cid:88) k |E C k | . (14) To solve Eq. 4, we exploit the conventional IRLS solverwith the cost function ρ ( x ) = x exp( τ x ) . As we mentionedbefore, it is well-known that the solution accuracy andconvergence speed both rely heavily on the initialization. Since we already have a set of reasonable weights owingto enforcing cycle consistency, our solver is signiﬁcantlymore robust than previous work solely with RANSAC ﬁl-tration. Comparisons with the RANSAC-based initializa-tion schemes are provided in §5.3.2.Now we brieﬂy describe the update rule we employ inconstructing the IRLS solver, the full derivation is providedin the supplementary materials. The algorithm is providedin Alg. 1. Algorithm 1:

MRA-Robust IRLS

Input:

Set of relative transformationsmeasurements { ˆ σ ( i, j ) } , threshold α ; Output:

Set of absolute rotation { λ i } ; Initialization:

Set residual res = 10 , iterationnumber k = 1 , τ = 1 , { λ i } = identity matrix, w ij from cycle consistency step; while res > α do k = k + 1 ;2. δ ij = δ − j λ − j σ ( i, j ) λ i δ i ;3. r ij ← δ ij exp( τ δ ij ) ;4. φ ij ← (1 + r ij ) exp( τ r ij ) ;5. h ij ← (1 + τ r ij + τ ) exp( τ r ij ) ;6. s ← φ (cid:62) φ/ (cid:107) φ (cid:62) hφ (cid:107) ;7. w ij ← sφ ij ;8. λ i ← (cid:80) w ij log( λ − i λ j ) (cid:107) log( λ − i λ j ) (cid:107) ;9. res = (cid:80) w ij δ ij exp( τ w ij δ ij ) ;10. τ = 1 /k ; end Denote (cid:107) · (cid:107) as the equivalent angle for d ( · , · ) . Recallour objective function Eq. 4, assume that we update λ with δλ , i.e ., for λ i , the updated λ i is λ i δ i , where δ i denotes theupdate. Then at one iteration, it is equivalent to solve Eq. 4as to minimize the following (cid:88) ( i,j ) ∈E p ρ ( (cid:107) δ − j λ − j σ ( i, j ) λ i δ i (cid:107) ) . (15)Then for an edge ( i, j ) , the residual r ij is thus r ij = ρ ( (cid:107) δ − j λ − j σ ( i, j ) λ i δ i (cid:107) ) = ρ ( δ ij ) . (16)Denote φ ij as the gradient, h ij as the hessian of ρ , the stepsize s is computed as s = (cid:107) φ (cid:107) / (cid:107) φ (cid:62) hφ (cid:107) . (17)The updated weight w ij and λ i are then w ij = sφ ij , (18) λ i = (cid:88) ( i,j ) ∈E p w ij log( λ − i λ j ) (cid:107) log( λ − i λ j ) (cid:107) . (19)6n practice, we observe that with τ k = 1 /k where k denotesthe iteration number, the convergence displays a quadraticconvergence and changes to linear at the end. The behavioris expected from the construction of ρ ( x ) , as we desire thepenalty to be high at the beginning and subtle by the end ofthe iterations.

5. Experimental Results

System Conﬁguration

All of our experiments are con-ducted on a PC with Intel(R) i7-7700 3.6GHz processors, 8threads and 64GB memory. The bundle adjustment is con-ducted by applying Ceres library [34].

Methods and Datasets

We compare our proposed ap-proach with recent state-of-the-art approaches includ-ing [14, 15, 31]. The approaches are tested on the PhotoTourism Dataset [35] and the KITTI Odometry [36]. In ourexperiments, all the optimization steps are modiﬁed frompublic libraries [34, 37, 38] in C++.

We provide the accuracy in mean and median degree,runtime and iteration numbers in Table. 1, where Ours- l denotes the corresponding result using l cost function in-stead of the original ρ in our proposed algorithm. It is shownthat our proposed algorithm achieves the superior perfor-mance on almost all of the dataset. Accuracy

We measure the accuracy of the methods us-ing the mean and median degree error by convention. Wenotice that the result given by that using l cost functionbares slightly greater error than that with cost function ρ in our original scheme. The reason can be due to the dif-ferences on both the weight computation and the inﬂuencefunction. As we mentioned before, the inﬂuence function of l cost is unbounded such that, during the initialization thepenalty on the measurement deviation is too harsh. In thatcase, some edges might get over-penalized in the sense ofthe ‘good’ information gets underweighted in the optimiza-tion. Moreover, as ρ in our scheme will lower the penaltyterm as the iteration goes on, the optimization will be betterreﬁned when the solve is sufﬁciently close to the solutionset. Speed

It can be seen that our proposed algorithm isslightly slower than the fastest approach IRLS on most ofthe datasets. As the plain-version IRLS processes the it-eration without robostiﬁer, it involves a lot fewer variablescompared to the other approaches. Also it is worth to notethat our algorithm takes the fewest iterations on almost allof the datasets, which greatly results from the weight ini-tialization scheme such that IRLS is well intialized at thebeginning, which leads to the faster convergence. Table 1: Experiment results on Tourism Dataset [35]. In thetable, ∆ ¯ deg and ∆ ˆ deg denote the mean and median error indegrees, respectively; runtime is in seconds and number ofiterations denote the iterations to initialize + iterations forthe calculation. Full result in supplementary materials.

IRLS[14] Robust-IRLS[15] MPLS[31] Ours- l Ours A l a m o ∆ ¯ deg 3.64 3.67 3.44 3.54 ∆ ˆ deg 1.30 1.32 runtime E ll . I s . ∆ ¯ deg 3.04 2.71 2.61 2.39 ∆ ˆ deg 1.06 0.93 0.88 0.82 runtime 3.2 2.8 4.0 2.7 iteration 10+9 10+13 6+11 6+12 M on t . N . D ∆ ¯ deg 1.25 1.22 ∆ ˆ deg 0.58 0.57 0.51 0.51 runtime N o t . D a . ∆ ¯ deg 2.63 2.26 2.06 2.19 ∆ ˆ deg 0.78 0.71 0.67 0.71 runtime P i cca . ∆ ¯ deg 5.12 5.19 3.93 4.08 ∆ ˆ deg 2.02 2.34 1.81 1.83 runtime NY C L i b ∆ ¯ deg 2.71 2.66 2.63 2.63 ∆ ˆ deg 1.37 1.30 1.24 1.20 runtime P . D . P ∆ ¯ deg 4.1 3.99 3.73 3.77 ∆ ˆ deg 2.07 2.09 1.93 1.85 runtime R o m . F o r . ∆ ¯ deg 2.66 2.69 2.62 2.65 ∆ ˆ deg 1.58 1.57 T . o . L ∆ ¯ deg 3.42 3.41 3.16 3.23 ∆ ˆ deg 2.52 2.50 2.20 2.12 runtime 2.6 iteration 10+8 10+12 6+7 6+7 U n i . S q . ∆ ¯ deg 6.77 6.77 6.54 6.57 ∆ ˆ deg 3.66 3.85 Y o r k m . ∆ ¯ deg 2.6 2.45 2.47 2.45 ∆ ˆ deg 1.59 1.53 1.45 1.47 runtime

300 -200 -100 0 100 200 300 x-translation [m] -1000100200300400500600700 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (a) Sequence 00 -100 0 100 200 300 400 500 600 x-translation [m] y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (b) Sequence 02 -300 -200 -100 0 100 200 300 x-translation [m] -100-50050100150200 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (c) Sequence 05 -200 -100 0 100 200 300 400 x-translation [m] y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (d) Sequence 09

Figure 3: Our proposed approach shows a trajectory estimation of comparably high quality in KITTI dataset which containslarge scale outdoor scenes. Speciﬁcally, our approach displays superior loop-closing capability on Seq.02, where many loopsare present. It can be shown that the accumulated drifts are negligible through the trajectory estimation. Results on the restof the dataset are provided in the supplementary materials.

To test the qualitative performance of our algorithm,we run experiments on VIO benchmark KITTI to furtherdemonstrate the robustness of our proposed solver. Differ-ent from internet photo dataset like [35], the pose-graph forKITTI contains a lot more small cycles but not many large-scale cycles. The local connectiviey is higher while theglobal connectivity is very small. To estimate the cameratrajectory, we use standard SfM pipeline where we ﬁx thecamera orientation with our outputs to estimate the transla-tion.

Figures of the ablation comparisons are provided in thesupplementary materials, we state the main observationshere. In the comparisons we use IRLS [15] as the baseline.We produce the synthetic data using perturbations from ◦ to ◦ uniformly on ground truth value of [35]. We further conduct ablation study on the effects on the costfunction choice with baseline as Robust IRLS [15] on syn-thetic data with perturbations on the measurements. It hasbeen shown that our proposed algorithm with l , l and Hu-ber cost function all show superior accuracy and faster con-vergence over [15]. We further test the cycle consistency constraints effect byconducting experiments with our solver with RANSAC,RANSAC+cycle constraints, and solely with the cycle con-straints, with baseline [14]. It has been shown that, it doesnot show much difference beween with RANSAC+cycleconstraints vs solely with cycle constraints.8 . Conclusion

In this paper, we propose a robust framework to solvemultiple rotation averaging problem, especially in the casesthat a signiﬁcant amount of noisy measurements are present.By introducing the (cid:15) -cycle consistency term into the solver,we enable the robust initialization scheme to be imple-mented into the IRLS solver. Instead of conducting thecostly edge removal, we implicitly constrain the negativeeffect of erroneous measurements by weight reducing, suchthat IRLS failures caused by poor initialization can be effec-tively avoided. Combined with our novel cost function, ourproposed method outperforms state-of-the-art approachessigniﬁcantly in both accuracy and efﬁciency. 9 upplementary Materials

Xinyi LiDepartment of Computer and Information SciencesTemple UniversityPhiladelphia, USA { [email protected] } Haibin LingDepartment of Computer ScienceStony Brook UniversityStony Brook, USA { [email protected] } In this supplementary work, we will unfold our discus-sions in the main paper in more details. In the following, weinclude properties and results of rotation groups and quater-nions, along with their representations on the unit 3-sphere(Fig. 2 in the main paper) in §7. We conclude the work byproviding the additional results of the experiments, wherein §8.1 and §8.2 we give the quantitative and qualitative re-sults on Photo Tourism [35] and KITTI [36] respectively,detailed analysis of the ablation study is provided in §8.3.

7. Rotations, Quaternions and 3-sphere

To generalize our analysis and make the discussionsconcise, we will adopt the notations { R , R , · · · } where R i ∈ SO (3) instead of { λ (1) , λ (2) , · · · } in the main pa-per. Similarly, { R ij } denotes the same transformations as σ ( i, j ) . As all rotations can be uniquely (up to a scale) rep-resented by unit quaternions, we further exploit the quater-nion representation of { R i } . Speciﬁcally, let a unit quater-nion q i denote the same rotation as R i . Recall that q i = s i + α i i + β i j + γ i k , (20)where i = j = k = ijk = − , (21) ij = − ji = k , (22) jk = − kj = i , (23) ki = − ik = j , (24) s i + α i + β i + γ i = 1 . (25)We will use the concrete notation q = [ s , v ] henceforth,where s denotes the real part of q and v = [ α, β, γ ] de-notes the imaginary part. Let q denotes an arbitrary ro-tation R ∈ SO (3) , we have that there exists a set of ba-sis { e i } such that q = (cid:80) a i e i for a i ∈ R , ∀ i . Then for q which represents the relative rotation R , we thus have R = R R which in quaternionic form is q = q q q − = q (cid:0) (cid:88) a i e i (cid:1) q − . (26) While Eq. 26 cannot be directly expanded, it sufﬁces toshow that q can be spanned with the basis rotated by q . Let { e i } denotes the basis of q , it immediatelyfollows that the rotated basis set from { e i } is that, e i = e i e ( e i ) − . Substituting into Eq. 26 q = (cid:88) a i e i e ( e i ) − = (cid:88) a i e i . (27)The generalized form of the equation above is thus q n = (cid:88) a i e n − ,n e n − ,n − · · · e i e ( e i ) − · · · e − n − ,n − e − n − ,n . (28)Eq. 28 provides the equivalence between the progressive ro-tation multiplication and the summation with respect to thetransformed quaternion basis. Now we can change Def. 3into that with quaternion form, which essentially denotesthe geodesic curve length on the surface of unit 3-sphere S . Recall that we deﬁne the (cid:15) -cycle consistency as d ( σ ( i, i ) · m k − (cid:89) j =1 σ ( i j , i j +1 ) · σ ( i m k , i ) , I σ ) ≤ (cid:15). (29) Lemma 10.

Let q ij denote the relative rotation as deﬁnedin Eq. 29, i.e ., q ij = σ ( i, j ) , ∀ i, j . Then given q , thereexists an angle of rotation θ such that d ∠ ( q n , q ) ≤ θ, (30) is equivalent with Eq. 29, where q n is the quaternion after n rotations from q . The proof follows by realizing that any quaternion canbe written into the rotation axis and an rotation angle θ . Itimmediately follows that the line determined by the rotationaxis is invariant under the rotation angle. Assume withoutloss of generality that q n has the same rotation axis with q ,it then sufﬁces to show that the pure quaternion of q aftertransformation by progressive q ij is also a pure quaternion.10onsider q j = q ij q i q − ij , we thus have q ij q i q − ij = q ij ( s i + v i ) q − ij = q ij s i q − ij + q ij v i q − ij = s i + q ij v i q − ij , (31)and the lemma follows.

8. Additional Experimental Results

In this section we ﬁrst provide additional quantitative re-sults on the Photo Tourism Dataset [35] in §8.1. We havetested our proposed approach with variations, e.g ., withconventional cost functions and RANSAC preprocessings.We demonstrate the efﬁciency of our proposed approachby outperforming the state-of-the-arts approaches on bothspeed and accuracy. We then present the qualitative resultson the KITTI dataset [36], where the accurate camera tra-jectories have demonstrated the robustness of our proposedapproach in consecutive image sequence, i.e ., smaller cy-cles are ubiquitous in the pose-graph while large cycles arenormally absent. Furthermore, we conduct ablation studieson the effects of cycle constraints and cost function choicesin §8.3 to conclude our discussions.

In Table 2, we provide the performance on thePhoto Tourism Dataset [35], compared with the originalIRLS [14], the Robust IRLS [15], MPLS [31]. Our pro-posed approach outperforms the state-of-the-art methods byboth speed and accuracy on most of the datasets.We have conducted the experiments with different costfunctions and with different initialization schemes. In Ta-ble 2, ‘Ours- l ’, ‘Ours- l ’ and ‘Ours- l ’ represent our pro-posed approach with l , l and l cost functions, respec-tively. It can be shown that though that with l cost func-tion achieves a better accuracy than the other two cost func-tions, it takes notably longer processing time and moreiterations as well. Moreover, on large-scale dataset, thetradeoff is quite inefﬁcient. For example, on the Piccadillydataset, which contains more than 2000 images, the l takesmore than 15% runtime compared with the l optimizationscheme while only providing negligible edge on accuracyadvantage. Meanwhile out proposed method with expo-nential cost function runs slightly longer than [14], but im-proves the accuracy tremendously.In addition, we have tested our approach with‘RANSAC’ only preprocessing and ‘RANSAC+cycle con-straints’ and it can be shown that with our proposed cycleconstraint on the erroneous relative rotations, the accuracyhas been improved substantially. It should be noted thatall of the state-of-the-art approaches we are comparing ourapproach with have implemented RANSAC iterations be-fore the rotation averaging to ﬁlter the measurements. It is shown that although ‘Ours-RANSAC’ outperforms theother approaches on a small scale on several datasets, theaccuracy is rather low compared with ‘Ours-R+cycle’ and‘Ours’. It also should be noted that, without our proposedinitialization schemes, the optimization requires more it-erations to be successfully initialized. Furthermore, thecomparison between ‘Ours-R+cycle’ and ‘Ours’ validatesthe effects of our proposed enforcement of the cycle con-straints. It can be shown that the two mostly perform sim-ilarly on most of the datasets. The fact that the additionalRANSAC does not improve the accurac demonstrates therobustness of our proposed approach against the measure-ment outliers. In Fig. 6, we provide the camera trajectories given byour proposed approach on the KITTI dataset [36]. In de-tails, we ﬁrst solve for the camera rotations and then keepthe rotations ﬁxed and compute the camera translation withconventional bundle adjustment [34].In the experiments it can be observed that, our proposedapproach has shown consistent robustness when dealingwith consecutive image sequences. As translation estima-tion includes the approximation of unknown scale parame-ter, deviation in rotation estimation can be ampliﬁed catas-trophically. The accurate camera trajectories thus demon-strate the high accuracy delivered by our rotation averagingscheme. Speciﬁcally, most of the sequences do not con-tain large cycles in the viewgraph, in such scenarios the en-forcement of cycle consistency tends to yield more looseconstraints. However, since there exists abundant smallercycles in the viewgraph, the constraints will be denser inthe subspace, thus demonstrate the practicality and univer-sality of our proposed approach.

In the following experiments, we randomly sampleground truth values of 100 camera rotations from the Pic-cadilly dataset from [35] and compute the relative rotationswith them. We then randomly select 10%, 20%, 30%, 40%of the relative rotations to add in noise uniformly. We ﬁrstanalyze the convergence performance with different costfunctions, followed by the robustness analysis with differ-ent measurement ﬁltration schemes. For each experiment,we record the result by repeating the experiment with samesetting for 10 times and report the mean values.

With the synthetic data we test the convergence perfor-mances of Robust IRLS [15], Ours- l , Ours-Huber andOurs with different noise and outlier levels. In all the ex-periments, we perturb the corresponding proportion of the11 Iteration M e d i a n E rr o r i n D e g r ee s Noise = 10%

Robust IRLSOurs-l2Ours-HuberOurs (a) Noise=10%

Iteration M e d i a n E rr o r i n D e g r ee s Noise = 20%

Robust IRLSOurs-l2Ours-HuberOurs (b) Noise = 20%

Iteration M e d i a n E rr o r i n D e g r ee s Noise = 30%

Robust IRLSOurs-l2Ours-HuberOurs (c) Noise = 30%

Iteration M e d i a n E rr o r i n D e g r ee s Noise = 40%

Robust IRLSOurs-l2Ours-HuberOurs (d) Noise = 40%

Figure 4: Convergence performances with different costfunction under different noise levels.ground truth relative rotations by ◦ .The convergence performances are given in Fig. 4. Itcan be observed that as noise level and outliers increase, ourpropose solver shows strong robustness and displays stableand fast convergence. While our approach with differentcost functions show similar convergence property when thenoise level is low, as 40% of the edges have add-in noise,the Huber cost function shows the slowest convergence. Inour experiments, that with Huber cost always requires morethan 15 iterations to converge when the outliers are rela-tively high. It also worth to point out that when a high noiselevel is present, the properties of our cost function can beshown more clearly, that the penalty is high at the begin-ning of the optimization and slows down near the optimalpoint. We also conduct experiments to analyze the erroneousmeasurements adjustment procedure. In the experiments,we test Robust IRLS [15] with RANSAC (original) andwith our cycle consistency instead of RANSAC, Ours withRANSAC only and Ours against increasing noise. In Fig. 5,we provide the performances on the optimization accuracy.It can be observed that as for the same outlier percentage,approaches with the implementation of the cycle consis-tency are more robust to higher level of noise. With the out-lier percentage increases, the effects of RANSAC continues

Noise in Degrees M e d i a n E rr o r i n D e g r ee s Noise = 10%

Robust IRLSIRLS-cycleOurs-RANSACOurs (a) Noise=10%

Noise in Degrees M e d i a n E rr o r i n D e g r ee s Noise = 20%

Robust IRLSIRLS-cycleOurs-RANSACOurs (b) Noise = 20%

Noise in Degrees M e d i a n E rr o r i n D e g r ee s Noise = 30%

Robust IRLSIRLS-cycleOurs-RANSACOurs (c) Noise = 30%

Noise in Degrees M e d i a n E rr o r i n D e g r ee s Noise = 40%

Robust IRLSIRLS-cycleOurs-RANSACOurs (d) Noise = 40%

Figure 5: Performances with different measurement ﬁltra-tions under different noise levels.to decrease since there are not sufﬁciently many inliers forRANSAC to function as expected while cycle consistencymaintains the advantage.12able 2: Experiment results on the Photo Tourism Dataset [35]. In the table, ∆ ¯ deg and ∆ ˆ deg denote the mean and medianerror in degrees, respectively; runtime is in seconds and number of iterations denote the iterations to initialize + iterations forthe calculation.

IRLS[14] Robust-IRLS[15] MPLS[31] Ours- l Ours- l Ours- l OursRANSAC OursR+cycle Ours A l a m o ∆ ¯ deg 3.64 3.67 3.44 3.54 3.52 3.44 3.59 ∆ ˆ deg 1.30 1.32 runtime E ll . I s . ∆ ¯ deg 3.04 2.71 2.61 2.39 2.28 ∆ ˆ deg 1.06 0.93 0.88 0.82 0.82 0.76 0.90 0.67 runtime 3.2 2.8 4.0 2.7 2.8 3.5 4.1 2.5 iteration 10+9 10+13 6+11 6+12 6+12 6+12 6+12 5+10 M on t . N . D ∆ ¯ deg 1.25 1.22 1.04 1.12 1.12 1.07 1.18 ∆ ˆ deg 0.58 0.57 0.51 0.51 0.50 0.50 0.56 runtime N o t . D a . ∆ ¯ deg 2.63 2.26 2.06 2.19 2.14 2.03 2.35 ∆ ˆ deg 0.78 0.71 0.67 0.71 0.70 0.68 0.79 0.65 runtime P i cca . ∆ ¯ deg 5.12 5.19 3.93 4.08 4.00 3.92 4.87 ∆ ˆ deg 2.02 2.34 1.81 1.83 1.83 1.79 2.07 1.79 runtime NY C L i b ∆ ¯ deg 2.71 2.66 2.63 2.63 2.60 2.58 2.73 ∆ ˆ deg 1.37 1.30 1.24 1.20 1.20 1.17 1.33 1.19 runtime P . D . P ∆ ¯ deg 4.1 3.99 3.73 3.77 3.74 3.69 3.94 ∆ ˆ deg 2.07 2.09 1.93 1.85 1.84 1.85 1.98 1.83 runtime R o m . F o r . ∆ ¯ deg 2.66 2.69 2.62 2.65 2.64 2.64 2.71 ∆ ˆ deg 1.58 1.57 T . o . L ∆ ¯ deg 3.42 3.41 3.16 3.23 3.20 3.14 3.32 ∆ ˆ deg 2.52 2.50 2.20 2.12 2.08 2.04 2.20 2.00 runtime 2.6 iteration 10+8 10+12 6+7 6+7 6+10 6+12 6+12 5+8 U n i . S q . ∆ ¯ deg 6.77 6.77 6.54 6.57 6.50 6.44 6.69 ∆ ˆ deg 3.66 3.85 Y o r k m . ∆ ¯ deg 2.6 2.45 2.47 2.45 2.44 2.44 2.50 2.40 ∆ ˆ deg 1.59 1.53 1.45 1.47 1.45 1.45 1.52 1.36 runtime G e nd . M a r . ∆ ¯ deg 39.24 39.41 44.94 42.76 39.98 38.70 40.24 ∆ ˆ deg 7.07 7.12 9.87 9.55 9.26 9.12 9.93 M a d . M e t . ∆ ¯ deg 5.3 4.88 4.65 4.72 4.62 4.52 5.43 4.46 ∆ ˆ deg 1.78 1.88 1.26 1.28 1.19 1.14 1.64 V i e n . C a t . ∆ ¯ deg 8.13 8.07 7.21 7.13 7.02 6.97 7.89 ∆ ˆ deg 1.92 1.76 2.83 2.32 2.26 2.29 2.42

300 -200 -100 0 100 200 300 x-translation [m] -1000100200300400500600700 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (a) Sequence 00 x-translation [m] -1200-1000-800-600-400-2000200 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (b) Sequence 01 -100 0 100 200 300 400 500 600 x-translation [m] y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (c) Sequence 02 -100 0 100 200 300 400 500 x-translation [m] -50050100150200250300 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (d) Sequence 03 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 x-translation [m] y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (e) Sequence 04 -300 -200 -100 0 100 200 300 x-translation [m] -100-50050100150200 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (f) Sequence 05 -100 -50 0 50 100 x-translation [m] -200-1000100200300 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (g) Sequence 06 -200 -150 -100 -50 0 50 x-translation [m] -100-50050100150 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (h) Sequence 07 -400 -300 -200 -100 0 100 200 300 400 500 x-translation [m] y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (i) Sequence 08 -200 -100 0 100 200 300 400 x-translation [m] y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (j) Sequence 09 x-translation [m] -100-50050100150200 y - t r a n s l a ti on [ m ] Ground TruthOur methodDifference (k) Sequence 10

Figure 6: Our proposed approach shows a trajectory estimation of comparably high quality in KITTI dataset which containslarge scale outdoor scenes. 14 eferences [1] K. Levenberg, “A method for the solution of certainnon-linear problems in least squares,”

Quarterly ofApplied Mathematics , 1944.[2] V. M. Govindu, “Combining two-view constraints formotion estimation,” in

CVPR , 2001.[3] M. Moakher, “Means and averaging in the group ofrotations,”

SIAM journal on matrix analysis and ap-plications , 2002.[4] Z. Cui and P. Tan, “Global structure-from-motion bysimilarity averaging,” in

ICCV , 2015.[5] S. Zhu, R. Zhang, L. Zhou, T. Shen, T. Fang, P. Tan,and L. Quan, “Very large-scale global SfM by dis-tributed motion averaging,” in

CVPR , 2018.[6] S. Zhu, T. Shen, L. Zhou, R. Zhang, J. Wang, T. Fang,and L. Quan, “Parallel structure from motion fromlocal increment to global averaging,” arXiv preprintarXiv:1702.08601 , 2017.[7] H. Cui, X. Gao, S. Shen, and Z. Hu, “HSfM: Hybridstructure-from-motion,” in

CVPR , 2017.[8] X. Li and H. Ling, “Hybrid camera pose estimationwith online partitioning for slam,”

RAL , 2020.[9] D. Martinec and T. Pajdla, “Robust rotation andtranslation estimation in multiview reconstruction,” in

CVPR , 2007.[10] A. Locher, M. Havlena, and L. V. Gool, “Progressivestructure from motion,” in

ECCV , 2018.[11] R. Hartley, J. Trumpf, Y. Dai, and H. Li, “Rotationaveraging,”

IJCV , 2013.[12] C. Tang, O. Wang, and P. Tan, “Gslam: initialization-robust monocular visual slam via global structure-from-motion,” in , 2017.[13] L. Carlone, R. Tron, K. Daniilidis, and F. Dellaert,“Initialization techniques for 3d slam: a survey on ro-tation estimation and its use in pose graph optimiza-tion,” in

ICRA , 2015.[14] A. Chatterjee and V. Madhav Govindu, “Efﬁcient androbust large-scale rotation averaging,” in

ICCV , 2013.[15] A. Chatterjee and V. M. Govindu, “Robust relative ro-tation averaging,”

PAMI , 2018.[16] L. Kneip, R. Siegwart, and M. Pollefeys, “Finding theexact rotation between two images independently ofthe translation,” in

ECCV , 2012. [17] L. Kneip and H. Li, “Efﬁcient computation of relativepose for multi-camera systems,” in

CVPR , 2014.[18] V. M. Govindu, “Robustness in motion averaging,” in

ACCV , 2006.[19] H. Yang, P. Antonante, V. Tzoumas, and L. Carlone,“Graduated non-convexity for robust spatial percep-tion: From non-minimal solvers to global outlier re-jection,”

RAL , 2020.[20] O. Enqvist, F. Kahl, and C. Olsson, “Non-sequentialstructure from motion,” in

ICCV Workshops , 2011.[21] F. Kahl and R. Hartley, “Multiple-view geometry un-der the l ∞ -norm,” PAMI , 2008.[22] J. Fredriksson and C. Olsson, “Simultaneous multiplerotation averaging using lagrangian duality,” in

ACCV ,2012.[23] J. Briales, L. Kneip, and J. Gonzalez-Jimenez, “A cer-tiﬁably globally optimal solution to the non-minimalrelative pose problem,” in

CVPR , 2018.[24] L. Carlone, D. M. Rosen, G. Calaﬁore, J. J. Leonard,and F. Dellaert, “Lagrangian duality in 3d slam: Ver-iﬁcation techniques and optimal solutions,” in

IROS ,2015.[25] A. Eriksson, C. Olsson, F. Kahl, and T.-J. Chin, “Ro-tation averaging and strong duality,” in

CVPR , 2018.[26] K. Wilson, D. Bindel, and N. Snavely, “When is ro-tations averaging hard?” in

European Conference onComputer Vision . Springer, 2016, pp. 255–270.[27] A. P. Bustos, T.-J. Chin, A. Eriksson, and I. Reid, “Vi-sual SLAM: Why bundle adjustment,” in

ICRA , 2019.[28] Y. Kasten, A. Geifman, M. Galun, and R. Basri, “Al-gebraic characterization of essential matrices and theiraveraging in multiview settings,” in

Proceedings of theIEEE International Conference on Computer Vision ,2019, pp. 5895–5903.[29] ——, “Gpsfm: Global projective sfm using algebraicconstraints on multi-view fundamental matrices,” in

Proceedings of the IEEE Conference on Computer Vi-sion and Pattern Recognition , 2019, pp. 3264–3272.[30] A. Geifman, Y. Kasten, M. Galun, and R. Basri, “Aver-aging essential and fundamental matrices in collinearcamera settings,” in

Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recog-nition , 2020, pp. 6021–6030.1531] G. Lerman and Y. Shi, “Robust group synchroniza-tion via cycle-edge message passing,” arXiv preprintarXiv:1912.11347 , 2019.[32] C. Zach, M. Klopschitz, and M. Pollefeys, “Disam-biguating visual relations using loop constraints,” in . IEEE, 2010,pp. 1426–1433.[33] T. Shen, S. Zhu, T. Fang, R. Zhang, and L. Quan,“Graph-based consistent matching for structure-from-motion,” in

European Conference on Computer Vi-sion . Springer, 2016, pp. 139–155.[34] S. Agarwal, K. Mierle, and Others, “Ceres solver,”http://ceres-solver.org.[35] K. Wilson and N. Snavely, “Robust global translationswith 1dsfm,” in

Proceedings of the European Confer-ence on Computer Vision (ECCV) , 2014.[36] A. Geiger, P. Lenz, and R. Urtasun, “Are we readyfor autonomous driving? the kitti vision benchmarksuite,” in

Conference on Computer Vision and PatternRecognition (CVPR) , 2012.[37] R. K¨ummerle, G. Grisetti, H. Strasdat, K. Konolige,and W. Burgard, “g 2 o: A general framework forgraph optimization,” in

ICRA , 2011.[38] E. Candes and J. Romberg, “L1-magic:recovery ofsparse signals via convex programming.” https://statweb.stanford.edu/ ∼∼