[PDF] MAGSAC: marginalizing sample consensus

Abstract

A method called, sigma-consensus, is proposed to eliminate the need for a user-defined inlier-outlier threshold in RANSAC. Instead of estimating the noise sigma, it is marginalized over a range of noise scales. The optimized model is obtained by weighted least-squares fitting where the weights come from the marginalization over sigma of the point likelihoods of being inliers. A new quality function is proposed not requiring sigma and, thus, a set of inliers to determine the model quality. Also, a new termination criterion for RANSAC is built on the proposed marginalization approach. Applying sigma-consensus, MAGSAC is proposed with no need for a user-defined sigma and improving the accuracy of robust estimation significantly. It is superior to the state-of-the-art in terms of geometric accuracy on publicly available real-world datasets for epipolar geometry (F and E) and homography estimation. In addition, applying sigma-consensus only once as a post-processing step to the RANSAC output always improved the model quality on a wide range of vision problems without noticeable deterioration in processing time, adding a few milliseconds. The source code is at this https URL.

Full PDF

MMAGSAC: Marginalizing Sample Consensus

Daniel Barath , Jiri Matas , and Jana Noskova Centre for Machine Perception, Department of CyberneticsCzech Technical University, Prague, Czech Republic Machine Perception Research Laboratory, MTA SZTAKI, Budapest, Hungary [email protected]

Abstract

A method called, σ -consensus, is proposed to elimi-nate the need for a user-deﬁned inlier-outlier threshold inRANSAC. Instead of estimating the noise σ , it is marginal-ized over a range of noise scales. The optimized model isobtained by weighted least-squares ﬁtting where the weightscome from the marginalization over σ of the point like-lihoods of being inliers. A new quality function is pro-posed not requiring σ and, thus, a set of inliers to deter-mine the model quality. Also, a new termination criterionfor RANSAC is built on the proposed marginalization ap-proach. Applying σ -consensus, MAGSAC is proposed withno need for a user-deﬁned σ and improving the accuracy ofrobust estimation signiﬁcantly. It is superior to the state-of-the-art in terms of geometric accuracy on publicly availablereal-world datasets for epipolar geometry ( F and E ) andhomography estimation. In addition, applying σ -consensusonly once as a post-processing step to the RANSAC outputalways improved the model quality on a wide range of vi-sion problems without noticeable deterioration in process-ing time, adding a few milliseconds.

1. Introduction

The RANSAC (RANdom SAmple Consensus) algo-rithm proposed by Fischler and Bolles [5] in 1981 has be-come the most widely used robust estimator in computervision. RANSAC and its variants have been successfullyapplied to a wide range of vision tasks, e.g. motion seg-mentation [25], short baseline stereo [25, 27], wide baselinestereo matching [18, 13, 14], detection of geometric primi-tives [21], image mosaicing [7], and to perform [28] or ini-tialize multi-model ﬁtting [10, 17]. In brief, the RANSACapproach repeatedly selects random subsets of the inputpoint set and ﬁts a model, e.g. a plane to three 3D pointsor a homography to four 2D point correspondences. Next, The source code is at https://github.com/danini/magsac the quality of the estimated model is measured, for instanceby the size of its support, i.e. the number of inliers. Finally,the model with the highest quality, polished e.g. by leastsquares ﬁting on its inliers, is returned.Since the publication of RANSAC, a number of modi-ﬁcations has been proposed. NAPSAC [16], PROSAC [1]and EVSAC [6] modify the sampling strategy to increasethe probability of selecting an all-inlier sample early.NAPSAC assumes that the inliers are spatially coherent,PROSAC exploits an a priori predicted inlier probabilityof the points and EVSAC estimates a conﬁdence in eachof them. MLESAC [26] estimates the model quality by amaximum likelihood process with all its beneﬁcial proper-ties, albeit under certain assumptions about inlier and out-lier distributions. In practice, MLESAC results are oftensuperior to the inlier counting of plain RANSAC and theyare less sensitive to the user-deﬁned inlier-outlier threshold.In MSAC [24], the robust estimation is formulated as a pro-cess that estimates both the parameters of the data distribu-tion and the quality of the model in terms of maximum aposteriori. timates the model quality by a maximum likeli-hood process with all its beneﬁcial properties, albeit undercertain assumptions about inlier and outlier distributions.One of the highly attractive properties of RANSAC isits small number of control parameters. The terminationis controlled by a manually set conﬁdence value η and thesampling stops as soon as the probability of ﬁnding a modelwith higher support falls below − η . The setting of η is notproblematic, the typical values are 0.95 or 0.99, dependingon the required conﬁdence in the solution.The second, and most critical, parameter is the inliernoise scale σ that determines the inlier-outlier threshold τ ( σ ) which strongly inﬂuences the outcome of the proce-dure. In standard RANSAC and its variants, σ must beprovided by the user which limits its fully automatic out-of-the-box use and requires the user to acquire knowledgeabout the problem at hand. In Fig. 1, the inlier residuals Note that the probabilistic interpretation of η holds only for the stan-dard { , } cost function. a r X i v : . [ c s . C V ] J un re shown for four real datasets demonstrating that σ variesscene-by-scene and, thus, there is no single setting whichcan be used for all cases.To reduce the dependency on this threshold, MIN-PRAN [22] assumes that the outliers are uniformly dis-tributed and ﬁnds the model where the inliers are least likelyto have occurred randomly. Moisan et al. [15] proposed acontrario RANSAC, to optimize each model by selectingthe most likely noise scale.As the major contribution of this paper, we propose anapproach, σ -consensus, that eliminates the need for σ , thenoise scale parameter. Instead of σ , only an upper limit isrequired. The ﬁnal outcome is obtained by weighted least-squares ﬁtting, where the weights are given for marginal-izing over σ , using likelihood of the model given data and σ . Besides ﬁnessing the need for a precise scale param-eter, the novel method, called MAGSAC, is more precisethan previously published RANSACs. Also, we proposea post-processing step applying σ -consensus to the so-far-the-best-model without noticeable deterioration in process-ing time, i.e. at most a few milliseconds. In our experi-ments, the method always improved the input model (com-ing from RANSAC, MSAC or LO-RANSAC) on a widerange of problems. Thus we see no reason for not applyingit after the robust estimation ﬁnished. As a second contribu-tion , we deﬁne a new quality function for RANSAC. It mea-sures the quality of a model without requiring σ and, there-fore, a set of inliers to measure the model quality. Moreover,as a third contribution , due to not having a single inlier setand, thus, an inlier ratio, the standard termination criterionof RANSAC is marginalized over σ to be applicable to theproposed method.

2. Notation

In this paper, the input points are denoted as P = { p | p ∈ R k , k ∈ N > } , where k is the dimension, e.g. k = 2 for 2D points and k = 4 for point correspondences.The inlier set is I ⊆ P . The model to ﬁt is representedby its parameter vector θ ∈ Θ , where Θ = { θ | θ ∈ R d , d ∈ N > } is the manifold, for instance, of all possi-ble 2D lines and d is dimension of the model, e.g. d = 2 for2D lines (angle and offset). Fitting function F : P ∗ → Θ calculates the model parameters from n ≥ m points, where P ∗ = exp P is the power set of P and m ∈ N > is the min-imum point number for ﬁtting a model, e.g. m = 2 for 2Dlines. Note that F is a combined function applying differentestimators on the basis of the input set, for example, a min-imal method if n = m and least-squares ﬁtting otherwise.Function D : Θ × P → R is the point-to-model residualfunction. Function I : P ∗ × Θ × R → P ∗ selects the inliersgiven model θ and threshold σ . For instance, if the origi-nal RANSAC approach is considered, I RANSAC ( θ, σ, P ) = { p ∈ P | D ( θ, p ) < σ } , for truncated quadratic dis- B o s t on B o s t onL i b B r ugge S qua r e B r ugge T o w e r B r u ss e l s C ap i t a l R eg i on E i ff e l Le P o i n t P o i n t P o i n t W h i t e B oa r dada m boa t c i t y g r a f R M SE o f i n li e r s ( i n p x ) (a) homogr dataset ada m c a f e c a t du m f a c e f o x g i r l g r a f g r and i nde x m ag p kks hop t he r e v i n R M SE o f i n li e r s ( i n p x ) (b) EVD dataset ba rr s m i t hbonha ll bon y t hone l de r ha ll ae l de r ha ll bha r t l e y j ohn ss ona j ohn ss onb l ad ysy m on li b r a r y nap i e r anap i e r bnee m ne s eo l d c l a ss i cs w i ngph ys i css eneun i hou s eun i onhou s e R M SE o f i n li e r s (c) AdelaideRMF dataset K y o t oboo ks hbo xc a s t l e c o rr g r a ff ha r t l e y head k a m pa l ea f s li b r a r y ph ys i cs p l an t r o t unda s hou t v a l bonne w a ll w a s h z oo m R M SE o f i n li e r s ( i n p x ) (d) kusvod2 dataset Figure 1: The average residuals (RMSE in pixels; verticalaxis) of manually annotated inliers given the ground truthmodel for each scene (horizontal) of four datasets.

Notation P - Set of data points σ - Noise standard deviation θ - Model parameters D - Residual function I - Inlier selector function Q - Model quality function F - Fitting function m - Minimal sample size τ ( σ ) - Inlier-outlier threshold σ max - Upper bound of σ tance of MSAC, I MSAC ( θ, σ, P ) = { p ∈ P | D ( θ, p ) < / σ } . The quality function is Q : P ∗ × Θ × R → R .Higher quality is interpreted as better model. For RANSAC, Q RANSAC ( θ, σ, P ) = | I ( θ, σ, P ) | and for MSAC, it is Q MSAC ( θ, σ, P ) = | I ( θ,σ, P ) | (cid:88) i =1 (cid:18) − D ( θ, I i ( θ, σ, P ))9 / σ (cid:19) , where I i ( θ, σ, P ) is the i th inlier.

3. Marginalizing sample consensus

A method called MAGSAC is proposed in this sectioneliminating the threshold parameter from RANSAC-like ro-bust model estimation. σ Let us assume the noise σ to be a random variable withdensity function f ( σ ) and let us deﬁne a new quality func-tion for model θ marginalizing over σ as follows: Q ∗ ( θ, P ) = (cid:90) Q ( θ, σ, P ) f ( σ )d σ. (1)2aving no prior information, we assume σ being uniformlydistributed, σ ∼ U (0 , σ max ) . Thus Q ∗ ( θ, P ) = 1 σ max (cid:90) σ max Q ( θ, σ, P )d σ. (2)For instance, using Q ( θ, σ, P ) of plain RANSAC, i.e. thenumber of inliers, where σ is the inlier-outlier thresholdand { D ( θ, p i ) } |P| i =1 are the distances to model θ such that ≤ D ( θ, p ) < D ( θ, p ) < .... < D ( θ, p K ) < σ max is the gamma function.In MAGSAC, the residuals of the inliers are describedby a distribution with density g ( r | σ ) , and the outliers bya uniform one on the interval [0 , l ] . Note that, for images, l can be set to the image diagonal. The inlier-outlier thresh-old τ ( σ ) is set to the 0.95 or 0.99 quantile of the distribu-tion with density g ( r | σ ) . Consequently, the likelihood ofmodel θ given σ is L ( θ, P | σ ) = 1 l |P|−|I ( σ ) | (cid:89) p ∈I ( σ ) (cid:20) C ( ρ ) σ − ρ D ρ − ( θ, p ) exp (cid:18) − D ( θ, p )2 σ (cid:19)(cid:21) . (4) MAGSAC, for a given σ , uses log-likelihood of model θ asits quality function as follows: Q ( θ, σ, P ) = ln L ( θ, P| σ ) .Thus, the quality marginalized over σ is the following. Q ∗ MAGSAC ( θ, P ) = 1 σ max (cid:90) σ max ln L ( θ, P| σ ) dσ ≈ −|P| ln l + 1 σ max K (cid:88) i =1 [ i (ln 2 C ( ρ ) l − ρ ln σ i ) − R i σ i + ( ρ − Lr i ]( σ i − σ i − ) , (5)where { D ( θ, p i ) } |P| i =1 are the distances to model θ , σ = 0 and ≤ D ( θ, p ) = τ ( σ ) < D ( θ, p ) = τ ( σ ) <... < D ( θ, p K ) = τ ( σ K ) < τ ( σ max ) < D ( θ, p K +1 ) <... < D ( θ, p |P| ) , R i = (cid:80) ij =1 D ( θ, p j ) and Lr i = (cid:80) ij =1 ln D ( θ, p j ) . As a consequence, the proposed newquality function Q ∗ MAGSAC does not depend on a manuallyset noise level σ . σ -consensus model ﬁtting Due to not having a set of inliers which could be used topolish the model obtained from a minimal sample, we pro-pose to use weighted least-squares ﬁtting where the weightsare the point probabilities of being inliers.Suppose that we are given model θ estimated from a min-imal sample. Let θ σ = F ( I ( θ, σ, P )) be the model impliedby the inlier set I ( θ, σ, P ) selected using τ ( σ ) around theinput model θ . It can be seen from Eq. 4 that the likelihoodof point p ∈ P being inlier given model θ σ isL ( p | θ σ , σ ) = 2 C ( ρ ) σ − ρ D ρ − ( θ σ , p ) exp (cid:18) − D ( θ σ , p )2 σ (cid:19) . For ﬁnding the likelihood of a point being an inliermarginalized over σ , the same approach is used as before:L ( p | θ ) ≈ C ( ρ ) σ max K (cid:88) i =1 ( σ i − σ i − ) σ − ρi D ρ − ( θ σ i , p ) exp (cid:18) − D ( θ σ i , p )2 σ i (cid:19) . (6)and the polished model θ ∗ MAGSAC is estimated usingweighted least-squares, where the weight of point p ∈ P is L ( p | θ ) . Not having an inlier set and, thus, at least a rough es-timate of the inlier ratio, makes the standard terminationcriterion of RANSAC [8] inapplicable, which is as follows: k ( θ, σ, P ) = ln(1 − η )ln (cid:16) − (cid:16) | I ( θ,σ, P ) ||P| (cid:17) m (cid:17) , (7)3here k is the iteration number, η a manually set conﬁdencein the results, m the size of the minimal sample needed forthe estimation, and | I ( θ, σ, P ) | is the inlier number of theso-far-the-best model.In order to determine k without using a particular σ , itis a straightforward choice to marginalize similarly to themodel quality. It is as follows: k ∗ ( P , θ ) = 1 σ max (cid:90) σ max k ( θ, σ, P ) dσ ≈ σ max K (cid:88) i =1 ( σ i − σ i − ) ln(1 − η )ln (cid:16) − (cid:16) | I ( θ,σ i , P ) ||P| (cid:17) m (cid:17) . (8)Thus the number of iterations required for MAGSAC is cal-culated during the process and updated whenever a new so-far-the-best model is found, similarly as in RANSAC.

4. Algorithms using σ -consensus In this section, we propose two algorithms applying σ -consensus. First, MAGSAC will be discussed incor-porating the proposed marginalizing approach, weightedleast-squares and termination criterion. Second, a post-processing step is proposed which is applicable to the outputof every robust estimator. In the experiments, it always im-proved the input model without noticeable deterioration inthe processing time, adding maximum a few milliseconds. Since plain MAGSAC would apply least-squares ﬁttinga number of times, the implied computational complexitywould be fairly high. Therefore, we propose techniques forspeeding up the procedure. In order to avoid unnecessaryoperations, we introduce a σ max value and use only the σ ssmaller than σ max in the optimization procedure. Thus, from σ < σ < ... < σ K < σ max < σ K +1 < ... < σ n only σ , σ , ... , and σ i are used. This σ max can be set to afairly big value, for example, 10 pixels. In the case whenthe results suggest that σ max is too low, e.g. if the densitymode of the residuals is close to σ max , the computation canbe repeated with a higher value.Instead of calculating θ σ i for every σ i , we divide therange of σ s uniformly into d partitions. Thus the pro-cessed set of σ s are the following: σ + ( σ max − σ ) /d , σ + 2( σ max − σ ) /d , ... , σ + ( d − σ max − σ ) /d , σ max .By this simpliﬁcation, the number of least-squares ﬁttingsdrops to d from K , where d (cid:28) K . In the experiments, d was set to .Also, as it was proposed for USAC [19], there are severalways of skipping early the evaluation of models which donot have the chance of being better than the previous so-far-the-best. For this purpose, we apply SPRT [2] with a τ ref threshold. Threshold τ ref is not used in the model evaluation (a) Homography; homogr dataset. Errors: (cid:15) LO-MSC = 4 . (2nd) and (cid:15) MAGSAC = 2 . pixels (1st).(b) Homography; EVD dataset. Errors: (cid:15)

LO-RSC = 9 . (2nd)and (cid:15) MAGSAC = 4 . pixels (1st).(c) Fundamental matrix; kusvod2 dataset. Errors: (cid:15) MSC =14 . (2nd) and (cid:15) MAGSAC = 0 . pixels (1st).(d) Essential matrix; Strecha dataset. Errors: (cid:15)

MSC = 4 . (2nd) and (cid:15) MAGSAC = 2 . pixels (1st).(e) Essential matrix; Strecha dataset. Errors: (cid:15)

MSC = 5 . (2nd) and (cid:15) MAGSAC = 3 . pixels (1st). Figure 2: Example results of MAGSAC where it was sig-niﬁcantly more accurate than the second most accuratemethod. Average errors (in pixels) are written in the cap-tions. Inlier correspondences are drawn by color and out-liers by black crosses.or inlier selection steps, but is used merely to skip applying σ -consensus when it is unnecessary. In the experiments, τ ref pixel.Finally, the parallel implementation of σ -consensus canbe straightforwardly done on GPU or multiple CPUs evalu-ating each σ on a different thread. In our C++ implementa-tion, it runs on multiple CPU cores. σ -consensus algorithm The proposed σ -consensus is described in Alg. 1. Theinput parameters are: the data points ( P ), initial model pa-rameters ( θ ), a user-deﬁned partition number ( d ), and a limitfor σ ( σ max ).As a ﬁrst step, the algorithm takes the points which arecloser to the initial model than τ ( σ max ) (line 1). Function τ returns the threshold implied by the input σ parameter.In case of χ (4) distribution, it is τ ( σ ) = 3 . σ . Thenthe residuals of the inliers are sorted, therefore, in { σ i } |I| i =1 , σ i < σ j ⇔ i < j . In I ord , the indices of the points areordered reﬂecting to { σ i } |I| i =1 , thus σ i = D ( θ, I ord ,i ) / . (line 2). In lines 3 and 4, the weights are initialized tozero, and σ max is set to max( { σ i } |I| i =1 ) . Then the current σ range is calculated. For instance, the ﬁrst range to pro-cess is [ σ , σ + δ σ ] . Note that σ = 0 due to having at least m points at zero distance from the model. The cycle runsfrom the ﬁrst to the last point and, since I ord is ordered, eachsubsequent point is farther from the model than the previ-ous ones. Until the end of the current range, i.e. partition,is not reached (line 7), it collects the points (line 8) one-by-one. After exceeding the boundary of the current range, θ σ is calculated using all the previously collected points (line10). Then, for each point, the weight is updated by the im-plied probability (line 12). Finally, the algorithm jumps tothe next range (line 13). After the weights have been calcu-lated for each point, weighted least-squares ﬁtting is appliedto obtain the marginalized model parameters (line 14). The MAGSAC procedure polishing every estimatedmodel by σ -consensus is shown in Alg. 2. First, it initializesthe model quality to zero and the required iteration numberto ∞ (line 1). In each iteration, it selects a minimal sample(line 3), ﬁts a model to the selected points (line 4) validatesit (line 5) and applies σ -consensus to obtain the parametersmarginalized over σ (line 6). The validation step includesdegeneracy testing and tests which stop the evaluation of themodel if there is no chance of being better than the previousso-far-the-best, e.g. by SPRT test [2]. Note that, for SPRT,the validation step is also included into σ -consensus whenthe distances from the current model are calculated (line 1in Alg. 1). Finally, the model quality is calculated (line 8),the so-far-the-best model and required iteration number areupdated (line 10) if required (line 9). As a post-processingstep in time sensitive applications, σ -consensus is a pos-sible option for polishing the RANSAC output instead of applying a least-squares ﬁtting to the inliers. In this case, σ -consensus is applied only once, thus improving the resultswithout noticeable deterioration in the processing time. Algorithm 1 σ -consensus.Input: P – points; θ – model parameters; d – partitionnumber; σ max – σ limit; η – conﬁdence Output: θ ∗ – optimal model parameters I ← I ( P , θ, τ ( σ max )) I ord , { σ i } |I| i =1 ← sort ( { D ( θ, p ) } p ∈I ) { w i } |I| i =1 ← { } |I| i =1 , σ max ← max ( { σ i } |I| i =1 ) δ σ ← σ max /d , σ next ← δ σ , I tmp ← ∅ for i = → |I ord | do p ← I ord ,i , d p ← D ( θ, p ) if d p ≤ τ ( σ next ) then I tmp ← I tmp ∪ { p } continue θ σ ← F ( I tmp ) for i = → |I| do w i ← w i + W ( θ σ , I i , δ σ ) /σ max (cid:46) Eq. 6 I tmp ← I tmp ∪ { p } , σ next ← σ next + δ σ θ ∗ ← F ( I , { w i } |I| i =1 ) (cid:46) Weighted LSQ

Algorithm 2 MAGSACInput: P – data points; σ max – σ limit; σ ref – reference σ ; m – sample size; d – partition number; η – conﬁdence Output: θ ∗ – optimal model; q ∗ – model quality q ∗ ← , k ← ∞ for i = → k do { p j } mj =1 ← Sample( P ) θ ← F ( { p j } mj =1 ) if ¬ Validate( θ , σ ref ) then continue θ (cid:48) ← σ -consensus ( P , θ, d, σ max ) (cid:46) Alg. 1 q (cid:48) ← Q ( θ (cid:48) , P ) if q > q ∗ then q ∗ , θ ∗ , k ← q (cid:48) , θ (cid:48) , Iters ( q (cid:48) , |P| , η ) (cid:46) Eq. 8

5. Experimental Results

To evaluate the proposed post-processing step, wetested several approaches with and without this step.The compared algorithms are: RANSAC, MSAC, LO-RANSAC, LO-MSAC, LO-RANSAAC [20], and a con-trario RANSAC [15] (AC-RANSAC). LO-RANSAAC is amethod including model averaging into robust estimation.AC-RANSAC estimates the noise σ . The same randomseed was used for all methods and they performed a ﬁnalleast-squares on the obtained inlier set. The difference be-tween RANSAC – MSAC and LO-RANSAC – LO-MSAC5s merely the quality function. Moreover, the methods withLO preﬁx run the local optimization step proposed by Chumet al. [3] with an inner RANSAC applied to the inliers. Theparameters used are as follows: σ = 0 . was the inlier-outlier threshold used for the RANSAC loop (this value wasproposed in [11] and also suited for us). The number of in-ner RANSAC iterations was r = 20 . The required conﬁ-dence η was . . There was a minimum number of iter-ations required (set to ) before the ﬁrst LO step appliedand also before termination. The reported error values arethe root mean square (RMS) errors. For σ -consensus, σ max was set to pixels for all problems. The partition of σ range was set to d = 10 . Therefore, the processed set of σ swere σ max /d , σ max /d , ... , ( d − σ max /d , and σ max . To test the proposed method in a fully controlled envi-ronment, two cameras were generated by their × projec-tion matrices P = K [ I × | and P = K [ R | − R t ] .Camera P was located in the origin and its image plane wasparallel to plane XY. The position of the second camera wasat a random point inside a unit-sphere around the ﬁrst one,thus | t | ≤ . Its orientation was determined by three ran-dom rotations affecting around the principal directions asfollows: R = R X ,α R Y ,β R Z ,γ , where R X ,α , R Y ,β and R Z ,γ are 3D rotation matrices rotating around axes X, Y and Z, by α , β and γ degrees, respectively ( α, β, γ ∈ [0 , π/ . Bothcameras had a common intrinsic camera matrix with focallength f x = f y = 600 and principal points [300 , T . A3D plane was generated with random tangent directions andorigin [0 , , T . It was sampled at n i locations, thus gener-ating n i

3D points at most one unit far from the plane origin.These points were projected into the cameras. All of therandom parameters were selected using uniform distribu-tion. Zero-mean Gaussian-noise with σ standard deviationwas added to the projected point coordinates. Finally, n o outliers, i.e. uniformly distributed random point correspon-dences, were added. In total, points were generated,therefore n i + n o = 200 .The mean results of runs are reported in Fig. 3.The competitor algorithms are: RANSAC (RSC), MSAC(MSC), LO-RANSAC (LO-RSC), LO-MSAC (LO-MSC)and MAGSAC. Sufﬁx ” + σ ” means that σ -consensus wasapplied as a post-processing step. Plots (a–c) reports thegeometric accuracy (in pixels) as a function of the noiselevel σ using different outlier ratios (a – . , b – . , c – . ). The RANSAC conﬁdence was set to . . For in-stance, outlier ratio . means that n o = 160 and n i = 40 .By looking at the differences between methods with andwithout the proposed post-processing step (” + σ ”), it can beseen that it almost always improved the results. E.g. the ge-ometric error of LO-MSC is higher than that of LO-MSC+ σ for every noise σ . MAGSAC results are superior to that of the competitor algorithms on every outlier ratio. Itcan be seen that it is less sensitive to noise and more ro-bust to outliers. In (d), the processing time (in seconds) isreported as the function of the noise σ . MAGSAC is theslowest on the easy scenes, i.e. when the noise σ < . pixels. Thereafter, it becomes the fastest method due to re-quiring signiﬁcantly fewer iterations than the others. Plots(e–f) of Fig. 3 demonstrate that the accuracy provided byMAGSAC cannot be achieved by simply letting RANSACrun longer. The charts report the results for a ﬁxed itera-tion number, i.e. calculated from the ground truth inlier ra-tio and conﬁdence set to 0.999. For outlier ratio 0.8, it was log(0 . / log(1 − . ) = 4 314 . For outlier ratio 0.9,it was log(0 . / log(1 − . ) = 69 074 . It can be seenthat MAGSAC obtains signiﬁcantly more accurate resultsthan the competitor algorithms. It ﬁnds the desired modelin most of the cases even when the outlier ratio is high. In this section, MAGSAC and the proposed post-processing step is compared with state-of-the-art robust es-timators on real-world data for fundamental matrix, homog-raphy and essential matrix ﬁtting. See Fig. 2 for exam-ple image pairs where the error ( (cid:15)

MAGSAC ; in pixels) of theMAGSAC estimate was signiﬁcantly lower than that of thesecond best method.

Fundamental Matrices.

To evaluate the performance onfundamental matrix estimation we downloaded kusvod2 (24 pairs), Multi-H (5 pairs), and AdelaideRMF (19pairs) datasets. Kusvod2 consists of 24 image pairsof different sizes with point correspondences and funda-mental matrices estimated from manually selected inliers.

AdelaideRMF and

Multi-H consist a total of 24 imagepairs with point correspondences, each assigned manuallyto a homography or the outlier class. All points which areassigned to a homography were considered as inliers and theothers as outliers. In total, image pairs were used fromthree publicly available datasets. All methods applied theseven-point method [8] as a minimal solver for estimating F . Thus they drew minimal sets of size seven in each itera-tion. For the ﬁnal least squares ﬁtting, the normalized eight-point algorithm [9] was ran on the obtained inlier set. Notethat all fundamental matrices were discarded for which theoriented epipolar constraint [4] did not hold.The ﬁrst three blocks of Table 1, each consisting of threerows, report the quality of the estimation on each datasetas the average of 100 runs on every image pair. The ﬁrsttwo columns show the name of the tests and the investi-gated properties: (1) e avg is the RMS geometric error in pix-els of the obtained model w.r.t. the manually annotated in- http://cmp.felk.cvut.cz/data/geometry2view/ http://web.eee.sztaki.hu/˜dbarath/ cs.adelaide.edu.au/˜hwong/doku.php?id=data Noise (px) E rr o r ( p x ) RSCRSC + MSCMSC + LO-RSCLO-RSC + LO-MSCLO-MSC + MAGSAC (a) outl., conf.

Noise (px) E rr o r ( p x ) RSCRSC + MSCMSC + LO-RSCLO-RSC + LO-MSCLO-MSC + MAGSAC (b) outl., conf.

Noise (px) E rr o r ( p x ) RSCRSC + MSCMSC + LO-RSCLO-RSC + LO-MSCLO-MSC + MAGSAC (c) outl., conf.

Noise (px) P r o cess i ng t i m e ( secs ) RSCRSC + MSCMSC + LO-RSCLO-RSC + LO-MSCLO-MSC + MAGSAC (d) conf.

Noise (px) E rr o r ( p x ) RSCRSC + MSCMSC + LO-RSCLO-RSC + LO-MSCLO-MSC + MAGSAC (e) outl., 4 314 iters.

Noise (px) E rr o r ( p x ) RSCRSC + MSCMSC + LO-RSCLO-RSC + LO-MSCLO-MSC + MAGSAC (f) outl., 69 074 iters.

Figure 3:

Synthetic homography ﬁtting.

The competitor methods are: RANSAC, MSAC, LO-RANSAC, LO-MSAC andMAGSAC. Sufﬁx ” + σ ” means that σ -consensus was applied to the output. Plots (a–c) report the errors (in pixels) as functionof the noise σ with conﬁdence set to . . Plot (d) shows the avg. processing time (in seconds). Plots (e–f) report the resultsmade by using a ﬁxed iteration number calculated from the ground truth inlier ratio and conﬁdence set to 0.999.liers. For fundamental matrices and homographies, it is theaverage Sampson distance and re-projection error, respec-tively. For essential matrices, it is the mean Sampson dis-tance of the implied F and the correspondences. (2) Value t is the mean processing time in milliseconds. (3) Value s isthe mean number of samples, i.e. RANSAC iterations, hadto be drawn till termination. Note that the iteration num-bers of methods applied with or without the proposed post-processing are equal.It can be seen that for F estimation the proposed post-processing step improved the results in nearly all of the testswith negligible deterioration in the processing time. The er-rors were reduced by approximately compared with themethods without σ -consensus. MAGSAC led to the mostaccurate results for kusvod2 and Multi-H datasets and itwas the third best for

AdelaideRMF dataset by a small mar-gin of . pixels. Homographies.

To test homography estimation we down-loaded homogr (16 pairs) and

EVD (15 pairs) datasets. Eachconsists of image pairs of different sizes from × up to × with point correspondences and inliersselected manually. The Homogr dataset consists of mostlyshort baseline stereo images, whilst the pairs of

EVD un-dergo an extreme view change, i.e. wide baseline or ex- http://cmp.felk.cvut.cz/wbs/ treme zoom. All algorithms applied the normalized four-point algorithm [8] for homography estimation both in themodel generation and local optimization steps. The th and th blocks of Fig. 1 show the mean results computed us-ing all the image pairs of each dataset. Similarly as for F estimation, the proposed post-processing step always im-proved (by . pixels on average). For both datasets, theresults obtained by MAGSAC were signiﬁcantly more ac-curate than what the competitor algorithms obtained. Essential Matrices.

To estimate essential matrices, weused the strecha dataset [23] consisting of image se-quences of buildings. All images are of size × .The ground truth projection matrices are provided. Themethods were applied to all possible image pairs in eachsequence. The SIFT detector [12] was used to obtain corre-spondences. For each image pair, a reference point set withground truth inliers was obtained by calculating F from theprojection matrices [8]. Correspondences were consideredas inliers if the symmetric epipolar distance was smallerthan . pixel. All image pairs with less than inliersfound were discarded. In total, image pairs were usedin the evaluation. The results are reported in the th block ofTable 1. The trend is similar to the previous cases. The mostaccurate essential matrices were obtained by MAGSAC.Also it was the fastest algorithm on average.7 SC + σ MSC + σ LO-RSC + σ LO-MSC + σ LO-RSAAC AC-RSC

MAGSAC kusvod2 F , e avg t

38 39 19 19 25 25

17 17 17

55 31 s

661 661 313 313 316 316 160 160 160 fails Adelaide F , e avg t

491 493 420 420 393 394

380 380 380

447 939 s fails Multi-H F , e avg t

321 329 149 149 132 140 119 128 126 s fails homogr H , e avg t

83 85

65 71 72 64 65 65 37 131 s fails EVD H , e avg t

381 383 379 380 367 369 353 356 355 291 s fails strecha E , e avg t s fails all e avg e med t

727 730 654 655 589 592

581 580 921 688 fails

Table 1:

Accuracy of robust estimators on two-view geometric estimation.

Fundamental matrix estimation ( F ) on kusvod2 (24 pairs), AdelaideRMF (19 pairs) and

Multi-H (4 pairs) datasets, homography estimation ( H ) on homogr (16 pairs) and EVD (15 pairs) datasets, and essential matrix estimation ( E ) on the strecha dataset (467 pairs). In total, the testing included image pairs. The datasets, the problem, the number of the image pairs ( ) and the reported properties are shown in theﬁrst three columns. The other columns show the average results ( runs on each image pair) of the competitor methodsat conﬁdence. Columns with ” + σ ” show the results when the proposed σ -consensus was applied to the output of themethod on its left. The mean geometric error ( e avg ; in pixels) of the estimated model w.r.t. the manually selected inliers arewritten in each 1st row; the mean processing time ( t , in milliseconds) and the required number of samples ( s ) are written inevery nd and rd rows. In the th one, the proportion of failures, i.e. when the sough model is not found, is shown. Thegeometric error is the RMS Sampson distance for F and E , and the RMS re-projection error for H using the ground truthinlier set. The thresholds proposed in [11] were used. For MAGSAC, σ max = 10 pixels.

6. Conclusion

A robust approach, called σ -consensus, was proposedfor eliminating the need of a user-deﬁned threshold bymarginalizing over a range of noise scales. Also, due to nothaving a set of inliers, a new model quality function and ter-mination criterion were proposed. Applying σ -consensus,we proposed two methods: ﬁrst, MAGSAC applying σ -consensus to each of the models estimated from a mini-mal sample. The method is superior to the state-of-the-artin terms of geometric accuracy on publicly available real-world datasets for epipolar geometry (both F and E ) and ho-mography estimation. The method is often faster than otherRANSAC variants in case of high outlier ratio. The pro- posed post-processing step applies σ -consensus only once:to polish the RANSAC output. The method nearly alwaysimproved the model quality on a wide range of vision prob-lems without noticeable deterioration in processing time,i.e. at most a few milliseconds. We see no reason for notapplying it after the robust estimation ﬁnished.

7. Acknowledgement

This work was supported by the OP VVV projectCZ.02.1.01/0.0/0.0/16019/000076 Research Center for In-formatics, by the Czech Science Foundation grant GA18-05360S, and by the Hungarian Scientic Research Fund (No.NKFIH OTKA KH-126513).8 eferences [1] O. Chum and J. Matas. Matching with PROSAC-progressivesample consensus. In

Computer Vision and Pattern Recogni-tion . IEEE, 2005. 1[2] Ondˇrej Chum and Jiˇr´ı Matas. Optimal randomizedRANSAC.

IEEE Transactions on Pattern Analysis and Ma-chine Intelligence , 30(8):1472–1482, 2008. 4, 5[3] O. Chum, J. Matas, and J. Kittler. Locally optimizedRANSAC. In

Joint Pattern Recognition Symposium .Springer, 2003. 6[4] O. Chum, T. Werner, and J. Matas. Epipolar geometry es-timation via RANSAC beneﬁts from the oriented epipolarconstraint. In

International Conference on Pattern Recogni-tion , 2004. 6[5] M. A. Fischler and R. C. Bolles. Random sample consen-sus: a paradigm for model ﬁtting with applications to imageanalysis and automated cartography.

Communications of theACM , 1981. 1[6] V. Fragoso, P. Sen, S. Rodriguez, and M. Turk. EVSAC:accelerating hypotheses generation by modeling matchingscores with extreme value theory. In

International Confer-ence on Computer Vision , 2013. 1[7] D. Ghosh and N. Kaabouch. A survey on image mosaick-ing techniques.

Journal of Visual Communication and ImageRepresentation , 2016. 1[8] R. Hartley and A. Zisserman.

Multiple view geometry incomputer vision . Cambridge university press, 2003. 3, 6,7[9] R. I. Hartley. In defense of the eight-point algorithm.

Trans-actions on Pattern Analysis and Machine Intelligence , 1997.6[10] H. Isack and Y. Boykov. Energy-based geometric multi-model ﬁtting.

International Journal of Computer Vision ,2012. 1[11] K. Lebeda, J. Matas, and O. Chum. Fixing the locally op-timized RANSAC. In

British Machine Vision Conference .Citeseer, 2012. 6, 8[12] D. G. Lowe. Object recognition from local scale-invariantfeatures. In

International Conference on Computer vision .IEEE, 1999. 7[13] J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide-baseline stereo from maximally stable extremal regions.

Im-age and Vision Computing , 2004. 1[14] D. Mishkin, J. Matas, and M. Perdoch. MODS: Fast androbust method for two-view matching.

Computer Vision andImage Understanding , 2015. 1[15] Lionel Moisan, Pierre Moulon, and Pascal Monasse. Auto-matic homographic registration of a pair of images, with acontrario elimination of outliers.

Image Processing On Line ,2:56–73, 2012. 2, 5[16] D. Nasuto and J. M. B. R. Craddock. NAPSAC: High noise,high dimensional robust estimation - its in the bag. 2002. 1[17] T. T. Pham, T-J. Chin, K. Schindler, and D. Suter. Interactinggeometric priors for robust multimodel ﬁtting.

Transactionson Image Processing , 2014. 1 [18] P. Pritchett and A. Zisserman. Wide baseline stereo match-ing. In

International Conference on Computer Vision . IEEE,1998. 1[19] R. Raguram, O. Chum, M. Pollefeys, J. Matas, and J-M.Frahm. USAC: a universal framework for random sampleconsensus.

Transactions on Pattern Analysis and MachineIntelligence , 2013. 4[20] Martin Rais, Gabriele Facciolo, Enric Meinhardt-Llopis,Jean-Michel Morel, Antoni Buades, and Bartomeu Coll. Ac-curate motion estimation through random sample aggregatedconsensus.

CoRR , abs/1701.05268, 2017. 5[21] C. Sminchisescu, D. Metaxas, and S. Dickinson. Incrementalmodel-based estimation using geometric constraints.

PatternAnalysis and Machine Intelligence , 2005. 1[22] Charles V. Stewart. Minpran: A new robust estimator forcomputer vision.

IEEE Transactions on Pattern Analysis andMachine Intelligence , 17(10):925–938, 1995. 2[23] C. Strecha, R. Fransens, and L. Van Gool. Wide-baselinestereo from multiple views: a probabilistic account. In

Con-ference on Computer Vision and Pattern Recognition . IEEE,2004. 7[24] P. H. S. Torr. Bayesian model estimation and selection forepipolar geometry and generic manifold ﬁtting.

Interna-tional Journal of Computer Vision , 50(1):35–61, 2002. 1[25] P. H. S. Torr and D. W. Murray. Outlier detection and mo-tion segmentation. In

Optical Tools for Manufacturing andAdvanced Automation . International Society for Optics andPhotonics, 1993. 1[26] P. H. S. Torr and A. Zisserman. MLESAC: A new robust esti-mator with application to estimating image geometry.

Com-puter Vision and Image Understanding , 2000. 1[27] P. H. S. Torr, A. Zisserman, and S. J. Maybank. Robust detec-tion of degenerate conﬁgurations while estimating the funda-mental matrix.

Computer Vision and Image Understanding ,1998. 1[28] M. Zuliani, C. S. Kenney, and B. S. Manjunath. The multi-ransac algorithm and its application to detect planar homo-graphies. In

International Conference on Image Processing .IEEE, 2005. 1.IEEE, 2005. 1