[PDF] Smooth Deformation Field-based Mismatch Removal in Real-time

Abstract

This paper studies the mismatch removal problem, which may serve as the subsequent step of feature matching. Non-rigid deformation makes it difficult to remove mismatches because no parametric transformation can be found. To solve this problem, we first propose an algorithm based on the re-weighting and 1-point RANSAC strategy (R1P-RNSC), which is a parametric method under a reasonable assumption that the non-rigid deformation can be approximately represented by multiple locally rigid transformations. R1P-RNSC is fast but suffers from a drawback that the local smoothing information cannot be taken into account. Then, we propose a non-parametric algorithm based on the expectation maximization algorithm and dual quaternion (EMDQ) representation to generate the smooth deformation field. The two algorithms compensate for the drawbacks of each other. Specifically, EMDQ needs good initial values provided by R1P-RNSC, and R1P-RNSC needs EMDQ for refinement. Experimental results with real-world data demonstrate that the combination of the two algorithms has the best accuracy compared to other state-of-the-art methods, which can handle up to 85% of outliers in real-time. The ability to generate dense deformation field from sparse matches with outliers in real-time makes the proposed algorithms have many potential applications, such as non-rigid registration and SLAM.

Full PDF

JJOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1

Smooth Deformation Field-basedMismatch Removal in Real-time

Haoyin Zhou,

Member, IEEE and Jagadeesan Jayender,

Senior Member, IEEE

Abstract —This paper studies the mismatch removal problem, which may serve as the subsequent step of feature matching. Non-rigiddeformation makes it difﬁcult to remove mismatches because no parametric transformation can be found. To solve this problem, weﬁrst propose an algorithm based on the re-weighting and 1-point RANSAC strategy (R1P-RNSC), which is a parametric method undera reasonable assumption that the non-rigid deformation can be approximately represented by multiple locally rigid transformations.R1P-RNSC is fast but suffers from a drawback that the local smoothing information cannot be taken into account. Then, we propose anon-parametric algorithm based on the expectation maximization algorithm and dual quaternion (EMDQ) representation to generatethe smooth deformation ﬁeld. The two algorithms compensate for the drawbacks of each other. Speciﬁcally, EMDQ needs good initialvalues provided by R1P-RNSC, and R1P-RNSC needs EMDQ for reﬁnement. Experimental results with real-world data demonstratethat the combination of the two algorithms has the best accuracy compared to other state-of-the-art methods, which can handle up to85% of outliers in real-time. The ability to generate dense deformation ﬁeld from sparse matches with outliers in real-time makes theproposed algorithms have many potential applications, such as non-rigid registration and SLAM.

Index Terms —Feature matching; mismatch removal; smooth deformation ﬁeld; 1-point RANSAC; dual quaternion; expectationmaximization; image registration (cid:70)

NTRODUCTION F EATURE matching is the foundation of many computervision applications, such as simultaneously localizationand mapping (SLAM) [1], structure-from-motion (SfM) [2]and image registration [3]. The problem of detecting keypoints and extracting their descriptors from the input im-ages has been studied for decades and many effective meth-ods have been proposed, such as SURF [4] and ORB [5]. Inrecent years, deep learning-based methods have also beendeveloped to improve the performance of feature detectionand description [6]. Correspondences between these featurepoints can be built according to the computed descriptors[7]. However, due to noise, illumination change, cameramotion and object deformation, incorrect matches, or out-liers, are unavoidable. To remove outliers from the correctmatches is essential for the feature-based applications, andthe related methods can be roughly classiﬁed into paramet-ric and non-parametric methods.The parametric methods assume that there exists a sim-ple analytical transformation between two point clouds [8],which include afﬁne [9], homography [10], epipolar geom-etry [11] and so on. For example, Zheng et al proposed analgorithm to extract the planar homography matrix fromthe matches, which is fast and accurate but requires theobserved objects to be approximately planar or the camerarotates at a ﬁxed location [10]. To handle more generaltransformations, the most common method is random sam-ple consensus (RANSAC) [12], which has many modiﬁedversions [13] [14] [15]. The RANSAC methods estimate • Haoyin Zhou and Jagadeesan Jayender are with the Surgical PlanningLaboratory, Brigham and Women’s Hospital, Harvard Medical School,Boston, MA, 02115, USA.E-mail: [email protected]; [email protected].. the transformations from small sets of randomly selectedcontrol points, and select the transformation supported bythe most number of matches as the ﬁnal results. RANSACis efﬁcient in handling parametric transformations, but isdifﬁcult to handle non-parametric transformations, such asnon-rigid deformation. To solve this problem, we proposea reasonable assumption that the non-rigid deformationcan be approximately represented by multiple locally rigidtransformations. However, traditional RANSAC methodshave low efﬁciency under this assumption, because theselected sets of control points need to (1) be inliers and (2)can ﬁt into the same rigid transformation. In this paper, wepropose a novel re-weighting and 1-point RANSAC-basedmethod (R1P-RNSC), which naturally satisﬁes the secondcondition because only one control point is needed in eachtrial. Hence, R1P-RNSC has high efﬁciency in handling non-rigid deformation while removing outliers in the featurematches.Compared with parametric methods, non-parametricmethods are more efﬁcient in removing matching outlierswhen non-rigid deformation exists, which is an ill-posedproblem. Most non-parametric methods are explicitly orimplicitly based on the assumption that the deformation issmooth in local areas, and they recognize a match as anoutlier if it is not consistent with its neighbors [16] [17]. Forexample, Probst et al used the isometric deformation priorthat assumed the distances between matches are preservedunder deformations [18]. Ma et al proposed a method toestimate the vector ﬁeld from the matches, which achievedhigh performance in terms of speed and accuracy [19].They further proposed to use machine learning methodsto exploit the consensus of local neighborhood structures[20]. Hence, the representation of the local smoothing infor-mation is essential for the accuracy of the non-parametric a r X i v : . [ c s . C V ] J u l OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 2 methods. In this paper, we propose an algorithm called asEMDQ based on dual quaternion to generate the smoothdeformation ﬁeld. Dual quaternion [21] is widely used in thecomputer graphics ﬁeld to generate smooth interpolationsamong 6-DoF rigid transformations. To reduce the impactof outliers, we integrate the dual quaternion-based inter-polation process into the expectation maximization (EM)algorithm.As shown in Fig.1, the method proposed in this paperconsists of two steps, which are R1P-RNSC and EMDQ.The two algorithms compensate for the drawbacks of eachother. The accuracy of R1P-RNSC is limited due to thelack of ability to take into account smoothing information.However, R1P-RNSC is fast and able to extract the can-didate rigid transformations from the matches, which isessential for the initialization of the EMDQ algorithm. Theexperimental results show that the combination of the twoalgorithms has the highest accuracy compared with otherstate-of-the-art methods, and the speed is sufﬁcient for real-time applications.This paper is organized as follows. In Section 2, wegive the details of R1P-RNSC, including the re-weightingstrategy and the termination conditions of the RANSACframework. The details of the EMDQ algorithm are givenin Section 3, in which we also provide the strategy of initial-ization based on the results of R1P-RNSC. Evaluation resultsare presented in Section 4. A discussion and description ofplanned future work is described in Section 5.

Building correct correspondences between two point cloudsin the presence of noise, outliers and deformation is a funda-mental problem in the computer vision ﬁeld, and the relatedresearch directions mainly include point clouds registrationand feature matching [22] [23]. (1) Point clouds registration:

This paper is closely relatedto the registration problem between point clouds or meshmodels, which has been studied for decades with manyeffective solutions [24]. The goal of point clouds registrationis to assign correspondences between two sets of pointsand/or to recover the transformation that maps one pointset to the other. The most popular point clouds registrationmethod is iterative closet points (ICP), which has many vari-ants [25] [26] [27]. ICP iteratively assigns correspondencesbased on a closest distance criterion and ﬁnds the least-squares transformation between the two point clouds. Toeliminate the impact of outliers, robust kernel functions canbe applied [28]. The ICP methods are sensitive to the initialalignment. To overcome this drawback, some works proposeto use the soft-assignment strategy. For example, the robustpoint matching (RPM) algorithm [29] uses a soft assignmentmatrix to represent the corresponding relationships. An-other representative work is coherent point drift (CPD) [30],which is based on the expectation maximization (EM) algo-rithm to calculate the probabilities of the correspondencesbetween points. CFD considers one point set as the GaussianMixture Model (GMM) centroids, and the other one as thedata points. Then, CFD ﬁts the GMM centroids to the datato maximize the likelihood. Inspired by CPD, we have alsoemployed the EM algorithm to obtain the probabilities of the matches according to the error distances to reduce theimpacts of outliers.To provide initial values and improve the robustness ofpoint cloud registration, some methods proposed to extractdescriptors of key points from the point clouds and buildcorrespondences [31]. For example, Elbaz et al proposed adeep learning-based method called LORAX [32] to describethe geometric structure of each point cloud with a low-dimensional descriptor. The obtained correspondences mayinclude mismatches and an efﬁcient method to remove thesemismatches is needed. (2) Feature matching and outliers removal:

Feature match-ing is often a two-steps work. The ﬁrst step detects featurepoints and builds matches. For example, there are manyeffective methods to detect key points and extract the de-scriptors from images, such as SIFT [33], SURF [4], ORB [5],A-SIFT [34] and LIFT [6]. There are also many works thataim to develop more robust and efﬁcient matchers, such asFLANN [35] [36]. However, mismatches are often unavoid-able after the ﬁrst step, hence a following step to remove themismatches is necessary. The problems in the ﬁrst step havebeen studied for many years but little attention has beengiven to identifying outliers in the obtained matches.It is very common that the correct matches in the imageshave a non-rigid deformation, because even when observinga rigid three-dimensional (3D) object from different anglesand positions, the displacements of different parts of theobject on the two-dimensional (2D) images are often non-rigid. It is difﬁcult to pre-deﬁne a transform model to handlethe non-rigid deformation when the 3D structures of theobject are unknown. Hence, non-parametric methods aremore ﬂexible in handling the non-rigid deformation. Forexample, Li et al proposed the ICF algorithm [16] based ona diagnostic technique and SVM to learn correspondencefunctions that mutually map one point set to the other.Then mismatches are identiﬁed according to the consis-tence with the estimated correspondence functions. Lin etal. proposed the SIM algorithm [9] that achieves afﬁneinvariance by computing the shape interaction matrices oftwo corresponding point sets. SIM detects the mismatchesby removing the most different entries between the in-teraction matrices. Basically, these methods interpolate anon-parametric function by applying the prior condition,in which the motion ﬁeld associated with the feature cor-respondence is slow-and-smooth. Similarly, a recent workby Lin et al [37] proposed a regression algorithm basedon as-smooth-as-possible piece-wise constraints to discovercoherence from the input matches.Graph matching [38] [39] is another type of featurematching methods, which uses the nodes and edges torepresent features and connections respectively. For exam-ple, Leordeanu et al proposed a spectral approach thatuses weighted adjacency matrix to represent pairwise agree-ments between matches [40]. Torresani et al propose a graphmatching optimization technique to minimize the energyfunction deﬁned by the appearance and the spatial arrange-ment of the features [41]. Liu et al proposed a graph-basedmethod to handle multiple patterns [42]. Cho et al proposeda method based on hierarchical agglomerative clustering[43]. The graph matching methods are ﬂexible to multipletypes of transformations but suffer from the NP-hard nature.

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 3 + …+=

Input:Step 1:

R1P-RNSC

Step 2:

EMDQ

Fig. 1. The process of the proposed method to remove outliers from feature matches. First row: The input images with matches built by traditionalfeature matching methods. Although the objects are rigid, the displacements of the correct matches in the images may be non-rigid. Second row:R1P-RNSC results, which assumes that the non-rigid deformation can be represented by multiple rigid transformations. Third row: EMDQ resultsand the generated smooth deformation ﬁeld. In this example, the two steps took ≈ ms to remove the outliers, and took additional ≈ ms togenerate the deformation ﬁeld on the grid points. E - WEIGHTING AND POINT

RANSAC (R1P-RNSC)

Under a reasonable assumption that a non-rigid deforma-tion can be approximatively represented by multiple locallyrigid transformations, we propose a method based on there-weighting and 1-point RANSAC (R1P-RNSC) strategyto handle matching outliers when non-rigid deformationexists. Compared with traditional RANSAC methods thatare based three or four control points, the ability to useonly one control point to estimate the rigid transformationscan naturally satisfy the requirement that all control pointsshould have similar rigid transformations. The R1P-RNSCmethod is fast and can provide good initial values for thesubsequent reﬁnement step.

Matches between two point clouds may include outliers.Denote the two coordinates of the i th match as x i ∈ R D and y i ∈ R D , i = 1 , , ...N , where D = x i and y i can be represented by y i = µ ( Rx i + t ) , (1)where R ∈ SO ( D ) is the rotation matrix, t ∈ R D is thetranslation vector and µ ∈ R is the scale factor. In the R1PRNSC algorithm, we randomly select a match o as the control point, then the coordinates of other matcheswith respect to match o are y i − y o = µ R ( x i − x o ) , i = 1 , , ..., N, (2)which suggests that a rigid transform can be represented by R , µ , x o and y o . Because x o and y o are constants, only R and µ are needed to be estimated. According to Ref. [44], R and µ can be obtained from x i and y i , i = 1 , , ...N , by [ U , Σ , V T ] = svd( YX T ) , R = UV T , (3) µ = (cid:107) vector( Y ) (cid:107) / (cid:107) vector( X ) (cid:107) , (4)where X = (cid:2) x − x o · · · x N − x o (cid:3) D × N , (5) Y = (cid:2) y − y o · · · y N − y o (cid:3) D × N . (6)Because match outliers exist and not all match inliers canﬁt into the local rigid transform of match o , the estimation ofthe rigid transformation represented by R and µ may be in-correct. We propose a re-weighting method to dynamicallyupdate the weights of matches i = 1 , , ...N by d i = (cid:107) y i − y o − µ R ( x i − x o ) (cid:107) , (7) w i = min ( H/d i , , (8) OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 4 where d i is the error distance of match i with the estimated R and µ , H is a pre-deﬁned threshold that when the errordistance d i < H , match i is recognized as an inliers.With the obtained weights of matches w i , i = 1 , , ...N ,the related items of matrices X and Y are adjusted ac-cording to the weights. Small weights w i will be assignedto the match inliers that cannot ﬁt into the same rigidtransformation of match o and the match outliers, hence theestimation of R and µ in (3) and (4) will not be interfered.We perform the re-weighting strategy within an iterativeprocess. With a randomly selected match o as the controlpoint, we set the initial values of weights as w i = 1 for allmatches. Then, we alternatively update R and µ accordingto (3) and (4), and update the weights of matches w i , i =1 , , ...N , according to (7) and (8). In practice we ﬁnd thisiteration process only need a small number of iterations toconverge, and we choose the number of iterations 3 in theexperiments.At the end of the iteration process, we recover thetranslation t by t = y o /µ − Rx o , (9)which is not needed by R1P-RNSC for removing outliers,but the subsequent EMDQ algorithm need the estimationresults of rigid transformations represented by R , t and µ to generate the initial dual quaternions. Traditional RANSAC methods aim to ﬁnd the consensusthat is supported by the most number of input data, andother consensuses are simply abandoned. It is difﬁcult touse a single parametric transformation to represent the non-rigid deformation. To solve this problem, we propose to usemultiple rigid transformations. In the R1P-RNSC algorithm,the rigid transformations are obtained in different RANSACtrials with different control points o . Hence, it is necessaryto modify the standard RANSAC methods to reserve the ob-tained rigid transformations and design a new terminationcondition. Within the RANSAC framework, we randomly try differentmatches as the control point o . With a control point o , somematches may have small error distances d i as in (7) and canbe considered as the candidate inliers. Denote the numberof candidate inliers detected with control point o as T o ,we omit the results if T o < T min , where T min ∈ R is athreshold. A small T o suggests that the selected match o maybe an outlier and the results obtained from it is incorrect. If T o ≥ T min , we add the candidate inliers to the ﬁnal inliersdetection results, and update the inliers ratio by γ = N inliers /N, (10)where N inliers is the total number of detected inliers.In our algorithm, the control point o is only selected fromthe matches that are not considered as inliers so far, whichaims to avoid repeatedly ﬁnd the same rigid transformation. The standard termination condition of RANSAC is deter-mined by the number of trials and the largest number ofdetected inliers in one trial, which needs to be modiﬁedfor the R1P-RNSC algorithm. The basic idea of the mod-iﬁed RANSAC termination condition is to terminate theRANSAC process when inliers are unlikely to exist in theremaining matches. T min is the minimal number of inliers detected by onetrial that can be reserved. Assuming that there exist a set of T min inliers in the remaining matches, then the possibilitythat a selected control point o belongs to this set of inliersis T min / ( N − γN ) . Then, the possibility that after k trialsof different control point o , the possibility that the a set of T min inliers is not found is (1 − T min / ( N − γN )) k , whichequals to − p . p is the widely used RANSAC parameter [45],which suggests the desired probability that the RANSACalgorithm provides a useful result after running and in ouralgorithm we set p = 0 . . Hence, the termination of our1-point RANSAC framework is k > log(1 − p )log(1 − T min / ( N − γN )) , (11)where k is the number of RANSAC trials. The R1P-RNSC algorithm computes the rigid transforma-tions by re-weighting all matches. A more efﬁcient wayis to use a small number of sample points N sparse < N to compute the rigid transformations, and then apply theobtained rigid transformations back to all matches to deter-mine inliers and outliers. This sparse implementation is lessaccurate because it may miss possible rigid transformationsto represent the non-rigid deformation, but is faster due thethe smaller number of matches involved in the re-weightingprocess. The proposed R1P-RNSC algorithm is a parametric methodthat is efﬁcient in estimating rigid transformations from thematches when outliers exist. However, R1P-RNSC suffersfrom a problem that it cannot take into account the smooth-ing information. For example, neighboring matches shouldhave similar rigid transformations, but the estimated rigidtransformations by R1P-RNSC are independent with differ-ent control points. Hence the accuracy is limited especiallywhen repeating pattern exists. A typical example is shownin Fig. 2, the repeating pattern may cause the feature pointsof a pattern be incorrectly matched to another pattern. Thesematching outliers are consistent with each other and arerecognized as inliers by R1P-RNSC, because the number ofoutliers that satisfy with the same rigid transformation maylarger than the threshold T min . To solve this problem, wepropose a non-parametric algorithm called as EMDQ as thereﬁnement step. OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 5 (a) (b) (c)

Fig. 2. The R1P-RNSC algorithm cannot take into account local smooth-ing information. A typical example is shown in this ﬁgure. (a) Therepeating pattern lead to a large number of matching outliers, and (b)the results of the R1P-RNSC algorithm misclassiﬁed these matchesas inliers and (c) The subsequent EMDQ step takes into account localsmoothing information and obtains more accurate results.

XPECTATION M AXIMIZATION AND D UAL Q UATERNION (EMDQ)

We propose a non-parametric algorithm called as EMDQ tohandle matching outliers by generating the smooth defor-mation ﬁeld. Denote a deformation ﬁeld as f , f ( x i ) is thedisplacement of x i caused by f . Matching inliers should beconsistent with the deformation ﬁeld f , which suggest y i ≈ f ( x i ) , (12)for inliers.According to (12), the proposed EMDQ algorithm distin-guishes inliers and outliers according to the error distancesbetween y i and f ( x i ) , i = 1 , , ...N . The ability to generatethe smooth deformation ﬁeld f from the matches withoutliers is essential, and we employ dual quaternion to solvethis problem. We assign discrete transforms g i , i = 1 , , ...N to allmatches, and the smooth deformation ﬁeld f is generatedby interpolating among g i . According to (1), a transform g i consists of the rotation R i , the translation t i and the scale µ i , which moves coordinate x i to y i by y i = g i ( x i ) = µ i ( R i x i + t i ) . (13)Then, the value of f at a location is equal to the weightedaverage value of neighboring g i . In this outliers removalproblem, we mainly focus on the values of f at the locationsof the matches x i , i = 1 , , ...N , that is f x i = N (cid:88) j =1 w ij g j , (14) where w ij is the weight that suggests the impact from match j to i . f x i is the value of f at the coordinate x i . Thisweighted average strategy (14) suggests that only when thetransformations of neighboring matches are consistent withthat of a match i , then match i can be recognized as an inlieraccording to (12).However, when performing interpolation according to(14), the use of the rotation matrices and translation vectorsas in (13) may lead to inaccurate results [46]. To overcomethis difﬁculty, we introduce a mathematics tool called asdual quaternion (DQ) [21].A DQ is a 8- dimension vector that can represent a 3Drotation and a 3D translation simultaneously. A 2D trans-form can be considered as a 3D transform that is restrictedat the z = 0 plane, and 4 out of the 8 components of a DQis always zero. To improve the efﬁciency, we only computethe 4 non-zero components in our EMDQ implementationwhen D = 2 . Hence, q i is a 4- or 8- dimension vector for D = i as q i , we have y i = g i ( x i ) = µ i q i ( x i ) , (15)where q i ( x i ) suggests to apply q i to the coordinate x i ∈ R D to obtain a transformed coordinate.According to (14) and (15), the deformation ﬁeld f at thelocation x i is determined by the weighted average value of µ j and q j , that is f x i = ¯ µ i ¯ q i (16)where ¯q i = N (cid:88) j =1 (cid:0) w ij q j (cid:1) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N (cid:88) j =1 (cid:0) w ij q j (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) dq , (17) ¯ µ i = N (cid:88) j =1 (cid:0) w ij µ j (cid:1) / N (cid:88) j =1 (cid:0) w ij (cid:1) , (18)where (cid:107)·(cid:107) dq is the norm of the dual quaternion.Many works in the computer graphics ﬁeld have proventhe feasibility and efﬁciency of using the linear combinationof DQ as in (17) to generate smooth interpolations. Forexample, by using DQ-based interpolation from skeletons,the motion of the skin surface of a mesh model can begenerated [21]. However, due to the existence of outliers,the interpolation process (17) and (18) may be incorrect. Oneway to reduce the impacts of outliers is to assign a smallweight w ij when match j is an outlier. Hence, we employ theexpectation maximization (EM) algorithm to dynamicallyupdate the weights w ij in the EM iteration process. The EM algorithm is an iterative process that alternativelyperforms two steps. In the M-step, the probabilities thatmatches are inliers are computed, and the weights w ij arefurther updated. In the E-step, the deformation ﬁeld f isupdated according to the weights w ij . The details of the EMalgorithm is as follows. (1) E-step: OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 6 (a)

10 20 30 40 50 60 70 80 90 100−30−20−100102030−20−10010203040506070 (b)

Fig. 3. Examples to show the uncertainty of smooth interpolation in (a) 2D and (b) 3D spaces respectively. The red and blue lines are the originaland deformed point clouds respectively. The two end points are assigned with multiple different sets of dual quaternions, and other points aremoved according to dual quaternion-based interpolation according to Eq. (17) (in this example we keep the scale µ = 1 ). Although the sets of dualquaternion assigned to the end points result in the same displacements of the end points, the generated deformation of the point clouds may bedifferent. All different deformations are smooth and may exist in reality, hence it is important to eliminate the uncertainty of interpolation and we usethe results of R1P-RNSC to initialize the dual quaternion assigned to the matches. Inspired by [30], we consider the outliers removal prob-lem as a classiﬁcation problem according to the error dis-tances between y i and f ( x i ) , i = 1 , , ...N . With a deforma-tion ﬁeld f , we have p ( i | inlier) = 12 πσ exp( (cid:107) y i − f ( x i ) (cid:107) σ ) (19)where σ is the standard variation.We consider the outliers follow an uniform distribution,that is p ( i | oulier) = a, (20)where a is a constant.Denote γ as the inliers ratio, we have p (inlier) = γ and p (outlier) = 1 − γ . According to Bayes’ theorem, we have p (inlier | i )= p ( i | inlier) p (inlier) p ( i | inlier) p (inlier) + p ( i | outlier) p (outlier)= exp( (cid:107) y i − f ( x i ) (cid:107) / σ )exp( (cid:107) y i − f ( x i ) (cid:107) / σ ) + 2 πσ − γ ) γ a (21) p (inlier | i ) suggests the probability that match i is aninliers. We adjust the weight w ij , which is the inﬂuence ofmatch j to match i , according to p (inlier | j ) . Taking intoaccount the distances between matches, w ij is determinedby two parts, that is w ij = w ij, distance p (inliers | j ) , (22)where w ij, distance is determined by the distance betweenmatch i and j , that is w ij, distance = max (cid:32) exp( (cid:107) y i − y j (cid:107) r ) , exp( (cid:107) x i − x j (cid:107) r ) (cid:33) (23)where r ∈ R is a pre-deﬁned radius. Because w ij, distance doesnot contain variables, we only compute it at the initial phase. (2) M-step: According to the updated weight w ij , we update ¯q i and ¯ µ i for each match i according to (17) and (18) respectively,which represent the value of the deformation ﬁeld f at x i ,or f x i .The standard variation σ is updated by σ = N (cid:88) i =1 (cid:16) p (inlier | i ) (cid:107) y i − f ( x i ) (cid:107) (cid:17) / N (cid:88) i =1 p (inlier | i ) . (24)Finally, we update q i and µ i to keep the transformation g i close to the deformation ﬁeld f x i by ∆ q i = trans2dq(( y i − f ( x i )) / ¯ µ i ) , (25) q i, new = q i, old ∆ q i , (26) µ i, new = ¯ µ i . (27)Equations (25)-(27) aim to keep (15) stands and makethe transforms g i more smooth, and trans2dq represents theconversion of the transform to the DQ representation.The termination condition of the EM algorithm deter-mined by the change p (inlier | i ) , that is N (cid:32) N (cid:88) i =1 (cid:12)(cid:12)(cid:12) p (inlier | i ) ( k ) − p (inlier | i ) ( k − (cid:12)(cid:12)(cid:12)(cid:33) < θ (28)where θ is a threshold. k suggests the value of the k thiteration in the EM algorithm. After termination, matcheswith p (inlier | i ) > p min and y i − f ( x i ) < H are consideredas inliers, where p min is a threshold and H is the samethreshold used in the R1P-RNSC algorithm. There are mainly three variables in the EMDQ algorithmthat need to be initialized, which include the DQ q i , thescale factor µ i and the weights between matches w ij . Accord-ing to (22), w ij is determined by w ij, distance and p (inliers | j ) , OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 7 and w ij, distance is constant in the EM iterations. The initialvalues of the three variables determine the initial deforma-tion ﬁeld f according to (14), and we use the results of R1P-RNSC for the initialization.The proposed R1P-RNSC algorithm detects multiplerigid transformations. If a match i is considered as an inlierwith a rigid transformation detected by R1P-RNSC, then therelated q i and µ i will be initialized according to the results.A match i may satisfy multiple rigid transformations, andwe use the one that has the most number of matches forEMDQ initialization. This initialization is important becauseas the examples shown in Fig. 3, there may exist multiplesmooth deformations between two matches. Hence, if the q i and µ i , i = 1 , , ...N are randomly initialized, thedeformation ﬁeld f may vary because it is generated byinterpolating the q i and µ i of neighboring matches. Hence,to remove this uncertainty, EMDQ relies on the results ofR1P-RNSC for the initialization of q i and µ i , i = 1 , , ...N .The initial values of p (inliers | i ) are important for re-ducing the impact of outliers when generating the ini-tial deformation ﬁeld f . The natural idea is to initialize p (inliers | i ) = 1 if match i is recognized as an inlier byR1P-RNSC, or p (inliers | i ) = 0 . However, we found thatthis initialization cannot handle the repeating pattern (seeFig. 2) very well, because at some local areas, there may bemore false positives than true positives in the R1P-RNSCresults, and the generated deformation ﬁeld f yields falsepositives. We propose a simple method to overcome thisproblem according to the uncertainty of the detected rigidtransformations provided by R1P-RNSC. With match o asthe control point, R1P-RNSC may ﬁnd different number ofinliers, which is denoted as T o . It is intuitive that when T o is large, the results are more reliable. Hence, we initialize p (inliers | i ) = T o if match i is recognized as an inlier withmatch o as the control point. This allows that f is initiallygenerated from the most reliable matches. XPERIMENTS

The performance of the proposed algorithms was evaluatedagainst state-of-the-art matching outliers removal methods.The source code was implemented in MATLAB and exe-cuted on a computer with an Intel Core i7 2.60 GHz CPU.In our MATLAB implementation, we used the vlfeat opensource library [47] for generating the kd-tree to obtain theneighboring matches.The EMDQ algorithm requires R1P-RNSC for initial-ization. Hence, in this section, EMDQ reports the ﬁnalaccuracy and runtime results of the R1P-RNSC and EMDQalgorithms combined, while the sEMDQ reports the resultswhen using the sparse implementation of R1P-RNSC.

For 2D cases, the parameters of the R1P-RNSC algorithmare as follows: H = 20 pixels is the inliers threshold; T min = 5 is the least number of inliers found by one R1P-RNSC trial that will be added to the ﬁnal R1P-RNSC results.We used a large H and a small T min , which aims to ﬁnd asmany rigid transformations as possible to represent the non-rigid deformation. Many outliers may be mis-recognized as inliers by R1P-RNSC, and we mainly rely on the subsequentEMDQ algorithm to remove these mis-recognized outliers.The parameters of the EMDQ algorithms are as follows: r = 50 pixels is the radius to compute w ij, distance betweenmatches; a = 1 e − is used for computing p ( i | outlier) ; p min = 0 . is the threshold for EMDQ to determine inliersand outliers; θ = 0 . is threshold used in the terminationcondition of EMDQ iterations; N neighbor = 16 is the numberof neighboring matches that are considered in the EMDQalgorithm, which are found by kd-tree.The 2D experiments were mainly conducted on the DTUrobot image data [48], which provides images and therelated 3D point cloud obtained by structured light scan.The camera was fully calibrated and the true values of thecamera positions and orientations are known. Images havea resolution of × . Datasets numbered 1 to 30 wereused. In each dataset, images were captured under 19 differ-ent illumination situations and from 119 camera positions.We randomly selected 10 out of 19 illumination situations.Hence, a total of × ×

119 = 35700 images wereincluded in this evaluation. Following the instruction, foreach dataset and illumination situation, we used the imagenumbered 25 as the reference image and performed SURFmatching between the reference image and other images. Byre-projecting the 3D point clouds to the images accordingto the true values of camera positions and orientations, weconsider matches that have a re-projection error smaller than10 pixels as inliers.The outliers handling methods for comparisons includeAHC [10], LPM [49], LMR [20], SIM [9] and VFC [19]. All theintroduced methods were proposed in recent years and canreﬂect the state-of-the-art performances. It is worth notingthat we did not introduce the traditional RANSAC + 3or 4 control points methods for comparison because theycannot handle the non-rigid cases, which is the main focusof this paper. In our experiments, we did not change theparameters of these methods. VFC has a fast version and asparse version, which are referred to as fVFC and sVFC inthis section. AHC is a very fast method for removing thematching outliers, which focuses on × projective trans-formations, namely planar homographies. However, for theDTU dataset, the images were captured with rotating 3Dobjects, hence the planar homographies assumption may notstand. LMR is based on machine learning to solving the two-class classiﬁcation problem, and we use the trained neuralnetwork model provided by the authors as the classiﬁer inthis experiments.As shown in Fig.4, the total number of matches and thepercentage of outliers varied with objects, illumination sit-uations and camera poses. Although clear comparisons re-quire that only one factor be different, this kind of variable-controlling is difﬁcult for the evaluation on real-world databecause SURF matching results are unpredictable. In exper-iments we found that the accuracy of outliers removal al-gorithms was mainly affected by the percentage of outliers,and the runtime was mainly affected by the total number ofmatches. Therefore, in this section, we report the accuracyand runtime evaluation results by comparing the statistical

1. http://roboimagedata.compute.dtu.dk/

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 8

Fig. 4. Samples of the results of EMDQ on 2D DTU images. First and second columns: the two image pairs with SURF features, the yellow and blacklines are identiﬁed as inliers and outliers by EMDQ respectively. Third column: the generated deformation ﬁeld on grid sample points, which areobtained by taking the weighted average of neighboring q and µ of the match inliers. We only show the deformation ﬁeld at locations that are closeto match inliers, which suggests lower uncertainty. The grid points were sampled with a step of 50 pixels, and the generation of the deformationﬁeld on the grid points took less than 5 ms. results at each percentage of outliers and range of totalnumber of matches respectively.For accuracy evaluation, we report four metrics, whichinclude the number of errors, F - score , recall and precision .The number of errors suggests the total number of dif-ferences between the ground truth and the inliers-outliersrecognition results. F - score is widely used for statisticalanalysis of binary classiﬁcation, and F - score = 2 Recall · P recision/ ( Recall + P recision ) . As shown in Fig. 5, theproposed EMDQ method has the best accuracy in terms ofnumber of errors and the F - score . sEMDQ did not showan obvious decrease in accuracy compared with EMDQ.Because we set a non-restrict threshold for 1P-RNSC, the recall rate of 1P-RNSC was high but the precision rate waslow, which results in a comparable F-score with AHC, LPMand LMR. fVFC and sVFC had very similar results hence wecombined them in Fig. 5. When the inliers ratio was high, the recall rate of fVFC/sVFC was lower than EMDQ/sEMDQbut the precision rate was slightly higher. As the inliersratio decreased, the precision rate of fVFC/sVFC droppedsigniﬁcantly, which decreased their F - score signiﬁcantlyalthough the recall rate remained high.As shown in Fig. 6, sEMDQ had a comparable runtimewith sVFC. AHC was fastest because it is based on anassumption that the object is planar, namely the planarhomographies problem. This assumption makes AHC notsuitable for solving the general outliers removal problem.Hence as shown in Fig. 5, the accuracy of AHC was lower than EMDQ and VFC. R1P-RNSC was also very fast.The DTU dataset contains large data with different il-luminations and object motion, hence it is sufﬁcient forthe evaluation of the proposed algorithms. Although theobjects in the DTU dataset are rigid, the displacementsof corresponding feature points on the images have largenon-rigid deformation, as shown in Fig. 4. In addition tothe experiments on the DTU dataset, we also conductedqualitative experiments as shown in Fig. 7, which includelaparoscopic images of soft tissue undergoing non-rigiddeformation due to physiological motions such as breathing,and tissue manipulation. The generated deformation ﬁeldsby EMDQ are smooth and physically realistic. Most parameters for 3D cases are the same as that of 2Dcases. However, because the scale of the input 3D pointcloud may vary, we ﬁrst developed a method to obtain thescale s to improve the adaptivity of the proposed methods,that is s = (cid:118)(cid:117)(cid:117)(cid:116) N (cid:32) N (cid:88) i =1 ( x i − ¯x ) + N (cid:88) i =1 ( y i − ¯y ) (cid:33) , (29)where ¯ x and ¯ y are the mean coordinates of x i and y i , i = 1 , , ...N , respectively. Certain parameters are adjustedfor the 3D case as follows: H = 0 . s , r = 0 . s and a = 20 /s . OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 9 -

95 95 -

90 90 -

85 85 -

80 80 -

75 75 -

70 70 -

65 65 -

60 60 -

55 55 -

50 50 -

45 45 -

40 40 -

35 35 -

30 30 -

25 25 -

20 20 - inliers ratio (%) nu m be r o f e rr o r s AHCLPMLMRSIMfVFC/sVFCR1P-RNSCEMDQsEMDQ (a) -

95 95 -

90 90 -

85 85 -

80 80 -

75 75 -

70 70 -

65 65 -

60 60 -

55 55 -

50 50 -

45 45 -

40 40 -

35 35 -

30 30 -

25 25 -

20 20 - inliers ratio (%) F - sc o r e AHCLPMLMRSIMfVFC/sVFCR1P-RNSCEMDQsEMDQ (b) -

95 95 -

90 90 -

85 85 -

80 80 -

75 75 -

70 70 -

65 65 -

60 60 -

55 55 -

50 50 -

45 45 -

40 40 -

35 35 -

30 30 -

25 25 -

20 20 - inliers ratio (%) r e c a ll AHCLPMLMRSIMfVFC/sVFCR1P-RNSCEMDQsEMDQ (c) -

95 95 -

90 90 -

85 85 -

80 80 -

75 75 -

70 70 -

65 65 -

60 60 -

55 55 -

50 50 -

45 45 -

40 40 -

35 35 -

30 30 -

25 25 -

20 20 - inliers ratio (%) p r e c i s i on AHCLPMLMRSIMfVFC/sVFCR1P-RNSCEMDQsEMDQ (d)

Fig. 5. Accuracy results with DTU data (2D experiments). (a) Number of errors. (b) F - score . (c) Recall . (d)

P recision . Instead of normalizing the point cloud, we adjust the pa-rameters according to scale s to obtain the deformation ﬁeld f that can be directly used. In practice, we found that moreneighboring matches are needed for 3D cases, hence we set N neighbor = 50 .We conducted the 3D experiments on medical data. Asshown in Fig. 8, the 3D models were generated by a stereomatching method [50] from stereo laparoscopy videos. Weperformed ORB feature matching [5] on the related 2Dimages and re-projected the feature points to the 3D models.The accuracy comparison results are in Table. 1, whichdemonstrate that (s)EMDQ has the best accuracy. Comparedwith the results in Fig. 8, the F - score is signiﬁcantly higherin these experiments when the inliers ratio is low. This isbecause in these experiments, the modiﬁed ORB feature de-tection method can detect greater number of feature pointsthan SURF, hence it is easier to generate the deformation ﬁeld. The total number of matches are 629, 1784 and 693 forthe three cases respectively, and the runtime of EMDQ are53, 157 and 93 ms respectively. ONCLUSION

We proposed two novel algorithms for solving the matchingoutliers removal problem, which are called as R1P-RNSCand EMDQ. The two algorithms compensate for the draw-backs of each other. Speciﬁcally, R1P-RNSC is fast to extracthidden rigid transforms from matches with outliers, but itdoes not take into account smoothing information hence itsaccuracy is limited. EMDQ is based on dual quaternion-based interpolation to generate the smooth deformationﬁeld and have high accuracy, but it relies on the resultsof R1P-RNSC for initialization. The combination of R1P-RNSC and EMDQ achieves the best accuracy in terms of thetotal number of errors and F - score compared with other OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 10 case (a) (inliers ratio = 76%) case (b) (inliers ratio = 39%) case (c) (inliers ratio = 16%) recall precision F - score recall precision F - score recall precision F - score LPM 0.93 0.91 0.92 0.91 0.92 0.91 0.90 0.92 0.91LMR 0.93 0.97 0.95 0.90 0.95 0.92 0.82 0.92 0.87SIM 0.97 0.67 0.79 0.98 0.43 0.60 0.97 0.32 0.48VFC/sVFC 0.99 0.94 0.96 0.99 0.96 0.97 0.99 0.94 0.96R1P-RNSC 0.99 0.95 0.97 0.99 0.90 0.94 0.99 0.92 0.95EMDQ 0.96 0.98 0.97 1.00 0.96 sEMDQ 0.97 0.98

TABLE 1Accuracy results of the soft tissue experiments as shown Fig. 8.

200 400 600 800 1000 1200 1400 1600 number of points r un t i m e ( m s ) AHCLPMLMRSIMfVFCsVFCR1P-RNSCEMDQSEMDQ

Fig. 6. Runtime results with DTU data (2D experiments). state-of-the-art methods. The proposed method is amongthe fastest methods for 2D cases, and can achieve real-timeperformance for 3D cases.The proposed algorithms can be considered as a methodto generate dense deformation ﬁeld from sparse matcheswith outliers, which suggests that the displacements of allpixels can be obtained from the sparse matches (if un-certainty is not considered). Hence, the algorithms havepotential to be used in many applications, such as non-rigid image registration. However, because the deformationﬁeld is generated by interpolating among the sparse featurematches, the uncertainty problem needs to be further stud-ied. A CKNOWLEDGMENTS

This work was supported by the National Institute ofBiomedical Imaging and Bioengineering of the NationalInstitutes of Health through Grant Numbers K99EB027177,R01EB025964, R01DK119269, P41EB015898, and a ResearchGrant from Siemens-Healthineers USA. Unrelated to thispaper, Jayender Jagadeesan owns equity in Navigation Sci-ences, Inc. He is a co-inventor of a navigation device to assistsurgeons in tumor excision that is licensed to NavigationSciences. Dr. Jagadeesan interests were reviewed and aremanaged by BWH and Partners HealthCare in accordancewith their conﬂict of interest policies. R EFERENCES [1] R. Mur-Artal and J. D. Tard´os, “Orb-slam2: An open-source slamsystem for monocular, stereo, and rgb-d cameras,”

IEEE Transac-tions on Robotics , vol. 33, no. 5, pp. 1255–1262, 2017.[2] J. L. Schonberger and J.-M. Frahm, “Structure-from-motion revis-ited,” in

CVPR , 2016, pp. 4104–4113.[3] A. Sotiras, C. Davatzikos, and N. Paragios, “Deformable medicalimage registration: A survey,”

IEEE Transactions on Medical Imag-ing , vol. 32, no. 7, p. 1153, 2013.[4] H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robustfeatures,” in

ECCV , 2006, pp. 404–417.[5] E. Rublee, V. Rabaud, K. Konolige, and G. R. Bradski, “Orb: Anefﬁcient alternative to sift or surf.” in

ICCV , vol. 11, 2011, p. 2.[6] K. M. Yi, E. Trulls, V. Lepetit, and P. Fua, “Lift: Learned invariantfeature transform,” in

ECCV , 2016, pp. 467–483.[7] M. Muja and D. G. Lowe, “Scalable nearest neighbor algorithmsfor high dimensional data,”

IEEE Transactions on Pattern Analysisand Machine Intelligence , vol. 36, no. 11, pp. 2227–2240, 2014.[8] T.-J. Chin, Y. Heng Kee, A. Eriksson, and F. Neumann, “Guar-anteed outlier removal with mixed integer linear programs,” in

CVPR , 2016, pp. 5858–5866.[9] Y. Lin, Z. Lin, and H. Zha, “The shape interaction matrix-basedafﬁne invariant mismatch removal for partial-duplicate imagesearch,”

IEEE Transactions on Image Processing , vol. 26, no. 2, pp.561–573, 2016.[10] Y. Zheng and Z. Lin, “The augmented homogeneous coordinatesmatrix-based projective mismatch removal for partial-duplicateimage search,”

IEEE Transactions on Image Processing , vol. 28, no. 1,pp. 181–193, 2018.[11] R. Deriche, Z. Zhang, Q.-T. Luong, and O. Faugeras, “Robustrecovery of the epipolar geometry for an uncalibrated stereo rig,”in

ECCV . Springer, 1994, pp. 567–576.[12] M. A. Fischler and R. C. Bolles, “Random sample consensus: aparadigm for model ﬁtting with applications to image analysisand automated cartography,”

Communications of the ACM , vol. 24,no. 6, pp. 381–395, 1981.[13] O. Chum and J. Matas, “Matching with prosac-progressive sampleconsensus,” in

CVPR , vol. 1, 2005, pp. 220–226.[14] R. Raguram, J.-M. Frahm, and M. Pollefeys, “A comparative anal-ysis of ransac techniques leading to adaptive real-time randomsample consensus,” in

ECCV , 2008, pp. 500–513.[15] O. Chum and J. Matas, “Optimal randomized ransac,”

IEEE Trans-actions on Pattern Analysis and Machine Intelligence , vol. 30, no. 8,pp. 1472–1482, 2008.[16] X. Li and Z. Hu, “Rejecting mismatches by correspondence func-tion,”

IJCV , vol. 89, no. 1, pp. 1–17, 2010.[17] X. Li, M. Larson, and A. Hanjalic, “Pairwise geometric matchingfor large-scale object retrieval,” in

CVPR , 2015, pp. 5153–5161.[18] T. Probst, A. Chhatkuli, D. Pani Paudel, and L. Van Gool, “Model-free consensus maximization for non-rigid shapes,” in

ECCV , 2018,pp. 117–133.[19] J. Ma, J. Zhao, J. Tian, A. L. Yuille, and Z. Tu, “Robust pointmatching via vector ﬁeld consensus,”

IEEE Transactions on ImageProcessing , vol. 23, no. 4, pp. 1706–1721, 2014.[20] J. Ma, X. Jiang, J. Jiang, J. Zhao, and X. Guo, “Lmr: Learning a two-class classiﬁer for mismatch removal,”

IEEE Transactions on ImageProcessing , 2019.[21] L. Kavan, S. Collins, J. ˇZ´ara, and C. O’Sullivan, “Skinning withdual quaternions,” in

Symposium on Interactive 3D graphics andgames . ACM, 2007, pp. 39–46.

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 11

Fig. 7. Qualitative 2D experiments results for the EMDQ. First and second rows: the two image pairs with SURF features, the yellow and blacklines are identiﬁed as inliers and outliers by EMDQ respectively. Third row: the generated deformation ﬁeld. The last two data are soft tissues withnon-rigid deformation. (a) (b) (c)

Fig. 8. 3D experiments with soft tissue. The blue and black lines are identiﬁed as inliers and outliers by EMDQ respectively. (a) A silicon phantomthat simulated heartbeat. (b) In vivo porcine abdomen, where the deformation was caused by instrument interaction. (c) A phantom with the textureof the lung surface, which has large deformation. [22] J. Maciel and J. P. Costeira, “A global solution to sparse correspon-dence problems,”

IEEE Transactions on Pattern Analysis and MachineIntelligence , no. 2, pp. 187–199, 2003.[23] Y. HaCohen, E. Shechtman, D. B. Goldman, and D. Lischinski,“Non-rigid dense correspondence with applications for imageenhancement,”

ACM Transactions on Graphics , vol. 30, no. 4, p. 70,2011.[24] G. K. Tam, Z.-Q. Cheng, Y.-K. Lai, F. C. Langbein, Y. Liu, D. Mar-shall, R. R. Martin, X.-F. Sun, and P. L. Rosin, “Registration of 3dpoint clouds and meshes: A survey from rigid to nonrigid,”

IEEETransactions on Visualization and Computer Graphics , vol. 19, no. 7,pp. 1199–1217, 2012.[25] P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,”

IEEE Transactions on Pattern Analysis and Machine Intelligence ,vol. 14, pp. 239–256, 1992.[26] Z. Zhang, “Iterative point matching for registration of free-formcurves and surfaces,”

IJCV , vol. 13, no. 2, pp. 119–152, 1994.[27] A. Segal, D. Haehnel, and S. Thrun, “Generalized-icp.” in

Robotics:Science and Systems , vol. 2, no. 4. Seattle, WA, 2009, p. 435.[28] R. A. Newcombe, D. Fox, and S. M. Seitz, “Dynamicfusion: Recon-struction and tracking of non-rigid scenes in real-time,” in

CVPR , 2015, pp. 343–352.[29] H. Chui and A. Rangarajan, “A new point matching algorithm fornon-rigid registration,”

Computer Vision and Image Understanding ,vol. 89, no. 2-3, pp. 114–141, 2003.[30] A. Myronenko and X. Song, “Point set registration: Coherentpoint drift,”

IEEE Transactions on Pattern Analysis and MachineIntelligence , vol. 32, no. 12, pp. 2262–2275, 2010.[31] H. Deng, T. Birdal, and S. Ilic, “Ppfnet: Global context aware localfeatures for robust 3d point matching,” in

CVPR , 2018, pp. 195–205.[32] G. Elbaz, T. Avraham, and A. Fischer, “3d point cloud registrationfor localization using a deep neural network auto-encoder,” in

CVPR , 2017, pp. 4631–4640.[33] D. G. Lowe, “Distinctive image features from scale-invariant key-points,”

IJCV , vol. 60, no. 2, pp. 91–110, 2004.[34] J.-M. Morel and G. Yu, “Asift: A new framework for fully afﬁneinvariant image comparison,”

SIAM Journal on Imaging Sciences ,vol. 2, no. 2, pp. 438–469, 2009.[35] P. Indyk and R. Motwani, “Approximate nearest neighbors: To-wards removing the curse of dimensionality,” in

ACM symposiumon Theory of computing . ACM, 1998, pp. 604–613.

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 12 [36] J. Bian, W.-Y. Lin, Y. Matsushita, S.-K. Yeung, T.-D. Nguyen, andM.-M. Cheng, “Gms: Grid-based motion statistics for fast, ultra-robust feature correspondence,” in

CVPR , 2017, pp. 4181–4190.[37] W.-Y. Lin, F. Wang, M.-M. Cheng, S.-K. Yeung, P. H. Torr, M. N.Do, and J. Lu, “Code: Coherence based decision boundaries forfeature correspondence,”

IEEE Transactions on Pattern Analysis andMachine Intelligence , vol. 40, no. 1, pp. 34–47, 2017.[38] J. Yan, M. Cho, H. Zha, X. Yang, and S. M. Chu, “Multi-graphmatching via afﬁnity optimization with graduated consistencyregularization,”

IEEE Transactions on Pattern Analysis and MachineIntelligence , vol. 38, no. 6, pp. 1228–1242, 2015.[39] M. Cho and K. M. Lee, “Mode-seeking on graphs via randomwalks,” in

CVPR , 2012, pp. 606–613.[40] M. Leordeanu and M. Hebert, “A spectral technique for correspon-dence problems using pairwise constraints,” in

ICCV , vol. 2, 2005,pp. 1482–1489.[41] L. Torresani, V. Kolmogorov, and C. Rother, “Feature correspon-dence via graph matching: Models and global optimization,” in

ECCV , 2008, pp. 596–609.[42] H. Liu and S. Yan, “Common visual pattern discovery via spatiallycoherent correspondences,” in

CVPR , 2010, pp. 1609–1616.[43] M. Cho, J. Lee, and K. M. Lee, “Feature correspondence anddeformable object matching via agglomerative correspondenceclustering,” in

ICCV , 2009, pp. 1280–1287.[44] K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares ﬁttingof two 3-d point sets,”

IEEE Transactions on Pattern Analysis andMachine Intelligence , no. 5, pp. 698–700, 1987.[45] “Random sample consensus.” [Online]. Available: https://en.wikipedia.org/wiki/Random sample consensus[46] K. Shoemake, “Animating rotation with quaternion curves,” in

ACM SIGGRAPH , vol. 19, no. 3. ACM, 1985, pp. 245–254.[47] A. Vedaldi and B. Fulkerson, “Vlfeat: An open and portable libraryof computer vision algorithms,” in

ACM International Conference onMultimedia . ACM, 2010, pp. 1469–1472.[48] H. Aanæs, A. L. Dahl, and K. S. Pedersen, “Interesting interestpoints,”

IJCV , vol. 97, no. 1, pp. 18–35, 2012.[49] J. Ma, J. Zhao, J. Jiang, H. Zhou, and X. Guo, “Locality preservingmatching,”

IJCV , vol. 127, no. 5, pp. 512–531, 2019.[50] H. Zhou and J. Jagadeesan, “Real-time dense reconstruction oftissue surface from stereo optical video,”