Rigid Point Registration with Expectation Conditional Maximization
11 Abstract—
This paper addresses the issue of matching rigid 3D object points with 2D image points through point registration based on maximum likelihood principle in computer simulated images. Perspective projection is necessary when transforming 3D coordinate into 2D. The problem then recasts into a missing data framework where unknown correspondences are handled via mixture models. Adopting the Expectation Conditional Maximization for Point Registration (ECMPR), two different rotation and translation optimization algorithms are compared in this paper. We analyze in detail the associated consequences in terms of estimation of the registration parameters theoretically and experimentally.
Index Terms—
Point registration, expectation maximization, perspective projection I NTRODUCTION
EGISTRATION is the process of transforming different sets of data into the same coordinate system, with wide application in medical image analysis [1-3], human pose estimation [4], etc. Point registration (PR) is frequently met in image analysis and computer vision to find an optimal alignment between two sets of points. This problem can be separated by two processes, namely, 1) Find point-to-point correspondences and 2) estimate the transformation allowing the alignment of the two sets. Existing solvers to PR problem can be roughly divided into three categories: Iterative Closest Point (ICP) algorithm [5-6], soft assignment methods [7-8] and probabilistic methods [9-10]. ICP uses sampling processes, either deterministic or based on heuristics by outlier rejection. Although ICP is attractive for its efficiency, it can be easily trapped into local minima, which makes it sensitive to both initialization and the choice of a threshold for match acceptance. An alternative of ICP is soft assignments within a continuous optimization frame work. However, this algorithm needs optimal entries for assignment matrix, M, and satisfy the constraints on M, thus providing one-to-one assignments for inliers and many-to-one assignments for outliers. As a consequence, the convergence properties are not guaranteed in the presence of outliers. In this paper, a probabilistic point registration is used with Gaussian mixture models (GMMs). The point-to-point assignment is to find the missing data with maximum likelihood. The algorithm of choice is the expectation conditional maximization (ECMPR) algorithm [9] to solve the problem of registering 3D data points on object with 2D data points on image plane. The challenge of this problem is to transform the 3D data into 2D by perspective projection. When the human eye looks at a scene, objects in the distance appear smaller than objects close by - this is known as perspective. While orthographic projection ignores this effect to allow accurate measurements, perspective definition shows distant objects as smaller to provide additional realism [11]. The difference between this paper and [9] is that there is inadequate observation data rather than redundant ones, thus there are no outlier points in this paper. This paper has the following contributions: • Perspective projection is used to transform 3D data points into 2D image plane. The camera's position, orientation, and field of view control the behavior of the projection transformation. Thus, four transformation matrices, namely, scaling, perspective projection, rotation, and translation, should be taken into consideration. • Two different algorithms for rotation matrix and translation vector optimization are used in this paper. The first algorithm is traversal scan that searches the whole angle and distance space to get the optimal solution, which is accurate but time consuming. The second algorithm is based on least-squares estimation [10], which is efficient but is less accurate. The performance of these two algorithms is compared by experimental results in Section 3. • The impact of initial conditions, model point radius, added Gaussian noise and the use of anisotropic covariance and isotropic ones are discussed in detail in the results part of the paper. The remainder of this paper is organized as follows: In Section 2, the PR problem is formed and ECMPR algorithm is listed for rigid point sets registration. In Section 3 shows the experimental results with testing data.. And Section 4 is the conclusions. P ROBLEM F ORMULATION
Mathematical Notations
Two data sets, namely 3D brace points and 2D image points, will be considered in this paper. We denote Y j j m Y the 2D coordinates of a set of observed image data Jing Wu1
Rigid Point Registration with Expectation Conditional Maximization
R I points and X i i n X the 3D coordinates of a set of model points of brace bead. The object is to find correspondence of observed data Y to model data X . The correspondence is denoted by Z as missing data where Z j i means the j th point in Y is corresponding to i th point in X . The 3D model points lie in global coordinate as shown is Fig 1 ; the 2D image points on the image plane in image coordinate; while the camera has its camera coordinate, as shown in
Fig. 2 . Hence, a coordinate transformation is necessary to compare points in the same coordinate system. The image coordinate can be easily transformed to the camera coosdrdinate by moving the origin of image to that of the camera in xy plane while its z axis is assigned as focal distance in camera coordinate. The next task is to register global coordinate to the camera coordinate. In this paper, a perspective projection is considered [12]. Perspective projection is to show distant objects as smaller to provide additional realism. The perspective projection is a matrix built from four component homogenous matrices, PSRT where P is perspective matrix that is responsible for fore-shortening; S is scale matrix that adjusts for aspect ratio; R is rotation matrix that determines which direction to look at and T is translation matrix that determines where the camera is. The coordinate transformation is as follows
11 12 1221 22 2331 32 33 ' (1)0 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 1 1 0 0 0 1 0 0 0 1 x PSRTxf s r r rf s r r rf s r r r t xt xt x where x is the 3D points (x ,x ,x ) that is converted to 4x1 matrix by adding a one (homogeneous points) at the end of the vector. f is focal distance (f=40 inch), s is scale coefficient (s=1.017mm/pixel), r ij is elements of rotation matrix while t , t , t are translation in x, y and z axis respectively. Since f, s are constant, equation (1) can be converted to ( ; ) ( ; )'( ; ) (3,1) i i ii x Rx tfs xx (2) where : , R t , R is rotation matrix and t is translation vector. : is a mapping from 3D coordinate to 2D. We will refer to the parameter vector as the registration parameters and a superscripted * denotes the optimized value of that parameter. The distance used in this paper is Mahalanobis distance, denoted by ( ) ( ) T X Y X Y X Y where is a symmetric positive definite matrix. Fig. 2 Perspective projection
Fig.1 3D data of bead on the brace
Expectation conditional maximization for point Registration
In this paper, we will optimize the parameters by expectation conditional maximization such that the center of observed data is constrained to coincide with the transformed model points '( ; ) i x μ . Suppose X and Y follows Gaussian distribution, the expected observed data log likelihood is a function of both the registration parameters and the covariance matrices: E ( , ... | n Y , Z ) [log ( Z E P Y , Z ; , ,... ) | n Y ] (3) A powerful method for finding maximal likelihood solutions in the presence of hidden variables is to replace the observed-data log-likelihood with the complete-data log-likelihood and to maximize the expected complete-data likelihood conditioned by the observed data. The criterion to be maximized becomes E ( , ... | n Y , Z ) ( | , , ) Z P Z Y
21 1 qi m n qji j i ij i Y μ X Σ (4)
Where qji is posterior probability described by ii i j iji n i j i Dk Y μ XY μ X (5) where D corresponds to the outlier component in the case of 3D point registration: D r Thus the parameters to be optimized are
21 1 1 qi m nq qji j ij i Y μ X (6) And ( '( , ))( '( , )) m n q Tji j i j ij iq m n qjij i
Y μ X Y μ XΣ (7)
With the virtual observation W i and its weight i : mi ji jji W Y (8) mi jij (9) The ECMPR-rigid algorithm Initialization : Set , q q R I t 0 . Choose the initial covariance matrices , q Σ I Coordinate Transformation : project the X i into the camera coordinate by rotation, translation, perspective projection, and scaling by (2) E-Step : Evaluate the posteriors qji from (5), qi from (8), qi W from (7) using the current parameters , and q q qi R t Σ . When calculating qi W , an infinity check is necessary because the number of points in Y j is less than that in X i . This leads to infinity in the columns of qi W which dose not have a corresponding point in Y j and i =0 for the points that have no correspondence to the observed points. 4 CM -steps: (two different searching methods are used in this paper, thus they are listed separately in a and b) a Traversal: i. Set the object function as
21 1 qi m n qji j ij i J Y μ X
Search optimal R * to minimize J by using the current t q in the range of [0, 2 ] . ii. Search optimal t * to minimized J by using the updated R q+1 in the range of [-1000,1000]. iii. Estimate the new covariances from (7) with the current posteriors, the new rotation matrix and the new translation vector. b Least-Squares Estimation [10]: i. Calculate the correspond between X i and Y j using the current posterior and re-arrange Y j accordingly ii. Calculate the SVD of
Txy Σ UDV and if det( ) 0diag(1,1,...,1, 1) if det( )<0 xyxy
I ΣS Σ
Where
11 22 1 22 11 nx ii ny iinx i xi ny i yi n Txy i y i xi nnnnn μ xμ yσ x μσ y μΣ y μ x μ and n is the number of points iii.
When rank ( ) 1 xy m Σ (m is the dimension of the data, and here m=3), the optimal transformation parameters are determined as follows Ty x c R USVt μ Rμ (10) iv.
Estimate the new covariance matrix ( q ) from (7) with the current posteriors, the new rotation matrix and the new translation vector. 5 Convergence:
Compare the new and current rotations. If
21 6 q q R R , then go to the Classification step. Else, set the current parameter values to their new values and return to the Step 2. 6
Classification:
Assign each observation to a model point based on the maximum a posteriori (MAP) principle: arg max qj jii z E XPERIMENTAL R ESULTS
We carried out several experiments with ECMPR-rigid algorithm on the platform of Matlab 7.11.0 (R 2010b). First, two different R, t searching algorithm was applied to testing data to assess the performance of the method as summarized in
Table I and Fig.3 . Then, real data point from both anteroposterior and lateral X-ray images were calculated by the ECMPR-rigid algorithm. The results are shown in
Table II-IV and Fig.4-7 . For the testing data, we considered 14 model points, corresponding to the clusters’ centers in the mixture model, as well as 14 observations generated from the model points by, perspective projection, scaling, rotation and translation and corrupted by noise. The model points are rotated by -20, 20 10 degrees in x, y, z axis respectively and then translated by a vector [100, -400, 200]. The projection factor is f=1016 mm and scaling factor is s=0.175 mm/pixel. We simulated both noise-free cases in the second and forth row and Gaussian noise case in the first and third row. This noise is centered at each model point and is drawn from 1D Gaussian probability distributions with variance of 1 along each dimension. The initial parameters are rotation angle as identity matrix and translation vector as null vector. Two different optimization algorithm for R, t searching was studied, namely, traversal and least-squares estimation(LSE). The percentage of correct matches is defined by the number of observations that were correctly classified over the total number of observations. This classification is based on the MAP principle: each observation j is assigned to the cluster i such that arg max ( ) i ji j . It can be shown from Table 1 that LSE uses much less time than the traversal algorithm. Moreover, LSE needs less iterate to reach a much smaller convergence threshold than the traversal algorithm. The matching errors are the same for both algorithms. But the traversal algorithm uses big the searching step it will have larger matching error than the LSE. The shortcoming of LSE, however, is its accuracy. Additionally, we performed a large number of trials with ECMPR-rigid traversal algorithm in the anisotropic covariance case (first row in Table 1). The model points are rotated with an angle that varies between 0 and 180 degrees. Fig. 3 show the percentage of correct matches, the relative error in rotation, and the relative error in translation as a function of the ground-truth rotation angle between the sets of data and the model points. The plotted curves correspond to the mean values and the variances computed over each rotation. T ABLE I C OMPARISON BETWEEN T WO O PTIMIZATION A LGORITHM
Algorithm Simulated noise Covariance model Iteration number Traversal anisotropic anisotropic 10 Traversal - anisotropic 2 Least-Squares Estimation anisotropic anisotropic 2 Least-Squares Estimation - anisotropic 2 Algorithm Error in rotation Process time Correct match% Traversal 10 -6 -6 -10 -10 (a) (b) (c) Fig. 3 Statics obtained with ECMPR-rigid over four trials. The percentage of correct matches and relative errors in rotation and translation are shown as a function of the ground-truth rotation angle between the set of data points and the set of model points. The three curves correspond to the means (central curves) and to the means +/- the standard deviation (upper and lower cures) computed over 4 trials (a) Correct matches. (b) Error in rotation. (c) Error in translation. C ONCLUSIONS
In this paper, we address the problem of perspective projection from 3D coordinate to 2D coordinate and matching rigid shapes through robust point registration. The proposed approach uses maximum likelihood with hidden variables for model based clustering. We used variant EM algorithm- E-CM, to maximize the expected complete-data log-likelihood and preserving the convergence properties of EM. We compared two different algorithms namely, traversal and least-squares estimation, for rotation matrix and translation vector optimization. Experimental results show robust of the proposed algorithm b adding Gaussian noise in conjunction with point registration. By comparing the two R, t optimization algorithms, the traversal algorithm shows accuracy while lack of efficiency and least-squares estimation is vice-versa. In the future, application of ECMPR to medical image registration [2-3], HCI in mobile device [13-14], human pose estimation [4] will be explored. R EFERENCES [1] M D ING , S A NTANI , S J AEGER , Z X UE , S C ANDEMIR , M K OHLI , G T HOMA , "L OCAL - GLOBAL C LASSIFIER F USION FOR S CREENING C HEST R ADIOGRAPHS ", IN SPIE M EDICAL I MAGING , [2] J ING W U , E MAM E LHAK A BDEL -F ATAH , M OHAMED R. M AHFOUZ , "F ULLY AUTOMATIC INITIALIZATION OF TWO - DIMENSIONAL – THREE - DIMENSIONAL MEDICAL IMAGE REGISTRATION USING HYBRID CLASSIFIER ," J OURNAL OF M EDICAL I MAGING (2 J UNE W U , J ING , "2D-3D R EGISTRATION OF K NEE J OINT FROM S INGLE P LANE X- RAY F LUOROSCOPY U SING N ONLINEAR S HAPE P RIORS . " P H D DISS ., U NIVERSITY OF T ENNESSEE , M D ING , G F AN , "A RTICULATED AND G ENERALIZED G AUSSIAN K ERNEL C ORRELATION FOR H UMAN P OSE E STIMATION ", IN IEEE T RANSACTIONS ON I MAGE P ROCESSING , P. J. BESL
AND N. D. MCKAY, "A METHOD
FOR
REGISTRATION OF SHAPES,"
IEEE
TRANSACTIONS ON PATTERN
ANALYSIS
AND
MACHINE
INTELLIGENCE,
VOL.
PP.
FEB Y. ZHANG, "ITERATIVE
POINT
MATCHING
FOR
REGISTRATION OF FREE-FORM
CURVES
AND
SURFACES " INTERNATIONAL
JOURNAL OF COMPUTER
VISION,
VOL.
PP.
OCT
RANGARAJAN, ET AL., "THE
SOFTASSIGN
PROCRUSTES
MATCHING
ALGORITHM," IN INFORMATION
PROCESSING IN MEDICAL
IMAGING.
VOL. J. DUNCAN
AND G. GINDI,
EDS.,
ED,
PP. L. CHUI
AND A. RANGARAJAN, "A NEW
POINT
MATCHING
ALGORITHM
FOR
NON-RIGID
REGISTRATION,"
COMPUTER
VISION
AND
IMAGE
UNDERSTANDING,
VOL.
PP.
FEB-MAR
HORAUD, ET AL., "RIGID
AND
ARTICULATED
POINT
REGISTRATION
WITH
EXPECTATION
CONDITIONAL
MAXIMIZATION,"
IEEE
TRANSACTIONS ON PATTERN
ANALYSIS
AND
MACHINE
INTELLIGENCE,
VOL.
PP.
MAR [10] S G E , G F AN , M D ING , " N ON - RIGID P OINT S ET R EGISTRATION WITH G LOBAL -L OCAL T OPOLOGY P RESERVATION ", IN CVPR W ORSHOP , S. UMEYAMA, "LEAST-SQUARES
ESTIMATION OF TRANSFORMATION
PARAMETERS
BETWEEN POINT
PATTERNS,"
IEEE
TRANSACTIONS ON PATTERN
ANALYSIS
AND
MACHINE
INTELLIGENCE,
VOL.
PP.
APR P. INGRID
CARLBOM, "PLANAR
GEOMETRIC
PROJECTIONS
AND
VIEWING
TRANSFORMATIONS," [13] D AWEI L I , X IAOLONG W ANG , AND D EGUANG K ONG . D EEPREBIRTH : A CCELERATING DEEP NEURAL NETWORK EXECUTION ON MOBILE DEVICES . C O RR,
ABS /1708.04728, D AWEI L I , M OOI C HOO C HUAH , “EMOD: A N E FFICIENT O N -D EVICE M OBILE V ISUAL S EARCH S YSTEM ”, P ROCEEDING
MMS YS '15 P ROCEEDINGS OF THE TH ACM M ULTIMEDIA S YSTEMS C ONFERENCE P AGES P ORTLAND , O REGON — M ARCH -2015