Direct ellipsoidal fitting of discrete multi-dimensional data
11 Preprint SMU-HEP-19-01
Direct ellipsoidal fitting of discrete multi-dimensional data
Rafey Anwar, Madeline Hamilton, Pavel M. Nadolsky Department of Physics , Southern Methodist University, Dallas, TX 75275 -0175 May 15, 2019
ABSTRACT Multi-dimensional distributions of discrete data that resemble ellipsoids arise in numerous areas of science, statistics, and computational geometry. We describe a complete algebraic algorithm to determine the quadratic form specifying the equation of ellipsoid for the boundary of such multi-dimensional discrete distribution. In this approach, the equation of an ellipsoid is reconstructed using a set of matrix equations from low-dimensional projections of the input data. We provide a Mathematica program realizing the full implementation of the ellipsoid reconstruction algorithm in an arbitrary number of dimensions. To demonstrate its many potential uses, the direct reconstruction method is applied to quasi-Gaussian statistical distributions arising in elementary particle production at the Large Hadron Collider.
Contents
I. Introduction 2 II. Fitting two-dimensional ellipses 3 A. The Convex Hull Method 4 1. Finding the convex hull 4 2. Fitting an ellipse to the convex hull 6 B. The Covariance Matrix Method 9 III. Reconstructing the ellipsoid from its projections 10 IV. Applications 12 A. A solid 3-dimensional ellipsoid 12 B. Cross sections for electroweak boson production at the Large Hadron Collider 12 V. Conclusion 18 References 19 I. Introduction
In this article, we focus on a common problem, reconstruction of a d-dimensional ellipsoid from coordinates of a set of discrete data points populating the volume of the ellipsoid. Clusters of data points that are approximately ellipsoidal in shape are encountered in many applications ranging from multivariate statistical analysis and machine learning to cardiac strain imaging (1) and calibration of magnetic compasses (2). Given the images (projections) of the ellipsoid, the task is to find the equation of the ellipsoid’s surface in a suitable coordinate representation. Remarkably, the equation of such an ellipsoid can be found by analytically solving a system of matrix equations, as described below. For example, suppose N discrete predictions dependent on parameters {x , x , …, x d } are distributed in an approximately ellipsoidal region in the d-dimensional parameter space. In statistical analysis, these predictions can be generated by random sampling from a multi-dimensional probability distribution that is approximately Gaussian. If the equation specifying the underlying probability distribution is unknown, one might wish to reconstruct it from the discrete distribution of the data. One way of doing this is to select points on the boundary of the d-dimensional region satisfying a given probability level and fit an ellipsoid to this boundary. From the quadratic form describing the ellipsoid, the quasi-Gaussian probability distribution can be immediately determined. A practical algorithm for the reconstruction of a d-dimensional ellipsoid by fitting discrete points was developed by Bertoni (3). It is based on the combination of methods developed by Fitzgibbon, Pilu, Fisher (4) and Karl (5). Bertoni’s algorithm is general, allowing one to reconstruct an ellipsoid from a complete set of the low-dimensional (not necessarily independent) ellipsoid’s projections. However, Karl’s and Bertoni’s papers do not demonstrate existence of a unique solution. In fact, such solution exists only when the set of projections is sufficiently complete to determine all coefficients of the quadratic form. In this article, we focus on a special case, when the ellipsoid is reconstructed from its two-dimensional orthogonal projections. We show how to derive a closed solution for the ellipsoid’s quadratic form using a set of complete and mutually consistent two-dimensional projections. The existence of such a unique solution, and the algebraic formula to find its coefficients, is a new result presented below. A Mathematica program implementing the full reconstruction algorithm is available upon request. The reconstruction algorithm has important applications in the field of elementary particle physics. For example, the structure of protons and nuclei in high-energy collisions is parameterized by parton distribution functions (PDFs) that are determined from a large-scale multivariate analysis of experimental measurements (6). To determine theoretical uncertainties for the rates of elementary particle production at the Large Hadron Collider, one may need to reconstruct an underlying quasi-Gaussian probability distribution from the multidimensional distribution of values obtained by stochastic sampling. Traditionally, the Gaussian distribution can be estimated using the method of the covariance matrix (7) or related Hessian matrix (8). Our ellipsoid reconstruction algorithm can be employed as a part of an alternative estimation method that does not assume that the probability distribution is perfectly Gaussian, as we explain in Section 4. To demonstrate the usefulness of the developed reconstruction method and explore its differences against the covariance matrix method, we employ both methods to predict the uncertainty due to PDF parameterizations in production of 𝑊 ± , 𝑍 , and 𝐻 bosons at the LHC. Figure 1 illustrates the reconstruction of a 3-dimensional ellipsoid from its 2-dimensional elliptical projections. The input data consists of 1000 random three-dimensional vectors (blue points) that populate the ellipsoid’s volume. The output consists of the symmetric matrix 𝐴 specifying the equation of the ellipsoid boundary (shown by a green mesh), found from the discrete input data with the help of our method. The first step is to project the input vectors onto independent orthogonal planes, where the boundaries of the input clusters are fitted by ellipses, as described in Section II. Then, in Section III, we reconstruct the output matrix 𝐴 from the matrices 𝐴 ( 𝑖 = 1,2,3) for the equations of the projected ellipses. This Section presents a general formula for reconstructing the d-dimensional ellipsoid matrix 𝐴 𝑑 from the 2-dimensional projection matrices 𝐴 , where . It also provides a proof that such a matrix exists and a consistency check for the projection matrices. Section IV applies the Mathematica program to the analysis of production cross sections in elementary particle physics. Section V contains our conclusions.
II.
Fitting two-dimensional ellipses
As the first step in the reconstruction of the d-dimensional ellipsoid, we need to determine the matrices for the boundaries of two-dimensional ellipses that are the projections of the ellipsoid
Figure 1. A three-dimensional ellipsoid fitted to 1000 quasi-ellipsoidal points and its two-dimensional elliptical projections. onto the orthogonal two-dimensional planes. In the example of Figure 1, the projected input data vectors populate the inside of an ellipse in each projection plane. The Convex Hull (CH) Method described in Subsection A reconstructs the quadratic form for the convex boundary of this ellipse with the help of the least-squares elliptical fitting algorithm described in (4). In statistical applications, the cluster of data vectors sampled from a quasi-Gaussian distribution does not have a sharp boundary. Rather, the “ellipse” may correspond to the boundary of the probability- 𝛼 region determined from the covariance matrix (CM) according to the conventional method summarized in Subsection B. A. The Convex Hull Method
Since a 2-dimensional projection of a d-dimensional ellipsoid represents a filled ellipse and not its outline, one must first find the boundary, or convex hull, of the projection and then fit an ellipse to this boundary. The convex hull algorithm addresses the first necessity, while the least squares elliptical fitting algorithm addresses the second. Finding the convex hull
The convex hull algorithm determines which points of the data set would be most appropriate for use in elliptical fitting, so that the resulting ellipse describes the boundary of the data subset, not the data subset as a whole. We will describe a convex hull algorithm that operates with cross products, although other algorithms for convex hull reconstruction are also available, such as the one described in (9).
Figure 2: Illustration of the vectors arising in the determination of the convex hull
A convex hull of a set of points in an xy plane is the smallest convex polygon in the plane that contains every point in the set. For visualization purposes, it can be described as the shape that a rubber band would take if it were stretched out around a set of points. The first step in finding the convex hull of a set of points is finding the convex hull’s vertices. These vertices are points from the data set such that if they were connected by straight lines, the polygon formed would be the convex hull of the data set. To begin, a point V from the set known to be a vertex is needed. If such a point is not explicitly given, it can easily be found by taking the point with the lowest 𝑥 value, as this point will certainly be a vertex due to its extreme position. V is now the active vertex. To find the next vertex, the active vertex is used as a basis of comparison for every other point in the set. Whichever point in the set creates the greatest angle relative to V ’s horizontal axis over [0, π] will be the next vertex, V . V will now act as the active vertex to find V . This will continue until a point V n whose next vertex is V , the original active vertex, is found. Once this point has been reached, all the vertices { V , V , … , V n } of the convex hull have been found. Connecting these vertices with straight lines creates the convex hull. The algorithm can be explained in detail using
Figure 2 . In the figure, V1 is the active vertex, and P1 and P2 represent the two points currently being compared. θ ab and θ ac , the angles that are compared for each pair of points, can be calculated as follows: Θ𝑎𝑏 = tan −1 (𝑌𝑏 − 𝑌𝑎𝑋𝑏 − 𝑋𝑎), Θ𝑎𝑐 = tan −1 (𝑌𝑐 − 𝑌𝑎𝑋𝑐 − 𝑋𝑎). However, because trigonometric functions are computationally slow, simpler algebraic representations of the angles are used, and the following test is obtained: (𝑌𝑏 − 𝑌𝑎)(𝑋𝑐 − 𝑋𝑎) − (𝑌𝑐 − 𝑌𝑎)(𝑋𝑏 − 𝑋𝑎) = 𝛿.
This test returns a determinant δ. Xa and Ya represent the coordinates of the current active vertex, V A . Xb and Yb represent the coordinates of any point P B , and Xc and Yc represent the coordinates of any point P C . If δ > 0, then P B creates the larger angle with respect to V A . If δ < 0, then P C creates the larger angle with respect to V A . If the determinant is zero, then all three points are collinear and the point which is farther from V A should be selected. It is also easy to realize that represents the z component of the cross product 𝐴𝐵⃗⃗⃗⃗⃗ × 𝐴𝐶⃗⃗⃗⃗⃗ , so that δ > 0 (δ <0) represents the clockwise (counterclockwise) rotation of
𝐴𝐵⃗⃗⃗⃗⃗ toward
𝐴𝐶⃗⃗⃗⃗⃗ , which can also be used to determine the relative orientation of
𝐴𝐵⃗⃗⃗⃗⃗ and
𝐴𝐶⃗⃗⃗⃗⃗ . The program repeats this process as needed until all the convex hull values have been found. Fitting an ellipse to the convex hull
Next, we need to find an ellipse that would provide a reasonable fit to the points on the convex hull. If a point lies on an ellipse, the point’s coordinates satisfy
𝐹(𝑨, 𝑋) = 𝑎 𝑥 + 𝑎 𝑥𝑦 + 𝑎 𝑦 + 𝑎 𝑥 + 𝑎 𝑦 + 𝑎 = 0, where the coefficients are constrained by 𝑎 − 4𝑎 𝑎 < 0 . For n points {𝑥 , 𝑦 }, … , {𝑥 𝑛 , 𝑦 𝑛 } that are not exactly on the ellipse, the desired ellipse can be obtained through a least squares minimization of algebraic distances from the points to the ellipse. As explained in (4), the minimization problem for finding the ellipse can be expressed as a generalized eigenvalue problem based on a matrix equation 𝑺𝑨 = 𝜆𝑪𝑨 (Eq. 1) where 𝜆 is the eigenvalue, 𝑨 = (𝑎 𝑎 𝑎 𝑎 𝑎 𝑎 ) 𝑇 , and 𝑺 and 𝑪 are certain matrices constructed in Ref. (4). The generalized eigenvalue problem can be solved numerically using LAPACK (10), Mathematica, or another advanced linear algebra package. Alternatively, it is possible to reduce this equation to a standard eigenvalue problem using the method that will be now described. This method can be easily implemented with any linear algebra library. Toward this goal, we identify two 3-component vectors 𝑎 = (𝑎 𝑎 𝑎 ) and 𝑎 = (𝑎 𝑎 𝑎 ) containing the coefficients associated with rotations and translations inside the 6-component vector 𝑨 = [𝑎 𝑎 ] 𝑇 . Block matrices are indicated by bold letters and square brackets. Express matrices 𝑺 and C in terms of and 𝑘 × 3 blocks, where 𝑘 = 1 or 3 : 𝑺 = [𝑆 𝑆 𝑆 𝑆 ] , 𝑪 = [𝑀 0 ] , with 𝑀 = (0 0 20 −1 02 0 0), and a zero matrix . If 𝑫 = [𝑑 𝑑 ] with 𝑑 = (𝑥 𝑥 𝑦 𝑦 𝑥 𝑥 𝑦 𝑦 ⋯ ⋯ ⋯𝑥 𝑛2 𝑥 𝑛 𝑦 𝑛 𝑦 𝑛2 ) and 𝑑 = (𝑥 𝑦 𝑦
1⋯ ⋯ ⋯𝑥 𝑛 𝑦 𝑛 the blocks 𝑆 𝑖𝑗 of S are given according to Ref. (4) by 𝑆 𝑖𝑗 = 𝑑 𝑖𝑇 𝑑 𝑗 . Eq. 1 can then be written as [𝑆 𝑆 𝑆 𝑆 ] (𝑎 𝑎 ) = 𝜆 [ 𝑀 0 ] (𝑎 𝑎 ) . (Eq. 2) We apply singular value decomposition to C to find 𝑪 = 𝑼𝑳 𝑰 𝑳 𝑽 𝑻 , (Eq. 3) which depends on block matrices 𝑼 = [𝑢 𝐼 ] , 𝑽 = [𝑣 𝐼 ] , 𝑳 𝑰 = [ 𝑙 𝐼 ] , 𝑳 = [ 𝐼 ]. In this equation, 𝐼 is the identity matrix, 𝑢 = (1 0 00 0 −10 1 0 ) , 𝑣 = (0 1 00 0 11 0 0) , 𝑙 = (2 0 00 2 00 0 1). (Eq. 4) The only singular matrix in Eq. 3 is 𝑳 : det 𝑳 = 0. On the other hand, U and V are orthonormal, 𝑼𝑼 𝑻 = 𝑽𝑽 𝑻 = 𝐼 . The inverse of 𝑳 𝑰 also exists, 𝑳 𝑰−1 = [𝑙 𝐼 ] , where 𝑙 = (1/2 0 00 1/2 00 0 1). In this representation, the only complication is associated with the singular 𝑳 matrix inside the decomposition for C . We therefore multiply Eq. 1 by 𝑼 𝑇 𝑳 𝐼−1 from the left and identify 𝑺 𝑉 ≡𝑼 𝑇 𝑳 𝐼−1
𝑺 𝑽 , 𝑨⃗⃗ 𝑉 ≡ 𝑽 𝑻 𝑨⃗⃗ to obtain 𝑺 𝑉 𝑨⃗⃗ 𝑉 = 𝜆𝑳 𝑨⃗⃗ 𝑽 . In the block form, this equation is [𝑆 𝑉11 𝑆 𝑉12 𝑆 𝑉21 𝑆 𝑉22 ] (𝑎 𝑉1 𝑎 𝑉2 ) = [ 𝜆 0 ] (𝑎 𝑉1 𝑎 𝑉2 ), and [𝑆 𝑉11 𝑆 𝑉12 𝑆 𝑉21 𝑆 𝑉22 ] = [𝑙 𝑢 𝑆 𝑣 𝑙 𝑢 𝑆 𝑆 𝑣 𝑆 ]. To proceed, we need to single out a special case when all points lie on a single line, corresponding to a degenerate solution for the elliptical coefficients. It can be easily demonstrated that the points lie on a line if and only if det 𝑆 = 0 . Indeed, since 𝑆 = 𝑑 𝑑 , the condition det 𝑆 = 0 is equivalent to det 𝑑 = 0 . Then, there is a vector 𝑤 =(𝑤 𝑥 𝑤 𝑦 𝑇 such that 𝑑 𝑤 = (𝑥 𝑦 𝑦
1⋯ ⋯ ⋯𝑥 𝑛 𝑦 𝑛 𝑥 𝑤 𝑦 𝑥 𝑥 + 𝑤 𝑦 𝑦 + 1𝑤 𝑥 𝑥 + 𝑤 𝑦 𝑦 + 1 ⋯𝑤 𝑥 𝑥 𝑛 + 𝑤 𝑦 𝑦 𝑛 + 1 ) = ( 00⋯0 ) ; or, all points lie on the line 𝑤 𝑥 𝑥 + 𝑤 𝑦 𝑦 + 1 = 0. If the solution is not degenerate ( det 𝑆 ≠ 0) , the system of equations becomes (𝑆 𝑉11 − 𝑆
𝑉12 𝑆 𝑉22−1 𝑆 𝑉21 )𝑎 𝑉1 = 𝜆𝑎 𝑉1 , (Eq. 5) 𝑎 𝑉22 = −𝑆
𝑉22−1 𝑆 𝑎 𝑉1 . (Eq. 6) The first equation is a regular eigenvalue problem for 𝑎 𝑉1 , solved by standard methods. The second equation derives 𝑎 𝑉22 from 𝑎 𝑉1 . For a degenerate solution ( det 𝑆 = 0) , it suffices to fit all points using linear regression (assuming that 𝑎 𝑉1 is a null vector). Based on this exposition, the equation of the ellipse is found as follows. Given the coordinates {𝑥 𝑘 , 𝑦 𝑘 } of the fitted points, we compute the matrices 𝑑 𝑖 , 𝑆 𝑖𝑗 , and 𝑆 𝑉𝑖𝑗 for 𝑖, 𝑗 = 1 or 2. The determinant det 𝑆 is calculated to decide if the points lie on one line within the accuracy of the calculation. If the solution is not degenerate, we find 𝑎 𝑉1 and 𝑎 𝑉2 from Eqs. 5 and 6; otherwise, we set 𝑎 𝑉𝑖 = 0 for 𝑖 = 1,2,3 and find 𝑎 𝑉4 and 𝑎 𝑉5 by linear regression. Among three possible eigenvalues in Eq. 5, one eigenvalue is positive and two are negative (4). The eigenvector 𝑎 𝑉1 that solves our conic problem corresponds to the only positive eigenvalue (4). Finally, the coefficient vectors are determined as 𝑎 = 𝑁 𝑣 𝑎 𝑉1 , 𝑎 = 𝑁 𝑎 𝑉2 , where 𝑁 is an arbitrary normalization factor that can multiply all coefficients 𝑎 𝑖 without violating the original equation 𝐹(𝑨, 𝑋) = 0 . 𝑁 must be found from a supplementary condition. For example, it can be that the quadratic form for the ellipse has a standard normalization so that at the center of the ellipse, 𝑋 = {𝑥 , 𝑦 } , the quadratic form takes the value 𝐹 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 (𝑨, 𝑋 ) =−1 . The coordinates of the center can be found from 𝑎 𝑖 as 𝑥 = − 𝑎 𝑎 − 𝑎 𝑎 𝑎 − 4𝑎 𝑎 , 𝑦 = − 𝑎 𝑎 − 𝑎 𝑎 𝑎 − 4𝑎 𝑎 independently of 𝑁. Then, once {𝑥 , 𝑦 } is determined using 𝑁 = 1 for 𝑎 𝑖 , the final normalization that satisfies 𝐹(𝑨, 𝑋 , 𝑁 𝑓𝑖𝑛𝑎𝑙 ) = −1 is obtained by 𝑁 𝑓𝑖𝑛𝑎𝑙 = − 1𝐹(𝑨, 𝑋 , 𝑁 = 1). If the ellipse is centered at the origin, 𝑥 = 𝑦 = 0, the final equation of the ellipse in the convex hull (CH) method is If there is not enough noise in the data (all points lie exactly on the ellipse), the positive eigenvalue may be indistinguishable from zero within accuracy. In this case, the solution corresponds to the largest eigenvalue. (𝑥 𝑦) ⋅ 𝐴 ⋅ (𝑥𝑦) = 1, with 𝐴 ≡ − (𝑎 𝑎 ) . Eq. (7) B. The Covariance Matrix Method
If a sufficiently large sample of the two-dimensional data {(𝑥 = 𝑥 𝑖 , 𝑥 = 𝑦 𝑖 )} is drawn from a Gaussian distribution, another method, which we will refer to as the “Covariance Matrix” (CM) method, can be used to determine the ellipse that delineates the boundary of the region containing the fraction 𝛼 of the data sample ( . In the absence of any correlation, and under a simplifying assumption that the data have zero mean values, 〈𝑥 〉 = 0 , the points on the axis-aligned boundary ellipse would adhere to 𝑥 σ + 𝑥 σ = 𝑠, where σ ≡ √〈𝑥 〉 are the standard deviations of the 𝑥 and 𝑥 data, respectively, and 𝑠 is the chi-squared critical value associated with the desired probability level α . On the other hand, if there is a correlation between 𝑥 and 𝑥 , the resulting ellipse will no longer be aligned with the 𝑥 and 𝑥 axes and will satisfy 𝑥 σ + 𝑥 σ − 2 𝑥 σ 𝑥 σ cos 𝜑 = 𝑠 sin 𝜑, with cos 𝜑 ≡ 〈𝑥 𝑥 〉〈𝑥 〉〈𝑥 〉. This equation can be re-written using the same sign convention as in the previous subsection as ∑ 𝑥 𝑖 𝐴 𝑥 𝑗 − 𝑠 sin 𝜑 = 0 𝑖,𝑗=1,2 in terms of the matrix 𝐴 ≡ ( − cos(𝜑)𝜎 𝜎 − cos(𝜑)𝜎 𝜎 ) . Eq. (8) In contrast to the convex hull method, the matrix 𝐴 in the CM method is shared by the entire input data, and the probability regions are distinguished only by the critical parameter 𝑠 . This reflects the assumption behind the CM method that the probability distribution is exactly Gaussian. We also notice that the CM method implies that the correlation ellipsoid can be found directly by diagonalizing the covariance matrix in d dimensions (without taking projections), i.e., by the 𝑑 − dimensional principal component analysis [PCA]. On the other hand, if we do not wish to use the CM method or suspect that the ellipsoid matrices may non-trivially depend on the probability level because of some deviations from the Gaussianity, the elliptical projection corresponding to the probability level 𝛼 can be determined using the convex hull (CH) method, by first identifying a two-dimensional region containing a fraction 𝛼 of the input data points, and then fitting an ellipse to the convex hull boundary of the enclosed data subset in this region. If the probability distribution deviates from the Gaussian one, the elliptical projections obtained with the CH method for different probability levels are not related by a simple rescaling of the parameter 𝑠 . The comparison of the ellipsoids determined with the CH and CM methods thus provides a normality test for the underlying probability distribution. III.
Reconstructing the ellipsoid from its projections
Next, we turn to the reconstruction of the ellipsoid from its two-dimensional projections. Notice that 𝑑(𝑑 − 1)/2 independent projections are necessary to find all coefficients of the ellipsoid’s quadratic form. The easiest way to proceed, then, is to determine, block-by-block, the inverse matrix of the quadratic form by repeatedly invoking Eq. 10 below for each projection. Here we lean on the crucial observation in Ref. (11) that it is the inverse matrices of the quadratic forms, rather than the quadratic forms themselves, that are straightforwardly related. Below we include a short proof of this important relation. [Ref. (8) presented a relation between the inverse matrices up to an overall normalization of their coefficients and without including a proof]. We bypass the difficulty of dealing with non-invertible operators that would affect, e.g., the direct implementation of the ellipsoid reconstruction method proposed by Karl (5). Karl’s proposal requires stacking multiple projection operators in a way as to allow reconstruction of all ellipsoid’s elements without omissions or double-counting. This is not necessary for the complete set of orthogonal projections, when the straightforward implementation using Eq. 10 is sufficient. Any 𝑑 -dimensional vector 𝑥⃑ = {𝑥 , 𝑥 , … , 𝑥 𝑑 } drawn from the center of the ellipsoid to its surface satisfies 𝑥 𝑇 ⋅ 𝐴 𝑑 ⋅ 𝑥⃑ = 1, where 𝐴 𝑑 is the matrix of the d -dimensional quadratic form whose elements we intend to find. A projection of 𝑥⃑ from 𝑑 to 2 dimensions, denoted by 𝑥⃑̅ ≡ 𝑃 ⋅ 𝑥⃑ , Eq. (9) obeys an analogous equation 𝑥 ̃ 𝑇 ⋅ 𝐴 ⋅ 𝑥⃑̃ = 1. 𝐴 is the matrix of the quadratic form for the projection found using the CH or CM method. 𝑃 is a projection matrix, such as 𝑃 = (𝕀 𝕆 ) for the projection on the 𝑥 𝑥 plane, with 𝕀 and 𝕆 being the identity matrix and zero matrix, respectively. To put together 𝐴 𝑑 , we notice that the inverse matrices are related by 𝑃 ⋅ 𝐴 𝑑−1 ⋅ 𝑃 = 𝐴 . Eq. (10) To prove it, recast the positive-definite symmetric matrix 𝐴 𝑑 in terms of its eigenvalues 𝜆 𝑖2 > 0 and the rotation matrix 𝑂, 𝐴 𝑑 = 𝑂 𝑇 Λ 𝑇 Λ𝑂 ≡ (𝐴 𝑑1/2 ) 𝑇 ⋅ 𝐴 𝑑1/2 , where 𝑂 𝑇 𝑂 = 𝕀 𝑑×𝑑 , Λ ≡ diag(λ , λ , … , λ 𝑑 ) , and 𝐴 𝑑1/2 ≡ Λ𝑂 . 𝐴 𝑑1/2 generates an isomorphism that maps 𝑥⃑ onto a unit vector 𝑛⃗⃑ = 𝐴 𝑑1/2 𝑥⃑ satisfying 𝑛⃗ 𝑇 ⋅ 𝑛⃗ = 1. In other words, the affine transformation specified by 𝐴 𝑑1/2 associates any 𝑥 ending on the ellipsoid’s surface to a vector 𝑛⃗ from the origin to a unit sphere. The inverse transformation also exists: 𝑥⃑ = 𝐴 𝑑−12 ⋅ 𝑛⃗⃑. Eq. (11) Similarly, the projections 𝑥 ̃ are related to the projections 𝑛⃗ ̃ ≡ 𝑃 ⋅ 𝑛⃗ by 𝑥⃑̃ = 𝐴 ⋅ 𝑛⃗⃑̃ . Eq. (12) From Eqs. 9, 11, and 12, we conclude that 𝑃 ⋅ 𝐴 𝑑−12 = 𝐴 ⋅ 𝑃 . Multiplying both sides by their transpose matrices on the right, and using 𝑃 ⋅ 𝑃 = 𝕀 , we arrive at the desired relation, 𝑃 ⋅ 𝐴 𝑑−1 ⋅ 𝑃 = 𝐴 . Q. E. D. In our practical algorithm, Eq. 10 is used to read off the coefficients of 𝐴 𝑑−1 directly from the coefficients of 𝐴 . If we generate 𝑑(𝑑 − 1)/2 projections on planes 𝑥 𝑖 𝑥 𝑗 with 𝑖 <𝑗 ≤ 𝑑 , the diagonal elements (𝐴 𝑑−1 ) 𝑖𝑖 will be equal to the diagonal elements in (𝑑 − 1) projections, and an off-diagonal element (𝐴 𝑑−1 ) 𝑖𝑗 will appear once in the projection 𝑥 𝑖 𝑥 𝑗 ( 𝑖 ≠ 𝑗 ). Due to noise, the (𝑑 − 1) computations of each diagonal element will not necessarily be exactly equivalent. The final estimate of a diagonal element is simply taken to be the mean value of the computations, and a comparison of the diagonal elements from the projections via their standard deviations and mean values provides a test of mutual consistency of the input projections. A straightforward generalization of Eq. 10 relates the 𝑑 -dimensional matrix 𝐴 𝑑 to the matrices 𝐴 𝑑′ of ellipsoids in lower-dimensional projections (𝑑 ′ < 𝑑) using 𝑑 ′ × 𝑑 projection operators 𝑃 𝑑′←𝑑 : 𝑃 𝑑′←𝑑 ⋅ 𝐴 𝑑−1 ⋅ 𝑃 𝑑 ′ ←𝑑 𝑇 = 𝐴 𝑑 ′ −1 . Eq. (13) IV.
Applications A. A solid 3-dimensional ellipsoid
Depending on the context, either the Convex Hull (CM) method or Covariance Matrix (CM) method may be preferable for the ellipsoid reconstruction. In the case of the solid 3-dimensional ellipsoid presented in
Figure 1 , the CM method under/overestimates the spread of the input data points. In the 𝑥 𝑥 projection of the ellipsoid in Figure 1 and its other projections, the boundary ellipse predicted based on the covariance matrix (red line) has lower eccentricity than the input data. The CH method (black dashed line), on the other hand, traces well the outer boundary of the ellipsoid. Furthermore, the ellipsoid matrix 𝐴 reconstructed using the CH method agrees well with the input ellipsoid matrix 𝐴 used to generate the data, with the relative differences not exceeding 1.5%: 𝐴 = ( 1.439 −1.607 0.626−1.607 2.685 −0.6310.626 −0.631 0.432 ) ; 𝐴 = ( 1.453 −1.621 0.634−1.621 2.704 −0.6390.634 −0.639 0.437 ). B. Cross sections for electroweak boson production at the Large Hadron Collider
Our second example establishes a connection to elementary particle physics, where the ellipsoid reconstruction may be employed in large-scale statistical analyses of experimental data from particle colliders. Parton distribution functions (PDFs) quantify the inner structure of the protons in many theoretical calculations in quantum chromodynamics (6). PDFs are published as effective functions dependent on tens to hundreds of free parameters determined from the global analysis of collider data. Knowledge of the statistical distributions of PDF parameters allowed by the experimental data is essential for quantifying the uncertainty on theoretical predictions. In the situations when the parameter distribution is established by stochastic sampling of the multi-dimensional (sometimes 100-dimensional) parameter space (12) (13), the information contained in the PDFs can be effectively compressed using the principal component analysis [PCA] (7) or an alternative compression method (14) (15). Compression of PDFs simplifies their use and combination (16). The Convex Hull ellipsoid reconstruction is similar in its spirit to the PDF
Figure 3. Reconstructed projections of the 3-dimensional ellipsoid shown in Fig.1. compression based on the PCA, while it also reflects deviations from the normality identified by the other compression methods . As an example of such an application, consider theoretical uncertainties in predicting probabilities (or cross sections) for production of elementary particles in high-energy physics experiments in proton-proton collisions at the Large Hadron Collider (LHC). Rates for production of electroweak bosons 𝑊 ± , 𝑍 , and 𝐻 or other heavy particles depend on distributions of partons (quarks and gluons) inside the proton, which are not fully known, but parameterized based on experimental measurements within some uncertainty. If the parton distributions are similar in production of particles A and B, the measurement of the cross section for production of A can constrain the parton distributions in production of B. We can estimate the probability that the measurement of A will constrain B by plotting pairs of cross sections for A and B for an ensemble of parton distributions. Such plots for production of electroweak bosons at the Large Hadron Collider at beam energy 8 TeV were obtained using Neural Network PDF (NNPDF2.1) parton distributions (17) in Figure 4 . The NNPDF2.1 set provides 1000 forms of PDFs whose parameters are distributed according to the probability prescribed by the pre-LHC data. For each NNPDF parameterization, we plot the total cross sections for two types of bosons ( 𝑍 vs. 𝑊 ± , 𝑊 + vs. 𝑊 − , 𝐻 vs. 𝑍 , and 𝐻 vs. 𝑊 ± ), and hence obtain a set of 1000 discrete points (indicated by black dots) in 2-dimensional planes of the respective cross sections. Next, we wish to ask if the predictions based on the NNPDF set follow the Gaussian distribution. If they do, the central regions will be elliptical and concentric for all cumulative probabilities, and thus our ellipsoid reconstruction method may accurately quantify the predictions. For each pair of cross sections, we fit the 68% (red) and 90% (green) ellipses using the Convex Hull Method. [The uncertainties of parton distributions are presented often at the 68% or 90% probability levels.] As we see, for all pairs of cross sections, the 68% and 90% intervals can be approximated by ellipses, but the ellipses are not always concentric. This indicates some deviations from the Gaussian approximation. The reason is that the 1000 NNPDF parton distributions are obtained using a Monte-Carlo statistical method that does not rely on the Gaussian approximation (12) (13). The CH method can be used to reveal deviations from the Gaussian statistics. Figure 4.
Next-to-leading order predictions for total cross sections of 𝑊 ± , 𝑍 , and 𝐻 boson production at the Large Hadron Collider obtained using NNPDF2.1 parameterizations of parton distributions and the Convex Hull Method. Figure 5. Representative 95% probability projections of ellipsoids formed by NNPDF2.1 predictions for W, Z, and H production cross sections. The solid red and dashed black lines indicate the projections of the CM and CH reconstructed ellipsoids, respectively.
The eccentricity of the ellipses quantifies the degree of correlation of the pairs of the cross sections through their PDF dependence (18).
Figure 5 shows the correlation ellipses for 𝑊 ± − 𝑍 , 𝑊 + − 𝑊 − , and 𝑊 + − 𝐻 cross sections at the 95% probability level. Here we normalize the cross sections of each type to their mean values over the sample of 1000 replicas, in order to eliminate the dependence on the average magnitude of the production cross sections, which varies depending on the type of the produced particle. We see from the figure that the relative variations due to the parton distributions are of the same order of magnitude for all particle types, not exceeding ±4% in the cross section magnitude at the 95% probability level. The solid red ellipses in
Figure 5 are obtained using the CM method, while the black dashed ellipses are found by fitting the convex hull of the data points enclosed in the overlap of ±2𝜎 intervals for each cross section of the pair (shown by blue short-dashed lines). Orange squares indicate the points fitted by the convex hull. The 95% CM ellipse automatically lies within the square corresponding to the overlap of the single-variable ±2𝜎 intervals. The CH ellipse, on the other hand, may go outside of the 95% square. The CH ellipse is more sensitive to outliers and more prone to random fluctuations, especially if it fits only a few points. Rather than fitting only the points exactly on the convex hull, we can fit instead the points within a narrow band around the convex hull in order to suppress the random fluctuations.
Figure 5 shows that the ellipses for 𝑍 vs. 𝑊 ± and 𝑊 + vs. 𝑊 − cross sections are very eccentric (highly correlated). A very high correlation normally indicates that the measurement of one cross section will impose tight constraints on the PDFs in the other cross section. The CM method indeed predicts such high correlation. However, we see that a few input data points for these cross sections lie far outside of the CM ellipse. Those on the convex hull are fitted by the CH ellipsoid, but have a small effect on the CM ellipsoid, as the latter is reconstructed from the totality of all points in the Gaussian approximation. Therefore, the deviations from the Gaussian behavior captured by the Convex Hull method result in a smaller absolute correlation than according to the Covariance Matrix method. On the other hand, the cross section for Higgs ( 𝐻 ) boson production is weakly correlated with the 𝑊 ± or 𝑍 cross sections: measuring 𝑊 ± and 𝑍 cross sections will not be very helpful for probing the parton distributions relevant for Higgs boson production. The CM and CH methods give comparable predictions for the correlations with the Higgs cross sections. From 10 independent projections like the ones in Figure 5 , we reconstruct the matrices for the 5-dimensional ellipsoid according to Eq. 10. The values of the matrices are 𝐴 = 10 × ( 61.5 42.3 -96.2 -8.16 0.17842.3 30.9 -66.5 -7.0 0.119-96.2 -66.5 170. -8.59 0.172-8.16 -7.0 -8.59 25.4 -0.4110.178 0.119 0.172 -0.411 0.371 ) , 𝐴 = 10 × ( 0.326 0.0509 −0.165 −0.136 0.01460.0509 0.206 −0.0758 −0.0639 0.00781−0.165 −0.0758 0.444 −0.124 −0.00442−0.136 −0.0639 −0.124 0.446 −0.004340.0146 0.00781 −0.00442 −0.00434 0.248 ) for the 68% probability level ellipsoids, and 𝐴 = 10 × ( 15.3 10.5 −24.0 −2.04 0.044610.5 7.73 −16.6 −1.75 0.0298−24.0 −16.6 42.7 −2.14 0.043−2.04 −1.75 −2.14 6.35 −0.1020.0446 0.0298 0.043 −0.102 0.0929 ) , 𝐴 = 10 × ( 0.228 0.0948 −0.133 −0.174 0.003310.0948 0.174 −0.0954 −0.146 0.00501−0.133 −0.0954 0.268 −0.0323 −0.00409−0.174 −0.146 −0.0323 0.386 −0.001510.00331 0.00501 −0.00409 −0.00151 0.0678 ) for the 95% probability level ellipsoids. The diagonal elements (𝐴 ) 𝑖𝑖 are taken to be the mean values of the (𝑑 − 1) = 4 estimates found from independent projections, according to the discussion in Section III. The standard deviations 𝛿(𝐴 ) 𝑖𝑖 of these constructed diagonal elements, divided by the mean values 〈(𝐴 ) 𝑖𝑖 〉 of the same elements, serve as the estimates of the consistency between the projections. For the matrices above, the ratios 𝛿(𝐴 ) 𝑖𝑖 /〈(𝐴 ) 𝑖𝑖 〉 are equal to zero for the CM ellipsoids and range between 0.03 and 0.2 for the CH ellipsoids. The geometric averages of 𝛿(𝐴 ) 𝑖𝑖 /〈(𝐴 ) 𝑖𝑖 〉 for the CH ellipsoids are 0.13 (0.1) at the 68% (95%) probability level. The magnitude of inconsistency of the CH projections may be explained by a small number of points lying on the convex hull. [The 3-dimensional ellipsoid in the previous example contained a large number of points, so its CH projections were practically consistent.] The CH Method selects points on the boundary of the desired two-dimensional probability region. In the projection 𝑥 𝑖 𝑥 𝑗 , the selection of 𝑥 𝑖 points depends on the other dimension 𝑥 𝑗 , as their coordinates must lie both within the probability intervals for 𝑥 𝑖 and for 𝑥 𝑗 . As the reconstruction algorithm cycles through different projections involving 𝑥 𝑖 , different 𝑥 𝑖 points will likely be selected, causing some inconsistencies in the coefficients 𝑥 𝑖2 . Meanwhile, the Covariance Matrix Method does not include a subset-selecting process: all data points are used regardless of the probability level. Thus, in the Covariance Matrix Method, the ellipses are guaranteed to be consistent. In the CH method, the consistency improves by including more points or by fitting the points lying within a band around the convex hull, rather than just on the convex hull itself. Table 1. Lengths of the principal semi-axes of the 5-domensional ellipsoids.
Probability Method Principal semi-axes
68% Covariance Matrix
Convex Hull
95% Covariance Matrix
Convex Hull
8 Table 1 lists the principal semi-axes of the four reconstructed ellipsoids. In the Covariance Matrix method, the semi-axes of the 95% (2-sigma) ellipsoid are twice as long as the ones for the 68% (1-sigma ellipsoid), as a consequence of the assumed normality of the probability distribution. The lengths span from 0.0013 to 0.077 for the 95% CM ellipsoid, reflecting high eccentricity of the CM ellipsoid in some directions. The Convex Hull method produces less eccentric ellipsoids because it accounts for the few outlying points that indicate some non-Gaussian behavior. The lengths for the 95% CH ellipsoid range from 0.013 to 0.07, i.e., they are more uniform than the respective lengths of the 95% CM ellipsoid. The ratios of the lengths of the 95% and 68% CH ellipsoids are and 2.16 – very different from 2 for the shortest principal axes. V. Conclusion
We presented an algebraic algorithm to obtain a unique, closed solution for the quadratic form of an ellipsoid reconstructed from d-dimensional discrete points using a complete and mutually consistent set of two-dimensional (or, generally, lower-dimensional) orthogonal projections. The reconstruction algorithm requires fitting several two-dimensional ellipses. We explored two approaches to achieving this task: the Convex Hull method, a purely algebraic process that uses cross products and least squares minimization using a generalized eigenvalue equation; and the Covariance Matrix method, which employs strong assumptions of normality to calculate a covariance matrix that determines the ellipse. We then explained how to exploit a simple relationship between their coefficients and those of the inverse of the desired ellipsoid’s quadratic form. In outlining this process, we proved that it is guaranteed to lead to a unique solution.
Finally, we realized the implementation of our algorithm in a Mathematica program and applied it to reconstruction of a three-dimensional solid ellipsoidal body as well as to a statistical distribution of cross sections for elementary particle production at the LHC. These applications illustrate when the Convex Hull and Covariance Matrix methods may produce different results. The suitability of each method depends on the context. The Convex Hull method is sensitive to outliers and deviations from the Gaussian behavior, though measures may be taken to suppress this sensitivity to a certain extent. In the non-Gaussian cases, it may give inconsistent coefficients for the ellipsoid’s quadratic form. In general, the Convex Hull method estimates correlations between the parameters more conservatively than the Covariance Matrix method, which is less sensitive to outliers, produces perfectly consistent closed forms of elliptical projections, and can provide very aggressive predictions for correlations between parameters. Each method performs well under a certain set of circumstances, and comparing the ellipsoids determined by both methods serves as a normality test of the underlying probability distribution. In the above example of the electroweak particle production at the LHC , the Convex Hull method indicates a weaker correlation between the production cross sections of 𝑊 ± and 𝑍 bosons than would be estimated by the commonly used Covariance Matrix formalism. The difference arises because of the non-Gaussian effects revealed by the NN parton distributions and may have practical implications for constraining precision measurements of 𝑊 ± bosons by the “benchmark” measurements of 𝑍 bosons. As the basis of this algorithm is purely mathematical, it can be applied in many fields of science. The development of a program that fits ellipsoids to sets of discrete multi-dimensional data has proved to be a useful way of determining correlations between parton distributions and particle production. This is just one application of the algorithm discussed; countless more exist. The research’s goal of producing a program that can efficiently fit ellipsoids to sets of discrete multi-dimensional data was accomplished, as the coded implementation of the algorithm has been tested and proven to be accurate. Acknowledgments
This work was supported in part by the SMU Senior Engaged Learning Fellowship, undergraduate research assistantships, and by the U.S. Department of Energy under Grant No. DE-SC0010129.
References Lopata, Richard G.P., et al.
Three-Dimensional Cardiac Strain Imaging in Healthy Children Using RF-Data. 2011, Vol. 37, 9. Fang, Jiancheng, et al.
A Novel Calibration Method of Magnetic Compass Based on Ellipsoid Fitting. 2011, Vol. 60, 6. 3.
Bertoni, Bridget.
Multi-dimensional ellipsoidal fitting.
Preprint SMU-HEP-10-14.
A. Fitzgibbon, M. Pilu, and R. Fisher.
Direct least square fitting of ellipses. 1999, Vol. 21, p. 5. 5.
Karl, W. C.
Reconstructing objects from projections.
Dept. of Electrical Engineering and Computer Science, MIT. 1991. Ph. D. Thesis. 6.
Gao, Jun, Harland-Lang, Lucian and Rojo, Juan.
The Structure of the Proton in the LHC Precision Era.
Phys. Rept.
Gao, Jun and Nadolsky, Pavel.
A meta-analysis of parton distribution functions.
JHEP.
J. Pumplin, D. Stump, R. Brock, D. Casey, J. Huston, J. Kalk, H.-L. Lai, W.-K. Tung.
Uncertainties of predictions from parton distribution functions. 2. The Hessian method.
Phys. Rev. D.
Liu, Runzong, et al.
A fast convex hull algorithm with maximum inscribed circle affine transformation. 2012, Vol. 77, 1. 0 10.
LAPACK — . 11.
Ellipsoid reconstruction from three perspective views.
Ma, S. D. and Li, L.
Vienna, Austria : IEEE, 1996. Proceedings of 13th International Conference on Pattern Recognition. p. 344.
12. Walter T. Giele, Stephane Keller.
Implications of hadron collider observables on parton distribution function uncertainties.
Phys. Rev.
13. NNPDF Collaboration, L. del Debbio et al.
Neural network determination of parton distributions: The Nonsinglet case.
JHEP .
14. Carrazza, Stefano, et al.
An Unbiased Hessian Representation for Monte Carlo PDFs.
Eur. Phys. J.
15. Carrazza, Stefano, et al.
A compression algorithm for the combination of PDF sets.
Eur.Phys.J.
16. Butterworth, Jon and others.
PDF recommendations for the LHC Run II.
J.Phys.
17. NNPDF Collaboration, R. Ball et al.
Impact of Heavy Quark Masses on Parton Distributions and LHC Phenomenology.
Nucl.Phys.
18. P. M. Nadolsky, H.-L. Lai, Q.-H. Cao, J. Huston, J. Pumplin, D. Stump, W.-K. Tung, C.-P. Yuan.
Implications of CTEQ global analysis for collider observables.
Phys.Rev.2008, Vol. D78, p. 013004.