Efficient Algorithms for Positive Semi-Definite Total Least Squares Problems, Minimum Rank Problem and Correlation Matrix Computation
aa r X i v : . [ m a t h . O C ] S e p EFFICIENT ALGORITHMS FOR POSITIVE SEMI-DEFINITETOTAL LEAST SQUARES PROBLEMS, MINIMUM RANKPROBLEM AND CORRELATION MATRIX COMPUTATION
NEGIN BAGHERPOUR ∗ AND
NEZAM MAHDAVI-AMIRI † Abstract.
We have recently presented a method to solve an overdetermined linear system ofequations with multiple right hand side vectors, where the unknown matrix is to be symmetric andpositive definite. The coefficient and the right hand side matrices are respectively named data andtarget matrices. A more complicated problem is encountered when the unknown matrix is to bepositive semi-definite. The problem arises in estimating the compliance matrix to model deformablestructures and approximating correlation and covariance matrices in financial modeling. Severalmethods have been proposed for solving such problems assuming that the data matrix is unrealis-tically error free. Here, considering error in measured data and target matrices, we propose a newapproach to solve a positive semi-definite constrained total least squares problem. We first con-sider solving the problem when the rank of the unknown matrix is known, by defining a new errorformulation for the positive semi-definite total least squares problem. Minimization of our newlydefined error consists of an optimization problem on the Stiefel manifold. To solve the optimizationproblem, in each iteration a linear operator subproblem arises for which we propose three differentiterative methods. We prove quadratic convergence of our proposed approach. We then describehow to generalize our proposed method to solve the general positive semi-definite total least squaresproblem. We further apply the proposed approach to solve the minimum rank problem and theproblem of computing correlation matrix. Comparative numerical results show the efficiency of ourproposed algorithms. In solving positive semi-definite total least squares problems, we find that inmost cases the linear operator equation is solved faster and turns to be more accurate using theGMRES method. Also, comparison of the results obtained by our algorithm with the ones dueto two other methods, the interior point method and a MATLAB routine for solving a quadraticprogramming problem with semi-definite constraint based on a path following algorithm, confirmsthe efficiency of our approach. Numerical test results also show that our approach for computinga correlation matrix leads to smaller standard deviations of error in the target matrix. Finally, theDolan-Mor´e performance profiles are shown to summarize our comparative study.
Key words.
Total least squares, positive semi-definite constraints, deformable structures, cor-relation matrix
AMS subject classifications.
1. Introduction.
In several physical problems, such as estimation of the massinertia matrix in the design of controllers for solid structures and robots, an overde-termined linear system of equations with multiple right hand side vectors arises withthe constraint that the unknown matrix be symmetric and positive definite; see, e.g.,[4, 7, 23]. A method for solving such a problem has been proposed in [24]. There arealso physical contexts, such as modeling a deformable structure [5, 18] and computingthe correlation matrix in finance or insurance/reinsurance industries [9, 12, 22, 27],where a symmetric positive semi-definite solution of an over determined linear systemof equations needs to be computed or equivalently the problem DX ≃ T (1.1)needs to be solved, where D, T ∈ R m × n , with m ≥ n , are given and X ∈ R n × n ,a symmetric positive semi-definite matrix, is to be computed as a solution. In some ∗ Faculty of Mathematical Sciences, Sharif University of Technology, Tehran, Iran, ([email protected]). † Faculty of Mathematical Sciences, Sharif University of Technology, Tehran, Iran,([email protected]). 1 pecial applications, the data matrix D has a simple structure, which may be takeninto consideration for efficiently organized computations. Computing the correlationmatrix in finance is such an example where the data matrix is the identity matrix;see, e.g., [27].Unlike the positive definite total least squares problem, here the unknown matrixis singular and thus our previously defined error formulation in [24] is no more appli-cable. We need to formulate the error in the measured data and target matrices as afunction of the unknown matrix but not its inverse.A number of least squares formulations have been proposed for the physical prob-lems, which may be classified as ordinary and total least squares problems. Unlikethe ordinary formulation, in a total least squares formulation both data and targetmatrices are assumed to contain error. Also, single or multiple right hand sides mayarise. In [24], ordinary and total least squares formulations with single or multipleright hand sides have been considered. For detailed analysis of total least squares, see[2, 8, 17].Here, we consider an specific case of the total least squares problem with multipleright hand side vectors. Our goal is to compute a symmetric positive semi-definitesolution X ∈ R n × n of the overdetermined system of equations DX ≃ T , where bothmatrices D and T may contain error. Several approaches have been proposed forthis problem, commonly considering the ordinary least squares formulation and min-imizing the error k ∆ T k F over all n × n symmetric positive semi-definite matrices,where k . k F is the Frobenious norm. Larson [6] discussed a method for computing asymmetric solution to an overdetermined linear system of equations based on solvingthe corresponding normal system of equations. Krislock [18] proposed an interiorpoint method for solving a variety of least squares problems with positive definitenessconstraint. Woodgate [15] described a new algorithm for solving a similar problem inwhich a symmetric positive semi-definite matrix P is computed to minimize k F − P G k ,with known F and G . In [14], Toh introduced a path following algorithm for solvinga positive semi-definite quadratic optimization problem. Later in 2009, he posteda MATLAB package for solving such a problem; see [30]. Hu [3] gave a quadraticprogramming approach to solve a least squares problem with a symmetric positivedefinite unknown matrix. In his method, the upper and lower bounds for the entriesof the target matrix can be given as extra constraints. In real measurements, how-ever, both the data and target matrices may contain error. Thus, to be practical, atotal least squares formulation seems to be appropriate. Here, we define a new errorfunction to consider error in both data and target matrices and propose an iterativealgorithm to minimize the defined error.If the goal is to compute the correlation matrix, the mathematical problem is alittle different. Computing the correlation matrix is very important in financial mod-eling. It is applicable for example in obtaining a quadratic model for an economicalsystem and even in reverse engineering for extreme scenario stress testing [25]. Inthis case, the data matrix is the identity and a large number of linear constraintsare to be satisfied. Sun [12] presented an algorithm for computing the correlationmatrix. Rebonato [9] and Werner [22] also discussed solving the same problem. Wewill see later that the minimum rank problem can also be solved by applying our roposed algorithm. This problem appears in the literature in diverse areas includingsystem identification and control, Euclidean embedding, and collaborative filtering;see [10, 16, 31]. In a minimum rank problem, the goal is to find a positive semi-definite solution with the minimum possible rank to an overdetermined linear systemof equations.The remainder of our work is organized as follows. In Section 2, we define a newerror function for solving a positive semi-definite total least squares problem with afixed rank. A method for solving the resulting optimization problem is presented inSection 3. Also, a discussion on solving the positive semi-definite total least squaresproblem (with arbitrary rank) is given in Section 3. In Section 4, we introduce twoslightly different problems and discuss how to solve them based on the proposedmethod in Section 3. These two problems are: the minimum rank problem andcomputing the correlation matrix. Comparative computational results are given inSection 5. Section 6 gives our conclusion.
2. Problem Formulation.
Available methods for solving a positive semi-definiteleast squares problem consider an ordinary least squares formulation; see, e.g., [14, 18].A practically useful total error formulation was introduced in [24] for a positive def-inite total least squares problem. Based on this formulation, the solution of theoptimization problem min X ≻ tr( DX − T ) T ( D − T X − )(2.1)is a solution of a corresponding positive definite total least squares problem, where X is symmetric and by X ≻
0, we mean X is positive definite. The error formulationin [24] not being suitable here, we first motivate and present a new error formulationfor the positive semi-definite total least squares case.In (2.1), the entries of D − T X − and DX − T represent the errors in D and T ,respectively. Here, we need to represent the error in D independent of X − . Beforediscussing how to solve the positive semi-definite total least squares problem, we con-sider the newly noted problem, positive semi-definite total least squares problem witha given rank, r , of the unknown matrix (R r -PSDTLS). In Section 3, we outline analgorithm for solving R r -PSDTLS and discuss how to solve the positive semi-definitetotal least squares problem applying the proposed algorithm.The error in D is supposed to be the difference between the real value of D andthe predicted value for D obtained by DX ≃ T . To compute the predicted value for D , we use the general least squares solution of the system XD T ≃ T T . Consideringthe block form D T = d T ... d Tn and T T = t T ... t Tn , where d i , t i ∈ R m , for i = 1 , · · · , n ,we have Xd i = t i , for i = 1 , · · · , n . The general solution to such a linear system hasthe form d i = X † t i + n i , (2.2)where X † is the pseudo-inverse of X and n i is an arbitrary vector in the null space of X [21]. A straight choice for n i is n i = 0 which results in d i = X † t i and ∆ D = D − T X † . e later consider the suitable choices for n i which minimizes k ∆ D k F . To compute d i from (2.2), the spectral decomposition can be applied. Result. (Spectral decomposition) [21] All eigenvalues of a symmetric matrix, A ∈ R n × n , are real and there are n mutually orthonormal vectors representing thecorresponding eigenvectors. Thus, there exist an orthonormal matrix ¯ U with columnsbeing the eigenvectors of A and a diagonal matrix ¯ D containing the eigenvalues suchthat A = ¯ U ¯ D ¯ U T . Suppose that D and U are respectively formed by ordering thediagonal elements of ¯ D non-increasingly and rearranging the corresponding columnsof ¯ U . The ordered spectral decomposition of A has the form A = U DU T . Also, if A is positive semi-definite with rank( A ) = r , then r of its eigenvalues are positive andthe rest are zero, and thus we can set D = (cid:18) S
00 0 (cid:19) , where S ∈ R r × r is diagonaland nonsingular. Hence, the ordered spectral decomposition of a symmetric positivesemi-definite matrix A is A = U (cid:18) S
00 0 (cid:19) U T , with U T U = U U T = I . Consideringthe block form U = ( U r U n − r ), we get A = U r S U Tr . Moreover, the columns ofmatrices U r and U n − r respectively form a basis for the range and null space of A .Now, making use of the spectral decomposition of A in (2.2), we have d i = X † t i + U n − r z i , where z i ∈ R n − r , for i = 1 , ..., m , are arbitrary vectors, and D T = X † T T + U n − r (cid:0) z · · · z m (cid:1) . Thus, the predicted value for D is D p = T X † + ZU Tn − r , (2.3)with Z ∈ R m × ( n − r ) , arbitrary, and the error in D is equal to ∆ D = D − ( T X † + ZU Tn − r ). A reasonable choice for Z in this formulation would be the one minimizingthe norm of ∆ D , which is the solution of the optimization problemmin Z k F − ZU Tn − r k F , (2.4)where F = D − T X † and k . k F is the Frobenius norm. Solving (2.4) results in [21] Z ∗ = F U n − r = ( D − T X † ) U n − r . (2.5)Substituting (2.5) in (2.3), we get∆ D = D − ( T X † + Z ∗ U Tn − r ) = ( D − T X † ) − ( D − T X † ) U n − r U Tn − r = ( D − T X † )( I − U n − r U Tn − r ) . (2.6)Using I − U n − r U Tn − r = U r U Tr along with (2.6), we get∆ D = ( D − T X † ) U r U Tr . ased on the above discussion, ∆ D = ( D − T X † ) U r U Tr and ∆ T = DX − T representthe error in D and T respectively. Thus, to solve a rank r positive semi-definite totalleast squares problem, it is appropriate to minimize the error E = X mi =1 X nj =1 ∆ D ij ∆ T ij = tr(∆ T T ∆ D ) , with tr( · ) standing for trace of a matrix. Consequently, the optimization problemmin tr(∆ T T ∆ D )(2.7) s.t. X (cid:23) X ) = r needs to be solved, where X is symmetric and by X (cid:23)
0, we mean X is positive semi-definite. We can rewrite the optimization problem using the spectral decompositionof X and substituting X and X † by U r S U Tr and U r S − U Tr respectively. Consideringwell-known properties of the trace operator [21] and the above formulation for X and X † , we get E = tr(∆ T T ∆ D ) = tr ( DX − T ) T ( D − T X † ) U r U Tr = tr( U r S U Tr D T − T T )( D − T U r S − U Tr ) U r U Tr = tr( S U Tr D T − U Tr T T )( DU r − T U r S − )= tr( U Tr D T DU r S − U Tr T T DU r − U Tr T T DU r + U Tr T T T U r S − ) . (2.8)Letting A = D T D , C = D T T + T T D and B = T T T , problem (2.7) is then equivalentto min Y,S tr( Y T AY S − Y T CY + Y T BY S − ) , (2.9)where Y ∈ R n × r satisfies Y T Y = I and S ∈ R r × r is a nonsingular diagonal matrix.The Lagrangian function corresponding to the constrained optimization problemmin f ( X ) s.t. g ( X ) = 0is L ( X, λ ) = f ( X ) + λ T g ( X ), where the Lagrangian coefficient vector λ correspondsto the constraint vector g ( X ) = 0. Necessary conditions for a solution, known asKarush-Kahn-Ticker conditions, is ∇ X L ( X, λ ) = 0 as well as ∇ λ L ( X, λ ) = 0, whichgives g ( X ) = 0; for KKT conditions, see [19]. Thus, L ( Y, S,
Ω) = tr( Y T AY S − Y T CY + Y T BY S − + Ω( Y T Y − I )) is the Lagrangian function for the optimizationproblem (2.9). Lemma 2.1.
An appropriate characteristic of the error formulation proposed by E = tr( Y T AY S − Y T CY + Y T BY S − ) , (2.10) is that its value is nonnegative and it is equal to zero if and only if ∆ D = 0 . roof . It is clear that if ∆ D = 0, then E = tr(∆ T T ∆ D ) = 0. Assuming E = 0,from (2.10) we have E = tr ( DY S − T Y S − ) T ( DY S − T Y S − ) = 0 , which holds if and only if DY S = T Y S − . Multiplying both sides from right by SY T , we get DY S Y T = T Y Y T , or equivalently DX = T X † X. (2.11)If (2.11) is satisfied, then we have ( D − T X † ) X = 0; hence, ∆ D = ( D − T X † ) XX † = 0.
3. Mathematical Solution.
Here, we discuss how to solve (2.9). We also studythe computational complexity and the convergence properties of our proposed algo-rithm. r Positive Semi-definite Total Least Squares Prob-lem.
We are to propose an algorithm for solving (2.9). More precisely, a nonsingulardiagonal matrix S = diag ( s , · · · , s r ) and a matrix Y ∈ V r ( R n ) need to be computedto minimize E = f ( Y, S ) = tr( Y T AY S − Y T CY + Y T BY S − ) , where V r ( R n ) is the Stiefel manifold [1]: V r ( R n ) = { A ∈ R n × r ; A T A = I } . In the following lemma, we show that the optimization problem (2.9) is strictly convexunder a weak assumption on the data and target matrices. We also make use of thewell-known properties of convexity to propose our algorithm.
Lemma 3.1.
The function f ( Y, S ) = tr( Y T AY S − Y T CY + Y T BY S − ) isalways convex and it is strictly convex on the set { ( Y, S ) | Y ∈ R n × r , S ∈ F } , where F = { S = diag ( s , . . . , s r ) | rank( s i D − T ) = n f or i = 1 , . . . , r } .Proof . The key point of the proof is to reformulate f ( Y, S ) as f ( Y, S ) = tr( Y T AY S − Y T CY + Y T BY S − )= X ri =1 s i y Ti Ay i − y Ti Cy i + 1 s i y Ti By i = X ri =1 y Ti ( s i D − s i T ) T ( s i D − s i T ) y i . Thus, f ( Y, S ) is always convex and it is strictly convex if and only if H i = ( s i D − s i T ) T ( s i D − s i T ), for i = 1 , . . . , r , is positive definite which holds if and only if s i D − s i T ,for i = 1 , . . . , r , has full column rank. ote. Since the function f ( Y, S ) is strictly convex and the set { Y | Y T Y = I } is convex, a point ( Y ∗ , S ∗ ) satisfying the Karush–Kuhn–Tucker (KKT) optimalityconditions is the unique solution of problem (2.9); for KKT conditions, see [19].Next, in the following theorem we derive the KKT optimality conditions for prob-lem (2.9). Theorem 3.2. If ( Y, S ) is the solution of problem (2.9), then it satisfies s i = ( y iT By i y iT Ay i ) , (3.1) where y i is the i th column of Y .Proof . The KKT necessary conditions for (2.9) are obtained by setting ∇ L = 0.Thus, if ( Y, S ) forms a solution for (2.9) with S = diag ( s , · · · , s r ), then we must have ∂L∂s i = 0, for i = 1 , · · · , r . To simplify the computation of ∂L∂s i , we can reformulate L ( Y, S,
Ω) using the definition of the trace operator. Let G = Y T AY S and H = Y T BY S − . We have tr( G ) = X ri =1 G ii = X ri =1 s i y iT Ay i and tr( H ) = X ri =1 ∆ H ii = X ri =1 s i y iT By i . So, L ( Y, S,
Ω) is equal to L ( Y, S,
Ω) = X ri =1 s i y iT Ay i + X ri =1 s i y iT By i (3.2) − tr( Y T CY − Ω( Y T Y − I )) . Now, from (3.2) we have ∂L∂s i = 2 s i ( y iT Ay i ) − s i ( y iT By i ) = 0 , and s i can be computed by s i = ( y iT By i y iT Ay i ) .Considering the above discussion, in each iteration of our proposed algorithm weneed to(1) compute one term of the sequence Y i ∈ V r ( R n ) converging to the minimizer of theerror E = f ( Y, S ), and(2) compute the diagonal elements of S from (3.1).Edelman [1] introduced two methods for solving optimization problems on Stiefelmanifolds: the Newton method and conjugate gradient method on the Stiefel man-ifold. Here, we adaptively use the Newton approach to develop an algorithm forsolving (2.9). In each iteration of our proposed algorithm, a Newton step is computedfrom the Newton method on the Stiefel manifold and then the diagonal elements of S , s i , are updated by (3.1). We show in Section 3.2 that our proposed algorithmconverges to the unique solution of (2.9) at least quadratically. Also, we discuss its omputational complexity in Section 3.2.We are now ready to outline the steps of our proposed algorithm. Algorithm 1.
Solving rank r positive semi-definite total least squares problemusing the Newton method on Stiefel manifold (R r -PSDTLS).- ǫ and δ are upper bounds for relative and absolute error, respectively taken to beclose to the machine (or user’s) unit roundoff error and machine (or user’s) zero.(1) Let A = D T D , B = T T T and C = D T T + T T D .(2) Choose Y such that Y T Y = I .Repeat(3.1) Let s i = ( y iT By i y iT Ay i ) .(3.2) Compute the n × r matrix F Y such that F Y ij = ∂E/∂Y ij and let G = F Y − Y F TY Y. (3.3) To compute ∆, solve the linear system of equations F Y Y (∆) − Y skew ( F TY ∆) − skew (∆ F TY ) Y −
12 ( I − Y Y T )∆ Y T F TY = − G, where F Y Y (∆) = A ∆ S − Y S ∆ T AY − C ∆ + Y ∆ T CY + B ∆ S − − Y S − ∆ T BY and skew ( X ) = X − X T .(3.4) Move from Y in direction ∆ to ¯ Y using ¯ Y = Y M + QN , where( I − Y Y T )∆ = QR is the compact QR factorization of ( I − Y Y T )∆, and M and N are: (cid:18) MN (cid:19) = exp ( (cid:18) K − R T R (cid:19) ) , with K = Y T ∆.(3.5) Compute Error = k ¯ Y − Y k . Let Y = ¯ Y .until Error ≤ ǫ k Y k + δ .(4) Let X = Y S Y T and E = tr( Y T AY S − Y T CY + Y T BY S − ). Note.
The linear equation F Y Y (∆) − Y skew ( F TY ∆) − skew (∆ F TY ) Y −
12 ( I − Y Y T )∆ Y T F TY = − G (3.3)may be solved by various methods including conjugate gradient and GMRES [20].Another possible method is to convert the linear operator appearing on the left sideof (3.3) to an nr × nr linear system of equations. In Section 5, we present the nu-merical results obtained by using these three methods and compare the respectiveobtained accuracies and computing times. .2. Solving Semi-definite Total Least Squares Problem. In Section 3.1,we outlined Algorithm 1 to solve the rank r positive semi-definite total least squaresproblem. Here, we discuss how to solve the general positive semi-definite total leastsquares problem.A positive semi-definite solution for the overdetermined linear system of equations DX ≃ T , whose rank is not known, needs to be computed. This problem arises forexample in estimation of compliance matrix of a deformable structure [18]. To solvethis problem, we can apply PSDTLS for possible values of r = 1 , · · · , n , computethe corresponding solutions X r = Y T S Y , and identify the one minimizing E =tr( Y T AY S − Y T CY + Y T BY S − ). We will refer to this approach as PSDTLS. InSection 5.1, we report some numerical results to compare PSDTLS by two existingmethods. Although our proposed method (PSDTLS) computes the minimizer of E for each value of r = 1 , · · · , n and then finds the optimal solution among X r , for r = 1 , · · · , n , it takes less time to solve the problem than two other proposed methodsin the literature. Here, we discuss con-vergence properties of R r -PSDTLS. We cite a theorem to be used to establish thelocal quadratic convergence of R r -PSDTLS to the unique solution of (2.9). We alsoshow that the computational complexity of every iteration of our proposed algorithmis N = 2 mn + n r + n ( r + r ) + 2 n r. Moreover, we provide an upper bound for the computational complexity of our pro-posed approach for solving the positive semi-definite total least squares problem,PSDTLS.
Theorem 3.3. ([11]) Newton’s method [1] applied to the function f ( Y ) =tr( Y T QY N ) on the Stiefel manifold V r ( R n ) = { Y ∈ R n × r ; Y T Y = I } , locally converges to the unique solution of min F ( Y ) ,Y T Y = I, (3.4) at least quadratically.Proof . See [11]. Lemma 3.4.
Algorithm 1 converges locally to the unique solution of problem (2.9)at least quadratically.Proof . In Algorithm 1, we have two main computations: applying Newton’sapproach on Stiefel manifold to update Y and updating the scalars s i using (3.1). Therate of convergence is not affected by (3.1) and it is governed by Newton’s approach.Thus, considering Theorem 3.3, R r -PSDTLS converges at least quadratically to theunique solution of (2.9). omputing Cost: Rank r Positive Semi-definite Total Least Squares Prob-lem.
The computational complexity of one iteration of R r -PSDTLS is given in Table3.1. The first, second and third columns respectively give the computational com-plexities of solving the linear problem (3.3) using conjugate gradient method in theoperator form (CG-O), GMRES in the operator form (GMRES-O) and conjugategradient method after converting (3.3) into a linear system of equations (CG-L);for details, see [20]. Also, the complexity of computing A , B and C in step (1) of Table 3.1
Computational complexities for one iteration using different approaches.
Computation Time complexityCG-O GMRES-O CG-L s i n r n r n rG n ( r + r ) n ( r + r ) n ( r + r )Solving (3.3) n r n r n r Total complexity n r + n ( r + r ) n r + n ( r + r ) n r + n ( r + r )+2 n r +2 n r +2 n r R r -PSDTLS is 2 mn . Thus, the total computational complexity of performing oneiteration of R r -PSDTLS is N r = 2 mn + n r + n ( r + r ) + 2 n r. As shown in Table 3.1, the computational complexity of CG-L is approximately r times greater than the ones due to CG-O and GMRES-O. Computing Cost: Positive Semi-definite Total Least Squares Problem.
Thecomputational cost for solving the positive semi-definite total least squares problemis equal to the sum of the computational costs for solving all the rank r positivesemi-definite total least squares problems. The GMRES algorithm for solving the n × r operator equation (3.3) is terminated after at most n iterations [20]; hence, themaximum computational cost would be N = 2 mn + X nr =1 n ( n r + n ( r + r ) + 2 n r ) ≃ mn + n n n + n . In some applications like rank one signal recovery, apositive semi-definite rank 1 least squares problem needs to be solved; see, e.g., [26].The following lemma is concerned with the special case r = 1. Lemma 3.5. If r = 1 , then problem (2.9) can be converted to a quadratic eigen-value problem.Proof . Reformulating the Lagrangian function for the optimization problem (2.9),for the case r = 1, we have L ( y, s, Ω) = s y T Ay + 1 s y T By − y T Cy − Ω( y T y − . Let u = sy and v = s y . Thus, L ( u, v, Ω) = u T Au + v T Bv − u T Cv − Ω( u T v − . he KKT necessary optimality conditions lead to2 Au − Cv − Ω v = 0 , (3.5) 2 Bv − Cu − Ω u = 0 ,u T v = 1 . If D and T have full ranks, then it can be concluded from (3.5) that(2 B −
12 ( C + Ω I ) A − ( C + Ω I )) v = 0 , (3.6) u = 12 A − ( C + Ω I ) v. Note that (3.6) is a quadratic eigenvalue problem which may be solved by variousmethods [13, 28]. This approach will be referred as R1-PSDTLS.Next, we point out two mathematical problems and describe how to solve themusing R r -PSDTLS.
4. Two Problems and Their Solutions.
Two slightly different problems alsoarise in some context. Here, we describe these problems and show how to solve theseproblems making use of Algorithm 1.( i ) Positive semi-definite total minimum rank problem:min rank( X ) s.t. (4.1) tr(∆ D T ∆ T ) < eX (cid:23) . This problem arises in different contexts such as system identification and control,Euclidean embedding and collaborative filtering [10]. Both ordinary and total leastsquares formulations have been considered for solving the problem [10, 16]. In anordinary formulation, the minimum possible rank of a positive semi-definite matrix X needs to be computed so that the ordinary least squares error, k DX − T k , is lessthan an error bound, e . In a total formulation, however, the goal is to minimizerank( X ), where X is a positive semi-definite matrix satisfying k [∆ D, ∆ T ] k < e . Tosolve this problem using R r -PSDTLS, our proposed total error E = tr(∆ D T ∆ T )needs to satisfy the constraint E < e . We start from the smallest possible rank, r = 1, and solve the corresponding positive semi-definite total least squares problem.If the inequality E < e holds for the computed X r , then we stop; otherwise, weincrease r by one and continue iteratively until satisfying the inequality E < e . Theiteration might be performed at most n times. If after n iterations, none of thematrices X r , r = 1 , · · · n , satisfies the inequality E < e , then we consider X r with thesmallest corresponding value of E as the solution of the minimum rank problem. Wesummarize the discussion above in the following algorithm. Algorithm 2.
Solving minimum rank positive semi-definite total least squares(MRPSDTLS) problem.(1) Let A = D T D , B = T T T and C = D T T + T T D .(2) Let r = 1.(3) Apply R r -PSDTLS with the input arguments D , T and r and compute X and E .If E < e hen let X ∗ = X and stopElseLet r = r + 1 and go to (3).EndIfA significant characteristic of our proposed approach for solving the minimumrank problem is that for each rank r the minimum value of E is determined applyingAlgorithm 1. Hence, to solve the minimum rank problem, it is sufficient to find theminimum value of r satisfying the inequality constraint.( ii ) Computing the correlation matrix: This problem is an special case of the pos-itive semi-definite total least squares problem. Computing a correlation matrix isequivalent to finding a positive semi-definite matrix X to satisfy X ≃ C,P X ≃ Q. The linear constraints X ≃ C and P X ≃ Q can be replaced by the overdeterminedsystem of equations (cid:18) IP (cid:19) X = (cid:18) CQ (cid:19) . (4.2)To solve the overdetermined system of equations (4.2), both ordinary and total for-mulations have been considered [9, 12, 22, 27]. We note that (4.2) is an special caseof positive semi-definite total least squares problem with data and target matrices D = (cid:18) IP (cid:19) and T = (cid:18) TQ (cid:19) , where I is the n × n identity matrix, P, Q ∈ R m × n and C ∈ R n × n are arbitrary. Here, we make use of the PSDTLS approach describedin Section 3.2 for solving (4.2).Next, we report some numerical results. We provide comparison of our proposedalgorithms and some existing methods on randomly generated test problems.
5. Numerical Results.
We made use of MATLAB 2012b in a Windows 7 ma-chine with a 3.2 GHz CPU and a 4 GB RAM to implement our proposed algorithmsand other methods. We generated random test problems with random data andtarget matrices. These random matrices were produced using the rand commandin MATLAB. The command R = rand( m, n ) generates an m × n matrix R , withuniformly distributed random entries in the interval [0 , Rab = ( b − a ) ∗ R + a , the matrix Rab with random entries in the interval[ a, b ] is generated.The numerical results are presented in two parts. Section 5.1 represents the nu-merical results corresponding to the rank r positive semi-definite least squares problemand the positive semi-definite total least squares problem. The numerical results forthe two problems mentioned in Section 4 are reported in Section 5.2. The abbreviatednames of problems and the corresponding inputs and outputs are listed in tables 5.1and 5.2. able 5.1 Abbreviated names
Problem name AbbreviationRank r positive semi-definite least squares R r -PSDLSPositive semi-definite least squares PSDLSMinimum rank MRCorrelation matrix CM Table 5.2
Inputs and outputs
Problem name Inputs OutputsR r -PSDLS Data matrix ( D ) Average computing time in seconds ( t )Target matrix ( T ) Average error value ( E ) r PSDLS
D tT E MR D tT
Average value of rank( X ) ( r )Error bound ( e )CM T Corellation matrix XP Average standard deviation value of error matrix (
Std ) Q For a given input size, ten random inputs are generated and the average value ofoutputs on the ten problems are reported. In Section 5.1, the numerical results corre-sponding to R r -PSDTLS and the one due to solving the linear problem (3.3) by eachof the three possible methods are presented. We refer to R r -PSDTLS using conjugategradient method in the operator form to solve the linear problem as R r -PSDTLS-CG-O, R r -PSDTLS using conjugate gradient method after converting the linear problemto a linear system of equations as R r -PSDTLS-CG-L and R r -PSDTLS using GMRESin the operator form to solve the linear problem as R r -PSDTLS-GMRES-O. Consid-ering these numerical results, we can make the following observations:(1) R r -PSDTLS-CG-L generates solutions for which the orthogonality constraint, Y T Y = I , is exactly satisfied; however, due to the high computing cost it isnot practical for large problems.(2) R r -PSDTLS-GMRES-O computes the solution in a less computing time than R r -PSDTLS-CG-O and R r -PSDTLS-CG-L.Some numerical results are also reported in Section 5.1 to compare our proposedapproach for solving the positive semi-definite least squares (PSDLS) problem by twoexisting methods, the Interior point method (PSDLS-IntP in [18]), and the path fol-lowing method described by Toh (PSDLS-PFToh in [30]).Numerical results corresponding to special problems mentioned in Section 4 arereported in Section 5.2. There, numerical results obtained by our proposed algorithmfor solving the minimum rank (MR) problem (MR-PSDTLS) and two other methodsare reported. These two methods are MR-Toh [31] and MR-Recht [10, 16]. Finally,the numerical results corresponding to our proposed method (CM-PSDTLS) and twoother methods, CM-IntP and CM-Sun [30], for solving the correlation matrix (CM)problem are also reported in Section 5.2. In all the tables, the headings m and n orrespond to the matrix size (the size of both data and target matrices) and r givesthe unknown matrix rank and the columns with headings t and E respectively containthe average computing time and error value.In summary, numerical results confirm the effectiveness of PSDTLS in produc-ing more accurate solutions with lower standard deviation values in less times forsolving the rank r positive semi-definite total least squares problems. The reportedresults show that our proposed methods solve the positive semi-definite total leastsquares problem and the minimum rank problem with a smaller value of error, whilebeing more efficient. Also, the presented results confirm the efficiency of our proposedmethod in solving the correlation matrix problem. Here, we re-port the numerical results for solving rank r positive semi-definite total least squaresproblems and the general positive semi-definite total least squares problem respec-tively. Rank r Positive Semi-definite Total Least Squares Problem.
In Table 5.3,the average computing time, t , and the average error value, E = tr ( DX − T ) T ( D − T X † ) U r U Tr , are reported for PSDTLS-CG-O. The fourth column gives norm of the error corre-sponding to the orthogonality constraint, δ = k Y T Y − I k . Table 5.3
The average computing times and the average error values for PSDTLS-CG-O. m n r δ t E
20 10 5 1.3421E-007 9.9864E-004 1.9459E+001100 20 10 6.7435E-006 8.1695E-004 2.0093E+002100 50 50 5.3193E-004 3.3691E-003 8.2009E+002200 100 50 ∗ ∗ ∗
400 300 200 ∗ ∗ ∗ ∗ ∗ ∗∗ : Out of memory.
In tables 5.4 and 5.5, we report the results for PSDTLS-GMRES-O and PSDTLS-CG-L respectively. able 5.4 The average computing times and the average error values for PSDTLS-GMRES-O. m n r δ t E
20 10 5 7.4924E-008 7.1055E-004 2.2493E+001100 20 10 1.7123E-006 1.1905E-003 2.0953E+002100 50 50 4.3528E-005 3.2913E-003 1.4490E+003200 100 50 2.5649E-004 1.3800E-002 2.6041E+005400 300 200 3.7246E-003 1.5974E-001 1.0264E+0061000 500 200 1.0021E-002 3.6251E-001 4.5691E+006
Table 5.5
The average computing times and the average error values for PSDTLS-CG-L. m n r δ t E
20 10 5 0 1.0904E-003 1.7025E+001100 20 10 1.0032E-009 1.1484E-003 1.8363E+002100 50 50 ∗ ∗ ∗
200 100 50 ∗ ∗ ∗
400 300 200 ∗ ∗ ∗ ∗ ∗ ∗
In Figure 5.1, the Dolan-Mor´e time profile is presented to compare the computingtimes by PSDTLS-CG-L, PSDTLS-GMRES-O and PSDTLS-CG-L for solving thePSDLS problems. Generating enough test problems is necessary for producing anillustrative performance profile. We generated 300 test problems (50 problems foreach matrix size). The presented time profile shows that PSDTLS-GMRES-O needsmuch less computing time than the other methods.
Fig. 5.1 . The Dolan-Mor´e performance profile (comparing the computing times by PSDTLS-CG-L, PSDTLS-GMRES-O and PSDTLS-CG-L).
Positive Semi-definite Total Least Squares Problem.
Here, the numericalresults for solving the general positive semi-definite total least squares problems arereported. In tables 5.6, 5.7 and 5.8, the average computing times, t , and the averageerror values, E , for solving the PSDLS problem using PSDTLS-CG-O, PSDTLS-CG- and PSDTLS-GMRES-O are respectively reported. The third column representsthe average value of δ = k Y T Y − I k . Table 5.6
The average computing times and the average error values for PSDTLS-CG-O. m n δ t E
20 10 1.0347E-007 5.0208E-004 4.7606E+000100 20 6.4238E-006 9.8582E-004 2.1158E+001100 50 1.0046E-004 4.3211E-003 2.5376E+001200 100 ∗ ∗ ∗
400 300 ∗ ∗ ∗
Table 5.7
The average computing times and the average error values for PSDTLS-GMRES-O. m n δ t E
20 10 8.1282E-009 7.3702E-004 3.0053E+000100 20 5.7631E-008 1.1974E-003 1.9550E+001100 50 1.3214E-007 4.9063E-003 2.5928E+001200 100 6.8472E-006 1.4990E-002 5.7797E+001400 300 2.1064E-004 1.8509E-001 7.5146E+001
Table 5.8
The average computing times and the average error values for PSDTLS-CG-L. m n δ t E
20 10 0 8.9295E-004 3.0689E+000100 20 0 1.6088E-003 1.6594E+001100 50 ∗ ∗ ∗
200 100 ∗ ∗ ∗
400 300 ∗ ∗ ∗
Considering the reported results in tables 5.6, 5.7 and 5.8, PSDTLS-CG-O andPSDTLS-GMRES-O perform approximately the same on small problems; however,for large problems, PSDTLS-GMRES-O outperforms PSDTLS-CG-O in almost allthe test problems. Thus, we report the results obtained by PSDTLS-GMRES-O incomparisons with other methods.The average computing times for solving the positive semi-definite total leastsquares problem using PSDTLS, PSDLS-IntP and PSDLS-PFToh are reported inTable 5.9. The third column presents the values of
T OL for PSDLS-IntP and PSDLS-PFToh.Similarly, the average error values, E , are given in Table 5.10 for PSDLS-IntP,PSDLS-PFToh and PSDTLS. able 5.9 The average computing times for solving PSDLS Using PSDTLS, PSDLS-IntP and PSDLS-PFToh. m n T OL t (IntP/PFToh) PSDTLS IntP PFToh20 10 1.0000E-006 5.0208E-004 3.1785E-003 1.8652E-003100 20 1.0000E-006 9.8582E-004 6.1142E-002 7.6139E-003100 50 1.0000E-005 4.0462E-003 9.2783E-001 5.2841E-002200 100 1.0000E-005 1.3841E-002 ∗ ∗ ∗ Table 5.10
The average error values for solving PSDLS-Using PSDTLS, PSDLS-IntP and PSDLS-PFToh. m n T OL t (IntP/PFToh) PSDTLS IntP PFToh20 10 1.0000E-006 2.7168E+000 3.7791E+000 5.1176E+000100 20 1.0000E-006 9.5355E+000 1.2956E+001 9.6873E+000100 50 1.0000E-005 1.0158E+001 1.9162E+001 1.4855E+001200 100 1.0000E-005 2.1574E+001 ∗ ∗ ∗ The corresponding Dolan-Mor´e time profile is shown in Figure 5.2 to compare thecomputing times for solving the PSDLS problem. We generated 500 test problems(100 for each matrix size) to provide an illustrative time profile. The results confirmthat our proposed algorithm for solving positive semi-definite least squares problemperforms much faster than the other methods.
Fig. 5.2 . The Dolan-Mor´e performance profile (comparing the computing times by PSDTLS,PSDLS-IntP and PSDLS-PFToh).
Here, we report numerical results corresponding to thespecial problems mentioned in Section 4. In Table 5.11, the average computing timesfor solving the minimum rank (MR) problem are reported. The third column gives he value of error bound, e , in the MR problem. Table 5.11
The average computing times for solving the MR problem-Using MR-PSDTLS-CG-O, MR-PSDTLS-GMRES-O and MR-PSDTLS-CG-L methods. m n e t r
CG-O GMRES-O CG-L CG-O GMRES-O CG-L20 10 10 1.0000E-003 1.1000E-003 1.0000E-003 3 3 4100 20 300 9.5430E-004 1.8000E-003 1.9000E-003 17 14 18100 50 500 4.7000E-003 4.3000E-003 4.2000E-003 22 16 24200 100 1000 1.4000E-002 1.8400E-002 ∗
19 19 ∗
400 300 3000 8.0024E-001 1.8700E-001 ∗
67 66 ∗ The average computing time needed by MR-PSDTLS, MR-Toh and MR-Rechtand the average resulting rank, r , for solving the MR problem are reported in Table5.12. Table 5.12
The average computing times and the average error values for solving the MR problem-UsingMR-PSDTLS, MR-Toh and MR-Recht methods. m n e t
RankMR- MR- MR- MR- MR- MR-PSDTLS Toh Recht PSDTLS Toh Recht20 10 10 1.0000E-003 9.0300E-003 7.8100E-003 3 3 4100 20 300 9.5430E-004 1.2300E-002 3.1100E-002 14 12 11100 50 500 4.2000E-003 2.1800E-002 9.8800E-003 16 13 13200 100 1000 1.4000E-002 8.9700E-002 9.1200E-002 19 18 19400 300 3000 1.8700E-001 ∗ ∗ An illustrative comparison is also provided in Figure 5.3 based on the corre-sponding Dolan-Mor´e time profile. We generated 500 test problems to provide thetime profile. The presented Dolan-Mor´e time profile shows that our proposed algo-rithm for solving minimum rank problem needs less computing time than the othertwo methods.
100 200 300 400 500 600 700 800 900 100000.20.40.60.81 MR−PSDTLSMR−RechtMR−Toh
Fig. 5.3 . The Dolan-Mor´e performance profile (comparing the computing times by MR-PSDTLS, MR-Toh and MR-Recht).
In Table 5.13, the average computing times are reported for computing the cor-relation matrices by our proposed method (CM-PSDTLS).
Table 5.13
The average computing times for computing the correlation matrix-Using CM-PSDTLS-CG-O,CM-PSDTLS-GMRES-O and CM-PSDTLS-CG-L. m n t
CG-O GMRES-O CG-L20 10 1.1000E-003 1.3000E-003 1.3000E-003100 20 2.1000E-003 1.6000E-003 3.7000E-003100 50 3.5000E-003 6.7000E-003 ∗
200 100 ∗ ∗
400 300 ∗ ∗ Similarly, in Table 5.14 the average value of standard deviation
Std for computingthe correlation matrix is reported.
Table 5.14
The average standard deviation values of error matrix for computing the correlation matrices,using CM-PSDTLS-CG-O, CM-PSDTLS-GMRES-O and CM-PSDTLS-CG-L . m n Std
CG-O GMRES-O CG-L20 10 3.0410E-001 3.0280E-001 4.0030E-001100 20 2.9940E-001 2.9140E-001 3.3810E-001100 50 2.9310E-001 2.8870E-001 ∗
200 100 ∗ ∗
400 300 ∗ ∗ The average computing times for CM-PSDTLS, CM-IntP and CM-Sun are re-ported in Table 5.15. able 5.15 The average computing times for computing the correlation matrix-Using CM-PSDTLS, CM-IntP and CM-Sun. m n t
CM-PSDTLS CM-IntP CM-Sun20 10 1.1000E-003 2.4100E-002 9.6640E-003100 20 1.6000E-003 1.6140E-001 5.4100E-002100 50 3.5000E-003 7.4941E+000 4.7810E-001200 100 1.6400E-002 ∗ ∗ ∗ In Figure 5.4, the Dolan-Mor´e time profile is presented to compare the neededcomputing times by CM-PSDTLS, CM-IntP and CM-Sun to solve the correlationmatrix problem. Here, we generated 250 test problems (50 for each matrix size). Thepresented time profile confirms that our proposed algorithm computes a correlationmatrix much faster than the other methods.
Fig. 5.4 . The Dolan-Mor´e performance profile (comparing the computing times by CM-PSDTLS, CM-IntP and CM-Sun).
In Table 5.16, the average values of the standard deviation,
Std , for computingthe correlation matrix are reported.
Table 5.16
The average standard deviation values of error matrix for computing the correlation matrices,using CM-PSDTLS, CM-IntP and CM-Sun. m n Std
CM-PSDTLS CM-IntP CM-Sun20 10 3.0410E-001 4.0404E+003 1.2957E+002100 20 2.9940E-001 1.7266E+004 6.3595E+004100 50 2.9310E-001 8.2909E+005 3.8741E+006200 100 ∗ ∗ ∗ Considering the numerical results reported in this section, we summarize our ob-servations:
1) A newly defined problem, R r -PSDLS, was considered and an efficient algorithmwas proposed for its solution.(2) Although our proposed algorithm for solving the PSDLS problem, PSDTLS, ap-plies R r -PSDTLS, for r = 1 , . . . , n , in search for the solution, it appears to bemore efficient than PSDLS-IntP and PSDLS-PFToh.(3) In contrast with other available methods, our use of total formulation for solvingthe PSDLS problem to consider error in both data and target matrices turns tobe practically effective to produce more meaningful results.(4) The proposed method for solving the PSDLS problem, PSDTLS, is more efficientthan the other methods.(5) The proposed method for solving the minimum rank problem, MR-PSDTLS, isalso more efficient than MR-Toh and MR-Recht.(6) The proposed method for computing the correlation matrix, CM-PSDTLS, showsto be more efficient and robust in computing a correlation matrix with a lowervalue of standard deviation of error in T as compared to CM-IntP and CM-Sun.
6. Concluding Remarks.
We proposed a new approach to solve positive semi-definite total least squares (PSDLS) problems. Consideration of our proposed errorestimate for both data and target matrices admitted a more realistic problem for-mulation. We first considered a newly defined given rank positive semi-definite totalleast squares (R r -PSDTLS) problem and presented an at least quadratically conver-gent algorithm for its solution. Numerical results confirmed the effectiveness of ourapproach to compute solutions of R r -PSDLS problems in less computing time thanthe interior point method and the path following algorithm. We then showed howto apply R r -PSDTLS to solve the general PSDLS problem. Based on the reportednumerical results, our method for solving the PSDLS problem also showed to be moreefficient than the interior point method and the path following algorithm. An specif-ically effective approach was also described to solve the rank 1 positive semi-definitetotal least squares problem, R1-PSDTLS. In addition, we noted that R r -PSDTLS canbe applied to other problems arising in control and financial modeling: the minimumrank (MR) problem and correlation matrix computation. Using the Dolan-Mor´e per-formance profiles, we showed our proposed method for solving the MR problem tobe more efficient than a path following algorithm and a semi-definite programmingapproach for solving the MR problem. Furthermore, in computing the correlationmatrix, numerical results showed lower standard deviation of error as compared tothe interior point method and semi-definite programming approach. ACKNOWLEDGEMENT.
The authors thank Research Council of Sharif Uni-versity of Technology for supporting this work.
REFERENCES[1] Edelman A., Arias T. A., Smith S. T.: The Geometry of Algorithms with Orthogonality Con-straints, SIAM J. Matrix Anal. Appl., 20(2), 303-353 (1998)[2] Golub G. H., Van Loan C. F.: An analysis of the total least squares problem, SIAM J. Numer.Anal., 17, 883-893 (1980) 213] Hu H.: Positive definite constrained least-squares estimation of matrices, Linear Algebra Appl,229, 167-174 (1995)[4] Hu H., Olkin I.: A numerical procedure for finding the positive definite matrix closest to apatterned matrix, Statistical and Probability Letters, 12, 511-515 (1991)[5] Krislock N. G., Lang J., Varah J. , Pai D. K., Seidel H.: Local Compliance Estimation viaPositive Semi-Denite Constrained Least Squares, IEEE Trans. Robotics and Automation20(6), 10071011 (2004)[6] Larson H. J.: Least squares estimation of the components of a symmetric matrix, Technomet-rics, 8(2), 360-362 (1966)[7] McInroy J., Hamann J. C.: Design and control of flexure jointed hexapods, IEEE Trans.Robotics and Automation, 16(4), 372-381 (2000)[8] Paige C. C., Strakoˇs Z.: Scaled total least squares fundamentals, Numer. Math., 91, 117-146(2000)[9] Rebonato R., J˜ackel P.: The most general methodology to create a valid correlation matrix forrisk management and option pricing purposes, J. Risk, 2, 1727 (1999)[10] Recht B., Fazel M., Parrilo P. A.: Guaranteed Minimum-Rank Solutions of Linear MatrixEquations via Nuclear Norm Minimization, SIAM Review, 52(3), 471-501 (2010)[11] Smith S. T.: Optimization Techniques on Riemannian Manifolds, Fields Ins. Com., 3, 113-146(1994)[12] Sun D., Gao Y.: Calibrating least squares covariance matrix problems with equality and in-equality constraints, SIAM. J. Matrix Anal. and Appl., 31, 1432-1457 (2009)[13] Tisseur F.: The Quadratic Eigenvalue Problem, SIAM Review, 43(2), 235-286 (2001)[14] Toh K. C.: An inexact primal-dual path-following algorithm for convex quadratic SDP, Math-ematical Programming, 112, 221-254 (2007)[15] Woodgate K. G.: Least-squares solution of F = P G over positive semidefinite symmetric P ∼ ∼∼