[PDF] A Sherman-Morrison-Woodbury Identity for Rank Augmenting Matrices with Application to Centering

Abstract

Matrices of the form A+( V 1 + W 1 )G( V 2 + W 2 ) ∗ are considered where A is a singular ℓ×ℓ matrix and G is a nonsingular k×k matrix, k≤ℓ . Let the columns of V 1 be in the column space of A and the columns of W 1 be orthogonal to A . Similarly, let the columns of V 2 be in the column space of A ∗ and the columns of W 2 be orthogonal to A ∗ . An explicit expression for the inverse is given, provided that W ∗ i W i has rank k . %and W 1 and W 2 have the same column space. An application to centering covariance matrices about the mean is given.

Full PDF

aa r X i v : . [ s t a t . M E ] M a r A SHERMAN MORRISON WOODBURY IDENTITY FOR RANKAUGMENTING MATRICES WITH APPLICATION TO CENTERING ∗ KURT S. RIEDEL † SIAM J. M

AT. A NAL. c (cid:13) Abstract.

Matrices of the form A + ( V + W ) G ( V + W ) ∗ are considered where A is a singular ℓ × ℓ matrix and G is a nonsingular k × k matrix, k ≤ ℓ . Let the columns of V be in thecolumn space of A and the columns of W be orthogonal to A . Similarly, let the columns of V be in the column space of A ∗ and the columns of W be orthogonal to A ∗ . An explicit expressionfor the inverse is given, provided that W ∗ i W i has rank k . An application to centering covariancematrices about the mean is given. Key words.

Linear Algebra, Schur Matrices, Generalized Inverses

AMS(MOS) subject classiﬁcations.

The wellknown Sherman-Morrison-Woodbury matrix identity [1]:( A + X G X T ) − = A − − A − X ( G − + X T A − X ) − X T A − (1)is widely used. Several excellent review articles have appeared recently [2-4]. However(1) is only valid when A is nonsingular . In this article, we consider matrix inversesof the form A + X G X T where the rank of A + X G X T is larger than the rankof A .We decompose the matrix X into V + W , where the columns of V arecontained in the column space of A and the columns of W are orthogonal to it.Similarly, we decompose X into V + W , where the columns of V are containedin the column space of A ∗ and the columns of W are orthogonal to M ( A ∗ ) . Wedenote the column space of A by M ( A ). The Moore-Penrose generalized inversewill be denoted by the superscript + . We denote the k × k matrix W ∗ i W i by B i and deﬁne C i ≡ W i ( W ∗ i W i ) − . We will require B i to be nonsingular. Howeverthe rank of the perturbation, k , can be signiﬁcantly less than the size of the originalmatrix. We note that V ∗ i W i = 0 and W ∗ i C i = I k . Finally the projection operatoronto the column space of W satisﬁes W i B − i W ∗ i = W C ∗ = C W ∗ . Theorem 1 . Let A be a ℓ × ℓ matrix of rank ℓ , ℓ < ℓ , V i and W i be ℓ × k matrices and G be a k × k nonsingular matrix. Let the columns of V ∈ M ( A )and the columns of W be orthogonal to M ( A ). Similarly, let the columns of V ∈ M ( A ∗ ) and the columns of W be orthogonal to M ( A ∗ ). Let B i ≡ W ∗ i W i haverank k . The matrix, Ω ≡ A + ( V + W ) G ( V + W ) ∗ , has the following Moore-Penrose generalized inverse:Ω + = A + − C V ∗ A + − A + V C ∗ + C ( G + + V ∗ A + V ) C ∗ . (2) ∗ Received by the editors July 12, 1990; accepted for publication February 19, 1991. † Courant Institute of Mathematical Sciences, New York University, New York, New York 10012.The work of this author was supported by the U.S. Department of Energy Grant No. DE-FG02-86ER53223. We denote the transpose of a matrix, A by A T and the hermitian or conjugate transpose by A ∗ . 79 Kurt S. Riedel

Proof: We recall that the Moore Penrose inverse is the unique generalized inversewhich satisﬁes the following four conditions,(Ref. [5], p.26):( a ) ΩΩ + Ω = Ω , ( b ) Ω + ΩΩ + = Ω + , ( c ) (ΩΩ + ) ∗ = ΩΩ + , ( d ) (Ω + Ω) ∗ = Ω + Ω . The identity is veriﬁed by direct computation,ΩΩ + ≡ A A + − A C V ∗ A + − A A + V C ∗ + A C ( G + + V ∗ A + V ) C ∗ +( V + W ) G ( V + W ) ∗ A + − ( V + W ) G ( V + W ) ∗ C V ∗ A + − ( V + W ) G ( V + W ) ∗ A + V C ∗ +( V + W ) G ( V + W ) ∗ C ( V ∗ A + V ) C ∗ +( V + W ) G ( V + W ) ∗ C G + C ∗ . Since W is orthogonal to A ∗ , we have A W = 0 , W ∗ A + = 0 , and V ∗ W = 0,which simpliﬁes the previous expression toΩΩ + ≡ A A + − A A + V C ∗ + ( V + W ) G V ∗ A + − ( V + W ) G W ∗ C V ∗ A + − ( V + W ) G V ∗ A + V C ∗ +( V + W ) G W ∗ C V ∗ A + V C ∗ + ( V + W ) G W ∗ C G + C ∗ .This expression may be simpliﬁed using G W ∗ C G + C ∗ = C ∗ , and G W ∗ C V ∗ = G V ∗ , and A A + V = V toΩΩ + ≡ A A + + W C ∗ , and clearly condition (c) is satisﬁed.The corresponding identity for Ω + Ω ≡ A + A + C W ∗ requires the decomposi-tion to satisfy A + W = 0 , W ∗ A = 0 , V ∗ W = 0, and V A + A = V . In addi-tion, the matrix G must satisfy C G + C ∗ W G = C and V C ∗ W G = V G .These requirements guarantee that conditions (a), (b) and (d) are also satisﬁed. []Remark: The conditions that G and W ∗ i W i have rank k may be replaced bythe somewhat weaker but more complicated conditions that G W ∗ C G + C ∗ = C ∗ , G W ∗ C V ∗ = G V ∗ , C G + C ∗ W G = C and V C ∗ W G = V G .Note that the generalized inverse in (2) is singular and tends to inﬁnity as W i approaches to zero. Thus (2) does not reduce to the (1) as the perturbation tends tozero. When the perturbation of the column space of A is zero, i.e. V ≡

0, theorem1 simpliﬁes to Ω + = A + + C G + C . (3)When A is a symmetric matrix, the column spaces of A and A ∗ are identical.Thus, for the case of symmetric A and Ω, Thm. 1 reduces to Theorem 2 . Let A be a symmetric ℓ × ℓ matrix of rank ℓ , ℓ < ℓ , V and W be ℓ × k matrices and G be a k × k nonsingular matrix. Let V ∈ M ( A ) and the herman Morrison Identity for Rank Augmenting W be orthogonal to M ( A ). Let B ≡ W ∗ W have rank k . The matrix,Ω ≡ A + ( V + W ) G ( V + W ) ∗ , has the following Moore-Penrose generalized inverse:Ω + = A + − C V ∗ A + − A + V C ∗ + C ( G + + V ∗ A + V ) C ∗ . (4)For concreteness, we specialise the preceding identities to the case of rank oneperturbations. In this special case, k ≡

1, and V i and W i reduce to ℓ vectors, v i and w i . In the nonsingular case, (1) reduces to Bartlett’s identity [6]. It states for anarbitrary nonsingular ℓ × ℓ matrix A and ℓ vectors v i ,( A + v v ∗ ) − = A − − ( A − v )( v ∗ A − )(1 + v ∗ A − v ) . (5)In this case, theorem 1 reduces to the analogous result for an arbitrary singularmatrix A with a rank one perturbation which contains a component perpendicular tothe column space of A . Noting that G ≡ C i ≡ w i / | w i | , theorem 1 simpliﬁesto the following result. Theorem 3 . Let A be a ℓ × ℓ matrix of rank ℓ , ℓ < ℓ , and v i , w i , i = 1 , ℓ vectors. Let v ∈ M ( A ) and w be orthogonal to M ( A ), and v ∈ M ( A ∗ ) and w be orthogonal to M ( A ∗ ). Assume w is parallel to w and w i = 0. LetΩ ≡ A + ( v + w )( v + w ) ∗ , The Moore-Penrose generalized inverse isΩ + = A + − w v ∗ A + | w | − A + v w ∗ | w | + (1 + v ∗ A + v ) w w ∗ | w | | w | . (6)This generalized inverse is singular and tends to inﬁnity as 1 / | w || w | , as w i approaches to zero. Thus (6) does not reduce to Bartlett’s identity.The projection operator onto the row space of Ω is P X T = A + A + w i w i ∗ | w i | . The symmetric version of Thm. 3 was originally developed and applied by theauthor in his statistical analysis of magnetic fusion data [7]. To estimate the regressionparameters in ordinary least squares regression, the sum of squares and products (SSP)matrix needs to be inverted. We apply Thm. 3 to determine the inverse of the SSPmatrix in terms of the inverse of the covariance matrix of the covariates.We decompose the independent variable vector, x into a mean value vector, ¯ x anda ﬂuctuating part, ˜ x . Thus the i -th individual observation has the form x i = ¯ x + ˜ x i . Let X denote the n × ℓ data matrix whose rows consist of x ∗ i and ˜ X be the centereddata matrix whose rows consist of ˜ x ∗ i .We assume that some of the independent variables, x k , have not been varied.Thus ˜ X ∗ ˜ X is singular. The inverse of the uncentered sum of squares and crossproducts2 Kurt S. Riedel matrix, X ∗ X can now be expressed in terms of the Moore Penrose generalized inverseof the centered covariance matrix, ˜ X ∗ ˜ X .We decompose a multiple of the mean value vector, √ n ¯ x , into v + w , where v ∈ M ( ˜ X ∗ ˜ X ) and w ⊥ M ( ˜ X ∗ ˜ X ).The data matrix has the form X ∗ X = ˜ X ∗ ˜ X + n ¯ x ¯ x T = ˜ X ∗ ˜ X + ( v + w )( v + w ) ∗ . Thus we have rewritten X ∗ X in a form appropriate to the application of theorem3. In conclusion, the application of these matrix identities requires the decompositionof X i into the orthogonal components, V i and W i . Thus our theorems are mostuseful in situations where the decomposition is trivial. Acknowledgments

The helpful comments of the referees are gratefully acknowledged.

REFERENCES

1. W.J. Duncan, “Some devices for the solution of large sets of simultaneousequations (with an appendix on the reciprocation of partitioned matrices)”,

The London, Edinburgh and Dublin Philosophical Magazine and Journal ofScience , Seventh Series, , p. 660, (1944).2. H.V. Henderson and S.R. Searle, “On deriving the inverse of a sum of matri-ces”, SIAM Review, , p.53, (1981).3. D.V. Ouellete, “Schur complements and statistics”, Journal of Linear Alge-bra, , p. 187, (1981).4. W.W. Hager, “Updating the inverse of a matrix”, SIAM Review, , p.221,(1989).5. C.R. Rao, Linear Statistical Inference and Its Applications , p. 26,33, J. Wileyand Sons, New York, 1973.6. M.S. Bartlett, “An inverse matrix adjustment arising in discriminant analy-sis”,