AA symmetrization approach to hypermatrix SVD
Edinah K. Gnang ∗ , Fan Tian † April 23, 2020
Abstract
We propose a new hypermatrix singular value decomposition based upon the spectral decomposition of thesymmetric products of transposes.
One of the most fruitful ideas in matrix theory is that of matrix decomposition or canonical form. Of the many matrixcanonical forms discussed in the literature, the Singular Value Decomposition (or SVD for short), is by far the mostwidely used. Recall that for an arbitrary A ∈ C n × n , the SVD of A is expressed by A = (cid:16) U (cid:112) diag σ ( A ) (cid:17) (cid:16)(cid:112) diag σ ( A ) V (cid:17) such that UU ∗ = I n = V ∗ V (1)Calculating the SVD consists of finding the eigenvalues and eigenvectors of the Hermitian products of A and A ∗ .Important infomation about the matrix A is obtained through decomposition such as the matrix rank, the orthornor-mal basis vectors and the diagonal matrix of the scaling values, all of which are useful to be exteneded to higherdimensions. Over the past decades, considerable progress has been made in generalizing the matrix SVD to higherorder hypermatrices. Two predominat approaches to hypermatrix canonical forms are now well established as theCANDECOMP-PARAFAC (CP) model [CC70, Har70] and the Tucker model [Tuc66], where the former is a specialcase to the later. Based on Tucker model, De Lathauwer, De Moor, and Vandewalle poineered a multilinear general-ization of the matrix SVD to hypermatrices in [DLDMV00], namely the Higher-Order Singular Value Decomposition(HOSVD). The classical models of CP and Tucker or HOSVD generally express the decompostion of a hypermatrixas a sum of outer products of vectors, also referred to as the n -mode product in the form of “hypermatrix timesmatrices” [KB09]. In particular, the n -mode product enables the hypermatrix SVD through performing matrix SVDsfollowing the mode- n flattening (unfolding) of the original hypermatrix into matrices, and then assemble results into ∗ Department of Applied Mathematics and Statistics, Johns Hopkins University, email: [email protected] † Department of Applied Mathematics and Statistics, Johns Hopkins University, email: [email protected] a r X i v : . [ m a t h . SP ] A p r hypermatrix of the same order. One of the advantages of the classical models and the method of HOSVD is thatthe obtained results guarantee orthogonality to some extent: the singular vectors are entries of orthogonal matrices,and the core hypermatrix coordinating singular values meets a property of all-orthogonality that is a relaxation to thediagonality property in the matrix SVD. Thorough discussions on the classical methods and applications have beenreviewed in [KB09]. Other more recent studies also explored alternative representations of a hypermatrix SVD as asum of outer products of matrices, which is a generalization based on a different hypermatrix multiplication schemein the form of “hypermatrix times hypermatrix” [KMP08, KM11].While the aforementioned generalizations to higher-order SVD have been widely used in applications, they oftenreduce the problems to matrix SVDs through the folding and unfolding schemes. By contrast to the matrix case, suchhigher-order SVD methods do not stem from a hypermatrix formulation of the spectral theorem. Recent works in[GF17, GER11] motivated by the generalization of the spectral theorem to hypermatrices suggest new ways to extendmatrix SVD to hypermatrix SVD while retaining the link to the spectra. In the present note, we discuss in analogyto matrix SVD the new approach to obtain orthogonal hypermatrices and diagonal scaling hypermatrix via spectraldecompostion of symmetric products of transposes. Our work is based on the the Bhattacharya-Mesner algebra (BMalgebra) introduced in [MB94, GF17, GF20], which has enabled the generalization of many important matrix conceptsincluding the rank, inverse, and spectral decompostions to hypermatrices. In addition to the hypermatrix SVD, we alsoexpand the list of concepts to the BM algebra to include definitions of tensorial orbits and invariants of hypermarices,and hypermatrix orthorgonality and unitarity. Hypermatrices are multidimensional matrices. More precisely, a hypermatrix is a finite multiset whose elements (calledentries) are indexed by members of some fixed Cartesian product of the form { , · · · , n − } × { , · · · , n − } × · · · × { , · · · , n m − − } . Such a hypermatrix is of order m and of size n × n × · · · × n m − . A hypermatrix is cubic of side length n if n = n = · · · = n m − = n .Hypermatrix algebras arise from natural generalizations of classical matrix notions and algorithms [MB94, GKZ94,Ker08, GER11, GF17, MB90]. The important distinction between hypermatrices and tensors closely mirrors the dis-tinction between matrices and abstract linear transformations. Recall that an abstract linear transformation specifiedover finite dimensional K -vector spaces is identified with a matrix orbit. For instance, let M ∈ K m × n be associatedwith some abstract linear transformation specified relative to the standard basis for K n × and K × m . The tensorialorbit of the linear transformation (accounting for all possible coordinate changes) is the matrix set A · M · B : A ∈ GL m ( K ) and B ∈ GL n ( K ) . A matrix property common to every member of a tensorial orbit is a tensorial invariant .2lassically, third order hypermatrices in K m × n × p arise from tensorial orbits induced by the action of various appropriatesubgroups of the general linear group on canonical embeddings of K -vector spaces : K m × × , K × n × and K × × p respectively. Incidentally, classical tensorial invariants such as the rank and singular values are defined by analogy totheir matrix counterparts.Hypermatrix multiplication, named the Bhattacharya-Mesner product (BM-product), is a generalization to the matrixmultiplication [MB90, MB94]. Occasionally, the product of a conformable matrix pair A ∈ K m × (cid:96) , B ∈ K (cid:96) × n , can be written using the BM-product notation as Prod ( A , B ) for consistency and such a product is specified entry-wiseby Prod ( A , B ) [ i, j ] = (cid:88) ≤ t<(cid:96) A [ i, t ] B [ t, j ] , ∀ (cid:40) ≤ i < m ≤ j < n . Similarly, the BM-product of a conformable triple of third order hypermatrices A ∈ K m × (cid:96) × p , B ∈ K m × n × (cid:96) and C ∈ K (cid:96) × n × p , is noted Prod ( A , B , C ) and specified entry-wise byProd ( A , B , C ) [ i, j, k ] = (cid:88) ≤ t<(cid:96) A [ i, t, k ] B [ i, j, t ] C [ t, j, k ] , ∀ ≤ i < m ≤ j < n ≤ k < p . Furthermore, we recall that the general Bhattacharya-Mesner product of a conformable triple A ∈ K m × (cid:96) × p , B ∈ K m × n × (cid:96) and C ∈ K (cid:96) × n × p , taken with an additional cubic background hypermatrix M ∈ K (cid:96) × (cid:96) × (cid:96) (similar to metric tensors first introduced indifferential geometry [RLC00, Gau28]) is denoted Prod M ( A , B , C ) ∈ K m × n × p and specified entry-wise byProd M ( A , B , C ) [ i, j, k ] = (cid:88) ≤ t ,t ,t <(cid:96) A [ i, t , k ] B [ i, j, t ] C [ t , j, k ] M [ t , t , t ] . (2)The original BM-product is thus recovered from the general BM-product by setting the cubic background hypermatrix M to be equal to the Kronecker delta hypermatrix denoted ∆ , whose entries are specified by ∆ [ i , i , i ] = (cid:40) if ≤ i = i = i < n otherwise . The general Bhattacharya-Mesner product of conformable matrices A ∈ K m × (cid:96) , B ∈ K (cid:96) × n , M ∈ K (cid:96) × (cid:96) is given byProd M ( A , B ) [ i, j ] = (cid:88) ≤ t ,t <(cid:96) A [ i, t ] B [ t , j ] M [ t , t ] , ∀ (cid:40) ≤ i < m ≤ j < n . We further recall that the transpose of an arbitrary hypermatrix A ∈ K m × n × p , denoted as A (cid:62) ∈ K n × p × m , resultsfrom a cyclic permutation on the indices and is specified entry-wise as follows A (cid:62) [ i, j, k ] = A [ k, i, j ] . We adopt the convention A (cid:62) := (cid:16) A (cid:62) (cid:17) (cid:62) , A (cid:62) := (cid:16) A (cid:62) (cid:17) (cid:62) = A . = ⇒ A (cid:62) u = A (cid:62) v if u ≡ v mod 3 . Note that when K is commutative Prod ( A , B , C ) (cid:62) = Prod (cid:16) B (cid:62) , C (cid:62) , A (cid:62) (cid:17) . Let K denote an arbitrary field (not necessarily commutative) and let GL n ( K ) denote the general linear group ofinvertible n × n matrices whose entries belong to K . When investigating matrices, it is of interest to determine matrixattributes which are independent of the chosen coordinate system. For this purpose we associate with an arbitrarymatrix M ∈ K m × n a tensorial orbit induced by the action on M of the group GL m ( K ) × GL n ( K ) as follows T ( M ) := A · M · B : A ∈ GL m ( K ) and B ∈ GL n ( K ) . (3)For instance, the tensorial orbit of (cid:18) (cid:19) whose entries are taken from the finite field with two elements denoted F is (cid:26)(cid:18) (cid:19) , (cid:18) (cid:19) , (cid:18) (cid:19) , (cid:18) (cid:19) , (cid:18) (cid:19) , (cid:18) (cid:19)(cid:27) . In particular the tensorial orbit of a zero matrix is a singleton T ( m × n ) = { m × n } . Recall that ∀ M ∈ GL n ( K ) , T ( M ) = GL n ( K ) . K = F p k for a prime p we have |T ( M ) | = (cid:89) ≤ i Prod ( U , H , V ) , E (cid:62) , F (cid:62) (cid:17)(cid:17) : ( U , V ) , ( E , F ) ( P , Q ) ∈ GL m × n × p ( m × p × p, p × n × p, K ) (cid:111) , (cid:110) Prod (cid:16) P (cid:62) , Q (cid:62) , Prod (cid:16) U , Prod (cid:16) H , E (cid:62) , F (cid:62) (cid:17) , V (cid:17)(cid:17) : ( U , V ) , ( E , F ) ( P , Q ) ∈ GL m × n × p ( m × p × p, p × n × p, K ) (cid:111) , (cid:110) Prod (cid:16) Prod (cid:16) P (cid:62) , Q (cid:62) , Prod ( U , H , V ) (cid:17) , E (cid:62) , F (cid:62) (cid:17) : ( U , V ) , ( E , F ) ( P , Q ) ∈ GL m × n × p ( m × p × p, p × n × p, K ) (cid:111) , (cid:110) Prod (cid:16) Prod (cid:16) U , Prod (cid:16) P (cid:62) , Q (cid:62) , H (cid:17) , V (cid:17) , E (cid:62) , F (cid:62) (cid:17) : ( U , V ) , ( E , F ) ( P , Q ) ∈ GL m × n × p ( m × p × p, p × n × p, K ) (cid:111) , (cid:110) Prod (cid:16) U , Prod (cid:16) Prod (cid:16) P (cid:62) , Q (cid:62) , H (cid:17) , E (cid:62) , F (cid:62) (cid:17) , V (cid:17) : ( U , V ) , ( E , F ) ( P , Q ) ∈ GL m × n × p ( m × p × p, p × n × p, K ) (cid:111) , (cid:110) Prod (cid:16) U , Prod (cid:16) P (cid:62) , Q (cid:62) , Prod (cid:16) H , E (cid:62) , F (cid:62) (cid:17)(cid:17) , V (cid:17) : ( U , V ) , ( E , F ) ( P , Q ) ∈ GL m × n × p ( m × p × p, p × n × p, K ) (cid:111) . For convenience we adopt the notationanl convention such thatProd ( C , Prod ( A , X , B ) , D ) = X , ∀ X ∈ K m × n × p ⇔ C = A − D = B − , Prod (cid:16) Prod (cid:16) X (cid:62) , B (cid:62) , A (cid:62) (cid:17) , D (cid:62) , C (cid:62) (cid:17) = X , ∀ X ∈ K m × n × p ⇔ C (cid:62) = (cid:0) A (cid:62) (cid:1) − D (cid:62) = (cid:0) B (cid:62) (cid:1) − , and Prod (cid:16) D (cid:62) , C (cid:62) , Prod (cid:16) B (cid:62) , A (cid:62) , X (cid:62) (cid:17)(cid:17) = X , ∀ X ∈ K m × n × p ⇔ C (cid:62) = (cid:16) A (cid:62) (cid:17) − D (cid:62) = (cid:16) B (cid:62) (cid:17) − . Recall the canonical R × representation of the field C is prescribed by the correspondence (cid:0) a + b √− (cid:1) ↔ (cid:18) a − bb a (cid:19) . (6)We therefore express an arbitrary M ∈ C n × n as a new matrix M (cid:48) ∈ R n × n obtained by replacing each entry of M by the corresponding × real matrix representation. It follows that no loss of generality incurs from restricting thediscussion to real matrices. 8t is well known that the Singular Value Decomposition (or SVD for short ) of A ∈ R n × n is obtained by solving formatrices U , V , diag ( µ ) and diag ( ν ) in the constraints (cid:0) AA (cid:62) (cid:1) k = (cid:16) U diag ( µ ) k (cid:17) (cid:16) U diag ( µ ) k (cid:17) (cid:62) and (cid:0) A (cid:62) A (cid:1) k = (cid:16) diag ( ν ) k V (cid:17) (cid:62) (cid:16) diag ( ν ) k V (cid:17) , ∀ k ∈ { , } . A distinctive feature of SVD constraints is that it can be equivalently formulated as a pair of fixed point constraintsof the form (cid:0) AA (cid:62) (cid:1) (cid:16) ( U diag ( µ )) (cid:62) (cid:17) − = U diag ( µ ) (cid:16) ( diag ( ν ) V ) (cid:62) (cid:17) − (cid:0) A (cid:62) A (cid:1) = diag ( ν ) V . (7)The fixed point formulation in Eq. (7) lies at the heart of iterative procedures for SVD numerical approximationschemes which fortunately extend to hypermatrices. Characteristic polynomials which eliminate the entries of U and V from the SVD constraints in Eq. (1) are Rank (cid:16) AA (cid:62) − ( U µ i I n ) ( U µ i I n ) (cid:62) (cid:17) < Rank (cid:0) AA (cid:62) (cid:1) andRank (cid:16) A (cid:62) A − ( ν i I n V ) (cid:62) ( ν i I n V ) (cid:17) < Rank (cid:0) AA (cid:62) (cid:1) = ⇒ det (cid:0) AA (cid:62) − µ i I n (cid:1) = 0 and det (cid:0) A (cid:62) A − ν i I n (cid:1) = 0 , ∀ ≤ i < n (8)It is well known that (cid:8) µ i : 0 ≤ i < n (cid:9) = (cid:8) ν i : 0 ≤ i < n (cid:9) , and as a result we can takediag ( µ ) = diag ( σ ) = diag ( ν ) . Once the singular values are known, we simultaneously solve for entries of U via constraints given by ( I n ⊗ Vandermonde { σ ◦ σ } ) vec (cid:18) U [ i, k ] U [ j, k ] : 0 ≤ i < j < n ≤ k < n (cid:19) = vec (cid:18)(cid:16) AA (cid:62) (cid:17) k [ i, j ] : 0 ≤ i < j < n ≤ k < n (cid:19) and also simultaneously solve for all entries of V via constraints given by ( I n ⊗ Vandermonde { σ ◦ σ } ) vec (cid:18) V [ k, i ] V [ k, j ] : 0 ≤ i < j < n ≤ k < n (cid:19) = vec (cid:18)(cid:16) A (cid:62) A (cid:17) k [ i, j ] : 0 ≤ i < j < n ≤ k < n (cid:19) . Note that the constraints above express a composition of constraints of type one and two as described in [GG18].We now extend to third order hypermatrices the matrix symmetrization formulation of the SVD. For an arbitrary9 ∈ C n × n × n , the three products of transposes which necessarily result in a symmetric hypermatrix are Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) (cid:62) = Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17) (cid:62) = Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17) Prod (cid:16) A (cid:62) , A (cid:62) , A (cid:17) (cid:62) = Prod (cid:16) A (cid:62) , A (cid:62) , A (cid:17) . Just as was done for matrices, we devise the SVD from the spectral decomposition of these symmetric products oftransposes. Recall that the scaling hypermatrices described in section 5 are hypermatrix analog of diagonal matricesand characterized by the constraints D ◦ ∈ (cid:110) Prod (cid:16) D (cid:62) , D (cid:62) , D (cid:17) , Prod (cid:16) D , D (cid:62) , D (cid:62) (cid:17) , Prod (cid:16) D (cid:62) , D , D (cid:62) (cid:17)(cid:111) , where D ◦ represents the Hadamard exponent of the scaling hypermatrix D . Here we recall that the Hadamardexponent H ◦ z is defined for an arbitrary H ∈ C m × n × p and z ∈ C as follows H ◦ z [ i, j, k ] = (cid:40) ( H [ i, j, k ]) z if H [ i, j, k ] (cid:54) = 00 otherwise . The above constraints are thus the hypermatrix diagonality constraints generalized from the following matrix con-straints Prod (cid:16) D (cid:62) , D (cid:17) = D ◦ = Prod (cid:16) D , D (cid:62) (cid:17) . Note that in contrast to the matrix case, scaling hypermatrices are not necessarily symmetric. For simplicity wedescribe the detailed derivation of the SVD for an arbitrary side length two cubic hypermatrix A ∈ C × × whoseentries are given by A [: , : , 0] = (cid:18) a a a a (cid:19) , A [: , : , 1] = (cid:18) a a a a (cid:19) , Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) = Prod (cid:16) Prod (cid:0) U , D µ , D (cid:62) µ (cid:1) , Prod (cid:0) U , D µ , D (cid:62) µ (cid:1) (cid:62) , Prod (cid:0) U , D µ , D (cid:62) µ (cid:1) (cid:62) (cid:17) D ◦ µ = Prod (cid:16) D (cid:62) µ , D (cid:62) µ , D µ (cid:17) D µ [: , : , 0] = (cid:32) µ µ (cid:33) D µ [: , : , 1] = (cid:32) µ µ (cid:33) Prod (cid:0) U , D µ , D (cid:62) µ (cid:1) [ i, j, k ] = µ min { i,j } max { i,j } µ min { j,k } max { j,k } u ijk , (9) Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17) = Prod (cid:16) Prod (cid:0) D (cid:62) ν , V , D ν (cid:1) (cid:62) , Prod (cid:0) D (cid:62) ν , V , D ν (cid:1) , Prod (cid:0) D (cid:62) ν , V , D ν (cid:1) (cid:62) (cid:17) D ◦ ν = Prod (cid:16) D ν , D (cid:62) ν , D (cid:62) ν (cid:17) D ν [: , : , 0] = (cid:32) ν ν (cid:33) D ν [: , : , 1] = (cid:32) ν ν (cid:33) Prod (cid:0) D (cid:62) ν , V , D ν (cid:1) [ i, j, k ] = ν min { i,k } max { i,k } ν min { j,k } max { j,k } v ijk , (10)11nd Prod (cid:16) A (cid:62) , A (cid:62) , A (cid:17) = Prod (cid:16) Prod (cid:0) D ω , D (cid:62) ω , W (cid:1) (cid:62) , Prod (cid:0) D ω , D (cid:62) ω , W (cid:1) (cid:62) , Prod (cid:0) D ω , D (cid:62) ω , W (cid:1)(cid:17) D ◦ ω = Prod (cid:16) D ω , D (cid:62) ω , D (cid:62) ω (cid:17) D ω [: , : , 0] = (cid:32) ω ω (cid:33) D ω [: , : , 1] = (cid:32) ω ω (cid:33) Prod (cid:0) D ω , D (cid:62) ω , W (cid:1) [ i, j, k ] = ω min { i,k } max { i,k } ω min { i,j } max { i,j } w ijk . (11)The hypermatrices U , V and W whose individual slices correspond to eigenmatrices are subject to the following thirdorder orthogonality constraintsProd (cid:16) U , U (cid:62) , U (cid:62) (cid:17) [ i, j, k ] = Prod (cid:16) V (cid:62) , V , V (cid:62) (cid:17) [ i, j, k ] = Prod (cid:16) W (cid:62) , W (cid:62) , W (cid:17) [ i, j, k ] = (cid:40) if i = j = k otherwise . (12)A distinctive feature of SVD constraints quite analogous to the matrix setting is the equivalent formulation as fixedpoint constraints of the form Prod (cid:0) U , D µ , D (cid:62) µ (cid:1) = Prod (cid:18) Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17) , (cid:16) Prod (cid:0) U , D µ , D (cid:62) µ (cid:1) (cid:62) (cid:17) − , (cid:16) Prod (cid:0) U , D µ , D (cid:62) µ (cid:1) (cid:62) (cid:17) − (cid:19) , Prod (cid:0) D (cid:62) ν , V , D ν (cid:1) = Prod (cid:18)(cid:16) Prod (cid:0) D (cid:62) ν , V , D ν (cid:1) (cid:62) (cid:17) − , Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17) , (cid:16) Prod (cid:0) D (cid:62) ν , V , D ν (cid:1) (cid:62) (cid:17) − (cid:19) , Prod (cid:0) D ω , D (cid:62) ω , W (cid:1) = Prod (cid:18)(cid:16) Prod (cid:0) D ω , D (cid:62) ω , W (cid:1) (cid:62) (cid:17) − , (cid:16) Prod (cid:0) D ω , D (cid:62) ω , W (cid:1) (cid:62) (cid:17) − , Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17)(cid:19) . (13)Just as in the matrix case, the characteristic polynomials which determine the entries of the scaling hypermatrices(hypermatrix analog of the singular values) are given by constraints of the form ∀ ≤ i < , Rank (cid:16) A , A (cid:62) , A (cid:62) (cid:17) > Rank (cid:110) Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) − Prod (cid:16) (cid:101) U i , (cid:101) U (cid:62) i , (cid:101) U (cid:62) i (cid:17)(cid:111) where (cid:101) U i = Prod (cid:18) U , D [ i ] µ , (cid:16) D [ i ] µ (cid:17) (cid:62) (cid:19) . ≤ i < , Rank (cid:16) A (cid:62) , A , A (cid:62) (cid:17) > Rank (cid:110) Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17) − Prod (cid:16) (cid:101) V (cid:62) i , (cid:101) V i , (cid:101) V (cid:62) i (cid:17)(cid:111) where (cid:101) V i = Prod (cid:18)(cid:16) D [ i ] ν (cid:17) (cid:62) , V , D [ i ] ν (cid:19) . and ∀ ≤ i < , Rank (cid:16) A (cid:62) , A (cid:62) , A (cid:17) > Rank (cid:110) Prod (cid:16) A (cid:62) , A (cid:62) , A (cid:17) − Prod (cid:16) (cid:102) W (cid:62) i , (cid:102) W (cid:62) i , (cid:102) W i (cid:17)(cid:111) where (cid:102) W i = Prod (cid:18) D [ i ] ω , (cid:16) D [ i ] ω (cid:17) (cid:62) , W (cid:19) . The entries of the scaling hypermatrices above are given by D [0] µ [: , : , 0] = (cid:18) µ µ (cid:19) D [0] µ [: , : , 1] = (cid:18) µ µ (cid:19) D [1] µ [: , : , 0] = (cid:18) µ µ (cid:19) D [1] µ [: , : , 1] = (cid:18) µ µ (cid:19) ; D [0] ν [: , : , 0] = (cid:18) ν ν (cid:19) D [0] ν [: , : , 1] = (cid:18) ν ν (cid:19) D [1] ν [: , : , 0] = (cid:18) ν ν (cid:19) D [1] ν [: , : , 1] = (cid:18) ν ν (cid:19) ; D [0] ω [: , : , 0] = (cid:18) ω ω (cid:19) D [0] ω [: , : , 1] = (cid:18) ω ω (cid:19) D [1] ω [: , : , 0] = (cid:18) ω ω (cid:19) D [1] ω [: , : , 1] = (cid:18) ω ω (cid:19) . Consequently characteristic polynomial constraints are expressed by ∀ ≤ i < , (cid:110) Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) − Prod (cid:16) (cid:101) U i , (cid:101) U (cid:62) i , (cid:101) U (cid:62) i (cid:17)(cid:111) , (cid:110) Prod (cid:16) A (cid:62) , A , A (cid:62) (cid:17) − Prod (cid:16) (cid:101) V (cid:62) i , (cid:101) V i , (cid:101) V (cid:62) i (cid:17)(cid:111) , (cid:110) Prod (cid:16) A (cid:62) , A (cid:62) , A (cid:17) − Prod (cid:16) (cid:102) W (cid:62) i , (cid:102) W (cid:62) i , (cid:102) W i (cid:17)(cid:111) . (14)Using the hypermatrix determinant formula introduced by Gnang and Yuval in [GF17], the corresponding constraints13re expressed as (cid:0) µ − a − a (cid:1) ( a a a + a a a ) − (cid:0) µ − a − a (cid:1) ( a a a + a a a ) (cid:0) µ − a − a (cid:1) ( a a a + a a a ) − (cid:0) µ − a − a (cid:1) ( a a a + a a a ) (cid:0) ν − a − a (cid:1) ( a a a + a a a ) − (cid:0) ν − a − a (cid:1) ( a a a + a a a ) (cid:0) ν − a − a (cid:1) ( a a a + a a a ) − (cid:0) ν − a − a (cid:1) ( a a a + a a a ) (cid:0) ω − a − a (cid:1) ( a a a + a a a ) − (cid:0) ω − a − a (cid:1) ( a a a + a a a ) (cid:0) ω − a − a (cid:1) ( a a a + a a a ) − (cid:0) ω − a − a (cid:1) ( a a a + a a a ) Once we have determined the entries of the scaling values, we simultaneously solve for all entries of U via constraintsgiven by µ µ µ µ µ µ µ µ µ µ µ µ · u u u u u u u u u u u u u u u u = a + a a a a + a a a a a a + a a a a + a , solve for all entries of V via constraints given by ν ν ν ν ν ν ν ν ν ν ν ν · v v v v v v v v v v v v v v v v = a + a a a a + a a a a a a + a a a a + a , and also solve for all entries of W via constraints given by ω ω ω ω ω ω ω ω ω ω ω ω · w w w w w w w w w w w w w w w w = a + a a a a + a a a a a a + a a a a + a A = (cid:88) ≤ i,j,k< σ i,j,k Prod (cid:16) (cid:101) U [: , i, :] , (cid:101) V [: , : , j ] , (cid:102) W [ k, : , :] (cid:17) (15)where (cid:101) U = Prod (cid:16) U , D µ , D (cid:62) µ (cid:17) , (cid:101) V = Prod (cid:16) D (cid:62) ν , V , D ν (cid:17) , and (cid:102) W = Prod (cid:16) D ω , D (cid:62) ω , W (cid:17) . The coefficients { σ i,j,k : 0 ≤ i, j, k < } ⊂ C of the linear combination in Eq. (15) are obtained through solving asystem of linear equations. The expansion in Eq. (15) is equivalently expressed as A = Prod (cid:0) U (cid:48) , V (cid:48) , W (cid:48) (cid:1) , where U (cid:48) ∈ C ×(cid:107) σ (cid:107) (cid:96) × , V (cid:48) ∈ C × ×(cid:107) σ (cid:107) (cid:96) , and W (cid:48) ∈ C (cid:107) σ (cid:107) (cid:96) × × , and σ is the vector whose entries are made up ofthe coefficients { σ i,j,k : 0 ≤ i, j, k < } in the linear combination. As an illustration, consider the task of expressingthe SVD of hypermatrices of arbitrary side lengths generated from × × hypermatrices by taking combinations ofdirect sums and Kronecker products. As shown in [GM18], and similarly to the matrix case, when given the SVD ofhypermatrices A ∈ C m × m × m and A ∈ C n × n × n A = Prod (cid:0) U (cid:48) , V (cid:48) , W (cid:48) (cid:1) , A = Prod (cid:0) U (cid:48) , V (cid:48) , W (cid:48) (cid:1) , then the SVD of A ⊗ A and A ⊕ A are expressed by A ⊗ A = Prod (cid:0) U (cid:48) ⊗ U (cid:48) , V (cid:48) ⊗ V (cid:48) , W (cid:48) ⊗ W (cid:48) (cid:1) ∈ C mn × mn × mn , A ⊕ A = Prod (cid:0) U (cid:48) ⊕ U (cid:48) , V (cid:48) ⊕ V (cid:48) , W (cid:48) ⊕ W (cid:48) (cid:1) ∈ C ( m + n ) × ( m + n ) × ( m + n ) , The action of a matrix in C n × n on the vector space C n × can be seen as a special instance of a more general (notnecessarily linear) map introduced in [GF17] specified in terms of a matrix pair ( A , B ) ∈ C n × n × C n × n as follows T A , B : C n × → C n × , y = T A , B ( x ) , such that ∀ ≤ k < n, y [ k ] = (cid:113) Prod P k ( x (cid:62) , x ) where P k = Prod I n [: ,k ] I n [ k, :] ( A , B ) . (16)Note that the map T A , B is determined up to the sign of the entries of its output. Invertibility in this context meansthat neither of the n univariate polynomials inResultant x (cid:110) Prod P k (cid:16) x (cid:62) , x (cid:17) : 0 ≤ k < n (cid:111) , 15s an identically non-zero constant. For instance when n = 2 and A = (cid:18) a a a a (cid:19) , B = (cid:18) b b b b (cid:19) , the map T A , B is invertible if neither of the polynomials in { Q ( x ) , Q ( x ) } = Resultant x (cid:110) Prod P k (cid:16) x (cid:62) , x (cid:17) : 0 ≤ k < (cid:111) explicitly given by Q ( x ) = (cid:0) a b x − y (cid:1) a b − (cid:0) a b x − y (cid:1) (cid:0) a b x − y (cid:1) a a b b + (cid:0) a b x − y (cid:1) a b +( − (cid:0) a b x − y (cid:1) ( a b x + a b x ) ( a b x + a b x ) a b + (cid:0) a b x − y (cid:1) ( a b x + a b x ) a b + (cid:0) a b x − y (cid:1) ( a b x + a b x ) a b − (cid:0) a b x − y (cid:1) ( a b x + a b x ) ( a b x + a b x ) a b . and Q ( x ) = (cid:0) a b x − y (cid:1) a b − (cid:0) a b x − y (cid:1) (cid:0) a b x − y (cid:1) a a b b + (cid:0) a b x − y (cid:1) a b +( − (cid:0) a b x − y (cid:1) ( a b x + a b x ) ( a b x + a b x ) a b + (cid:0) a b x − y (cid:1) ( a b x + a b x ) a b + (cid:0) a b x − y (cid:1) ( a b x + a b x ) a b − (cid:0) a b x − y (cid:1) ( a b x + a b x ) ( a b x + a b x ) a b . is an identically non-zero constant. Furthermore when A = B − the map T A , B is subject to the resolution of identity y = T A , B ( x ) = ⇒ Prod (cid:16) y (cid:62) , y (cid:17) = Prod (cid:16) x (cid:62) , x (cid:17) . In other words the map preserves the sum of squares of the entries. Also note that when A = B (cid:62) , the map T A , B expresses up to the sign of the entries a linear transformation. In particular, when AB = I n and B = A (cid:62) ∈ R n × n themap T A , A (cid:62) expresses up to the entry signs a linear isometry of R n × , thereby emphasizing the importance of matrixorthogonality. Recall for illustration purposes that X = (cid:18) x x x x (cid:19) , is orthogonal if X · X (cid:62) = I . Hence X · X (cid:62) = I = ⇒ x + x = 1 x x + x x = 0 x + x = 1 . 16n the one hand, (cid:89) ≤ i< x i = ⇒ X ∈ (cid:26)(cid:18) (cid:19) , (cid:18) − (cid:19) , (cid:18) − (cid:19) , (cid:18) − − (cid:19)(cid:27) . On the other hand, when (cid:54) = (cid:32) (cid:81) ≤ i< x i (cid:33) implies that x x + x x ⇔ x x x − x − = − , ∀ k ∈ Z . = ⇒ x x x x = − s t / r rst = ⇒ X = (cid:18) − s t / r sr t (cid:19) (17)By normalizing the row of X we obtain the following parametrization of the orthogonal matrices X = − s t / r √ ( − st / r ) + s s √ ( − st / r ) + s r √ r + t t √ r + t , s ∈ {− , } , and r (cid:54) = t √− . (18)To express some important invariants of orthogonal matrices, consider the index rotation operation introduced in[GM18], noted A R θ for θ ∈ (cid:8) , π , π, π (cid:9) , which generalizes the matrix transpose operation and is defined for anarbitrary A ∈ C n × n as A R = A , A R π = A (cid:62) Q , A R π = QAQ , A R π/ = QA (cid:62) , where Q = (cid:88) ≤ i 0] = (cid:18) x x x x (cid:19) , X [: , : , 1] = (cid:18) x x x x (cid:19) , is orthogonal if Prod (cid:16) X , X (cid:62) , X (cid:62) (cid:17) [: , : , 0] = (cid:18) (cid:19) , Prod (cid:16) X , X (cid:62) , X (cid:62) (cid:17) [: , : , 1] = (cid:18) (cid:19) . The corresponding constraints are therefore given by the polynomial constraints x x x + x x x = 0 x x x + x x x = 0 x + x = 1 x + x = 1 . When (cid:54) = (cid:81) ≤ i< x i , the above system of equations yields the equivalence of x x x + x x x x x x + x x x ⇔ x x − x x x − x − = 1 x x x − x − x x − = 1 ⇒ x x x x x x x x = v v v − v v v v v v v v v v v (19) = ⇒ X [: , : , 0] = (cid:18) v v v v − v v v v v v (cid:19) X [: , : , = (cid:18) v v v v (cid:19) . We account for the sum of cube constraints by normalizing appropriate rows as follows X [: , : , 0] = v v √ v + v v v √ v + v − v v v v v v , X [: , : , 1] = v v v √ v + v v √ v + v where v ∈ (cid:26) exp (cid:18) π k √− (cid:19) : 0 ≤ k < (cid:27) When (cid:81) ≤ i< x i , The variables to be assigned zero entries are indicated in the table below [ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0 , x = 0][ x = 0 , x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0][ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0 , x = 0][ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0][ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0] [ x = 0 , x = 0][ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0][ x = 0 , x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0][ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0 , x = 0][ x = 0 , x = 0] [ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0][ x = 0 , x = 0] [ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0][ x = 0 , x = 0 , x = 0] [ x = 0 , x = 0 , x = 0] To express some important invariants of orthogonal hypermatrices, we extend the index rotation operation to thirdorder hypermatrices and is denoted by A R [ θx,θy,θz ] for θ x , θ y , θ z ∈ (cid:8) · π , · π , · π , · π (cid:9) such that A R [ θx , denotesthe hypermatrix which result from performing the index rotation by angle θ x to each row slices of A . Similarly, A R [ ,θy, ] denotes the hypermatrix which result from performing the index rotation by angle θ y to each column slice of A andfinally A R [0 , ,θz ] denotes the hypermatrix which result from performing the index rotation by angle θ z to each depth21lice of A . The index rotation A R [ θx,θy,θz ] is performed relative to the axis x , y and z in that order. For instance wehave (cid:0) A R [0 , ,θz ] (cid:1) [ i, j, k ] = A (cid:20)(cid:18) i − n − (cid:19) cos θ + (cid:18) n − − j (cid:19) sin θ + n − , (cid:18) i − n − (cid:19) sin θ − (cid:18) n − − j (cid:19) cos θ + n − , k (cid:21) . Prod (cid:16) X , X (cid:62) , X (cid:62) (cid:17) = ∆ = ⇒ Prod X R (cid:20) πk , πk , πk (cid:21) , (cid:32) X R (cid:20) πk , πk , πk (cid:21) (cid:33) (cid:62) , (cid:32) X R (cid:20) πk , πk , πk (cid:21) (cid:33) (cid:62) = ∆ (20)where (cid:2) k π , k π , k π (cid:3) belong to values indicated in the table below [0 , , (cid:2) , , π (cid:3) (cid:2) , π, (cid:3) (cid:2) , π, π (cid:3)(cid:2) , π, π (cid:3) [0 , π, π ] (cid:2) , π, π (cid:3) (cid:2) , π, π (cid:3)(cid:2) π, , π (cid:3) (cid:2) π, , π (cid:3) (cid:2) π, π, π (cid:3) (cid:2) π, π, π (cid:3)(cid:2) π, π, (cid:3) (cid:2) π, π, π (cid:3) (cid:2) π, π, (cid:3) (cid:2) π, π, π (cid:3) [ π, , (cid:2) π, , π (cid:3) (cid:2) π, π, (cid:3) (cid:2) π, π, π (cid:3)(cid:2) π, π, π (cid:3) [ π, π, π ] (cid:2) π, π, π (cid:3) (cid:2) π, π, π (cid:3)(cid:2) π, , π (cid:3) (cid:2) π, , π (cid:3) (cid:2) π, π, π (cid:3) (cid:2) π, π, π (cid:3)(cid:2) π, π, (cid:3) (cid:2) π, π, π (cid:3) (cid:2) π, π, (cid:3) (cid:2) π, π, π (cid:3) As shown in [GF17] if Prod (cid:16) X , X (cid:62) , X (cid:62) (cid:17) = ∆ = Prod (cid:16) Y , Y (cid:62) , Y (cid:62) (cid:17) then we haveProd (cid:16) ( X ⊕ Y ) , ( X ⊕ Y ) (cid:62) , ( X ⊕ Y ) (cid:62) (cid:17) = ∆ ⊕ ∆ andProd (cid:16) ( X ⊗ Y ) , ( X ⊗ Y ) (cid:62) , ( X ⊗ Y ) (cid:62) (cid:17) = ∆ ⊗ ∆ Consider block operation of hymatrices A [: , : , 0] = A A A A A [: , : , 1] = A A A A (21) A (cid:62) tb = (cid:88) ≤ i,j,k< Prod ( K [: , i, :] , K [: , : , j ] , K [ k, : , :]) (cid:62) t ⊗ A ijk (22)s.t. K [: , : , 0] = (cid:18) (cid:19) ; K [: , : , 1] = (cid:18) (cid:19) [: , : , 0] = (cid:18) (cid:19) ; K [: , : , 1] = (cid:18) (cid:19) K [: , : , 0] = (cid:18) (cid:19) ; K [: , : , 1] = (cid:18) (cid:19) A (cid:62) te = (cid:88) ≤ i,j,k< Prod ( K [: , i, :] , K [: , : , j ] , K [ k, : , :]) ⊗ A (cid:62) t ijk (23) A R b, πk = (cid:88) ≤ i,j,k< Prod ( K [: , i, :] , K [: , : , j ] , K [ k, : , :]) R πk ⊗ A ijk (24) A R e, πk = (cid:88) ≤ i,j,k< Prod ( K [: , i, :] , K [: , : , j ] , K [ k, : , :]) ⊗ A R πk ijk (25)Similarly to the matrix case, if A is block hypermatrix whose invidividual blocks are orthogonal hypermatrices all ofthe same size and all subject to Prod (cid:16) X , X (cid:62) , X (cid:62) (cid:17) = ∆ then it follows that ∀ ≤ i < n, Prod A √ n , (cid:32) A √ n (cid:62) e (cid:33) (cid:62) b , (cid:18) A √ n (cid:62) e (cid:19) (cid:62) b [ i, i, i ] = ∆ In which case Prod A √ n , (cid:32) A √ n (cid:62) e (cid:33) (cid:62) b , (cid:18) A √ n (cid:62) e (cid:19) (cid:62) b = = ⇒ A = . In the case of × × block hypermatrix A [: , : , 0] = 1 √ (cid:18) A A A A (cid:19) , A [: , : , 1] = 1 √ (cid:18) A A A A (cid:19) where ∀ X ∈ { A , A , A , A , A , A , A , A } , we have Prod (cid:16) X , X (cid:62) , X (cid:62) (cid:17) = ∆ is expressed by Prod (cid:18) A , (cid:16) A (cid:62) e (cid:17) (cid:62) b , (cid:16) A (cid:62) e (cid:17) (cid:62) b (cid:19) [: , : , 0] = ∆ Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) , Prod (cid:18) A , (cid:16) A (cid:62) e (cid:17) (cid:62) b , (cid:16) A (cid:62) e (cid:17) (cid:62) b (cid:19) [: , : , 1] = Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) ∆ . A necessary condition for the resulting block hypermatrix to be orthogonal is specified by the constraints Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) = Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) = Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) = Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) = Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) = Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) + Prod (cid:16) A , A (cid:62) , A (cid:62) (cid:17) = References [CC70] J. Douglas Carroll and Jih-Jie Chang, Analysis of individual differences in multidimensional scaling viaan n-way generalization of "eckart-young" decomposition , Psychometrika (1970), 283–319.[DLDMV00] L. De Lathauwer, B. De Moor, and J. Vandewalle, A multilinear singular value decomposition , SIAMJournal on Matrix Analysis and Applications (2000), no. 4, 1253–1278.[dSL08] Vin. de Silva and Lek-Heng. Lim, Tensor rank and the ill-posedness of the best low-rank approximationproblem , SIAM Journal on Matrix Analysis and Applications (2008), no. 3, 1084–1127.24Gau28] C.F. Gauß, Disquisitiones generales circa superficies curvas , Typis Ditericianis, 1828.[GER11] E. K. Gnang, A. Elgammal, and V. Retakh, A spectral theory for tensors , Annales de la faculte dessciences de Toulouse Mathematiques (2011), no. 4, 801–841.[GF17] Edinah K. Gnang and Yuval Filmus, On the spectra of hypermatrix direct sum and kronecker productsconstructions , Linear Algebra and its Applications (2017), 238 – 277.[GF20] Edinah K. Gnang and Yuval Filmus, On the bhattacharya-mesner rank of third order hypermatrices ,Linear Algebra and its Applications (2020), 391 – 418.[GG18] Edinah K. Gnang and Jeanine S. Gnang, Sketch for a Theory of Constructs , arXiv e-prints (2018),arXiv:1808.03743.[GKZ94] I. Gelfand, M. Kapranov, and A. Zelevinsky, Discriminants, resultants and multidimensional determi-nant , Birkhauser, Boston, 1994.[GM18] Edinah K. Gnang and James M. Murphy, Spectral analysis for non-hermitian matrices and directedgraphs , 2018.[Gor69] Paul Gordan, Ueber ternäre formen dritten grades , Mathematische Annalen (1869), 90 – 128.[Har70] Richard A. Harshman, Foundations of the parafac procedure : Models and conditions for an "explanatory"multi-mode factor analysis , 1970.[Hil90] David Hilbert, Ueber die theorie der algebraischen formen , Mathematische Annalen (1890), no. 4,473–534.[KB09] Tamara G. Kolda and Brett W. Bader, Tensor decompositions and applications , SIAM Review (2009),no. 3, 455–500.[Ker08] R. Kerner, Ternary and non-associative structures , International Journal of Geometric Methods in Mod-ern Physics (2008), no. 8, 1265–1294, cited By 10.[KM11] Misha E. Kilmer and Carla D. Martin, Factorization strategies for third-order tensors , Linear Algebraand its Applications (2011), no. 3, 641 – 658, Special Issue: Dedication to Pete Stewart on theoccasion of his 70th birthday.[KMP08] Misha Elena Kilmer, Carla D. Moravitz Martin, and Lisa Perrone, A third-order generalization of thematrix svd as a product of third-order tensors , 2008.[Lim13] Lek-Heng Lim, Tensors and hypermatrices , pp. 231–260, 12 2013.[MB90] D. M. Mesner and P. Bhattacharya, Association schemes on triples and a ternary algebra , Journal ofcombinatorial theory A55 (1990), 204–234. 25MB94] D. M. Mesner and P. Bhattacharya, A ternary algebra arising from association schemes on triples , Journalof algebra (1994), 595–613.[RLC00] M. M. G. Ricci and T. Levi-Civita, Méthodes de calcul différentiel absolu et leurs applications , Mathe-matische Annalen (1900), no. 1, 125–201.[Tuc66] Ledyard R. Tucker, Some mathematical notes on three-mode factor analysis , Psychometrika31