A Tensor Rank Theory and Maximum Full Rank Subtensors
AA Tensor Rank Theory and Maximum Full
Rank Subtensors
Liqun Qi ∗ , Xinzhen Zhang † , and Yannan Chen ‡ May 6, 2020
Abstract
A matrix always has a full rank submatrix such that the rank of this matrixis equal to the rank of that submatrix. This property is one of the corner stonesof the matrix rank theory. We call this property the max-full-rank-submatrixproperty. Tensor ranks play a crucial role in low rank tensor approximation,tensor completion and tensor recovery. However, their theory is still not maturedyet. Can we set an axiom system for tensor ranks? Can we extend the max-full-rank-submatrix property to tensors? We explore these in this paper. We firstpropose some axioms for tensor rank functions. Then we introduce proper tensorrank functions. The CP rank is a tensor rank function, but is not proper. Thereare two proper tensor rank functions, the max-Tucker rank and the submax-Tucker rank, which are associated with the Tucker decomposition. We define apartial order among tensor rank functions and show that there exists a uniquesmallest tensor rank function. We introduce the full rank tensor concept, anddefine the max-full-rank-subtensor property. We show the max-Tucker tensorrank function and the smallest tensor rank function have this property. We definethe closure for an arbitrary proper tensor rank function, and show that it is stilla proper tensor rank function and has the max-full-rank-subtensor property. Anapplication of the submax-Tucker rank is also presented.
Key words. tensor rank axioms, full rank tensors, the max-full-rank-subtensorproperty, the max-Tucker rank, the submax-Tucker rank. ∗ Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom,Kowloon, Hong Kong, China; ( [email protected] ). † School of Mathematics, Tianjin University, Tianjin 300354 China; ( [email protected] ). Thisauthor’s work was supported by NSFC (Grant No. 11871369). ‡ School of Mathematical Sciences, South China Normal University, Guangzhou, China;( [email protected] ). This author was supported by the National Natural Science Foundationof China (11771405). a r X i v : . [ m a t h . R A ] M a y MS subject classifications.
A matrix always has a full rank submatrix such that the rank of this matrix is equalto the rank of that submatrix. We call this property the max-full-rank-submatrixproperty. This property is one of the corner stones of the matrix rank theory.We now arrive the era of big data and tensors. Tensor ranks play a crucial rolein low rank tensor approximation, tensor completion and tensor recovery [1, 4, 5, 6,8, 10, 12, 13, 14, 15, 16, 17]. However, their theory is still not matured yet. Can weset an axiom system for tensor ranks? Can we extend the full rank concept and themax-full-rank-submatrix property to tensors? We explore these in this paper.We first propose some axioms for tensor rank functions. Then we introduce propertensor rank functions. The CP rank is a tensor rank function, but is not proper. Thereare two proper tensor rank functions, the max-Tucker rank and the submax-Tuckerrank, which are associated with the Tucker decomposition. We define a partial orderamong tensor rank functions and show that there exists a unique smallest tensor rankfunction. We introduce the full rank tensor concept, and define the max-full-rank-subtensor property. We show the max-Tucker tensor rank function and the smallesttensor rank function have this property. We define the closure for an arbitrary propertensor rank function, and show that it is still a proper tensor rank function and has themax-full-rank-subtensor property. An application of the submax-Tucker rank is alsopresented.The set of all nonnegative integers is denoted by Z + . The set of all positive integersis denoted by N . Let m, n , · · · , n m ∈ N . Denote the set of all real m th order tensors ofdimension n × n ×· · ·× n m by T ( n , n , · · · , n m ). If n = · · · = n m = n , then we denoteit by CT ( m, n ). Here “CT” means cubic tensors. Denote the set of all real tensorsby T . Thus, scalars, vectors, matrices are a part of T . Let X ∈ T ( n , n , · · · , n m ).We call X a rank-one tensor if and only if there are nonzero vectors x ( i ) ∈ (cid:60) n i for i = 1 , · · · , m , such that X = x (1) ◦ · · · ◦ x ( m ) . Here, ◦ is the tensor outer product. Then, nonzero vectors and scalars are all rank-onetensors in this sense.Suppose that m, n , · · · , n m ∈ N and X = ( x i ··· i m ) ∈ T ( n , · · · , n m ). Let T l ⊂{ , · · · , n l } and | T l | = k l ≥ l = 1 , · · · , m . Suppose that Y = ( y j ··· j m ) ∈ T ( k , · · · , k m ) for j l ∈ T l , with y j ··· j m = x i ··· i m if j l = i l for l = 1 , · · · , m . Thenwe say that Y is a subtensor of X . If Y (cid:54) = X , then we say that Y is a proper subtensorof X . For p = 1 , · · · , m , the subtensor Y , described above, is called a p -row of X if2 T p | = 1 and T l = { , · · · , n l } for l (cid:54) = p . If T p = { q } , then the corresponding p -rowis called the q th p -row of X . The p -row concept extends the concepts of rows andcolumns from matrices to tensors. For a matrix, a 1-row is called a row, a 2-row iscalled a column.In the next section, we present a set of axioms for tensor rank functions. We list sixproperties which are essential for tensor ranks. In particular, we define a partial order“ ≤ ” among tensor rank functions, and show that there exists a unique smallest tensorrank function r ∗ . We also introduce proper and strongly proper tensor rank functionsin that section.We study the CP rank and the Tucker rank in Section 3. The Tucker rank is a vectorrank. We derive two scalar ranks from this, and call them the max-Tucker rank andthe submax-Tucker rank respectively. We show that the CP rank, the max-Tucker rankand the submax-Tucker rank are all tensor rank functions. The CP rank is subadditivebut not proper. The max-Tucker rank is proper, subadditive but not strongly proper.The submax-Tucker rank is strongly proper but not subadditive.We introduce the concept of maximum full rank subtensors, and define the max-full-rank-subtensor property in Section 4. We show that the max-Tucker rank functionhas the max-full-rank-subtensor property. Suppose that m, n , · · · , n m ∈ N and X =( x i ··· i m ) ∈ T ( n , · · · , n m ), and Y is a maximum full rank subtensor of X under themax-Tucker rank. Then we show that there is an index p , 1 ≤ p ≤ m , such that allthe q th p -rows of X , with q in the mode p index set T p of Y , are linearly independent,and any p -row of X is a linear combination of the q th p -rows of X with q ∈ T p .In Section 5, we define the closure of an arbitrary proper tensor rank function, andshow that it is still a proper tensor rank function, and has the max-full-rank-subtensorproperty. We show that r ∗ is strongly proper and has the max-full-rank-subtensorproperty.We present an application of the submax-Tucker rank in internet traffic data ap-proximation in Section 6.Some final remarks are made in Section 7.We use small letters to denote scalars, small bold letters to denote vectors, capitalletters to denote matrices, and calligraphic letters to denote tensors. We denote thematrix rank of a matrix A as r ( A ). Let m, n ∈ N . Consider CT ( m, n ). Suppose X = ( x i ··· i m ) ∈ CT ( m, n ). An entry x i ··· i m is called a diagonal entry of X if i = i = · · · = i m . Otherwise, x i ··· i m is calledan off-diagonal entry of X . If all the off-diagonal entries of X are zero, then X is calleda diagonal tensor. If X ∈ CT ( m, n ) is diagonal, and all the diagonal entries of X is3, then X is called the identity tensor of CT ( m, n ), and denoted as I m,n . Clearly, theidentity tensor I m,n is unique to CT ( m, n ). The identity tensor plays an importantrole in spectral theory of tensors [7]. Definition 2.1
Suppose that r : T → Z + . If r satisfies the following six properties,then r is called a tensor rank function. Property 1
Suppose that
X ∈ T . Then r ( X ) = 0 if and only if X is a zero tensor,and r ( X ) = 1 if and only if X is a rank-one tensor. Property 2
For m, n ∈ N with m ≥ , r ( I m,n ) = n . Property 3
Let n , n ∈ N , X ∈ T ( n , n , , , · · · , . Then r ( X ) is equal to thematrix rank of the n × n matrix corresponding to X . Property 4
Let m, n , · · · , n m ∈ N , X = ( x i ··· i m ) ∈ T ( n , · · · , n m ) , and α is areal nonzero number. Then r ( X ) = r ( α X ) . Property 5
Let m, n , · · · , n m ∈ N , X = ( x i ··· i m ) ∈ T ( n , · · · , n m ) , and σ is apermutation on N m . Then r ( X ) = r ( Y ) , where Y = ( x j ··· j m ) ∈ T ( σ ( n , · · · , n m )) , ( j , · · · , j m ) = σ ( i , · · · , i m ) . Property 6
Let m, n , · · · , n m ∈ N . Suppose that X , Y ∈ T ( n , · · · , n m ) , and Y is a subtensor of X . Then r ( Y ) ≤ r ( X ) . These six properties are essential for tensor ranks. Property 1 specifies rank zerotensors and rank-one tensors. Though the tensor rank theory is not matured, thereare no arguments in rank zero and rank-one tensors in the literature. Property 2 fixesthe value of the tensor rank for identity tensors. This is necessary as identity tensorsare good references for the magnitude of tensor ranks. Property 3 justifies the tensorrank is an extension of the matrix rank. Property 4 claims that the tensor rank is notchanged when a tensor is multiplied by a nonzero real number. Property 5 says thatthe roles of the modes are balanced. Property 6 justifies the subtensor rank relation.Suppose that r , r : T → Z + are two tensor rank functions. If for any X ∈ T wealways have r ( X ) ≤ r ( X ), then we say that the tensor rank function r is not greaterthan the tensor rank function r and denote this relation as r ≤ r . Theorem 2.2
Suppose that r , r : T → Z + are two tensor rank functions. Define r : T → Z + by r ( X ) = min { r ( X ) , r ( X ) } , for any X ∈ T . Then r is a tensor rank function, r ≤ r and r ≤ r . Proof
For any
X ∈ T , let r ( X ) = min { r ( X ) , r ( X ) } . Then Properties 1, 2, 3 and 4hold clearly from the definition of tensor rank functions.To show Property 5, we assume that Y is a permutated tensor of X . Then r ( X ) = r ( Y ) and r ( X ) = r ( Y ). Hence, r ( X ) = r ( Y ) and Property 5 is obtained.4ow we assume that Z is a subtensor of X ∈ T . Then r ( Z ) ≤ r ( X ) and r ( Z ) ≤ r ( X ). Hence r ( Z ) ≤ r ( X ) since r ( Z ) = min { r ( Z ) , r ( Z ) } ≤ r ( X ) and r ( Z ) ≤ r ( X ). Thus, Property 6 holds.Thus, we conclude that r = min { r , r } is a tensor rank function.clearly, r ≤ r and r ≤ r . (cid:3) Theorem 2.3
There exists a unique tensor rank function r ∗ , such that for any tensorrank function r , we have r ∗ ≤ r . Proof
For any
X ∈ T , define r ∗ ( X ) := min { r ( X ) | r is a tensor rank function } .This is well-defined as tensor rank functions take values on Z + . Now we show that r ∗ is a tensor rank function.1) Suppose X is a zero tensor in T . Then for any tensor rank function r , r ( X ) = 0.This implies that r ∗ ( X ) = 0 by the definition of r ∗ . On the other hand, suppose that r ∗ ( X ) = 0 for some X ∈ T . Then for some tensor rank function r , r ∗ ( X ) = r ( X ) = 0.Hence, X is a zero tensor from Property 1 of the tensor rank function r . Similarly, wemay show that r ∗ ( X ) = 1 if and only if X is a rank-one tensor.2) For any m, n ∈ N with m ≥ r ( I m,n ) = n for all tensor rank functions r . Thus r ∗ ( I m,n ) = n .3) Let X ∈ T ( n , n , , · · · , M be the corresponding n × n matrix in X .Then for any tensor rank function r , r ( X ) = r ( M ). Hence all of r ( X ) are equal.Hence, r ∗ ( X ) = r ( M ) and Property 3 holds.4) For any X ∈ T and any tensor rank function r , r ( X ) = r ( α X ) for any α (cid:54) = 0.Thus, r ∗ ( X ) = r ∗ ( α X ).5) We have Properties 5 and 6 in a similar way as in the proof of Theorem 2.2 andomit the details here.By the definition, r ∗ ≤ r for any tensor rank function r .Suppose that r ∗ and r ∗∗ are two tensor rank functions with the property that r ∗ ≤ r and r ∗∗ ≤ r for any tensor rank function. Then r ∗ ≤ r ∗∗ ≤ r ∗ . We see that r ∗ = r ∗∗ .Thus, such a tensor rank function r ∗ is unique. (cid:3) We call r ∗ the smallest tensor rank function. In Section 5, we will show that r ∗ hasthe max-full-rank-subtensor property.The six properties in Definition 2.1 are essential to tensor rank functions. Thereare some other properties which are satisfied by some tensor rank functions.5 efinition 2.4 Suppose that r is a tensor rank function. We say that r is a propertensor rank function if for any m, n ∈ N and X ∈ CT ( m, n ) , we have r ( X ) ≤ n . For an n × n square matrix, its matrix rank is never greater than its dimension n .Thus, proper tensor rank functions are reasonable in a certain sense. Definition 2.5
Suppose that r is a tensor rank function. We say that r is a subadditivetensor rank function if for any m, n , · · · , n m ∈ N , and X , Y ∈ T ( n , · · · , n m ) , we have r ( X + Y ) ≤ r ( X ) + r ( Y ) . The subadditivity property is somehow restrictive. The minimum of two subaddi-tive tensor rank functions may not be subadditive.
Proposition 2.6
Suppose that r is a proper tensor rank function. Let m, n , · · · , n m ∈ N with m ≥ and X ∈ T ( n , · · · , n m ) . Then we have r ( X ) ≤ max { n , · · · , n m } . (2.1) Proof
Let n = max { n , n , · · · , n m } and A ∈ T ( n , · · · , n m ) with a subtensor X .Then r ( X ) ≤ r ( A ) from Property 6. Together with r ( A ) ≤ n since r is proper, theresult is arrived. (cid:3) The matrix rank of an n × n rectangular matrix is never greater than min { n , n } .From this proposition, we may think further to restrict the magnitude of the tensorrank. For m, n , · · · , n m ∈ N with m ≥
2, we define submax { n , · · · , n m } as the secondlargest value of n , · · · , n m . Definition 2.7
Suppose that r is a tensor rank function. We say that r is a stronglyproper tensor rank function if for any m, n , · · · , n m ∈ N with m ≥ , and X ∈ T ( n , · · · , n m ) , we have r ( X ) ≤ submax { n , · · · , n m } . (2.2)We cannot change submax { n , · · · , n m } in (2.2) to the third largest value of n , · · · , n m as this violates Properties 1 and 3 of Definition 2.1.We will show that r ∗ is strongly proper in the next section.6 CP Rank, Max-Tucker Rank and Submax-TuckerRank
As we stated in the introduction, our motivation to introduce the axiom system fortensor ranks is to find some tensor ranks which have the max-full-rank-subtensor prop-erty. The six properties of Definition 2.1 are not satisfied by some tensor ranks in theliterature. For example, the tubal rank r of third order tensors was introduced in [5].For X ∈ T ( n , n , n ), r ( X ) ≤ min { n , n } . Thus, it is not a tensor rank function evenfor third order tensors. It is still very useful in applications [12, 13, 15, 14, 17].However, the six properties of Definition 2.1 are satisfied by tensor ranks aris-ing from two most important tensor decompositions – the CP decomposition and theTucker decomposition.We now study the CP rank [6]. Definition 3.1
Suppose that m, n , · · · , n m ∈ N and X = ( x i ··· i m ) ∈ T ( n , · · · , n m ) .Suppose that there are a ( i,p ) ∈ (cid:60) n i for i = 1 , · · · , m and p = 1 , · · · , r such that X = r (cid:88) p =1 a (1 ,p ) ◦ · · · ◦ a ( m,p ) , (3.3) then we say that X has a CP decomposition (3.3). The smallest integer r such that(3.3) holds is called the CP rank of X , and denoted as r CP ( X ) . Theorem 3.2
The CP rank is a subadditive tensor rank function. It is not a propertensor rank function.
Proof
We first show that the CP rank is a tensor rank function. Properties 1, 3 and4 hold clearly from the definition of the CP rank. Before we show Property 2, we canassert that r CP ( I m,n ) ≤ n for all m, n ∈ N with m ≥ I m,n = (cid:80) ni =1 e i ◦ · · · ◦ e i ,where e i ∈ (cid:60) n with the unique nonzero entry e ii = 1. In the following, we show Property2 by induction for m . We fix n here.For m = 2, I ,n reduces to the n × n identity matrix and hence Property 2 is truefor such a case. Now we assume that r CP ( I m,n ) = n . Then we show r CP ( I m +1 ,n ) = n .Assume that I m +1 ,n = r (cid:80) p =1 a (1 ,p ) ◦ · · · ◦ a ( m +1 ,p ) with r < n . Then I m,n = I m +1 ,n · e ≡ r (cid:88) p =1 (( e ) T a ( m +1 ,p ) ) a (1 ,p ) ◦ · · · ◦ a ( m,p ) . Here, e is the all one vector in (cid:60) n . This indicates that r CP ( I m,n ) < n since r < n .This contradicts the assumption that r CP ( I m,n ) = n and hence r CP ( I m +1 ,n ) = n .7ence, Property 2 holds.For Property 5, we have that Y = (cid:80) rp =1 a ( j ,p ) ◦ · · · ◦ a ( j m ,p ) if X = (cid:80) rp =1 a (1 ,p ) ◦ · · · ◦ a ( m,p ) when Y is a permutation of X with ( j , . . . , j m ) = σ (1 , , . . . , m ). Hence we haveProperty 5.For property 6, assume that Y is a subtensor of X . For p = 1 , . . . , r , let X p = a (1 ,p ) ◦ · · · ◦ a ( m,p ) and Y p be a subtensor of X p by a similar way of Y from X . Thenwe have that Y = Y + . . . Y r and r CP ( Y ) ≤ r since Y p are rank-one tensors for p = 1 , · · · , r . This means that r CP ( Y ) ≤ r CP ( X ). Hence Property 6 is satisfied.Therefore, the CP rank is a tensor rank function.Suppose that X , Y ∈ T ( n , . . . , n m ) with r CP ( X ) = r and r CP ( Y ) = r . Let X = r (cid:88) p =1 a (1 ,p ) ◦ · · · ◦ a ( m,p ) , Y = r (cid:88) q =1 b (1 ,q ) ◦ · · · ◦ b ( m,q ) . It holds that X + Y = r (cid:88) p =1 a (1 ,p ) ◦ · · · ◦ a ( m,p ) + r (cid:88) q =1 b (1 ,q ) ◦ · · · ◦ b ( m,q ) . Hence, r CP ( X + Y ) ≤ r + r ≡ r CP ( X ) + r CP ( Y ). This shows that it is subadditive.By [6], the CP rank of a 9 × × (cid:3) We now study the Tucker rank. In some papers such as [4], the n -rank is called theTucker rank.Suppose that m, n , · · · , n m ∈ N and X = ( x i ··· i m ) ∈ T ( n , · · · , n m ). We mayunfold X to a matrix X ( j ) = ( x i j ,i ··· i j − i j +1 i m ) ∈ (cid:60) n j × n ··· n j − n j +1 ··· n m for j = 1 , · · · , m .Denote r ( X ( j ) ) as r j for j = 1 , · · · , m . Then the vector ( r , · · · , r m ) is called the n -rank of X [6].The n -rank is a vector rank. Hence it does not satisfy Definition 2.1. However, ifwe define r = max { r , · · · , r m } , (3.4)then we have the following proposition. Theorem 3.3
The function r defined by (3.4) is a proper, subadditive tensor rankfunction. But it is not strongly proper. Proof
We first show that rank function r defined by (3.4) is a tensor rank function.To see this, it suffices to show that Property 1-6 are all satisfied.1) Suppose that X ∈ T ( n , · · · , n m ) for m, n , · · · , n m ∈ N is a zero tensor. Then X ( j ) are zero matrices for j = 1 , · · · , m . This implies that r j ( X ( j ) ) = 0 for j = 1 , · · · , m .8y (3.4), we have r ( X ) = 0. On the other hand, assume that r ( X ) = 0 for some X ∈ T ( n , · · · , n m ) with m, n , · · · , n m ∈ N . This means that r i = 0 for i = 1 , · · · , m ,which means that X ( i ) = 0 and hence X is a zero tensor.Suppose that r ( X ) = 1, then r i ( X ( i ) ) = 1 for all i = 1 , · · · , m . This can be seen asfollows. Assume that there exists i such that r ( X ( i ) ) = 0, then X is a zero tensorsince X ( i ) = 0. From above analysis, r ( X ) = 0 if and only if X is a zero tensor. Thiscontradicts with r ( X ) = 1.Let X = (cid:80) ¯ rp =1 a (1 ,p ) ◦ a (2 ,p ) · · · ◦ a ( m,p ) . Then X (1) = (cid:80) ¯ rp =1 a (1 ,p ) ◦ ( a (2 ,p ) ◦ · · · ◦ a ( m,p ) ) . From r ( X (1) ) = 1, we have that a (1 ,p ) ( p = 1 , . . . , ¯ r ) is rank-one. From X (2) = (cid:80) rp =1 a (2 ,p ) ◦ ( a (1 ,p ) ◦ · · · ◦ a ( m,p ) ) and r ( X (2) ) = 1, we have that a (2 ,p ) for all p = 1 , · · · , ¯ r is also rank-one.Similarly, we have that for any i = 1 , . . . , m , a ( i,p ) ( p = 1 , , . . . , ¯ r ) is rank-one.Thus, X = λ a (1 , ◦ · · · ◦ a ( m, for some λ and hence X is a rank-one tensor.Conversely, if X is a rank-one tensor, then X = x (1) ◦ · · · ◦ x ( m ) for some nonzerovectors x ( i ) ∈ (cid:60) n i . Then X ( i ) = x ( i ) ◦ ( x (1) ◦ · · · ◦ x ( m ) ) and r i ( X ( i ) ) = 1 for all i = 1 , . . . , m . Thus r ( X ) = 1.Based on the above analysis, Property 1 is satisfied.2) Denote I ≡ I m,n . Then I ( i ) is a rectangular matrix which can be partitioned toan n -dimensional identity matrix and an n × ( m − n zero matrix, for i = 1 , · · · , m ,and hence r ( I ( i ) ) = n . Thus, r ( I ) = n .3) When X ∈ T ( n , n , , · · · , X (1) ∈ (cid:60) n × n , X (2) = X T (1) ∈ (cid:60) n × n and X ( i ) ∈ (cid:60) ,n n for any i ≥
3. Clearly, r = r and hence r ( X ) = r ( X (1) ) = r ( X (2) ).4) Suppose that X ∈ T ( n , n , · · · , n m ). For any α (cid:54) = 0, and any i ∈ { , , · · · , m } ,( αX ) ( i ) = αX ( i ) and hence r i ( X ( i ) ) = r i (( α X ) ( i ) ). Hence r ( X ) = r ( α X ).5) Suppose that X ∈ T ( n , · · · , n m ) and Y is any permuted tensor of X . Then Y ( i ) will be X ( j ) for some j ∈ { , , · · · , m } . So r ( Y ( i ) ) = r ( X ( j ) ). Hence r ( Y ) =max { r ( Y ( i ) ) : i = 1 , · · · , m } = max { r ( X ( j ) ) : j = 1 , · · · , m } = r ( X ) and the resultholds.6) Suppose that Z is a subtensor of X . Then for all i = 1 , , · · · , m , Z ( i ) will be asubmatrix of X ( i ) and r ( Z ( i ) ) ≤ r ( X ( i ) ) since r is the matrix rank. So r ( Z ) ≤ r ( X ).Now we conclude that r defined by (3.4) is a tensor rank function.It is clear that such a tensor rank function r is proper from its definition. Further-more, we have that such rank r is also subadditive since matrix rank is subadditive.In addition, we consider X ∈ T (3 , ,
2) with X (1) = [ I ; e ] where I is the identitymatrix of three dimension. Hence r ( X ) = 3 > { , , } . Hence we concludethat such a tensor rank function is not strongly proper. (cid:3) r max ( X ) for any X ∈ T .Note that the max-Tucker rank naturally arises from applications of the Tuckerdecomposition when people assume that r i ≤ r for i = 1 , · · · , m and fix the value of r [2, 11]. Then this means that tensors of max-Tucker ranks not greater than r are used.In the following, we introduce a new tensor rank function, which is also associated withthe Tucker decomposition, but is different from the max-Tucker rank. We may replace(3.4) by r = submax { r , · · · , r m } . (3.5)Then we have the following theorem. Theorem 3.4
The function r defined by (3.5) is a strongly proper tensor rank func-tion. But it is not subadditive. Proof
We first show that function r defined by (3.5) is a tensor rank function. Itsuffices to show that Property 1-6 are all satisfied.1) Suppose that X ∈ T ( n , · · · , n m ) for m, n , · · · , n m ∈ N is a zero tensor. Then X ( j ) are zero matrices for all j = 1 , · · · , m . This implies that r ( X ( j ) ) = 0, whichmeans that X ( j ) = 0. By (3.5), we have r ( X ) = 0. On the other hand, assume that r ( X ) = 0 for some X ∈ T ( n , · · · , n m ) with m, n , · · · , n m ∈ N . This means that forsome i ∈ { , · · · , m } , r ( X ( i ) ) = 0, and hence X ( i ) = 0, X is a zero tensor. Therefore, X is a zero tensor if and only if r ( X ) = 0 . Suppose that r ( X ) = 1. Then X is not a zero tensor and hence then r ( X ( i ) ) ≥ i = 1 , · · · , m . Since r is defined by (3.5), there exists i , i , . . . , i m − suchthat r i j ( X ) = r ( X ( i j ) ) = 1. Without loss of generality, we assume that i j = j for j = 1 , , . . . , m −
1. Let X = (cid:80) ¯ rp =1 a (1 ,p ) ◦ · · · ◦ a ( m,p ) . Similar to discussion in proof ofTheorem 3.3, a ( j,p ) ( p = 1 , , . . . , ¯ r ) is rank-one for all j = 1 , . . . , m −
1. Thus X = a (1 ,p ) ◦ · · · ◦ ( a ( m, + λ a ( m, + λ a ( m, + · · · + λ ¯ r a ( m, ¯ r )) , for some λ , . . . , λ ¯ r . Clearly, such X is a rank-one tensor.Conversely, if X is a rank-one tensor, then X = x (1) ◦ · · · ◦ x ( m ) for some nonzerovectors x ( i ) ∈ (cid:60) n i . Then X ( i ) = x ( i ) ◦ ( x (1) ◦ · · · ◦ x ( m ) ) and r i ( X ( i ) ) = 1 for all i = 1 , . . . , m . Thus r ( X ) = 1.Based on the above analysis, Property 1 is satisfied.2) Denote I ≡ I m,n . Then I ( i ) is a rectangular matrix which can be partitioned toan n -dimensional identity matrix and an n × ( m − n zero matrix, for i = 1 , · · · , m ,and hence r ( I ( i ) ) = n . Thus, r ( I ) = n .3) When X ∈ T ( n , n , , · · · , X (1) ∈ (cid:60) n × n , X (2) = X T (1) ∈ (cid:60) n × n and X ( i ) ∈ (cid:60) × n n for any i ≥
3. Clearly, r ( X (1) ) = r ( X (2) ) ≥ r ( X ( i )) ≤ i ≥
3. Hence r ( X ) = r ( X (1) ) = r ( X (2) ) is the same as the matrix rank of thecorresponding matrix.4) Suppose that X ∈ T ( n , n , · · · , n m ). For any α (cid:54) = 0, and any i ∈ { , , · · · , m } ,( α X ) ( i ) = α X ( i ) and hence r ( X ( i ) ) = r (( α X ) ( i ) ). Hence r ( X ) = r ( α X ).5) Suppose that X ∈ T ( n , · · · , n m ) and Y is any permuted tensor of X . Then Y ( i ) will be X ( j ) for some j ∈ { , , · · · , m } . So r ( Y ( i ) ) = r ( X ( j ) ). Hence r ( Y ) =submax { r ( Y ( i ) ) : i = 1 , · · · , m } = submax { r ( X ( j ) ) : j = 1 , · · · , m } = r ( X ) and theresult holds.6) Suppose that Z is a subtensor of X . Then for all i = 1 , , · · · , m , Z ( i ) will be asubmatrix of X ( i ) and r ( Z ( i ) ) ≤ r ( X ( i ) ) since r i is matrix rank. So r ( Z ) ≤ r ( X ).Now we conclude that r defined by (3.5) is a tensor rank function.The strongly proper property of such a tensor rank function is clear and hence itsuffices to show that it is not subadditive.Let X = ( x ijk ) , Y = ( y ijk ) , Z = ( z ijk ) ∈ T (2 n , n , n ) with X = Y + Z and y ijk = 0 if i > n , j > n , k > n , z pqs = 0 if p ≤ n , q ≤ n , s ≤ n . It is assumed that n − rank ( Y ) = ( r , r , r ), n − rank ( Z ) = ( R , R , R ) and r > r >r , R > R > R . Then X ( i ) = Y ( i ) + Z ( i ) for i = 1 , , r ( X ( i ) ) = r ( Y ( i ) )+ r ( Z ( i ) ).So r ( X ) = submax { r + R , r + R , r + R } > r + R since r + R > r + R and r + R > r + R . Therefore, we conclude that such a tensor rank function is notsubadditive. (cid:3) Thus, we call this tensor rank function the submax-Tucker rank in this paper, anddenote it as r sub ( X ) for any X ∈ T . Proposition 3.5
We have r sub ( Y ) ≤ r max ( Y ) for any Y ∈ T and r sub ( X ) < r max ( X ) for some X ∈ T . Thus, r max (cid:54) = r ∗ . Furthermore, r ∗ is strongly proper. Proof
Clearly, r sub ( Y ) ≤ r max ( Y ) for any Y ∈ T . To see r sub ( X ) < r max ( X ) for some X ∈ T , we consider the following counterexample X .Consider the tensor X ∈ T (2 , ,
4) with its nonzeros entries X = X = X = X = 1. By observation, we have that r ( X (1) ) = 2 , r ( X (2) ) = 3 and r ( X (3) ) = 4,which implies that r sub ( X ) = 3 < r max ( X ) and the result is arrived here.As r ∗ ≤ r sub and r sub is strongly proper, r ∗ is also strongly proper. (cid:3) We cannot replace submax { r , · · · , r m } in (3.5) by the third largest value in r , · · · , r m , as this will violate Properties 1 and 3 of Definition 2.1.11 Maximum Full Rank Subtensors
In this section, we introduce the concept of maximum full rank subtensors, and definethe max-full-rank-subtensor property.We first define the full rank concept for a tensor rank function. Recall that inmatrix theory, there is the concept of full row (column) rank matrices.
Definition 4.1
Suppose that r is a tensor rank function. Let m, n , · · · , n m ∈ N with m ≥ , and X ∈ T ( n , · · · , n m ) . If we have r ( X ) = n p (4.6) for some index p satisfying ≤ p ≤ m , then we say that X is of full p -row r rank, orsimply say that X is of full r rank. In particular, zero tensors are regarded as of full r rank. We then define the max-full-rank-subtensor property for a tensor rank function.
Definition 4.2
Suppose that r is a tensor rank function. Let X ∈ T . We call asubtensor Y of X a maximum full rank subtensor of X under r if Y is of full r rank,and r ( Y ) is the maximum for any such full rank subtensors of X . We say that r is ofthe max-full-rank-subtensor property if for any X ∈ T , r ( X ) = r ( Y ) , where Y is a maximum full rank subtensor of X under r . Now, the question is if there is a tensor rank function of the max-full-rank-subtensorproperty. We have the following theorem.
Theorem 4.3
The max-Tucker rank function r max has the max-full-rank-subtensorproperty.Furthermore, suppose that m, n , · · · , n m ∈ N and X = ( x i ··· i m ) ∈ T ( n , · · · , n m ) ,and Y is a maximum full rank subtensor of X under r max . Then there is an index p , ≤ p ≤ m , such that all the q th p -rows of X , with q in the mode p index set T p of Y ,are linearly independent, | T p | = r max ( Y ) = r max ( X ) , and any p -row of X is a linearcombination of all the q th p -rows of X with q ∈ T p . Proof
Let m, n , · · · , n m ∈ N with m ≥
2, and
X ∈ T ( n , · · · , n m ). Assumethat r ( X ( i ) ) = r i for i = 1 , · · · , m . Without loss of generality, assume that r =max { r , · · · , r m } . By the properties of the matrix rank, we know that there is a set12 = { k , · · · , k r } ⊂ { , · · · , n } such that Y = ( y j ··· j m ) ∈ T ( r , n , · · · , n m ) is asubtensor of X , where y j ··· j m = x j ··· j m for j ∈ T , j l = 1 , · · · , n l , l = 2 , · · · , m , and r ( Y (1) ) = r . Then r ( Y ( l ) ) ≤ r ( X ( l ) ) ≡ r l for l = 2 , · · · , m . This shows that Y is of full r max rank, and r max ( Y ) = r max ( X ) = r .Hence, the max-Tucker rank function r max has the max-full-rank-subtensor property.On the other hand, suppose that m, n , · · · , n m ∈ N and X = ( x i ··· i m ) ∈ T ( n , · · · , n m ),and Y is a maximum full rank subtensor of X under r max . By Definition 4.1, there isan index p , 1 ≤ p ≤ m , such that r max ( Y ) = | T p | , where T p is the mode p index set of Y . Denote r ( X ( l ) ) and r ( Y ( l ) ) as r l ( X ) and r l ( Y )respectively for l = 1 , · · · , m . Then r l ( Y ) ≤ r l ( X )for l = 1 , · · · , m . We have r max ( Y ) = | T p | ≤ r p ( Y ) ≤ r p ( X ) ≤ r max ( X ) = r max ( Y ) . Hence, r max ( Y ) = | T p | = r p ( Y ) = r p ( X ) ≤ r max ( X ) = r max ( Y ) . This shows that all the q th p -rows of X with q ∈ T p are linearly independent by thedefinition of r p ( Y ), and any p -row of X is a linear combination of all the q th p -rows of X with q ∈ T p by the definition of r p ( X ). (cid:3) The property of the maximum full rank subtensor Y of X under r max , stated inTheorem 4.3, extends the corresponding property of matrices to tensors. Theorem 4.3 says that the max-Tucker rank function r max has the max-full-rank-subtensor property. Is there any other tensor rank function which also has this prop-erty? The full rank concept is only suitable for proper tensor rank functions. Thus,the CP rank r CP is out of question. Does the submax-Tucker rank function r sub havethe max-full-rank-subtensor property? At this moment, we do not know the answerto this question. However, we show that for any proper tensor rank function, we mayalso derive another proper tensor rank function, which has the max-full-rank-subtensorproperty. To do this, we introduce the concept of the closure of a proper tensor rankfunction. 13 efinition 5.1 Suppose that r : T → Z + is a proper tensor rank function. We maydefine ¯ r : T → Z + as the closure of r by ¯ r ( X ) = max { r ( Y ) : Y is a subtensor of X , and of full r rank } , for any X ∈ T . We have the following theorems.
Theorem 5.2
The closure ¯ r of a proper tensor rank function r is also a proper tensorrank function. We have ¯ r ≤ r . A proper tensor rank function r has the max-full-rank-subtensor property if and only if ¯ r = r . Proof
Let X be a zero tensor. By Definition 4.1, X is of full r rank. By Definition5.1, ¯ r ( X ) = 0. Let X be a nonzero rank-one tensor. Then X has a one-entry nonzerosubtensor Y and r ( Y ) = 1. By Definition 4.1, Y is of full r rank. Thus, r ( Y ) = 1.This shows ¯ r ( X ) ≥
1. By Property 6 of Definition 2.1, for any subtensor Z of X , r ( Z ) ≤ r ( X ) ≤
1. This shows that ¯ r ( X ) ≤
1. Hence, ¯ r ( X ) = 1. This shows that ¯ r satisfies Property 1 of Definition 2.1.By Property 2 of Definition 2.1, for m, n ∈ N with m ≥ r ( I m,n ) = n . ByDefinition 4.1, I m,n is of full r rank. By Definition 5.1, ¯ r ( I m,n ) = n . Thus, ¯ r satisfiesProperty 2 of Definition 2.1.Assume that n , n ∈ N , X ∈ T ( n , n , , , · · · , M be the corresponding n × n matrix. Denote that r ( M ) = r M . Then there is an r M × r M submatrix ¯ M of M such that the matrix rank of ¯ M is r M = r ( X ). Furthermore, there is a subtensor Y of X such that Y ∈ T ( r M , r M , , , · · · , r M × r M matrix is¯ M . We now see that Y is of full r rank. This shows that ¯ r ( X ) ≥ r ( Y ) = r M = r ( X ).Thus, ¯ r ( X ) = r M . This shows that ¯ r satisfies Property 3 of Definition 2.1.For any α (cid:54) = 0, r ( α X ) = r ( X ) and hence¯ r ( α X ) = max { r ( α Y ) : Y is a subtensor of X , and of full r rank } = max { r ( Y ) : Y is a subtensor of X , and of full r rank } = ¯ r ( X ) . This means that Property 4 is satisfied. Similarly, we have Property 5 for ¯ r .For any subtensor Z of X , we have¯ r ( Z ) = max { r ( Y ) : Y is a subtensor of Z , and of full r rank }≤ max { r ( Y ) : Y is a subtensor of X , and of full r rank } = ¯ r ( X ) . Hence, Property 6 of Definition 2.1 is satisfied by ¯ r .Clearly, ¯ r is proper since r is proper. So we can assert that ¯ r is a proper tensorrank function. 14y Property 6 of r and the definition of ¯ r , we have ¯ r ≤ r .Now we show the last assertion that r has the max-full-rank-subtensor property ifand only if ¯ r = r .“ ⇒ ” It suffices to show that ¯ r ≥ r . If X is of full r rank, then ¯ r ( X ) = r ( X ).Otherwise, there exists a full r rank subtensor Y of X such that r ( Y ) = r ( X ). Since¯ r ( X ) ≥ r ( Y ) from definition, ¯ r ( X ) ≥ r ( X ). Together with ¯ r ( X ) ≤ r ( X ), we have¯ r ( X ) = r ( X ) . “ ⇐ ” From r ( X ) = ¯ r ( X ) for any X , we have that there exists a full r rank subtensor Y of X such that ¯ r ( X ) = r ( Y ). So r ( Y ) = r ( X ) and hence r has the max-full-rank-subtensor property from the arbitrariness of X .The conclusion holds. (cid:3) Theorem 5.3
Suppose that r is a proper tensor rank function, and ¯ r is its closure.Then ¯ r has the max-full-rank-subtensor property. Proof
Let
X ∈ T and Y be a maximum full rank subtensor of X under r . By Definition5.1, ¯ r ( X ) = r ( Y ), and ¯ r ( Y ) = r ( Y ). Then Y is also a maximum full rank subtensor of X under ¯ r , and we have ¯ r ( Y ) = ¯ r ( X ) . This shows that ¯ r has the max-full-rank-subtensor property. (cid:3) Then we are now able to show that the smallest tensor rank function r ∗ is such astrongly proper tensor rank function. Corollary 5.4
The smallest tensor rank function r ∗ has the max-full-rank-subtensorproperty. Proof
Let r ∗∗ be the closure of r ∗ . Since, r ∗ ≤ r ∗∗ ≤ r ∗ , we have r ∗ = r ∗∗ . Hence, r ∗ has the max-full-rank-subtensor property. (cid:3) Is r ∗ equal to the submax-Tucker rank function r sub or its closure ¯ r sub ? This leavesas a further research question. In Section 3, we introduced a new tensor rank function, the submax-Tucker rank func-tion, which is associated with the Tucker decomposition, but is different from themax-Tucker rank. According to our theoretical analysis, the submax-Tucker rank is15trongly proper. Comparing with the CP rank and the max-Tucker rank, it is smallerin general. Thus, the submax-Tucker rank may be a good choice for low rank tensorapproximation and tensor completion. We now present an application of the submax-Tucker rank.Suppose that we have a data tensor
M ∈ T ( n , n , · · · , n m ). Assume that n >>n i for i = 2 , · · · , m . Then we may approximate M by X ∈ T (¯ n, r, · · · , r ), where n ≥ ¯ n ≥ n i for i = 2 , · · · , m , and r ≤ max { n , · · · , n m } . , For example, in [16], forthe internet traffic data tensor Abilene M [9], we have n = 1008, which is the numberof time intervals, n = n = 11 is the number of the origin-destination nodes of theinternet traffic dataset. We may use the Tucker decomposition [6] X = D × A × B × C to approximate M . Here, D is the Tucker core tensor of dimension r × r × r . Factormatrices A, B and C are of dimensions n × r , n × r and n × r , respectively. Theoperations × i are mode i product [6]. A usual practice is to fix r and assume that r i ≤ r for i = 1 , , M by a tensor X of the max-Tucker rank not greater than r . Then the range r is 1 ≤ r ≤ r ≤
11 and r ≤
11. The range of r is quite large. If we use atensor X of the submax-Tucker rank not greater than r to approximate M , then therange of r is 1 ≤ r ≤
11, we may let, say, r ≤ ¯ n = 30, r ≤ r and r ≤ r , by fixing r .This provides a good choice of the range of X to approximate M .For example, we consider the internet traffic tensor X ∈ T (1008 , , X . (I) Tucker decomposition with themax-Tucker rank r , i.e., the core tensor D ∈ T ( r, r, r ). (II–IV) Tucker decompositionwith the submax-Tucker rank r and ¯ n = 30 , , D ∈ T (¯ n, r, r ),respectively. For each decomposition (cid:101) X , we calculate the relative errorRelative error := (cid:107) (cid:101) X − X (cid:107) F (cid:107)X (cid:107) F . Using the Tensor Toolbox, we illustrate results in Figure 1 for r ranging from 1 to 11.Obviously, we see that relative errors corresponding to submax-Tucker rank is smallerthan the relative error of the max-Tucker rank case. In this paper, we extended the maximum full rank subtensor concept and the max-full-rank-submatrix property to tensors. We proved that the max-Tucker rank function,the smallest tensor rank function, and the closure of any proper tensor rank function16igure 1: Comparison between the max-Tucker rank and the submax-Tucker Ranks.17ave the max-full-rank-subtensor property. These show that the maximum full ranksubtensor concept and the max-full-rank-subtensor property should be an importantpart for the tensor rank theory. Some questions remain for further research.The axiom system for tensor ranks is also an exploration. The six properties ofDefinition 2.1 may be further modified. But it may be a worthwhile research directionto study tensor ranks with some appropriate axiom systems.For low rank tensor approximation, the concept of border rank [6] is useful. Canthe concept of border rank also be accommodated by the tensor rank axiom system?This may also be an interesting further research topic.
References [1] E. Acar, D.M. Dunlavy, T.G. Kolda and M. Mørup, “Scalable tensor factoriza-tions for incomplete data”,
Chemometrics and Intelligent Laboratory Systems (2011) 41-56.[2] B. Chen, T. Sun, Z. Zhou, Y. Zeng and L. Cao, “Nonnegative tensor completionvia low-rank Tucker decomposition: model and algorithm”,
IEEE Access (2019)95903-95914.[3] L. De Lathauwer, D. De Moor and J. Vandewalle, “A multilinear singular valuedecomposition”, SIAM Journal on Matrix Analysis and Applications (2000)1253-1278.[4] B. Jiang, F. Yang and S. Zhang, “Tensor and its tucker core: The invariancerelationships”, Numerical Linear Algebra with Applications (2017) e2086.[5] M. Kilmer, K. Braman, N. Hao and R. Hoover, “Third-order tensors as operatorson matrices: A theoretical and computational framework with applications inimaging”, SIAM Journal on Matrix Analysis and Applications (2013) 148-172.[6] T.G. Kolda and B. Bader, “Tensor decompositions and applications”, SIAM Re-view (2009) 455-500.[7] L. Qi and Z. Luo, Tensor Analysis: Spectral Theory and Special Tensors , SIAM,Philadelphia, 2017.[8] H. Tan, Z. Yang, G. Feng, W. Wang and B. Ran, “Correlation analysis for tensor-based traffic data imputation method”,
Procedia-Social and Behavioral Sciences (2013) 2611-2620. 189] The Abilene Observatory Data Collections. Accessed: May 2004. [Online]. Avail-able: http://abilene.internet2.edu/observatory/datacollections.html[10] K. Xie, L. Wang, X. Wang, G. Xie, J. Wen and G. Zhang, “Accurate recoveryof internet traffic data: A tensor completion approach”, IEEE INFOCOM 2016 -The 35th Annual IEEE International Conference on Computer Communications (2016).[11] Y. Xu and W. Yin, “A block coordinate method for regularized multiconvex op-timization with applications to nonnegatove tensor factorizatn and completion”,
SIAM Journal on Imaging Sciences (2013) 1758-1789.[12] L. Yang, Z.H. Huang, S. Hu and J. Han, “An iterative algorithm for third-ordertensor multi-rank minimization”, Computational Optimization and Applications (2016) 169-202.[13] J. Zhang, A.K. Saibaba, M.E. Kilmer and S. Aeron, “A randomized tensor singularvalue decomposition based on the t-product”, Numerical Linear Algebra withApplications (2018) e2179.[14] Z. Zhang and S. Aeron, “Exact tensor completion using t-SVD”, IEEE Transac-tions on Signal Processing (2017) 1511-1526.[15] Z. Zhang, G. Ely, S. Aeron, N. Hao and M. Kilmer, “Novel methods for multilineardata completion and de-noising based on tensor-SVD”, Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, ser.
CVPR ’14 (2014)3842-3849.[16] H. Zhou, D. Zhang, K. Xie and Y. Chen, “Spatio-temporal tensor completion forimputing missing internet traffic data”, (2015).[17] P. Zhou, C. Lu, Z. Lin and C. Zhang, “Tensor factorization for low-rank tensorcompletion”,
IEEE Transactions on Image Processing27