A New Fast Computation of a Permanent
1 A New Fast Computation of a Permanent
Xuewei Niu , Shenghui Su
1, 4 , Jianghua Zheng , and Shuwang Lü College of Computers, Nanjing Univ. of Aeronautics & Astronautics, Najing 211106, PRC School of Network Securities, Information Engineering University, Zhengzhou 450001, PRC Laboratory of Information Security, Univ. of Chinese Academy of Sciences, Beijing 100039, PRC Public Security Innovation Center, Nanjing Univ. of Science and Technology, Nanjing 210094, PRC
Abstract:
This paper proposes a general algorithm called Store-zechin for quickly computing the permanent of an arbitrary square matrix. Its key idea is storage, multiplexing, and recursion. That is, in a recursive process, some sub-terms which have already been calculated are no longer calculated, but are directly substituted with the previous calculation results. The new algorithm utilizes sufficiently computer memories and stored data to speed the computation of a permanent. The Analyses show that computating the permanent of an n n matrix by Store-zechin requires (2 n -1 - 1) n multiplications and 2 n -1 ( n - 2) + 1 additions while does (2 n - 1) n + 1 multiplications and (2 n - n )( n + 1) - 2 additions by the Ryser algorithm, and does 2 n -1 n + n + 2 multiplications and 2 n -1 ( n + 1) + n - n -1 additions by the R-N-W algorithm. Therefore, Store-zechin is excellent more than the latter two algorithms, and has a better application prospect. Keywords:
Matrix, Permanent, Recursive algorithm, Linked list, Time complexity Introduction
In the year of 1812, Cauchy used the determinant as a special type of alternating symmetry functions. In order to distinguish it from ordinary symmetry functions, it is called “fonction symetriques permanents [1]”. In the meantime, Cauchy introduced a subclass of the symmetric functions which was later named as permanents by T. Muir [2]. The computation of the permanent of a matrix is known to be more difficult than the computation of the determinant. The difficulty of computing a permanent is directly proportional to the difficulty of a boson sampling problem. In recent years, with the advance of quantum computing technologies, a permanent is often regarded as a measure of the quantum supremacy by which people can determine whether quantum computers are worthy of research and development. Therefore, it has received more and more attention. Definition and Computation of Permanent of a Square Matrix
Basic Definition and Properties
The permanent of a square matrix is a number that is define in a way similar to the determinant. Let A be an n × n matrix. The permanent of A is defined as , ( )1 ( ) , n n i iS i Per A a (1) where S n is the symmetric group over the set {1, 2, ..., n }, and is an element of S n , namely a permutation of the numbers 1, 2, ..., n [3], while the definition of a determinant is , ( )1 e ( ) sgn( ) , n n i iS i D t A a (2) where sgn( ) represents the parity sign of a group element [4]. The only difference between the determinant and the permanent is the parity sign of a group element, so there are some similar properties between them [5][6], such as 1) Per ( I ) = 1, where I represents the n -th identity matrix (Normativeness); 2) Per ( A T ) = Per ( A ), where A T represents the transpose of A (Transpose invariance); 3) Per ( A ) will be changed to k Per ( A ) when any row or column of A is multiplied by a scalar k . Computation Methods At present, the well-known methods to calculate a permanent are the Naive algorithm, Ryser algorithm, and R-N-W algorithm. Naive algorithm is a way based on the formula (1). It computes the permanent directly and the algorithm complexity of this algorithm is O ( n·n !). The Ryser algorithm is an efficient method [7]. This method was proposed by H. Ryser in 1963, and used the principle of tolerance to calculate the permanent. It is defined as ( ) ( 1) , n k kk Per A T (3) where T k is the sum of the values of P(A k ) over all possible A k , A k is a matrix obtained from A with columns k removed, and P(A k ) is the product of the row-sums of A k . According to formula (3), it can be deduced that the algorithm complexity of the Ryser algorithm is O ( n n -1 ). The R-N-W algorithm was developed shortly after the Ryser algorithm [8]. Nijenhuis and Wilf used some techniques to improve the Ryser algorithm and reduced the complexity to O ( n n -1 ). This algorithm can be descripted as ( ) ( 1) 2 ( 1) { }, nSn i i jS j Si Per A x a (4a) , ,1 ni i n i jj x a a i n (4b) where S runs over the subsets of 1, 2, …, n -1. And for each subset S {1, 2, …, n -1}, we have to calculate ( ) ( ), n ii f S S (5) where , ( ) ( 1,..., ), i i i jj S S x a i n (6) Suppose that the current subset S differs from its predecessor S’ by a single element. Then , ( ) ( ') ( 1,..., ). i i i j S S a i n (7) Thus, instead of requiring n (| S | + 1) operations to compute , … , n in (6), we can get them in just n operations by (7). The key to (6) transitioning to (7) is to encode the subset with Gray code, and then we can perform related operations on its corresponding subsets. In addition, with respect to the permanents of some special square matrixes —— Design of the General Store-zechin Algorithm
Thought of the Algorithm
Store-zechin is an algorithm designed by us, which has seemingly been ignored by some pure mathematicians. The computer memories and stored data can be utilized effectively repeatedly so as to speed the computation of a permanent. The key idea of the Store-zechin algorithm is to calculate the permanent recursively and to replace the being calculated items with the previous stored results. For example, if n = , a a a aa a a aA a a a aa a a a then according the Store-zechin algorithm, we can known that ( ) ( ) ( ) ( ) ( ) ( ( ) ( ) ( )) ( ( ) ( ) ( Per A a Per A a Per A a Per A a Per Aa a Per A a Per A a Per Aa a Per A a Per A a Per )) ( ( ) ( ) ( )) ( ( ) ( ) ( )),
Aa a Per A a Per A a Per Aa a Per A a Per A a Per A (8) where A i;j means the matrix that removes the i- th row and the j- th column. According to (8), we can find that Per ( A ) , Per ( A ) , Per ( A ) , Per ( A ) , Per ( A ) , Per ( A ) are repeated. So the second calculation of these items are replaced by their first results. Data Structure of the Algorithm
In order to store the calculation results in a recursive process, we can build a global linked list. Check whether the item has been calculated before calculating each recursive item. If yes, return the stored result. Otherwise, calculate the permanent of this item and stored it in the linked list. We first need to create two structures, HeadNode and BodyNode. BodyNode contains three variables, Array, value and pbNext. The Array is a one-dimensional integer array which stores the columns that need to be removed. The value is an integer which means the permanent of a square matrix that removed columns and rows. In fact, the columns that need to removed can get from Array. So we can know how many columns should be removed which recorded as m . Then we can remove last m rows of the original matrix. So we only record the columns that need to be removed. The pbNext is a pointer which points to the next BodyNode node. The structure of BodyNode is shown in Figure 1. int *Array int value pbNext Fig.1. The structure of BodyNode And the definition of BodyNode in C is typedef struct bodynode { int *Array; int value; struct bodynode *pbNext; }BodyNode,*pBodyNode; HeadNode also contains three variables, size, phNext and pbody. The size is an integer and it means how many BodyNode nodes are linked after the node. The phNext is a pointer which points to the next HeadNode node. The pbody is also a pointer and it points to the BodyNode nodes. The structure of HeadNode is shown in Figure 2. int size phNext pbody
Fig.2. The structure of HeadNode And the definition of HeadNode in C is typedef struct headnode { int size; pBodyNode pbody; struct headnode *phNext; }HeadNode,*pHeadNode; The whole linked list can be constructed by the above two structures as Figure 3. For the sake of convenience, we specify that only the BodyNode that removes one column can link to the first HeadNode and only the BodyNode that removes two columns can link to the second HeadNode and so on. … size int *Array value … size int *Array value … size int *Array value … size HeadNode BodyNode
Fig.3. The structure of linked list Then we can deduce that in general, namely when A is an n -th order square matrix, we can get the following formula. ,1 ;1 ,2 ;2 , ;,1 1,2 , 1;1,2 1,3 , 1;1,3 1, , 1;1,,2 1,1 , 1;1,2 1,3 , 1;2,3 ( ) ( ) ( ) ( ) ( ( ) ( ) ( )) ( ( ) ( n n n n n n n nn n n n n n n n n n n nn n n n n n n Per A a Per A a Per A a Per Aa a Per A a Per A a Per Aa a Per A a Per A
1, , 1;2,, ( 1)1 , 1;1, 1,2 , 1;2, 1, 1 , 1; 1, ) ( )) ( ( ) ( ) ( )). n n n n nn n n n n n n n n n n n n n n n a Per Aa a Per A a Per A a Per A (9) The termination condition of the recursive is ( ) , 2.
Per A a a a a n (10) (9) and (10) and the rule that only calculates the sub-items that not been calculated constitute the Store-zechin algorithm for calculating a permanent.
Description of the Algorithm
Based on the key idea and the data structure, we can describe the general Store-zechin algorithm detailedly. Calling statement: Store-zechin(pHead, A , n , del_index, exist_index, del_order); pHead: the pointer which points to the linked list; A : the matrix that needs to be calculated; n : the order of A ; del_index: the array of the columns that need to be removed; exist_index: the array of the columns that still exist after the removal operation; del_order: the number of columns that need to be removed. Algorithm steps: S1 : Find if there is such a BodyNode whose Array is same as the del_index in the linked list which is pointed by the pHead, S1.1 : If it exists, return the value of the node, S1.2 : If it doesn’t exist, go to S2. S2 : Let sum S2.1 : If n = 2 , sum a a + a a ( a i,j is the number at the i -th row and j -th column in A ). Creat a new BodyNode node, assigning del_index and sum to its array and value respectively. Then link the BodeNode to the linked list, S2.2 : If n > 2 , then let i , and go to S3. S3 : Let exist_i exist_index( i ), put exist_i on the last of del_index, del_order del_order + 1. S4 : Let temp_exist_index exist_index , and delete the i -th number of temp_exist_index. S5 : Let coe a ni , and temp_A represent the matrix that removes the last row and i -th column , sum sum+coe*Store-zechin(pHead, temp_A, n -1, del_index, exist_index, del_order). S6 : Delete the last number of del_index, del_order del_order – 1. S7 : Let i i + 1, S7.1 : If i > n , go to S8, S7.2 : If i <= n , go to S3. S8 : If del_order
0, creat a new BodyNode node, assigning del_index and sum to its array and value respectively, then link it to the global linked list. S9 : Return sum. In fact, we need to initialize some global variables before the algorithm starts. The initialization steps are as follows. S1 : Creat an empty lined list, and let pHead point to it. S2 : Let del_index array1 , and array1 is an empty array. exist_index array2 , and array2 is an array whose numbers are 1,2,3,…, n , del_order Analysis of Time Complexity of the New Algorithm
Since the Store-zechin algorithm is obtained by recursion, the number of multiplication operations and addition operations of each sub-item can be derived by that used by the lower-order sub-items.
Multiplication Operations
According to the derivation process of the Store-zechin algorithm, it can be found that the number of multiplication operations required in each sub-item of the algorithm satisfies the following condition.
1 02 0 03 2 2 24 9 7 5 3 .5 28 19 12 7 46 75 47 28 16 nnnnnn
9 5 (11) Namely, when n = i , the number of multiplication operations required for the first sub-item from right to left is i – 1 (0 for i = 1, 2), and the number of multiplication operations of the j ( j > 1) sub-items from right to left satisfies the following relationship. (when n = i , the number of multiplication operations to be used for the j - 1 sub-item from right to left) + (when n = i - 1, the number of multiplication operations is required for the j - 1 sub-item from right to left) = (when n = i , the number of multiplication operations is required for the j sub-item from right to left). In fact, the number of multiply steps we need can be derived from the sequence 0, 0, 2, 3, 4, 5, ..., n and it can be shown like this. iA In A , the number of multiplication operations of all sub-item can be obtained, as long as it is derived from the rightmost column to the left and follows the rule a i,j = a i , j -1 + a i -1, j -1 . But because in the sequence 0, 0, 2, 3, 4, 5, ..., n , the second item of this series is 0. It is inconvenient to consider, so we might consider the sequence 0, 1, 2, 3, 4, 5, ..., n and follows the process of A then we can get A . iA By comparing A and A , we can find that when i > 1, a i , i -1 in A is 1 larger than a i , i -1 in A and a i , i in A is j -1 larger than a i , i in A , and the other values in the two matrices are equal. Then we can completely represent the sum of n -th row in A recorded as sumn( A ) by firstly calculating the sum of n -th row in A recorded as sumn( A ). Sumn( A ) and sumn( A ) satisfy the following relationship sumn( A ) = sumn( A ) + ( n – 1 + 1). (12) For A , we can change the way we express n n n n nn n n nA n n n n nn n n n n nn n n n n n ni j Then the i -th item of the n -th row in A can be expressed as 2 i -1 * n -(2 i -1 +( i -1)*2 i -2 ) and sumn( A ) can be expressed as n i i ii n i (13) Now according to relation (12), we can derive sumn( A ) as (( 1) 1) 2 * (2 ( 1) * 2 ), n i i ii n n i (14) namely n i i ii n n i (15) Formula (15) represents the number of multiplication operations required for each recursive item but it is not what we need for the Store-zechin algorithm. Looking back at formula (9), we can see that in a recursion term, the preceding coefficients also perform multiplication operations and the number of them is n . In summary, we can deduce the number of multiplication operations to calculate the permanent of square matrix by Store-zechin under general conditions n i i ii n i (16) After summing the formula (16), the formula (17) is obtained. (2 1). n n (17) Addition Operations
Similar to the multiplication operations, the number of addition operations of each sub-item in the Store-zechin algorithm also satisfies a certain rule
1 02 0 03 1 1 14 5 4 3 2 .5 17 12 8 5 36 49 32 20 12 nnnnnn
7 4 (18) It also can list the number of addition operations required for all sub-items from the sequence 0, 0, 1, 2, 3, 4, ..., n
17 12 8 5 3 549 32 20 12 7 4 6 iA The process of getting A is similar to getting A . The first item of sequence 0, 0, 1, 2, 3, 4, ..., n don’t satisfy the general condition of n and it is not conducive to the generalization of the derivation. So we consider the sequence -1, 0, 1, 2, 3, 4, ..., n and we get A after going through the same calculation as A . iA By comparing A and A , we can conclude that a i , i belong to A is 1 larger than a i , i belong to A ( i = 1, 2, ..., n ), and the other values in the two matrices are equal. Then we can completely represent the sum of n -th row in A recorded as sumn( A ) by firstly calculating the sum of n -th row in A recorded as sumn( A ). Sumn( A ) and sumn( A ) satisfy the following relationship sumn( A )= sumn( A ) + 1 (19) For A , we can change the way we express n nn n nn n n nA n n n n nn n n n n nn n n n n n ni j Then the i -th item of the n -th row in A can be expressed as 2 i -1 * n -(2 i +( i -1)*2 i -2 ) and sumn( A ) can be expressed as n i i ii n i (20) Now according to relation (19), we can derive sumn( A ) as ( 2 (( 1)2 2 )) 1. n i i ii n i (21) However, formula (21) just represents the sum of addition operations of each sub-items. All addition operations should also include the operations between each sub-items of the recursive top layer, see formula (9) for details. There are n sub-items, so it need n – 1 addition operations. Now we can deduce the number of addition operations to calculate the permanent of square matrix by Store-zechin under general conditions ( 2 (( 1)2 2 )) 1 ( 1). n i i ii n i n (22) After summing the formula (22), it becomes n n (23) Comparison of Complexities between New Algorithm and Existing Algorithms
As mentioned above, the current well-known algorithms for calculating the permanent are Naive algorithm, Ryser algorithm and R-N-W algorithm. Here, the addition operations, the multiplication operations and the total bit operations (assuming the maximum integer allowed is 2 ) will be used as the standard to compare the Store-zechin algorithm with the above algorithm. Firstly, we count the relevant data of each algorithm when n = 3,4,……,10, and the results are shown in Table 1-3. Table 1: Comparison of The Addition Operations of Four Algorithms Algorithm n =3 n =4 n =5 n =6 n =7 n =8 n =9 n =10 Naive 5 23 119 719 5039 40319 362879 3628799 Ryser 18 58 160 404 966 2230 5028 11152 R-N-W 21 51 115 253 553 1207 2631 5721 Store-zechin 5 17 49 129 321 769 1793 4097 Table 2: Comparison of the Multiplication Operations of Four Algorithms
Algorithm n =3 n =4 n =5 n =6 n =7 n =8 n =9 n =10 Naive 18 96 600 4320 35280 322560 3265920 36288000 Ryser 22 61 156 379 890 2041 4600 10231 R-N-W 17 38 87 200 457 1034 2315 5132 Store-zechin 9 28 75 186 441 1016 2295 5110 Table 3: Comparison of the Total Bit Operations of Four Algorithms
Algorithm n =3 n =4 n =5 n =6 n =7 n =8 n =9 n =10 Naive 74048 394688 2465216 17740736 144829376 1.3238e+10 1.3400e+10 1.4887e+11 Ryser 91264 253568 649216 1578240 3707264 8502656 19163392 42619904 R-N-W 70976 158912 363712 835392 1907264 4312512 9650624 21386816 Store-zechin 37184 115776 310336 770112 1826880 4210752 9515072 21192768 From the comparison in Table 1-3, we can see that, when n > 5, the addition operations, the multiplication operations and the total bit operations all reflect Laplace > Ryser > R-N-W > Store-zechin. Besides, the difference between them increases as n increases. It is revealed that Store-zechin algorithm can complete the calculation of the permanent of the fifth order or more with fewer operations. In order to prove the above statement, the addition operations, multiplication operations, and total bit operations of the four algorithms will be compared next. The results are shown in Table 4. Table 4: Comparison of Computational Complexity of Four Algorithms Algorithm Addition Multiplication Total Bit Naive n !-1 n · n ! (4096· n +64)· n !-64 Ryser ( n +1)(2 n - n )-2 n (2 n -1)+1 4160· n n +64·2 n -64· n -4160· n +3968 R-N-W ( n +1)2 n -1 + n - n -1 n n -1 + n +2 4160· n n -1 +64·2 n -1 +64· n +4032· n +8128 Store-zechin n n -1 -2 n +1 n n -1 - n n n -1 -128·2 n -1 -4096· n +64 As can be seen from the comparison in the table, all three indicators reflect that the Naive algorithm has the largest expression, so its computational complexity is the highest, and the Ryser ranks second. Although the R-N-W algorithm has the same highest order as the Store-zechin, it has the larger small items, so the Store-zechin has the lower computational complexity. Conclusion
Although the Store-zechin algorithm has been neglected by mathematicians, the algorithm can fully utilize the storage characteristics of the computer, and when the order of the matrix is improved, the Store-zechin algorithm can calculate the permanent more efficiently. Through theoretical analysis, we also confirm that the Store-zechin has the lower computational complexity than the Naive algorithm and the Ryser algorithm. The R-N-W has the larger small items, although it has the same highest order as the Store-zechin. As the order of the matrix increases, the Store-zechin algorithm will have better performance undoubtedly. Moreover, the Store-zechin algorithm is designed for the storage characteristics of computer of computers, so it is more compatible with computer. Therefore, in some performance tests, the Store-zechin algorithm can more fully reflect some of the features of the device and has a good application prospect. Acknowledgment
This work is supported by MOST with Project 2007CB311100 and 2009AA01Z441. …………