[PDF] On the Weight Spectrum of Pre-Transformed Polar Codes

Abstract

Polar codes are the first class of channel codes achieving the symmetric capacity of the binary-input discrete memoryless channels with efficient encoding and decoding algorithms. But the weight spectrum of Polar codes is relatively poor compared to RM codes, which degrades their ML performance. Pre-transformation with an upper-triangular matrix (including cyclic redundancy check (CRC), parity-check (PC) and polarization-adjusted convolutional (PAC) codes), improves weight spectrum while retaining polarization. In this paper, the weight spectrum of upper-triangular pre-transformed Polar Codes is mathematically analyzed. In particular, we focus on calculating the number of low-weight codewords due to their impact on error-correction performance. Simulation results verify the accuracy of the analysis.

Full PDF

aa r X i v : . [ c s . I T ] F e b On the Weight Spectrum of Pre-Transformed PolarCodes

Yuan Li ∗†‡ , Huazi Zhang ∗ , Rong Li ∗ , Jun Wang ∗ , Guiying Yan †‡ , and Zhiming Ma †‡∗ Huawei Technologies Co. Ltd. † University of Chinese Academy of Sciences ‡ Academy of Mathematics and Systems Science, CASEmail: [email protected], {zhanghuazi, lirongone.li, justin.wangjun}@huawei.com,[email protected], [email protected]

Abstract —Polar codes are the ﬁrst class of channel codesachieving the symmetric capacity of the binary-input discretememoryless channels with efﬁcient encoding and decoding al-gorithms. But the weight spectrum of Polar codes is relativelypoor compared to RM codes, which degrades their ML per-formance. Pre-transformation with an upper-triangular matrix(including cyclic redundancy check (CRC), parity-check (PC)and polarization-adjusted convolutional (PAC) codes), improvesweight spectrum while retaining polarization. In this paper,the weight spectrum of upper-triangular pre-transformed PolarCodes is mathematically analyzed. In particular, we focus oncalculating the number of low-weight codewords due to theirimpact on error-correction performance. Simulation results verifythe accuracy of the analysis.

I. I

NTRODUCTION

Polar codes [1], invented by Arıkan, are a great breakthrough in coding theory. As code length N = 2 n approachesinﬁnity, the synthesized channels become either noiseless orpure-noise, and the fraction of the noiseless channels ap-proaches channel capacity. Thanks to channel polarization,efﬁcient successive cancellation (SC) decoding algorithm canbe implemented with a complexity of O ( N logN ) . However,the performance of polar codes under SC decoding is poor atshort to moderate block lengths.In [2], a successive cancellation list (SCL) decoding al-gorithm was proposed. As the list size L increases, theperformance of SCL decoding approaches that of maximum-likehood (ML) decoding. But the ML performance of polarcodes is still inferior due to low minimum distance. Conse-quently, concatenation of polar codes with CRC [3] and PC [4]were proposed to improve weight spectrum. Recently, Arıkanproposed polarization-adjusted convolutional (PAC) codes [5],which is shown to approach BIAWGN dispersion bound [6]under large list decoding [7].CRC-Aided (CA) Polar, PC-Polar, and PAC codes can beviewed as pre-transformed Polar codes with upper-triangulartransformation matrices [7]. In [8], it is proved that any pre-transformation with an upper-triangular matrix does not reducethe minimum Hamming weight, and a properly designed pre-transformation can reduce the number of minimum-weightcodewords. In this paper, we propose an efﬁcient method tocalculate the average weight spectrum of pre-transformed polarcodes. Moreover, the method holds for arbitrary information sub-channel selection criteria, thus covers Polar codes, RMcodes and is not constrained by "partial order" [9]. Our resultsconﬁrm that the pre-transformation with an upper-triangularmatrix can reduce the number of minimun-weight codewordssigniﬁcantly. In the meantime, it enhances error-correctingperformance of SCL decoding.In section II, we review Polar codes and pre-transformed Po-lar codes. In section III we propose a formula to calculate theaverage weight spectrum of pre-transformation Polar codes.In section IV the simulation results are presented to verify theaccuracy of the formula. Finally we draw some conclusionsin section V. II. B ACKGROUND

A. Polar code

Given a B-DMC W : { , } → Y , the channel transitionprobabilities are deﬁned as W ( y | x ) where y ∈ Y , x ∈ { , } . W is said to be symmetric if there is a permutation π , suchthat ∀ y ∈ Y , W ( y |

1) = W ( π ( y ) | and π = id .Then the symmetric capacity and the Bhattacharyya param-eter of W are deﬁned as I ( W ) , X y ∈Y X x ∈X W ( y | x ) log W ( y | x ) W ( y |

0) + W ( y | (1)and Z ( W ) , X y ∈Y p W ( y | W ( y | (2)Let F = (cid:20) (cid:21) , N = 2 m , and H N = F ⊗ m . Startingfrom N = 2 m independent channels W , we obtain N polarized channels W ( i ) N , after channel combining and splittingoperations [1], where W N (cid:0) y N | u N (cid:1) , W N (cid:0) y N | u N H N (cid:1) (3) W ( i ) N (cid:0) y N , u i − | u i (cid:1) , X u Ni +1 ∈X N − i N − W N (cid:0) y N | u N (cid:1) (4)Polar codes can be constructed by selecting the indicesof K information sub-channels, denoted by the informationset A = { I , I , . . . , I K } . The optimal sub-channel selectionriterion for SC decoding is reliability, i.e., selecting the K most reliable sub-channel as information set. Density evolution(DE) algorithm [10], Gaussian approximation (GA) algorithm[11] and the channel-independent PW construction method[12] are efﬁcient methods to ﬁnd reliable sub-channels. Theoptimal sub-channel selection criterion for SCL decoding isstill an open problem. Some heuristic approaches cosiderboth reliablity and row weight, such as RM-Polar codes [13]and PC-Polar codes [4], to improve minimum code distance.Others employ artiﬁcial intelligence techniques to ﬁnd goodinformation set [14] [15].After determining the information set A , the complementset A c is called the frozen set. Let u N = ( u , u , . . . , u N ) be the bit sequence to be encoded. The information bits areinserted into u A , and all zeros are ﬁlled into u A c . Then thecodeword x N is obtained by x N = u N H N . B. Weight Spectrum of Polar Codes

There are many prior works to analyze the weight spectrumof Polar codes. In [16], the authors use SCL decoding witha large list size to decode an all-zeros codeword. Codewordswithin the list are enumerated to estimate the number of low-weight codewords. In [17], the authors improve this approachin term of memory usage. The above methods only obtainpartial weight spectrum. In [18] [19], the probabilistic com-putation methods are proposed to estimate the weight spectrumof Polar codes.

C. Weight Spectrum of Polar Cosets

As in [20], let u i − ∈ { , } i − , u i ∈ { , } , deﬁne thepolar coset C ( i ) N (cid:0) u i − , u i (cid:1) as C ( i ) N (cid:0) u i − , u i (cid:1) = (cid:8) ( u i , u ′ ) H N | u ′ ∈ { , } n − i (cid:9) In [21] [22], recursive formulas are proposed to efﬁcientlycompute the weight spectrum of C ( i ) N (cid:0) i − , (cid:1) . The weightspectrum of C ( i ) N (cid:0) i − , (cid:1) is tightly associated with the per-formance of SC decoding, our analysis of average weightspectrum of pre-transformed Polar codes is based on the polarcoset spectrum as well. D. Pre-Transformed Polar Codes T =  T · · · T N · · · T N ... ... . .. ... · · ·  The above non-degenerate upper-triangular pre-transformation matrix T has all ones on the maindiagonal. Let G N = T H N and u A c = , thecodeword of the pre-transformed Polar codes is givenby x N = u N G N = u N T H N . Let z N = u N T , the i - th pre-transformed bit is given by z i = u i ⊕ i − P j =1 u j T ji . As seen, z i is a linear combination of the i - th and previous bits, muchlike a parity-check bit or a dynamic frozen bits [23]. III. A VERAGE C ODE S PECTRUM A NALYSIS

In this section, we propose a formula to compute the averageweight spectrum of the pre-transformed Polar codes, withfocus on the number of low-weight codewords. The averagenumber assumes that T ij , ≤ i < j ≤ N are i.i.d.Bernoulli ( ) r.v. . A. Notations and Deﬁnitions h ( i ) N is the i - th row vector of H N , and g ( i ) N is the i - th row vector of G N . The number of codewords with Hammingweight d of the pre-transformed Polar codes is denoted by N d ( U × T × H N ) . The minimum distance of Polar/RM codesand the pre-transformed codes are denoted by d min ( U × H N ) and d min ( U × T × H N ) , respectively. The numberof minimum-weight codewords of Polar/RM codes and thepre-transformed codes are denoted by N min ( U × H N ) and N min ( U × T × H N ) , respectively. B. Code Spectrum Analysis

The expected number of codewords with Hamming weight d is E [ N d ( U × T × H N )]= X u I ,u I ,...,u IK ∈{ , } K P w ⊕ K X i =1 u I i g ( I i ) N ! = d ! = K X j =1 X u , . . . , uIj − uIj = 1 uIj +1 , . . . , uIK ∈ { , } K − j P  w  g ( I j ) N ⊕ K X i = j +1 u I i g ( I i ) N  = d  (5) Lemma 1. ∀ u I j +1 , . . . , u I K ∈ { , } K − j , P  w  g ( I j ) N ⊕ K X i = j +1 u I i g ( I i ) N  = d  = P (cid:16) w (cid:16) g ( I j ) N (cid:17) = d (cid:17) Proof.

According to the pre-transformation matrix g ( I j ) N = h ( I j ) N ⊕ N X i = I j +1 T I j i h ( i ) N g ( I j ) N ⊕ K X i = j +1 u I i g ( I i ) N = h ( I j ) N ⊕ N X i = I j +1 T ′ I j i h ( i ) N And T ′ I j i , (P I k

Lemma 2. If w ( h ( I j ) N ) > d , P (cid:16) w ( g ( I j ) N ) = d (cid:17) = 0 .Proof. Recall that g ( I j ) N = h ( I j ) N ⊕ N X i = I j +1 T I j i h ( i ) N According to [9, corollary 1], w (cid:16) g ( I j ) N (cid:17) ≥ w (cid:16) h ( I j ) N (cid:17) > d therefore P (cid:16) w (cid:16) g ( I j ) N (cid:17) = d (cid:17) = 0 According to

Lemma 1 and

Lemma 2 , (5) can be furthersimpliﬁed to E [ N d ( U × T × H N )] = X ≤ j ≤ Kw ( hIj ) ≤ d K − j P (cid:16) w (cid:16) g ( I j ) N (cid:17) = d (cid:17) (6)Let P ( m, i, d ) , P (cid:16) w (cid:16) g ( i )2 m (cid:17) = d (cid:17) , (6) can be rewrittenas E [ N d ( U × T × H N )] = X ≤ j ≤ Kw ( hIj ) ≤ d K − j P ( m, I j , d ) (7)In particular, let P ( m, i ) , P (cid:16) w (cid:16) g ( i )2 m (cid:17) = w (cid:16) h ( i )2 m (cid:17)(cid:17) . Soif d = d min , (6) can be rewritten as E [ N min ( U × T × H N )] = X ≤ j ≤ Kw ( hIj ) = dmin ( U × H ) K − j P ( m, I j ) (8)Let A d denote the number of codewords in C ( i ) N (cid:0) i − , (cid:1) with Hamming weight d . Clearly, N − i P ( m, i ) = A w (cid:16) h ( i ) N (cid:17) , N − i P ( m, i, d ) = A d . In [21] [22], the authors proposerecursive formulas to calculate the weight spectrum of Polarcosets.In Theorem 1 and

Theorem 2 , we investigate the recursivefomulas for P ( m, i ) and P ( m, i, d ) , which are similar to theformula in [22]. But instead of Polar cosets, we are interestedin the pre-transformed Polar codes. For the completeness ofthe paper, the proofs are in the appendix. Theorem 1. P ( m, i ) =  w ( h ( i )2 m ) m − P ( m − , i ) 1 ≤ i ≤ m − P ( m − , i − m − ) 2 m − < i ≤ m (9) with the boundary conditions P (1 ,

1) = P (1 ,

2) = 1 . With (8) and (9), we can recursively calculate the averagenumber of minimum-weight codewords. We are also interestedin other low-weight codewords on the weight spectrum, sincetogether they determine the ML performance at high SNR. Theproblem boils down to evaluating the more general formaula of P ( m, i, d ) . As we will see in Theorem 2 , the average weightspectrum can be calculated efﬁciently in the same recursivemanner especially for codewords with small Hamming weight.

Theorem 2. If ≤ i ≤ m − P ( m, i, d ) = d X d ′ = w (cid:18) h ( i )2 m (cid:19) d − d ′ is even P ( m − , i, d ′ ) 2 d ′ (cid:18) m − − d ′ d − d ′ (cid:19) m − If m − < i ≤ m P ( m, i, d ) = ( P ( m − , i − m − , d/ d is even d is odd with the boundary conditions P (1 , ,

1) = P (1 , ,

2) = 1 . And ( P ( m, , d ) = 0 , if d is evenP ( m, i, d ) = 0 , if i > and d is odd (10)IV. S IMULATION

In this section, we verify the correctness of the recursiveformula through simulations. In particular, we employ the"large list decoding" method described in [16] to collectlow-weight codewords. At ﬁrst, we randomly generate onethousand pre-transfom matrices for RM(128, 64), and set L = 5 × to count the number of minimum-weight code-words for each matrix, and obtain their average N min . Theresult is shown in Fig. 1: d min = 16 , N simulationmin = 2768 . , N recursionmin = 2766 . .To show that our recursive formula is applicable for any sub-channel selection criterion we also construct Polar code (128,64) by the PW method [12]. The simulation result is shown inFig. 2: d min = 8 , N simulationmin = 272 . , N recursionmin = 272 .As seen, the recursively calculated minimum-weight code-word numbers are very close to ones obtained through simu-lation.In Table. I, we display the number of minimum codewordsof the original RM/Polar codes, and the average number isrecursively calculated. It is shown that pre-transforming signif-icantly reduces the number of minimum-weight code words,especially in RM(128, 64). The signiﬁcant improvement ofweight spectrum after pre-transformation explains why theCA-Polar, PC-Polar, and PAC codes outperform the originalPolar codes under list decoding with large list size.The improvement can be observed under different codelengths and rates, as we can see from Fig. 3. In all cases, pre-transformation reduces the number of minimum codewordssigniﬁcantly.

100 200 300 400 500 600 700 800 900 1000Number of testing2200240026002800300032003400 N Polar RM (128, 64) N16 SimulationN16 Simulation AverageN16 Calculation

Fig. 1. RM(128,64), the black solid line is calculated, and the blue solid lineis obtained from simulation. N Polar PW (128, 64) N8 SimulationN8 Simulation AverageN8 Calculation

Fig. 2. Polar PW(128,64), the black solid line is calculated, and the bluesolid line is obtained from simulation.

In addition to minimum-weight codewords, we also simulateto verify the accuracy of the formula for other low-weightcodewords. The simulation results are shown in Table. II forRM(128, 64) and PW(128, 64) respectively, where N sim isthe simulation results, and N recur is the calculation results.In parity-check (PC) codes [4], both reliability and codedistance are taken into consideration when selecting the in-formation set. A coefﬁcient α is used to control the trade-off between reliability and code distance. The larger α is,the greater code distance is. A parity check pattern can beconsidered as a realization of the pre-transformation matrix.Take PC-Polar (128, 64) ( α = 1 . ) as an example, we TABLE I

COMPARSION BETWEEN ORIGINAL P OLAR CODES AND P RE - TRANSFOMED P OLAR CODES

Minimum-weight codewords d min Original Pre-trasformed

RM(128,64) 16 94488 2767PW(128,64) 8 304 272

RM(128,29) RM(64,42) RM(64,22) RM(32,16)Differernt code lengths and rates020004000600080001000012000 N u m be r OriginalPre-transformed average

Fig. 3. comparsion between original Polar codes and pre-transformed Polarcodes under different code lengths and ratesTABLE II

COMPARSION BETWEEN SIMULATION RESULTS OF

RM(128, 64)

AND

PW(128, 64)

WITH

REALIZATIONS AND THE CALCULATION RESULTSBY THE PROPOSED RECURSIVE FORMULA

RM(128,64) PW(128,64) d N simd N recurd d N simd N recurd

16 2764.5 2766.9 8 272.2 27218 397.1 393.5 12 896.6 89620 80251 80182 16 76812.2 77111Note that N = N = 0 for PW(128, 64) calculate the average number of low-weight codewords. Theresult implies that pre-transformtion of the pre-transformedcode can increase the minimum code distance when theinformation set is properly chosen, that is, reducing the numberof original minimun-codewords to zero. The spectrum of thecode ensemble average, the original code and a realizationof the pre-transformed code are shown in Table. III. In thiscase, although some rows of H N with Hamming weight 8 areselected into the information set, PC-Polar codes can increasethe minimum distance from 8 to 12. TABLE III

COMPARSION BETWEEN THE SPECTRUM OF THE ENSEMBLE AVERAGE , THEORIGINAL CODES AND A REALIZATION OF THE PRE - TRANSFORMED CODES

Average Original Pre-transformed8 0.5 32 010 0.0547 0 012 39.5 0 4814 27 128 2816 5250 57048 5228

Fig. 4 and Fig. 5 provide the BLER performances of variousconstructions under different list sizes , with reference to ﬁnite-length performance bounds such as normal approximation(NA), random-coding union (RCU) and meta-converse (MC)bounds [6] [24] [25]. It is observed that reliability is theonly contributing factor to decoding performance under SCdecoding. Under SCL decoding with list size L = 8 , the PC-Polar codes ( α = 1 . ) strike a good balance between reliabilitynd distance, and shows the best decoding performance. Whenthe list size is large enough, both PAC and PC-Polar codes canapproach NA bound with their ML performances. SNR(dB) -3 -2 -1 B L E R N=128 K=64

PW SCPW pre SCRM SCPAC SCPC 1.5 SCPC 3.5 SCPW L=8PW pre L=8RM L=8PAC L=8PC 15 L=8PC 35 L=8

SNR(dB) -5 -4 -3 -2 -1 B L E R N=128 K=64

PW L=256PW pre L=256RM L=256PAC L=256PC 1.5 L=256PC 3.5 L=256NARCUMC

V. C

ONCLUSION

In this paper, we propose recursively formulas to efﬁcientlycalculate the average weight spectrum of pre-transformedPolar codes, which include CA-Polar, PC-Polar and PAC codesas special cases. It is worth mentioning that our formulas workfor any sub-channel selection criteria. We found that, with pre-transformation, the average number of minimum codewordsdecreases signiﬁcantly, therefore outperforming the originalRM/Polar codes under the ML decoding and SCL decodingwith large list sizes. Furthermore, as in the instance of PC-Polar codes ( α = 1 . ), the combination of a proper sub-channel selection and pre-transformation has the potential toincrease minimum code distance by eliminating minimum-weight codewords. R EFERENCES[1] E. Arıkan, "Channel Polarization: A Method for Constructing Capacity-Achieving Codes for Symmetric Binary-Input Memoryless Channels," inIEEE Transactions on Information Theory, vol. 55, no. 7, pp. 3051-3073,July 2009.[2] I. Tal and A. Vardy, "List Decoding of Polar Codes," in IEEE Transactionson Information Theory, vol. 61, no. 5, pp. 2213-2226, May 2015.[3] K. Niu and K. Chen, "CRC-Aided Decoding of Polar Codes," in IEEECommunications Letters, vol. 16, no. 10, pp. 1668-1671, October 2012.[4] H. Zhang et al., "Parity-Check Polar Coding for 5G and Beyond," 2018IEEE International Conference on Communications (ICC), Kansas City,MO, 2018, pp. 1-7.[5] E. Arıkan, "From sequential decoding to channel Polarization and backagain," arXiv:1908.09594

September 2019.[6] Y. Polyanskiy, H. V. Poor and S. Verdu, "Channel Coding Rate in theFinite Blocklength Regime," in IEEE Transactions on Information Theory,vol. 56, no. 5, pp. 2307-2359, May 2010.[7] H. Yao, A. Fazeli and A. Vardy, "List Decoding of Arıkan’s PAC Codes,"2020 IEEE International Symposium on Information Theory (ISIT), LosAngeles, CA, USA, 2020, pp. 443-448.[8] B. Li, H. Zhang, J. Gu. "On Pre-transformed Polar Codes," arXiv:1912.06359 , December 2019.[9] C. Schürch, "A partial order for the synthesized channels of a polar code,"2016 IEEE International Symposium on Information Theory (ISIT),Barcelona, 2016, pp. 220-224.[10] R. Mori and T. Tanaka, "Performance of Polar Codes with the Construc-tion using Density Evolution," in IEEE Communications Letters, vol. 13,no. 7, pp. 519-521, July 2009.[11] P. Trifonov, "Efﬁcient Design and Decoding of Polar Codes," in IEEETransactions on Communications, vol. 60, no. 11, pp. 3221-3227, Novem-ber 2012.[12] G. He et al., " β -Expansion: A Theoretical Framework for Fast andRecursive Construction of Polar Codes," GLOBECOM 2017 - 2017 IEEEGlobal Communications Conference, Singapore, 2017, pp. 1-6.[13] B. Li, H. Shen, D. Tse,"A RM-Polar Codes," arXiv:1407.5483 , Jul2014.[14] L. Huang, H. Zhang, R. Li, Y. Ge, J. Wang,"Reinforcement Learning forNested Polar Code Construction," arXiv:1904.07511 , Nov 2019.[15] L. Huang, H. Zhang, R. Li, Y. Ge and J. Wang, "AI Coding: Learningto Construct Error Correction Codes," in IEEE Transactions on Commu-nications, vol. 68, no. 1, pp. 26-39, Jan. 2020.[16] B. Li, H. Shen and D. Tse, "An Adaptive Successive CancellationList Decoder for Polar Codes with Cyclic Redundancy Check," in IEEECommunications Letters, vol. 16, no. 12, pp. 2044-2047, December 2012.[17] Z. Liu, K. Chen, K. Niu and Z. He, "Distance spectrum analysis ofpolar codes," 2014 IEEE Wireless Communications and NetworkingConference (WCNC), Istanbul, Turkey, 2014, pp. 490-495.[18] M. Valipour and S. Youseﬁ, "On Probabilistic Weight Distribution ofPolar Codes," in IEEE Communications Letters, vol. 17, no. 11, pp. 2120-2123, November 2013.[19] Q. Zhang, A. Liu and X. Pan, "An Enhanced Probabilistic ComputationMethod for the Weight Distribution of Polar Codes," in IEEE Communi-cations Letters, vol. 21, no. 12, pp. 2562-2565, Dec 2017.[20] H. Yao, A. Fazeli, A. Vardy, "A Deterministic Algorithm for Computingthe Weight Distribution of Polar Codes," arXiv:2102.07362 , Feb2021.[21] K. Niu, Y. Li, W. Wu, "Polar Codes: Analysis and Construction Basedon Polar Spectrum," arXiv:1908.05889 , Nov 2019.[22] R. Polyanskaya, M. Davletshin and N. Polyanskii, "Weight Distributionsfor Successive Cancellation Decoding of Polar Codes," in IEEE Trans-actions on Communications, vol. 68, no. 12, pp. 7328-7336, Dec. 2020.[23] P. Trifonov and V. Miloslavskaya,“Polar codes with dynamic frozensymbols and their decoding by directed search,” Proc. IEEE InformationTheory Workshop, pp. 1–5, Sevilla, Spain, September 2013.[24] J. Font-Segura, G. Vazquez-Vilar, A. Martinez, A. Guillén i Fàbregasand A. Lancho, "Saddlepoint approximations of lower and upper boundsto the error probability in channel coding," 2018 52nd Annual Conferenceon Information Sciences and Systems (CISS), Princeton, NJ, 2018, pp.1-6.[25] G. Vazquez-Vilar, A. G. i Fabregas, T. Koch and A. Lancho, "Sad-dlepoint Approximation of the Error Probability of Binary HypothesisTesting," 2018 IEEE International Symposium on Information Theory(ISIT), Vail, CO, 2018, pp. 2306-2310. PPENDIX

A. Proof of Theorem 1Proof.

A trivial examination can prove the correctness of theboundary conditions. Let us focus on deriving the recursiveformula.

Case 1 : ≤ i ≤ m − g ( i )2 m = h ( i )2 m ⊕ m − X j = i +1 T ij h ( j )2 m ⊕ m X j =2 m − +1 T ij h ( j )2 m Let h ( i )2 m ⊕ m − P j = i +1 T ij h ( j )2 m , [ X , ] , m P j =2 m − +1 T ij h ( j )2 m , [ Y , Y ] ,where is an all-zero row vector of length m − , X =( x , . . . , x m − ) , Y = ( y , . . . , y m − ) .Apparently, X and Y are independent, and ∀ a =( a , . . . , a m − ) ∈ { , } m − , P ( Y = a ) = 2 − m − . Let w ( X ) = d , w ( Y ) = d , and c be the number of positionswhere X and Y are both 1. We have w ( g ( i )2 m ) = w ([ X ⊕ Y , Y ])= w ( X ⊕ Y ) + w ( Y )= d + 2 d − c Because d ≥ c and d ≥ w ( h ( i )2 m ) [9, corollary 1], theequation w ( g ( i )2 m ) = d + 2 d − c = w ( h ( i )2 m ) holds if andonly if d = w ( h ( i )2 m ) , d = c . In fact, d = c implies that if x i = 0 then y i = 0 , ≤ i ≤ m − . Let { i , . . . , i d } denotethe d locations where x i , . . . , x i d = 1 , hence the recursiveformula is P ( m, i ) = P ( d = w ( h ( i )2 m )) ∗ P ( d = c | d = w ( h ( i )2 m ))= P ( m − , i ) ∗ P (cid:0) y i , . . . , y i d ∈ { , } d , y i = 0 otherwise (cid:1) = P ( m − , i ) ∗ d m − = P ( m − , i ) ∗ w ( h ( i )2 m ) m − Case 2 : m − < i ≤ m g ( i )2 m =  h ( i )2 m ⊕ m X j = i +1 T ij h ( j )2 m  = (cid:20) h ( i − m − )2 m − ⊕ m X j = i +1 T ij h ( j − m − )2 m − ,h ( i − m − )2 m − ⊕ m X j = i +1 T ij h ( j − m − )2 m − (cid:21) ∽ h g ( i − m − )2 m − , g ( i − m − )2 m − i (11)where X ∽ X means X , X have the same distribution. B. Proof of Theorem 2 Proof. (10) is obtained with the observation that w (cid:16) h (1) N (cid:17) isodd and ∀ i > , w (cid:16) h ( i ) N (cid:17) is even. Case 1 : ≤ i ≤ m − Similar to the proof of

Theorem 1 , let w ( X ) = d , w ( Y ) = d and c be the number of positions where X and Y are both1. Denoted by V = { v , . . . , v c } the set of positions where X and Y are both 1, and V c its complement. Let Y V c denotethe corresponding subvector of Y , we have w ( Y V c ) = d − c .Because w ( g ( i )2 m ) = w ([ X ⊕ Y , Y ])= w ( X ⊕ Y ) + w ( Y )= d + 2 d − c = d then d − c = d − d , so d − d must be even. No matter what c is, the equation is satisﬁed if and only if w ( Y V c ) = d − d .Based on the above observations, P ( m, i, d ) can be formulatedas P ( m, i, d )= d X d ′ = w (cid:18) h ( i )2 m (cid:19) d − d ′ is even P ( m, i, d | w ( X ) = d ′ ) ∗ P ( w ( X ) = d ′ )= d X d ′ = w (cid:18) h ( i )2 m (cid:19) d − d ′ is even P ( m, i, d | w ( X ) = d ′ ) ∗ P ( m − , i, d ′ ) The last equality holds due to X ∽ g ( i )2 m − .In particular P ( m, i, d | w ( X ) = d ′ ) = P (cid:18) w ( Y V c ) = d − d (cid:19) = 2 d ′ (cid:18) m − − d ′ d − d ′ (cid:19) m − Consequently, the recursive formula is P ( m, i, d ) = d X d ′ = w (cid:18) h ( i )2 m (cid:19) d − d ′ is even P ( m − , i, d ′ ) 2 d ′ (cid:18) m − − d ′ d − d ′ (cid:19) m − Case 2 : m − < i ≤ m , according to (11) g ( i )2 m ∽ h g i − m − m − , g i − m − m − i It is straightforward to obtain the recursive formula P ( m, i, d ) = P ( m − , i − m − , d/, d/