Impact of redundant checks on the LP decoding thresholds of LDPC codes
aa r X i v : . [ c s . I T ] A p r Impact of redundant checks on the LP decoding thresholds ofLDPC codes ∗ Louay Bazzi † Hani Audah ‡ August 13, 2018
Abstract
Feldman et al. (2005) asked whether the performance of the Linear Programming (LP)decoder can be improved by adding redundant parity checks to tighten the LP relaxation.We prove in this paper that for LDPC codes, even if we include all redundant parity checks,asymptotically there is no gain in the LP decoder threshold on the Binary Symmetric Channel(BSC) under certain conditions on the base Tanner graph. First, we show that if the Tannergraph has bounded check-degree and satisfies a condition which we call asymptotic strength ,then including high degree redundant parity checks in the LP does not significantly improvethe threshold of the LP decoder in the following sense: for each constant δ >
0, there is aconstant k > k improves by at most δ upon adding to the LP all redundant checks of degreelarger than k . We conclude that if the graph satisfies an additional condition which we call rigidity , then including all redundant checks does not improve the threshold of the base LP.We call the graph asymptotically strong if the LP decoder corrects a constant fraction of errorseven if the log-likelihood-ratios of the correct variables are arbitrarily small. By building on aconstruction due Feldman et al. (2007) and its recent improvement by Viderman (2013), weshow that asymptotic strength follows from sufficiently large variable-to-check expansion. Wealso give a geometric interpretation of asymptotic strength in terms pseudocodewords. We callthe graph rigid if the minimum weight of a sum of check nodes involving a cycle tends to infinityas the block length tends to infinity. Under the assumptions that the graph girth is logarithmicand the minimum check degree is at least 3, rigidity is equivalent to the nondegeneracy propertythat adding at least logarithmically many checks does not give a constant weight check. Weargue that nondegeneracy is a typical property of random check-regular Tanner graphs. ∗ Research supported by FEA URB grant, American University of Beirut. † Department of Electrical and Computer Engineering, American University of Beirut, Beirut, Lebanon. E-mail:[email protected]. ‡ Department of Electrical and Computer Engineering, American University of Beirut, Beirut, Lebanon. E-mail:[email protected] Introduction
A Low Density Parity Check (LDPC) code is a linear code whose parity check matrix is sparse.LDPC codes were discovered by Gallager [Gal62] in 1962 who used the sparsity of the parity checkmatrix to design various iterative decoding algorithms with good performance. The parity checkmatrix of a LDPC is represented by a bipartite graph, called a Tanner graph [Tan81], betweena set of variable nodes and set of check nodes. The past two decades saw a growing number ofresearch results related to LDPC codes and their iterative decoding algorithms (see [RU08] for acomprehensive account). Graph properties such as good girth [Gal62, Tan81] and expansion [SS96]play a central role in designing good LDPC codes with efficient iterative decoding algorithms.Linear Programming (LP) decoding of linear codes was introduced by Feldman et al. [Fel03,FWK05] as a good-performance low-complexity relaxation of Maximum Likelihood (ML) decod-ing. In the past decade, the good performance of LP decoding of LDPC codes was establishedin a sequence of papers which lead again to good girth and expansion as desirable properties ofthe underlying Tanner graph. The LP decoder corrects a constant fraction of errors if the graphhas sufficiently large expansion [FMS+07, DDKW08, Vid13]. Moreover, the LP decoder of cer-tain expander codes achieves the capacity of a wide class of binary-input memoryless symmetricchannels [FS05]. Lower bounds on the LP decoding thresholds of LDPC codes where obtained in[KV06, ADS12] under the assumption that the graph has a logarithmic girth, and upper boundswere obtained in [VK06]. The LP decoding polytope was independently discovered by Koetterand Vontobel [KV03] in the context of graph covers of Tanner graphs and iterative decoding algo-rithms. The link between LP decoding and iterative decoding algorithms, in particular the min-sumalgorithm, was further investigated in [VK04, ADS12].Feldman et al. [Fel03, FWK05] asked whether the performance of the LP decoder can beimproved by tightening the LP relaxation. Namely, they proposed two natural approaches totighten the LP: (1) adding redundant parity checks and (2)
Lifting techniques . Another tighteningtechnique based on merging nodes was explored by Burshtein and Goldenberg [BG11].This paper is about the first approach. Including redundant parity checks does not affect thecode but adds new constraints to the LP. The problem of appropriately selecting redundant checksto be added to the LP without sacrificing its efficiency was investigated in [TS08, MWT09]. Eventhough simulation results suggest that redundant checks improve the LP decoder performance[FWK05, TS08, MWT09], we argue in this paper that asymptotically there is no gain in terms ofthe LP decoder threshold on the BSC even if we add all redundant checks, assuming that the baseTanner graph has bounded check-degree and satisfies two natural conditions which we call asymp-totic strength and rigidity. The required conditions are satisfied if in addition to sufficiently goodexpansion and girth, the graph has a nondegeneracy property, which holds with high probabilityfor random check-regular graphs.As for the lifting techniques, a recent result of Ghazi and Lee [GL14] shows that extensions ofthe LP decoder based on Sherali-Adams and Lasserre hierarchies do not significantly improve theerror correction capabilities of the LP decoder if the graph is a good expander.The common theme between our result and the result of [GL14] is that if the base LP has“certain desirable or typical properties” then it is “hard to make it asymptotically better”. Relatedto this theme is the other extreme of geometrically perfect codes , which are by definition codes forwhich the LP resulting from adding all redundant checks is equivalent to ML decoding (see Section1.2); such codes are asymptotically bad by a recent result due to Kashyap [Kas08].On the positive side, our negative results suggest studying the LP decoding limits in the frame-2ork of the dual code containing all redundant check nodes. This framework is appealing since itis independent of a particular Tanner graph representation of the code.The proof of our main result is based on a careful analysis of the dual LP. We use the dualwitness and hyperflow structures [FMS+07, DDKW08] and the fact that the existence of suchstructures is necessary for LP decoding success [BGU14]. We also use the the notion of acyclichyperflows and the LP excess technique [BGU14]. To establish the relation between asymptoticstrength and expansion, we build on the dual witness construction in [FMS+07, Vid13]. Ourprobabilistic analysis of nondegeneracy is based on the work of Calkin [Cal97].In the remainder of this introductory section, we give background material on Tanner graphs,redundant checks and LP decoding. Then, we formally state our results in Section 1.3 and we givea detailed outline of the rest of the paper in Section 1.4. A Tanner graph G = ( V, C, E ) is an undirected bipartite graph between a set V of variable nodes and a set C of check nodes , where E is the set of edges. If i ∈ V is a variable node, we will denoteby N ( i ) the check neighborhood of i , i.e., the set of check nodes adjacent to i . Similarly, if j ∈ C ischeck node, N ( j ) is the set of variable nodes adjacent to j . Unless otherwise specified, we assumethroughout the paper that V = { , . . . , n } , where n ≥ block length . We assume also thatthe degree of each check node is at least one. The linear code Q = Q G associated with G is the F -linear code Q ⊂ F n whose parity check matrix is the biadjacency matrix of G . That is, Q is theset of all binary strings x ∈ F n such that P i ∈ N ( j ) x i = 0 for each j ∈ C .Given a tanner graph G = ( V, C, E ), the Tanner graph of all redundant checks G associatedwith G is defined as follows. A redundant check of G is a nonzero F -linear combinations of checksof G , thus the redundant checks are in one-to-one correspondence with the nonzero elements of thedual code Q ⊥ . The graph G is obtained from G by adding all redundant checks to G . That is, G = ( V, C, E ), where C = Q ⊥ − { } and i ∈ V is connected to c ∈ Q ⊥ iff c i = 1.We are also interested in the following graded subgraphs of G . Given G = ( V, C, E ) and aninteger k , let G k be the Tanner graph of redundant checks of degree at most k . That is, G k =( V, C k , E k ) is the subgraph of G induced on V and the set C k of nonzero checks of degree at most k , i.e., C k = { c = 0 ∈ Q ⊥ : weight ( c ) ≤ k } . Thus, if d is the maximum degree of a check node in G , we have the nested sequence of Tanner graphs G ⊂ G d ⊂ G d +1 ⊂ . . . ⊂ G n = G , all definingthe same code Q . Throughout this paper, we are in interested in base Tanner graphs where the maximum check degree d is bounded. Let G = ( V, C, E ) be a Tanner graph and Q ⊂ F n the associated code. Consider transmitting acodeword of Q over the the ǫ - BSC ( Binary Symmetric Channel ), which on input x ∈ F n outputs y ∈ F n by flipping each bit of x independently with probability ǫ . The ML ( Maximum Likelihood )decoder is given by ˆ x ML = argmax x ∈ Q p Y | X ( y | x ). Let γ ∈ R n be the LLR ( Log-Likelihood-Ratio )vector of y : γ i = log (cid:0) p Yi | Xi ( y i | p Yi | Xi ( y i | (cid:1) = ( − y i log − ǫǫ for i = 1 , . . . , n . In terms of γ , the ML decoderis given by ˆ x ML = argmin x ∈ Q h x, γ i , (1)3here h x, γ i := P i x i γ i . For general linear codes, the ML decoding problem is NP-hard [BMVT78].Feldman et al. [Fel03, FWK05] introduced the approach of LP ( Linear Programming ) decoding,which is based on relaxing the optimization problem on Q into an LP. Due to the linearity of theobjective function h x, γ i , optimizing over Q is equivalent to optimizing over the convex polytopeconv( Q ) ⊂ R n spanned by the convex combinations of the codewords in Q :ˆ x ML = argmin x ∈ conv( Q ) h x, γ i . (2)The idea of Feldman is to relax conv( Q ) into a larger lower-complexity polytope. For each checknode j ∈ C , define the local code Q j consisting of all vectors x ∈ { , } n satisfying check j , thus Q = T j ∈ C Q j . Let P ( G ) := \ j ∈ C conv( Q j ) ⊃ conv( \ j ∈ C Q j ) = conv( Q ) . (3)The polytope P ( G ) depends on the Tanner graph representation of the code and it is called the fundamental polytope of G . The LP decoder is the relaxation of the ML decoder given byˆ x LP = argmin x ∈ P ( G ) h x, γ i . (4)The relaxed LP can be efficiently solved due to the low complexity of P ( G ). More generally, (1) and(4) define the ML and LP decoder for an arbitrary LLR vector γ ∈ R n . If γ is as above associatedwith a binary vector y , we ignore without loss of generality the constant log − ǫǫ and we normalize γ so that γ = ( − y .It is appropriate to mention at this stage geometrically perfect codes. A linear code Q ⊂ F n iscalled geometrically perfect [BG86, Kas08] if the LP relaxation corresponding to the full dual codeis exact, i.e., P ( G ) = conv ( Q ), where G is any Tanner graph of Q . Examples of such codes aretree codes and cycle codes. Geometrically perfect codes are classified in [BG86] based on Seymoursmatroid decomposition theory [Sey80], but they are unfortunately asymptotically bad in the sensethat their minimum distance does not grow linearly with the block length [Kas08].We are interested in LP thresholds over the BSC as the block length n tends to infinity. Thatis, we have an infinite family of Tanner graphs G = { G n } n , where G n = ( V n , C n , E n ) is a Tannergraph on n variable nodes, i.e., V n = { , . . . , n } . Define the LP-threshold ξ LP ( G ) of G to be thesupremum of ǫ ≥ G n over the ǫ -BSC goes tozero as n tends to infinity, i.e., ξ LP ( G ) = sup { ǫ ≥ ǫ - BSC [ LP decoder of G n fails] = o (1) } . As in previous work [FWK05], we assume without loss of generality that the all-zeros codewordwas transmitted and that the LP decoder fails if zero is not the unique optimal solution of the LP.Finally, given an infinite family of Tanner graphs G = { G n } n , we are interested in the resultingfamily G := { G n } n of Tanner graphs obtained by adding all redundant checks. Moreover, if k : N + → R , we are also interested in the family G k := { G nk ( n ) } n of Tanner graphs obtained byadding all redundant checks of degree at most k .4 .3 Summary of results Let G = { G n } n be an infinite family of Tanner graphs of bounded check degree. We show thatif G satisfies a condition which we call asymptotic strength , then including high degree redundantchecks in the LP does not improve the threshold in the sense that for each constant δ >
0, thereis a constant k > ξ LP ( G k ) ≥ ξ LP ( G ) − δ . We conclude that if G satisfies an additionalcondition which we call rigidity , then including all redundant checks does not improve the thresholdof the base LP in the sense that ξ LP ( G ) = ξ LP ( G ). We call the graph asymptotically strong if theLP decoder corrects a constant fraction of errors even if the LLR values of the correct variables arearbitrarily small. We show that the asymptotic strength condition follows from expansion. We callthe graph rigid if the minimum weight of a sum of check nodes involving a cycle tends to infinityas the block length tends to infinity. We note that under the assumptions that the girth of G n isΘ(log n ) and the minimum check degree is at least 3, rigidity is equivalent to the property thatadding Ω(log n ) checks does not give O (1) weight checks, which we argue is a typical property ofrandom check-regular Tanner graphs. Definition 1.1 (Asymptotically strong Tanner graphs)
Let G = { G n } n be an infinite familyof Tanner graphs. We call G asymptotically strong if for each (small) constant β > , there existsa constant α > such that for each n and each error vector y ∈ { , } n of weight at most αn , theLP decoder of G n succeeds on the asymmetric LLR vector γ ( y, β ) ∈ R n given by γ i ( y, β ) = (cid:26) − if y i = 1 β if y i = 0 , for i = 1 , . . . , n . Although asymmetry in the LLR vector might seem unnatural at this point, we start with thisdefinition of asymptotic strength because it gives flexibility in the analysis. We give later anequivalent definition in terms of pseudocodewords (Theorem 8.1).
Theorem 1.2 (High degree redundant checks do not improve LP threshold)
Let G = { G n } n be an infinite family of Tanner graphs such that each check node has degree at most d , where d is a constant. Assume that G is asymptotically strong. Then:a) For any small constant δ > , there exists a sufficiently large constant k ≥ d ( dependent on δ and independent of n ) such that ξ LP ( G k ) ≥ ξ LP ( G ) − δ .b) If k ( n ) is a real valued function of n such that k ( n ) = ω (1) ( i.e., k ( n ) tends to infinity as n tends to infinity ) , then ξ LP ( G k ) = ξ LP ( G ) . The proof of Theorem 1.2 uses the LP excess lemma [BGU14] and the notion of primitive hyperflowswhich we define at the end of this section.Feldman et al. [FMS+07] argued that expansion implies that the LP decoder corrects a positivefraction of errors. The link between the expansion of a Tanner graph and the error correctioncapabilities of the underlying code was discovered by Sipser and Spielman [SS96] in the context ofiterative decoding algorithms. Recently, Viderman [Vid13] simplified the argument of [FMS+07]and improved its dependency on the expansion parameter. By building on the construction in[FMS+07, Vid13], we show that graphs with good expansion are asymptotically strong.5 Tanner graph G = ( V, C, E ) is called an ( εn, κ ) -expander if for each subset S ⊂ V of variablenodes of size at most εn , we have | N ( S ) | ≥ κ | S | , where N ( S ) is the set of (check) nodes adjacentto S . Theorem 1.3 (Expansion implies asymptotic strength)
Let d v > , ε > and δ > beconstants such that d v is an integer and δd v is an integer. Let G = { G n } n be an infinite familyof Tanner graphs with regular variable degree d v and bounded check degree. If G n is an ( εn, δd v ) -expander for each n , then G is asymptotically strong. It is known that redundant check nodes obtained by acyclic sums of check nodes do not tighten thepolytope [Fel03, VK05, BG11], which motivates the following definitions.
Definition 1.4 (Cylic sums of checks)
Let G = ( V, C, E ) be a Tanner graph . We call a subsetof check nodes S ⊂ C cyclic if the graph induced by G on S contains a cycle.Define ∆( G ) to be the minimum weight of the sum of a cyclic subset of check nodesof G . More formally, let Q ⊂ F n be the code associated with G . For each check j ∈ C , let z j ∈ Q ⊥ be the vector in the dual code associated with j . Then ∆( G ) := min { weight ( X j ∈ S z j ) : S ⊂ C cyclic } . Definition 1.5 (Rigid Tanner graphs)
We call an infinite family G = { G n } n of Tanner graphs rigid if the minimum weight of a sum of check nodes involving a cycle tends to infinity as theblock length tends to infinity. More formally, G is rigid if ∆( G n ) = ω (1) . Remark 1.6 If G is rigid, then the check nodes of G n are linearly independent for sufficiently large n (since any subset of check nodes whose sum is zero must be cyclic).Accordingly, we obtain the following corollary to Theorem 1.2. Corollary 1.7 (Redundant checks do not improve LP threshold)
Let G = { G n } n be an in-finite family of Tanner graphs of bounded check degree. If G is asymptotically strong and rigid, then ξ LP ( G ) = ξ LP ( G ) . It is not hard to see that ω (1)-girth is a necessary condition for rigidity. Unfortunately, randomgraphs have O (1)-girth, thus they are not necessarily rigid. In general, Θ(log n )-girth is a desirableproperty of a Tanner graph in the context of LP decoding [FWK05] and iterative decoding [Gal62,Tan81]. Random graphs with good girth are typically constructed by breaking the cycles of arandom graph. We note that for graphs with Θ(log n )-girth and minimum check degree at least 3,rigidity is equivalent to a simpler nondegeneracy condition which we define below. Definition 1.8 (Nondegeneracy)
Call an m × n matrix M ∈ F m × n ( s, k ) -nondegenerate if thesum of any subset of at least s rows of M has weight larger than k . We call a Tanner graph G ( s, k ) -nondegenerate if its m × n biadjacency matrix is ( s, k ) -nondegenerate, where m is the numberof check nodes and n is the number of variable nodes. For instance, full row rank corresponds to (1 , emma 1.9 (Rigidity versus girth and nondegeneracy) Let G = { G n } n be an infinite fam-ily of Tanner graphs of bounded check degree. If G is rigid, then girth ( G n ) = ω (1) . On the otherhand, if girth ( G n ) = Θ(log n ) and the minimum check degree of G is at least (i.e., for all n , eachcheck node of G n has degree at least ), then the following are equivalent:i) (Rigidity) G is rigidii) (Nondegeneracy) For each constant c > , G n is ( c log n, ω (1)) -nondegenerate.That is, for each constant c > , the minimum weight of a sum of at least c log n checks nodestends to infinity as n increases. We argue that nondegeneracy is a typical property of random check-regular Tanner graphs. Namely,we show that random check-regular graphs are ( c log n, ω (1))-nondegenerate with high probabilityif m ≤ β d n , where d the check degree and and β d is Calkin’s threshold as given in Definition 7.1( β d is a threshold close to 1, e.g., β ∼ . β ∼ .
967 and β ∼ . Lemma 1.10 (Random check-regular graphs are nondegenerate)
Let d, m and n be inte-gers such that d ≥ and ≤ m < β d n . Consider a random m × n matrix M ∈ F m × n constructed byindependently choosing each of the m rows of M uniformly from the set of vectors in F n of weight d . Then for any constant c > and any function k ( n ) of n such that k ( n ) = o (log log n ) , M is ( c log n, k ( n )) -nondegenerate with high probability. We establish the claim by adapting an argument used by Calkin [Cal97] to show that if m < β d n ,then M has full row rank with high probability. The ensemble of random check-regular graphsis attractive from a probabilistic analysis standpoint, but it typically gives irregular graphs withconstant girth. We believe that good girth and variable-regularity do not increase the odds ofdegeneracy; we conjecture that the statement of Lemma 1.10 extends to the ensemble of regularΘ(log n )-girth Tanner graphs (see Section 10).We also prove the following general results about LP decoding which might be of independentinterest: • ( Primitive hyperflows ) We give a simple necessary and sufficient condition for the successof the LP decoding when all redundant checks are included in the LP. The condition is interms of the existence of a hyperflow (see Definition 2.1) which is primitive in the sense thatall the variables in error have zero outflow and all the correct variables have zero inflow(Theorem 4.2). This characterization is essential to the proof of Theorem 1.2. • ( Pseudocodewords interpretation of asymptotic strength ) We note that the notion ofasymptotic strength has the following geometric interpretation in terms of pseudocodewords: G = { G n } n is asymptotically strong iff for each nonzero pseudocodeword x ∈ P ( G n ), toattain a positive fraction of P i x i , we need a least linear number of coordinates of x . Thatis, for each θ >
0, there exists α > n and each nonzero pseudocodeword x ∈ P ( G n ), the sum of the largest ⌊ αn ⌋ coordinates of x is less than θ P i x i (Theorem 8.1). • ( Asymptotic strength and LP decoding with help ) Assume that we are allowed to toflip at most a certain number of bits of the corrupted codeword to help the LP decoder onthe BSC. We argue that if the Tanner graph is asymptotically strong, allowing a sublinearnumber of help bits does not improve the LP threshold (Theorem 9.2). This result, althougha negative statement, has potential constructive applications as it weakens the dual witnessrequirement for LP decoding success. 7 ( LP deficiency lemma ) We give a converse of the LP excess lemma [BGU14]. Namely, weshow how to trade LP-deficiency for crossover probability (Lemma 9.3) and we use the LPdeficiency lemma to establish the above result on LP decoding with help.
In Section 2, we give background material on graph structures whose existence is necessary andsufficient for LP decoding success: dual witness, hyperflows and acyclic hyperflows. To warm up,we highlight in Section 3 a simple classical argument, which shows that high density codes have zerothresholds on the BSC. The key starting point of our proof is the above-mentioned special type ofhyperflows called primitive hyperflows. We define primitive hyperflows in Section 4 and we arguethat their existence is sufficient for LP decoding success when all redundant checks are includedin the LP. In Section 5, we show that for asymptotically strong codes with bounded-check degree,high degree checks do not improve the threshold (Theorem 1.2). Then we conclude that addingall redundant checks does not improve the threshold if the graph is additionally rigid (Corollary1.7). In Section 6, we study the relation between expansion and asymptotic strength (Theorem1.3). In Section 7, we study the rigidity and the related nondegeneracy properties (Lemmas 1.9 and1.10). In Section 8, we give the above-mentioned pseudocodewords interpretation of asymptoticstrength. In Section 9, we give an application of asymptotically strong codes in the context ofthe above-mentioned problem of LP decoding with help bits. Finally, we conclude in Section 10with a discussion of the asymptotic strength condition, the rigidity condition and the limits of LPdecoding on the BSC.
In this section we summarize various dual characterizations of LP decoding success that will beused in this paper. The notion of dual witness was introduced in [FMS+07] as a sufficient conditionfor LP decoding success. The necessity of the existence of a dual witness for LP decoding successwas established in [BGU14]. A special type of dual witnesses called hyperflows was introduced in[FMS+07, DDKW08]. The equivalence between the existence of a hyperflow and the existence ofa dual witness was established in [DDKW08]. The notion of a hyperflow was further simplified in[BGU14] who argued that the the existence of an acyclic hyperflow is equivalent to the existenceof a hyperflow.
Definition 2.1 ([FMS+07, DDKW08]) (Dual witness, Hyperflow, and WDG)
Considera Tanner graph G = ( V, C, E ) and an LLR vector γ ∈ R V . A dual witness for γ in G is afunction w : E → R satisfying the inequalities in (a) and (b) below.a) Variable nodes inequalities: F i ( w ) < γ i , for each variable i ∈ V , where F ( w ) ∈ R V isgiven by F i ( w ) := X j ∈ N ( i ) w ( i, j ) . We call F i ( w ) the flow at variable node i associated with w .b) Check nodes inequalities: for each check j ∈ C and all distinct variables i = i ′ ∈ N ( j ) , w ( i, j ) + w ( i ′ , j ) ≥ . dual witness w : E → R is called a hyperflow if, instead of (b), it satisfies the following strongercheck nodes inequalities.c) Hyperflow check nodes inequalities: for each check j ∈ C , there exists P j ≥ and avariable i ∈ N ( j ) such that w ( i, j ) = − P j and w ( i ′ , j ) = P j , for all i ′ = i ∈ N ( j ) .A dual witness or a hyperflow w can viewed as a weighted directed graph (WDG) D on thevertices V ∪ C , where an arrow is directed from i to j if w ( i, j ) > , an arrow is directed from j to i if w ( i, j ) < and i and j are not connected by an arrow if w ( i, j ) = 0 . The weight of each directededge connecting i ∈ V and j ∈ C is | w ( i, j ) | . Thus, in terms of D , the variable nodes inequalitiesin (a) can be rephrased as follows.d) WDG variable nodes inequalities: F outi ( w ) < F ini ( w ) + γ i , for each variable i ∈ V ,where F out ( w ) , F in ( w ) ∈ R V are defined as follows. • F outi ( w ) := P j ∈ Out D ( i ) | w ( i, j ) | where Out D ( i ) is the set of check nodes incident to edgesoutgoing from i . • F ini ( w ) := P j ∈ In D ( i ) | w ( j, i ) | where and In D ( i ) is the set of check nodes incident to edgesingoing to i .We call F outi ( w ) the outflow from variable node i associated with w and F ini ( w ) the inflow tovariable node i associated with w . We summarize in the following theorem various equivalent characterizations of LP decodingsuccess.
Theorem 2.2 ([FMS+07, DDKW08, BGU14]) (Equivalent characterizations of LP de-coding success )
Let G = ( V, C, E ) be a Tanner graph and γ ∈ R V an LLR vector. Then thefollowing are equivalent:i) The LP decoder of G succeeds on γ ( i.e., it returns zero as the unique solution under theassumption that the all-zeros codeword was transmitted ) .ii) There is a dual witness for γ in G .iii) There is a hyperflow for γ in G .iv) There is a hyperflow for γ in G whose WDG is acyclic. Remark 2.3
The fact that (ii) implies (i) follows from [FMS+07], the fact that (i) implies (ii)follows from Theorem 3.2 and Remark 3.3 in [BGU14], the equivalence between (ii) and (iii) followsfrom Proposition 1 in [DDKW08] and the the equivalence between (ii) and (iv) follows from Theorem3.7 in [BGU14]. Note that the statement of Theorem 3.7 in [BGU14] assumes that γ is an LLRvector of a binary error pattern (i.e., γ ∈ {− , } V ), but its proof holds for an arbitrary LLR vector γ ∈ R V . 9 High density codes
In this section, we highlight a simple classical argument which shows that high density codes havezero thresholds on the BSC. A statement similar to Lemma 3.1 below appears in Corollary 7 of[VK06] in the context of regular Tanner graphs (with a different but also simple proof). Althoughnot used in the proofs of the results in this paper, we include this lemma since from a broadperspective it is related to the statement of Theorem 1.2, which says that high degree redundantchecks are not helpful if the code is asymptotically strong. Unfortunately, the simple proof ofLemma 3.1 does not extend to the setup of high degree redundant checks.
Lemma 3.1 (High density codes)
Let G = ( V, C, E ) be a Tanner graph such that the minimumdegree of a check node is d min . Then the LP decoder of G fails if the number of errors introducedby the BSC is at least n/d min . Thus, if G is an infinite family of Tanner graphs such that theminimum degree of a check node in G n is ω (1) , then the threshold ξ LP ( G ) = 0 . Proof:
Assume that the all-zeros codeword was transmitted and let y ∈ { , } n be the receivedvector. If the LP decoder of G correctly decodes y , then by Theorem 2.2, ( − y has a hyperflow w : E → R . Consider the WDG D corresponding to w and let U = { i : y i = 1 } be the set of variables inerror. If S ⊂ V , let F in ( w ; S ) := P i ∈ S F ini ( w ) be total inflow to S and F out ( w ; S ) := P i ∈ S F outi ( w )be total outflow from S . Summing the variable nodes inequalities F outi ( w ) < F ini ( w ) + γ i over all i ∈ V , we get F out ( w ; V ) < F in ( w ; V ) + | U c | − | U | , i.e., F out ( w ; V ) < F in ( w ; V ) + n − | U | . (5)Summing the variable nodes inequalities over all i ∈ U , we get F out ( w ; U ) < F in ( w ; U ) − | U | . Since F out ( w ; U ) ≥ F in ( w ; U ) ≤ F in ( w ; V ), we obtain | U | < F in ( w ; V ) . (6)Finally, the hyperflow check nodes inequalities ((c) in Definition 2.1) imply that( d min − F in ( w ; V ) ≤ F out ( w ; V ) . (7)Solving for | U | in (5), (6) and (7), we obtain | U | < n/d min . (cid:4) We give in this section a simple necessary and sufficient condition for the success of LP decodingwhen all redundant checks are included in the LP. The condition is in terms of the existence of aprimitive hyperflow which we define as a hyperflow such that all the variables in error have zerooutflow and all the correct variables have zero inflow. Primitive hyperflows are central to the proofof Theorem 1.2.
Definition 4.1 (Primitive hyperflow)
Let H = ( V, C, E ) be a Tanner graph, γ ∈ R V an LLRvector and w : E → R a hyperflow for γ in H . Consider the WDG D of w . We call w a primitivehyperflow if for each variable nodes i ∈ V , we have:a) If γ i ≤ , then i has no outgoing edges in D , i.e., F outi ( w ) = 0 . ) If γ i > , then i has no ingoing edges in D , i.e., F ini ( w ) = 0 .Note that the WDG of a primitive hyperflow is necessarily acyclic. Lemma 4.2 (Redundant checks and primitive hyperflows)
Let G = ( V, C, E ) be a Tannergraph and consider the associated Tanner graph G = ( V, C, E ) of all redundant check nodes. Let γ ∈ R V be an LLR vector. If the LP decoder of G succeeds on γ , then there is a primitive hyperflowfor γ in G Proof:
Assume that the LP decoder of G succeeds on γ . By Theorem 2.2, there exists a hyperflow w : E → R for γ in G whose WDG D is acyclic . We will make D primitive by exploiting the keyproperty of G that its check nodes are in one-to-one correspondence with the nonzero vectors inthe dual Q ⊥ of the code Q of G . Hence, the F -sum of any two distinct check nodes in G is againa check node in G . We will iteratively modify D until it becomes primitive by repeated XORing ofcheck nodes. The basic operation is the Switch operation in Algorithm 1, which given a variablenode i ∈ V and distinct check nodes j, j ′ ∈ C such that ( j, i ) and ( i, j ′ ) are edges in D , modifies D by replacing either j or j ′ with the XOR j ′′ of j and j ′ . A key property of the Switch operation isthat it does not increase the indegree or the outdegree of i and it decreases at least one of them.The Switch operation uses the fact that D is acyclic. Algorithm 1
Basic Switch operation
Switch D along path j → i → j ′ Input: variable node i ∈ V and check nodes j, j ′ ∈ C such that ( j, i ) and ( i, j ′ ) are edges in D Let P = min {| w ( j, i ) | , | w ( i, j ′ ) |} Decrease by P the absolute weights of all the directed edges connected to j or j ′ Let i ′ be the (unique) variable node such that ( j ′ , i ′ ) is an edge in D Let j ′′ be the XOR of j and j ′ Increase by P the absolute weights the edges ( j ′′ , i ′ ) and ( i ′′ , j ′′ ), ∀ i ′′ = i ′ ∈ N ( j ′′ ) Remove all zero weight edges.Figures 1 and 2 illustrate the Switch operation.
Claim 4.3 (Switch operation properties)
Let i ∈ V and j, j ′ ∈ C such that ( j, i ) and ( i, j ′ ) are edges in D . After switching D along j → i → j ′ , the followings hold:a) D is still an acyclic WDG of a hyperflow for γ in G .b) For each variable node v ∈ V , the total inflow F inv ( w ) to v and the total outflow F outv ( w ) from v do not increase.c) The indegree of i and the outdegree of i do not increase and at least one of them decreases byat least one.Proof. First we note that due to the acyclicity of D , variable node i ′ will not cancel out afterXORing j and j ′ in Line 4. Indeed, assume that i ′ cancels out, then i ′ must be connected to j (by an edge incoming from i ′ since j already has an edge outgoing to i ), hence we get the cycle j → i → j ′ → i ′ → j . 11 a) Before switching jij ′ i ′ γ i = − k . . . . . . . . . (b) After switching jij ′ i ′ γ i = − j ′′ k . . . . . . . . . Figure 1: An example of a portion of the WDG D before and after switching along path j → i → j ′ .This figure illustrates the case when | w ( i, j ′ ) | < | w ( j, i ) | , hence P = | w ( i, j ′ ) | . (a) Before switching jij ′ i ′ γ i = +1 k . . . . . . . . . (b) After switching jij ′ i ′ γ i = +1 0 . j ′′ k . . . . . . . . Figure 2: An example of a portion of the WDG D before and after switching along path j → i → j ′ .This figure illustrates the case when | w ( i, j ′ ) | > | w ( j, i ) | , hence P = | w ( j, i ) | .It is straightforward to verify (b) and (c). Note that the only variable nodes in D whose inflowor outflow change are those shared by j and j ′ – namely, i and possibly other nodes k (see Figures1 and 2). Both the inflow to i and the outflow from i decrease by P , the outflow from k decreasesby P and the inflow to k remains unchanged.It is also straightforward to verify that the acyclicity of D and the WDG variable nodes in-equalities ((d) in Definition 2.1) are maintained. To complete the proof of (a), we need to showthat the hyperflow check nodes inequalities ((c) in Definition 2.1) are maintained. In particular,we have to argue that in Line 5 it is not possible that check node j ′′ is already present with a12ifferent edge orientation, i.e., with an edge outgoing from j ′′ to a variable node i ′′ = i ′ . Again,this follows from the acyclicity of D . Assume that right before executing Line 5, there is an edgeoutgoing from j ′′ to a variable node i ′′ = i ′ . Since variable i ′′ appears in check j ′′ , then it appearsin either j or j ′ , hence either ( i ′′ , j ) or ( i ′′ , j ′ ) is an edge in D . If ( i ′′ , j ) is an edge, we get the cycle i ′′ → j → i → j ′ → i ′ → j ′′ → i ′′ . If ( i ′′ , j ′ ) is an edge, we get the cycle i ′′ → j ′ → i ′ → j ′′ → i ′′ . H Algorithm 2 given below iteratively modifies D until it becomes primitive by repeated applica-tion of the Switch operation. Recall that In D ( i ) is the set of check nodes incident to edges ingoingto i and Out D ( i ) is the set of check nodes incident to edges outgoing from i . Algorithm 2
Making the WDG D primitive for each variable node i ∈ V do while InDegree D ( i ) = 0 and OutDegree D ( i ) = 0 (i.e., In D ( i ) = ∅ and Out D ( i ) = ∅ ) do Pick any j ∈ In D ( i ) and any j ′ ∈ Out D ( i ) Switch D along j → i → j ′ end while end for for each variable node i ∈ V such that γ i > InDegree D ( i ) = 0 do Remove all the edges in D connected to check nodes in In D ( i ) end for For each i ∈ V , Part (c) of Claim 4.3 asserts that the indegree and the outdegree of i do notincrease and at least one of them decreases by at least one, hence the inner while-loop halts in afinite number of steps. Thus at the end of each iteration of the first outer for-loop, variable node i has either zero indegree or zero outdegree. Part (b) of Claim 4.3 guarantees that once the indegreeor the outdegree of a node i is zero, it remains zero in future iterations of the algorithm.Consider D after the end of the first outer for-loop and consider any variable node i ∈ V .If γ i ≤
0, the indegree of i must be nonzero due to the WDG variable nodes inequalities. Thusthe outdegree of i must be zero.If γ i > i is nonzero, then the outdegree of i must be zero, hence theoutflow from i is zero. Since γ i > i is zero, the inflow to i is unnecessary.The second for-loop performs a final pass to removes this unnecessary inflow by disconnecting theedges of the check nodes in In D ( i ) from D (thus now both the indegree and the outdegree of i arezeros). (cid:4) In this section we establish Theorem 1.2 and Corollary 1.7 restated below for convenience. Theproof of Theorem 1.2 uses the LP excess lemma [BGU14].
Lemma 5.1 ([BGU14]) (LP Excess Lemma: trading crossover probability with LP ex-cess)
Let H = ( V, C, E ) be a Tanner graph. Let < ǫ < ǫ ′ < and < δ < such that ǫ ′ = ǫ + (1 − ǫ ) δ . Let q ǫ ′ be the probability that the LP decoder of H fails on the ǫ ′ -BSC. Consideroperating on the ǫ -BSC, i.e., choose the error pattern x ∼ Ber( ǫ, n ) . Then the probability that thereexists a dual witness in H for ( − x − δ is at least − q ǫ ′ δ . n other words, if we let ( − x i − P j ∈ N ( i ) w ( i, j ) be the “LP excess” of w on variable node i ,then the probability over the ǫ -BSC that there exists a dual witness with LP excess greater than δ on all the variable nodes is at least − q ǫ ′ δ . Theorem 1.2 (High degree redundant checks do not improve LP threshold)
Let G = { G n } n be an infinite family of Tanner graphs such that each check node has degree at most d , where d is a constant. Assume that G is asymptotically strong. Then:a) For any small constant δ > , there exists a sufficiently large constant k ≥ d ( dependent on δ and independent of n ) such that ξ LP ( G k ) ≥ ξ LP ( G ) − δ .b) If k ( n ) is a real valued function of n such that k ( n ) = ω (1) ( i.e., k ( n ) tends to infinity as n tends to infinity ) , then ξ LP ( G k ) = ξ LP ( G ) . Proof:
Part (b) is an immediate consequence of (a). At a high level, the argument behind (a) is asfollows. We will operate G on the BSC slightly below its LP threshold to guarantee the existence ofa dual witness w with some small but constant LP excess over all variable nodes. Namely, we setthe LP excess to δ . Since G contains all redundant check nodes, we can assume that w is primitive .We will trim w by removing all check nodes of degree larger than k . The trimming process leads toa distorted dual witness w k , where the variable nodes inequalities are violated for w k over some setof variables which we call problematic . Call a variable risky if it receives at least δ flow from theremoved check nodes and let U be the set of risky variables. Thus the risky variables include all theproblematic variables. Moreover, all the risky variables are received in error since w is primitive.Due to the high degree of the removed check nodes and due to the primitivity of w , the removedchecks give the variables in error little flow, namely at most nk − . It follows that the set U of riskyvariables is small, namely | U | ≤ nδ ( k − . Due to the primitivity of w , the variables in error, andin particular the problematic variables, have no outgoing edges. That is, the outflow from eachproblematic variable node is zero, hence fixing each problematic variable requires adding a unitflow in the worst case (this conclusion critically depends on the primitivity of w ). By construction,the nonrisky variables still have δ − δ = δ LP excess after the trimming process. We will use thisremaining excess to fix w k by patching a dual witness which turns the remaining small LP excesson the nonrisky variables into a unit flow on each risky variable. The existence of the patch followsfrom the asymptotic strength of G .More formally, let δ > ξ LP ( G ) > δ < ξ LP ( G )(otherwise, the claim of the theorem is trivial). We will show that there is a sufficiently largeconstant k such that ξ LP ( G k ) ≥ ξ LP ( G ) − δ . Let ǫ = ξ LP ( G ) − δ and ǫ ′ = ǫ + (1 − ǫ ) δ , thus0 < ǫ < ǫ ′ < ξ LP ( G ). Let q ǫ ′ ( n ) be the probability of error of the LP decoder of G n over the ǫ ′ -BSC. Note that q ǫ ′ ( n ) tends to zero as n tends to infinity since ǫ ′ < ξ LP ( G ). By the LP excesslemma (Lemma 5.1), with probability at least 1 − q ǫ ′ ( n ) δ , there exists a dual witness in G n for( − x − δ , where x ∼ Ber( ǫ, n ). In what follows, consider any k and n such that d ≤ k ≤ n ,consider any x ∈ { , } n such that ( − x − δ has a dual witness w in G n , say that G n = ( V, C, E )and consider the Tanner graph G nk = ( V, C k , E k ). We will construct from w a dual witness for( − x in G nk for sufficiently large k .Let V + x = { i ∈ V : ( − x i − δ ≥ } and V − x = { i ∈ V : ( − x i − δ < } . Note that since0 < δ < V + x = { i ∈ V : ( − x i = 1 } and V − x = { i ∈ V : ( − x i = − } , i.e., V + x is the set14f variable nodes received correctly and V − x consists of those received in error. Since G n containsall redundant check nodes, we can assume by Lemma 4.2 that the WDG D of w is a primitivehyperflow . Since D is a primitive hyperflow, for each check node j in D , all the ingoing edges to j are from variables in V + x and the only outgoing edge from j is to some variable in V − x . Let L k be the set of check nodes in G n of degree larger than k , i.e., L k = C − C k . The check nodes in L k give the variable nodes in V − x a total flow which is at most | V + x | k − ≤ nk − . Call a variable node in V − x risky if it receives at least δ flow in total from the checks in L k . Let U be the set of risky variablenodes, thus | U | ≤ nδ ( k − . Remove from D all the check nodes in L k and all the associated edges and let w k be the resultingweight map w k : E k → R . The map w k possibly violates the variable nodes inequalities oversome variables in U , but it satisfies the hyperflow check nodes inequalities and hence the dualwitness check nodes inequalities over all checks. For each i ∈ V , consider the flows at i associatedwith w and w k : F i ( w ) = P j w ( i, j ), F outi ( w ) = P j ← i | w ( i, j ) | , F i ( w k ) = P j w k ( i, j ), F ini ( w k ) = P j → i | w k ( i, j ) | and F outi ( w k ) = P j ← i | w k ( i, j ) | . Since w is primitive, none of the variables i ∈ V − x have outgoing edges in G k , thus F outi ( w ) = 0 and hence F outi ( w k ) = 0. Thus for each i ∈ U , F i ( w k ) = − F ini ( w k ) ≤
0. If i ∈ V − x − U , we have F i ( w k ) < F i ( w ) + δ < − − δ δ − − δ . If i ∈ V + x , we have F i ( w k ) ≤ F i ( w ) < − δ < − δ . Therefore, for each variable i ∈ V , (cid:26) F i ( w k ) ≤ i ∈ UF i ( w k ) < ( − x i − δ otherwise. (8)To turn w k into a dual witness for ( − x , we have to fix the possible violations of variable nodesinequalities over U . Over V − U , the variable nodes inequalities are satisfied with δ excess. Wewill use this excess to fix the problematic variables in U by patching to w k a dual witness for theasymmetric LLR vector γ ∈ R V given by γ i = (cid:26) − i ∈ U δ otherwise,for all i ∈ V .Since G is asymptotically strong , there exists a constant α δ > n ) such that if | U | ≤ α δ n , the LP decoder of G n = ( V, C, E ) succeeds on the asymmetric LLR vector γ . Hence, if δ ( k − ≤ α δ , then γ has a dual witness v : E → R in G n . Since k ≥ d (recall that d is the maximumdegree of a check node in G ), we can extend v from E to E k by zeros. Let v k : E k → R be theresulting weight map, thus (cid:26) F i ( v k ) < − i ∈ UF i ( v k ) < δ otherwise, (9)where F i ( v k ) = P j v k ( i, j ). Since U ⊂ V − x , it follows from (8) and (9) that F i ( w k )+ F i ( v k ) < ( − x i ,for all i ∈ V . Noting that the dual witness check nodes inequalities are preserved by superposition,we conclude that w k + v k is the desired dual witness of ( − x .15n summary, for all δ > δ < ξ LP ( G ), there exists a constant α δ > ǫ = ξ LP ( G ) − δ , ǫ ′ = ǫ + (1 − ǫ ) δ and k = ⌈ δα δ ⌉ + 1, the following holds for all values of n . Let q ǫ ′ ( n ) be the probability of error of the LP decoder of G n over the ǫ ′ -BSC. Then there exists adual witness in G nk for ( − x with probability at least 1 − q ǫ ′ ( n ) δ over the choice of x ∼ Ber( ǫ, n ).Since ǫ ′ < ξ LP ( G ), q ǫ ′ ( n ) tends to zero as n increases. It follows that, for all δ >
0, there exists asufficiently large constant k > δ such that ξ LP ( G k ) ≥ ξ LP ( G ) − δ . (cid:4) To derive Corollary 1.7 from Theorem 1.2, we need the following classical result.
Theorem 5.2 ([Fel03]) (Optimality of LP decoding on acylic graphs)
Let H = ( V, C, E ) be a Tanner graph and Q H the associated code. If H is acyclic, then the fundamental polytope P ( H ) of H is the convex span of the code Q H , i.e., conv ( Q H ) = P ( H ) . See [VK05] or [BG11] for a proof. It follows from Theorem 5.2 that redundant checks obtained byacyclic sums do not tighten the polytope. A statement similar to Corollary 5.3 appears in [BG11].We include a short derivation of Corollary 5.3 from Theorem 5.2 for completeness.
Corollary 5.3 (Acyclic redundnat checks do not tighten the polytope)
Let G = ( V, C, E ) be a Tanner graph and Q ⊂ F n the associated code. For each check j ∈ C , let z j ∈ Q ⊥ be the vectorin the dual code associated with j . Let D ⊂ Q ⊥ such that each check z ∈ D is obtained by an acyclicsum of checks of G . That is, each z ∈ D is of the form z = P j ∈ S z j , for some S ⊂ C such thatthe graph induced by G on S is acyclic. Consider the Tanner graph G ′ = ( V, C ∪ D, E ′ ) resultingfrom G by adding all the checks in D . Then P ( G ) = P ( G ′ ) . Proof:
By definition, P ( G ) = T j ∈ C conv( Q j ) and P ( G ′ ) = T z ∈ C ∪ D conv( Q z ). Consider any check z ∈ D . It is enough to argue that P ( G ) ⊂ conv ( Q z ). Let S ⊂ C such that z = P j ∈ S z j andthe graph G S = ( V S , S, E S ) induced by G on S is acyclic. By Theorem 5.2, P ( G S ) = conv ( Q G S ).Extending the polytopes from R V S to R V , we get T j ∈ S conv( Q j ) = conv ( Q S ), where Q S is thesupercode of Q consisting of all the vectors in F n satisfying all the checks in S . Since z is a linearcombinations of checks in S , we have Q S ⊂ Q z , hence conv ( Q S ) ⊂ conv ( Q z ). Therefore P ( G ) = \ j ∈ C conv( Q j ) ⊂ \ j ∈ S conv( Q j ) = conv ( Q S ) ⊂ conv ( Q z ) . (cid:4) Finally, we conclude Corollary 1.7 from Theorem 1.2 and Corollary 5.3.
Corollary 1.7 (Redundant checks do not improve LP threshold)
Let G = { G n } n be aninfinite family of Tanner graphs of bounded check degree. If G is asymptotically strong and rigid,then ξ LP ( G ) = ξ LP ( G ) . Proof:
Say that G n = ( V n , C n , E n ) and G n = ( V n , C n , E n ). Since G is rigid, ∆( G n ) = ω (1). Let k ( n ) := ∆( G n ) − n is large enough so that k ( n ) is at least the maximum degreeof a check node of G . By the definition of k ( n ), all redundant checks in C n of degree at most k ( n ) are obtained by acyclic sums of checks in C n . By Corollary 5.3, P ( G n ) = P ( G nk ( n ) ), hence ξ LP ( G ) = ξ LP ( G k ). On the other hand, by Theorem 1.2, ξ LP ( G k ) = ξ LP ( G ) since k ( n ) = ω (1) and G is asymptotically strong. It follows that ξ LP ( G ) = ξ LP ( G ). (cid:4) Expansion and asymptotic strength
In this section we prove Theorem 1.3 restated below for convenience. The proof uses the notion ofa narrow dual witness defined below.
Definition 6.1 (Narrow dual witness)
Let G = ( V, C, E ) be a Tanner graph, y ∈ { , } n anerror vector and w : E → R a dual witness for ( − y in G . We call w a narrow dual witness for ( − y if all the edges not incident to N ( U ) have zero weights, where U = { i ∈ V : y i = 1 } is the setof variables in error ( i.e, if an edge is not incident to a check node incident to a variable in error,then it has zero weight ) . A key property of a narrow dual witness is that the flow at the correct variable nodes far from U by more than 2 edges is zero.Recall that a Tanner graph G = ( V, C, E ) is called an ( εn, κ ) -expander if for each subset S ⊂ V of variable nodes of size at most εn , we have | N ( S ) | ≥ κ | S | .Feldman at al. [FMS+07] argued that the LP decoder of graphs with good expansion corrects apositive fraction of errors. Although not explicitly stated, the dual witness constructed in their proofis actually narrow. Their argument was later simplified by Viderman [Vid13] who also improvedthe expansion requirement. Lemma 6.2 (Implicit in [Vid13]) (Expansion implies the existence of a narrow dualwitness)
Let d v > , ε > and δ > be constants such that d v is an integer and δd v is an integer.Let G = ( V, C, E ) be a Tanner graph with regular variable degree d v and assume that G is an ( εn, δd v ) -expander. Then ( − y has a narrow dual witness in G , for each error vector y ∈ { , } n of weight at most δ − δ − ( εn − . Theorem 1.3 (Expansion implies asymptotic strength)
Let d v > , ε > and δ > beconstants such that d v is an integer and δd v is an integer. Let G = { G n } n be an infinite familyof Tanner graphs with regular variable degree d v and bounded check degree. If G n is an ( εn, δd v ) -expander for each n , then G is asymptotically strong. Proof:
The proof is based on successive superpositions of narrow dual witnesses obtained fromLemma 6.2 to amplify the flow at the variable nodes in errors. The fact they are narrow is essentialfor superposing them without violating the variable nodes constraints at the correct variables.Consider any constant β > B = ⌈ β ⌉ . It is enough to find a constant α > n and each U ⊂ V = { , . . . , n } of size most αn , a dual witness w in G n =( V, C, E ) for the asymmetric LLR vector γ ∈ R V given by γ i = (cid:26) − B if i ∈ U i ∈ V . Since B ≥ β , the scaled version B w of w is the desired dual witness for γ ( y, β ) (asgiven in Definition 1.1), where y ∈ { , } n is the indicator vector of U .If S ⊂ V is a set of variable nodes and t ≥ N var ( S ; t ) be the set of variablenodes at distance at most 2 t from S . Thus N var ( S ; 0) = S and N var ( S ; 1) is the set of variablesconnected to check nodes connected to S . 17et α > U ⊂ V of size at most αn , we have | N var ( U ; B − | ≤ δ − δ − εn − , (10)for sufficiently large n (the explicit value of α is at the end of the proof). Assume that | U | ≤ αn and let U t = N var ( U ; t ), for t = 0 , . . . , B −
1. In what follows, consider any t ∈ { , . . . , B − } . Since | U t | ≤ δ − δ − ( εn − − y t has a narrow dual witness w t : E → R in G , where y t ∈ { , } n is the indicator vector of U t , i.e., y ti = 1 iff i ∈ U t . The fact that w t is narrowmeans all the edges not incident to N ( U t ) have zero weights, thus the flow at the variable nodesoutside U t +1 is zero. That is, F i ( w t ) = 0 for each i ∈ V − U t +1 , where F i ( w t ) = P j w t ( i, j ) is theflow with respect to w t at variable node i . Let w = P B − t =0 w t . We will argue that w is the desireddual witness for γ .First, note that superposing dual witnesses does not violate the dual witness check nodesinequalities ((b) in Definition 2.1). Thus, we only have to worry about the variable nodes in-equalities ((a) in Definition 2.1). Consider the flow at the variable nodes with respect to w : F i ( w ) = P B − t =0 F i ( w t ), for all i ∈ V . We have to show that (cid:26) F i ( w ) < − B if i ∈ U = UF i ( w ) < w t is a narrow dual witness for ( − y t , we have F i ( w t ) < − i ∈ U t F i ( w t ) < i ∈ U t +1 − U t F i ( w t ) = 0 if i ∈ V − U t +1 . Summing over t = 0 , . . . , B − U ⊂ U ⊂ U ⊂ . . . ⊂ U B , we obtain F i ( w ) < − B if i ∈ U F i ( w ) < − ( B −
2) if i ∈ U − U F i ( w ) < − ( B −
3) if i ∈ U − U F i ( w ) < − ( B −
4) if i ∈ U − U . . .F i ( w ) < − i ∈ U B − − U B − F i ( w ) < − i ∈ U B − − U B − F i ( w ) < i ∈ U B − − U B − F i ( w ) < i ∈ U B − U B − F i ( w ) = 0 if i ∈ V − U B , and hence (11) follows.Finally, note that if d c be the maximum check degree of check node in G n for all n , then for all t ≥ | N var ( U ; t ) | ≤ t X i =0 ( d v ( d c − i | U | = ( d v ( d c − t +1 − d v ( d c − − | U | . Thus condition (10) is satisfied if( d v ( d c − B − d v ( d c − − αn ≤ δ − δ − εn − , n sufficiently large with α = (3 δ − d v ( d c − − δ − d v ( d c − ⌈ β ⌉ − ε. (cid:4) In this section we prove Lemmas 1.9 and 1.10 restated below for convenience.
Lemma 1.9 (Rigidity versus girth and nondegeneracy)
Let G = { G n } n be an infinite familyof Tanner graphs of bounded check degree. If G is rigid, then girth ( G n ) = ω (1) . On the other hand,if girth ( G n ) = Θ(log n ) and the minimum check degree of G is at least (i.e., for all n , each checknode of G n has degree at least ), then the following are equivalent:i) (Rigidity) G is rigidii) (Nondegeneracy) For each constant c > , G n is ( c log n, ω (1)) -nondegenerate.That is, for each constant c > , the minimum weight of a sum of at least c log n checks nodestends to infinity as n increases. Proof:
First we show that if G is rigid, then girth ( G n ) = ω (1). If G n has a cycle of O (1) length,then the weight of the sum of the check nodes on this cycle is O (1) since G has bounded checkdegree, which contradicts the rigidity of G .Assume in what follows that:a) girth ( G n ) = Θ(log n ) and let α > girth ( G n ) ≥ α log n for sufficientlylarge n .b) Each check node of G n has degree at least 3.Say that G n = ( V n , C n , E n ), let Q n be the code associated with G n and let z j ∈ Q ⊥ n be the vectorin the dual code associated with check j ∈ C n . We will use (a) to show that (ii) implies (i) and (b)to show that (i) implies (ii).Assume that (ii) holds. To verify (i), let z = P j ∈ S z j for some subset S ⊂ C n such thatthe graph induced by G n on S contains a cycle. Thus | S | ≥ girth ( G ) ≥ α log n . Since G n is( α log n, ω (1))-nondegenerate, we get weight ( z ) = ω (1), hence G is rigid.Finally, assume that G is rigid and let c >
0. To verify that (ii) holds, we use (b). Let z = P j ∈ S z j for some subset S ⊂ C n of size at least c log n . We will argue that weight ( z ) = ω (1)by considering two cases depending on whether or not the graph G S induced by G n on S is acyclic.Case 1: Assume that G S is acyclic. Since each check node in S has degree at least 3, the numberof leaves in the forest G S is at least | S | + 2 (in general, if F is a forest and s is the number of internalnodes of F of degree at least 3, then the number of leaves of F is at least s + 2 assuming that s ≥ weight ( z ) ≥ | S | + 2 = Ω(log n ).Case 2: If G S contains a cycle, then weight ( z ) = ω (1) since G is rigid. (cid:4) efinition 7.1 ([Cal97]) (Calkin’s threshold) If d ≥ is an integer, define the threshold <β d < as follows. Consider the function f d ( α, β ) = − H ( α ) + β log (1 + (1 − α ) d ) , where H ( α ) = − α log α − (1 − α ) log (1 − α ) is the binary entropy function. Let β d := sup { β ∗ : f d ( α, β ) < for all < α < / and all < β < β ∗ } . Equivalently, β d is the unique < β d < such that there exists < α d < / such that ( α d , β d ) isa root of the system of equations (cid:26) f d ( α, β ) = 0 ∂∂α f d ( α, β ) = 0 . For instance, β ∼ . β ∼ .
967 and β ∼ . d increases, β d approaches 1. In general,Calkin shows that β d = (1 − e − d ln 2 )(1 ± o (1)). Calkin established the following. Lemma 7.2 ([Cal97]) (Random row-regular matrices have full row rank)
Let d ≥ bean integer. Consider a random m × n matrix M ∈ F m × n constructed by independently choosingeach of the m rows of M uniformly from the set of vectors in F n of weight d . If m < β d n , then theprobability that the rows of M are linearly dependent goes to zero as n tends to infinity. Note that full row rank corresponds to (1 , Lemma 1.10 (Random check-regular graphs are nondegenerate)
Let d, m and n be integerssuch that d ≥ and ≤ m < β d n . Consider a random m × n matrix M ∈ F m × n constructed byindependently choosing each of the m rows of M uniformly from the set of vectors in F n of weight d . Then for any constant c > and any function k ( n ) of n such that k ( n ) = o (log log n ) , M is ( c log n, k ( n )) -nondegenerate with high probability. That is, the probability that there are at least c log n rows of M whose F -sum has weight less than or equal to k ( n ) goes to zero as n tends toinfinity. The proof follows the argument Calkin [Cal97] used to establish Lemma 7.2. Let B d be the set ofvectors in F n of weight d . Let g = ⌈ c log n ⌉ , k = k ( n ), and P be the probability that there are atleast g rows of M whose F -sum has weight less than or equal to k . Thus P ≤ m X t = g (cid:18) mt (cid:19) k X p =0 a ( t ) p , (12)where a ( t ) p is the probability that the weight of the sum of t random vectors chosen uniformly andindependently from B d is p .Consider the random walk on F n which starts from 0 and moves by adding random elementsfrom B d . The transition probability matrix of the underlying Markov chain is the ( n + 1) × ( n + 1)matrix A = ( a pq ) p,q ∈{ ,...,n } , where a pq is defined as follows. Fix any vector y q ∈ F n of weight q .20hen a pq is the probability that the weight of x + y q is p over the uniformly random choice of x from B d . The entries of A are given by a pq = (cid:0) q q + d − p (cid:1)(cid:0) n − q d − q + p (cid:1)(cid:0) nd (cid:1) if q + d − p is even. Otherwise, a pq = 0. In terms of A , a ( t ) p = a ( t ) p , where a ( t ) p is the ( p, A t .The following lemma due to Calkin gives the eigenvalues and the eigenvectors of A in terms ofKrawtchook Polynomials. Lemma 7.3 ([Cal97]) a) The eigenvalues of A are λ i = 1 (cid:0) nd (cid:1) X s ( − s (cid:18) is (cid:19)(cid:18) n − id − s (cid:19) for i = 0 , . . . , n .The eigenvector corresponding to λ i is the n × vector e i whose entries are given by e ij = X s ( − s (cid:18) is (cid:19)(cid:18) n − ij − s (cid:19) for j = 0 , . . . , n .Moreover, A is decomposable as A = U Λ U − , where Λ = diag ( λ i ) ni =1 , U is the matrix whosecolumns are e , . . . , e n and U − = 2 − n U .b) If i > n , then λ i = ( − d λ n − i . We have a ( t ) p = a ( t ) p = 2 − n X i e ip λ ti e i ≤ − n (cid:18) np (cid:19) X i (cid:18) ni (cid:19) | λ i | t , since e i = (cid:0) ni (cid:1) and | e ip | ≤ (cid:0) np (cid:1) . It follows from (12), that P ≤ − n k X p =0 (cid:18) np (cid:19) n X i =0 (cid:18) ni (cid:19) m X t = g (cid:18) mt (cid:19) | λ i | t ≤ n + 1) k ⌊ n/ ⌋ X i =0 − n (cid:18) ni (cid:19) m X t = g (cid:18) mt (cid:19) | λ i | t , (13)where the second inequality follows from Part (b) of Lemma 7.3 and the bound P kp =0 (cid:0) np (cid:1) ≤ ( n +1) k .Instead of (13), Calkin obtains the bound:2 ⌊ n/ ⌋ X i =0 − n (cid:18) ni (cid:19) m X t =1 (cid:18) mt (cid:19) λ ti . (14)The key differences between (13) and (14) are that (14) starts from t = 1 instead of t = g and (13)has the extra ( n + 1) k term (the fact that the absolute values of the eigenvalues appear in (13)instead of their actual values is of minor significance). We will show that P ≤ − Θ( n / ) + n +1) k mg g ,hence P = o (1) for g = Θ(log n ) and k = o (log log n ).To estimate P , we will use the following bounds on the eigenvalues established by Calkin.21 emma 7.4 ([Cal97]) a) | λ i | ≤ for all ≤ i ≤ n b) If cn ≤ i ≤ n for some constant c > , then λ i = (cid:18) − in (cid:19) d − (cid:0) d (cid:1) n (cid:18) − in (cid:19) d − in (cid:18) − in (cid:19) + O (cid:18) d n (cid:19) . c) If n − n / ≤ i ≤ n , then | λ i | = o (cid:0) n (cid:1) . Let P i = 2( n + 1) k − n (cid:18) ni (cid:19) m X t = g (cid:18) mt (cid:19) | λ i | t . Thus P ≤ P ⌊ n/ ⌋ i =0 P i . We divide the summation on i as in the argument of Calkin into three regions:0 ≤ i ≤ ǫn , ǫn < i ≤ n − n / and n − n / < i ≤ n , where ǫ > m < β d n in second region and fact that t starts from g in the third region. Region 1: ≤ i ≤ ǫn . Using the bound | λ i | ≤ g on t , we get P i ≤ n + 1) k − n (cid:18) ni (cid:19) m X t =0 (cid:18) mt (cid:19) | λ i | t = 2( n + 1) k − n (cid:18) ni (cid:19) (1 + | λ i | ) m ≤ n + 1) k (cid:18) ni (cid:19) − ( n − m ) ≤ n + 1) k − n (1 − m/n − H (( ǫ ))+ O (log n ) = 2 − Θ( n ) , for sufficiently small ǫ >
0, since m < β d n , β d < k = o ( n log n ). Hence P (1) := X ≤ i ≤ ǫn P i ≤ − Θ( n ) . Region 3: n − n / < i ≤ n . Here we use the bound λ i = o (cid:0) n (cid:1) in Part (c) of Lemma 7.4 andthe bound (cid:0) mt (cid:1) ≤ (cid:0) emt (cid:1) t : P i ≤ n + 1) k − n (cid:18) ni (cid:19) m X t = g (cid:16) emt | λ i | (cid:17) t ≤ n + 1) k − n (cid:18) ni (cid:19) m X t = g t t ≤ n + 1) k − n (cid:18) ni (cid:19) mg g , where the second equality holds for sufficiently large n . Hence P (3) := X n − n / n . Therefore P (2) := X ǫn f d ( α, β ) = − H ( α ) + β log (1 + (1 − α ) d ) = − Θ(( α − ) )since H ( α ) = 1 − Θ(( α − ) ) and d ≥
3. It follows that P (2) ≤ δn k + − Θ( n / ) = 2 − Θ( n / ) for k = o (cid:16) n / log n (cid:17) .Combining the above three cases, we get P ≤ P (1) + P (2) + P (3) ≤ − Θ( n / ) + 2( n + 1) k mg g if k = o (cid:16) n / log n (cid:17) . Recall that g = Θ(log n ). It follows that P = o (1) if k = o (log log n ). In the section we give an interpretation of the notion of asymptotic strength in terms of thefractional spectrum of pseudocodewords. Then we compare with the related notions of minimumBSC-pseudoweight [VK05], fractional distance and maximum-fractional distance [Fel03, FWK05].23f G = ( V, C, E ) is a Tanner graph, let
Ext ( G ) be the set of extreme points of P ( G ). Thecodewords of Q are the integral vertices of P ( G ), i.e., Ext ( G ) ∩ { , } n = Q . The elements of Ext ( G ) are called pseudocodewords (see [Fel03, KV03, FWK05, VK05]).In terms of pseudocodewords, the notion of asymptotic strength translates as follows. Theorem 8.1 (Pseudocodewords and asymptotic strength)
Let G = { G n } n be an infinitefamily of Tanner graphs. Then G is asymptotically strong iff for each (small) constant θ > , thereexists a constant α > such that for each n and each nonzero pseudocodeword x ∈ Ext ( G n ) , thesum of the largest ⌊ αn ⌋ coordinates of x is less than θ P i x i . That is, to attain a positive fractionof P i x i , we need a least linear number of coordinates of x . Proof:
By the definition of the LP decoder, the following are equivalent for any LLR vector γ ∈ R n :i) The LP decoder of G n = ( V n , C m , E n ) succeeds on γ under the all-zeros codeword assumptionii) h x, γ i > x ∈ Ext ( G ).By the equivalence between (i) and (ii), G is asymptotically strong iff for each constant β >
0, thereexists a constant α > n and each error vector y ∈ { , } n of weight at most αn ,we have h x, γ ( y, β ) i > x ∈ Ext ( G n ), where γ ( y, β ) : V n → R isthe asymmetric LLR vector given by γ i ( y, β ) = (cid:26) − y i = 1 β if y i = 0 . Let U = { i : y i = 1 } , thus h x, γ ( y, β ) i = − X i ∈ U x i + β X i ∈ U c x i = − (1 + β ) X i ∈ U x i + β X i x i . Hence h x, γ ( y, β ) i > P i ∈ U x i < β β P i x i . The theorem then follows by setting θ = β β . (cid:4) Note that if x is integral, i.e., x ∈ { , } n is a codeword, then the above condition is equivalentto weight ( x ) = Θ( n ). If x is not integral, the above condition says that the fractional weightsspectrum is not “too unbalanced” in the sense that we need at least a linear number of coordinatesof x to attain a positive fraction of P i x i .In the setup of Theorem 8.1, the minimum BSC-pseudoweight [VK05] w BSCp ( G n ) correspondsto θ = . Namely, w BSCp ( G n ) = 2 a ∗ , where a ∗ is the maximum value of a such that the sum ofthe largest a coordinates of x is less than P i x i for all nonzero x ∈ Ext ( G n ). The 2 multiplica-tive factor ensures that the largest number of errors the LP decoder can handle over the BSC is w BSCp ( G n ) /
2. Thus, for integral codewords, the BSC-pseudoweight coincides with the Hammingweight. The asymptotic strength property implies that w BSCp ( G n ) = Θ( n ). It is not clear if theconverse holds; the asymptotic strength requirement seems stronger since it is in terms of all θ > θ = . We leave the problem of whether or not it is strictly stronger open.The fractional distance of G is the minimum L -norm of a nonzero pseudocodeword [Fel03,FWK05]. Unlike the the minimum BSC-pseudoweight, the fractional distance is always sublinear forregular bounded-degree Tanner graphs [KV03, VK05]. The same holds for the maximum-fractionaldistance which is defined as the minimum L -norm/ L ∞ -norm of a nonzero pseudocodeword [Fel03,FWK05]. 24 Decoding with help bits
In this section we highlight a general property of asymptotically strong Tanner graphs. We arguethat for such graphs, allowing a sublinear number of “help bits” does not improve the LP threshold.This result, although a negative statement, has potential constructive applications as it weakensthe dual witness requirement for LP decoding success. We also derive a converse of the LP excesslemma.
Definition 9.1 (LP decoder with help)
Let H = { H n } n be an infinite family of Tanner graphsand b : N → R ≥ . Consider transmitting x ∈ F n and receiving the corrupted version y ∈ F n of x .We say that the LP decoder of H n corrects y with b ( n ) help bits if there exists z ∈ F n of weightat most b ( n ) such that the LP decoder of H n succeeds in recovering x from y + z . That is, we areallowed to flip at most b ( n ) bits of y to help the LP decoder. Define the LP-threshold ξ LP ( H , b ) tobe the supremum of ǫ ≥ such that the probability that the LP decoder of H n fails with b ( n ) helpbits over the ǫ -BSC tends to zero as n tends to infinity, i.e., ξ LP ( H , b ) = sup { ǫ ≥ ǫ - BSC [ LP decoder of H n fails with b ( n ) help bits ] = o (1) } . Theorem 9.2 (Sublinear help does not improve LP threshold)
Let H = { H n } n be an in-finite family of Tanner graphs. If H is asymptotically strong and b ( n ) = o ( n ) , then ξ LP ( H , b ) = ξ LP ( H ) . A potential constructive application of Theorem 9.2 is the following. In dual terms (by Theorem1.2), the LP decoder of H n = ( V n , C n , E n ) corrects y with b ( n ) help bits iff there is a b ( n )-weakdual witness for ( − y , where w : V → R is called a b ( n ) -weak dual witness if instead of the variablenodes inequalities F i ( w ) < ( − y i , for i ∈ V , it satisfies the following weaker version: (cid:26) F i ( w ) < i ∈ V n F i ( w ) < − b ( n ) variable i ∈ V n such that y i = 1.Thus Theorem 9.2 implies that to estimate the LP threshold of an asymptotically strong Tannergraph, it is enough to find a weak dual witness instead of a dual witness, which is in principle aneasier task.The proof of Theorem 9.2 is below and it uses the following converse of the LP excess lemma(Lemma 5.1). Lemma 9.3 (LP deficiency lemma: trading LP deficiency with crossover probability)
Let H = ( V, C, E ) be a Tanner graph. Let < ǫ < ǫ ′ < and < δ < such that ǫ ′ = ǫ + (1 − ǫ ) δ .Let q ǫ ′ ,δ be the probability that there is no dual witness in H for ( − y + δ , where y ∼ Ber( ǫ ′ , n ) is an error pattern generated by the ǫ ′ -BSC. Then the probability that the LP decoder of H fails onthe ǫ -BSC is at most q ǫ ′ ,δ δ . The proof of the LP deficiency lemma is in Section 9.1. Note that the δ term in ( − y + δ representsthe “LP deficiency” of the dual witness with respect to ( − y , i.e., how far it is from being a dualwitness for ( − y . Proof of Theorem 9.2:
The proof uses a part of the argument in Theorem 1.2 and applies theLP deficiency lemma instead of the LP excess lemma. Using the asymptotic strength of H , we25ill trade the help bits with LP deficiency, which in turns we will trade with crossover probabilityusing the LP deficiency lemma. At a high level, the argument is as follows. For any δ >
0, wewill operate the LP decoder of H with b ( n ) help bits on the BSC below its threshold ξ LP ( H , b )by around δ . With high probability, we have a dual witness w for ( − y + z for some help vector z ∈ { , } n of sublinear weight. We will turn w into a dual witness for ( − y + δ by patching to w a dual witness v for the asymmetric LLR vector µ ( z, δ ) given by µ i ( z, δ ) = (cid:26) − z i = 1 δ if z i = 0 , (15)for all i ∈ V . The existence of v follows from the asymptotic strength of H . Using the LP deficiencylemma, we get rid of the deficiency δ by deceasing the crossover probability to ξ LP ( H , b ) − δ .More precisely, assume without loss of generality that ξ LP ( H , b ) > < δ < ξ LP ( H , b ). We will show that ξ LP ( H ) ≥ ξ LP ( H , b ) − δ . Let ǫ = ξ LP ( H , b ) − δ and ǫ ′ = ǫ + (1 − ǫ ) δ , thus 0 < ǫ < ǫ ′ < ξ LP ( H ). Let q ǫ ′ ( n ) be the probability that the LP decoderof H n with b ( n ) help bits fails on ( − y , where y ∼ Ber( n, ǫ ′ ). Since ǫ ′ < ξ LP ( H , b ), we have q ǫ ′ ( n ) = o ( n ). By Theorem 2.2, with probability 1 − q ǫ ′ ( n ) over the choice of y ∼ Ber( n, ǫ ′ ), there isa dual witness w in H n for ( − y + z for some z ∈ { , } n of weight at most b ( n ). Consider any n andany y ∈ { , } n such that w and z exist. Since H is asymptotically strong, there exists a constant α δ > n ) such that if weight ( z ) ≤ α δ n , the LP decoder of H n = ( V, C, E ) succeedson the asymmetric LLR vector µ ( z, δ ) defined in (15). Accordingly, by Theorem 2.2, let v : E → R be a dual witness for µ ( z, δ ). Since b ( n ) = o ( n ), assume that n is large enough so that b ( n ) ≤ α δ n .It follows that w + v is a dual witness for ( − y + z + µ ( z, δ ). Since ( − y + z + µ ( z, δ ) ≤ ( − y + δ ,we get that w + v is a dual witness for ( − y + δ . Therefore, the probability that there is no dualwitness in H n for ( − y + δ over the choice of y ∼ Ber( n, ǫ ′ ) is at most q ǫ ′ ( n ). It follows from theLP deficiency lemma that the probability that the LP decoder of H n fails on the ǫ -BSC is at most q ǫ ′ ( n ) δ . Since q ǫ ′ ( n ) = o ( n ), we get that ξ LP ( H ) > ǫ = ξ LP ( H , b ) − δ . (cid:4) The proof is a variation of the argument in the proof of Theorem 8.1 in [BGU14]. Decompose the ǫ ′ -BSC into the bitwise OR of the ǫ -BSC and the δ -BSC. Choose x ∼ Ber( ǫ, n ) and e ′′ ∼ Ber( δ, n )and consider e = x ∨ e ′′ , thus e ∼ Ber( ǫ ′ , n ). At a high level, we will construct a dual witness on the ǫ -BSC by appropriately averaging dual witnesses on the ǫ ′ -BSC over the choice of e ′′ ∼ Ber( δ, n ).For every y ∈ { , } n , let L ( y ) = (cid:26) − y + δ has a dual witness0 otherwise.Thus, in terms of L , q ǫ ′ ,δ = P r y ∼ Ber( ǫ ′ ,n ) (cid:2) L ( y ) = 0] . (16)If y ∈ { , } n , let v y : E → R be an arbitrary dual witness for ( − y + δ if L ( y ) = 1. Otherwise, let v y : E → R be the identically zero function. If x ∈ { , } n , define w x : E → R by averaging v x ∨ e ′′ over the choice of e ′′ ∼ Ber( δ, n ) and scaling: w x = α E e ′′ ∼ Ber( δ,n ) v x ∨ e ′′ , α = − δ >
0. We will show that w x is a dual witness for ( − x with probability at least1 − q ǫ ′ ,δ δ over the choice of x ∼ Ber( ǫ, n ).If L ( y ) = 1, then by definition, v y satisfies the dual witness check nodes inequalities: v y ( i, j ) + v y ( i ′ , j ) ≥
0, for each check j ∈ C and all distinct variables i = i ′ ∈ N ( j ). The identically zerofunction E → R also satisfies those inequalities, hence they are satisfied by v y for all y ∈ { , } n .Since w x is an average over v x ∨ e ′′ scaled by a positive constant, we get that the dual witness checknodes inequalities are satisfied by w x for all x ∈ { , } n .In what follows, we take care of the variable nodes inequalities ((a) in Definition 2.1). If w : E → R , consider the flow vector F ( w ) ∈ R V associated with w : F i ( w ) = X j ∈ N ( i ) w ( i, j ) , for all i ∈ V . In terms of F , we have F ( v y ) < ( − y + δ y ∈ { , } n such that L ( y ) = 1 . (17)We have to show that F ( w x ) < ( − x with probability at least 1 − q ǫ ′ ,δ δ over the choice of x ∼ Ber( ǫ, n ). For any x ∈ { , } n , F ( w x ) = α E e ′′ ∼ Ber( δ,n ) F ( v x ∨ e ′′ )= α E e ′′ [ F ( v x ∨ e ′′ ) | L ( x ∨ e ′′ ) = 1 ] × Pr e ′′ [ L ( x ∨ e ′′ ) = 1] (18) < α E e ′′ [ ( − x ∨ e ′′ + δ | L ( x ∨ e ′′ ) = 1 ] × Pr e ′′ [ L ( x ∨ e ′′ ) = 1] (using (17))= α (cid:16) E e ′′ ( − x ∨ e ′′ + δ − E e ′′ [ ( − x ∨ e ′′ | L ( x ∨ e ′′ ) = 0 ] × φ x (cid:17) ≤ α (cid:16) E e ′′ ( − x ∨ e ′′ + δ + φ x (cid:17) where φ x := Pr e ′′ ∼ Ber( δ,n ) (cid:2) L ( x ∨ e ′′ ) = 0]. Note that (18) follows from the fact that L ( y ) = 0implies v y = 0 and hence F ( v y ) = 0. Fix any i ∈ V . If x i = 1, then E e ′′ ( − x i ∨ e ′′ i = −
1. If x i = 0,then E e ′′ ( − x i ∨ e ′′ i = δ ( −
1) + (1 − δ )(1) = 1 − δ . Hence F i ( w x ) < ( α ( − δ + φ x ) if x i = 1 α (1 − δ + φ x ) if x i = 0.By (16),E x ∼ Ber( ǫ,n ) φ x = Pr e ′′ ∼ Ber( δ,n ) ,x ∼ Ber( ǫ,n ) (cid:2) L ( x ∨ e ′′ ) = 0] = Pr y ∼ Ber( ǫ ′ ,n ) (cid:2) L ( y ) = 0] = q ǫ ′ ,δ . Thus, by Markov’s inequality, φ x ≥ δ with probability at most q ǫ ′ ,δ δ over the choice x ∼ Ber( ǫ, n ).Hence, with probability at least 1 − q ǫ ′ ,δ δ over x ∼ Ber( ǫ, n ), we have for all i ∈ V , F i ( w x ) < ( α ( − δ + δ ) if x i = 1 α (1 − δ + δ ) if x i = 0.= ( − x i , since α = − δ . 27 We conclude with some remarks and open questions mainly related to the asymptotic strengthcondition, the rigidity condition and the LP decoding threshold on the BSC.
Asymptotic strength condition.
Theorem 1.3 shows that expansion implies asymptotic strength.We know that random low density Tanner graphs are good expanders with high probability [SS96,FMS+07]. Combining Theorem 1.3 and the probabilistic analysis in Appendix B of [FMS+07]implies the following.
Theorem 10.1
Let < r < be a constant. Let d v be a positive integer constant such that thereexists a constant < δ < for which δd v and (1 − δ ) d v are integers and (1 − δ ) d v ≥ . Then,for any positive integers n and m such that r = 1 − mn , a random variable-regular Tanner graph G with variable degree d v , n variable nodes and m check nodes is asymptotically strong with highprobability . The integrality constraint on δd v and (1 − δ ) d v can require large values of d v (see [FMS+07]). Weconjecture that the following holds. Conjecture 10.2
For all d c > d v ≥ , a random ( d v , d c ) -regular Tanner graph is asymptoticallystrong with high probability. Rigidity condition.
If the graph has Θ(log n ) girth and minimum check degree at least 3, therigidity condition is equivalent to the simpler ( c log n, ω (1))-nondegeneracy condition. We argued inLemma 1.10 that the latter condition holds with high probability for random check-regular graphsassuming that m < β d n , where m is the number of checks nodes, d is the check degree and β d is Calkin’s threshold. The statistical independence of the check nodes in the ensemble of randomcheck-regular graphs makes the ensemble attractive from a probabilistic analysis perspective, butit typically gives irregular graphs with constant girth. We believe that good girth and variable-regularity do not increase the odds of degeneracy; we conjecture that nondegeneracy is also a typicalproperty of the ensemble of regular Θ(log n )-girth Tanner graphs. Conjecture 10.3
Let d c > d v ≥ be integers such that d v < β d c d c . If λ > be a constant, let Γ λ be ensemble of ( d v , d c ) -regular Tanner graphs on n variable nodes of girth at least λ log n . Thenthere is a constant λ > small enough such that for each constant c > , a random graph G fromthe ensemble Γ λ is ( c log n, ω (1)) -nondegenerate with high probability. Establishing this conjecture requires working in a more complex probabilistic framework. We leavethe question open for further investigation. Note that since d c m = d v n , the condition d v < β d c d c is equivalent to m < β d c n . A natural but probably more difficult problem is to study also theasymptotic strength of the ensemble Γ λ . Limits of LP decoding on the BSC.
On the positive side, our negative results suggest studying By a random family G = { G n } n of Tanner graphs being asymptotically strong with high probability, we meanthe following. For each constant β >
0, there exists a constant α > n , with probability at least1 − o (1) over the random choice of G n , the LP decoder of G n succeeds on the asymmetric LLR vector γ ( y, β ) for all y ∈ { , } n of weight at most αn . r ′ is the rate of the dual code, Shannon’s limit says that we can transmit reliably over the ǫ -BSC if ǫ < H − ( r ′ ), where H is the binary entropy function. For LP decoding with all redundantchecks included, it is natural to study the following LP capacity function. Definition 10.4 (LP capacity over the BSC)
Given a dual rate < r ′ < , define the LPcapacity function ξ LP ( r ′ ) := sup { D n } n ξ LP ( { G D n } n ) , where the supremum is over all F -linear codes D n ⊂ F n such that of lim n →∞ rate ( D n ) = r ′ and G D n is the Tanner graph on n variables whose checks are the nonzero elements of D n . Note that primitive hyperflows (Theorem 4.2) maybe useful in studying the LP capacity function.
Question 10.5 i) (Relation to Shannon’s capacity) How far is ξ LP ( r ′ ) from the Shannon’scapacity H − ( r ′ ) ? Is ξ LP ( r ′ ) = H − ( r ′ ) ?ii) (Achievability with bounded check-degree) Is any ǫ < ξ LP ( r ′ ) achievable by a familyof codes { D n } n with a bounded-weight basis? That is, is it true that for each ǫ < ξ LP ( r ′ ) ,there exist a constant d and a family of Tanner graphs G = { G n } n such that ξ LP ( G ) ≥ ǫ and G n has at most r ′ n check nodes each of degree most d ?iii) (Achievability with asymptotic strength) If the answer of (ii) is affirmative, is G asymp-totically strong?iv) (Achievability with rigidity) If the answer of (iii) is affirmative, is G rigid? The answer to first question is not clear.The answer to (ii) is probably affirmative since we already know from [FMS+07] that a positivevalue of ǫ is achievable with bounded check degree. The answer to (iii) seems also affirmative.In general, asymptotic strength makes the LP stronger as it guarantees that the fractional weightspectrum of the pseudocodewords is not “too unbalanced” (Theorem 8.1). Inspired by [LMSS01],if G is not asymptotically strong, we can actually make it asymptotically strong without noticeablyaffecting its rate by adding to the code a small number of parity checks which form a sufficientlygood expander . The added checks do not decrease the LP threshold of the code.If both (ii) and (iii) have affirmative answers, we obtain from Theorem 1.2 that for any ǫ <ξ LP ( r ′ ), there exists a sufficiently large constant k ≥ d such that ξ LP ( G k ) ≥ ǫ . Thus, by runningthe LP decoder of G k , dual rate r ′ is achievable on the ǫ -BSC in time polynomial in the blocklength n . More specifically, the time is polynomial in n k , where the constant k increases as the gap δ = ξ LP ( r ′ ) − ǫ gets small.The last question is more intriguing. If the answer to (iv) is also affirmative, then ξ LP ( G ) = ξ LP ( G ) by Corollary 1.7. Thus, by running the LP decoder of G , we conclude that for any ǫ <ξ LP ( r ′ ), dual rate r ′ is achievable on the ǫ -BSC in time polynomial in the block length n andindependent of the gap δ , which is counter intuitive if ξ LP ( r ′ ) = H − ( r ′ ). We need a ( εn, δd ′ )-expander between n variable nodes and αn check nodes of regular variable degree d ′ andbounded check degree, where α > ε, δ > < δ < δd ′ is aninteger. WGN.
On a final note, a natural question is to explore the potential extendability of the resultsin this paper to other channels such as the AWGN.
Acknowledgments
The authors would like to thank the anonymous reviewers for their helpful and constructive com-ments.
References [ADS12] S. Arora, C. Daskalakis, and D. Steurer. Message-passing algorithms and improved LPdecoding.
Information Theory, IEEE Transactions on , 58(12):7260-7271, 2012.[BG86] F. Barahona and M. Gr¨otschel. On the cycle polytope of a binary matroid.
J. Comb. TheorySer. B , 40: 40-62, 1986.[BGU14] L. Bazzi, B. Ghazi, and R.L. Urbanke. Linear Programming Decoding of Spatially Cou-pled Codes.
Information Theory, IEEE Transactions on , 60(8):4677-4698, 2014.[BMVT78] E. Berlekamp, R. McEliece, and H. Van Tilborg. On the inherent intractability of certaincoding problems (corresp.).
Information Theory, IEEE Transactions on , 24(3):384-386, 1978.[BG11] D. Burshtein and I. Goldenberg. Improved linear programming decoding of LDPC codesand bounds on the minimum and fractional distance.
Information Theory, IEEE Transactionson , 57(11):7386-7402, 2011.[Cal97] N. J. Calkin. Dependent sets of constant weight binary vectors.
Combinatorics, Probability,and Computing,
Information Theory, IEEE Transactions on , 54(8):3565-3578,2008.[Fel03] J. Feldman.
Decoding error-correcting codes via linear programming.
PhD thesis, Mas-sachusetts Institute of Technology, 2003.[FMS+07] J. Feldman, T. Malkin, R.A. Servedio, C. Stein, and M.J. Wainwright. LP decodingcorrects a constant fraction of errors.
Information Theory, IEEE Transactions on , 53(1):82-89, 2007.[FS05] J. Feldman and C. Stein. LP decoding achieves capacity. In
Proceedings of the Sixteenth An-nual ACM-SIAM Symposium on Discrete Algorithms , SODA 05, pages 460-469, Philadelphia,PA, USA, 2005. Society for Industrial and Applied Mathematics.[FWK05] J. Feldman, M.J. Wainwright, and D.R. Karger. Using linear programming to decodebinary linear codes.
Information Theory, IEEE Transactions on , 51(3):954-972, 2005.30Gal62] R. Gallager. Low-density parity-check codes.
Information Theory, IRE Transactions on ,8(1):21-28, 1962.[GL14] B. Ghazi and E. Lee. LP/SDP Hierarchy Lower Bounds for Decoding Random LDPCCodes. Oct 2014, arXiv:1410.4241 [cs.CC]. To appear in
ACM-SIAM Symposium on DiscreteAlgorithms (SODA) , 2015.[Kas08] N. Kashyap. A decomposition theory for binary linear codes.
Information Theory, IEEETransactions on , 54(7):3035-3058, 2008.[KV03] R. Koetter and P.O. Vontobel. Graph-covers and iterative decoding of finite length codes.In
Proceedings of the IEEE International Symposium on Turbo Codes and Applications , pages75-82, 2003.[KV06] R. Koetter and P. O. Vontobel. On the block error probability of LP decoding of LDPCcodes. In
Inaugural Workshop of the Center for Information Theory and its Applications , 2006.Available as eprint arXiv:cs/0602086v1.[LMSS01] M. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. Spielman. Improved Low-DensityParity-Check Codes Using Irregular Graphs.
Information Theory, IEEE Transactions on ,47(2), pp. 585-598. 2001[MWT09] M. Miwa, T.Wadayama, and I. Takumi. A cutting-plane method based on redundantrows for improving fractional distance.
IEEE Journal on Selected Areas in Communications ,(27)6:1603-1613, 2009.[RU08] T. Richardson and R. Urbanke.
Modern coding theory.
Cambridge University Press, 2008.[Sey80] P. D. Seymour. Decomposition of regular matroids.
J. Comb. Theory Ser. B , 28: 305-359,1980.[SS96] M. Sipser and D. Spielman. Expander codes.
Information Theory, IEEE Transactions on ,42(6):1710-1722, 1996.[TS08] M. H. Taghavi and P. H. Siegel. Adaptive methods for linear programming decoding.
In-formation Theory, IEEE Transactions on , (54) 12:5396-5410, 2008.[Tan81] R. M. Tanner. A recursive approach to low complexity codes.
Information Theory, IEEETransactions on , 27(5):533-547, 1981.[Vid13] M. Viderman. LP decoding of codes with expansion parameter above 2/3.
InformationProcessing Letters , 113(7):225-228, 2013.[VK05] P.O. Vontobel and R. Koetter. Graph-Cover Decoding and Finite-Length Analysis ofMessage-Passing Iterative Decoding of LDPC Codes. Dec.2005, arXiv:cs/0512078v1, submit-ted for publication.[VK04] P.O. Vontobel and R. Koetter. On the Relationship between Linear Programming Decod-ing and Min-Sum Algorithm Decoding.
Proc. IEEE International Symposium on InformationTheory (ISITA) , October 2004, pp. 991-996.31VK06] P.O. Vontobel and R. Koetter. Bounds on the threshold of linear programming decoding.