Linear-Programming Decoding of Nonbinary Linear Codes
Mark F. Flanagan, Vitaly Skachek, Eimear Byrne, Marcus Greferath
aa r X i v : . [ c s . I T ] J un Linear-Programming Decoding of NonbinaryLinear Codes
Mark F. Flanagan,
Member, IEEE,
Vitaly Skachek,
Member, IEEE,
Eimear Byrne,and Marcus Greferath
Abstract
A framework for linear-programming (LP) decoding of nonbinary linear codes over rings is de-veloped. This framework facilitates linear-programming based reception for coded modulation systemswhich use direct modulation mapping of coded symbols. It is proved that the resulting LP decoder hasthe ‘maximum-likelihood certificate’ property. It is also shown that the decoder output is the lowest costpseudocodeword. Equivalence between pseudocodewords of the linear program and pseudocodewords ofgraph covers is proved. It is also proved that if the modulator-channel combination satisfies a particularsymmetry condition, the codeword error rate performance is independent of the transmitted codeword.Two alternative polytopes for use with linear-programming decoding are studied, and it is shown thatfor many classes of codes these polytopes yield a complexity advantage for decoding. These polytoperepresentations lead to polynomial-time decoders for a wide variety of classical nonbinary linear codes.LP decoding performance is illustrated for the [11 , ternary Golay code with ternary PSK modulationover AWGN, and in this case it is shown that the performance of the LP decoder is comparable tocodeword-error-rate-optimum hard-decision based decoding. LP decoding is also simulated for medium-length ternary and quaternary LDPC codes with corresponding PSK modulations over AWGN. Keywords:
Linear-programming decoding, LDPC codes, pseudocodewords, coded modulation.
This work was supported in part by the Claude Shannon Institute for Discrete Mathematics, Coding and Cryptography (ScienceFoundation Ireland Grant 06/MI/006). The material in this paper was presented in part at the 7-th International ITG Conferenceon Source and Channel Coding (SCC), Ulm, Germany, January 2008, and in part at the IEEE International Symposium onInformation Theory (ISIT), Toronto, Canada, July 2008.M. F. Flanagan is with the School of Electrical, Electronic and Mechanical Engineering, University College Dublin, Belfield,Dublin 4, Ireland (e-mail:mark.fl[email protected]).V. Skachek, E. Byrne and M. Greferath are with the Claude Shannon Institute and the School of Mathematical Sciences,University College Dublin, Belfield, Dublin 4, Ireland (e-mail: { vitaly.skachek, ebyrne, marcus.greferath } @ucd.ie). June 4, 2009 DRAFT . I
NTRODUCTION
Low-density parity-check (LDPC) codes [1] have become very popular in recent years due totheir excellent performance under sum-product (SP) decoding (or message-passing decoding).The primary research focus in this area to date has been on binary
LDPC codes. Finite-lengthanalysis of such LDPC codes under SP decoding is a difficult task. An approach to such ananalysis was proposed in [2] based on the consideration of so-called pseudocodewords and their pseudoweights , defined with respect to a structure called the computation tree . By replacing thisset of pseudocodewords with another set defined with respect to cover graphs of the Tannergraph (here called graph-cover pseudocodewords ), the analysis was found to be significantlymore tractable while still yielding accurate experimental results [3], [4], [5].In [6] and [7], the decoding of binary
LDPC codes using linear-programming (LP) decodingwas proposed, and many important connections between linear-programming decoding and clas-sical message-passing decoding were established. In particular, it was shown that the LP decoderis inhibited by a set of pseudocodewords corresponding to points in the LP relaxation polytopewith rational coordinates (here called linear-programming pseudocodewords ), and that the set ofthese pseudocodewords is equivalent to the set of graph-cover pseudocodewords. This representsa major result as it indicates that essentially the same phenomenon determines performance ofLDPC codes under both LP and SP decoding.For high-data-rate communication systems, bandwidth-efficient signalling schemes are re-quired which necessitate the use of higher-order (nonbinary) modulation. Of course, within such aframework it is desirable to use state-of-the-art error-correcting codes. Regarding the combinationof LDPC coding and higher-order modulation, bit-interleaved coded modulation (BICM) [8] isa high-performance method which cascades the operations of binary coding, interleaving andhigher-order constellation mapping. Here however the problem of system analysis is exacerbatedby the complication of joint design of binary code, interleaver and constellation mapping; thisbecomes even more difficult when feedback is included from the decoder to the demodulator [9].Alternatively, higher-order modulation may be achieved in conjunction with coding by theuse of nonbinary codes whose symbols map directly to modulation signals. A study of suchcodes over rings, for use with PSK modulation, was performed in [10], with particular focus onthe ring of integers modulo . Nonbinary LDPC codes over fields have been investigated with2irect mapping to binary [11] and nonbinary [12], [13], [14], [15] modulation signals; in all ofthis work, SP decoding (with respect to the nonbinary alphabet) was assumed. Recently, someprogress has been made on the topic of analysis of such codes; in particular, pseudocodewordsof nonbinary codes were defined and some bounds on the pseudoweights were derived [16].In this work, we extend the approach in [7] towards coded modulation, in particular to codesover rings mapped to nonbinary modulation signals. As was done in [7], we show that theproblem of decoding may be formulated as an LP problem for the nonbinary case. We alsoshow that an appropriate relaxation of the LP leads to a solution which has the ‘maximum-likelihood (ML) certificate’ property, i.e. if the LP outputs a codeword, then it must be the MLcodeword. Moreover, we show that if the LP output is integral, then it must correspond to the MLcodeword. We define the graph-cover pseudocodewords of the code, and the linear-programmingpseudocodewords of the code, and prove the equivalence of these two concepts. This shows thatthe links between LP decoding on the relaxation polytope and message-passing decoding on theTanner graph generalize to the nonbinary case. Of course, while we use the term ‘nonbinary’throughout this paper, our framework includes the binary framework as a special case.For coded modulation systems using maximum-likelihood (ML) decoding, the concept of geometric uniformity [17] was introduced as a condition which, if satisfied, guarantees codeworderror rate (WER) performance independent of the transmitted codeword (this condition was usedfor design of the coded modulation systems in [10]). An analogous symmetry condition wasdefined in [18] for binary codes over GF (2) with SP decoding; this was later extended tononbinary codes over GF ( q ) by invoking the concept of coset LDPC codes [13], [14]. We showthat for the present framework, there exists a symmetry condition under which the codeword errorrate performance is independent of the transmitted codeword. This provides a condition somewhatakin to geometric uniformity for the present framework. It is noteworthy that the same symmetrycondition has recently been shown to yield codeword-independent decoder performance in thecontext of SP decoding [19] and also in the context of ML decoding [20]. In particular, thisidentifies a ‘natural’ mapping for nonbinary codes mapped to PSK modulation, where LP, SPor ML decoding is used with direct modulation mapping of coded symbols.For the binary framework, alternative polytope representations were studied which gave acomplexity advantage in certain scenarios [6], [7], [21], [22], [23]. Analogous to these works,we define two alternative polytope representations, which offer a smaller number of variables3nd constraints for many classes of nonbinary codes. We compare these representations withthe original polytope, and show that both of them have equal error-correcting performance tothe original LP relaxation. Both of these representations lead to polynomial-time decoders for awide variety of classical nonbinary linear codes.To demonstrate performance, LP decoding is simulated for the ternary Golay code mappedto ternary PSK over AWGN, and the LP decoder is seen to perform approximately as well ascodeword-error-rate optimum hard-decision decoding, and approximately . dB from the unionbound for codeword-error-rate optimum soft-decision decoding.The paper is organized as follows. Section II introduces general settings and notation. Thenonbinary decoding problem is formulated as a linear-programming problem in Section III, andbasic properties of the decoding polytope are studied in Section IV. A sufficient conditionfor codeword-independence performance of the decoder is presented in Section V. Linear-programming pseudocodewords are defined in Section VI, and their properties are discussed.Their equivalence to the graph-cover pseudocodewords is shown in Section VII. Two alternativepolytope representations are presented in Sections VIII and IX, both of which have equivalentperformance to the original but may provide lower-complexity decoding. Simulation results arepresented in Section X for some example coded modulation systems. Finally, some directionsfor future research are proposed in Section XI.II. G
ENERAL S ETTINGS
We consider codes over finite rings (this includes codes over finite fields, but may be moregeneral). Denote by R a ring with q elements, by its additive identity, and let R − = R \{ } .Let C be a code of length n over the ring R , defined by C = { c ∈ R n : c H T = } (1)where H is an m × n matrix (with entries from R ) called the parity-check matrix of the code C .Obviously, the code C may admit more than one parity-check matrix; however we will considerthat the parity-check matrix H is fixed in this paper.Linearity of the code C follows directly from (1). Also, the rate of the code C is defined as R ( C ) = log q ( |C| ) /n and is equal to the number of information symbols per coded symbol. Thecode C may then be referred to as an [ n, log q ( |C| )] linear code over R .4enote the set of column indices and the set of row indices of H by I = { , , · · · , n } and J = { , , · · · , m } , respectively. We use notation H j for the j -th row of H , where j ∈ J .Denote by supp ( c ) the support of a vector c . For each j ∈ J , let I j = supp ( H j ) and d j = |I j | ,and let d = max j ∈J { d j } .Given any c ∈ R n , we say that parity-check j ∈ J is satisfied by c if and only if c H Tj = X i ∈I j c i · H j,i = 0 . (2)For j ∈ J , define the single parity-check code C j over R by C j = { ( b i ) i ∈I j : X i ∈I j b i · H j,i = 0 } Note that while the symbols of the codewords in C are indexed by I , the symbols of thecodewords in C j are indexed by I j . We define the projection mapping for parity-check j ∈ J by x j ( c ) = ( c i ) i ∈I j Then, given any c ∈ R n , we may say that parity-check j ∈ J is satisfied by c if and only if x j ( c ) ∈ C j , (3)since (2) and (3) are equivalent. Also, it is easily seen that c ∈ C if and only if all parity-checks j ∈ J are satisfied by c . In this case we say that c is a codeword of C .We shall take an example which shall be used to illustrate concepts throughout this paper.Consider the [4 , linear code over R = Z with parity-check matrix H = (4)Here I = { , , , } , I = { , , } , and the two single parity-check codes C and C , of length d = 4 and d = 3 respectively, are given by C = { ( b b b b ) : b + 2 b + 2 b + b = 0 } and C = { ( b b b ) : 2 b + b + 2 b = 0 } . ECODING AS A L INEAR -P ROGRAMMING P ROBLEM
Assume that the codeword ¯ c = (¯ c , ¯ c , · · · , ¯ c n ) ∈ C has been transmitted over a q -ary inputmemoryless channel, and a corrupted word y = ( y , y , · · · , y n ) ∈ Σ n has been received. Here Σ denotes the set of channel output symbols; we assume that this set either has finite cardinality,or is equal to R l or C l for some integer l ≥ . In practice, this channel may represent thecombination of modulator and physical channel. We assume hereafter that all information wordsare equally probable, and so all codewords are transmitted with equal probability.It was suggested in [6] to represent each symbol as a binary vector of length | R − | , where theentries in the vector are indicators of a symbol taking on a particular value. Below, we showhow this representation may lead to a generalization of the framework of [7] to the case ofnonbinary coding. This generalization is nontrivial since, while such a representation convertsthe nonbinary code into a binary code, this binary code is not linear and therefore the analysisin [6], [7] is not directly applicable.For use in the following derivation, we shall define the mapping ξ : R −→ { , } q − ⊂ R q − , by ξ ( α ) = x = ( x ( γ ) ) γ ∈ R − , such that, for each γ ∈ R − , x ( γ ) = if γ = α otherwise.We note that the mapping ξ is one-to-one, and its image is the set of binary vectors of length q − with Hamming weight 0 or 1. Building on this, we also define Ξ : R n −→ { , } ( q − n ⊂ R ( q − n , according to Ξ ( c ) = ( ξ ( c ) | ξ ( c ) | · · · | ξ ( c n )) . We note that Ξ is also one-to-one.Now, for vectors f ∈ R ( q − n , we adopt the notation f = ( f | f | · · · | f n ) , ∀ i ∈ I , f i = ( f ( α ) i ) α ∈ R − . Also, we may use this notation to write the inverse of Ξ as Ξ − ( f ) = ( ξ − ( f ) , ξ − ( f ) , · · · , ξ − ( f n )) . We also define a function λ : Σ −→ ( R ∪ {±∞} ) q − by λ = ( λ ( α ) ) α ∈ R − , where, for each y ∈ Σ , α ∈ R − , λ ( α ) ( y ) = log (cid:18) p ( y | p ( y | α ) (cid:19) , and p ( y | c ) denotes the channel output probability (density) conditioned on the channel input.We extend this to a map on Σ n by defining Λ ( y ) = ( λ ( y ) | λ ( y ) | . . . | λ ( y n )) .The codeword-error-rate-optimum receiver operates according to the maximum a posteriori (MAP) decision rule: ˆ c = arg max c ∈C p ( c | y )= arg max c ∈C p ( y | c ) p ( c ) p ( y ) . Here p ( · ) denotes probability if Σ has finite cardinality, and probability density if Σ has infinitecardinality.By assumption, the a priori probability p ( c ) is uniform over codewords, and p ( y ) is indepen-dent of c . Therefore, the decision rule reduces to maximum-likelihood (ML) decoding: ˆ c = arg max c ∈C p ( y | c )= arg max c ∈C n Y i =1 p ( y i | c i )= arg max c ∈C n X i =1 log( p ( y i | c i ))= arg min c ∈C n X i =1 log (cid:18) p ( y i | p ( y i | c i ) (cid:19) = arg min c ∈C n X i =1 λ ( y i ) ξ ( c i ) T = arg min c ∈C Λ ( y ) Ξ ( c ) T , c i = α ∈ R − , then λ ( y i ) ξ ( c i ) T = λ ( α ) ( y i ) . This is then equivalent to ˆ c = Ξ − ( ˆ f ) where ˆ f = arg min f ∈K ( C ) Λ ( y ) f T , (5)and K ( C ) represents the convex hull of all points f ∈ R ( q − n which correspond to codewords,i.e. K ( C ) = H conv (cid:8) Ξ ( c ) : c ∈ C (cid:9) . Therefore it is seen that the ML decoding problem reduces to the minimization of a linearobjective function (or cost function) over a polytope in R ( q − n . The number of variables andconstraints for this linear program is exponential in n , and it is therefore too complex for practicalimplementation. To circumvent this problem, we formulate a relaxed LP problem, as shown next.The solution we seek for f (i.e. the desired LP output) is f = Ξ (¯ c ) = ( ξ (¯ c ) | ξ (¯ c ) | . . . | ξ (¯ c n )) . (6)Note that (6) implies that the solution we seek for each f i ( i ∈ I ) is an indicator function which“points” to the i -th transmitted symbol ¯ c i , i.e. ∀ i ∈ I : f ( α ) i = if α = ¯ c i otherwise.We now introduce auxiliary variables whose constraints, along with those of the elements of f ,will form the relaxed LP problem. We denote these auxiliary variables by w j, b for j ∈ J , b ∈ C j , and we denote the vector containing these variables as w = (cid:0) w j (cid:1) j ∈J where w j = (cid:0) w j, b (cid:1) b ∈C j ∀ j ∈ J . The solution we seek for these variables is ∀ j ∈ J : w j, b = if b = x j (¯ c )0 otherwise. (7)8ote that the solution we seek for each w j ( j ∈ J ) is an indicator function which “points” tothe j -th transmitted local codeword x j (¯ c ) . Based on (7), we impose the constraints ∀ j ∈ J , ∀ b ∈ C j , w j, b ≥ , (8)and ∀ j ∈ J , X b ∈C j w j, b = 1 . (9)Finally, we note that the solution we seek (given by the combination of (6) and (7)) satisfiesthe further constraints ∀ j ∈ J , ∀ i ∈ I j , ∀ α ∈ R − ,f ( α ) i = P b ∈C j , b i = α w j, b . (10)It is interesting to note that from (8) and (9), each vector w j (for j ∈ J ) may be interpreted asa probability distribution for the local codeword b ∈ C j , in which case each f i (for i ∈ I ) has anatural interpretation (via (10)) as the corresponding probability distribution for the i -th codedsymbol c i ∈ R . The following example illustrates the connection (10) between f and w . Example 3.1:
Consider the example [4 , code over Z defined by the parity-check ma-trix (4). The second row H of the parity-check matrix corresponds to the parity-check equation b + b + 2 b = 0 over Z . Here b = ( b b b ) ∈ C . Assume that the values of w , b for b ∈ C are as given inthe following table. b b b w , b b b b w , b b b b w , b
000 0 .
01 102 0 .
05 201 0 . .
04 110 0 .
07 212 0 . .
05 121 0 .
08 220 0 . Then, some of the values of f ( α ) i are as follows: f (2)1 = 0 .
15 + 0 .
32 + 0 .
23 = 0 . f (1)2 = 0 .
04 + 0 .
07 + 0 .
32 = 0 .
43 ; f (2)3 = 0 .
05 + 0 .
05 + 0 .
32 = 0 . . j ∈ J , the vector ˆ f j =( f i ) i ∈I j lies in the convex hull K ( C j ) . Constraints (8)-(10) form a polytope which we denoteby Q . The minimization of the objective function (5) over Q forms the relaxed LP decodingproblem. This LP is defined by O ( qn + q d m ) variables and O ( qn + q d m ) constraints, and therefore,the number of variables and of constraints scales as approximately q d .We note that the further constraints ∀ j ∈ J , ∀ b ∈ C j , w j, b ≤ , (11) ∀ i ∈ I , ∀ α ∈ R − , ≤ f ( α ) i ≤ . (12)and ∀ i ∈ I , X α ∈ R − f ( α ) i ≤ . (13)follow from the constraints (8)-(10), for any ( f , w ) ∈ Q .Now we may define the decoding algorithm, which works as follows. The decoder solvesthe LP problem of minimizing the objective function (5) subject to the constraints (8)-(10). If f ∈ { , } ( q − n , the output is the codeword Ξ − ( f ) (we shall prove in the next section thatthis output is indeed a codeword). This codeword may then be the correct one (we call this‘correct decoding’) or an incorrect one (we call this ‘incorrect decoding’). If f / ∈ { , } ( q − n ,the decoder reports a ‘decoding failure’. Note that in this paper, we say that the decoder makesa codeword error when the decoder output is not equal to the transmitted codeword (this couldcorrespond to a ‘decoding failure’, or to an ‘incorrect decoding’).The time complexity of an LP solver depends on the number of variables and constraints inthe LP problem. The simplex method is a popular and practically efficient algorithm for solvingLP problems. However, its worst-case time complexity has been shown to be exponential inthe number of variables. There are other known LP solvers, such as solvers that are based oninterior-point methods [24, Chapter 11], which have time complexity polynomial in the numberof variables and constraints. For more detail the reader may also refer to [25]. We note, however,that the standard iterative decoding algorithms (such as the min-sum or sum-product algorithms)have time complexity which is linear in the block length of the code, and therefore significantlyoutperform the LP decoder in terms of efficiency.10V. P OLYTOPE P ROPERTIES
The analysis in this section is a direct generalization of the results in [7].
Definition 4.1: An integral point in a polytope is a point with all integer coordinates. Proposition 4.1:
1) Let ( f , w ) ∈ Q , and f ( α ) i ∈ { , } for every i ∈ I , α ∈ R − . Then Ξ − ( f ) ∈ C .2) Conversely, for every codeword c ∈ C , there exists w such that ( f , w ) is an integral pointin Q with f = Ξ ( c ) . Proof:
1) Suppose ( f , w ) ∈ Q , and f ( α ) i ∈ { , } for every i ∈ I , α ∈ R − .Let c = Ξ − ( f ) ; by (13), this is well defined. Now, fix some j ∈ J and define t = x j ( c ) .Note that from these definitions it follows that for any i ∈ I , α ∈ R − , f ( α ) i = 1 if andonly if t i = α . Now let r ∈ C j , r = t . Since r and t are distinct, there must exist α ∈ R − and l ∈ I j such that either r l = α and t l = α , or t l = α and r l = α . We examine thesetwo cases separately. • If r l = α and t l = α , then by (10) f ( α ) l = 0 = X b ∈C j , b l = α w j, b . Therefore w j, b = 0 for all b ∈ C j with b l = α , and in particular w j, r = 0 . • If t l = α and r l = α , then by (9) and (10) − f ( α ) l = X b ∈C j w j, b − X b ∈C j , b l = α w j, b = X b ∈C j , b l = α w j, b . Therefore w j, b = 0 for all b ∈ C j with b l = α , and in particular w j, r = 0 .It follows that w j, r = 0 for all r ∈ C j , r = t . But by (9) this implies that t ∈ C j (and that w j, t = 1 ). Applying this argument for every j ∈ J implies c ∈ C .2) For c ∈ C , we let f = Ξ ( c ) . For each parity-check j ∈ J , we let t = x j ( c ) ∈ C j andthen set ∀ j ∈ J : w j, b = if b = t otherwise.11t is easily checked that the resulting point ( f , w ) is integral and satisfies constraints (8)-(10).The following proposition assures the so-called ML certificate property.
Proposition 4.2:
Suppose that the decoder outputs a codeword c ∈ C . Then, c is themaximum-likelihood codeword.The proof of this proposition is straightforward. The reader can refer to a similar proof forthe binary case in [7].V. C ODEWORD -I NDEPENDENT D ECODER P ERFORMANCE
In this section, we state and prove a theorem on decoder performance, namely, that under acertain symmetry condition, the probability of codeword error is independent of the transmittedcodeword. The proof generalizes the corresponding proof for the binary case which may befound in [6], [7].
Symmetry Condition.
For each β ∈ R , there exists a bijection τ β : Σ −→ Σ , such that the channel output probability (density) conditioned on the channel input satisfies p ( y | α ) = p ( τ β ( y ) | α − β ) , (14) ∀ y ∈ Σ , α ∈ R . When Σ is equal to R l or C l for l ≥ , the mapping τ β is assumed to beisometric with respect to Euclidean distance in Σ , for every β ∈ R .Note that the symmetry condition above is very similar to that introduced in [20] whichguarantees codeword-independent performance under ML decoding. Theorem 5.1:
Under the stated symmetry condition, the probability of codeword error isindependent of the transmitted codeword.
Proof:
We shall prove the theorem for the case where Σ has infinite cardinality; the case ofdiscrete Σ may be handled similarly. Fix some codeword c ∈ C , c = . We wish to prove thatPr ( Err | c ) = Pr ( Err | ) , ( Err | c ) denotes the probability of codeword error given that the codeword c wastransmitted.Now Pr ( Err | c ) = Pr ( y ∈ B ( c ) | c ) , where B ( c ) = { y ∈ Σ n : ∃ ( f , w ) ∈ Q , f = Ξ ( c ) with Λ ( y ) f T ≤ Λ ( y ) Ξ ( c ) T } . Here B ( c ) is the set of all received words which may cause codeword error, given that c wastransmitted. Recall that the elements of Λ ( y ) are given by λ ( α ) ( y i ) = log (cid:18) p ( y i | p ( y i | α ) (cid:19) , (15)for i ∈ I , α ∈ R − . Also Pr ( Err | ) = Pr ( y ∈ B ( ) | ) where B ( ) = { ˜ y ∈ Σ n : ∃ ( ˜ f , ˜ w ) ∈ Q , ˜ f = Ξ ( ) with Λ (˜ y ) ˜ f T ≤ Λ (˜ y ) Ξ ( ) T } . So we write Pr ( Err | c ) = Z y ∈ B ( c ) p ( y | c ) d y (16)and Pr ( Err | ) = Z ˜ y ∈ B ( ) p ( ˜ y | ) d ˜ y . (17)Now, setting α = β in the symmetry condition (14) yields p ( y | β ) = p ( τ β ( y ) | (18)for any y ∈ Σ , β ∈ R .We now define G : Σ n −→ Σ n and ˜ y as follows. ˜ y = G ( y ) s.t. ∀ i ∈ I : ˜ y i = τ β ( y i ) where β = c i . We note that G is a bijection from the set Σ n to itself, and that if y , z ∈ Σ n and β = c i then k y i − z i k = k τ β ( y i ) − τ β ( z i ) k k G ( y ) − G ( z ) k = k y − z k i.e. G is isometric with respect to Euclidean distance in Σ n .We prove that the integral (16) may be transformed to (17) via the substitution ˜ y = G ( y ) .First, we have p ( y | c ) = Y i ∈I p ( y i | c i )= Y β ∈ R Y i ∈I ,c i = β p ( y i | β )= Y β ∈ R Y i ∈I ,c i = β p ( τ β ( y i ) | Y β ∈ R Y i ∈I ,c i = β p (˜ y i | Y i ∈I p (˜ y i | p ( ˜ y | ) . Since G is isometric with respect to Euclidean distance in Σ n , it follows that the Jacobiandeterminant of the transformation is equal to unity. Therefore, to complete the proof, we needonly show that y ∈ B ( c ) if and only if ˜ y ∈ B ( ) . We begin by relating the elements of Λ ( y ) to the elements of Λ (˜ y ) . Let i ∈ I , α ∈ R − .Suppose c i = β ∈ R . We then have λ ( α ) ( y i ) = log (cid:18) p ( y i | p ( y i | α ) (cid:19) = log (cid:18) p ( τ β ( y i ) | − β ) p ( τ β ( y i ) | α − β ) (cid:19) = log (cid:18) p (˜ y i | − β ) p (˜ y i | α − β ) (cid:19) . This yields λ ( α ) ( y i ) = λ ( α ) ( ˜ y i ) if β = 0 − λ ( − α ) ( ˜ y i ) if α = βλ ( α − β ) ( ˜ y i ) − λ ( − β ) ( ˜ y i ) otherwise.14ext, for any point ( f , w ) ∈ Q we define a new point ( ˜ f , ˜ w ) as follows. For β = c i and all i ∈ I , α ∈ R − , ˜ f ( α ) i = − P γ ∈ R − f ( γ ) i if α = − βf ( α + β ) i otherwise. (19)For all j ∈ J , r ∈ C j we define ˜ w j, r = w j, b where b = r + x j ( c ) . Next we prove that for every ( f , w ) ∈ Q , the new point ( ˜ f , ˜ w ) lies in Q and thus is afeasible solution for the LP. Constraints (8) and (9) obviously hold from the definition of ˜ w . Toverify (10), we let j ∈ J , i ∈ I j and α ∈ R − . We also let β = c i . We now check two cases: • If α = − β , ˜ f ( α ) i = 1 − X γ ∈ R − f ( γ ) i = X b ∈C j w j, b − X γ ∈ R − X b ∈C j , b i = γ w j, b = X b ∈C j , b i =0 w j, b = X r ∈C j , r i = α ˜ w j, r . • If α = − β , ˜ f ( α ) i = f ( α + β ) i = X b ∈C j , b i = α + β w j, b = X r ∈C j , r i = α ˜ w j, r . Therefore ( ˜ f , ˜ w ) ∈ Q , i.e. ( ˜ f , ˜ w ) is a feasible solution for the LP. We write ( ˜ f , ˜ w ) = L ( f , w ) .We also note that the mapping L is a bijection from Q to itself; this is easily shown by verifyingthe inverse f ( α ) i = − P γ ∈ R − ˜ f ( γ ) i if α = β ˜ f ( α − β ) i otherwise (20)15or all i ∈ I , α ∈ R − , and w j, b = ˜ w j, r where r = b − x j ( c ) for all j ∈ J , b ∈ C j .We now prove that for every ( f , w ) ∈ Q , ( ˜ f , ˜ w ) = L ( f , w ) satisfies Λ ( y ) f T − Λ ( y ) Ξ ( c ) T = Λ (˜ y ) ˜ f T − Λ (˜ y ) Ξ ( ) T . (21)We achieve this by proving λ ( y i ) f Ti − λ ( y i ) ξ ( c i ) T = λ ( ˜ y i ) ˜ f Ti − λ ( ˜ y i ) ξ (0) T (22)for every i ∈ I . We may then obtain (21) by summing (22) over i ∈ I . Let β = c i . We considertwo cases: • If β = 0 , (22) becomes λ ( y i ) f Ti = λ ( ˜ y i ) ˜ f Ti which holds since λ ( α ) ( ˜ y i ) = λ ( α ) ( y i ) and ˜ f ( α ) i = f ( α ) i for all α ∈ R − in this case. • If β = 0 , λ ( y i ) f Ti − λ ( y i ) ξ ( c i ) T = X γ ∈ R − λ ( γ ) ( y i ) f ( γ ) i − λ ( β ) ( y i )= X γ ∈ R − γ = β (cid:0) λ ( γ − β ) ( ˜ y i ) − λ ( − β ) ( ˜ y i ) (cid:1) f ( γ ) i − λ ( − β ) ( ˜ y i ) f ( β ) i + λ ( − β ) ( ˜ y i )= X α ∈ R − α = − β λ ( α ) ( ˜ y i ) f ( α + β ) i + λ ( − β ) ( ˜ y i ) − X γ ∈ R − f ( γ ) i = X α ∈ R − λ ( α ) ( ˜ y i ) ˜ f ( α ) i = λ ( ˜ y i ) ˜ f Ti − λ ( ˜ y i ) ξ (0) T where we have made use of the substitution α = γ − β in the third line. Therefore (22) holds,proving (21). 16inally, we note that it is easy to show, using (19) and (20), that f = Ξ ( c ) if and only if ˜ f = Ξ ( ) . Putting together these results, we may make the following statement. Suppose weare given y , ˜ y ∈ Σ n with ˜ y = G ( y ) . Then the point ( f , w ) ∈ Q satisfies f = Ξ ( c ) and Λ ( y ) f T ≤ Λ ( y ) Ξ ( c ) T if and only if the point ( ˜ f , ˜ w ) = L ( f , w ) ∈ Q satisfies ˜ f = Ξ ( ) and Λ (˜ y ) ˜ f T ≤ Λ (˜ y ) Ξ ( ) T . This statement, along with the fact that both G and L are bijective,proves that y ∈ B ( c ) if and only if ˜ y ∈ B ( ) . We next provide, with details, some examples of modulator-channel combinations for whichthe symmetry conditions hold.
Example 5.1: Discrete memoryless q -ary symmetric channel. Here we denote the ring el-ements by R = { a , a , · · · , a q − } . Also Σ = { s , s , · · · , s q − } , where the channel outputprobability conditioned on the channel input satisfies, for each t, k ∈ { , , · · · , q − } , p ( s t | a k ) = (1 − p ) if t = kp/ ( q − otherwise , where p represents the probability of transmission error. Here we may define the mapping τ β for each β ∈ R according to τ β ( s t ) = s ℓ where a ℓ = a t − β for all t ∈ { , , · · · q − } . It is easy to check that these mappings are bijective and satisfy thesymmetry condition. Example 5.2: Orthogonal modulation over AWGN.
Here
Σ = R q , and denoting the ringelements by R = { a , a , · · · , a q − } , the modulation mapping may be written without loss ofgenerality as M : R −→ R q , such that, for each k = 0 , , · · · , q − , M ( a k ) = x = ( x (0) , x (1) , · · · , x ( q − ) , where x ( t ) = if t = k otherwise.17ere we may define the mapping τ β for each β ∈ R according to (where y = ( y (0) , y (1) , · · · , y ( q − ) ∈ R q , z = ( z (0) , z (1) , · · · , z ( q − ) ∈ R q ) τ β ( y ) = z such that for each l ∈ { , , · · · , q − } , z ( ℓ ) = y ( k ) where a k = a l + β. It is easily checked that these mappings are bijective and isometric, and satisfy the symmetrycondition.
Example 5.3: q -ary PSK modulation over AWGN. Here
Σ = C , and again denoting the ring elements by R = { a , a , · · · , a q − } , the modulationmapping may be written without loss of generality as M : R C such that M ( a k ) = exp (cid:18) ı πkq (cid:19) (23)for k = 0 , , · · · , q − (here ı = √− ). Here (18), together with the rotational symmetry of the q -ary PSK constellation, motivates us to define, for every β = a k ∈ R , τ β ( x ) = exp (cid:18) − ı πkq (cid:19) · x ∀ x ∈ C (24)Next, we also impose the condition that R under addition is a cyclic group. To see why weimpose this condition, let α = a k ∈ R and β = a l ∈ R . By the symmetry condition we musthave p ( y i | α + β ) = p ( τ α + β ( y i ) | and also p ( y i | α + β ) = p ( τ β ( y i ) | α )= p ( τ α ( τ β ( y i )) | . In order to equate these two expressions, we impose the condition τ α + β ( x ) = τ α ( τ β ( x )) for all x ∈ C , α, β ∈ R . Letting α + β = a p ∈ R , and using (24) yields exp (cid:18) − ı πkq (cid:19) · exp (cid:18) − ı πlq (cid:19) = exp (cid:18) − ı πpq (cid:19) p ≡ ( k + l ) mod q .Therefore, we must have a k + a l = a ( k + l ) mod q (25)for all a k , a l ∈ R . This implies that R , under addition, is a cyclic group.It is easy to check that the condition that R under addition is cyclic, encapsulated by (25),along with the modulation mapping (23), satisfies the symmetry condition, where the appropriatemappings τ β are given by (24). This means that codeword-independent performance is guaranteedfor such systems using nonbinary codes with PSK modulation. This applies to AWGN, flatfading wireless channels, and OFDM systems transmitting over frequency selective channelswith sufficiently long cyclic prefix.VI. L INEAR P ROGRAMMING P SEUDOCODEWORDS
Definition 6.1: A linear-programming pseudocodeword (LP pseudocodeword) of the code C , with parity-check matrix H , is a pair ( h , z ) where h ∈ R ( q − n and z = (cid:0) z j, b (cid:1) j ∈J , b ∈C j , where z j, b is a nonnegative integer for all j ∈ J , b ∈ C j , such that the following constraints aresatisfied: ∀ j ∈ J , ∀ i ∈ I j , ∀ α ∈ R − ,h ( α ) i = P b ∈C j , b i = α z j, b , (26)and ∀ j ∈ J , X b ∈C j z j, b = M , (27)where M is a nonnegative integer independent of j .It follows from (26) that h ( α ) i is a nonnegative integer for all i ∈ I , α ∈ R − . We note that thefurther constraints ∀ j ∈ J , ∀ b ∈ C j , z j, b ≤ M , (28) ∀ i ∈ I , ∀ α ∈ R − , ≤ h ( α ) i ≤ M , (29)19nd ∀ i ∈ I , X α ∈ R − h ( α ) i ≤ M , (30)follow from the constraints (26) and (27).For each i ∈ I , we also define h (0) i = M − X α ∈ R − h ( α ) i . (31)By (30), h (0) i is a nonnegative integer for all i ∈ I . Now, for any j ∈ J , i ∈ I j we have h (0) i = M − X α ∈ R − h ( α ) i = X b ∈C j z j, b − X α ∈ R − X b ∈C j ,b i = α z j, b = X b ∈C j ,b i =0 z j, b where we have used (26) and (27).Corresponding to the LP pseudocodeword ( h , z ) defined above, we define the normalized LPpseudocodeword as the vector obtained by scaling of ( h , z ) by a factor /M . We also definethe n × q LP pseudocodeword matrix H = (cid:16) h ( α ) i (cid:17) i ∈I ; α ∈ R . The normalized LP pseudocodeword matrix is defined as (1 /M ) · H .Note that if we interpret { z j, b /M } (for each j ∈ J ) as a probability distribution for the localcodeword b ∈ C j , then the i -th row of the normalized LP pseudocodeword matrix (for i ∈ I ) canbe interpreted as the corresponding probability distribution for the i -th coded symbol c i ∈ R .This idea of interpretating pseudocodewords as probability distributions was used in [3] for thebinary case. Example 6.1:
As an illustration, we provide an LP pseudocodeword for the example [4 , code over Z defined by the parity-check matrix (4). The reader may check that ( h (1)1 , h (1)2 , h (1)3 , h (1)4 ) = (2 2 2 2) (32)and ( h (2)1 , h (2)2 , h (2)3 , h (2)4 ) = (2 2 0 0) (33)20ogether with z , b = if b = (2 1 1 0)2 if b = (1 2 0 1)0 otherwise, (34)and z , b = if b = (2 0 1)2 if b = (1 1 0)0 otherwise, (35)satisfy (26) and (27), where M = 4 in (27). We also obtain from (31) ( h (0)1 , h (0)2 , h (0)3 , h (0)4 ) = (0 0 2 2) . Therefore (32)-(35) define an LP pseudocodeword, with pseudocodeword matrix H = . (36)The corresponding normalized LP pseudocodeword matrix is then given by · H =
12 12
12 1212 12
12 12 . (37)Here the probabilistic interpretation of this normalized LP pseudocodeword matrix correspondsto an equiprobable distribution of symbols from { , } for the first two symbols in the codeword,and an equiprobable distribution of symbols from { , } for the last two symbols in the codeword. Theorem 6.1:
Assume that the all-zero codeword was transmitted.1) If the LP decoder makes a codeword error, then there exists some LP pseudocodeword ( h , z ) , h = , such that Λ ( y ) h T ≤ .2) If there exists some LP pseudocodeword ( h , z ) , h = , such that Λ ( y ) h T < , then theLP decoder makes a codeword error. Proof:
The proof follows the lines of its counterpart in [7].21) Let ( f , w ) be the point in Q which minimizes Λ ( y ) f T . Suppose there is a codeworderror; then f = , and we must have Λ ( y ) f T ≤ .Next, we construct the LP pseudocodeword ( h , z ) as follows. Since the LP has rationalcoefficients, all elements of the vectors f and w must be rational. Let M denote theirlowest common denominator; since f = we may have M > . Now set h ( α ) i = M · f ( α ) i for all i ∈ I , α ∈ R − and set z j, b = M · w j, b for all j ∈ J and b ∈ C j .By (8)-(10), ( h , z ) is an LP pseudocodeword and h = since f = . Also Λ ( y ) f T ≤ implies Λ ( y ) h T ≤ .2) Now, suppose that an LP pseudocodeword ( h , z ) with h = satisfies Λ ( y ) h T < . Since h = we have M > in (27). Now, set f ( α ) i = h ( α ) i /M for all i ∈ I , α ∈ R − , and set w j, b = z j, b /M for all j ∈ J and b ∈ C j . It is straightforward to check that ( f , w ) satisfiesall the constraints of the polytope Q . Also, h = implies f = . Finally, Λ ( y ) h T < implies Λ ( y ) f T < . Therefore, the LP decoder will make a codeword error.VII. E QUIVALENCE B ETWEEN P SEUDOCODEWORD C ONCEPTS
A. Tanner Graphs and Graph-Cover Pseudocodewords
The Tanner graph of a linear code C over R is an equivalent characterization of the code’sparity-check matrix H . The Tanner graph G = ( V , E ) has vertex set V = { u , u , · · · , u n } ∪{ v , v , · · · , v m } , and there is an edge between u i and v j if and only if H j,i = 0 . This edge islabelled with the value H j,i . We denote by N ( v ) the set of neighbors of a vertex v ∈ V .For any word c = ( c , c , · · · , c n ) ∈ R n , the Tanner graph allows an equivalent graphicalstatement of the condition c ∈ C j for each j ∈ J , as follows. The variable vertex u i is labelledwith the value c i for each i ∈ I . Equation (2) (or (3)) is then equivalent to the conditionthat for vertex v j , the sum, over all vertices in N ( v j ) , of the vertex labels multiplied by thecorresponding edge labels is zero. This graphical means of checking whether a parity-check issatisfied by c ∈ R n will be useful when defining graph-cover pseudocodewords later in thissection.To illustrate this concept, Figure 1 shows the Tanner graph for the codeword c = (1 0 2 1) ofthe example [4 , code over Z defined by the parity-check matrix (4). In Figure 1, edge labelsare shown in square brackets, and vertex labels in round brackets. The reader may check that for22 (1) u (0) u (2) u (1) v v Fig. 1. Tanner graph for the example [4 , code over Z . Edge labels are shown in square brackets, and vertex labels in roundbrackets. For each parity-check j , the sum, over all vertices in N ( v j ) , of the vertex labels multiplied by the corresponding edgelabels is zero; therefore all parity-checks are satisfied. each parity-check j = 1 , , the sum, over all vertices in N ( v j ) , of the vertex labels multipliedby the corresponding edge labels is zero.We next define what is meant by a finite cover of a Tanner graph. Definition 7.1: ([4]) A graph ˜ G = ( ˜ V , ˜ E ) is a finite cover of the Tanner graph G = ( V , E ) ifthere exists a mapping Π : ˜
V −→ V which is a graph homomorphism ( Π takes adjacent verticesof ˜ G to adjacent vertices of G ), such that for every vertex v ∈ G and every ˜ v ∈ Π − ( v ) , theneighborhood N (˜ v ) of ˜ v (including edge labels) is mapped bijectively to N ( v ) . Definition 7.2: ([4]) A cover of the graph G is said to have degree M , where M is a positiveinteger, if | Π − ( v ) | = M for every vertex v ∈ V . We refer to such a cover graph as an M -cover of G .Fix some positive integer M . Let ˜ G = ( ˜ V , ˜ E ) be an M -cover of the Tanner graph G = ( V , E ) representing the code C with parity-check matrix H . The vertices in the set Π − ( u i ) are called copies of u i and are denoted { u i, , u i, , · · · , u i,M } , where i ∈ I . Similarly, the vertices in theset Π − ( v j ) are called copies of v j and are denoted { v j, , v j, , · · · , v j,M } , where j ∈ J .Less formally, given a code C with parity-check matrix H and corresponding Tanner graph G , an M -cover of G is a graph whose vertex set consists of M copies of u i and M copies of v j , such that for each j ∈ J , i ∈ I j , the M copies of u i and the M copies of v j are connectedin an arbitrary one-to-one fashion, with edges labelled by the value H j,i .For any M ≥ , a graph-cover pseudocodeword is a labelling of vertices of the M -cover23raph with values from R such that all parity-checks are satisfied. We denote the label of u i,l by p i,l for each i ∈ I , ℓ = 1 , , · · · , M , and we may then write the graph-cover pseudocodewordin vector form as p = ( p , , p , , · · · , p ,M , p , , p , , · · · , p ,M , · · · , p n, , p n, , · · · , p n,M ) . It is easily seen that p belongs to a linear code ˜ C of length M n over R , defined by an M m × M n parity-check matrix ˜ H . To construct ˜ H , for ≤ i ∗ , j ∗ ≤ M and i ∈ I , j ∈ J , we let i ′ =( i − M + i ∗ , j ′ = ( j − M + j ∗ , and so ˜ H j ′ ,i ′ = H j,i if u i,i ∗ ∈ N ( v j,j ∗ )0 otherwise . It may be seen that ˜ G is the Tanner graph of the code ˜ C corresponding to the parity-check matrix ˜ H .We also define the n × q graph-cover pseudocodeword matrix P = (cid:16) m ( α ) i (cid:17) i ∈I ; α ∈ R , where m ( α ) i = |{ ℓ ∈ { , , · · · , M } : p i,ℓ = α }| ≥ , for i ∈ I , α ∈ R , i.e. m ( α ) i is equal to the number of copies of u i which are labelled with α , for each i ∈ I , α ∈ R . The normalized graph-cover pseudocodeword matrix is defined as (1 /M ) · P . This matrix representation is similar to that defined in [16]. Note that the i -th row ofthe normalized graph-cover pseudocodeword matrix (for i ∈ I ) can be viewed as a probabilitydistribution for the i -th coded symbol c i ∈ R , in a similar manner to the case of the normalizedLP pseudocodeword matrix.Another representation, which we shall use in Section IX, is the graph-cover pseudocodewordvector m = ( m i ) i ∈I where m i = ( m ( α ) i ) α ∈ R − for each i ∈ I . Correspondingly, the normalizedgraph-cover pseudocodeword vector is given by (1 /M ) · m ∈ R ( q − n .It is easily seen that for any c ∈ C , the labelling of u i,l by the value c i for all i ∈ I , ℓ = 1 , , · · · , M trivially yields a pseudocodeword for all M -covers of G , M ≥ . However,non-trivial pseudocodewords exist in general. 24 xample 7.1: To illustrate these concepts, a graph-cover pseudocodeword in shown inFigure 2 for the example [4 , code over Z defined by the parity-check matrix (4). Here thedegree of the cover graph is M = 4 , and we have p = (1 1 2 2 | | | , and the parity-check matrix of the code ˜ C is given by ˜ H = Also, the graph-cover pseudocodeword matrix corresponding to p is P = , (38)and the normalized graph-cover pseudocodeword matrix is · P . The graph-cover pseudocodeword vector corresponding to p is m = ( 2 2 | | | , and the normalized graph-cover pseudocodeword vector is · m . , (1) u , (2) u , (2) u , (1) u , (1) u , (2) u , (2) u , (1) u , (0) u , (0) u , (1) u , (1) u , (0) u , (1) u , (1) u , (0) [1] [2] [2][2][2][2] [1][1] [1][1] [1][1][2][2][2][2] [2] [2] [2][1] [1][1] [1] [1] v , v , v , v , v , v , v , v , Fig. 2. Cover graph of degree and corresponding graph-cover pseudocodeword for the example [4 , code over Z withparity-check matrix given by (4). Edge labels are shown in square brackets, and vertex labels in round brackets. This graph-coverpseudocodeword corresponds to the LP pseudocodeword described by (32)-(35) via the correspondence described in the proofof Theorem 7.1. B. Equivalence between LP Pseudocodewords and Graph-Cover Pseudocodewords
In this section, we show the equivalence between the set of LP pseudocodewords and the setof graph-cover pseudocodewords. The result is summarized in the following theorem.
Theorem 7.1:
Let C be a linear code over the ring R with parity-check matrix H and corre-sponding Tanner graph G . Then, there exists an LP pseudocodeword ( h , z ) with pseudocodewordmatrix H if and only if there exists a graph-cover pseudocodeword for some M -cover of G withthe same pseudocodeword matrix. Proof:
1) Let ( h , z ) be an LP pseudocodeword of C , and let G = ( V , E ) be the Tanner graphassociated with the parity-check matrix H . We construct an M -cover ˜ G = ( ˜ V , ˜ E ) , where M = P b ∈C j z j, b , and corresponding graph-cover pseudocodeword, as follows. We beginwith the vertex set, which consists of M copies of u i , i ∈ I , and M copies of v j , j ∈ J .26hen we proceed as follows: • Label h ( α ) i copies of u i with the value α , for each i ∈ I , α ∈ R . By (31), all copiesof u i are labelled. • Label z j, b copies of v j with the value b , for every j ∈ J , b ∈ C j . By (27), all copiesof v j are labelled. • Next, let T ( α ) i denote the set of copies of u i labelled with the value α , for i ∈ I , α ∈ R . Also, for all i ∈ I , j ∈ J , α ∈ R , let R ( α ) i,j denote the set of copies of v j whose label satisfies b i = α . The vertices in T ( α ) i and the vertices in R ( α ) i,j are thenconnected by edges in an arbitrary one-to-one fashion, for every j ∈ J , i ∈ I j , α ∈ R .All of these edges are labelled with the value H j,i .First, we note that this is possible because | T ( α ) i | = h ( α ) i = X b ∈C j , b i = α z j, b = | R ( α ) i,j | for every j ∈ J , i ∈ I j , α ∈ R . Here we have used (26)).Second, we note that all checks are satisfied by this labelling. For j ∈ J , considerany copy of v j with label b . By construction of the graph, the sum, over all verticesin N ( v j ) , of the vertex labels multiplied by the corresponding edge labels is X i ∈I j b i · H j,i , which is zero because b ∈ C j . Therefore, this vertex labelling yields a graph-coverpseudocodeword of the code C with parity-check matrix H .2) Now suppose that there exists a graph-cover pseudocodeword corresponding to some M -cover of the Tanner graph G of C . Then, • Step 1: for every i ∈ I , and for every α ∈ R − , we define h ( α ) i to be the number ofcopies of u i labelled with the value α . • Step 2: for every copy of v j , j ∈ J , label the copy with the word b , where b i is equalto the label on the neighbouring copy of u i , i ∈ I j . Then, for every j ∈ J , b ∈ C j ,we define z j, b to be the number of copies of v j labelled with the word b .27tep 2 ensures that z j, b are nonnegative integers for all j ∈ J and b ∈ C j , and that (27)holds. Also, to show that (26) holds, we reason as follows. The right-hand side of (26)counts the number of copies of v j whose labels b satisfy b i = α . By step 2, this is equalto the number of copies of u i labelled with α , which by step 1 is equal to the left-handside of (26). Therefore, ( h , z ) is an LP pseudocodeword of the code C with parity-checkmatrix H .As an illustration of the correspondences described in this proof, consider the example [4 , code over Z defined by the parity-check matrix (4). First, note that the LP pseudocodewordof (32)-(35) and the graph-cover pseudocodeword of Figure 2 have the same pseudocodewordmatrix, via (36) and (38). Indeed, the reader may check that each pseudocodeword may bederived from the other using the correspondences described in the proof of Theorem 7.1.The next corollary follows immediately from Theorem 7.1. Corollary 7.2:
Let C be a linear code over the ring R with parity-check matrix H andcorresponding Tanner graph G . Then, there exists a (normalized) LP pseudocodeword ( h , z ) ifand only if there exists a graph-cover pseudocodeword for some M -cover of G with (normalized)graph-cover pseudocodeword vector h .Note that this corollary contains two different equivalences, one for normalized objects and theother for non-normalized ones.VIII. A LTERNATIVE P OLYTOPE R EPRESENTATION
In this section, we present an alternative polytope for use with linear-programming decoding.This polytope may be regarded as a generalization of the “high-density polytope” defined in [7].As we show in this section, the new polytope may under some circumstances yield a complexityadvantage over the polytope of Section III. In the sequel, we will analyze the properties of thispolytope.First, we introduce some convenient notation and definitions. Recall that the ring R contains q − non-zero elements; correspondingly, for vectors k ∈ N q − , we adopt the notation k = ( k α ) α ∈ R − j ∈ J , we define the mapping κ j : C j −→ N q − , b κ j ( b ) defined by ( κ j ( b )) α = |{ i ∈ I j : b i · H j,i = α }| for all α ∈ R − . We may then characterize the image of κ j , which we denote by T j , as T j = ( k ∈ N q − : X α ∈ R − α · k α = 0 and X α ∈ R − k α ≤ d j ) , for each j ∈ J , where, for any k ∈ N , α ∈ R , α · k = if k = 0 α + · · · + α if k > ( k terms in sum)Note that κ j is not a bijection, in general. We say that a local codeword b ∈ C j is k -constrainedover C j if κ j ( b ) = k .Next, for any index set Γ ⊆ I , we introduce the following definitions. Let N = | Γ | . We definethe single-parity-check-code, over vectors indexed by Γ , by C Γ = ( a = ( a i ) i ∈ Γ ∈ R N : X i ∈ Γ a i = 0 ) . (39)Also define a mapping κ Γ : C Γ −→ N q − by ( κ Γ ( a )) α = |{ i ∈ Γ : a i = α }| , and define, for k ∈ T j , C ( k )Γ = { a ∈ C Γ : κ Γ ( a ) = k } . Below, we define a new polytope for decoding. Recall that y = ( y , y , · · · , y n ) ∈ Σ n standsfor the received (corrupted) word. In the sequel, we make use of the following variables: • For all i ∈ I and all α ∈ R − , we have a variable f ( α ) i . This variable is an indicator of theevent y i = α . • For all j ∈ J and k ∈ T j , we have a variable σ j, k . Similarly to its counterpart in [7], thisvariable indicates the contribution to parity-check j of k -constrained local codewords over C j . 29 For all j ∈ J , i ∈ I j , k ∈ T j , α ∈ R − , we have a variable z ( α ) i,j, k . This variable indicatesthe portion of f ( α ) i assigned to k -constrained local codewords over C j .Motivated by these variable definitions, for all j ∈ J we impose the following set ofconstraints: ∀ i ∈ I j , ∀ α ∈ R − , f ( α ) i = X k ∈T j z ( α ) i,j, k . (40) X k ∈T j σ j, k = 1 . (41) ∀ k ∈ T j , ∀ α ∈ R − , X i ∈I j , β ∈ R − , β H j,i = α z ( β ) i,j, k = k α · σ j, k . (42) ∀ i ∈ I j , ∀ k ∈ T j , ∀ α ∈ R − , z ( α ) i,j, k ≥ . (43) ∀ i ∈ I j , ∀ k ∈ T j , X α ∈ R − X β ∈ R − , β H j,i = α z ( β ) i,j, k ≤ σ j, k . (44)We note that the further constraints ∀ i ∈ I , ∀ α ∈ R − , ≤ f ( α ) i ≤ , (45) ∀ j ∈ J , ∀ k ∈ T j , ≤ σ j, k ≤ , (46)and ∀ j ∈ J , ∀ i ∈ I j , ∀ k ∈ T j , ∀ α ∈ R − , z ( α ) i,j, k ≤ σ j, k , (47)follow from constraints (40)-(44). We denote by U the polytope formed by constraints (40)-(44).Let T = max j ∈J |T j | . Then, upper bounds on the number of variables and constraints in thisLP are given by n ( q −
1) + m ( d ( q −
1) + 1) T and m ( d ( q −
1) + 1) + m (( d + 1)( q −
1) + d ) T ,respectively. Since T ≤ (cid:0) d + q − d (cid:1) , the number of variables and constraints are O ( mq · d q ) , which,for many families of codes, is significantly lower than the corresponding complexity for polytope Q .For notational simplicity in proofs in this section, it is convenient to define a new set ofvariables as follows: ∀ j ∈ J , ∀ i ∈ I j , ∀ k ∈ T j , ∀ α ∈ R − , τ ( α ) i,j, k = X β ∈ R − , β H j,i = α z ( β ) i,j, k . (48)30hen constraints (42) and (44) may be rewritten as ∀ j ∈ J , k ∈ T j , ∀ α ∈ R − , X i ∈I j τ ( α ) i,j, k = k α · σ j, k . (49)and ∀ j ∈ J , ∀ i ∈ I j , ∀ k ∈ T j , ≤ X α ∈ R − τ ( α ) i,j, k ≤ σ j, k . (50)Note that the variables τ do not form part of the LP description, and therefore do not contributeto its complexity. However these variables will provide a convenient notational shorthand forproving results in this section.We will prove that optimizing the cost function (5) over this new polytope is equivalent tooptimizing over Q . First, we state the following proposition, which will be necessary to provethis result. Proposition 8.1:
Let M ∈ N and k ∈ N q − . Also let Γ ⊆ I . Assume that for each α ∈ R − ,we have a set of nonnegative integers X ( α ) = { x ( α ) i : i ∈ Γ } and that together these satisfy theconstraints X i ∈ Γ x ( α ) i = k α M (51)for all α ∈ R − , and X α ∈ R − x ( α ) i ≤ M (52)for all i ∈ Γ .Then, there exist nonnegative integers n w a : a ∈ C ( k )Γ o such that1) X a ∈C ( k )Γ w a = M . (53)2) For all α ∈ R − , i ∈ Γ , x ( α ) i = X a ∈C ( k )Γ , a i = α w a . (54)The proof of this proposition appears in the Appendix. We now prove the main result. Theorem 8.2:
The set ¯ U = { f : ∃ σ , z s.t. ( f , σ , z ) ∈ U } is equal to the set ¯ Q = { f : ∃ w s.t. ( f , w ) ∈ Q} . Therefore, optimizing the linear cost function (5) over U is equivalent tooptimizing over Q . 31 roof:
1) Suppose, ( f , w ) ∈ Q . For all j ∈ J , k ∈ T j , we define σ j, k = X b ∈C j , κ j ( b )= k w j, b , and for all j ∈ J , i ∈ I j , k ∈ T j , α ∈ R − , we define z ( α ) i,j, k = X b ∈C j , κ j ( b )= k , b i = α w j, b , It is straightforward to check that constraints (43) and (44) are satisfied by these definitions.For every j ∈ J , i ∈ I j , α ∈ R − , we have by (10) f ( α ) i = X b ∈C j , b i = α w j, b = X k ∈T j X b ∈C j , κ j ( b )= k , b i = α w j, b = X k ∈T j z ( α ) i,j, k , and thus constraint (40) is satisfied.Next, for every j ∈ J , we have by (9) X b ∈C j w j, b = X k ∈T j X b ∈C j , κ j ( b )= k w j, b = X k ∈T j σ j, k , and thus constraint (41) is satisfied.Finally, for every j ∈ J , k ∈ T j , α ∈ R − , X i ∈I j , β ∈ R − , β H j,i = α z ( β ) i,j, k = X i ∈I j , β ∈ R − , β H j,i = α X b ∈C j , κ j ( b )= k , b i = β w j, b = X b ∈C j , κ j ( b )= k X i ∈I j , b i H j,i = α w j, b = X b ∈C j , κ j ( b )= k k α · w j, b = k α · σ j, k . ( f , σ , z ) is a vertex of the polytope U , and so all variables are rational, asare the variables τ . Next, fix some j ∈ J , k ∈ T j , and consider the sets X ( α )0 = ( τ ( α ) i,j, k σ j, k : i ∈ I j ) . for α ∈ R − . By constraint (50), for each α ∈ R − , all the values in the set X ( α )0 are rationalnumbers between 0 and 1. Let µ be the lowest common denominator of all the numbersin all the sets X ( α )0 , α ∈ R − . Let X ( α ) = ( µ · τ ( α ) i,j, k σ j, k : i ∈ I j ) , for each α ∈ R − . The sets X ( α ) consist of integers between 0 and µ . By constraint (49),we must have that for every α ∈ R − , the sum of the elements in X ( α ) is equal to k α µ .By constraint (50), we have X α ∈ R − µ · τ ( α ) i,j, k σ j, k ≤ µ for all i ∈ I j .We now apply the result of Proposition 8.1 with Γ = I j , M = µ and with the sets X ( α ) defined as above (here N = d j ). Set the variables { w a : a ∈ C ( k )Γ } according toProposition 8.1.Next, for k ∈ T j , we show how to define the variables { w ′ b : b ∈ C j , κ j ( b ) = k } .Initially, we set w ′ b = 0 for all b ∈ C j , κ j ( b ) = k . Observe that the values µ · z ( β ) i,j, k /σ j, k are nonnegative integers for every i ∈ I , j ∈ J , k ∈ T j , β ∈ R − .For every a ∈ C ( k )Γ , we define w a words b (1) , b (1) , · · · , b ( w a ) ∈ C j . Assume some orderingon the elements β ∈ R − satisfying β H j,i = a i , namely β , β , · · · , β ℓ for some positiveinteger ℓ . For i ∈ I j , b ( ℓ ) i ( ℓ = 1 , , · · · , w a ) is defined as follows: b ( ℓ ) i is equal to β forthe first µ · z ( β ) i,j, k /σ j, k words b (1) , b (2) , · · · , b ( w a ) ; b ( ℓ ) i is equal to β for the next µ · z ( β ) i,j, k /σ j, k words, and so on. For every b ∈ C j we define w ′ b = (cid:12)(cid:12)(cid:12)n i ∈ { , , · · · , w a } : b ( i ) = b o(cid:12)(cid:12)(cid:12) . b ∈ C j , κ j ( b ) = k , we define w j, b = σ j, k µ · w ′ b . Using Proposition 8.1, X a ∈C ( k )Γ , a i = α w a = µ · τ ( α ) i,j, k σ j, k = X β : β H j,i = α µ · z ( β ) i,j, k σ j, k , and so all b (1) , b (2) , · · · , b ( w a ) (for all a ∈ C ( k )Γ ) are well-defined. It is also straightforwardto see that b ( ℓ ) ∈ C j for ℓ = 1 , , · · · , w a . Next, we check that the newly-defined w j, b satisfy (8)-(10) for every j ∈ J , b ∈ C j .It is easy to see that w j, b ≥ ; therefore (8) holds. By Proposition 8.1 we obtain σ j, k = X b ∈C j , κ j ( b )= k w j, b , for all j ∈ J , k ∈ T j , and τ ( α ) i,j, k = X b ∈C j , κ j ( b )= k , b i H j,i = α w j, b , for all j ∈ J , i ∈ I j , k ∈ T j , α ∈ R − . Let β H j,i = α . Since τ ( α ) i,j, k = X β : β H j,i = α z ( β ) i,j, k , by the definition of w j, b it follows that X b ∈C j , κ ( b )= k , b i = β w j, b = z ( β ) i,j, k τ ( α ) i,j, k · X b ∈C j , κ ( b )= k , b i H j,i = α w j, b = z ( β ) i,j, k , where the first equality is due to the definition of the words b ( ℓ ) , ℓ = 1 , , · · · , w a .By constraint (41) we have, for all j ∈ J , X k ∈T j σ j, k = X k ∈T j X b ∈C j , κ j ( b )= k w j, b = X b ∈C j w j, b , j ∈ J , i ∈ I j , β ∈ R − , f ( β ) i = X k ∈T j z ( β ) i,j, k = X k ∈T j X b ∈C j , κ j ( b )= k , b i = β w j, b = X b ∈C j , b i = β w j, b , thus satisfying (10).IX. C ASCADED P OLYTOPE R EPRESENTATION
In this section we show that the “cascaded polytope” representation described in [21], [22]and [23] can be extended to nonbinary codes in a straightforward manner. Below, we elaborateon the details.For j ∈ J , consider the j -th row H j of the parity-check matrix H over R , and recall that C j = ( b i ) i ∈I j : X i ∈I j b i · H j,i = 0 . Assume that I j = { i , i , · · · , i d j } and denote L j = { , , · · · , d j − } . We introduce newvariables χ j = ( χ ji ) i ∈L j , and denote χ = ( χ j ) j ∈J . We define a new linear code C ( χ ) j of length d j − by the ( d j − × (2 d j − parity-checkmatrix F j associated with the following set of parity-check equations over R :1) b i H j,i + b i H j,i + χ j = 0 . (55)2) For every ℓ = 1 , , · · · , d j − , − χ jℓ + b i ℓ +2 H j,i ℓ +2 + χ jℓ +1 = 0 . (56)35
1] [2] [1] b i b i b i b i b i b i [2][1][2] χ j χ j χ j b i b i b i b i b i [2] [2] [1] [1] b i [2][1] [1] [2] [1] [2] [2][1] Fig. 3. Example of the Tanner graph of a local code C j = { ( b i b i b i b i b i b i ) : b i + 2 b i + 2 b i + b i + b i + 2 b i = 0 } of length d j = 6 over R = Z , and its transformation into the Tanner graph of the corresponding code C ( χ ) j . Note that thedegree of each parity-check vertex in the transformed graph is equal to . − χ jd j − + b i dj − H j,i dj − + b i dj H j,i dj = 0 . (57)We also define a linear code C ( χ ) of length n + P j ∈J ( d j − defined by the ( P j ∈J ( d j − × ( n + P j ∈J ( d j − parity-check matrix F associated with all the sets of parity-checkequations (55)-(57) (for all j ∈ J ). We adopt the notation ˜ b = ( b | χ j ) for codewords of C ( χ ) j ,and ˜ c = ( c | χ ) for codewords of C ( χ ) . Example 9.1:
Figure 3 presents an example of the Tanner graph of a local code C j = { ( b i b i b i b i b i b i ) : b i + 2 b i + 2 b i + b i + b i + 2 b i = 0 } of length d j = 6 over R = Z , and the Tanner graph of the corresponding code C ( χ ) j of length (three extra variables were added). The degree of every parity-check vertex in the Tanner graphof C ( χ ) j is at most .The following theorem relates the codes C j and C ( χ ) j . Theorem 9.1:
The vector b = ( b i ) i ∈I j ∈ R d j is a codeword of C j if and only if there existsa vector χ j ∈ R d j − such that ( b | χ j ) ∈ C ( χ ) j .36 roof:
1) Assume b = ( b i ) i ∈I j ∈ C j . Define χ jℓ = − b i H j,i − b i H j,i if ℓ = 1 χ jℓ − − b i ℓ +1 H j,i ℓ +1 if ≤ ℓ ≤ d j − (58)Then, obviously, (55) holds, and (56) holds for all ≤ ℓ ≤ d j − . Finally, (57) follows fromsubtraction of (55) and (56) (for each ≤ ℓ ≤ d j − ) from the equation P i ∈I j b i ·H j,i = 0 .Therefore, ( b | χ j ) ∈ C ( χ ) j , as required.2) Now, assume that b = ( b i ) i ∈I j is such that ( b | χ j ) ∈ C ( χ ) j for some χ j ∈ R d j − , andthus (55)–(57) hold (in particular, (56) holds for all ≤ ℓ ≤ d j − ). We sum all theequalities in (55)–(57) and obtain that P i ∈I j b i · H j,i = 0 . Therefore, b ∈ C j .Note that from this theorem we may see that for every b ∈ C j , there exists a unique χ j = χ j ( b ) such that ˜ b = ( b | χ j ) ∈ C ( χ ) j , via (58); we may therefore use the notation ˜ b ( b ) = ( b | χ j ( b )) to denote this unique completion, where χ j ( b ) = ( χ ji ( b )) i ∈L j .It follows from Theorem 9.1 that the set of parity-check equations (55)–(57) for all j ∈ J equivalently describes the code C . This description has at most n + m · ( d − variables and m · ( d − parity-check equations. However, the number of variables participating in everyparity-check equation is at most . Therefore, the total number of variables and of constraintsin the corresponding LP problem (defined by constraints (8)-(10) applied to the parity-checkmatrix F ) is bounded from above by ( n + m ( d − q −
1) + m ( d − · q and m ( d − q + 3 q − , respectively.In the sequel, we make use of some new notations, which we define next. First of all, witheach parity-check equation prescribed by the matrix F , we associate a pair of indices ( j, ℓ ) , j ∈ J , ℓ = 1 , , · · · , d j − , where j indicates the corresponding parity-check equation in H ,and ℓ indicates the serial number of the parity-check equation in the set of equations (55)–(57)corresponding to the j -th row of H . Denote by I j,ℓ ⊆ I j and L j,ℓ ⊆ L j the sets of indices i of37ariables b i and χ ji , respectively, corresponding to the non-zero entries in row ( j, ℓ ) of F . Then,each row of F defines a single parity-check code C ( χ ) j,ℓ . For any g ∈ C ( χ ) j,ℓ , we adopt the notation g = ( g b | g χ ) where g b = ( g bi ) i ∈I j,ℓ ; g χ = ( g χi ) i ∈L j,ℓ . We denote by S the polytope corresponding to the LP relaxation (8)-(10) for the code C ( χ ) withthe parity-check matrix F . Recall that codewords of C ( χ ) are denoted ˜ c = ( c | χ ) . It is naturalto represent points in S as (( f , h ) , z ) , where f = ( f ( α ) i ) i ∈I , α ∈ R − and h = ( h ( α ) j,i ) j ∈J , i ∈L j , α ∈ R − are vectors of indicators corresponding to the entries c i ( i ∈ I ) in c and χ ji ( j ∈ J , i ∈ L j ) in χ , respectively. Here z = ( z j,ℓ, g ) j ∈J , ℓ =1 , , ··· ,d j − , g ∈C ( χ ) j,ℓ is a vector of weights associated with each parity-check equation ( j, ℓ ) and each codeword g ∈ C ( χ ) j,ℓ .Similarly, for each j ∈ J we denote by S j the polytope corresponding to the LP relaxation (8)-(10) for the code C ( χ ) j , defined by the parity-check matrix F j . Recall that codewords of C ( χ ) j aredenoted ˜ b = ( b | χ j ) . Then, it is also natural to represent points in S j as (( ˆ f j , ˆ h j ) , ˆ z j ) , where ˆ f j = ( f ( α ) i ) i ∈I j , α ∈ R − and ˆ h j = ( h ( α ) j,i ) i ∈L j , α ∈ R − are vectors of indicators corresponding to theentries b i ( i ∈ I j ) in b and χ ji ( i ∈ L j ) in χ j , respectively. Moreover, ˆ z j = ( z j,ℓ, g ) ℓ =1 , , ··· ,d j − , g ∈C ( χ ) j,ℓ is a vector of weights associated with each parity-check equation ( j, ℓ ) and each codeword g ∈ C ( χ ) j,ℓ .For each j ∈ J , define the mapping Ξ j analogously to the mapping Ξ with respect to thedimensionality of the code C ( χ ) j , namely Ξ j : R d j − −→ { , } ( q − d j − ⊂ R ( q − d j − , such that for ˜ b = ( b | χ j ) ∈ C ( χ ) j , Ξ j (˜ b ) = ( ξ ( b i ) | ξ ( b i ) | · · · | ξ ( b i dj ) | ξ ( χ j ) | ξ ( χ j ) | · · · | ξ ( χ jd j − )) . The next lemma is similar to one of the claims of Proposition 10 in [5].38 emma 9.2:
Let C be a code of length n over R with parity-check matrix H , and let Q ( H ) be the corresponding polytope of the LP relaxation, i.e. the set of points ( f , w ) satisfying (8)-(10). Let ¯ Q ( H ) denote the projection of Q onto the f variables, i.e. ¯ Q ( H ) = { f : ∃ w s.t. ( f , w ) ∈ Q} Denote by P the set of normalized graph-cover pseudocodeword vectors associated with H .Then, ¯ Q ( H ) = P , where P is the closure of P under the usual (Euclidean) metric in R ( q − n . Proof:
Generally, the proof is similar to the proof of the relevant parts of Proposition 10in [5]. It is largely based on the equivalence between the set of graph-cover pseudocodewordsand the set of LP pseudocodewords (Theorem 7.1 and Corollary 7.2). We avoid many technicaldetails, and mention only the main ideas. The proof consists of proving two main claims.1) P ⊆ ¯ Q ( H ) .Given any normalized graph-cover pseudocodeword vector f ∈ P , by Corollary 7.2 theremust exist w with ( f , w ) ∈ Q ( H ) . Therefore f ∈ ¯ Q ( H ) .2) If a point in ¯ Q ( H ) has all rational entries, then it must also be in P .The proof follows the lines of the proof of Lemma 56 in [5]. Let ( f , w ) ∈ Q ( H ) be apoint such that all entries in f are rational. Then for all j ∈ J , the vector ˆ f j = ( f i ) i ∈I j lies in the convex hull K ( C j ) . For convenience in what follows, denote the index set Ψ = { , , · · · , ( q − n + 1 } . Using Carath´eodory’s Theorem [26, p. 10], for all j ∈ J we may write f = µ ( j ) P ( j ) where µ ( j ) = ( µ ( j ) i ) i ∈ Ψ is a row vector of length | Ψ | whoseelements sum to unity, and P ( j ) is a | Ψ | × ( | Ψ | − matrix such that for each i ∈ Ψ , the i -th row of P ( j ) , denoted p ( j ) i , satisfies p ( j ) i = Ξ ( c ) for some c ∈ R n with x j ( c ) ∈ C j .Therefore, ( f
1) = µ ( j ) ( P ( j ) ) , where denotes a vector of length | Ψ | all of whose entries are equal to , is a | Ψ | ×| Ψ | system; therefore by Cr´amer’s rule the solution for µ ( j ) has all rational entries (thisargument applies for every j ∈ J ). Let M denote a common denominator of all variablesin vectors µ ( j ) , for j ∈ J . Define h ( α ) i = M f ( α ) i ∈ R for each i ∈ I , α ∈ R − (it is easy tosee that these variables must be nonnegative integers). Also define δ ( j ) i = M µ ( j ) i for each i ∈ Ψ , j ∈ J , and δ ( j ) = ( δ ( j ) i ) i ∈ Ψ . We then have h = δ ( j ) P ( j ) . (59)39ext define, for all j ∈ J , b ∈ C j , z j, b = X i ∈ Ψ : p ( j ) i = Ξ ( c ) , x j ( c )= b δ ( j ) i . By comparing appropriate entries in the vector equation (59), we obtain ∀ j ∈ J , ∀ i ∈ I j , ∀ α ∈ R − ,h ( α ) i = P b ∈C j , b i = α z j, b , and so ( h , z ) is an LP pseudocodeword (the preceding equation yields (26), and (27)follows from the fact that the sum of the entries in δ ( j ) is equal to M for all j ∈ J , theseentries being nonnegative integers). So the construction of Theorem 7.1, part (1), yields acorresponding graph-cover pseudocodeword with graph-cover pseudocodeword vector h .Therefore the corresponding normalized graph-cover pseudocodeword vector is f , and sowe must have f ∈ P .The claim of the lemma follows.The following proposition is a counterpart of Lemma 28 in [5]. Proposition 9.3:
Let C be a code of length n over R with parity-check matrix H . Assumethat the Tanner graph represented by H is a tree. Then, the projected polytope ¯ Q ( H ) of thecorresponding LP relaxation problem is equal to K ( C ) . Proof:
The proof follows the lines of the proof of Lemma 28 in [5]. Let G be the labeledTanner graph of the code C corresponding to H . Let ˜ G be an M -cover of G for some positiveinteger M . Since G is a tree, ˜ G is a collection of M labeled trees which are copies of G . Let ˜ C be a code defined by the parity-check matrix corresponding to this ˜ G . We obtain that ˜ C = n x ∈ R Mn : ( x ,m , x ,m , · · · , x n,m ) ∈ C for all m = 1 , , · · · , M o . Then, it is easy to see that the set of normalized graph-cover pseudocodeword vectors of H , P ,is equal to K ( C ) ∩ Q ( q − n .To this end, we apply Lemma 9.2 to see that ¯ Q ( H ) = P = K ( C ) ∩ Q ( q − n = K ( C ) , as required. 40y taking C = C ( χ ) j and H = F j so that Q ( H ) = S j (for j ∈ J ), we immediately obtain thefollowing corollary: Corollary 9.4:
For j ∈ J , let ¯ S j = { ( ˆ f j , ˆ h j ) : ∃ ˆ z j s.t. (( ˆ f j , ˆ h j ) , ˆ z j ) ∈ S j } Then ¯ S j = K ( C ( χ ) j ) .The proof of the next theorem requires the following definition. Let b ∈ C j , and let g ∈ C ( χ ) j,ℓ .We say that g coincides with b , writing g ⊲⊳ b , if and only if g bi = b i for all i ∈ I j,ℓ and g χi = χ ji ( b ) for all i ∈ L j,ℓ . Theorem 9.5:
The set ¯ S = { f : ∃ h , z s.t. (( f , h ) , z ) ∈ S} is equal to the set ¯ Q = { f : ∃ w s.t. ( f , w ) ∈ Q} , andtherefore, optimizing the linear cost function (5) over S is equivalent to optimizing over Q . Proof:
1) Let f ∈ ¯ Q . Then, there exists w such that ( f , w ) ∈ Q . Therefore, ∀ j ∈ J , ∀ i ∈ I j , ∀ α ∈ R − , f ( α ) i = X b ∈C j , b i = α w j, b . (60)In addition, the entries in w satisfy (8) and (9).We set the values of the variables z j,ℓ, g as follows: ∀ j ∈ J , ∀ ℓ = 1 , , · · · , d j − , ∀ g ∈ C ( χ ) j,ℓ , z j,ℓ, g = X b ∈C j , g ⊲⊳ b w j, b . So we have that ∀ j ∈ J , ∀ ℓ = 1 , , · · · , d j − , ∀ i ∈ I j,ℓ , ∀ α ∈ R − , X g ∈C ( χ ) j,ℓ , g bi = α z j,ℓ, g = X b ∈C j , b i = α w j, b = f ( α ) i , (61)using (60), since I j,ℓ ⊆ I j for all ℓ = 1 , , · · · , d j − . In addition, we define the variables h ( α ) j,i as follows. ∀ j ∈ J , ∀ i ∈ L j , ∀ α ∈ R − , h ( α ) j,i = X b ∈C j , χ ji ( b )= α w j, b . (62)41ote that all variables h ( α ) j,i are well defined. It then follows that ∀ j ∈ J , ∀ ℓ = 1 , , · · · , d j − , ∀ i ∈ L j,ℓ , ∀ α ∈ R − , X g ∈C ( χ ) j,ℓ , g χi = α z j,ℓ, g = X b ∈C j , χ ji ( b )= α w j, b = h ( α ) j,i , (63)using (60), since L j,ℓ ⊆ L j for all ℓ = 1 , , · · · , d j − .Next, we claim that (( f , h ) , z ) ∈ S . (64)In order to show this, it is necessary to show (8)-(10) with respect to (( f , h ) , z ) and thecode C ( χ ) . However (8) and (9) follow easily from the definition of the variables z j,ℓ, g andthe properties of the variables w j, b . As to (10), it follows from the combination of (61)and (63).Finally, (64) yields that f ∈ ¯ S , as required.2) Now, assume that f ∈ ¯ S . This means that there exist h , z such that (( f , h ) , z ) ∈ S . Then,for all j ∈ J , (( ˆ f j , ˆ h j ) , ˆ z j ) ∈ S j . By Corollary 9.4, ( ˆ f j , ˆ h j ) lies in K ( C ( χ ) j ) . Therefore, ( ˆ f j , ˆ h j ) = X ˜ b ∈C ( χ ) j β j, ˜ b · Ξ j (˜ b ) , (65)where P ˜ b ∈C ( χ ) j β j, ˜ b = 1 and β j, ˜ b ≥ for all ˜ b ∈ C ( χ ) j .For all j ∈ J , b ∈ C j set the value of w j, b as w j, b = β j, ˜ b ( b ) , and thus X b ∈C j w j, b = 1 , (66)and w j, b ≥ for all b ∈ C j . (67)Then, (65) becomes ( ˆ f j , ˆ h j ) = X b ∈C j w j, b · Ξ j (˜ b ( b )) . Comparing the first set of coordinates, we obtain that ∀ i ∈ I j , ∀ α ∈ R − , f ( α ) i = X b ∈C j , b i = α w j, b . j ∈ J . Together with (66) and (67) this means that ( f , w ) ∈ Q . Therefore, f ∈ ¯ Q , as required.The polytope representation described in this section leads to a polynomial-time decoder fora wide variety of classical nonbinary codes (for example, generalized Reed-Solomon codes).X. S IMULATION S TUDY
A. Comparison with ML Decoding
In this section we compare performance of the linear-programming decoder with hard-decisionand soft-decision based ML decoding. For such a comparison, a code and modulation schemeare needed which possess sufficient symmetry properties to enable derivation of analytical MLperformance results. We consider encoding of -symbol blocks according to the [11 , ternaryGolay code, and modulation of the resulting ternary symbols with -PSK modulation prior totransmission over the AWGN channel. Figure 4 shows the symbol error rate (SER) and codeworderror rate (WER) performance of this code under LP decoding using the polytope Q of SectionIII. Note that this is the same as its performance using the polytope U of Section VIII, and itsperformance using the polytope S of Section IX. When the decoder reports a decoding failure,the SER and WER are both taken to be . To quantify performance, we define the signal-to-noiseratio (SNR) per information symbol γ s = E s /N as the ratio of the received signal energy perinformation symbol to the noise power spectral density. Also shown in the figure are two otherperformance curves for WER. The first is the exact result for ML hard-decision decoding of theternary Golay code; since the Golay code is perfect, this is obtained fromWER ( γ s ) = X ℓ =3 (cid:18) ℓ (cid:19) ( p ( γ s )) ℓ (1 − p ( γ s )) − ℓ , where p ( γ s ) represents the probability of incorrect hard decision at the demodulator and wasevaluated for each value of γ s using numerical integration. The second WER curve representsthe union bound for ML soft-decision decoding. Using the symmetry of the -PSK constellation,this may be obtained fromWER ( γ s ) < X c ∈C erfc r w H ( c ) R ( C ) γ s ! , −8 −7 −6 −5 −4 −3 −2 −1 Es/No (dB) C ode w o r d / S y m bo l E rr o r R a t e Hard−Decision−Based ML Decoding − Exact WERSoft−Decision−Based ML Decoding − Union Bound WERLP Decoding − WERLP Decoding − SER
Fig. 4. Codeword error rate (WER) and symbol error rate (SER) for the [11 , ternary Golay code with -PSK modulation overthe AWGN channel. The figure shows performance under LP decoding, as well as the exact result for hard-decision decodingand the union bound for soft-decision decoding. R ( C ) = 6 / denotes the code rate, and the Hamming weight of the codeword c ∈ C , w H ( c ) , is distributed according to the weight enumerating polynomial [27] W ( x ) = 1 + 132 x + 132 x + 330 x + 110 x + 24 x . The performance of LP decoding is approximately the same as that of codeword-error-rateoptimum hard-decision decoding. The performance lies . dB from the result for ML hard-decision decoding and . dB from the union bound for codeword-error-rate optimum soft-decision decoding at a WER of − . These results are comparable to those of a similar studyconducted for the binary case in [7]. B. Low-Density Code Performance
Figure 5 shows SER and WER simulation performance results for two low-density parity-check (LDPC) codes. The first code C (1) , of length n = 150 , is over the ring R = Z , wherenonbinary coded symbols are mapped directly to ternary PSK signals and transmitted over anAWGN channel, the mapping described in Example 5.3 being used for modulation. The parity-check matrix H (1) consists of m = 60 rows and is equal to the right-circulant matrix H (1) j,i = if i − j ∈ { , , } if i − j ∈ { , , } otherwise.The code rate is R ( C (1) ) = 0 . . As expected, the performance of the low-density code C (1) issignificantly better than that of the ternary Golay code given in Figure 4. The second code C (2) ,of length n = 80 , is over the ring R = Z , where nonbinary coded symbols are mapped directlyto quaternary phase shift keying (QPSK) signals and transmitted over an AWGN channel, themapping described in Example 5.3 again being used for modulation. The parity-check matrix H (2) consists of m = 32 rows and is equal to the right-circulant matrix H (2) j,i = if i − j ∈ { , , } if i − j ∈ { , } otherwise.This code also has rate R ( C (2) ) = 0 . . The quaternary code has a higher SER and WER than theternary code for the same E s /N ; however it has a smaller block length and a higher spectral45 −5 −4 −3 −2 −1 Es/No (dB) C ode w o r d / S y m bo l E rr o r R a t e (150,90) code − WER(150,90) code − SER(80,48) code − WER(80,48) code − SER Fig. 5. Codeword error rate (WER) and symbol error rate (SER) for the [150 , ternary LDPC code C (1) under ternary PSKmodulation, and for the [80 , quaternary LDPC code C (2) under QPSK modulation. efficiency. In both systems, when the decoder reports a decoding failure the SER and WER areboth taken to be . XI. F UTURE R ESEARCH
Sections VIII and IX presented two alternative polytope representations, which have a smallernumber of variables and constraints than the respective standard LP representation in certaincontexts. It would be interesting to further reduce the complexity of the polytope representation46n order to yield more efficient decoding algorithms. Alternatively, one could try to reducecomplexity of the LP solver for the nonbinary decoding problem by exploiting knowledge ofthe polytope structure.The notion of pseudodistance for nonbinary codes was recently defined in [28], and lowerbounds on the pseudodistance of nonbinary codes under q -ary PSK modulation over the AWGNchannel were presented. It would be interesting to obtain lower bounds on the pseudodistancefor other families of nonbinary linear codes and for other modulation schemes.A PPENDIX
Proof of Proposition 8.1
Preliminary to proving this Proposition we give some background material on flow networks.
Flow Networks:
Let G = ( V , E ) be a directed graph, and let { s, t } ⊆ V , s = t . A flow network ( G ( V , E ) , c ) is a graph G = ( V , E ) with a nonnegative capacity function c : E −→ R ∪ { + ∞} defined for every edge.For a subset V ′ ⊆ V let V ′′ = V \ V ′ . We define a cut ( V ′ : V ′′ ) induced by V ′ as a set of edges { ( u, v ) : u ∈ V ′ , v ∈ V ′′ } . The capacity of this cut, c ( V ′ : V ′′ ) , is defined as c ( V ′ : V ′′ ) = X u ∈ V ′ , v ∈ V ′′ c (( u, v )) . For the edge e = ( u, v ) we use the notation e ∈ in ( v ) and e ∈ out ( u ) . We also use the notation N ( v ) to denote the set of neighbors of v , namely N ( v ) = { u : ( u, v ) ∈ E } ∪ { u ′ : ( v, u ′ ) ∈ E } . For a set of vertices V ⊆ V , denote N ( V ) = ∪ v ∈ V N ( v ) \ V . The flow in the graph (network) G with a source s and a sink t is defined as a function f : E −→ R ∪ { + ∞} that satisfies ≤ f ( e ) ≤ c ( e ) for all e ∈ E , and ∀ v ∈ V \{ s, t } , X e ∈ E , e ∈ in ( v ) f ( e ) = X e ∈ E , e ∈ out ( v ) f ( e ) . The value of the flow f is defined as X e ∈ E , e ∈ in ( t ) f ( e ) = X e ∈ E , e ∈ out ( s ) f ( e ) . f that attains the maximum possiblevalue. There are several known algorithms, for instance the Ford-Fulkerson algorithm, for findingthe maximum flow in a network, the reader can refer to [29, Section 26.2]. It is well knownthat the value of the maximum flow in the network is equal to the capacity of the minimum cutinduced by a vertex set V ′ such that s ∈ V ′ and t / ∈ V ′ (see [29]).Finally, we prove the Proposition. Proof:
The proof will be by induction on M . We set w a = 0 for all a ∈ C ( k )Γ . We showthat there exists a vector a = ( a i ) i ∈ Γ ∈ C ( k )Γ such that(i) For every i ∈ Γ and α ∈ R − , a i = α = ⇒ x ( α ) i > . (ii) If for some i ∈ Γ , P α ∈ R − x ( α ) i = M , then a i = α for some α ∈ R − .Then, we ‘update’ the values of x ( α ) i ’s and M as follows. For every i ∈ Γ and α ∈ R − with a i = α we set x ( α ) i ← x ( α ) i − . In addition, we set M ← M − . We also set w a ← w a + 1 .It is easy to see that the ‘updated’ values of x ( α ) i ’s and M satisfy X i ∈ Γ x ( α ) i = k α M for all α ∈ R − , and P α ∈ R − x ( α ) i ≤ M for all i ∈ Γ . Therefore, the inductive step can be appliedwith respect to these new values. The induction ends when the value of M is equal to zero.It is straightforward to see that when the induction terminates, (53) and (54) hold with respectto the original values of the x ( α ) i and M . Proof of existence of a that satisfies (i): We construct a flow network G = ( V , E ) as follows: V = { s, t } ∪ U ∪ U , where U = R − and U = Γ . Also set E = { ( s, α ) } α ∈ R − ∪ { ( i, t ) } i ∈ Γ ∪ { ( α, i ) } x ( α ) i > .
48e define an integer capacity function c : E −→ N ∪ { + ∞} as follows: c ( e ) = k α if e = ( s, α ) , α ∈ R − if e = ( i, t ) , i ∈ Γ+ ∞ if e = ( α, i ) , α ∈ R − , i ∈ Γ . (68)Next, apply the Ford-Fulkerson algorithm on the network ( G ( E , V ) , c ) to produce a maximalflow f max . Since all the values of c ( e ) are integer for all e ∈ E , so the values of f max ( e ) mustall be integer for every e ∈ E (see [29]).We will show that the minimum cut in this graph has capacity c min = P α ∈ R − k α . First,consider the cut induced by the set V ′ = { s } . This cut has capacity P α ∈ R − k α , and therefore c min ≤ P α ∈ R − k α .Assume that there is another cut, which has smaller capacity. If this smaller cut is induced bythe set V ′ = V \{ t } , its capacity must be N ≥ P α ∈ R − k α − it is not smaller. Therefore, withoutloss of generality, assume that the minimum cut is induced by the set V ′ , where V ′ = { s } ∪ X ′ ∪ Y ′ , X ′ ⊆ U and Y ′ ⊆ U . Let X ′′ = U \ X ′ and Y ′′ = U \ Y ′ (and so V ′′ = { t } ∪ X ′′ ∪ Y ′′ ).Observe that there are no edges ( α, i ) ∈ E with α ∈ X ′ , i ∈ Y ′′ , because otherwise the capacityof the respective cut would be infinitely large (so it cannot be a minimum cut). Thus, | Y ′ | ≥ | U ∩ N ( X ′ ) | (69)Observe also that X i ∈ Γ X α ∈ X ′ x ( α ) i = X α ∈ X ′ k α M and X α ∈ X ′ x ( α ) i ≤ X α ∈ R − x ( α ) i ≤ M .
Therefore, | U ∩ N ( X ′ ) | ≥ X α ∈ X ′ k α . (70)We obtain that c ( V ′ : V ′′ ) = X α ∈ X ′′ k α + | Y ′ | ≥ X α ∈ X ′′ k α + X α ∈ X ′ k α = X α ∈ R − k α , (71)where the inequality is due to (69) and (70). This leads to a contradiction of the non-minimalityof c ( V ′ : V ′′ ) for V ′ = { s } . 49f we apply the Ford-Fulkerson algorithm (or a similar algorithm) on the network ( G ( V , E ) , c ) ,we obtain that the integer flow f max in G has a value of P α ∈ R − k α . Observe that f max (( α, i )) ∈{ , } for all α ∈ R − and i ∈ Γ . Then, for all i ∈ Γ , we define a i = α if f max (( α, i )) = 1 for some α ∈ U otherwise . For this selection of a = ( a , a , · · · , a N ) , we have a ∈ C ( k )Γ and a i = α only if x ( α ) i > . Proof of existence of a that satisfies (i) and (ii) simultaneously: We start with the followingdefinition.
Definition A.1:
The vertex i ∈ U is called a critical vertex, if X α ∈ R − x ( α ) i = M .
In order to have (52) satisfied after the next inductive step, we have to decrease the value of P α ∈ R − x ( α ) i by (exactly) 1 for every critical vertex. This is equivalent to having f max (( i, t )) = 1 .We have just shown that the maximum (integer) flow in G has value P α ∈ R − k α . Now, we aimto show that there exists a flow f ∗ of the same value, which has f ∗ (( i, t )) = 1 for every criticalvertex i .Suppose that there is no such flow. Then, consider the maximum flow f ′ , which has f ′ (( i, t )) =1 for the maximal possible number of the critical vertices i ∈ U . In the sequel, we assume thatthere is a critical vertex i ∈ U , which has f ′ (( i , t )) = 0 . We will show that the flow f ′ can bemodified towards the flow f ′′ of the same value, such that for f ′′ the number of critical vertices i ∈ U having f ′′ (( i, t )) = 1 is strictly larger than for f ′ .Indeed, if there exists vertex α ∈ N ( i ) such that ( α , i ) ∈ E and f ′ (( α , i )) = 1 for some non-critical vertex i , then f ′ (( α , i )) = 0 , f ′ (( i , t )) = 0 and f ′ (( i , t )) = 1 . We define the flow f ′′ as f ′′ ( e ) = if e ∈ { ( α , i ) , ( i , t ) } if e ∈ { ( α , i ) , ( i , t ) } f ′ ( e ) for all other edges e ∈ E . It is easy to see that f ′′ is a legal flow in ( G ( V , E ) , c ) . Moreover, it has the same value as f ′ , andthe number of critical vertices i ∈ U satisfying f ′′ (( i, t )) = 1 is strictly larger than for f ′ .50n the general case (when there is no vertex α as above), we iteratively define a maximal set Z of vertices α ∈ U satisfying the next two rules:1) For any α ∈ U : if ( α, i ) ∈ E then α ∈ Z .2) For any α ∈ U and i ∈ U : if ( α, i ) ∈ E , f ′ (( α, i )) = 0 , and there exists β ∈ Z such that ( β, i ) ∈ E , f ′ (( β, i )) = 1 and all i ′ ∈ U with f ′ (( β, i ′ )) = 1 are critical, then α ∈ Z .Consider the set Z . There are two cases.Case1: Every vertex α in Z satisfies ∀ i ∈ U : ( α, i ) ∈ E and i is not critical = ⇒ f ′ (( α, i )) = 0 . Then, for every α ∈ Z there are exactly k α vertices i such that ( α, i ) ∈ E and f ′ (( α, i )) =1 . Define T = { i ∈ U is critical : ∃ α ∈ Z s.t. ( α, i ) ∈ E and f ′ (( α, i )) = 1 } . We have | T | = X i ∈ T X α ∈ Z k α . (72)Note that i / ∈ T and recall that ( β, i ) ∈ E for some β ∈ Z . (73)Note also that if γ / ∈ Z and i ∈ T , then there is no edge between γ and i (otherwise, f ′ (( γ, i )) = 0 , and so γ should be in Z ). Therefore, x ( γ ) i = 0 , and so X α ∈ Z X i ∈ T x ( α ) i = X α ∈ R − X i ∈ T x ( α ) i . (74)We obtain X α ∈ Z k α M = X α ∈ Z X i ∈ Γ x ( α ) i > X α ∈ Z X i ∈ T x ( α ) i = X α ∈ R − X i ∈ T x ( α ) i = X i ∈ T X α ∈ R − x ( α ) i = X i ∈ T M = X α ∈ Z k α M .
Here the first equality is due to (51), the strict inequality is due to (73) and the secondequality is due to (74). The third equality is obtained by the change of the order of the51ummation. The fourth equality is true because all vertices in T are critical. Finally,the fifth equality is due to (72).Therefore, this case yields a contradiction.Case2: There is a vertex α in Z which satisfies ∃ j ∈ U , ( α , j ) ∈ E , j is not critical and f ′ (( α , j )) = 1 . However, by the definition of Z , there is an integer ℓ and a set of edges { ( α h , j h +1 ) } h =0 , , ··· ,ℓ ⊆ E and { ( α h , j h ) } h =1 , , ··· ,ℓ ⊆ E , such that j ℓ +1 = i , α h ∈ Z for h = 0 , , · · · , ℓ ,j h ∈ U for h = 1 , , · · · , ℓ + 1 , and f ′ (( α h , j h +1 )) = 0 for h = 0 , , · · · , ℓ , f ′ (( α h , j h ) = 1 for h = 1 , , · · · , ℓ . We define the flow f ′′ as f ′′ ( e ) = if e ∈ { ( α h , j h +1 ) } h =0 , , ··· ,ℓ ∪ { ( j ℓ +1 , t ) } if e ∈ { ( α h , j h ) } h =0 , , ··· ,ℓ ∪ { ( j , t ) } f ′ ( e ) for all other edges e . This f ′′ is a legal flow in ( G ( V , E ) , c ) . Moreover, it has the same value as f ′ , and thenumber of critical vertices i ∈ U having f ′′ (( i, t )) = 1 is strictly larger than for f ′ .We conclude that there exists an integer flow f ∗ in ( G ( V , E ) , c ) of value P α ∈ R − k α , such thatfor every critical vertex i ∈ U , f ∗ (( i, t )) = 1 . We define a i = α if f ∗ (( α, i )) = 1 for some α ∈ U otherwise . and a = ( a i ) i ∈ Γ . For this selection of a , we have a ∈ C ( k )Γ and the properties (i) and (ii) aresatisfied. 52 CKNOWLEDGEMENTS
The authors would like to thank the anonymous reviewers, as well as the associate editorI. Sason, for their comments which improved the presentation of the paper. They would alsolike to thank I. Duursma, J. Feldman, R. Koetter and O. Milenkovic for helpful discussions.R
EFERENCES [1] R. G. Gallager, “Low-density parity-check codes,”
IRE Transactions on Information Theory, vol. IT-8, pp. 21–28, Jan.1962.[2] N. Wiberg,
Codes and Decoding on General Graphs.
Ph.D. Thesis, Link¨oping University, Sweden, 1996.[3] G. D. Forney, R. Koetter, F. R. Kschischang, and A. Reznik, “On the effective weights of pseudocodewords for codes definedon graphs with cycles,” vol. 123 of
Codes, Systems, and Graphical Models,
IMA Vol. Math. Appl., ch. 5, pp. 101-112,Springer, 2001.[4] R. Koetter, W.-C. W. Li, P. O. Vontobel, and J. L. Walker, “Characterizations of pseudo-codewords of LDPC codes,” Arxivreport arXiv:cs.IT/0508049, Aug. 2005.[5] P. Vontobel and R. Koetter, “Graph-cover decoding and finite-length analysis of message-passing iterative decoding ofLDPC codes,” to appear in
IEEE Transactions on Information Theory,
Arxiv report arXiv:cs.IT/0512078, Dec. 2005.[6] J. Feldman,
Decoding Error-Correcting Codes via Linear Programming.
Ph.D. Thesis, Massachusetts Institute ofTechnology, Sep. 2003.[7] J. Feldman, M. J. Wainwright, and D. R. Karger, “Using linear programming to decode binary linear codes,”
IEEETransactions on Information Theory, vol. 51, no. 3, pp. 954–972, March 2005.[8] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modulation,”
IEEE Transactions on Information Theory, vol.44, no. 3, pp. 927–946, May 1998.[9] X. Li and J. A. Ritcey, “Bit-interleaved coded modulation with iterative decoding,”
Proc. IEEE International Conferenceon Communications (ICC), vol. 2, pp. 858–863, Sep. 1999.[10] D. Sridhara and T. E. Fuja, “LDPC codes over rings for PSK modulation,”
IEEE Transactions on Information Theory, vol.51, no. 9, pp. 3209–3220, Sep. 2005.[11] M. C. Davey and D. J. C. MacKay, “Low density parity check codes over
GF( q ) ,” IEEE Communications Letters, vol. 2,no. 6, pp. 165–167, June 1998.[12] X. Li, M. R. Soleymani, J. Lodge, and P. S. Guinand, “Good LDPC codes over
GF( q ) for bandwidth efficient transmission,” Proc. 4th IEEE Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 95–99, June 2003.[13] A. Bennatan and D. Burshtein, “On the application of LDPC codes to arbitrary discrete-memoryless channels,”
IEEETransactions on Information Theory, vol. 50, no. 3, pp. 417–438, March 2004.[14] A. Bennatan and D. Burshtein, “Design and analysis of nonbinary LDPC codes for arbitrary discrete-memoryless channels,”
IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 549–583, Feb. 2006.[15] A. Bennatan,
The Application of LDPC Codes to New Problems in Communications.
Ph.D. Thesis, Tel Aviv University,Jan. 2007.[16] C. A. Kelley, D. Sridhara, and J. Rosenthal, “Pseudocodeword weights for non-binary LDPC codes,”
Proc. IEEEInternational Symposium on Information Theory (ISIT) , Seattle, USA, pp. 1379-1383, July 2006.
17] G. D. Forney, Jr., “Geometrically uniform codes,”
IEEE Transactions on Information Theory , vol. 37, issue 5, pp. 1241–1260, Sep. 1991.[18] T. Richardson and R. E. Urbanke, “On the capacity of LDPC codes under message-passing decoding,”
IEEE Transactionson Information Theory , vol. 51, no. 9, pp. 3209–3220, Sep. 2005.[19] M. F. Flanagan, “Codeword-independent performance of nonbinary linear codes under linear-programming and sum-productdecoding,”
Proc. IEEE International Symposium on Information Theory (ISIT) , Toronto, Canada, pp. 1503–1507, July 2008.[20] E. Hof, I. Sason, and S. Shamai (Shitz), “Performance bounds for nonbinary linear block codes over memoryless symmetricchannels,”
IEEE Transactions on Information Theory , vol. 55, no. 3, pp. 977–996, March 2009.[21] M. Chertkov and M. Stepanov, “Pseudo-codeword landscape,”
Proc. IEEE International Symposium on Information Theory(ISIT) , Nice, France, pp. 1546–1550, June 2007.[22] K. Yang, X. Wang, and J. Feldman, “Cascaded formulation of the fundamental polytope of general linear block codes,”
Proc. IEEE International Symposium on Information Theory (ISIT) , Nice, France, pp. 1361–1365, June 2007.[23] K. Yang, X. Wang, and J. Feldman, “A new linear programming approach to decoding linear block codes,”
IEEETransactions on Information Theory , vol. 54, no. 3, pp. 1061–1072, March 2008.[24] S. Boyd, L. Vandenberghe,
Convex Optimization,
Cambridge: Cambridge University Press, 2004.[25] A. Schrijver,
Theory of Linear and Integer Programming,
New York: John Wiley & Sons, 1998.[26] A. Barvinok,
A Course in Convexity, vol. 54 of
Graduate Studies in Mathematics.
American Mathematical Society,Providence, RI, 2002.[27] F. J. MacWilliams and N. J. A. Sloane,
The Theory of Error Correcting Codes.
Amsterdam: North-Holland, 1977.[28] V. Skachek, M. F. Flanagan, “Lower bounds on the minimum pseudodistance for linear codes with q -ary PSK modulationover AWGN,” Proc. 5-th International Symposium on Turbo Codes and Related Topics,
Lausanne, Switzerland, September2008.[29] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein,
Introduction to Algorithms.