Channel Decoding with a Bayesian Equalizer
Luis Salamanca, Juan José Murillo-Fuentes, Fernando Pérez-Cruz
CChannel Decoding with a Bayesian Equalizer
Luis Salamanca, Juan Jos´e Murillo-Fuentes
DTSC University of Seville,Camino de los Descubrimientos s/n,41092 Seville, Spain. { salamanca,murillo } @us.es Fernando P´erez-Cruz
DTSC University Carlos III in Madrid,Avda. de la Universidad 30, 28911,Legan´es (Madrid), Spain. [email protected]
Abstract —Low-density parity-check (LPDC) decoders assumethe channel estate information (CSI) is known and they have thetrue a posteriori probability (APP) for each transmitted bit. Butin most cases of interest, the CSI needs to be estimated with thehelp of a short training sequence and the LDPC decoder hasto decode the received word using faulty APP estimates. In thispaper, we study the uncertainty in the CSI estimate and how itaffects the bit error rate (BER) output by the LDPC decoder. Toimprove these APP estimates, we propose a Bayesian equalizerthat takes into consideration not only the uncertainty due to thenoise in the channel, but also the uncertainty in the CSI estimate,reducing the BER after the LDPC decoder.
I. I
NTRODUCTION
Single input single output (SISO) communication channelscan be characterized by a linear finite impulsive response thateither represents the dispersive nature of a physical mediaor the multiple paths of wireless communications [1]. Thisrepresentation causes inter-symbol interference (ISI) at thereceiver end that can impair the digital communication. Giventhis channel estate information (CSI), the maximum likelihoodsequence detector (MLSD) [2] -Viterbi Algorithm- providesthe optimal transmitted sequence at the receiver end. And, ifwe are interested in a posteriori probabilities (APP) for thetransmitted symbols, we can use the BCJR algorithm [3], thatprovides bitwise optimal decisions.Channel encoders introduce controlled redundancy in thetransmitted sequence to correct errors caused by the channel.Modern channel decoders, such as turbo or low-density parity-check (LDPC) codes [4], need accurate APP estimates to beable to achieve channel capacity [5].In a previous work, we have shown that accurate APPestimates increase the performance of LDPC decoders [6],although that work focuses on nonlinear channel estimation.In this paper, we study how the uncertainty in the estimationof the CSI affects the optimal performance of modern channeldecoders. The CSI is acquired using a training sequence [1]and it is typically estimated by maximum likelihood (ML).These training sequences are necessary short to reduce thetransmission of non-informative symbols, yielding inaccurateCSI estimates. Thus, the BCJR assuming an ML estimation(hereafter refers as ML-BCJR) only delivers an approximationto the APP for each symbol because it does not include theuncertainty in the estimate. Inaccuracies in the APP estimatesdegrade the performance of channel decoders for turbo orLDPC codes [7], [8]. We consecutively propose and analyze a Bayesian equalizer, which takes into account the uncertaintyin the CSI estimate to produce more accurate APP estimates.The difference between the bit error rate (BER) of the ML-BCJR and the proposed Bayesian equalizer is not significant,although it slightly favors the Bayesian equalizer. However,this is not an accurate measure of the quality of the APPestimates for each equalizer since it only considers hard deci-sions, in contrast to the soft inputs needed by modern channeldecoders. Thus, assuming LDPC coding in our communicationsystem, we can compare the quality of the APP estimates foreach equalizer. We experimentally show that at the outputof the LDPC decoder the Bayesian equalizer considerablyimproves the performance of the ML-BCJR equalizer, whenwe measure the probability of error. These gains are moresignificant for high signal to noise ratios, channels with longimpulsive responses and/or short training sequences.There are some related works where uncertainties are ex-ploited on the estimation of the transmitted symbols. In theframework of turbo-receivers [9], some approaches can befound in the literature that incorporate these uncertainties inthe iterative process of equalization and decoding. It has beenexploited in [10], where the authors use an MMSE to estimatethe channel, and in [11], [12], where they do not focus on theoptimal estimation of the APP. In [13] we find a proposal toestimate some parameters in a OFDM system to later includethem in the decoding. In [14], the authors consider the channelestimation inaccuracies during the decoding process by meansof a practical decoding metric.The paper is organized as follows. In Section II we describethe structure of a SISO communication system and the ML-BCJR solution. The proposed Bayesian equalization techniqueis presented in Section III. Experimental results in SectionIV help to illustrate the performance of our method. Finally,Section V ends with conclusions and some proposals for futurework. II. ML-BCJR E
QUALIZATION
We consider the discrete-time dispersive communicationsystem depicted in Fig. 1. The channel H ( z ) is completelyspecified by the CSI, i.e., h = [ h , h , . . . , h L ] (cid:62) , where L isthe length of the channel. We model the values of the channel h as independent Gaussians with zero-mean and varianceequal to /L (Rayleigh fading). a r X i v : . [ c s . I T ] J un hannelEncoder Equalizer Channel Decoder m i b i v i w i p ( b i = b | x , h ) H ( z ) ˆ m i x i Transmitter Channel Receiver
Fig. 1. System model.
A block of K message bits, m = [ m , m , . . . , m K ] (cid:62) , isencoded with a rate R = K/N to obtain the codeword b =[ b , b , . . . , b N ] (cid:62) that is transmitted over the channel using aBPSK modulation: x i = b (cid:62) i h + w i , (1)where b i = [ b i , b i − , . . ., b i − L +1 ] (cid:62) and w i is additive whiteGaussian noise (AWGN) with variance σ w . Thus, the receivedsequence is x = [ x , x , . . . , x N ] (cid:62) .At the beginning of every block we transmit a pream-ble with n known bits ( b ◦ , . . . , b ◦ n ) and the receiver uses D = { x ◦ i , b ◦ i } ni =1 , the training sequence, to estimate thechannel. The ML criterion is widely considered for the taskof estimation: ˆ h ML = arg max h p ( x ◦ | b ◦ , h ) . (2)Once we have estimated the channel coefficients with thepreamble, we apply the BCJR algorithm to obtain the posteriorprobability estimates for each transmitted bit: p ( b i = b | x , ˆ h ML ) i = 1 , . . . N. (3)Finally we decode the received word using the LDPCdecoder to obtain a maximum a posteriori estimate for m i .The LDPC decoder relies on the estimates in (3) beingaccurate and, if they are not, the decoding might not finishor might return an incorrect codeword. In Fig. 2 we show p ( b i = 1 | x , ˆ h ML ) versus the probability p ( b i = 1 | x , h ) forone estimation of the fading channel h = [1 , . obtainedassuming n = 6 , and a signal to noise ratio (SNR) equal to dB. We can see in Fig. 2 that the predictions for each bitare quite accurate. But these posterior probability estimatesare overconfident roughly half of the time, because ML is anunbiased estimator of the CSI, which could derail the LDPCdecoder, because a bit with high confidence for a zero or aone is hard to overrule if it is incorrect.III. B AYESIAN E QUALIZATION
If we provide the BCJR with the true CSI we have theexact APP [3]. However, in practice we usually use the MLcriterion in (2) to estimate the CSI from a training sequence.This estimated CSI, provided as ground truth to the BCJRalgorithm, assigns inaccurate estimates of the APP and mightmislead the channel decoder to deliver the incorrect codeword(or not to converge at all). We propose a Bayesian equalizerthat takes into account both the uncertainty about the noise p(b i |x,h) p ( b i | x , h M L ) Fig. 2. Calibration curve for the ML-BCJR assuming SNR = 0 dB and h = [1 , . . and the uncertainty about the CSI estimate. We compute theposterior probability as: p ( b i = b | x , D ) = (cid:90) p ( b i = b | x , h ) p ( h |D ) d h , (4)where b = ± and p ( h |D ) = p ( h ) p ( x ◦ | b ◦ , h ) p ( x ◦ | b ◦ )= p ( h ) (cid:81) ni =1 p ( x i ◦ | b i ◦ , h ) p ( x ◦ , . . . , x n ◦ | b ◦ , . . . , b n ◦ ) , (5)is the posterior probability for the CSI, given the likelihood(Gaussian noise) and the prior (e.g. Rayleigh fading).The marginalization of h in (4) provides equal or betterAPP estimates that the ML-BCJR, because it includes all theinformation of p ( h |D ) . In fact, if the training sequence is largeenough, this Gaussian posterior tends to a multidimensionaldelta centered at its mean, whose value tends to the real valueof h , as ˆ h ML does. However, the performance of the ML-BCJR is misled in case of uncertainty in the CSI. Conversely,the Bayesian equalizer in (4) considers the uncertainty in theestimation, providing more accurate APP estimates.This computation of the APP does not yield a significantimprovement in the BER after the equalizer, because foretection we only consider if the probability is lower or higherthan . . This is illustrated in Section IV.However, the main advantage of the Bayesian equalizer isthat the performance of the soft-decoder improves with thisnew estimation of the probabilities, that translates into betterresults in terms of BER at the output of the decoder. InFig. 3 we show p ( b i = 1 | x , D ) versus p ( b i = 1 | x , h ) fora fading channel h = [1 , . estimated with the same trainingsequences assumed in Fig 2. Compared to the ML-BCJR inFig. 2, in the Bayesian equalizer the zero-mean Gaussian priortends to underestimate the CSI. Then, the averaging overall possibles values of p ( h |D ) gives mostly underconfidentposterior probability estimates, as we can observe in Fig. 3.Therefore, the LDPC decoder does not have such a strongpreference for the received bits and they can be easily flipped,if necessary. p(b i |x,h) p ( b i | x , D ) Fig. 3. Calibration curve for the Bayesian equalizer assuming SNR = 0 dBand h = [1 , . . A. Computation of the solution
We propose to use Monte Carlo to obtain an approximateresult to (4) considering the following steps:1) Calculate the posterior of the channel: in (5) the nu-merator is the product of the likelihood p ( x ◦ | b ◦ , h ) and the prior of h . Considering a system with BPSKmodulation in a Rayleigh channel, both terms are realvalued Gaussians distributed as: p ( x ◦ | b ◦ , h ) ∼ N ( ( B ◦ ) (cid:62) h , σ w I ) , (6) p ( h ) ∼ N (0 , I ) , (7)where B ◦ = [ b ◦ , b ◦ , . . . , b n ◦ ] is an L × n matrix, andwithout loss of generality we assume that the varianceof the CSI is equal to one. The expressions for themean and the covariance matrix of the posterior when both terms are Gaussians are straightforward [15]. Theseexpressions can be particularized for our system as: h h |D = ( I + B ◦ ( B ◦ ) (cid:62) σ − w ) − B ◦ σ − w x ◦ , (8) C h |D = ( I + B ◦ ( B ◦ ) (cid:62) σ − w ) − , (9)where x ◦ = [ x ◦ , x ◦ , . . . , x ◦ n ] (cid:62) .2) Produce random samples from the posterior: with thevector of means and the covariance matrix, we cansample to obtain M random samples.3) Solve the BCJR algorithm: the APP estimates of eachtransmitted bit is computed for the M different samplesof p ( h |D ) .4) Computation of (4): the M different values of p ( b i = b | x , . . . , x N , h ) average the APP of each transmittedbit over all possible cases of h : p ( b i = b | x , . . . , x N , D ) ≈≈ M M (cid:88) j =1 p ( b i = b | x , . . . , x N , h j ) , (10)which yields an approximation of (4).As already discussed, if we use these optimal APP esti-mates as inputs to a soft-decoder, the system achieves betterperformance, especially in case of inaccurate estimations of theCSI, i.e., the length of the training sequence is short comparedto the number of channel taps. However, this solution iscomputationally demanding, because we have to calculate M times the BCJR algorithm, whose complexity increasesexponentially with L .IV. S IMULATION R ESULTS
To illustrate the performance of the proposed receiver, wecompare its bit error rate curves to the ones of the ML-BCJR,before and after the decoder. In all experiments we considerthe following scenario: • Block frames of 500 random bits encoded with a regularLDPC code (3,6) of rate / . • Up to 10000 frames of 1000 bits are transmitted over thechannel. • Between frames, a training sequence of n random un-coded bits is transmitted to estimate the channel. • Every frame, and its associated training sequence, is sentover the same Rayleigh fading channel. We consider thatthe channel coherence time is greater than the duration ofthe frame, i.e., the channel does not change during thistime. Furthermore, in our experiments we take the samevalue for the taps of the channel during all transmittedframes. For the channel with three taps - L = 3 - thesevalues are: H ( z ) = 0 . . z − + 0 . z − , and for L = 6 : H ( z ) = 0 . . z − − . z − ++ 0 . z − + 0 . z − − . z − . We consider for the Bayesian estimation a prior with zeromean and variance equal to 1 for all the taps.We depict in Fig. 4 the BER measured after the ML-BCJR (dashed line) and Bayesian (solid line) equalizers forthe channel with L = 3 and different lengths of the trainingsequence. The differences between both equalizers are negli-gible, although the BER of the Bayesian equalizer is alwayslower. ! " ! $ "! ! "! ! " *+, -. , / / -012340567 ! -89,:2;<2=>/8*? Fig. 4. BER performance for Bayesian equalizer (solid lines) and ML-BCJR(dashed lines) before the decoder, for a channel with 3 taps and differentlengths of the training sequence, n = 10 ( ◦ ), n = 15 ( (cid:3) ) and n = 20 ( (cid:5) ).Dash-dotted line illustrates performance assuming perfect CSI. In Fig. 5 and 6 we show the BER after the LDPC decoderhas corrected the errors introduced by the channel. In theseplots, we can observe that for short training sequences thegains of using the Bayesian equalizer over the ML-BCJR aresignificant (over dB). The difference between the results inFig. 4, and Fig. 5 and 6 can be explained by the APP estimatesgiven by each procedure. When we measure the BER after theequalizer (Fig. 4), we only care if each bit has been correctlydecoded and, consequently, we are only measuring how goodthe APP estimate of the 50% percentile is. When we measurethe BER after the LDPC decoder (Fig. 5 and 6), we needthat the APP estimates are accurate everywhere, because theLDPC decoder needs these estimates to decode correctly thetransmitted codeword, i.e., the Belief Propagation algorithmuses the APP for each individual bit. These results sustainour claim that the Bayesian equalizer provides more accuratepredictions of the APP than the ML-BCJR equalizer, as theLDPC decoder is able to better decode with them.Particularly, in Fig. 5 we can observe that the Bayesianequalizer improves the ML-BCJR equalizer as we increasethe SNR for a fixed length of the training sequence and bothequalizers tend to coincide as we increase the length of thetraining sequence. This last result is expected, because as weincrease the length of the training sequence the posterior forthe CSI tends to a delta function and the ML-BCJR andBayesian equalizers coincides. But for these equalizers to ! " ! & "! ! % "! ! $ "! ! "! ! " "! ! *+, -. , / / -0123405//67 ! -89,//:2;<2=>/8*?// Fig. 5. BER performance for Bayesian equalizer (solid lines) and ML-BCJR (dashed lines) after the LDPC decoder, for a channel with 3 taps anddifferent lengths of the training sequence, n = 10 ( ◦ ), n = 15 ( (cid:3) ), n = 20 ( (cid:5) ), n = 35 ( (cid:53) ) and n = 60 ( (cid:52) ). Dash-dotted line illustrates performanceassuming perfect CSI. ! " ! & "! ! % "! ! $ "! ! "! ! " "! ! *+, -. , / / -0123405//67 ! -89,//:2;<2=>/8*?// Fig. 6. BER performance for Bayesian equalizer (solid lines) and ML-BCJR(dashed lines) after the decoder, for a channel with 6 taps and different lengthsof the training sequence, n = 15 ( ◦ ), n = 25 ( (cid:3) ), n = 40 ( (cid:5) ), n = 50 ( (cid:53) )and n = 90 ( (cid:52) ). Dash-dotted line illustrates performance assuming perfectCSI. coincide, we need over 60 training samples (more than 20 percoefficient) and for shorter training sequences the Bayesianequalizer is vastly superior.Finally, in Fig. 6, we plot the curves for the channel with6 taps. In this experiment, we can see the previous resultmagnified, because the longer the impulsive response of thechannel is the more uncertain the channel estimate is and thehigher the room for improvement for the Bayesian equalizer.In this experiment, we can observe gains over dB for trainingsequence with 25 symbols and SNR = 7 dB.. C ONCLUSIONS AND F UTURE W ORK
Channel equalization has been traditionally solved usinga discriminative model where the only variable modeled asrandom is the noise. The generative model introduced inthis paper, where the posterior probability density functionof the estimated CSI is included, is a more efficient solution.If we are to just estimate the encoded transmitted symbols,the discriminative model is a good choice. However, if theestimation of the APP is needed, i.e., the decoder very muchbenefits from this information, the discriminative solutionexhibits poor results whenever the CSI is badly estimated. Onthe contrary, the Bayesian approach exploits the full statisticalmodel to provide optimal APP. We prove that these estimationsare useful if a LDPC encoding is used. Other decoders maytake advantage of this solution as well.Despite both the BCJR algorithm and LDPC decoding canbe computed efficiently using machine learning algorithmsapplied on graphs [16], the drawback of this proposal isits computational complexity, because we have to computeseveral times the results for the BCJR algorithm to average theintegral (4). Alternative graphical representation or inferencealgorithms that capture the essence of this approach may yieldsub-optimal but less demanding solution for Bayesian equal-ization proposed. Furthermore, other systems and channels canbe considered. width=2.5in]subfigcase1A
CKNOWLEDGMENT
This work was partially funded by Spanish government(Ministerio de Educaci´on y Ciencia TEC2006-13514-C02- { } /TCM and TEC2009-14504-C02- { } , Consolider-Ingenio 2010 CSD2008-00010), and the European Union(FEDER). R EFERENCES[1] J. G. Proakis,
Digital Communications , 5th ed. New York, NY:McGraw-Hill, 2008.[2] D. Forney, “The Viterbi algorithm,”
Proc. IEEE , vol. 61, no. 2, pp. 268–278, Mar. 1973.[3] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding oflinear codes for minimizing symbol error rate,”
IEEE Trans. Inf. Theory ,vol. 20, no. 2, pp. 284–287, Mar. 1974.[4] T. Richardson and R. Urbanke,
Modern Coding Theory . CambridgeUniversity Press, March 2008.[5] T. M. Cover and J. A. Thomas,
Elements of Information Theory . NewJersey, USA: John Wiley & Sons, 2006.[6] P. M. Olmos, J. Murillo-Fuentes, and F. Perez-Cruz, “Joint nonlinearchannel equalization and soft LDPC decoding with Gaussian processes,”
IEEE Trans. Signal Process. , vol. 58, no. 3, pp. 1183–1192, Mar 2010.[7] M. Luby, M. Mitzenmacher, A. Shokrollahi, D. Spielman, and V. Ste-mann, “Efficient erasure correcting codes,”
IEEE Trans. Inf. Theory ,vol. 47, no. 2, pp. 569–584, Feb. 2001.[8] S. Chung, D. Forney, T. Richardson, and R. Urbanke, “On the design oflow-density parity-check codes within 0.0045 dB of the Shannon limit,”
IEEE Commun. Lett. , vol. 5, no. 2, pp. 58–60, Feb. 2001.[9] C. Douillard, M. Jezequel, C. Berrou, A. Picart, P. Didier, andA. Glevieux, “Iterative correction of intersymbol interference: Turbo-equalization,”
European Trans. Telecomm. , vol. 6, no. 5, pp. 507–511,Sep.-Oct. 1995.[10] L. M. Davis, I. B. Collings, and P. Hoeher, “Joint MAP equalization andchannel estimation for frequency-selective and frequency-flat fast-fadingchannels,”
IEEE Trans. Commun. , vol. 49, no. 12, pp. 2106–2114, Dec.2001. [11] R. Otnes and M. Tuchler, “On iterative equalization, estimation, anddecoding,” in
Proc. IEEE Int. Conf. on Commun. , vol. 4, May 2003, pp.2958–2962.[12] X. Wang and R. Chen, “Blind Turbo equalization in Gaussian andimpulsive noise,”
IEEE Trans. Veh. Technology , vol. 50, no. 4, pp. 1092–1105, Jul. 2001.[13] B. Lu and X. Wang, “Bayesian blind Turbo receiver for coded OFDMsystems with frequency offset and frequency-selective fading,” in
Proc.IEEE Int. Conf. on Commun. , vol. 1, Apr.-May 2002, pp. 44–48.[14] P. Piantanida, S. Sadough, and P. Duhamel, “On the outage capacity of apractical decoder accounting for channel estimation inaccuracies,”
IEEETrans. Commun. , vol. 57, no. 5, pp. 1341–1350, May 2009.[15] S. M. Kay,
Fundamentals of Statistical Signal Processing: Estimationtheory . New York, NY: Prentice-Hall, 1993, vol. 1.[16] F. R. Kschischang, B. I. Frey, and H. A. Loeliger, “Factor graphs andthe sum-product algorithm,”