Distributed Arithmetic Coding for Sources with Hidden Markov Correlation
aa r X i v : . [ c s . I T ] J a n Distributed Arithmetic Coding for Sources withHidden Markov Correlation
Yong Fang and Jechang Jeong,
Member, IEEE
Abstract —Distributed arithmetic coding (DAC) has been shownto be effective for Slepian-Wolf coding, especially for short datablocks. In this letter, we propose to use the DAC to compressmomery-correlated sources. More specifically, the correlationbetween sources is modeled as a hidden Markov process. Ex-perimental results show that the performance is close to thetheoretical Slepian-Wolf limit.
Index Terms —Distributed Source Coding (DSC), Slepian-WolfCoding (SWC), Distributed Arithmetic Coding (DAC), HiddenMarkov Correlation, Forward Algorithm.
I. I
NTRODUCTION W E consider the problem of Slepian-Wolf Coding (SWC)with decoder Side Information (SI). The encoder com-presses discrete source X = { x t } Nt =1 in the absence of Y = { y t } Nt =1 , discretely correlated SI. Slepian-Wolf theorempoints out that lossless compression is achievable at rates R ≥ H ( X | Y ) , the conditional entropy of X given Y ,where both X and Y are discrete random processes [1].Conventionally, channel codes, such as turbo codes [2] or Low-Density Parity-Check (LDPC) codes [3], are used to deal withthe SWC problem.Recently, some SWC techniques based on entropy codingare proposed, such as Distributed Arithmetic Coding (DAC)[4, 5] and Overlapped Quasi-Arithmetic Coding (OQAC)[6]. These schemes can be seen as an extension of classicArithmetic Coding (AC) whose principle is to encode source X at rates H ( X | Y ) ≤ R < H ( X ) by allowing overlappedintervals. The overlapping leads to a larger final interval andhence a shorter codeword. However, ambiguous codewords areunavoidable at the same time. A soft joint decoder exploits SI Y to decode X . Afterwards, the time-shared DAC (TS-DAC)[7] is proposed to deal with the symmetric SWC problem.To realize rate-incremental SWC, the rate-compatible DAC isproposed in [8].In this paper, we research how to use the DAC to compresssources with hidden Markov correlation.II. B INARY D ISTRIBUTED A RITHMETIC C ODING
Let p be the bias probability of binary source X , i.e., p = P ( x t = 1) . In the classic AC, source symbol x t isiteratively mapped onto sub-intervals of [0 , , whose lengthsare proportional to (1 − p ) and p . The resulting rate is This research was supported by Seoul Future Contents Convergence (SFCC)Cluster established by Seoul R&BD Program.The authors are with the Lab. of ICSP, Dep. of Electronics and Commu-nications Engineering, Hanyang University, Haengdang-dong, Seongdong-gu,Seoul 133-791, Korea (e-mail: [email protected]). R ≥ H ( X ) . In the DAC [4], interval lengths are proportionalto the modified probabilities (1 − p ) γ and p γ , where H ( X | Y ) H ( X ) ≤ γ ≤ . (1)The resulting rate is R ≥ γH ( X ) ≥ H ( X | Y ) . To fit the [0 , interval, the sub-intervals have to be partially overlapped.More specifically, symbols x t = 0 and x t = 1 correspond tointervals [0 , (1 − p ) γ ) and [1 − p γ , , respectively. It is justthe overlapping that leads to a larger final interval, and hencea shorter codeword. However, as a cost, the decoder can notdecode X unambiguously without Y .To describe the decoding process, we define a ternarysymbol set { , χ, } , where χ represents a decoding ambiguity.Let C X be the codeword and ˜ x t be the t -th decoded symbol,then ˜ x t = , ≤ C X < − p γ χ, − p γ ≤ C X < (1 − p ) γ , (1 − p ) γ ≤ C X < . (2)When the t -th symbol is decoded, if ˜ x t = χ , the decoderperforms a branching: two candidate branches are generated,corresponding to two alternative symbols x t = 0 and x t = 1 .For each new branch, its metric is updated and the correspond-ing interval is selected for next iteration. To reduce complexity,every time after decoding a symbol, the decoder uses the M -algorithm to keep at most M branches with the best partialmetric, and prunes others [4].Note that the metric is not reliabe for the very last symbolsof a finite length sequence X [5]. This problem is solved byencoding the last T symbols without interval overlapping [5].It means that for ≤ t ≤ ( N − T ) , x t is mapped onto [0 , (1 − p ) γ ) and [1 − p γ , ; while for ( N − T + 1) ≤ t ≤ N , x t ismapped onto [0 , − p ) and [1 − p, .Therefore, a binary DAC system can be described by fourparameters: { p, γ, M, T } .III. H IDDEN M ARKOV M ODEL AND F ORWARD A LGORITHM
Let S = { s t } Nt =1 be a sequence of states and Z = { z t } Nt =1 be a sequence of observations. A hidden Markov process isdefined by λ = ( A, B, π ) : A = { a ji } : state transition probability matrix, where a ji = P ( s t = i | s t − = j ) ; B = { b i ( k ) } : observation probability distribution, where b i ( k ) = P ( z t = k | s t = i ) ; π = { π i } : initial state distribution, where π i = P ( s = i ) . The aim of forward algorithm is to compute P ( z , ..., z t | λ ) ,given observation { z , ..., z t } and model λ . Let α t ( i ) be theprobability of observing the partial sequence { z , ..., z t } suchthat state s t is i , i.e., α t ( i ) = P ( z , ..., z t , s t = i | λ ) . (3)Initially, we have α ( i ) = π i b i ( z ) . (4)For t > , α t ( i ) can be induced through iteration α t ( i ) = { X j [ α t − ( j ) a ji ] } b i ( z t ) . (5)Therefore, P ( z , ..., z t | λ ) = X i α t ( i ) . (6)In practice, α t ( i ) is usually normalized by α t ( i ) = α t ( i ) δ t , (7)where δ t = X i α t ( i ) . (8)In this case, we have P ( z , ..., z t | λ ) = t Y t ′ =1 δ t ′ . (9)IV. DAC FOR H IDDEN M ARKOV C ORRELATION
Assume that binary source X and SI Y are correlated by Y = X ⊕ Z , where Z is generated by a hidden Markovmodel with parameter λ . X is encoded using a { p, γ, M, T } DAC encoder. The decoding process is very similar to whatdescribed in [4]. The only difference is that the forwardalgorithm is embedded into the DAC decoder and the metric ofeach branch is replaced by P ( z , ..., z t | λ ) , where z t = x t ⊕ y t .V. E XPERIMENTAL R ESULTS
We have implemented a 16-bit DAC codec system. The biasprobability of X is p = 0 . . According to the recommendationof [5], we set M = 2048 and T = 15 . The same 2-state (0and 1) and 2-output (0 and 1) sources as in [9] are used insimulations (see Table I). The length of data block used ineach test is N = 1024 .To achieve lossless compression, each test starts from γ = H ( X | Y ) (see Table II). If the decoding fails, we increase γ with 0.01. Such process is iterated until the decodingsucceeds. For each model, results are averaged over 100 trials.Experimental results are enlisted in Table II.For comparison, also included in Table II are experimentalresults for the same settings from [9]. In each test of [9], N = 16384 source symbols are encoded using an LDPCcode. In addition, to synchronize the hidden Markov model, N α original source symbols are sent to the decoder directlywithout compression.The results show that the DAC performs similarly to orslightly better than (for models 1 and 2) the LDPC-based
TABLE IM
ODELS FOR S IMULATION model { a , a , b (0) , b (1) } { } { } { } { } TABLE IIE
XPERIMENTAL R ESULTS model H ( X | Y ) [9] DAC approach [9]. Moreover, for hidden Markov correlation, theDAC outperforms the LDPC-based approach in two aspects:1). The LDPC-based approach requires longer codes toachieve better performance, while the DAC is insensitive tocode length [4].2). For the LDPC-based approach, to synchronize the hiddenMarkov model, a certain proportion of original source symbolsmust be sent to the decoder as “seeds”. However, it is hardto determine α , the optimal proportion of “seeds”. The resultsreported in [9] were obtained through an exhaustive search,which limits its application in practice.VI. C ONCLUSION
This paper researches the compression of sources withhidden Markov correlation using the DAC. The forward al-gorithm is incorporated into the DAC decoder. The results aresimilar to that of the LPDC-based approach. Compared to theLDPC-based approach, the DAC is more suitable for practicalapplications. R
EFERENCES[1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated informationsources,”
IEEE Trans. Info. Theory , vol. 19, no. 4, pp. 471-480, Jul. 1973.[2] J. Garcia-Frias, “Compression of correlated binary sources using turbocodes,”
IEEE Commun. Lett. , vol. 5, no. 10, pp. 417-419, Oct. 2001.[3] A. Liveris, Z. Xiong, and C. Georghiades, “Compression of binary sourceswith side information at the decoder using LDPC codes,”
IEEE Commun.Lett. , vol. 6, no. 10, pp. 440-442, Oct. 2002.[4] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic coding,”
IEEE Commun. Lett. , vol. 11, no. 11, pp. 883-885, Nov. 2007.[5] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic cod-ing for the asymmetric slepian-wolf problem,”
IEEE Trans. SignalProcess
IEEE Interna-tional Conference on Image Processing (ICIP) , San Antonio, Texas, USA,Sep. 2007.[7] M. Grangetto, E. Magli, and G. Olmo, “Symmetric Distributed Arith-metic Coding of Correlated Sources,”
IEEE International Workshop onMultimedia Signal Processing (MMSP) , Crete, Greece, Oct. 2007.[8] M. Grangetto, E. Magli, R. Tron, and G. Olmo, “Rate-compatibledistributed arithmetic coding,”
IEEE Commun. Lett. , vol. 12, no. 8, pp.575-577, Aug. 2008. [9] J. Garcia-Frias and W. Zhong, “LDPC Codes for Compression of Multi-Terminal Sources With Hidden Markov Correlation,”