[PDF] Asynchronous Physical-Layer Network Coding with Quasi-Cyclic Codes

Abstract

Communication in the presence of bounded timing asynchronism which is known to the receiver but cannot be easily compensated is studied. Examples of such situations include point-to-point communication over inter-symbol interference (ISI) channels and asynchronous wireless networks. In these scenarios, although the receiver may know all the delays, it is often not be an easy task for the receiver to compensate the delays as the signals are mixed together. A novel framework called interleave/deinterleave transform (IDT) is proposed to deal with this problem. It is shown that the IDT allows one to design the delays so that quasi-cyclic (QC) codes with a proper shifting constraint can be used accordingly. When used in conjunction with QC codes, IDT provides significantly better performance than existing schemes relying solely on cyclic codes. Two instances of asynchronous physical-layer network coding, namely the integer-forcing equalization for ISI channels and asynchronous compute-and-forward, are then studied. For integer-forcing equalization, the proposed scheme provides improved performance over using cyclic codes. For asynchronous compute-and-forward, the proposed scheme shows that there is no loss in the achievable information due to delays which are integer multiples of the symbol duration. Further, the proposed approach shows that delays introduced by the channel can sometimes be exploited to obtain higher information rates than those obtainable in the synchronous case. The proposed IDT can be thought of as a generalization of the interleaving/deinterleaving idea proposed by Wang et al. which allows the use of QC codes thereby substantially increasing the design space.

Full PDF

aa r X i v : . [ c s . I T ] N ov Asynchronous Physical-Layer Network Coding withQuasi-Cyclic Codes

Ping-Chung Wang, Yu-Chih Huang, and Krishna R. Narayanan,Department of Electrical and Computer EngineeringTexas A&M University { [email protected], [email protected], [email protected] } Abstract

Communication in the presence of bounded timing asynchronism which is known to the receiver but cannot beeasily compensated is studied. Examples of such situations include point-to-point communication over inter-symbolinterference (ISI) channels and asynchronous wireless networks. In these scenarios, although the receiver may knowall the delays, it is often not be an easy task for the receiver to compensate the delays as the signals are mixedtogether. A novel framework called interleave/deinterleave transform (IDT) is proposed to deal with this problem.It is shown that the IDT allows one to design the delays so that quasi-cyclic (QC) codes with a proper shiftingconstraint can be used accordingly. When used in conjunction with QC codes, IDT provides signiﬁcantly betterperformance than existing schemes relying solely on cyclic codes. Two instances of asynchronous physical-layernetwork coding, namely the integer-forcing equalization for ISI channels and asynchronous compute-and-forward,are then studied. For integer-forcing equalization, the proposed scheme provides improved performance over usingcyclic codes. For asynchronous compute-and-forward, the proposed scheme shows that there is no loss in theachievable information due to delays which are integer multiples of the symbol duration. Further, the proposedapproach shows that delays introduced by the channel can sometimes be exploited to obtain higher informationrates than those obtainable in the synchronous case. The proposed IDT can be thought of as a generalization of theinterleaving/deinterleaving idea proposed by Wang et al. which allows the use of QC codes thereby substantiallyincreasing the design space.

Index Terms

Compute-and-forward, physical-layer network coding, and integer-forcing receiver.

I. I

NTRODUCTION

Physical-layer network coding (or compute-and-forward) [2] [3] [4] has been shown to be a wayto effectively harness interference in wireless networks and to provide signiﬁcantly higher throughputthan conventional strategies for many wireless networking problems. However, most of the results inthe literature consider the case when the time delays from the multiple transmitters to a receiver are allidentical (we refer to this as the synchronous case). One of the important open problems in this area isto determine whether the information rates achieved with compute-and-forward in the synchronous casecan be obtained when the time delays from the multiple transmitters are different also (we refer to this asthe asynchronous case). So far, this question has not been conclusively answered and our understandingof asynchronous physical-layer network coding is not as thorough as that of synchronous one. Recently,there have been some efforts in the literature trying to address such problems for some speciﬁc modelssuch as ISI [5] and asynchronous physical-layer network coding (and compute-and-forward as well) [6][7] [8] [9]. In both cases, cyclic codes have been suggested for combating the time delays for these twoseemingly different problems [5] [8]. While cyclic codes are quite useful for these problems, there hasbeen no proof that rates achievable in the synchronous case are achievable in the asynchronous case also.One important reason for this is the fact that there is no proof showing the existence of ensembles ofcyclic codes that can achieve capacity.

Part of the results in this paper has been submitted to the 2014 IEEE Global Communications Conference [1].

In this paper, we show that there is no fundamental loss in the achievable information rates due toasynchronism, when the time delays introduced by the channel are integer multiples of a symbol duration.Interestingly, we also show that in some scenarios, time delays introduced by the channel can be exploitedto achieve higher information rates than those achievable in the synchronous case. These results are basedon two insights and novel ideas proposed in this paper. The ﬁrst is the insight that cyclic codes are notnecessary to deal with asynchronism and that quasi-cyclic (QC) codes sufﬁce. The second main idea is touse an interleve/deinterleave transform (IDT) which equips QC codes with the capability to combat timedelays. With a slight rate reduction, this transform will convert any linear shift of an integer multiple ofthe symbol duration introduced by the channel into a circular shift of another integer value depending onthe parameter we choose. Therefore, one can then utilize the IDT for designing the equivalent time delaysseen at the transform output and implement a QC code accordingly. We then show the existence of anensemble of QC codes that can achieve capacity for channels whose capacity can be achieved by linearcodes and leverage this result to prove the aforementioned information-theoretic result for asynchronousphysical layer network coding.To give concrete examples, we implement the proposed IDT together with QC codes for two instancesof asynchronous physical-layer network coding, namely integer-forcing equalization for ISI channels andasynchronous compute-and-forward. For the integer-forcing equalization proposed in [5], we show that ourIDT-QC codes achieve the upper bound on information rates presented in [5] which may not be achievableby the cyclic coding scheme proposed therein. For asynchronous compute-and-forward, when the delaysare integer multiples of the symbol duration, we ﬁrst show that the rates achievable in the synchronouscase can also be achieved in the asynchronous case. In addition to this, we also show that the proposedIDT-QC codes are capable of exploiting another dimension, namely the delay dimension which leads torates exceeding those achieved in synchronous compute-and-forward [4]. Finally, we consider the caseof non-integer valued delays and when rectangular pulses are used, we show that the proposed schemesachieves higher rates than the scheme in [9]. It is worth noting that the proposed IDT-QC codes are notlimited to these two speciﬁc examples and can potentially be implemented for many networks with delayswhich cannot be easily compensated.In addition to being of theoretical importance, the use of quasi-cyclic (QC) codes is of substantialpractical importance as well. QC codes, QC low-density parity check (LDPC) codes in particular, arequite popular in modern coding theory due to their following desirable properties. They can be encodedusing linear feedback shift registers [10] and a message passing decoder can be implemented efﬁcientlyin hardware in a partially parallel architecture [11]. Further, the QC property makes it efﬁcient to routewires when implementing the message passing decoder [12]. Moreover, the family of QC codes is muchlarger than and subsumes as a special case the family of cyclic codes. Due to these properties, QC LDPCcodes have been included in many real world applications such as IEEE 802.11n [13], IEEE 802.16e [14],DVB-S2 [15], etc. In this paper, we show that in addition to these desirable properties, when used withthe IDT transform, QC codes can be a perfect candidate for combating time delays.The proposed IDT framework can be regarded as a generalization of the scheme in [6] where a pair ofinterleaver/deinterleaver has been implemented together with convolutional codes. In the very last stageof the preparation of this paper, we became aware of a very recently posted independent work [16]where an idea similar to [6] has been used together with tail-biting convolutional codes for asynchronousphysical-layer network-coding for the two-way relay channel. Our paper differs from [6] and [16] in thefollowing two important ways. Firstly, in contrast to [6] and [16] which consider only convolutional codesand cyclic codes, our generalization permits the use of any QC linear/lattice code, thereby expanding thedesign space for the codes that can be used with asynchronism. Secondly, the use of QC codes allows usto derive capacity results for channels with asynchronism.

A. Organization

The paper is organized as follows. In Section II, we provide deﬁnitions of cyclic codes and QC codes andalso review a well-known construction of QC codes based on protographs. We also review the modulation scheme typically used in the compute-and-forward literature. The modulation scheme has been shownto preserve the structures induced by the channel and hence are crucial for compute-and-forward. Thisreview is of practical importance as our proposed scheme heavily relies on QC codes and the modulationscheme and the family of QC LDPC codes is one of the most popular classes of QC codes in practice.In Section III, we elucidate the proposed IDT-QC codes and show some properties of the proposedcodes which include the capacity-achieving property. Section IV and Section V provide two interestingapplications of the proposed IDT-QC codes, namely point-to-point communication over ISI channels andasynchronous compute-and-forward. In Section VI, we introduce a new joint detection and decodingscheme for our proposed IDT-QC codes. This section lifts the information-theoretic framework proposedin Section III-V towards practical implementation by explicitly introducing a practically implementabledecoding scheme. Finally, Section VII concludes the paper.

B. Notation

Throughout the paper, R , C , and Z represent the set of real numbers, complex numbers, and integers,respectively. P ( E ) denotes the probability of the event E . Vectors and matrices are written in lowercaseboldface and uppercase boldface, respectively. We use ∗ to denote linear convolution. For a vector x , weuse x ( t ) to denote the right circularly shifted version of x by t . e.g., for x = [1 , , , , x (1) = [4 , , , .Moreover, ⊕ and ⊙ are addition and multiplication, respectively, over a ﬁnite ﬁeld whose size is understoodfrom the context. II. P RELIMINARIES

We ﬁrst give deﬁnitions of cyclic codes and QC codes and then discuss a well-known construction ofQC LDPC codes.

Deﬁnition 1 (Cyclic codes) . A linear code C is a cyclic code of length N if any circular shift of acodeword is a codeword in C , i.e., for every c ∈ C , c ( i ) ∈ C , for all i = 0 , . . . , N − . Deﬁnition 2 (Quasi-cyclic codes - Representation I) . A linear code C is a QC code with shifting constraint b if any circular shift of a codeword by a multiple of b is a codeword in C , i.e., for every c ∈ C , c ( bi ) ∈ C ,for all i = 0 , . . . , (cid:4) Nb (cid:5) − . Deﬁnition 3 (Quasi-cyclic codes - Representation II) . A linear code C is a QC code with shifting constraint b if every codeword c ∈ C consists of b sub-blocks and for each codeword, circularly shifting every sub-block by the same amount results in a codeword.Note that the above two representations of QC codes are equivalent and such codes are referred toas b -QC codes. One can be converted to the other via an interleaver. Throughout the paper, unlessmentioned otherwise, the ﬁrst representation of QC codes is adopted (Deﬁnition 2). On the other hand,many constructions in the literature (e.g. [17], [18]) adopt the second representation.LDPC codes have been very popular in modern coding theory and in practice due to its ability toachieve near-capacity performance with low decoding complexity and outstanding performance in theﬁnite-length regime. The family of QC LDPC codes is a special class of LDPC codes possessing the QCproperty that have efﬁcient encoding and decoding algorithms. In what follows, we brieﬂy review theconstruction of QC LDPC codes. Most of the works in the literature consider using the protograph-basedconstruction of [19] to generate QC LDPC codes, see for example [17], [18] and the reference therein.To construct a protograph of a length N b -QC LDPC code, one begins with a c × b protomatrix andthen replaces each entry in the protomatrix by an L × L matrix where L , N/b . The replacement followsthe rule that if the entry is 1, it is replaced by a random L × L permutation matrix while if the entry is0, it is replaced by an all-zero matrix. This would result in a cL × N LDPC matrix and can be used togenerate an LDPC code. Now, if we further restrict those permutation matrices to be circulant matrices,then the output would be the parity-check matrix for a b -QC LDPC code. Unlike standard linear codes, the generator matrix of every QC code cannot be written in a systematic form without violating the QCproperty. However, many existing QC LDPC ensembles including the ones used for simulation and forthe proofs satisfy those constraints.In order to use the above QC codes for transmission, we modulate the codeword symbols onto elementsin a constellation A to form the transmitted signals. Throughout the paper, we consider using QC codesover a prime ﬁeld F p and restrict the constellation A to be pulse amplitude modulation (PAM) with p elements, i.e., A = {− p − , . . . , , . . . , p − } . The mapping M : F p → A is given by M ( u ) , (cid:26) u, ≤ u ≤ p − ,u − p, p − < u < p, (1)for p ≥ , and M ( u ) , u − for p = 2 (i.e., BPSK). This mapping has the important property that M ( u ⊕ v ) = M ( u ) + M ( v ) mod p and M ( u ⊗ v ) = M ( u ) · M ( v ) mod p for p ≥ and M ( u ⊕ v ) + = M ( u ) + + M ( v ) + mod 2 and M ( u ⊗ v ) + = ( M ( u ) + ) · ( M ( v ) + ) mod 2 for p = 2 . Note thatthe above operation is precisely the standard procedure for constructing a lattice from linear codes viaConstruction A [23] [24]. In fact, one can easily show that the above construction results in lattices havingQC property, i.e., QC lattices. Moreover, from a result by Forney et al. [25], we know that applying acapacity-achieving linear code to Construction A would result in a sphere-bound-achieving (or Poltyrevgood) lattice. Therefore, existing good QC codes such as AR4JA codes [18] can be adopted to generategood QC lattice codes via Construction A.III. P

ROPOSED I NTERLEAVE /D EINTERLEAVE T RANSFORMED Q UASI -C YCLIC C ODE

Even though our ultimate application is in networks, in this section, we start with the point-to-point communication to facilitate the illustration of the proposed IDT transform. Consider point-to-pointcommunication with additive white Gaussian noise (AWGN) and with time delays τ that is upper boundedby the maximal possible delay D max . We assume that the transmitter only has access to D max but thereceiver knows both τ and D max . For the point-to-point case, one can easily achieve the capacity by usinga capacity-achieving code since τ is known and can be easily compensated. However, in a network wherethere are multiple source nodes, the signals may arrive at different time and are all mixed together sothat this simple approach may no longer work. In order to obtain insight into this problem, we beginwith the point-to-point case. Motivated by this issue, we propose a general framework called IDT whichutilizes a pair of interleaver/deinterleaver to transform the received signal into the desired form. Whencombined with QC codes, the IDT allows us to combat time delays that may be introduced in manypractical scenarios such as asynchronous channels and/or ISI channels. We nickname this combination asinterleave/deinterleave transformed quasi-cyclic (IDT-QC) codes. A. System Model

Consider a point-to-point communication with AWGN and delay τ ∈ { , . . . , D max } . The transmitterwishes to send a message w ∈ F Kp to the receiver. It ﬁrst feeds the message into an encoder E N : F Kp → F Np to form the codeword c = E N ( w ) ∈ F Np . The transmitter adopts the modulation scheme M : F Np → A N to form the transmitted signal x = M ( c ) ∈ A N where A is the signal constellation. The transmittedsignal x is subject to an input power constraint P . N k x k = 1 N N X n =1 | x [ n ] | ≤ P. (2) Note that by systematic form, we particularly mean those encoders whose generator matrices can be written as [ P | I ] , where I is theidentity matrix. While it is true that for any linear code one can always ﬁnd a set of columns that can be used as systematic bits, thesepositions might not be consecutive. Since reordering the bits destroys the QC property, it is not possible to put the generator matrix of everyQC code into a systematic form mentioned above without violating the QC property. Mappings having such properties between two rings are said to be ring homomorphisms. A ring homomorphism is a ring isomorphism ifit is bijective. One can see that the mapping in (1) is in fact a natural mapping from F p to Z /p Z which is known to be a ring isomorphism.There are other constellations possessing such properties (e.g. [20] [21] [22]) but we restrict ourselves to PAM for brevity. ˜ x ¯ x insert CPinterleaver E M w xc ˜ y ¯ y removeinterleaver G ˆ w y CPde- channel

Fig. 1. Block diagram of the proposed IDT.

The received signal is then given by y [ n ] = x [ n − τ ] + z [ n ] , n ∈ { , , . . . , N + τ } , (3)where τ ∈ { , . . . , D max } represents an integer delay and z [ n ] ∼ N (0 , . Upon receiving y , the receiverthen forms an estimate of the message ˆ w ∈ F Kp via G N : R N → F Kp . Throughout the paper, we assumethat τ is unknown to the transmitter and is known to the receiver and D max is known to both ends. Forthe ISI channel, this assumption implies that although the transmitter may not know how many taps wewould have, it knows the maximal delay spread D max . For asynchronous communication, this assumptionmodels the scenario where there is only a very loose synchronization mechanism that would control thetime delays to some degree D max . In what follows, we give the deﬁnition of codes, achievable rates, andcapacity. Deﬁnition 4 (Codes) . An ( N, K ) code consists of a pair of encoding/decoding functions ( E N , G N ) described above and an error probability given by P ( N ) e , P ( ˆ w = w ) . (4) Deﬁnition 5 (Achievable rate and capacity) . For a given set of parameters P and D max , a rate R ( P, D max ) is achievable if for any ε > there is an ( N, K ) code over F p such that K ≥ N R ( P, D max ) / log( p ) and P ( N ) e ≤ ε. (5)The capacity is deﬁned as the supremum of all achievable rates given by C ( P, D max ) , sup R ( P, D max ) . (6) B. IDT-QC Codes

Let C be a ( N ′ , K ) b -QC linear/lattice code with the design rate R d = r d · log( p ) where r d , K/N ′ and r d b ∈ Z . Also, we enforce the generator matrix of this code to be systematic. In the proposed IDT-QCcodes shown in Fig. 1, the transmitter maps the message to a codeword c ∈ C via the encoder E . Thiscodeword is modulated by M to form the signal ˜ x . The signal is then fed into a ( b, N ′ /b ) write column-wise transmit row-wise interleaver [26] to get a interleaved signal ¯ x where the input-output relationshipis given by ¯ x [ n ] = ˜ x [1 + ( ⌊ n/L ⌋ ) + b · ( n mod L − , (7)where L , N ′ /b is always an integer provided by the QC constraint. An illustration of interleaving canbe found in Fig. 2 and one example with b = 4 is given in Fig. 3.Note that one can write the interleaved codeword as the collection of b sub-blocks as ¯ x = [¯ x [1]¯ x [2] . . . ¯ x [ b ]] , (8)where each sub-block ¯ x [ s ] for s ∈ { , . . . , b } , is of length L . For each of the ﬁrst r d b sub-blocks, wefreeze the D max last positions to be zero. This is possible since the encoder is systematic and the ﬁrst r d b blocks correspond to the message part. We then insert a cyclic preﬁx (CP) of length D max for each ˜ x [1]˜ x [2]˜ x [ b ] ...writetransmit ˜ x [ b + 1]˜ x [ b + 2]˜ x [2 b ] ... · · ·· · ·· · · ˜ x [ N ′ − b + 1]˜ x [ N ′ − b + 2]˜ x [ N ′ ] ...... Fig. 2. The write column-wise transmit row-wise interleaver. ... ... ... ... ¯ x ¯ x ¯ x ¯ x Fig. 3. An example of the IDT-QC codes with b = 4 . of the last (1 − r d ) b sub-blocks by appending the last D max symbols to the front. The overall transmittedsignal is given by x = [ x [1] , x [2] , . . . , x [ b ]] , (9)where for s ∈ { , . . . , r d b } , x [ s ] = ¯ x [ s ] whose last D max symbols are 0, and for s ∈ { r d b + 1 , . . . , r d b } x [ s ] , [¯ x L − D max +1 [ s ] , . . . , ¯ x L [ s ] | {z } CP with length D max , ¯ x [ s ] , . . . , ¯ x L [ s ] | {z } =¯ x [ s ] ] . (10)The total length of this signal is N = N ′ + (1 − r d ) bD max . An illustration of the overall signal structureis given in Fig. 4.(a). In fact, for the purpose of IDT transform, one does not have to distinguish theparts using CPs and freezing symbols; it sufﬁces to append CPs for all the sub-blocks. The reason thatwe choose to freeze symbols instead of inserting CPs for the ﬁrst r d b sub-blocks will become apparentin Sections IV and V.At the receiver end, since the receiver knows the time delays τ and τ ≤ D max , it ﬁrst discards the CPfor each sub-block to form ¯ y . As shown in Fig. 5, this signal ¯ y is then fed to a ( b, N ′ /b ) read row-wiseoutput column-wise deinterleaver to get output ˜ y where the input-output relationship is given by ˜ y [ n ] = ¯ y [1 + ( ⌊ n/b ⌋ ) + L · ( n mod b − , (11)which is then fed into the decoder of the QC code to form an estimate of the message. The actual rateof this IDT-QC code is given by R a = K − r d bD max N ′ + (1 − r d ) bD max log( p )= (cid:18) − (2 − r d ) bD max N ′ + (1 − r d ) bD max (cid:19) R d (12) · · · CP ¯ x [1] ¯ x [ r D b + 1] ¯ x [2] ¯ x [ r D b + 2] · · · (a) Synchronous case · · · CP ¯ x ( τ ) [1] ¯ x ( τ ) [ r D b + 1] ¯ x ( τ ) [2] ¯ x ( τ ) [ r D b + 2] · · · (b) Asynchronous case τ Fig. 4. (a) Overall signal structure. (b) Asynchronous case. ¯ y [1]¯ y [ L + 1]¯ y [ N ′ − L + 1] ...outputread · · ·· · ·· · · ¯ y [ L ]¯ y [2 L ]¯ y [ N ′ ] ...... Fig. 5. The read row-wise output column-wise deinterleaver. which tends to R d asymptotically but introduces a rate loss for any ﬁnite N ′ .In the following, we provide some properties of the IDT-QC codes. Lemma 6.

If the receiver opts not to compensate the delay τ , i.e., the receiver observes a noisy versionof the transmitted signal delayed by τ , [0 , . . . , | {z } τ , x [1] , x [2] , . . . , x [ b ]] , then the proposed IDT transformsthe received signal into a noisy version of the codeword circularly shifted by b · τ . Moreover, if a b -QCcode is employed in conjunction with IDT, one can directly decode c ( bτ ) . Proof:

Let us ﬁrst assume that there is no channel noise. For τ ≤ D max , due to the frozen bits forthe ﬁrst r d b sub-blocks and the insetion/removal of the CPs for the last (1 − r d ) b sub-blocks, the linearshift by τ introduced by the channel has been transformed into circular shift of each sub-block by τ . Thisis written with a slight abuse of notation as ¯ x ( τ ) , [¯ x ( τ ) [1] , ¯ x ( τ ) [2] , . . . , ¯ x ( τ ) [ b ]] , (13)where for s ∈ { , . . . , b } , ¯ x ( τ ) [ s ] = [¯ x L − τ +1 [ s ] , . . . , ¯ x L [ s ] , ¯ x [ s ] , . . . , ¯ x L − τ [ s ]] (14)is the circularly shifted version of ¯ x [ s ] by τ positions. One can then verify that the output of thedeinterleaver with this input would be ˜ x ( bτ ) which is corresponding to the codeword c ( bτ ) if a b -QCcode is employed. Therefore, in the presence of channel noise, the received signal would be a noisysignal corresponds to c ( bτ ) and hence we can directly decode c ( bτ ) instead of c . Theorem 7.

There exists a sequence of IDT-QC linear/lattice codes that achieve the capacity of theasynchronous point-to-point AWGN channel.

Proof:

Let

L > D max and let C , C , . . . , C L be identical ( b, k ) linear/lattice codes that can approachthe capacity [27] [28] when b → ∞ for a ﬁxed k/b . i.e., for a ε > , there is a large enough b suchthat k/b log( p ) > C ( P, − ε and P ( b ) e < ε/L . Moreover, in order to make these codes ﬁt into theaforementioned form, the generator matrices of these codes should be systematic. We would like toconstruct a capacity-achieving IDT-QC codes C for the asynchronous AWGN channel from C , C , . . . , C L .For every c l ∈ C l for l ∈ { , , . . . , L } , we construct the codeword c = [ c , c , . . . , c L ] . (15)Using the fact that C , C , . . . , C L are identical linear/lattice codes, one can see that the collection of suchcodewords forms a ( bL, kL ) b -QC code with design rate R d = r d log( p ) where r d = K/N ′ = k/b . Thecodeword is then modulated to ˜ x , fed into the interleaver to form ¯ x , bits-frozen and CP-appended to get x . The receiver observes a noisy version of the transmitted codeword delayed by τ which can be easilycompensated as τ is known by the receiver. It then removes the CP and feeds the signal to the deinterleaver to get a noisy version of the original signal x . The error probability of this IDT-QC code can be boundedusing the union bound as P ( Lb ) e < L · εL = ε, (16)and from (12), one has the actual rate given by R a ( b, L ) = (cid:18) − (2 b − k ) D max bL + ( b − k ) D max (cid:19) R d . (17)Now, letting b go to inﬁnity results in R d → C and R a ( L ) = lim b →∞ R a ( b, L ) = (cid:18) − (2 − r d ) D max L + (1 − r d ) D max (cid:19) C, (18)which in turn results in lim L →∞ R a ( L ) = C which completes the proof. Remark 8.

Note that the ensemble of codes that we construct in Theorem 7 is not necessarily a goodensemble in the sense that for a particular target error rate, it requires a very long code length to achievethat error rate. In practice, QC codes are usually not constructed this way and QC codes with not verylong block length comparing to the one constructed here can perform very well. Since this is the case,we only use this ensemble for the proof and construct QC codes for simulation by existing constructionssuch as AR4JA.

C. Advantages and Disadvantages

Here, we provide a discussion of some advantages and disadvantages of the proposed IDT-QC codes.

Advantages : • The proposed IDT-QC codes substantially generalize the idea of [6]. Unlike [6] which only considersallowing the use of convolutional codes and only works for the two-way relay channel, the proposedframework is more general in that it can take any QC codes and would work for a larger classof networks as will be shown later on. The use of QC codes provides signiﬁcant improvement inpractice as there are families of QC codes (e.g. AR4JA codes [29]) that can work very close tothe Shannon limit with reasonable decoding complexity (iterative decoding). On the other hand, forconvolutional codes, one has to use a very large constraint length (usually with formidably highdecoding complexity) in order to approach the Shannon limit. • Compared to the use of cyclic codes as in [5] [8], our approach enjoys a better error-correctingcapability provided by QC codes. This can be easily seen by the fact that QC codes contain thefamily of cyclic codes as a special case. Our simulation in the following sections show that evenwhen we consider the rate loss introduced by the IDT, IDT-QC codes signiﬁcantly outperform cycliccodes. Moreover, unlike cyclic codes, the proposed IDT-QC codes can be easily shown to achievethe Shannon limit. • In practice, QC codes (QC LDPC codes in particular) have been very popular and have been adoptedin many communication standards due to the existence of efﬁcient encoding and decoding algorithms.The proposed IDT-QC codes naturally inherit those practical beneﬁts from QC codes and hence arepractically attractive. • As will be unveiled in the following sections, the proposed IDT-QC codes can not only harnessinterference in the presence of asynchronism, but also exploit asynchronism in some cases.

Disadvantages : • Due to the use of interleaver and deinterleaver in our proposed IDT-QC codes, the transmitter has towait until the entire codeword is generated before transmission. This results in an increased encodinglatency. • While the rate loss is negligible as L → ∞ when we prove the capacity result in Theorem 7, it mustbe taken into account in the ﬁnite length regime. However, in the following sections, we will showthat the proposed IDT-QC codes outperform cyclic codes even when rate loss is included. Z ( D ) + Y ( D ) X ( D ) w H ( D ) ENC A ( D ) Y ′ ( D ) DEC I ( D ) − ˆw Fig. 6. The integer-forcing equalization system.

IV. A

PPLICATION

1: I

NTEGER -F ORCING E QUALIZATION FOR

ISI C

HANNELS

In this section, we consider point-to-point communication with ISI. It has been known that the capacityof the ISI channel can be achieved by multi-carrier systems. However, the high peak-to-average power ratiomakes this approach less attractive for applications requiring extremely low complexity such as wirelesssensor networks. Another way to approach capacity is to code over time domain and uses a decisionfeedback equalizer (DFE) at the receiver. For this to work, a very long interleaver/deinterleaver (betweenmultiple codewords) is required to avoid error propagation. Recently, Ordentlich and Erez in [5] proposeda new linear equalization technique called integer-forcing equalization. This new equalization techniquedoes not require inteleaver/deinterleaver between codewords and avoids the error propagation as no DFEis implemented. However, one of the drawbacks pointed out by the authors themselves is that the channelcoding adopted is required to be cyclic and hence is not guaranteed to achieve capacity. In what follows,we will replace the cyclic codes by the proposed IDT-QC codes which are capacity-achieving to achievethe upper bound on information rates presented in [5]. It should be noted that unlike the DFE-basedscheme, the interleaver/deinterleaver for the proposed framework is within a QC codeword and hence ismuch shorter than that in DFE-based schemes.

A. Problem Statement

The transmitter encodes its message w ∈ F Kp to a codeword c ∈ F Np which is then mapped to a signal x ∈ A N via M where A is the signal constellation (e.g., M-PAM) and M is the natural mapping. Thissignal is subject to a power constraint P and is sent over an AWGN channel with ISI h = [ h , . . . , h d M ] where d M depends on the maximal delay spread and the sampling frequency. The received signal is givenby y = h ∗ x + z . (19)i.e., the received signal would be a noisy version of a linear combination of the codeword linearly shiftedby integers. We consider a recently proposed linear equalizer called integer-forcing equalizer proposed in[5]. This technique ﬁrst passes y to a linear equalizer chosen in such a way that the equalized channelimpulse responses are forced to be an integer vectors i , [ i , i , . . . , i D max ] . Also, one can easily transformlinear convolution into circular convolution. The equalized signal is then given by y = D max X d =0 i d x ( d ) + z ′ , (20)where z ′ is the ﬁltered noise.The authors in [5] then proposed using cyclic codes over F p at the transmitter so that ϕ ( P D max d =0 i d x ( d ) ) with ϕ , M − ◦ mod p is a codeword of the same cyclic code. Therefore, one can directly decode ϕ ( P D max d =0 i d x ( d ) ) from y mod p . This decoded signal is then used to recover x and hence w . In whatfollows, we propose using the IDT-QC codes to replace the cyclic codes. Since the problem of designingand analyzing integer-forcing equalizers has been well addressed in [5], we assume that the ISI channelhas already been integral. i.e., h = i . i × ··· CP ¯ x [1] ¯ x [ r D b + 1] ¯ x [2] ¯ x [ r D b + 2] ······ CP ¯ x (1) [1] ¯ x (1) [ r D b + 1] ¯ x (1) [2] ¯ x (1) [ r D b + 2] ··· i × i × + ··· CP ¯ x (2) [1] ¯ x (2) [ r D b + 1] ¯ x (2) [2] ¯ x (2) [ r D b + 2] ··· + Fig. 7. The idea of using IDT-QC for point-to-point communication with ISI.

B. Using IDT-QC for Point-to-Point Communication with ISI

After the integer-forcing equalizer, the signal becomes linear combination of linearly-shifted versionsof the transmitted signal with integer coefﬁcients i d ∈ Z for d ∈ { , . . . , D max } . As shown in Fig. 7where an example with D max = 2 is given, the receiver then removes the received signal at the positionswhere the CP of the ﬁrst tap’s signal should be. Then the signal can be expressed as a noisy version ofthe linear combination of the codeword circularly shifted by integers given by ¯ y = D max X d =0 i d ¯ x ( d ) + ¯ z , (21)where ¯ x ( d ) is as in (13) and elements in ¯ z and those in z have the same distribution. As shown in Lemma 6,any linear shift by an integer d ≤ D max introduced by the channel will be transformed by the IDT into b · d circular shift for the codeword. i.e., the channel would be transformed into ˜ y = D max X d =0 i d ˜ x ( bd ) + ˜ z , (22)where elements in ˜ z and those in z have the same distribution. Moreover, since IDT-QC codes adopt b -QC codes for channel coding, every ˜ x ( bd ) in (22) corresponds to a valid codeword in the underlyingQC code. This in turn allows us to directly decode ˜ y mod p to a valid codeword ϕ (cid:16)P D max d =0 i d ˜ x ( bd ) (cid:17) (or ϕ ( I ( D b ) X ( D )) in the D -domain) in the same QC code. After this codeword is decoded, one can use theknowledge of frozen bits to strip out all the message bits. This is perhaps easier seen from the secondrepresentation and is shown in Fig. 7. It can be seen that within each sub-block corresponding to themessage part, one can initiate the deconvolution since the last D max bits are frozen.It has been shown in Section III that there exists a sequence of the proposed IDT-QC codes thatcan achieve capacity. Thus, theoretically, using the proposed framework allows one to achieve the upperbound on information rates presented in [5] which may not be achievable for the cyclic coding schemeproposed therein. Therefore, the proposed framework bridges the gap-to-capacity for such integer-forcingequalization schemes. In what follows, we provide some simulation results to demonstrate that the proposedIDT-QC codes outperform cyclic codes even though for the ﬁnite-length regime, the proposed IDT-QCcodes suffer from a rate loss. It is worth mentioning that in addition to being of independent interest, theinteger-forcing equalization for ISI channel will play an important role in using IDT-QC for asynchronouscompute-and-forward. C. Simulation Results

We now provide some simulation results to compare the proposed IDT-QC framework and the cycliccoded scheme proposed in [5]. We consider the dicode channel whose impulse response is I ( D ) = 1 + D (i.e., i = i = 1 ); therefore, D max = 1 . We construct an binary IDT-QC LDPC code from the AR4JAensemble [18] with N ′ = 4096 , b = 32 , K = 3072 , and the design rate R d = 0 . . The actual rate of this −1 0 1 2 3 4 510 −6 −5 −4 −3 −2 −1 SNR per information bit, E b /N (dB) BE R IDT−QC LDPC R a = 0.742Cyclic LDPC R a = 0.66R=0.66 5.9 dB4.4 dBR=0.742 Fig. 8. BER comparison of the IDT-QC LDPC code with R a = 0 . and the cyclic LDPC code with R a = 0 . over the dicode channel (1 + D ) . The dashed and solid vertical lines are the required SNRs corresponding to the information rates R = 0 . and R = 0 . ,respectively. code is R a = 0 . . For comparison with the scheme in [5], we also construct a cyclic LDPC code fromthe ensemble proposed in [30] with N ′ = N = 4095 , K = 2703 , and the design rate R d = 0 . . Note thatfor the cyclic coded scheme in [5], one has to freeze D max bits for initializing the deconvolution. Thisresults in the actual rate R a ≈ R d = 0 . . For the both codes, the decoding algorithm is a message-passingalgorithm [31] with at most iterations. Since I ( D ) = 1 + D , the receiver attempts to decode c ⊕ c ( b ) .Simulation results presented in Fig. 8 show that in spite of having a higher rate, the proposed IDT-QCLDPC code provides roughly 1.1 dB gain when BER is at − . This is mainly because the proposedIDT transform enables the use of QC codes where very powerful ensembles such as AR4JA can be easilyconstructed. Moreover, the conventional scheme in [5] relies solely on the family of cyclic codes whichis much smaller than that of QC codes.We also provide the information rate corresponding to the independent uniformly distributed inputdistribution for the dicode channel estimated by the method in [32] (which is equivalent to the forwardrecursion of the BCJR algorithm). One observes that there is a roughly 4.4dB gap between the proposedscheme and the corresponding information rate at P e ≈ − . This gap comes from the following sources.First and foremost, a receive ﬁlter has not been used to harness all the energy in all the taps of the ISIchannel. In this example, this contributes a 3 dB loss. A second source is the power loss inherited from theinteger-forcing equalization approach which transforms the channel into a mod p channel (here mod 2 ).The ﬁnal source of this gap simply comes from the fact that the block length we consider here (4096)is rather small. Similar but larger gap can be observed for the cyclic coded integer-forcing equalizationscheme.It should be noted that for a single user ISI channel, using conventional equalization techniques suchas a decision feedback equalizer (DFE) will provide better results than integer-forcing equalization.However, when a compute-and-forward problem is considered with multiple users and ISI, conventionalequalization techniques will not be sufﬁcient to efﬁciently compute functions of transmitted signals sincethe interference from multiple users and ISI cannot be simultaneously removed easily. In these cases,integer-forcing equalization can substantially outperform conventional equalization techniques.V. A PPLICATION

2: A

SYNCHRONOUS C OMPUTE - AND -F ORWARD

In this section, we study the compute-and-forward relay network introduced by Nazer and Gastpar [4].In particular, we consider the asynchronous version of this network where signals sent from differentsource nodes may arrive at a destination node at different times. In the synchronous case, the compute-and-forward strategy suggested in [4] implements an identical nested lattice code [28] at each user anddirectly decodes the received signal to a modulo version of linear combination of the codewords with D M ˆ u M y M z m z M + z + ...+ y y m x s x S h h MS h m x w w s w S S D S s D m ... S S ˆ u ˆ u m ...... h ms Fig. 9. The compute-and-forward relay network. integer coefﬁcients at the destination. This scheme is shown to provide a substantially higher computationrate in the medium signal-to-noise ratio (SNR) regime than existing schemes. For practical purposes, in[21], this nested lattice code is replaced by linear code over F p together with a signal mapping possessingthe property described in Section II in order to exploit the structural gain. The destination then decodesthe received signal to a linear combination of the codewords over F p .There have been a few attempts at using physical-layer network coding to this asynchronous setting. Aconvolutional coded scheme has been proposed in [6] to deal with synchronization errors; however, onlyinteger-valued delays (i.e., frame-level asynchronism) are allowed. In [7], an over-sampling method wasproposed and a graph-based decoding algorithm has been proposed speciﬁcally for this over-samplingmodel. This over-sampling method can take real-valued delays (i.e., symbol-level asynchronism) but onlywithin one symbol time; thus, results in a stringent timing synchronization. In [9], frame-level and symbol-level asynchronous compute-and-forward are considered where the destinations are only able to computesynchronous functions. A very recent work in [8] has successfully applied cyclic codes to this problemso that asynchronous functions are computable and showed through simulation that cyclic codes are ableto combat with real-valued delays within one packet time.In this section, we replace the cyclic codes by the proposed IDT-QC codes and show that this replace-ment allows us to prove capacity results for both the frame-level and symbol-level cases. Moreover, thesimulation results given in Section VI show that this replacement substantially improves the performance.It should be noted that since the proposed scheme relies on the quasi-cyclic property instead of the cyclicproperty to deal with synchronization errors, the delay constraint is more stringent than the scheme in [8].Nonetheless, it only requires the delays to be controlled within a certain range D max (say few symbolstime), which is practically reasonable. A. Problem Statement

As shown in Fig. 9, in a compute-and-forward network, there are total S source nodes and M ≥ S destination nodes. Each source node s ∈ { , . . . , S } encodes its message w s ∈ F Kp to a codeword c s ∈ F Np .This codeword is then modulated to the transmitted signal x s ∈ A N via a mapping M as described inSection II. The codeword is subject to a power constraint given by N k x s k = 1 N N X n =1 | x s [ n ] | ≤ P. (23)Let τ ms be the delay experienced by the signal from source s to destination m . We will separately con-sider two cases, namely the frame-level asynchronous compute-and-forward where τ ms ∈ { , . . . , D max } and the symbol-level asynchronous compute-and-forward where τ ms ∈ [0 , T ) with T being symbol duration. For the frame-level asynchronous one, the received signal is given by y m [ n ] = S X s =1 h ms x s [ n − τ ms ] + z m [ n ] , (24)where h ms ∈ R (or C depending on whether the signal constellation is real or complex) is the channelcoefﬁcient between the source node s and destination m , and z m [ n ] ∼ CN (0 , . For the symbol-levelasynchronous compute-and-forward, one has to work with the continuous-time model given by y m ( t ) = S X s =1 N X n =1 h ms x s [ n ] p ( t − nT − τ ms ) + z ( t ) , (25)where p ( t ) is the pulse shaping function and z ( t ) is a Gaussian process with zero mean and variance .The destination node m is only interested in computing and forwarding a function of the messages.In particular, the compute-and-forward scheme in [4] conﬁnes itself to synchronous functions of themessages which mimics the behavior of linear network coding. i.e., the destination node m chooses { b ms } and computes u m = ⊕ Ls =1 b ms w s such that the computation rate at node m is maximized. Thecomputed functions together with { b ms } are then forwarded to a central destination which desires all themessages. It is clear that as long as the coefﬁcients { b ms } form a full-rank matrix, the central destinationwould be able to invert the matrix and obtain all the messages. In the sequel, we will show that the useof IDT-QC codes allows one to compute asynchronous functions which may lead one to an increasedcomputation rate. For ease of exposition, we will separately discuss the frame-level and symbol-levelmodels and restrict ourselves to S = M = 2 , but the proposed scheme works for general scenarios. B. Using IDT-QC for Frame-Level Asynchronous Compute-and-Forward

We illustrate the idea of using the proposed IDT-QC codes for frame-level asynchronous compute-and-forward. Each source node adopts a same b -QC code over F p for encoding its message to the codeword c s which is then modulated to the signal ˜ x s = M ( c s ) . It will then be interleaved to form ¯ x s and furtheradded frozen bits and appended CPs to form the transmitted signal x s . The length of the CPs is againset to be D max . One difference here is that for a compute-and-forward network with S source nodes, onehas to freeze SD max positions instead of D max positions for each sub-block corresponding to messagepart. As shown in Fig. 10, the receiver removes the signal at the positions where the ﬁrst source node’sCP should be and the received signal becomes ¯ y m = a m ¯ x ( τ m )1 + a m ¯ x ( τ m + ¯ z eq,m , (26)where ¯ x ( τ ms ) s is as in (13) and ¯ z eq,m is the effective noise which consists of the noise and the self-interference [4]. The destination node then feeds this signal into the deinterleaver. As discussed inLemma 6, the proposed IDT-QC codes transform any integer delay τ introduced by the channel into b · τ ms circular shifts for the codeword. Thus, the deinterleaver output is given by ˜ y m = a m ˜ x ( bτ m )1 + a m ˜ x ( bτ m )2 + ˜ z eq,m , (27)where elements in ˜ z eq,m and that in ¯ z eq,m have the same distribution. The receiver m then attempts tocompute the lattice point a m ˜ x ( bτ m )1 + a m ˜ x ( bτ m )2 and uses the ring homomorphism ϕ , M − ◦ mod p to map this lattice point to ϕ ( a m ˜ x ( bτ m )1 + a m ˜ x ( bτ m )2 ) = b m ⊙ c ( bτ m )1 ⊕ b m ⊙ c ( bτ m )2 , (28)where b ms , ϕ ( a ms ) . It should be noted that since the underlying code we adopt is a b -QC code, c ( bτ ms ) s is a codeword and so is f m , b m ⊙ c ( bτ m )1 ⊕ b m ⊙ c ( bτ m )2 . a m × ··· CP ¯ x (1)1 [1] ¯ x (1)1 [ r D b + 1] ¯ x (1)1 [2] ¯ x (1)1 [ r D b + 2] ··· a m × ··· CP ¯ x (2)2 [1] ¯ x (2)2 [ r D b + 1] ¯ x (2)2 [2] ¯ x (2)2 [ r D b + 2] ··· + Fig. 10. An example of using IDT-QC for asynchronous compute-and-forward where τ m = 1 , τ m = 2 , and D max = 2 . The computed functions and those coefﬁcients are then forwarded to the central destination and arethen further processed to recover all the messages. We now show that the compute-and-forward problemwith full rank coefﬁcients can be equivalently represented as ISI channel problems in Section IV and canbe solved by deconvolution if sufﬁcient initial conditions are provided. For the sake of simplicity, we lookat the interleaved version of f m given by ¯ f m , b m ⊙ ¯ c ( τ m )1 ⊕ b m ⊙ ¯ c ( τ m )2 . (29)In the D -domain, one has that ¯ F = (cid:18) ¯ F ( D )¯ F ( D ) (cid:19) = (cid:18) b D τ b D τ b D τ b D τ (cid:19) ⊙ (cid:18) ¯ C ( D )¯ C ( D ) (cid:19) , ¯ B ⊙ ¯ C . (30)Note that mathematically, one can now left-multiply by the inverse of the matrix ¯ B to get ¯ C . In order toendow this inverse an operational meaning, we note that for every full rank matrix ¯ B , one has ¯ B − = adj( ¯ B )det( ¯ B ) , (31)where det( . ) is the determinant and adj( . ) is the adjugate. One can then left multiply F with adj( ¯ B ) toform adj( ¯ B ) ⊙ ¯ F = det( ¯ B ) ⊙ ¯ C . (32)One observes that the problem has been converted into two separate ISI channel problems whose impulseresponses are integer vectors. Moreover, since each element in ¯ B has the range { , . . . , D max } , eachelement in det( ¯ B ) has range { , . . . , D max } or in general { , . . . , SD max } . Therefore, this problemcan be solved by deconvolution provided that the transmitter freeze SD max positions for each sub-blockbelonging to the message part. The actual rate then becomes R a = K − r d bSD max N ′ + (1 − r d ) bD max log( p )= (cid:18) − ( S + 1 − r d ) D max L + (1 − r d ) D max (cid:19) R d , (33)which does not affect the asymptotic results. One example is given in the following. Example 9.

Consider a 2-by-2 example over F . Suppose that relay 1 receives y [ n ] = x [ n −

1] + x [ n ] + z [ n ] , (34)and relay 2 receives y [ n ] = x [ n ] + x [ n −

1] + z [ n ] . (35) Then ¯ B = (cid:18) D D (cid:19) . (36)We have det( ¯ B ) = 1 + D , and adj( ¯ B ) = ¯ B . Thus, by left-multiplying adj( ¯ B ) , one has (cid:18) D ¯ F ( D ) + ¯ F ( D )¯ F ( D ) + D ¯ F ( D ) (cid:19) = (cid:18) (1 + D ) ¯ C ( D )(1 + D ) ¯ C ( D ) (cid:19) , (37)which are two separate ISI channel problems. What (37) implies is that the receiver separately decodes ¯ c ⊕ ¯ c (2 b )1 and ¯ c ⊕ ¯ c (2 b )2 where b is the shifting constraint. Then using the frozen bits, deconvolutions areperformed to obtain ¯ c and ¯ c We now present the main theorem of this section.

Theorem 10.

Consider the frame asynchronous case where τ ms ∈ { , . . . , D max } . At relay m , given h m and a m , a computation rate of R ( h m , a m ) = 12 log + (cid:18) k a m k − P | h Hm a m | P k h m k (cid:19) − ! , (38)where a ms ∈ Z , is achievable per real dimension. Proof:

See Appendix A.

C. Achieving higher rates than in the synchronous case

One important observation here is that the proposed QC-IDT scheme allows one to exploit anotherdimension, namely the delay dimension. This is due to the fact that the QC nature of the proposedscheme enables the computation of asynchronous functions in addition to synchronous ones. Sometimes,this allows one to achieve rates surpassing that achieved by tightly synchronous compute-and-forwardwith the same channel coefﬁcients. For example, the matrix ¯ B sync = (cid:18) (cid:19) is not invertible; however,the matrix in (36) is invertible. It must be noted that the delays are completely determined by the channelso that one does not have control over those parameters. But instead of being limited by those delays, theproposed scheme is capable of exploiting them. One example of how the delay may improve the systemperformance is given in the following. Example 11.

In Fig. 11, we plot the achievable computation rates of asynchronous compute-and-forward,for the cases where S = M = 2 and S = M = 3 respectively. The channel coefﬁcients h ms are drawn fromi.i.d. Rayleigh distribution. One can see from this ﬁgure that for both cases, increasing D max substantiallyincreases achievable computation rates. This effect is most pronounced when D max is increased from 0 to1. This example demonstrates that when the channel introduces delays, using the proposed scheme whichallows the decoding of asynchronous functions results in higher achievable computation rates. Remark 12.

Frame asynchronous compute-and-forward has been considered in [9]. The scheme thereindoes not possess the QC or cyclic properties so that the relays are forced to compute synchronous functionsonly. As a consequence, they have to use multiple antenna at the relays to rotate the received signal inorder to recover synchronous functions in the presence of frame-synchronization errors. This introducesa huge loss not just in rates but also in degrees of freedom because multiple antennas are used just forcomputing one function at each relay. On the other hand, thanks to the QC nature of the proposed scheme,the computation rates given in Theorem 10 have the exactly same form with that in [4], i.e., as therewas no frame asynchronism at all. This is a direct consequence of enabling computing asynchronousfunctions which are undecodable in [9]. However, this gain does not come for free; this gain comes withan increased burden in the next phase since in addition to a ms (or b ms equivalently), the relays also haveto forward the delay proﬁle to the central destination. But this is usually not an issue as the bottleneck isusually the ﬁrst phase. r a t e / s ou r c e max =02−by−2 D max =12−by−2 D max =23−by−3 D max =03−by−3 D max =13−by−3 D max =2 Fig. 11. Achievable rates of asynchronous compute-and-forward (average over 10000 realizations). · · · T T T T · · · · · · T + τ T + τ T + τ T + τ τ τ T + τ f T + τ T + τ T + τ Fig. 12. An example of SNR loss resulting from symbol-level asynchronous model where τ j = 0 , τ j = τ , and τ j = τ . D. Using IDT-QC for Symbol-Level Asynchronous Compute-and-Forward

We now focus on the symbol-level asynchronous compute-and-forward. i.e., τ ms ∈ [0 , T ) and thecontinuous-time model in (25) is considered. Similar to [7] [16] [33], we further assume that the pulseshaping function adopted is the ideal (rectangular) pulse. Let π m be the permutation operation at the relay m deﬁned by π m (1 , . . . , S ) = ( j , . . . , j S ) , (39)such that τ mj ≤ . . . ≤ τ mj S . In order to extract out all the energy, in general, one can perform S differentmatched ﬁlter p mi ( t ) for i ∈ { , . . . , S } as p mi ( t ) =  , τ mj i − + ( n − T ≤ t < τ mj i + ( n − T √ P , τ mj i + ( n − T ≤ t < τ mj i +1 + ( n − T , τ mj i +1 + ( n − T ≤ t < nT , (40)where τ mj = 0 and τ mj S +1 = T for each m . Note that the sampled output of different matched ﬁlters wouldcorrespond to different functions. For example, as shown in Fig. 12, for the S = 3 case, three differentmatched ﬁlters would correspond to three different functions, namely, c j ⊕ c ( b ) j ⊕ c ( b ) j , c j ⊕ c j ⊕ c ( b ) j ,and c j ⊕ c j ⊕ c j . The corresponding SNR are then given by P mi = P T ( τ mj i − τ mj i − ) . (41)We then pick the one with the highest SNR for compute-and-forward and discard the others. This wouldresult in the following achievable computation rates. Theorem 13.

Consider the symbol asynchronous case where τ ms ∈ [0 , T ) . At relay m , given h m and a m ,a computation rate of R ( h m , a m ) = 12 log + (cid:18) k a m k − P m | h Hm a m | P m k h m k (cid:19) − ! , (42)where a ms ∈ Z and P m = max i ∈{ ,...,S } P mi , is achievable per real dimension. Proof:

Similar to the proof of Theorem 10 but replacing P by P m . Remark 14.

In [9], the same problem has been studied and achievable computation rates similar to(42) but with P m replaced by P mj S has been achieved as only synchronous functions are computable. Ourproposed scheme provides increased computation rates through allowing the computation of asynchronousfunctions. Remark 15.

The above scheme only works with the signals corresponding to the function with the highestSNR and completely ignores those corresponding to other functions. However, as will be discussed inthe next section, in terms of error probability, one can take advantage of those information by jointlyconsidering detection and decoding.VI. P

RACTICAL D ETECTION AND D ECODING FOR A SYNCHRONOUS C OMPUTE - AND -F ORWARD

In this section, we introduce a joint detection and decoding scheme for the proposed IDT-QC codesto alleviate the SNR loss in the presence of symbol-level asynchronism. This decoder is then used forgenerating simulation results which demonstrate that the proposed framework substantially outperformsthe cyclic coding scheme [8]. We again begin with the continuous-time model in (25). We further restrictour attention to a speciﬁc relay and drop the subscript m for the sake of simplicity. This allows usto assume τ = 0 and τ = τ without loss of generality. Let τ ∈ [0 , D max ] where τ = τ f + τ s with τ f ∈ { , , ..., D max − } being the frame-level asynchronism and τ s ∈ [0 , being the symbol-levelasynchronism.We use a set of matched ﬁlters similar to (40) to over-sample the received signal [7]. This will resultin the following sampled outputs r [2 n −

1] = h x [ n ] + h x [ n −

1] + z [2 n − , (43)with x [0] = 0 and r [2 n ] = h x [ n ] + h x [ n ] + z [2 n ] , (44)and r [2( N − τ f ) + 1] = h x [ N − τ f ] + z [2( N − τ f ) + 1] where z [2 n − ∼ N (0 , /τ s ) , z [2( N − τ f ) + 1] ∼N (0 , /τ s ) , and z [2 n ] ∼ N (0 , / (1 − τ s )) . In what follows, we describe how to perform the detection anddecoding based on this over-sampling model. A. Joint MAP detection and JCF decoding

We now propose a joint detection and decoding scheme which can be deemed as the decoding scheme in[7] tailored speciﬁcally for the IDT-QC codes. The decoding algorithm is based on the Tanner graph givenin Fig. 13. The top part of the Tanner graph with zigzag fashion is associated with the MAP detectionwhich accommodates the correlation between two consecutive over-sampling symbols. The bottom partof the Tanner graph is precisely that of the underlying QC code but over F p , i.e., the ACNC decoderin [33] which we refer to as the joint compute-and-forward (JCF) decoder. Unlike [7], there is a pairof interleaver/deinterleaver between the MAP detection and the JCF decoding parts. Moreover, thanks tothe ability described in Lemma 6, depending on the corresponding SNRs, the receiver can opt to decodeeither c ⊕ c ( bτ f )2 or c ⊕ c ( b ( τ f +1))2 . This is represented as solid and dash edges, respectively. interleaver/de-interleaver MAP detectorJCF decodercheck nodes (code) check node (induced by delay) variable node ≤ τ f ≤ . T . T ≤ τ f ≤ T Fig. 13. Graph representation of iterative receiver

B. Simulation Results

We now provide some simulation results. For the sake of simplicity, we only consider coding over F with BPSK. The channel parameters are set to be h = h = 1 and D max = 5 . We again construct aIDT-QC LDPC code from the AR4JA ensemble with N = 4096 , b = 32 , K = 3072 , and the design rate R d = 0 . . The actual rate is then R a = 0 . . We would like to recall that for a same set of D max and b , the longer the code the smaller the rate loss. Also, we construct the same cyclic LDPC code as inSection IV with design rate R d = 0 . . Note that for using cyclic codes for asynchronous compute-and-forward, one has to freeze SD max bits. Thus, the actual rate of the cyclic LDPC code is R a = 0 . .The decoding algorithm for the both codes is the joint MAP detection and JCF decoding described abovewith outer iterations and inner iterations.In Fig. 14, BER versus SNR curve is plotted. One can observe that despite of having a higher rate,the proposed IDT-QC LDPC code outperforms the cyclic LDPC code by roughly 1.1 dB when τ = 0 ,i.e., under perfect synchronization. This is the coding gain offered by the AR4JA code over the cycliccode adopted. When τ = 0 . , the proposed IDT-QC LDPC code provides roughly 1.5 dB gain over thecyclic-LDPC considered. This enlarged gap may be explained by the observation that the joint graph ofthe zigzag detection and the parity check of a cyclic code is more likely to create short cycles comparedto QC LDPC codes.In Fig. 15, BER versus delay curve is plotted for SNR = 3 . dB. One observes that in the region τ ∈ [0 , T ] , the proposed IDT-QC LDPC code always performs better than the cyclic LDPC code in termsof BER. Moreover, we observe a symmetric behavior of BER about τ = 0 . . This is a consequenceof allowing decoding to asynchronous functions so that the performance would only depend on howclose the delay τ is to an integer. According to this observation, one expects a periodic behavior forother [ kT, ( k + 1) T ] within [0 , D max ] . Another interesting observation is that there is a local minimum ataround τ = 0 . . This may be explained by the observation that at around τ = 0 . , two codewords arewell separated and hence the zigzag detection and JCF would perform like a decode-and-forward decoder.VII. C ONCLUDING R EMARKS AND F UTURE W ORK

The problem of communication in the presence of time delays that cannot be easily compensated, such aswhat we encounter in asynchronous physical layer network coding, has been studied. There are three mainresults in the paper. Theorem 10 establishes that fundamentally there is no loss in the information ratesachievable in the presence of timing delays which are integer multiples of symbol duration in comparisonto the synchronous case. This is the ﬁrst result to show that integer-valued asynchronism does not cause anyreduction in the achievable rates. This result is obtained through the use of a novel framework called theinterleave-deinterleave (IDT) transform in conjunction with quasi-cyclic codes. Secondly, in Section V-C,we have shown that delays from the channel can be exploited to decode an increased set of functionsat the relays, thereby obtaining higher rates than in the synchronous case in some scenarios. Finally, anachievable rate in the presence of non-integer valued delays is given in Theorem 13. The rates achievableare higher than those reported earlier in the literature in [9]. −6 −5 −4 −3 −2 −1 SNR per information bit, E b /N (dB) BE R IDT−QC LDPC τ =0IDT−QC LDPC τ =0.5Cyclic LDPC τ =0Cyclic LDPC τ =0.5 Fig. 14. BER performance of the proposed IDT-QC LDPC code with R a = 0 . and the cyclic LDPC code with R = 0 . forcompute-and-forward. −4 −3 −2 −1 τ (delay) BE R IDT−QC LDPC 3.5dBCyclic LDPC 3.5dB

Fig. 15. BER versus τ for the proposed IDT-QC LDPC code with R = 0 . and the cyclic LDPC code with R = 0 . for compute-and-forward. As applications, the IDT-QC codes have been implemented as coding schemes for two channel models,namely the integer-forcing equalized ISI channel and the asynchronous compute-and-forward relay net-work. For the former, the proposed IDT-QC framework was able to bridge the gap-to-capacity sufferedfrom the cyclic coding scheme recently proposed. For the latter, it has been shown that the proposedIDT-QC scheme not only provides signiﬁcantly higher rates than the state of the art but also allowsthe exploitation of the delay dimension which may lead one to rates beyond those achieved by tightlysynchronous compute-and-forward. Moreover, practical implementation of the proposed scheme has alsobeen considered where a joint detection and decoding scheme was considered. Simulation results for thetwo transmitters and one receiver case have further conﬁrmed the theoretical analysis and observations.Interesting future work includes the following. Theoretically, it is of interest to see how one canfurther beneﬁt from the delay dimension introduced by the channel. For example, for the symbol-levelasynchronous compute-and-forward, it could be the case that some relays are unable to compute anyfunction with a rate above the threshold but some relays are able to compute more than one functions(possibly have the same coefﬁcients but different delays) with rates above the threshold. If the secondphase bandwidth is not an issue, the central destination can pick S functions with the highest computationrates, regardless of where the functions are computed. This may lead to higher computation rates thanthat provided by tightly synchronous compute-and-forward. Practically, it is interesting to design spatially- coupled QC LDPC codes which can further bridge the gap to those theoretical results. Moreover, it is ofinterest to investigate the impact of using a practical pulse shaping function.A PPENDIX AP ROOF OF T HEOREM Λ f and Λ c be two b -dimensional lattices withthe relationship Λ c ⊆ Λ f . A nested lattice code is a code using the minimum-energy coset representativesof Λ f / Λ c as codewords. i.e., C , Λ f ∩ V Λ c where V Λ is the fundamental Voronoi region of a lattice Λ .The rate of a nested lattice codes is given by R = 1 b log (cid:18) Vol ( V Λ c ) Vol ( V Λ f ) (cid:19) . (45)Moreover, lattices in [28] are constructed by Construction A with ( b, k ) linear codes over F p ; hence, onecan further rewrite the rate as R = k/b log( p ) . We denote such nested lattice codes as ( b, k ) nested latticecodes.Now, let C (1) , C (2) , . . . , C ( L ) be identical ( b, k ) nested lattice code Λ f ∩ V Λ c that can achieve R ( h m , a m ) given in (38) in the absence of synchronization errors. i.e., for a ε > , there exist sufﬁciently large b and p such that k/b log( p ) > R ( h m , a m ) − ε and P ( b ) e < ε/L . Let us again concatenate L codes to forma super-code being the collection of c = [ c , c , . . . , c L ] . We then feed codewords of the super-code intothe IDT transform to freeze bits and add CP. Note that every Construction A lattice can be easily put intoa systematic form [34] and hence freezing bits is feasible.From (27), one can see that the proposed IDT transform would make the received signal a noisy versionof [ a m x L − τ m +1 , . . . , a m x L − τ m ]+[ a m x L − τ m +1 , . . . , a m x L − τ m ] . (46)Each b sub-block is now a perfectly synchronized compute-and-forward problem. Therefore, we cancompute a m x l + a m x l − τ m + τ m mod L for each l ∈ { , . . . , L } separately and use the mapping ϕ to obtainlinear combinations in F p . The error probability can be union bounded by P Lbe < ε and the actual rateof this strategy is given by (33). Now, letting b, p → ∞ results in vanishing ε and the rate of eachnested lattice sub-code would approach R ( h m , a m ) . Moreover, letting L → ∞ would make the actualrate converge to the design rate R ( h m , a m ) . This completes the proof.R EFERENCES [1] P.-C. Wang, Y.-C. Huang, and K. R. Narayanan, “Asynchronous compute-and-forward/integer-forcing with quasi-cyclic codes,” in

Proc.IEEE Globecom , Dec. 2014.[2] S. Zhang, S. C. Liew, and P. P. Lam, “Hot topic: Physical-layer network coding,” in

Proc. ACM MobiCom , pp. 358–365, Sept. 2006.[3] B. Nazer and M. Gastpar, “Reliable physical-layer network coding,”

Proceedings of the IEEE , vol. 99, pp. 438–460, Mar. 2011.[4] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing interference through structured codes,”

IEEE Trans. Inf. Theory , vol. 57,pp. 6463–6486, Oct. 2011.[5] O. Ordentlich and U. Erez, “Cyclic-coded integer-forcing equalization,”

IEEE Trans. Inf. Theory , vol. 58, pp. 5804–5815, Sept. 2012.[6] D. Wang, S. Fu, and K. Lu, “Channel coding design to support asynchronous physical layer network coding,” in

Proc. IEEE Globecom ,pp. 1–6, Nov. 2009.[7] L. Lu and S.-C. Liew, “Asynchronous physical-layer network coding,”

IEEE Trans. Wireless Commun. , vol. 11, pp. 819–831, Feb.2012.[8] X. Wu, C. Zhao, and X. You, “Joint LDPC and physical-layer network coding for asynchronous bi-directional relaying,”

IEEE J. Sel.Areas Commun. , vol. 31, pp. 1446–1454, Aug. 2013.[9] H. Najaﬁ, M. O. Damen, and A. Hørungnes, “Asynchornous compute-and-forward,”

IEEE Trans. Commun. , vol. 61, pp. 2704–2712,July 2013.[10] Z. Li, L. Chen, L. Zeng, S. Lin, and W. Fong, “Efﬁcient encoding of quasi-cyclic low-density parity-check codes,”

IEEE Trans.Commun. , vol. 54, no. 1, pp. 71–81, 2006. [11] Z. Wang and Z. Cui, “Low-complexity high-speed decoder design for quasi-cyclic LDPC codes,” IEEE Trans. on Very Large ScaleIntegration Systems , vol. 15, no. 1, pp. 104–114, 2007.[12] Y. Chen and K. Parhi, “Overlapped message passing for quasi-cyclic low-density parity check codes,”

IEEE Trans. on Circuits andSystems I , vol. 51, no. 6, pp. 1106–1113, 2004.[13] “IEEE standard for information technology– local and metropolitan area networks– speciﬁc requirements– part 11: Wireless lan mediumaccess control (mac) and physical layer (phy) speciﬁcations amendment 5: Enhancements for higher throughput,”

IEEE Std 802.11n-2009 , pp. 1–565, 2009.[14] “IEEE standard for local and metropolitans area network. part-16; air interface for ﬁxes broadband wireless acccess systems,”

IEEE802.16-2004 , pp. 1–857, 2004.[15] “Digital video broadcasting (DVB) user guidelines for the second generation system for broadcasting, interactive services, news gatheringand other broadband satellite applications (DVB-S2),”

ETSI TR 102 376 , 2005.[16] Q. Yang and S. C. Liew, “Asynchronous convolutional-coded physical-layer network coding,” arXiv:1312.1447v1 [cs.IT] , Dec. 2013.[17] L. Lan, L. Zeng, Y. Tai, L. Chen, S. Lin, and K. Abdel-Ghaffar, “Construction of quasi-cyclic ldpc codes for AWGN and binary erasurechannels: A ﬁnite ﬁeld approach,”

IEEE Trans. Inf. Theory , vol. 53, no. 7, pp. 2429–2458, 2007.[18] Consultative Committee for Space Data Systems, “Low density parity check codes for use in near-earth and deep space applications(131.1-o-2 orange book),” Sept. 2007.[19] J. Thorpe, “Low-density parity-check (LDPC) codes constructed from protographs,” in

JPL, IPN Progress Report , pp. 42–154, Aug.2003.[20] C. Feng, D. Silva, and F. R. Kschischang, “An algebraic approach to physical-layer network coding,”

IEEE Trans. Inf. Theory , vol. 59,pp. 7576–7596, Nov. 2013.[21] N. E. Tunali, K. Narayanan, J. Boutros, and Y.-C. Huang, “Lattices over eisenstein integers for compute-and-forward,” in

Proc. AllertonConf. , Oct. 2012.[22] Y.-C. Huang, K. Narayanan, and N. E. Tunali, “Multistage compute-and-forward with multilevel lattice codes based on productconstructions,” arXiv:1401.2228 [cs.IT] , Jan. 2014.[23] J. Leech and N. J. A. Sloane, “Sphere packing and error-correcting codes,”

Canad. J. Math. , vol. 23, no. 4, pp. 718–745, 1971.[24] J. Conway and N. Sloane,

Sphere Packings, Lattices, and Groups . Springer Verlag, 1999.[25] G. D. Forney, M. D. Trott, and S.-Y. Chung, “Sphere-bound-achieving coset codes and multilevel coset codes,”

IEEE Trans. Inf. Theory ,vol. 46, pp. 820–850, May 2000.[26] T. Guess and M. Varanasi, “A new successively decodable coding technique for intersymbol-interference channels,” in

Proc. IEEE ISIT ,p. 102, June 2000.[27] R. G. Gallager,

Information Theory and Reliable Communication . John Wiley and Sons Inc. New York, NY, USA, 1968.[28] U. Erez and R. Zamir, “Achieving log(1 + SNR ) on the AWGN channel with lattice encoding and decoding,” IEEE Trans. Inf.Theory , vol. 50, pp. 2293–2314, Oct. 2004.[29] A. Abbasfar, D. Divsalar, and K. Yao, “Accumulate-repeat-accumulate codes,”

IEEE Trans. Commun. , vol. 55, pp. 692–702, Apr. 2007.[30] Q. Huang, Q. Diao, S. Lin, and K. Abdel-Ghaffar, “Cyclic and quasi-cyclic LDPC codes on constrained parity-check matrices andtheir trapping sets,”

IEEE Trans. Inf. Theory , vol. 58, pp. 2648–2671, May 2012.[31] T. Richardson and R. Urbanke,

Modern Coding Theory . Cambridge University Press, 2008.[32] H. Pﬁster, J. Soriaga, and P. Siegel, “On the achievable information rates of ﬁnite state ISI channels,” in

Proc. IEEE Globecom , Nov.2001.[33] S. Zhang and S.-C. Liew, “Channel coding and decoding in a relay system operated with physical-layer network coding,”

IEEE J. Sel.Areas Commun. , vol. 27, pp. 788–796, Feb. 2009.[34] K. Murugan, A; Azarian and H. El-Gamal, “Cooperative lattice coding and decoding in half-duplex channels,”