[PDF] Erasure decoding of convolutional codes using first order representations

Abstract

In this paper, we employ the linear systems representation of a convolutional code to develop a decoding algorithm for convolutional codes over the erasure channel. We study the decoding problem using the state space description and this provides in a natural way additional information. With respect to previously known decoding algorithms, our new algorithm has the advantage that it is able to reduce the decoding delay as well as the computational effort in the erasure recovery process. We describe which properties a convolutional code should have in order to obtain a good decoding performance and illustrate it with an example.

Full PDF

aa r X i v : . [ c s . I T ] A ug Erasure decoding of convolutional codes using ﬁrst orderrepresentations

Julia Lieb ∗ Institute of Mathematics, University of Zurich Winterthurerstrasse 190, 8057 Zurich,Switzerland

Joachim Rosenthal

Institute of Mathematics, University of Zurich Winterthurerstrasse 190, 8057 Zurich,Switzerland

Abstract

In this paper, we employ the linear systems representation of a convolutionalcode to develop a decoding algorithm for convolutional codes over the erasurechannel. We study the decoding problem using the state space descriptionand this provides in a natural way additional information. With respect topreviously known decoding algorithms, our new algorithm has the advantagethat it is able to reduce the decoding delay as well as the computational eﬀortin the erasure recovery process. We describe which properties a convolutionalcode should have in order to obtain a good decoding performance and illustrateit with an example.

Keywords: convolutional codes, linear systems, decoding, erasure channel

1. Introduction

In modern communication, especially over the Internet, the erasure channel iswidely used for data transmission. In this type of channel the receiver knows ifan arrived symbol is correct, as each symbol either arrives correctly or is erased. ∗ Corresponding author

Email addresses: [email protected] (Julia Lieb), [email protected] (Joachim Rosenthal)

Preprint submitted to Elsevier August 21, 2020 or example over the Internet messages are transmitted using packets and eachpacket comes with a check sum. The receiver knows that a packet is correctwhen the check sum is correct. Otherwise a packet is corrupted or simply islost during transmission. An especially suitable class of codes for transmissionover an erasure channel is the class of convolutional codes [12]. It is knownthat convolutional codes are closely related to discrete-time linear systems overﬁnite ﬁelds, in fact each convolutional code has a so-called input-state-output(ISO) representation via such a linear system [13, 14]. This correspondence wasalso used in [4, 5, 6] to study concatenated convolutional codes. Moreover, theconnection between linear systems and convolutional codes was investigated ina more general setup in [17], where multidimensional codes and systems overﬁnite rings were considered.Hence, decoding of a convolutional code can be viewed as ﬁnding the trajectory(consisting of input and output) of the corresponding linear system that is insome sense closest to the received data. The underlying distance measure oneuses to identify the closest trajectory (i.e. the closest codeword) depends on thekind of channel that is used for data transmission. This decoding process canalso be interpreted as minimizing a cost function attached to the correspondinglinear system, which measures the distance of a received word to a codeword orthe distance of a measured trajectory to a possible trajectory, respectively. Forthe Euclidean metric over the ﬁeld of real numbers R , this is nothing else thansolving the classical LQ problem, i.e. minimizing the cost function P N − i =0 || u i − ˆ u i || + || y i − ˆ y i || , where ˆ u ∈ ( R m ) N and ˆ y ∈ ( R p ) N are received and one wantsto ﬁnd an input u ∈ ( R m ) N and corresponding output y ∈ ( R p ) N of the linearsystem such that this cost function is minimized. This problem is relatively easyto solve and it is known how to approach it for quite some time; see e.g. [10].However, for the setting of classical coding theory, where usually the Hammingmetric over ﬁnite ﬁelds is used, it turns out to be in general a hard problem tominimize the corresponding cost function P N − i =0 wt ( u i − ˆ u i ) + wt ( y i − ˆ y i ) with ˆ u, u ∈ ( F m ) N and ˆ y, y ∈ ( F p ) N for some ﬁnite ﬁeld F . The methods used to solvethe LQ problem cannot be applied since the Hamming metric is not induced by2 positive deﬁnite scalar product. However, the problem becomes much easierfor transmission over an erasure channel as done with convolutional codes in thispaper. In this setting, one introduces an additional symbol ∗ that stands for anerasure and considers F ∪ {∗} as set of symbols for the decoding. The Hammingmetric can easily be extended to this new symbol space and we are going tominimize the same cost function. The big advantage when decoding over anerasure channel is that we know that all received symbols, i.e. all symbols except ∗ in ˆ u and ˆ v , are correct and we only have to ﬁnd a way to replace the unknowns ∗ be the original values to bring the cost function to its minimal value, whichequals the number of erasures. It depends on the number of erasures if uniquedecoding is possible or if one gets a list of possible codewords. In this paper,we focus on unique decoding, i.e. we present an erasure decoding algorithmthat skips part of the sequence if there are too many erasures such that uniquedecoding is not possible. Our algorithm exploits the ISO representation of aconvolutional code via linear systems to recover the erasures in the receivedsequence. With respect to other erasure decoding algorithms for convolutionalcodes that can be found in the literature [15, 1], our systems theoretic approachhas the advantage that the computational eﬀort as well as the decoding delaycan be reduced.The paper is structured as follows. In Section 2, we give the necessary back-ground on convolutional codes. In Section 3, we explain the correspondence oftime-discrete linear systems and convolutional codes. In Section 4, we presentour decoding algorithm, describe which properties a convolutional code shouldhave to perform well with our algorithm and illustrate it with an example. InSection 5, we describe the advantages of our algorithm and in Section 6, weconclude with some remarks.

2. Convolutional codes

In this section, we start with some basics on convolutional codes.

Deﬁnition 2.1. An ( n, k ) convolutional code C is deﬁned as an F [ z ] -submoduleof F [ z ] n of rank k . As F [ z ] is a principal ideal domain, every submodule is free nd hence, there exists a full column rank polynomial matrix G ( z ) ∈ F [ z ] n × k whose columns constitute a basis of C , i.e. C = Im F [ z ] G ( z )= { G ( z ) u ( z ) | u ( z ) ∈ F [ z ] k } . Such a polynomial matrix G is called a generator matrix of C . A basis of an F [ z ] -submodule of F [ z ] n , and therfore also a generator matrix of a convolutionalcode, is not unique. If G ( z ) and ˜ G ( z ) in F [ z ] n × k are two generator matrices of C , then one has G ( z ) = ˜ G ( z ) U ( z ) for some unimodular matrix U ( z ) ∈ F [ z ] k × k (a unimodular matrix is a polynomial matrix with a polynomial inverse).Another important parameter of a convolutional code is its degree δ , which isdeﬁned as the highest (polynomial) degree of the k × k minors of any generatormatrix G ( z ) of the code. An ( n, k ) convolutional code with degree δ is denoted as ( n, k, δ ) convolutional code. If δ , ..., δ k are the column degrees (i.e. the largestdegrees of any entry of a ﬁxed column) of G ( z ) , then one has that δ ≤ δ + ... + δ k .Moreover, there always exists a generator matrix of C such that δ = δ + ... + δ k and we call such a generator matrix column reduced .Furthermore, for the use over an erasure channel, it is a crucial property of aconvolutional code to be non-catastrophic . A convolutional code is said tobe non-catastrophic if one (and therefore each) of its generator matrices is rightprime, i.e. if it admits a polynomial left inverse. The following theorem shows,why this property is so important. Theorem 2.2.

Let C be an ( n, k ) convolutional code. Then C is noncatastrophicif and only if there exists a so-called parity-check matrix for C , i.e. a full rowrank polynomial matrix H ( z ) ∈ F [ z ] ( n − k ) × n such that C = Ker F [ z ] H ( z )= { v ( z ) ∈ F [ z ] n | H ( z ) v ( z ) = 0 } . Parity-check matrices are common to be used for decoding of convolutionalcodes over the erasure channel. Recall hat, when transmitting over this kindof channel, each symbol is either received correctly or is not received at all.The ﬁrst decoding algorithm of convolutional codes over the erasure channelusing parity-check matrices can be found in [15], variations of it in [1] or [11].4o investigate the capability of error correction of convolutional codes, it isnecessary to deﬁne distance measures for these codes.Therefore, we denote by the

Hamming weight wt ( v ) of v ∈ F n the numberof its nonzero components. For v ( z ) ∈ F [ z ] n with deg( v ( z )) = r , we write v ( z ) = v r + · · · + v z r with v t ∈ F n for t = 0 , . . . , r and set v t = 0 ∈ F n for t

6∈ { , . . . , r } . For j ∈ N , we deﬁne the j-th column distance of aconvolutional code C as d cj ( C ) := min v ( z ) ∈C ( j X t =0 wt ( v r − t ) | v r = 0 ) . The erasure correcting capability of a convolutional code increases with its col-umn distances, which are upper bounded as the following theorem shows.

Theorem 2.3. [8] Let C be an ( n, k, δ ) convolutional code. Then, it holds: d cj ( C ) ≤ ( n − k )( j + 1) + 1 for j ∈ N . It is well-known that the column distances of a convolutional code could reachthis upper bound only up to j = L := (cid:4) δk (cid:5) + j δn − k k . Deﬁnition 2.4. [9] An ( n, k, δ ) convolutional code C is said to be maximumdistance proﬁle (MDP) if d cj ( C ) = ( n − k )( j + 1) + 1 for j = 0 , . . . , L := (cid:22) δk (cid:23) + (cid:22) δn − k (cid:23) If one has equality for some j ∈ N in Theorem 2.3, then one also has equalityfor j ≤ j , see [8]. Hence, it is suﬃcient to have equality for j = L to obtain anMDP convolutional code. The following theorem presents criteria to check if aconvolutional code is MDP. Theorem 2.5. [8] Let C have a column reduced generator matrix G ( z ) = P µi =0 G i z i ∈ F [ z ] n × k and parity-check matrix H ( z ) = P νi =0 H i z i ∈ F [ z ] ( n − k ) × n .The following statements are equivalent:(i) d cj ( C ) = ( n − k )( j + 1) + 1 (ii) G cj :=  G ... . . . G j . . . G  where G i ≡ for i > µ has the property thatevery full size minor that is not trivially zero, i.e. zero for all choices of G , . . . , G j , is nonzero. iii) H cj :=  H ... . . . H j . . . H  with H i ≡ for i > ν has the property thatevery full size minor that is not trivially zero is nonzero. The erasure decoding capability of an MDP convolutional code is stated in thefollowing theorem.

Theorem 2.6. [15]If for an ( n, k, δ ) MDP convolutional code C , in any sliding window of length atmost ( L + 1) n at most ( L + 1)( n − k ) erasures occur, then full error correctionfrom left to right is possible.

3. The linear systems representation of a convolutional code

In this section, we consider discrete-time linear systems of the form x ( τ + 1) = Ax ( τ ) + Bu ( τ ) y ( τ ) = Cx ( τ ) + Du ( τ ) (1)with A ∈ F s × s , B ∈ F s × k , C ∈ F ( n − k ) × s , D ∈ F ( n − k ) × k , input u ∈ F k , statevector x ∈ F s , output y ∈ F n − k and s, τ ∈ N . We identify this system withthe matrix-quadruple ( A, B, C, D ) . The function T ( z ) = C ( zI − A ) − B + D iscalled transfer function of the linear system. Deﬁnition 3.1.

A linear system (1) is called(a) reachable if for each ξ ∈ F s there exist τ ∗ ∈ N and a sequence of inputs u (0) , . . . , u ( τ ∗ ) ∈ F k such that the sequence of states x (0) , x (1) , . . . ,x ( τ ∗ + 1) generated by (1) satisﬁes x ( τ ∗ + 1) = ξ .(b) observable if Cx ( τ ) + Du ( τ ) = C ˜ x ( τ ) + Du ( τ ) for all τ ∈ N implies x ( τ ) = ˜ x ( τ ) for all τ ∈ N . This means that the knowledge of the inputand output sequences is suﬃcient to determine the sequence of states.(c) minimal if it is reachable and observable. Recall the following well-known characterization of reachability and observabil-ity. 6 heorem 3.2. (Kalman test)A linear system (1) is reachable if and only if the reachability matrix R ( A, B ) := (

B, AB, . . . , A s − B ) ∈ F s × sk satisﬁes rk( R ( A, B )) = s and observ-able if and only if the observability matrix O ( A, C ) =  C ... CA s −  ∈ F ( n − k ) s × s satiesﬁes rk( O ( A, B )) = s . Next, we will explain how one can obtain a convolutional code from a linearsystem; see [14]. First, for ( A, B, C, D ) ∈ F s × s × F s × k × F ( n − k ) × s × F ( n − k ) × k ,we set H ( z ) :=  zI − A s × ( n − k ) − B − C I n − k − D  . The set of v ( z ) =  y ( z ) u ( z )  ∈ F [ z ] n with y ( z ) ∈ F [ z ] n − k and u ( z ) ∈ F [ z ] k for which there exists x ( z ) ∈ F [ z ] s with H ( z ) · [ x ( z ) y ( z ) u ( z )] ⊤ = 0 forms asubmodule of F [ z ] n of rank k and thus, an ( n, k ) convolutional code, denotedby C ( A, B, C, D ) .Moreover, if one writes x ( z ) = x z γ + · · · + x γ , y ( z ) = y z γ + · · · + y γ and u ( z ) = u z γ + · · · + u γ with γ = max(deg( x ) , deg( y ) , deg( u )) , it holds x τ +1 = Ax τ + Bu τ y τ = Cx τ + Du τ ( x τ , y τ , u τ ) = 0 for τ > γ. Furthermore, there exist X ∈ F [ z ] s × k , Y ∈ F [ z ] ( n − k ) × k , U ∈ F [ z ] k × k such that ker( H ( z )) = im[ X ( z ) ⊤ Y ( z ) ⊤ U ( z ) ⊤ ] ⊤ and G ( z ) =  Y ( z ) U ( z )  is a generatormatrix for C with C ( zI − A ) − B + D = Y ( z ) U ( z ) − , i.e. one is able to obtaina factorization of the transfer function of the linear system via the generatormatrix of the corresponding convolutional code, and in the case that this con-volutional code is non-catastrophic, one even obtains a coprime factorization ofthe transfer function.On the other hand, for each ( n, k, δ ) convolutional code C , there exists ( A, B, C, D ) ∈ s × s × F s × k × F ( n − k ) × s × F ( n − k ) × k with s ≥ δ such that C = C ( A, B, C, D ) .In this case, ( A, B, C, D ) is called linear systems representation or input-state-output (ISO) representation of C . Besides, one can always choose s = δ . In thiscase, ( A, B, C, D ) is called a minimal representation of C . Remark 3.3.

In the coding literature state space descriptions were often donein a graph theoretic manner using so-called trellis representations: see e.g. [7].However, especially over large ﬁnite ﬁelds it is hard to algebraically describe adecoding algorithm and hence, a state space description as above is preferred.The following theorems show how properties of a linear system are related toproperties of the corresponding convolutional code.

Theorem 3.4. [14] ( A, B, C, D ) is a minimal representation of C ( A, B, C, D ) if and only if it isreachable. Theorem 3.5. [14]Assume that ( A, B, C, D ) is reachable. Then C ( A, B, C, D ) is non-catastrophicif and only if ( A, B, C, D ) is observable.

4. Low-delay erasure decoding algorithm using the linear systemsrepresentation

In this chapter, we develop our erasure decoding algorithm based on the ISOrepresentation of the convolutional code. Some ﬁrst ideas on decoding via thisrepresentation can already be found in [16]. We adopt some of the ideas pre-sented there and combine it with new ideas to obtain a complete decodingalgorithm.Assume that we have a message M = [ m ⊤ · · · m ⊤ γ ] ⊤ ∈ F k ( γ +1) with m i ∈ F k which is sent at time step i . We write this message as m ( z ) = P γi =0 m γ − i z i and encode it via a full rank, left prime, column reduced polynomial generatormatrix G ( z ) = P µi =0 G µ − i z i ∈ F [ z ] n × k to obtain v ( z ) = G ( z ) m ( z ) ∈ F [ z ] n .We write v ( z ) =  y ( z ) u ( z )  with y ( z ) = P µ + γi =0 y µ + γ − i z i ∈ F [ z ] n − k and u ( z ) = P µ + γi =0 u µ + γ − i z i ∈ F [ z ] k . As m is sent ﬁrst, we ﬁrst receive  y u  = G m , in8he next time step  y u  = G m + G m , and so on. Remark 4.1.

In principle, it would be also possible to encode the message viathe linear system, i.e. to set u ( z ) = m ( z ) . In this case, one gets a rationalgenerator matrix, which equals the transfer function of the linear system. Butto make sure that the state and the output of the linear system have ﬁnitesupport, we had to impose restrictions on the input, i.e. on the message. Thisis why we consider this option as not suitable.Let ( A, B, C, D ) be the linear systems representation of the convolutional codegenerated by G ( z ) . Then, ( y , u , . . . , y j , u j ) represents the beginning of a code-word if and only if  − I D . . . CB . . . . . . ...... . . . . . . CA j − B . . . CB D   y ... y j u ... u j  =  − I D CB ... . . . . . . CA j − B . . . CB − I D   y u ... y j u j  = 0 (2)Moreover, one has for i, j, l ∈ N : 9  C ... CA j  x i + l +  − I D . . . CB . . . . . . ...... . . . . . . CA j − B . . . CB D   y i + l ... y i + l + j u i + l ... u i + l + j  = 0 (3)where x i + l = A i + l − Bu + · · · + Bu i + l − . (4)Deﬁne F := D and F j :=  D . . . CB . . . . . . ...... . . . . . . CA j − B . . . CB D  for j ≥ as well as R l := [ A l − B · · · B ] and ℓ := max { l | R l has full column rank } if B has fullcolumn rank and ℓ := − otherwise. Theorem 4.2. [9] The quadruple ( A, B, C, D ) is the linear systems represen-tation of an MDP convolutional code if and only if each minor of F L which isnot trivially zero is nonzero. Furthermore, u i = y i = 0 for i > γ + µ implies CA γ + µ + w Bu + · · · + CA w Bu γ + µ = 0 for w ∈ N . Deﬁne E w :=  CA γ + µ B · · · CB ... ... CA γ + µ + w B · · · CA w B  and ˜ E w as sub-matrix of E w consisting only of the columns corresponding to components of ( u ⊤ , . . . , u ⊤ γ + µ ) that are not known yet.We assume that the erasure recovering process has to be done within time delay T , i.e. it is neceassary that m i can be recovered after one has received (withpossible erasures) v , . . . , v i , . . . , v i + T . 10ssume that v , . . . , v i − are known and v i contains erasures. Then, one obtains  − I D . . . CB . . . . . . ...... . . . . . . CA j − B . . . CB D   y i ... y i + j u i ... u i + j  = β (5)where β is a known vector depending on v , . . . , v i − . Decoding Algorithm1 : Set i = − . : If there exists w ∈ N such that ˜ E w has full column rank, go to 12, otherwiseif v i contains erasures, go to 3 and if v i contains no erasures, set i = i + 1 andrepeat step 2. : Set j = 0 . : If v i can be recovered solving the linear system of equations induced by [ − I | F j ] and v i , . . . , v i + j (see (5)), go to 5, otherwise go to 6. : Recover the erasures in v i (and if possible also erasures in v i +1 , . . . , v i + j ),solving the system of linear equations (5). Replace the erased symbols with thecorrect symbols and go back to 2. : If j = T , we go to 7. Otherwise, we set j = j + 1 and go back to 4. : Set l = 1 . : Set j = 0 . : If x i + l can be recovered solving the linear system of equations induced by(3) with x i + l and the erased components of v i + l , . . . , v i + l + j as unknowns, we goto 10. Otherwise, we go to 11. : Recover x i + l and as much as possible of v i + l , . . . , v i + l + j with the help of(3). With the knowledge of x i + l and u , . . . , u i − and with equation (3), ob-tain A l − Bu i + · · · + Bu i + l − . If l ≤ ℓ , this equations allows us to recover u i , . . . , u i + l − and use it to compute y i , . . . , y i + l − as well. If l > ℓ some values11f v i , . . . , v i + l − are lost but still we can restart the recovering process afterthese lost symbols. In either case, set i = i + l − and go back to 2. : If j = T − l , set l = l + 1 , and go back to 8. Otherwise set j = j + 1 andgo back to 9. : Use the system of linear equations E W · [ u ⊤ , . . . , u ⊤ γ + µ ] ⊤ = 0 to recover allerased components of [ u ⊤ , . . . , u ⊤ γ + µ ] ⊤ . Afterwards use (2) to obtain [ y ⊤ , . . . , y ⊤ γ + µ ] ⊤ .In steps 4 to 6 the algorithm recovers erasures forward within time delay T aslong as this is possible. If it reaches a point where this is not possible, it triesto recover the state of the corresponding linear system (steps 9 to 11) to beable to restart the decoding process (and recovers also symbols that had beenlost in between, in case this is possible, even if these symbols are then recoveredwith a delay that is larger than T ). After every successful recovery, in step 2,it is checked if there are already enough symbols known to recover the wholemessage with step 12. Note that due to theorem of Cayley-Hamilton one onlyhas to check ˜ E w up to w = δ − .In order to have a good performance for our algorithm, a convolutional codeshould fulﬁll the following properties as good as possible:1. The nontrivial minors of F j are nonzero for j = 1 , . . . , T .2. The nontrivial minors of  C ... F j CA j  are nonzero for j = 1 , . . . , T .3. For as many sets of columns of E w as possible, there exists w = 1 , . . . , δ − such that these columns are linearly independent.4. ℓ is as large as possible.It is diﬃcult to ensure that all these four properties are perfectly fulﬁlled. How-ever, since these properties involve similar matrices, it seems to be a good at-tempt to construct a convolutional code in such a way that some of the prop-erties are fulﬁlled, and then check how good the other properties are fulﬁlled.12learly, if 2. is perfectly fulﬁlled, then also 1. Furthermore, there already existconstructions for matrices having all nontrivial minors nonzero (in the litera-ture also referred to as superregular matrices); see e.g. [2], [16], [8]. Hence, toillustrate the performance of our algorithm with an example, we will constructa convolutional code such that 2. is perfectly fulﬁlled and then investigate howgood 3. and 4. are fulﬁlled. Note that 4. is not so important for our algorithmas it only helps to recover symbols that had to be declared as lost with a largerdelay as allowed by the delay constraint. Example 4.3.

We will construct an (5 , , convolutional code for decodingwith maximum delay T = L = 1 . First note that property 4 can never befulﬁlled for these parameters because R l has more columns than rows for all l ∈ N . But as mentioned before, this property is only useful for the recoveryof lost symbols with larger delay than originally prescribed and thus, it is noproblem to neglect this. Hence, we want to construct A, C ∈ F × , B, D ∈ F × such that (cid:20) C D CA CB D (cid:21) has all nontrivial minors nonzero for a suitable ﬁniteﬁeld F . We use the construction for superregular matrices from [3] as well asthe fact that column permutation preserves superregularity to obtain that (cid:20) C D CA CB D (cid:21) =  a a a a a a a a a a a a a a a a a a a a a a a a a a  , where F = F p N with N > and a is a primitive element of F , has the propertythat all nontrivial minors are nonzero. We immediately obtain D = (cid:20) a a a a a a (cid:21) and C = (cid:20) a a a a (cid:21) and can compute B = C − ( CB ) = (cid:20) − a ( a + 1)0 1 a ( a + a + 1) (cid:21) and A = C − ( CA ) = 1 a − (cid:20) a − a a − a a − a a − a (cid:21) . As B is full rank, ( A, B, C, D ) is a minimal ISO representation of an (5 , , convolutional code C and since F is superregular, C is an MDP convolutionalcode. Hence, in particular, it has to fulﬁll Theorem 2.5 (ii), which is not possibleif G has two columns that are identically zero. Hence a generator matrix G C has at most one column degree that is equal to zero. Consequently, G hascolumn degrees , , since we assumed it to be a column reduced generatormatrix and thus, the column degrees of G have to sum up to δ = 2 . Therefore,we obtain µ = 1 .Assume γ = 3 and that we receive the following: y u y u y u y u y u ∗ ∗ √ √ √ ∗ ∗ ∗ √ √ ∗ ∗ √ √ √ √ √ √ √ √ ∗ ∗ ∗ ∗ ∗ where ∗ symbolizes an erasure and √ a received symbol.Since C is MDP, it can recover n − k erasures out of n symbols or n − k ) erasures out of n symbols (assuming that there are no erasures in front of thiswindow of size n or n , respectively). The steps of our algorithm with C andthe above erasure pattern would be the following.First, the algorithm uses (5) with j = 0 to recover y . Afterwards, one realizesthat it is neither possible to recover y and u with (5) for j = 0 nor y , u , y , u with (5) for j = 1 . The algorithm applies (3) with i = l = 1 to recover x and y but the erased components of y and u have to be declared as lost. Finally,as the matrix consisting of the ﬁrst column of (cid:18) CA BCA B (cid:19) and all columns of (cid:18) CBCAB (cid:19) has nonzero determinant, one can use step 12 of the algorithm torecover the lost component of u as well as u before u and y were even sent,just with the knowledge of the already known symbols of u , u , u , u and withthe information that γ = 3 , i.e. u i = y i = 0 for i > . Then, with the knowledgeof u , . . . , u , it is also possible to compute the erased components of y and y .In summary, we are able to recover the whole sequence but part of it only witha larger delay than actually allowed. However, we were able to obtain u , y already one time interval before these vectors were sent, i.e. in some sense withdelay − .

5. Performance Analysis

In this section, we will explain the two main advantages of our systems theoreticdecoding algorithm with respect to the (ﬁrst) erasure decoding algorithm forconvolutional codes that can be found in [15], namely the reduced decodingdelay and the reduced computational eﬀort.Our algorithm tries to recover the occurring erasures with smallest possible delayby ﬁrst trying to do the recovery in a window of size n , afterwards in a windowof size n , and so on. In contrast to this approach, the decoding algorithmin [15] ﬁrst tries to decode in the largest possible window of size ( L + 1) n and only decreases this window if it fails to recover all the erasures in the big14indow. This implies that the decoding delay is always at least L . Moreover, itis computationally less complex and less costly to do several decoding steps insmall windows than one decoding step in a larger window whose size is the sumof the sizes of the smaller windows since it is easier to solve several small thanone large linear system of equations. In addition, by using the linear systemsapproach, the systems of equations we have to solve for erasure recovery areparts of linear systems that are already in echelon form; see (5). Especially,when we transmit over a channel with a statistic that implies that it is morelikely to get erasures in the y i than in the u i , this is of very big advantage asyou can obtain any erased component of any y i (that has the possibility to berecovered), directly from (5) with very small computational eﬀort.Finally, as we already observed in our example, the use of the terminatingequations in step 12 of the algorithm can make it possile to obtain symbols thatwere not even sent yet, i.e. in some sense we are able to "look into the future"and terminate the decoding before the end of the transmission. This is of coursean additional considerable reduction of the decoding delay.

6. Conclusion

In this paper, we presented an erasure decoding algorithm for convolutionalcodes employing their linear systems representation. We observed that thisalgorithm is able to reduce the decoding delay and the computational eﬀort incomparison with previous algorithms.

Acknowledgments

The authors acknowledge the support of Swiss National Science Foundationgrant n. 188430. Julia Lieb acknowledges also the support of the GermanResearch Foundation grant LI 3101/1-1.