[PDF] GRADE-AO: Towards Near-Optimal Spatially-Coupled Codes With High Memories

Abstract

Spatially-coupled (SC) codes, known for their threshold saturation phenomenon and low-latency windowed decoding algorithms, are ideal for streaming applications. They also find application in various data storage systems because of their excellent performance. SC codes are constructed by partitioning an underlying block code, followed by rearranging and concatenating the partitioned components in a "convolutional" manner. The number of partitioned components determines the "memory" of SC codes. While adopting higher memories results in improved SC code performance, obtaining optimal SC codes with high memory is known to be hard. In this paper, we investigate the relation between the performance of SC codes and the density distribution of partitioning matrices. We propose a probabilistic framework that obtains (locally) optimal density distributions via gradient descent. Starting from random partitioning matrices abiding by the obtained distribution, we perform low complexity optimization algorithms over the cycle properties to construct high memory, high performance quasi-cyclic SC codes. Simulation results show that codes obtained through our proposed method notably outperform state-of-the-art SC codes with the same constraint length and codes with uniform partitioning.

Full PDF

GGRADE-AO: Towards Near-OptimalSpatially-Coupled Codes With High Memories

Siyi Yang ∗ , Ahmed Hareedy † , Shyam Venkatasubramanian ∗ , Robert Calderbank † , and Lara Dolecek ∗∗ Electrical and Computer Engineering Department, University of California, Los Angeles, Los Angeles, CA 90095 USA † Electrical and Computer Engineering Department, Duke University, Durham, NC 27708 [email protected], [email protected], [email protected], [email protected], and [email protected]

Abstract —Spatially-coupled (SC) codes, known for their thresh-old saturation phenomenon and low-latency windowed decodingalgorithms, are ideal for streaming applications. They also ﬁndapplication in various data storage systems because of their ex-cellent performance. SC codes are constructed by partitioning anunderlying block code, followed by rearranging and concatenatingthe partitioned components in a “convolutional” manner. Thenumber of partitioned components determines the “memory” ofSC codes. While adopting higher memories results in improved SCcode performance, obtaining optimal SC codes with high memoryis known to be hard. In this paper, we investigate the relationbetween the performance of SC codes and the density distributionof partitioning matrices. We propose a probabilistic frameworkthat obtains (locally) optimal density distributions via gradientdescent. Starting from random partitioning matrices abiding bythe obtained distribution, we perform low complexity optimizationalgorithms over the cycle properties to construct high memory,high performance quasi-cyclic SC codes. Simulation results showthat codes obtained through our proposed method notably out-perform state-of-the-art SC codes with the same constraint lengthand codes with uniform partitioning.

I. I

NTRODUCTION

Spatially-coupled (SC) codes, also known as low-densityparity-check (LDPC) codes with convolutional structures, arean ideal choice for streaming applications and storage devicesthanks to their threshold saturation phenomenon [1]–[5] andamenability to low-latency windowed decoding [6]. SC codesare constructed by partitioning the parity-check matrix of anunderlying block code, followed by rearranging the componentmatrices in a “convolutional” manner. In particular, componentmatrices are concatenated into a “replica”, and then multiplereplicas are placed together, resulting in a “coupled” code. Thenumber of component matrices minus one is referred to as the“memory” of the SC codes [7]–[10].It is known that the performance of an SC code improvesas its memory increases. This is a byproduct of improvednode expansion and additional degrees of freedom that can beutilized to decrease the number of small cycles and detrimentalobjects [8], [9], [11], [12]. Although the optimization problemof designing SC codes with memory less than has beenefﬁciently solved [8], [9], there remains a vacuum in efﬁcientalgorithms that construct good enough SC codes with highmemories. Esfahanizadeh et al. [8] proposed a combinatorialframework to develop optimal quasi-cyclic (QC) SC codes,comprising so-called optimal-overlap (OO) to search for theoptimal partitioning matrices, and circulant power optimization(CPO) to optimize the lifting parameters, which was extendedby Hareedy et al. [9]. However, this method is hard to executein practice for high memory codes due to the increasingcomputational complexity. Battaglioni et al. developed an al- gorithmic method that searches for good SC codes with highmemories [13]. However, high memory codes designed bypurely algorithmic methods are unable to offer strict guaranteeson performance superiority; several of these codes can even bebeat by optimally designed QC-SC codes with lower memoriesunder the same constraint length. Therefore, a method thattheoretically identiﬁes an avenue to a near-optimal constructionof SC codes with high memories is of signiﬁcant interest.In a way similar to random coding in spirit, our objective isto obtain some near-optimal solutions starting from a randompartitioning matrix, where the density distribution of componentmatrices (i.e., edge distribution) is analogous to the degreedistribution in random coding. While discrete optimizationmethods [8], [9] have been shown to suffer from exponentialgrowths in complexity, fueled by the increase in degrees offreedom, we adopt a more efﬁcient, probabilistic frameworkthat searches for the optimal edge distribution via gradient de-scent, referred to as gradient-descent distributor (GRADE) ,followed by an algorithmic optimizer (AO) that obtains alocally optimal partition near a random partition with edgedistribution obtained from GRADE. The current goal is still tominimize the number of small cycles, which reduces undesir-able dependencies, and thus improves the code performance.The impact of this probabilistic method extends beyond itsperformance gains and low complexity. Particularly in theerror ﬂoor region, a more advanced set of detrimental objects(absorbing sets [11]) governs the LDPC code performance. Ourprobabilistic method also has high potential to be extended tohandle detrimental objects speciﬁed by the channel.In this paper, we propose a probabilistic framework that efﬁ-ciently searches for near-optimal SC codes with high memories.In Section II, we introduce preliminaries of SC codes and theperformance-related metrics. In Section III, we develop thetheoretical basis of GRADE, which derives a locally optimaledge distribution from an arbitrarily provided initial distributionand conditions. In Section IV, we introduce two examples ofGRADE-AOs that result in near-optimal SC codes: the so-called gradient descent (GD) codes and topologically-coupled (TC)codes . Our proposed framework is validated in Section V bysimulation results of four groups of codes, with the best code ineach obtained from GRADE-AO. Finally, we make concludingremarks and introduce possible future work in Section VI.II. P RELIMINARIES

In this section, we recall the typical construction of SC codeswith QC structure. Any QC code with a parity-check matrix H is obtained by replacing each nonzero (zero) entry of somebinary matrix H P with a circulant (zero) matrix of size z , z ∈ N . a r X i v : . [ c s . I T ] J a n ig. 1. Cycles in the protograph (right panel) and their corresponding structuresin the partitioning matrices (left panel). The matrix H P and z are referred to as the protograph andthe circulant size of the code, respectively. In particular, theprotograph H PSC of an SC code has a convolutional structurecomposed of L replicas, as presented in Fig. 1. Each replica isobtained by stacking the disjoint component matrices { H P i } mi =0 ,where m is the memory and Π = H P + H P + · · · + H P m is theprotograph of the underlying block code.In this paper, we constrain Π to be an all-one matrix of size γ × κ , γ, κ ∈ N . An SC code is then uniquely represented by itspartitioning matrix P and lifting matrix L , where P and L areall γ × κ matrices. The matrix P has ( P ) i,j = a if ( H P a ) i,j = 1 .The matrix L is determined by replacing each circulant matrixby its associated exponent. Here, this exponent represents thepower to which the matrix σ deﬁned by ( σ ) i,i +1 = 1 is raised,where ( σ ) z,z +1 = ( σ ) z, .The performance of ﬁnite-length LDPC codes is stronglyaffected by the number of detrimental objects that are subgraphswith certain structures in the Tanner graphs of those codes.Two major classes of detrimental objects are trapping sets andabsorbing sets. Since enumerating and minimizing the numberof detrimental objects is complicated, existing work typicallyfocuses on common substructures of these objects: the smallcycles [8], [9], [13]. A cycle- g candidate in H PSC ( Π ) isa path of traversing a structure to generate cycles of length g after lifting (partitioning) [9]. In an SC code, each cyclein the Tanner graph corresponds to a cycle candidate in theprotograph H PSC , and each cycle candidate in H PSC correspondsto a cycle candidate C in the base matrix Π . Lemma 1 speciﬁesa necessary and sufﬁcient condition for a cycle candidate in Π to become a cycle candidate in the protograph and then a cyclein the ﬁnal Tanner graph. Lemma 1.

Let C be a cycle- g candidate in the base matrix,where g ∈ N , g ≥ . Denote C by ( j , i , j , i , . . . , j g , i g ) ,where ( i k , j k ) , ( i k , j k +1 ) , ≤ k ≤ g , j g +1 = j , are nodes of C in Π , P , and L . Then C becomes a cycle candidate in theprotograph if and only if the following condition follows [14]: (cid:88) gk =1 P ( i k , j k ) = (cid:88) gk =1 P ( i k , j k +1 ) . (1) This cycle candidate becomes a cycle in the Tanner graph ifand only if: (cid:88) gk =1 L ( i k , j k ) ≡ (cid:88) gk =1 L ( i k , j k +1 ) mod z. (2) As shown in Fig. 1, a cycle- candidate and a cycle- candidate in the partitioning matrix with assignments satisfyingcondition (1), and their corresponding cycle candidates inthe protograph are marked by red and blue, respectively. Anoptimization of a QC-SC code is typically divided into twomajor steps: optimizing P to minimize the number of cyclecandidates in the protograph, and optimizing L to further reducethat number in the Tanner graph given the optimized P [8],[9]. The latter goal has been achieved in [8] and [9], using analgorithmic method called circulant power optimization (CPO),while the former goal is yet to be achieved for large m . Wenote that the step separation highlighted above notably reducesthe overall optimization complexity.In the remainder of this paper, we focus on SC codes forthe additive white Gaussian noise (AWGN) channel, wherethe most detrimental objects are the low weight absorbing sets[8]. Consequently, a simpliﬁed optimization focuses on cyclecandidates of lengths , , and [8], [9]. Existing literatureshows that the optimal P for an SC code with m ≤ typicallyhas a balanced (uniform) edge distribution among componentmatrices [8]. However, in the remaining sections, we show thatthe edge distribution for optimal SC codes with large m isnot uniform, and we propose the GRADE-AO framework thatexplores a locally optimal solution.III. A P ROBABILISTIC O PTIMIZATION F RAMEWORK

In this section, we present a probabilistic framework thatsearches for a locally optimal edge distribution for the parti-tioning matrices of SC codes with given memories through thegradient descent algorithm.

Deﬁnition 1.

Let γ, κ, m, m t ∈ N and a = ( a , a , . . . , a m t ) ,where a < a < · · · < a m t = m . A ( γ, κ ) SC code withmemory m is said to have coupling pattern a if and only if H P i (cid:54) = γ × κ , for all i ∈ { a , a , . . . , a m t } , and H P i = γ × κ ,otherwise. The value m t is called the pseudo-memory of theSC code.A. Probabilistic Metric In this subsection, we deﬁne metrics linking the edge dis-tribution with the expected number of cycle candidates in theprotograph in Theorem 1 and Theorem 2. While Schmalen etal. have shown in [15] that nonuniform coupling (nonuniformedge distribution in our paper) yields an improved threshold,our work differs in two areas: 1) Explicit optimal couplinggraphs were exhaustively searched and were restricted to smallmemories in [15], whereas our method produces near-optimalSC protographs for arbitrary memories. 2) Work [15] focusedon the asymptotic analysis for the threshold region, while ourframework is dedicated to the ﬁnite-length construction and hasadditional demonstrable gains in the error ﬂoor region.

Deﬁnition 2.

Let m, m t ∈ N and a = ( a , a , . . . , a m t ) , where a < a < · · · < a m t = m . Let p = ( p , p . . . , p m t ) ,where < p i ≤ , p + p + · · · + p m t = 1 . Then, the following f ( X ; a , p ) , which is abbreviated to f ( X ) when the context isclear, is called the coupling polynomial of an SC code withcoupling pattern a , associated with probability distribution p : f ( X ; a , p ) (cid:44) (cid:88) ≤ i ≤ m t p i X a i . (3) ig. 2. Structures and cycle candidates for cycle 8. Theorem 1.

Let [ · ] i denote the coefﬁcient of X i of a polyno-mial. Denote by P ( a , p ) the probability of a cycle- candi-date in the base matrix becoming a cycle- candidate in theprotograph under random partitioning with edge distribution p . Then, P ( a , p ) = (cid:2) f ( X ) f ( X − ) (cid:3) . (4) Proof.

According to Lemma 1, suppose the cycle- candidate inthe base matrix is represented by C ( j , i , j , i , j , i ) . Then, P ( a , p ) = P (cid:20)(cid:88) k =1 P ( i k , j k ) = (cid:88) k =1 P ( i k , j k +1 ) (cid:21) = (cid:88) (cid:80) k =1 x i = (cid:80) k =1 y i P [ P ( i k , j k ) = x k , P ( i k , j k +1 ) = y k ]= (cid:88) (cid:80) k =1 x i = (cid:80) k =1 y i p x p x p x p y p y p y =  (cid:88) x i ,y i ∈ supp( a ) p x p x p x p y p y p y X x + x + x − y − y − y  = (cid:2) f ( X ) f ( X − ) (cid:3) . The theorem is proved. (cid:4)

Example 1.

Consider SC codes with full memories and uniformpartition, i.e., a = (0 , , . . . , m ) and p = m +1 m +1 . When m = 2 , P ( a , p ) = 0 . ; when m = 4 , P ( a , p ) = 0 . . Example 2.

First, consider SC codes with m = m t = 2 .Let a = (0 , , and p = (2 / , / , / . According toTheorem 1, f ( X ) = (2 + X + 2 X ) / , f ( X ) f ( X − ) =0 . X + X − )+0 . X + X − )+0 . X + X − )+0 . X + X − )+0 . X + X − )+0 . X + X − )+0 . . Therefore, P ( a , p ) = 0 . . Second, considerSC codes with m = m t = 4 . Let a = (0 , , , , and p = (0 . , . , . , . , . . According to Theorem 1, P ( a , p ) = 0 . . After we have derived the metric for cycle- candidates inthe protograph, we now turn to the case of cycle- candidates.As shown in Fig. 2, cycle candidates in the base matrix thatresult in cycle- candidates in the protograph can be categorizedinto different structures, labeled S , . . . , S . Different casesare differentiated by the number of rows and columns (withoutorder) the structures span in the partitioning matrix [9]. Specif-ically, S , . . . , S denote the structures that span submatricesof size × , × or × , × , × or × , × or × , and × , respectively. Any structure that belongs to S , S , S has multiple cycle- candidates, and these distinctcandidates are marked by blue in Fig. 2. Lemma 2.

Denote P i ( a , p ) , ≤ i ≤ , as the averageprobability of a cycle- candidate of structure S i in the basematrix becoming a cycle- candidate in the protograph, underrandom partition with edge distribution p . Then, P ( a , p ) = (cid:2) f ( X ) f ( X − ) (cid:3) ,P ( a , p ) = (cid:2) f ( X ) f ( X − ) f ( X ) f ( X − ) (cid:3) ,P ( a , p ) = (cid:2) f ( X ) f ( X ) f ( X − ) (cid:3) , and P ( a , p ) = P ( a , p ) = P ( a , p ) = (cid:2) f ( X ) f ( X − ) (cid:3) . Proof.

For patterns where the nodes of the cycle- candidatesare pairwise different, namely, S , S , S , the result can bederived by following the logic in the proof of Theorem 1.For S , suppose the indices of the rows and columns are i , i , and j , j , respectively. Then, the cycle condition inLemma 1 is P ( i , j ) + P ( i , j ) = P ( i , j ) + P ( i , j ) .For S , suppose the indices of the rows and columns are i , i , and j , j , j , respectively. Then, the cycle condition inLemma 1 is P ( i , j ) − P ( i , j ) + P ( i , j ) + P ( i , j ) − P ( i , j ) − P ( i , j ) = 0 .For S , suppose the indices of the rows and columns are i ,i , i , and j , j , j , respectively. Then, the cycle condition inLemma 1 is P ( i , j )+ P ( i , j )+ P ( i , j ) − P ( i , j ) − P ( i ,j ) − P ( i , j ) − P ( i , j ) = 0 .Following the logic in the proof of Theorem 1, the case for S , S , S can be proved. (cid:4) Theorem 2.

Denote N ( a , p ) as the expectation of the numberof cycle- candidates in the protograph. Then, N ( a , p ) = w (cid:2) f ( X ) f ( X − ) (cid:3) + w (cid:2) f ( X ) f ( X − ) f ( X ) f ( X − ) (cid:3) + w (cid:2) f ( X ) f ( X ) f ( X − ) (cid:3) + w (cid:2) f ( X ) f ( X − ) (cid:3) , (5) where w = (cid:0) γ (cid:1)(cid:0) κ (cid:1) , w = 3 (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 3 (cid:0) γ (cid:1)(cid:0) κ (cid:1) , w = 18 (cid:0) γ (cid:1)(cid:0) κ (cid:1) , w = 6 (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 6 (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 36 (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 36 (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 24 (cid:0) γ (cid:1)(cid:0) κ (cid:1) .Proof. Provided the results in Lemma 2, we just need toprove that the numbers of cycle candidates of structures S ,S , . . . , S in a γ × κ base matrix are (cid:0) γ (cid:1)(cid:0) κ (cid:1) , (cid:0) γ (cid:1)(cid:0) κ (cid:1) +3 (cid:0) γ (cid:1)(cid:0) κ (cid:1) , (cid:0) γ (cid:1)(cid:0) κ (cid:1) , (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 6 (cid:0) γ (cid:1)(cid:0) κ (cid:1) , (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 36 (cid:0) γ (cid:1)(cid:0) κ (cid:1) ,and (cid:0) γ (cid:1)(cid:0) κ (cid:1) , respectively.Take i = 5 as an example, the number of cycle candidates ofstructure S in any × or × matrix is · (cid:0) (cid:1) · . Thetotal number of × or × matrices in a γ × κ base matrixis (cid:0) γ (cid:1)(cid:0) κ (cid:1) + (cid:0) γ (cid:1)(cid:0) κ (cid:1) . Therefore, the total number of cycles ofpattern S is (cid:0) γ (cid:1)(cid:0) κ (cid:1) + 36 (cid:0) γ (cid:1)(cid:0) κ (cid:1) . By a similar logic, we canprove the result for the remaining patterns. (cid:4) B. Gradient Descent Distributor

By contrasting Examples 1 with 2 it is clear that for a givencoupling pattern, an optimal edge distribution is not necessarilyreached by a uniform partition. In this subsection, we developan algorithm that obtains a locally optimal distribution bygradient descent. emma 3.

Given m t ∈ N and a = ( a , a , . . . , a m t ) , anecessary condition for P ( a , p ) to reach its minimum value isthat the following equation holds for some c ∈ R : (cid:2) f ( X ) f ( X − ) (cid:3) a i = c , ∀ i, ≤ i ≤ m t . (6) Proof.

Consider the gradient of L ( a , p ) = P ( a , p ) + c (1 − p − p − · · · − p m t ) . ∇ p L ( a , p )= ∇ p ( P ( a , p ) + c (1 − p − p − · · · − p m t ))= ∇ p (cid:2) f ( X ) f ( X − ) (cid:3) − c m t +1 = (cid:2) ∇ p (cid:0) f ( X ) f ( X − ) (cid:1)(cid:3) − c m t +1 =3 (cid:2) f ( X ) f ( X − ) f ( X ) ∇ p f ( X − ) (cid:3) + 3 (cid:2) f ( X ) f ( X − ) f ( X − ) ∇ p f ( X ) (cid:3) − c m t +1 =6 (cid:2) f ( X ) f ( X − ) (cid:0) X − a , X − a , . . . , X − a mt (cid:1)(cid:3) − c m t +1 . (7)When P ( a , p ) reaches its minimum, ∇ p [ L ( a , p )] = m t +1 ,which is equivalent to (6) by deﬁning c = c/ . (cid:4) Lemma 4.

Given γ, κ, m t ∈ N and a = ( a , a , . . . , a m t ) , anecessary condition for N ( a , p ) to reach its minimum valueis that the following equation holds for some c ∈ R : (cid:2) f ( X ) f ( X − ) (cid:3) ) a i + ¯ w (cid:2) f ( X ) f ( X ) f ( X − ) (cid:3) a i + ¯ w (cid:2) f ( X ) f ( X − ) f ( X ) f ( X − ) (cid:3) a i + ¯ w (cid:2) f ( X ) f ( X − ) (cid:3) − a i + ¯ w (cid:2) f ( X ) f ( X ) f ( X − ) (cid:3) − a i + ¯ w (cid:2) f ( X ) f ( X ) f ( X − ) (cid:3) a i + ¯ w (cid:2) f ( X ) f ( X − ) (cid:3) a i = c , ∀ i, ≤ i ≤ m t , where ¯ w = γ + κ − , ¯ w = 2( γ − κ − , and ¯ w = [( γ − γ −

3) + ( κ − κ − γ − κ − γ + κ −

6) + ( γ − γ − κ − κ − .Proof. Consider the gradient of L ( a , p ) = N ( a , p ) + c (1 − p − p − · · · − p m t ) . ∇ p L ( a , p )= ∇ p ( N ( a , p ) + c (1 − p − p − · · · − p m t ))= w (cid:2) ∇ p (cid:0) f ( X ) f ( X − ) (cid:1)(cid:3) + w (cid:2) ∇ p (cid:0) f ( X ) f ( X − ) f ( X ) f ( X − ) (cid:1)(cid:3) + w (cid:2) ∇ p (cid:0) f ( X ) f ( X ) f ( X − ) (cid:1)(cid:3) + w (cid:2) ∇ p (cid:0) f ( X ) f ( X − ) (cid:1)(cid:3) − c m t +1 = w { (cid:2) f ( X ) f ( X − )( X − a , X − a , . . . , X − a mt ) (cid:3) ) + ¯ w (cid:2) f ( X ) f ( X ) f ( X − )( X − a , . . . , X − a mt ) (cid:3) + ¯ w (cid:2) f ( X ) f ( X − ) f ( X ) f ( X − )( X − a , . . . , X − a mt ) (cid:3) + ¯ w (cid:2) f ( X ) f ( X − )( X a , X a , . . . , X a mt ) (cid:3) + ¯ w (cid:2) f ( X ) f ( X ) f ( X − )( X a , X a , . . . , X a mt ) (cid:3) + ¯ w (cid:2) f ( X ) f ( X ) f ( X − )( X − a , X − a , . . . , X − a mt ) (cid:3) + ¯ w (cid:2) f ( X ) f ( X − )( X − a , X − a , . . . , X − a mt ) (cid:3) }− c m t +1 . (8)When P ( a , p ) reaches its minimum, ∇ p [ L ( a , p )] = m t +1 ,which is equivalent to (4) by deﬁning c = c/w . (cid:4) Based on Lemma 3 and Lemma 4, we adopt the gradientdescent algorithm to obtain a locally optimal edge distributionfor SC codes with coupling pattern a , starting from the uniformdistribution inside P as presented in Algorithm 1. Note that conv( · ) and inverse( · ) refer to convolution and reverse ofvectors, respectively. Algorithm 1

Gradient Descent Distributor (GRADE)

Inputs and Parameters: γ, κ, m t , m, a : parameters of the SC code; w : weight of each cycle- candidate; (cid:15), α : accuracy and step size of gradient descent; Outputs and Intermediate Variables: p : a locally optimal edge distribution over supp( a ) ; ¯ w ← w ( γ − κ − , obtain { ¯ w i } i =2 in Lemma 4; v prev = 1 ; v cur = 1 ; p , g ← m t +1 , f , ¯ f ← m +1 , f , ¯ f ← m +1 ; p ← m t +1 m t +1 ; f [ a , . . . , a m t ] ← p , ¯ f ← inverse( f ) ; f [1 , , . . . , m + 1] ← f , ¯ f ← inverse( f ) ; q ← ¯ w conv( f , f , f , ¯ f , ¯ f , ¯ f ) , q ← conv( f , f , ¯ f , ¯ f ) ; q ← ¯ w conv( f , ¯ f , f , f , ¯ f , ¯ f ) + ¯ w conv( f , f , f , ¯ f , ¯ f , ¯ f , ¯ f )+ ¯ w conv( f , f , f , f , ¯ f , ¯ f , ¯ f , ¯ f ) ; v prev = v cur , v cur = q [3 m ] + q [2 m ] + q [4 m ] ; g ← w conv( f , f , f , ¯ f , ¯ f ) , g ← f , f , ¯ f ) ; g ← w conv( f , ¯ f , f , f , ¯ f ) + 2 ¯ w conv(¯ f , f , f , f , f , ¯ f )+4 ¯ w conv( f , f , f , ¯ f , ¯ f , ¯ f ) + 8 ¯ w conv( f , f , f , f , ¯ f , ¯ f , ¯ f ) ; g ← w conv( f , f , f , ¯ f , ¯ f ) + ¯ w conv( f , f , f , f , ¯ f , ¯ f ) ; g ← g [2 m + a ] + g [ m + a ] + g [3 m + a ]+ g [2 m + 2 a ] , g ← g − mean( g ) ; if | v prev − v cur | > (cid:15) then p ← p − α g || g || ; goto step 5; return p ; IV. C ONSTRUCTION

In this section, we present two algorithmic optimizationmethods based on GRADE to obtain locally optimal SC codeswith a ﬁxed coupling pattern.

A. Gradient Descent Codes

In this subsection, we consider SC codes with full memories,i.e., m = m t and a = (0 , , . . . , m ) . In this case, our proposedGRADE algorithm obtains a highly skewed edge distribution.Starting from a random partitioning matrix P with the deriveddistribution, one can perform a semi-greedy algorithm thatsearches for the locally optimal partitioning matrix near theinitial P . Constraining the search space to contain P ’s that havedistributions within small L and L ∞ distances from that of theoriginal P , and adopting the CPO next, signiﬁcantly reduces thecomputational complexity to ﬁnd a strong high memory code.This procedure is given in Algorithm 2.We refer to codes obtained from Algorithm 2 as gradientdescent (GD) codes . By replacing the input distribution p in Algorithm 2 with the uniform distribution, we obtain theso-called uniform (UNF) codes . We show in Section V by lgorithm 2 GRADE-A Optimizer (AO)

Inputs and Parameters: γ, κ, m : parameters of an SC code with full memory; p : edge distribution obtained from Algorithm 1; d , d : parameters indicating the size of the searchingspace; Outputs and Intermediate Variables: P : a locally optimal partitioning matrix; Obtain the lists L ( i, j ) , L ( i, j ) of cycle- candidates andcycle- candidates in the base matrix that contain node ( i,j ) , ≤ i ≤ γ , ≤ j ≤ κ ; Obtain u = arg min x ∈ N m +1 , || x || = γκ || γκ x − p || ; for i ∈ { , , . . . , m } do Place u [ i + 1] i ’s into P randomly; d ← m +1 ; for i ∈ { , , . . . , γ } , j ∈ { , , . . . , κ } do noptimal ← False ; n ← |L ( i, j ) | , n ← |L ( i, j ) | , n ← wn + n ; for v ∈ { , , . . . , z − } do d (cid:48) = d , d (cid:48) [ v + 1] ← d (cid:48) [ v + 1] + 1 , p ← P ( i, j ) ; if || d (cid:48) || ≤ d and || d (cid:48) || ∞ ≤ d then P ( i, j ) ← v ; t ← |L ( i, j ) | , t ← |L ( i, j ) | , t ← wt + t ; if t < n then noptimal ← True , n ← t , d ← d (cid:48) ; else P ( i, j ) ← p ; if noptimal then goto step 6; return P ;simulation that the distribution obtained by GRADE resultsin constructions that are better than those adopting uniformdistribution and in existing literature. B. Topologically-Coupled Codes

In this subsection, we explore SC codes with pseudo-memory m t such that m t < m and a (cid:54) = (0 , , . . . , m ) . The motivationbehind this task is to construct an SC code with memory m with the same computational complexity needed to construct amemory m t code, where m t < m . Given m t and m , we ﬁrstﬁnd the optimal a with length m t + 1 in a brute-force manner.Taking m = 4 and m t = 2 as an example, the optimal couplingpattern is reached by a = (0 , , and the correspondingoptimal distribution is almost uniform.Given the optimal coupling pattern a , we then obtain anoptimal partitioning matrix by the OO method proposed in [8]and [9]. We extend the OO method for memory m SC codesto any SC code with pseudo-memory m t = m , which doesnot increase the complexity of the approach. We refer to thecodes obtained from the extended OO method followed by theCPO as topologically-coupled (TC) codes .Optimal TC codes with pseudo-memory m t have strictlyfewer cycle candidates in their protographs than optimal SCcodes with full memory m = m t . Take m = 4 and m t = 2 TABLE IS

TATISTICS OF THE N UMBER OF C YCLES ( γ, κ ) Code Cycles- Cycles- , GD 0 0UNF 0 6292 (3 , GD 0 397880UNF 0 559902Battaglioni et al. [13] 0 451337 (4 , GD 0 528,090UNF 0 1,087,268 (4 , TC 15,436 -SC (matched constraint length) 19,180 -SC (matched circulant size) 74,579 - as an example. Suppose the optimal SC code has the partition Π = H P0 + H P1 + H P2 . Consider the TC code with partition Π = H P0 + H P1 + H P4 such that H P2 = H P4 . Then, anycycle- candidate resulting from a cycle candidate in the basematrix assigned with - - - - - , - - - - - , or - - - - x - x , x ∈ { , , } , in P no longer has a counterpart in theTC code, since by replacing ’s with ’s, assignments - - - - - , - - - - - , and - - - - x - x , x ∈ { , , } , no longersatisfy the cycle condition in Lemma 1. Moreover, there existsa bijection between the remaining candidates in the SC codeand all candidates in the TC code through the replacement of ’s with ’s. Therefore, TC codes are better (have less cycles)than SC codes with the same circulant size. In Section V, wepresent simulation results of such codes and show that theycan also outperform SC codes with the same constraint length(larger circulant size).V. S IMULATION R ESULTS

In this section, we obtain the frame error rate (FER) curves offour groups of SC codes designed by the GRADE-AO methodspresented in Section IV. We show that codes constructed bythe GRADE-AO methods offer signiﬁcant performance gainscompared with codes with uniform edge distributions and codesconstructed through purely algorithmic methods.Out of these three plots, the left and center ones compare GDcodes with UNF codes designed as in Section IV-A. The rightplot compares a TC code designed as in Section IV-B withoptimal SC codes constructed through the OO-CPO methodproposed in [8]. The GD/UNF codes have parameters ( γ, κ,m, z, L ) = (3 , , , , , (3 , , , , , and (4 , , , , , respectively. The TC code has parameters ( γ, κ, m t ,z, L ) = (4 , , , , with coupling pattern a = (0 , , .For a fair comparison, we have selected two SC codes: onewith a similar constraint length ( m + 1) z and the other with anidentical circulant power z . To ensure that the SC codes and theTC code have close rates and codelengths, the two SC codeshave parameters ( γ, κ, m, z, L ) = (4 , , , , and (4 , , , , , respectively. The statistics regarding the number ofcycles of each code are presented in Table I.Fig. 5 shows FER curves of our GD/UNF comparisons with ( γ, κ ) = (3 , and (3 , . The partitioning matrices and thelifting matrices of the codes are speciﬁed in Fig. 3-4. When γ = 3 , cycles- are easily removed by the CPO. Therefore, weperform joint optimization on the number of cycle- and cycle- candidates by assigning different weights to cycle candidatesin Algorithm 2. We observe a performance gain for the GD a) GD code.(b) UNF code.Fig. 3. Partitioning matrices (left) and circulant power matrices (right) ofGD/UNF codes with ( γ, κ, m, z, L ) = (3 , , , , .(a) GD code.(b) UNF code.Fig. 4. Partitioning matrices (top) and circulant power matrices (bottom) ofGD/UNF codes with ( γ, κ, m, z, L ) = (3 , , , , . code with respect to the UNF code in both the waterfall regionand the error ﬂoor region. Moreover, the number of cycles- in the (3 , GD code is reduced by and comparedwith the UNF code and the code constructed by Battaglioniet al. in [13], respectively. In addition, the (3 , GD codehas no weight- absorbing sets (ASs) and weight- ASs,whereas the UNF code has weight- ASs and weight- ASs. As for the (3 , codes, all cycles- and cycles- areremoved. Thus, the gain of the GD code compared with theUNF code exceeds the gain observed in the (3 , codes.Fig. 6 shows FER curves of the GD/UNF comparison with ( γ, κ ) = (4 , . The partitioning matrices and the liftingmatrices of the codes are speciﬁed in Fig. 8-9. Cycles- inthe GD code and the UNF code are both removed, and the SNR (dB) -7-6-5-4-3-2-10 l og ( F E R ) (3, 7)-GD Code(3, 7)-UNF Code(3, 17)-GD Code(3, 17)-UNF Code Fig. 5. FER curves of GD/UNF codes with γ = 3 . SNR (dB) -7-6-5-4-3-2-10 l og ( F E R ) (4, 29)-GD Code(4, 29)-UNF Code Fig. 6. FER curves of GD/UNF codes with ( γ, κ ) = (4 , . SNR (dB) -7-6-5-4-3-2-10 l og ( F E R ) (4,17, z=17)-TC Code(4,17, z=17)-SC Code(4,17, z=28)-SC Code Fig. 7. FER curves of TC/SC codes with ( γ, κ ) = (4 , . number of cycles- in the GD code demonstrates a . reduction from the count observed in the UNF code. It isworth mentioning that both codes have no ASs of weightsup to , which is reﬂected in their FER curves via the sharpwaterfall regions and the non-existing error ﬂoor regions. TheFER of the GD/UNF codes decreases with a rate exceeding orders of magnitude per . dB. Moreover, the GD code hasa signiﬁcant gain of about . dB over the UNF code. Theseresults substantiate the signiﬁcant potential of the GRADE-AOmethod in constructing SC codes with superior performancefor storage devices, with further applications including wirelesscommunication systems.Fig. 7 shows the FER curves of the TC/SC codes with ( γ, κ ) = (4 , . The partitioning matrices and the liftingmatrices of the codes are speciﬁed in Fig. 10. The numberof cycles- in the (4 , TC code demonstrates a anda reduction from the counts observed in the SC codeswith a matched constraint length and a matched circulantsize, respectively. Moreover, the TC code has no weight- norweight- ASs. It is shown that the TC code outperforms the

Fig. 8. Partitioning matrix (top) and circulant power matrix (bottom) for GD code with ( γ, κ, m, z, L ) = (4 , , , , .

12 1 1 7 14 27 4 26 25 2 0 6 15 7 24 1 1 6 17 5 13 19 2 0 11 0 0 0 15 2 4 22 8 5 23 1 4 18 28 1 19 17 22 6 3 3 14 9 11 13 15 3 0 2 3 25 2723 8 2 24 3 7 1 27 6 14 21 12 9 17 5 4 12 20 28 7 7 13 2 25 18 26 5 13 2128 18 7 4 14 3 21 10 28 17 6 24 13 2 9 8 1 26 5 23 12 1 19 8 26 15 4 22 11

Fig. 9. Partitioning matrix (top) and circulant power matrix (bottom) for UNF code with ( γ, κ, m, z, L ) = (4 , , , , .(a) TC code with ( z, L ) = (4 , , , , and a = (0 , , . (b) SC codes with ( z, L ) = (17 , (middle) and ( z, L ) = (28 , (bottom).Fig. 10. Partitioning matrices (top) and circulant power matrices (bottom) for TC/SC codes with ( γ, κ, m t ) = (4 , , . optimal SC code with a matched constraint length, and thatthe gain is of greater magnitude when compared with the SCcode of identical circulant size. Note that although TC codeshave higher memories and thus larger constraint lengths thanSC codes of matched circulant sizes, they possess the samenumber of nonzero component matrices, and thus the samedegrees of freedom in construction. This fact makes TC codeseven more promising if we can devise for them windoweddecoding algorithms with window sizes that are comparableto the corresponding SC codes of matched circulant sizes.VI. C ONCLUSION

Discrete optimization of the constructions of spatially-coupled (SC) codes with high memories is known to be com-putationally expensive. Algorithmic optimization is efﬁcient,but can hardly guarantee the performance because of the lackof theoretical guidance. In this paper, we proposed a so-called GRADE-AO method, a probabilistic framework that efﬁ-ciently searches for locally optimal QC-SC codes with arbitrarymemories. We obtain a locally optimal edge distribution thatminimizes the expected number of cycle candidates by gradientdescent. Starting from a random partitioning matrix with thederived edge distribution, we use algorithmic optimization toﬁnd a locally optimal partitioning matrix near it. Simulationresults show that our proposed constructions have a signiﬁcantperformance gain over state-of-the-art codes. Future work in- cludes extending the framework on cycle optimization into aone that focuses on detrimental objects.A

CKNOWLEDGMENT

This work was supported in part by UCLA Dissertation YearFellowship, NSF under the Grants CCF-BSF 1718389, CCF1717602, CCF 2008728, and CCF 1908730, and in part byAFOSR under the Grant 8750-20-2-0504.R

EFERENCES[1] S. Kudekar, T. J. Richardson, and R. L. Urbanke, “Threshold saturationvia spatial coupling: Why convolutional LDPC ensembles perform sowell over the BEC,”

IEEE Trans. Information Theory , vol. 57, no. 2, pp.803–834, Feb. 2011.[2] S. Kumar, A. J. Young, N. Macris, and H. D. Pﬁster, “Threshold saturationfor spatially coupled LDPC and LDGM codes on BMS channels,”

IEEETrans. Information Theory , vol. 60, no. 12, pp. 7389–7415, Dec. 2014.[3] P. M. Olmos and R. L. Urbanke, “A scaling law to predict the ﬁnite-lengthperformance of spatially-coupled LDPC codes,”

IEEE Trans. InformationTheory , vol. 61, no. 6, pp. 3164–3184, 2015.[4] A. Hareedy, H. Esfahanizadeh, and L. Dolecek, “High performancenon-binary spatially-coupled codes for ﬂash memories,” in , Nov. 2017, pp. 229–233.[5] M. Lentmaier, A. Sridharan, D. J. Costello, and K. S. Zigangirov,“Iterative decoding threshold analysis for ldpc convolutional codes,”

IEEETrans. Information Theory , vol. 56, no. 10, pp. 5274–5289, Oct. 2010.[6] A. R. Iyengar, P. H. Siegel, R. L. Urbanke, and J. K. Wolf, “Windoweddecoding of spatially coupled codes,”

IEEE Trans. Information Theory ,vol. 59, no. 4, pp. 2277–2292, Apr. 2013.[7] D. G. M. Mitchell, M. Lentmaier, and D. J. Costello, “Spatially coupledldpc codes constructed from protographs,”

IEEE Trans. InformationTheory , vol. 61, no. 9, pp. 4866–4889, Sep. 2015.8] H. Esfahanizadeh, A. Hareedy, and L. Dolecek, “Finite-length con-struction of high performance spatially-coupled codes via optimizedpartitioning and lifting,”

IEEE Trans. Communications , vol. 67, no. 1,pp. 3–16, Jan. 2018.[9] A. Hareedy, R. Wu, and L. Dolecek, “A channel-aware combinatorialapproach to design high performance spatially-coupled codes,”

IEEETrans. Information Theory , vol. 66, no. 8, pp. 4834–4852, Aug. 2020.[10] A. E. Pusane, R. Smarandache, P. O. Vontobel, and D. J. Costello,“Deriving good ldpc convolutional codes from ldpc block codes,”

IEEETrans. Information Theory , vol. 57, no. 2, pp. 835–857, Feb. 2011.[11] L. Dolecek, Z. Zhang, V. Anantharam, M. J. Wainwright, and B. Nikolic,“Analysis of absorbing sets and fully absorbing sets of array-based ldpccodes,”

IEEE Trans. Information Theory , vol. 56, no. 1, pp. 181–201,Jan. 2010. [12] S. Naseri and A. H. Banihashemi, “Spatially coupled LDPC codes withsmall constraint length and low error ﬂoor,”

IEEE CommunicationsLetters , vol. 24, no. 2, pp. 254–258, Feb. 2020.[13] M. Battaglioni, A. Tasdighi, G. Cancellieri, F. Chiaraluce, and M. Baldi,“Design and analysis of time-invariant SC-LDPC convolutional codeswith small constraint length,”

IEEE Trans. Communications , vol. 66,no. 3, pp. 918–931, Mar. 2018.[14] M. P. C. Fossorier, “Quasicyclic low-density parity-check codes fromcirculant permutation matrices,”

IEEE Trans. Information Theory , vol. 50,no. 8, pp. 1788–1793, Aug. 2004.[15] L. Schmalen, V. Aref, and F. Jardel, “Non-uniformly coupled LDPCcodes: Better thresholds, smaller rate-loss, and less complexity,” in2017IEEE International Symposium on Information Theory (ISIT)