Temporal Energy Analysis of Symbol Sequences for Fiber Nonlinear Interference Modelling via Energy Dispersion Index
Kaiquan Wu, Gabriele Liga, Alireza Sheikh, Frans M. J. Willems, Alex Alvarado
PPREPRINT, FEBRUARY 25, 2021 1
Temporal Energy Analysis of Symbol Sequences forFiber Nonlinear Interference Modelling via EnergyDispersion Index
Kaiquan Wu,
Student Member, IEEE , Gabriele Liga,
Member, IEEE , Alireza Sheikh,
Member, IEEE ,Frans M. J. Willems,
Life Fellow, IEEE , and Alex Alvarado,
Senior Member, IEEE
Abstract —The stationary statistical properties of independent,identically distributed (i.i.d.) input symbols provide insights onthe induced nonlinear interference (NLI) during fiber transmis-sion. For example, kurtosis is known to predict the modulationformat-dependent NLI. These statistical properties can be usedin the design of probabilistic amplitude shaping (PAS), which is apopular scheme that relies on an amplitude shaper for increasingspectral efficiencies of fiber-optic systems. One property ofcertain shapers used in PAS—including constant-compositiondistribution matchers—that is often overlooked is that a time-dependency between amplitudes is introduced. This dependencyresults in symbols that are non -i.i.d., which have time-varyingstatistical properties. Somewhat surprisingly, the effective signal-to-noise ratio (SNR) in PAS has been shown to increase whenthe shaping blocklength decreases. This blocklength dependencyof SNR has been attributed to time-varying statistical propertiesof the symbol sequences, in particular, to variation of the symbolenergies. In this paper, we investigate the temporal energybehavior of symbol sequences, and introduce a new metric called energy dispersion index (EDI). EDI captures the time-varyingstatistical properties of symbol energies. Numerical results showstrong correlations between EDI and effective SNR, with absolutecorrelation coefficients above for different transmissiondistances.
Index Terms —Constant-Composition Distribution Matching,Fiber Channel Models, Fiber Nonlinearities, Probabilistic Am-plitude Shaping.
I. I
NTRODUCTION C ONSTELLATION shaping (CS) and forward error cor-rection (FEC) are two crucial elements to realize nearcapacity-achieving transmission for the additive white Gaus-sian noise (AWGN) channel. In recent years, a novel codedmodulation framework called probabilistic amplitude shap-ing (PAS) [1] has attracted wide attention for its capacity-achieving performance. PAS elegantly integrates FEC andprobabilistic shaping (PS) [2], where the PS functionality
This work is supported by the Netherlands Organisation for ScientificResearch via the VIDI Grant ICONIC (project number 15685). The workof Alex Alvarado is supported by the European Research Council (ERC)under the European Union’s Horizon 2020 research and innovation programme(grant agreement No. 757791). The work of G. Liga is supported by theEuroTechPostdoc programme under the European Union’s Horizon 2020 re-search and innovation programme (Marie Skłodowska-Curie grant agreementNo 754462).The authors are with the Information and Communication Theory Lab,Signal Processing Systems Group, Department of Electrical Engineering,Eindhoven University of Technology, Eindhoven 5600 MB, The Netherlands(e-mails: { k.wu, g.liga, a.sheikh, f.m.j.willems, a.alvarado } @tue.nl).A. Sheikh is with imec, Holst Centre, High Tech Campus 31, 5656 AEEindhoven, The Netherlands (email: [email protected]). is enabled by an amplitude shaper. One popular amplitudeshaper is constant-composition distribution matcher (CCDM)[3]. Other well-known shapers include multiset-partition dis-tribution matcher (MPDM) [4], product distribution matcher(PDM) [5], enumerative sphere shaping (ESS) [6], [7], andshell mapping [8].In the context of fiber optical communications, numerousstudies about PAS have been carried out in simulation [9]–[11] and experiments [12], [13]. Record spectral efficiencies(SEs) have been achieved in field trials [14], [15]. However,unlike the AWGN channel, SEs in the nonlinear fiber channelare limited by the Kerr effect. This effect causes nonlinearinterference (NLI), which becomes a substantial part of thetotal noise experienced by the transmitted signals [16].Mitigation of NLI can be achieved by optimizing statis-tical properties of the transmitted symbol sequences. Givena number of constellation points, it is possible to changethe constellation geometry [17], [18] or the probability massfunction (PMF) of the constellation symbols [9], [19], [20].These techniques are often referred to as geometric andprobabilistic shaping, respectively. In both cases, by assumingsymbols to be independent identically distributed (i.i.d.), thestationary statistical properties of symbols are optimized. Inthe former, the support of the PMF is optimized, while in thelater, the probabilities are optimized.An alternative approach is to manipulate the temporal structure of the symbol sequence, which is believed to exertgreat influence on the NLI [21], [22]. The NLI can bemodeled as inter-symbol interference (ISI) [23], i.e., the NLIexperienced by a transmitted symbol only depends on adja-cent symbols. The number of interfering symbols is usuallyreferred to as channel memory. Therefore, transmitted symbolsthat comply to certain temporal structures could be usedto suppress NLI. One example of this approach is the so-called temporal shaping introduced in [22], which generatescorrelated symbols from a ball-shaped constellation. By takingthe correlation between symbol energies into account, [22]showed that transmitting correlated symbols leads to a newcorrelated term compared to i.i.d. symbol sequences (see [22,Eqs. (5) and (9)]). Temporal shaping was later realized usinga finite state machine [24]. Although improved tolerance tononlinearities can be achieved by controlling the temporalstructure of the symbol sequences, a guiding principle for theoptimization of the temporal structure is still unknown. Thiscould be due to the fact that the statistical analysis of non- a r X i v : . [ c s . I T ] F e b REPRINT, FEBRUARY 25, 2021 2 i.i.d. symbol sequences is in general more difficult than itsi.i.d. counterpart.PAS generates dependent symbols because the employedshaper imposes hidden temporal structure on the amplitudeblocks. Recently, several studies based on the PAS architecturehave observed the impact of the resulting temporal structureon the effective SNR. The effective SNR has been shownto depend on the shaping blocklength: using short shapingblocklengths has been shown to offer significant effectiveSNR gains due to a weaker presence of nonlinearities [11],[25]–[27]. In [28], the relationship between carrier phaserecovery and shaping blocklength on the nonlinear shapinggain was studied. In addition, the symbol mapping strategy,i.e., the way the shaped amplitudes are mapped to multi-dimensional symbols, has also been shown to be a crucialfactor in the effective SNR performance of the system. It wasobserved in [25], [29] that instead of using four amplitudeshapers independently for I/Q and X/Y dimensions, usingone amplitude shaper across four dimensions improves theeffective SNR.The assumption of i.i.d. input symbols can be justified inscenarios such as (i) uncoded transmission, and (ii) systemsemploying a random interleaver that breaks dependenciesbetween symbols. Based on the i.i.d. assumption, state-of-the-art Gaussian noise (GN) and enhanced Gaussian noise(EGN) models provide insights about the connection betweenthe stationary statistical properties of the transmitted signaland the NLI. The GN model concludes that the NLI powerscales as the cube of the average symbol energy [30]–[32]. Byrelaxing the
Gaussianity assumption of the GN model [33], theEGN model shows that the standardized fourth moment (alsoknown as kurtosis ) of the transmitted symbols is an importantmetric in predicting the modulation-dependent NLI [34]–[36].To mitigate the NLI, an optimized PMF was designed in [20]with the help of kurtosis. In [37], the blockwise compositionof the symbol sequences having low kurtosis was selected fortransmission.To address the scenario of non-i.i.d. input symbols, theheuristic finite-memory GN model was proposed in [21],which considers a time-windowed symbol energy within thechannel memory for the computation of the NLI. Recently,to analyze the temporal structure of symbol sequences, itwas shown in [25], [38] that that symbol sequences withfewer clusters of identical symbols offer reduced NLI. Aheuristic metric called run ratio was also proposed in [25].The perturbative time-domain model of [39] can in principlealso be used to analyze non-i.i.d. input symbols.The discussion above leaves multiple open questions aboutthe effects of the temporal structure of the symbol sequenceson the NLI. Perhaps the most important question is that ametric that accounts for time-varying statistical properties ofsymbol sequences and enables a precise NLI prediction isstill unknown. This paper is devoted to presenting a detailedexplanation for the blocklength dependency on effective SNR, The sixth-order moment reflects self-channel interference, which ismarginal in long distance transmission using wavelength-division multiplexing(WDM). and introducing a metric that is able to capture the effect oftime-varying statistics on the NLI.The contributions of this paper are three. First, we showthat the temporal structure of the symbol sequence shapedby CCDM yields correlated symbols. The second contributionis to introduce a novel metric called energy dispersion index(EDI) to evaluate the variance of the symbol energies within awindow. This metric is inspired by the channel models in [21],[22] and is a function of the autocorrelation function of thesymbol energy’s variance within a window. An almost perfectcorrelation between the EDI and the effective SNR for differ-ent transmission distances is observed in numerical analysisusing CCDM. Lastly, we also give analytical expressions forthe autocorrelation and the EDI of the QAM symbol sequencesgenerated by CCDM.The rest of the paper is organized as follows. A review of thechannel memory estimation and relevant channel models forthe optical fiber channel is presented in Sec. II. In Sec. III, thestatistical properties of symbol sequences shaped by CCDMare studied. The EDI, which is the main contribution of thepaper, is presented along with the simulation results in Sec. IVand Sec. V. Conclusions are drawn in Sec. VI.II. M
ODELING
NLI
WITH W INDOWED E NERGY
In this section, the estimation of memory in the fiber-optic channel and the effective SNR considering the NLI arereviewed. We then discuss the effects of temporal energy be-havior of symbol sequences on the NLI, from the perspectivesof channel models with memory [21], [22], [39]. We end thissection with a review of metrics available in the literature thatpredict the NLI induced by input symbols.
A. Channel Memory and Effective SNR
In a nonlinear fiber channel, amplifiers along with con-ventional linear digital signal processing (DSP) effectivelycompensate chromatic dispersion (CD) and attenuation. Inthis paper, we consider the channel output after DSP stepsexcluding nonlinearity compensation. Thus, the residual noisecomes mainly from amplified spontaneous emission (ASE) andNLI. The ASE noise is a random process independent of thesignal, whereas the NLI noise is dependent on the signal. Theinterplay between CD and fiber nonlinearity induces nonlinearinteractions among a number of past and future symbols.The number of interfering symbols is often referred to as thechannel memory.For a fiber-optic system with group velocity dispersion β and optical bandwidth ∆ ω , the dispersive length L D is definedas [21, Sec. II-B] L D = 1 / (cid:0) ∆ ω | β | (cid:1) , (1)where L D indicates the distance scale over which pulsebroadening effects become significant. Given the propagationdistance L , the one-sided channel memory M is roughlyapproximated as [21, Sec. II-B] M ≈ L/L D . (2) REPRINT, FEBRUARY 25, 2021 3
At time instant i , given the transmitted symbol X i , a vectorconsisting of X i along with its M neighbouring symbols isdenoted as X i + Mi − M = [ X i − M , . . . , X i , X i +1 , . . . , X i + M ] . Byconsidering the NLI and the ASE as AWGNs, the receivedsymbol Y i can be expressed as Y i = X i + Z ASE ,i + Z NLI ,i ( X i + Mi − M ) . (3)Due to the nonlinear interference, the NLI term Z NLI ,i isexpressed as a function of X i + Mi − M .The SNR is usually evaluated under the implicit assumptionof ergodic signal and noise processes. Therefore, the empiricalestimations of the signal power and noise variance converge tothe corresponding statistical values. Hence, the effective SNRincluding the NLI is defined as
SNR eff (cid:44) E (cid:2) | X | (cid:3) Var [ Z ASE ] + Var [ Z NLI ] (4) = E (cid:2) | X | (cid:3) E [ | Y − X | ] ≈ µ | X | µ | Y − X | . (5)Often the r.h.s. of (5) is used to approximate the effective SNRin simulations and experiments. B. Fiber Channel Models with Finite Memory
We review related channel models that consider the tem-poral effects of the transmitted symbols and also fit thegeneral model in (3). A first-order perturbation model can bederived from the nonlinear Schr¨odinger equation in discrete-time domain [39]. Assuming a single channel at time instant i = 0 , Z NLI , is modeled as [39, Eq. (57)] Z NLI , ( X + ∞−∞ ) = γ + ∞ (cid:88) h = −∞ + ∞ (cid:88) j = −∞ + ∞ (cid:88) l = −∞ S h,j,l X h X j X ∗ l , (6)where γ is the nonlinear coefficient. The product of the symboltriplets are weighted by complex perturbation terms S h,j,l thatquantify self channel interference . In (6) we use the notation X + ∞−∞ to emphasize that (6) is in theory an infinite-memorychannel.The perturbation terms S h,j,l exhibit very different magni-tudes depending on their indices (see for example [40, Fig. 5]).To capture the dominant terms contributing to the NLI, weconsider the terms with j = l as previously done in [22,Sec. II-B]. Furthermore, we then truncate the indices withinthe channel memory M . Hence, (6) is approximated as Z NLI , ( X + M − M ) ≈ γ M (cid:88) h = − M X h M (cid:88) l = − M S h,l,l | X l | , (7) Notation : Throughout this paper, random variables are denoted by up-percase letters X and their (deterministic) outcomes by the same letter inlowercase x . Sequences are denoted by boldface letters X (random) or x (deterministic). We use subscript and superscript to denote the boundary of arandom sequences, i.e., X ji = [ X i , X i +1 , . . . , X j ] . Expectations, variancesand autocovariances are denoted by E [ · ] , Var [ · ] , and Cov [ · ] , respectively.The probability of X i = x is denoted by P X i ( x ) . Conditional probability isdenoted as P Y i | X i ( y | x ) . The imaginary unit is (cid:44) √− . The cross-channel interference can be expressed in the same form asthe symbol triplet product times the complex perturbation terms (see [39,Eq. (60)]). which shows that the NLI noise Z NLI , depends on theweighted symbol energies within the vector of X + M − M .For the transmission of non -i.i.d. symbol sequences, the sumof the ASE and the NLI in (3) is approximated in [21, Eq. (8)]as Z ASE , + Z NLI , ( X + M − M ) ≈ ˜ Z (cid:118)(cid:117)(cid:117)(cid:116) P ASE + η (cid:32) (cid:80) Mk = − M | X k | M + 1 (cid:33) , (8)where ˜ Z is a zero-mean unit-variance circularly symmetriccomplex Gaussian random variable, P ASE is the ASE noisevariance, and η is a real, non-negative constant quantifyingthe NLI. When compared to (7), (8) is a more radical approx-imation in the sense that the energies of all interfering symbolswithin a window of length M + 1 are assumed to make equalcontribution to NLI.In general, the expressions (7) and (8) show that the sumof (weighted) symbol energies within a window plays animportant role in determining the NLI. Transmitting correlatedsymbols can therefore suppress the NLI. These observationsmotivate us to focus on the symbol energies and their temporalcorrelations. C. Related Metrics for NLI
One well-known metric for showing variations of symbolenergies is peak-to-average power ratio (PAPR). PAPR hasbeen widely used for analysis of orthogonal frequency-divisionmultiplexing (OFDM) fiber transmission [41]. PAPR partiallyreflects energy variation of symbols and thus can be regardedas a rough indication of nonlinearity tolerance of the trans-mitted signal [42, Sec. II-D], [43, Sec. II]. PAPR is definedas Θ (cid:44) | X | max E [ | X | ] , (9)where | X | max is the maximum symbol energy in the symbolsequence.Another important metric (originated from the EGN model)is kurtosis. The EGN model predicts strong NLI if the trans-mitted symbols sequence has high kurtosis. For zero-meanconstellations, kurtosis is defined as Φ (cid:44) E (cid:2) | X | (cid:3) E [ | X | ] . (10)To explain the blocklength dependent effective SNR, runratio was proposed in [25, Sec. III-B]. Symbol sequences withhigh run ratio are considered to be more prone to the NLI. Runratio is defined asR r (cid:44) T (cid:32) T − (cid:88) i =1 ¯ δ ( x i − , x i ) (cid:33) , (11)where T is the number of transmitted symbols, and ¯ δ ( x i − , x i ) is a decision function which is equal to when x i − (cid:54) = x i or otherwise. REPRINT, FEBRUARY 25, 2021 4
AmplitudeShaper { , } k → A n Amplitudeto Bits Syst.FECEnc. N Nk Bits ζNn
Bits S I A I X I AmplitudeShaper { , } k → A n Amplitudeto Bits Syst.FECEnc.
N N S Q A Q X Q L X In-phaseQuadrature
S/PInfo.Bits Nn Amplitudes Nn SignBits Nn QAMSymbols
Fig. 1. Transmitter schematic diagram for a PAS architecture using 1D symbolmapping.
III. QAM S
YMBOLS S HAPED BY
CCDMThis section first reviews the generation of QAM symbolsequences in the PAS architecture. We then show that thesymbols generated by CCDM are statistically dependent fromeach other. Motivated by results in Sec. II-B, which showedthat the correlation among symbols could lead to a reducedNLI, we propose to analyze the autocorrelation of the symbolenergies. This analysis will be later used in Sec. IV for thederivation of the EDI.A block diagram of a PAS transmitter is illustrated in Fig. 1.To obtain a nonuniformly distributed QAM symbol sequence X , two sequences of shaped pulse-amplitude modulation(PAM) symbols X I and X Q are independently generatedin the in-phase and quadrature dimensions, respectively. Theresulting QAM symbols are then X = X I + X Q . In thispaper, similar to [29, Sec. II-C], we refer this mapping strategyas 1D symbol mapping . When obtained from independentinformation bit sequences, X I and X Q are also independentof each other. A. System Model
As shown in Fig. 1, the generation of PAM symbols relieson a systematic FEC engine and an amplitude shaper. A fixed-length amplitude shaper encodes k information bits to an am-plitude codeword of blocklength n , and thus the shaping rateis k/n . The amplitude set is denoted as A = { ∆ , , , . . . } where ∆ is a scaling factor.For simplicity of exposition, let us consider the in-phasedimension and one FEC codeword. As shown in Fig. 1, aFEC codeword consists of N amplitude codewords. Aftera serial to parallel conversion, N k information bits go intothe amplitude shaper, and ζN n bits go into FEC, where ζ denotes the fraction of information in sign bits (see [1,Eq. (29)]). The amplitude shaper converts N k bits into N amplitude codewords, which we call an amplitude sequence A I = [ A , A , . . . , A Nn − ] . The amplitudes in A I are thenlabeled into bits (see Fig. 1) and fed into the FEC along This mapping is also called inter-pairing in [25, Sec. II-B]. with ζN n bits. The labeling used is the binary-reflectedGray code [44]. The FEC parity bits together with ζN n bitsserve as sign bits S I to yield the PAM symbol sequence X I = [ X I, , X I, , . . . , X I,Nn − ] , where X I,i = ( − S I,i A I,i ( i = 0 , , . . . , N n − ). The same procedure is conducted inthe quadrature branch to generate X Q .The amplitude shaper imposes constraints on the amplitudesof each amplitude codeword. In this paper, the amplitudeshaper we consider is CCDM. Due to the constant-composition(CC) constraint, given blocklength n , the number of ampli-tudes a ∈ A in every codeword is a constant denoted by n a ,and thus n = (cid:80) a ∈A n a . For any realization of the amplitudesequence, the amplitude frequency distribution of a is n a /n .Therefore, the amplitude codewords generated by CCDM aresimply permuted versions of each other.In this paper, we make two assumptions about the CCDMcodebook. In the following section, these assumptions willallow us to consider the CCDM codewords as a genuine pro-cess of drawing amplitudes without replacement [3, Sec. IV].The first assumption is that the codewords are equally likely,which is justified in the scenario where the information bits areindependent and uniformly distributed. The second assumptionis that the CCDM codebook contains all possible amplitudepermutations. The total number of permutations is given by themultinomial coefficient of the composition (see [45, Eq. (5)]),which is denoted as N C . For k input bits, CCDM selects k permutations as codewords. In general, k < N C because (i) k is chosen to be variable to for example achieve rate adaptivity,or (ii) N C usually not be exactly a power of two. In this paper,we always use the largest possible value of k , and thus, (i)above can be ignored. On the other hand, due to (ii) above,CCDM does not always generate all N C permutations. In thispaper, we use emulated CCDM codewords that include allamplitude permutations. We will show later in the paper thatthis approximation is very good when compared to “exact”CCDM, where not all permutations are used.
B. Statistical Dependency Among CC Amplitudes
The dependency of amplitudes in a FEC frame is illustratedin Fig. 2. For notation simplicity, in what follows we drop thesubscripts of A and S that indicate in-phase or quadrature.Furthermore, we pay little attention to sign bits since theyonly determine the polarity of PAM symbols. In terms of theamplitude sequence A , Fig. 2 schematically shows that theamplitudes from different amplitude codewords of length n areindependent of each other. Fig. 2 also shows that amplitudes within an amplitude codeword are mutually dependent .The amplitude dependency within amplitude codewordscomes from the CC constraint. During the arithmetic encod-ing of a CCDM that can be modeled as drawing withoutreplacement process, the probability of drawing an amplitudeis always updated by subtracting the previous number ofamplitudes from the composition. The CC constraint furtherimplicitly ensures that the sum of amplitudes in a CCDMcodeword is a constant, i.e., n − (cid:88) j =0 A j = (cid:88) a ∈A an a . (12) REPRINT, FEBRUARY 25, 2021 5 S S . . . S n − S n S n +1 . . . S n − S ( N − n . . . S Nn − S Nn − A A . . . A n − A n A n +1 . . . A n − A ( N − n . . . A Nn − A Nn − . . .. . . Sign Bits, S Amp. Seq., A Amplitude Codeword 1 Amplitude Codeword 2 Amplitude Codeword N FEC Frame dependent independent dependent n n n
Fig. 2. Illustration of FEC frame and amplitude dependency. One FEC frame is formed by nN amplitudes and nN sign bits. Amplitudes within the samecodeword are dependent, while amplitudes from different codewords are independent. For any two different time instants i and i + τ , we then have A i = − A i + τ − n − (cid:88) j =0 j (cid:54) = i,i + τ A j + (cid:88) a ∈A an a . (13)Expression (13) shows that these amplitudes A i and A i + τ within CCDM codewords exhibit a linear relationship . Thisobservation shows that amplitudes within an amplitude code-word are indeed linearly dependent.In the next example, we use a CCDM shaping trellis to ex-plain the statistical dependency between amplitudes. The trellisshows all possible amplitude sequences and their accumulatedenergies. We construct the trellis based on the following rules: • Each path in the trellis represents a unique amplitudesequence. • At time instant i , the vertical position of the nodesrepresents the accumulated energy E acc i ∈ E , where E acc i (cid:44) i − (cid:88) t =0 | A t | , (14) E acc = 0 , and E is the set of possible accumulated energylevels. • The numbers labeling the trellis states represent thevalue of P A i | E acc i ( a | e ) , i.e., the conditional probability ofamplitude A i = a given the accumulated energy. E acc i = e for i = 0 , , . . . , where a ∈ A and e ∈ E . • At time instant i , the edges indicate the amplitude A i . Thesteeper the slope of the edge has, the larger the amplitude.The number next to the edges is P A i ,E acc i ( a, e ) , i.e., theprobability of the paths starting from accumulated energy e using an amplitude a . Example 1 (CCDM Trellis):
Assume the use of CCDM withblocklength n = 4 and a composition of three amplitudes 1and one amplitude 3 (i.e., n = 3 and n = 1 ), and thus,the amplitude PMF is P A = [ , ] . Fig. 3 shows the trellisof the first amplitude codeword, whose behavior will repeatfor the subsequent codewords. The set of accumulated energystates is E = { , , , , , , , } . To explain the trellis,consider state E acc = 10 and the three connected edges (shownwith boldface text in Fig. 3). Since two paths reach this state, P E acc (10) = P A ,E acc (3 ,
1) + P A ,E acc (1 ,
9) = + = .At this node, we can only choose amplitude 1 for a , andthus P A | E acc (3 |
10) = 0 and P A | E acc (1 |
10) = 1 . The jointprobability of amplitude a = 1 starting from e = 10 is thus P A ,E acc (1 ,
10) = P A | E acc (1 | P E acc (10) = 1 × = . iE acc i
34 12 14
14 12
14 14
14 34 13 23 12 12
14 34
Fig. 3. CCDM shaping trellis with blocklength n = 4 for amplitude set A = { , } with P A = [ , ] . The amplitude edges are labeled with P A i ,E acc i ( a, e ) . Each node contains P A i | E acc i (3 | e ) on the left (in red) and P A i | E acc i (1 | e ) on the right (in blue). In Example 1, the amplitudes within a CCDM codewordare statistically dependent among each other. This can be seenfrom Fig. 3, where P A i | E acc i ( a | e ) = P A i ( a ) is not alwayssatisfied. The reason for this dependency is that A + A + A + A = 6 (see (12)), and any pair of amplitudes exhibitsa linear relationship. On the other hand, for i.i.d. amplitudesequences, P A i | E acc i ( a | e ) = P A i ( a ) is satisfied for ∀ a ∈ A .Another property of i.i.d. amplitude sequences is the time-independent marginal probability of an amplitude. As can beseen in Example 1, the marginal probability of an amplitudewithin a CCDM codeword is also constant, i.e., for ∀ i ∈ Z , (cid:88) e ∈E P A i ,E acc i ( a, e ) = P A i ( a ) = P A ( a ) = n a n . (15)The property in (15) is due to the set of codewords being per-mutation invariant, as shown in [46, Lemma 1]. For example, P A (1) = P A ,E acc (1 ,
10) + P A ,E acc (1 ,
2) = + = . Theproperty in (15) holds for all CC amplitude sequences and willbe be used in the statistical analysis later in the paper.A closer look at Example 1 also reveals that when multipleCCDM codewords are transmitted, the trellis extends in thetime domain, and the probabilistic model of the first amplitudecodeword repeats with period n . This repetition is studied inthe next example. Example 2 (Trellis Repetition and Blocklength):
Fig. 4shows a comparison of the CCDM shaping trellises for block-
REPRINT, FEBRUARY 25, 2021 6 iE acc i EnergyVariation (a) Trellis for n = 4 . iE acc i EnergyVariation (b) Trellis for n = 8 .Fig. 4. CCDM shaping trellis comparison for blocklengths n = 4 and n = 8 ,with P A = [ , ] . The trellis marked by gray dotted lines show amplitudesequences never generated by CCDM. lengths n = 4 and n = 8 . Since both cases have the same P A , it can be seen that E acc / / for n = 4 , whichis equal to E acc / / for n = 8 (see Fig. 4). As aresult, the green dotted lines in the middle of the trellises showthat their accumulated energies grow at the same average rate.We also note that for blocklength n = 8 , the trellis broadens(in the energy direction) and includes the trellis for n = 4 . Thevertical arrows between the black dashed lines shows that thevariations of the accumulated energy growth for n = 8 islarger when compared to n = 4 .An important implication from the two examples above isthat by introducing dependencies between amplitudes, CCDMartificially constrains the dispersion of the accumulated energyof the amplitude sequences (see black arrows in Fig. 4). Fora fixed PMF P A , the accumulated energy dispersion is moreconstrained with shorter blocklength. C. Statistics of Symbol Energies with CC Amplitudes
In this section, we will study the statistical propertiesof the QAM symbol sequences defined by CC amplitudesfrom the perspective of stochastic processes. In particular, wewill show that the process of symbol energies is wide-sensecyclostationary (WSCS). The analysis in this section will laythe ground for the definition and formulation of the EDI metricin Sec. IV. We start by briefly reviewing some of the keyproperties of stochastic processes.
Definition 1 (First-order stationarity [47, Ch. 9-1]):
Astochastic process X is called first-order stationary if P X i = P X i + τ holds for any delay τ . Definition 2 (Autocorrelation [48, Def. 13.13]):
The auto-correlation of a stochastic process X as a function of timeslot i and delay τ is defined as R X ( i, τ ) (cid:44) E [ X i X i + τ ] . (16) Definition 3 (Autocovariance [48, Def. 13.12]):
The auto-covariance of a stochastic process X is defined as Cov [ X i , X i + τ ] (cid:44) E [( X i − E [ X i ]) ( X i + τ − E [ X i + τ ])] . (17)Moreover, for first-order stationary processes, Cov [ X i , X i + τ ] = R X ( i, τ ) − E [ X ] . (18) Definition 4 (Wide-sense Cyclostationarity [47, Ch. 10-4]):
A stochastic process X is called wide-sense cyclostationary with period n if E [ X i ] = E [ X i + mn ] and R X ( i, τ ) = R X ( i + mn, τ ) hold for every m ∈ Z .In the definitions above, we used X to denote a genericstochastic process. In what follows, we focus on the processof symbol energies defined as E (cid:44) [ . . . , E i − , E i , E i +1 , . . . ] , (19)where E i (cid:44) | X i | = A I,i + A Q,i . (20)For a symbol energy process E , we will refer to every n symbol energies as a block , which are constructed basedon two amplitude codewords (see Fig. 1). In addition, theautocorrelation of E is R E ( i, τ ) (cid:44) E [ E i E i + τ ] . (21) Lemma 1:
The QAM symbol energies E defined by CCamplitudes is a first-order stationary process. Proof:
As shown in (15), the CC amplitude sequences A are first-order stationary processes. Since a QAM symbol canbe decomposed into two independent dimensions, the processof the energies E in (19) is also first-order stationary.Lemma 1 shows that the probability distribution of symbolenergy P E i is constant for any i ∈ Z . From this first-orderstationarity property in Lemma 1, it follows that E [ E i ] = E (cid:2) | X i | (cid:3) = E (cid:2) | X | (cid:3) , (22) Var [ E i ] = Var (cid:2) | X i | (cid:3) = Var (cid:2) | X | (cid:3) , (23)where E (cid:2) | X | (cid:3) = 2 E (cid:2) A (cid:3) , (24) Var (cid:2) | X | (cid:3) = 2 E (cid:2) A (cid:3) − E (cid:2) A (cid:3) , (25)and (25) was obtained using E (cid:2) | X | (cid:3) = 2 E (cid:2) A (cid:3) + 2 E (cid:2) A (cid:3) . (26)The second and fourth order moments of A in (24)–(26) canbe computed by using the amplitude PMF P A ( a ) = n a /n , i.e., E (cid:2) A (cid:3) = (cid:88) a ∈A a n a n , (27) E (cid:2) A (cid:3) = (cid:88) a ∈A a n a n . (28) Theorem 2:
For the sequence of QAM symbol energies E defined by CC amplitudes, if E i and E i + τ ( τ (cid:54) = 0 ) belong tothe same block, we have R E ( i, τ ) = ρ < E (cid:2) | X | (cid:3) , (29)where ρ (cid:44) n E (cid:2) | X | (cid:3) − E (cid:2) | X | (cid:3) n − , (30) We avoid using “symbol energy codeword”, since the mapping between k input bits and n symbol energies is not one-to-one. REPRINT, FEBRUARY 25, 2021 7 and
Cov [ E i , E i + τ ] = − Var (cid:2) | X | (cid:3) n − < . (31) Proof:
See Appendix A.In Theorem 2, the negativity of autocovariance in (31)shows that the dependency between amplitudes is due tothe fact that these amplitudes are inversely correlated. Thisnegativity will also be used to prove Corollary 7 in Sec. IV.In addition, from (31) we obtain lim n →∞ Cov [ E i , E i + τ ] = 0 ,indicating that using long blocklengths weakens the lineardependency between symbol energies. Lemma 3:
The QAM symbol energies E defined by CCamplitudes is a WSCS process. Proof:
The first condition of WSCS processes E [ E i ] = E [ E i + mn ] is satisfied because of (22). The second conditionof WSCS processes R E ( i, τ ) = R E ( i + mn, τ ) follows fromthe fact the same probabilistic model repeats for every blockof n symbols (see Fig. 2).In practice, we are interested in the average autocorrelationof WSCS processes over the cyclostationarity period. For theprocess under investigation, such an average autocorrelation R E ( τ ) is defined as R E ( τ ) (cid:44) n n − (cid:88) i =0 R E ( i, τ ) (32) = 1 n n − (cid:88) i =0 E [ E i E i + τ ] , (33)where (33) follows from (21). The average autocorrelation R E ( τ ) in (32) can be interpreted as a quantity that indicatesthe average linear dependency of all possible pairs of symbolenergies separated by τ symbols.The following theorem gives an analytical expression forthe average autocorrelation in (33) for the considered QAMsymbol sequences. Theorem 4:
The average autocorrelation R E ( τ ) , τ ∈ Z forCCDM QAM symbol sequences with blocklength n generatedusing a composition such that P A ( a ) = n a /n, a ∈ A can beexpressed as R E ( τ ) = E (cid:2) | X | (cid:3) , if τ = 0 | τ | E (cid:2) | X | (cid:3) + ( n − | τ | ) ρn , if ≤ | τ | < n E (cid:2) | X | (cid:3) , if | τ | ≥ n (34)where ρ is given in (30). Proof:
The function R E ( i, τ ) for different values of τ and i is illustrated in Table I, where for simplicity only τ > is shown. When τ = 0 , R E ( i,
0) = E (cid:2) | X | (cid:3) as shown in thefirst column in Table I. For τ (cid:54) = 0 , E i + τ and E i are either atthe same block and thus correlated, or at two different blocksand thus independent of each other. For example, when i = 0 , E is only correlated with the following n − symbol energies,yielding R E (0 , τ ) = ρ for τ = 1 , , . . . , n − (see Fig. 2 andthe blue cells in the i = 0 row of Table I). When τ = n , E n is the energy of the first symbol in the second block, and thus E and E n are independent that gives R E (0 , n ) = E (cid:2) | X | (cid:3) . . . . . . . . . τ R E ( τ ) Eq. (35) Eq. (34)Exact Emulated n = 10 n = 20 n = 50 n = 100i.i.d. E (cid:2) | X | (cid:3) Fig. 5. Simulation results of ˆ R E ( τ ) in (35) (markers) by using exact or emu-lated CCDM codewords, and analytical results of R E ( τ ) given by Theorem 4(solid lines) for normalized (cid:0) E (cid:2) | X | (cid:3) = 1 (cid:1) P A = [0 . , . , . , . . For all cases R E (0) = E (cid:2) | X | (cid:3) = 1 . . This is shown by the red cells in the i = 0 row of Table I.The other rows in Table I can be calculated in a similar way. R E ( τ ) is the column-wise average of R E ( i, τ ) .Note that R E ( τ ) is an even function and that R E ( τ ) in (34)is completely determined by the blocklength n , the second, andfourth order moments of | X | . The latter can be calculated via(24)–(25) and (27)–(28).Furthermore, R E ( τ ) can be approximated by the sampleautocorrelation function ˆ R E ( τ ) in a Monte Carlo simulation.With T samples, ˆ R E ( τ ) is defined as ˆ R E ( τ ) (cid:44) T T − (cid:88) t =0 | x t | | x t + τ | . (35)When T → ∞ , ˆ R E ( τ ) → R E ( τ ) . This ergodicity followsfrom Slutsky’s theorem [47, Thm. 12-2], due to the fact thatas τ (cid:48) → ∞ , the correlation between symbol energies vanishesand thus Cov (cid:2) E i E i + τ , E i + τ (cid:48) E i + τ + τ (cid:48) (cid:3) → . Example 3 (Average Autocorrelation):
Fig. 5 shows theaverage autocorrelation R E ( τ ) in Theorem 4 and its es-timation ˆ R E ( τ ) for 64QAM symbol sequences (normal-ized to E (cid:2) | X | (cid:3) = 1 ). Four different shaping blocklengthsare compared, which use the same distribution P A =[0 . , . , . , . , and have the same R E (0) = E (cid:2) | X | (cid:3) =1 . (not shown in Fig. 5). I.i.d. sequences with the samedistribution are also considered. Fig. 5 shows ˆ R E ( τ ) in (35)for sequences generated using exact CCDM, emulated CCDM,and for a system with a symbol-level interleaver to emulatethe i.i.d. property. Fig. 5 shows that the emulated results differfrom exact CCDM only for n = 10 , where only ofthe total number of permutations N C are used as codewords.Except this minor mismatch for n = 10 , R E ( τ ) in Theorem 4approximates well the true autocorrelation function R E ( τ ) inall other cases. It can also be seen in Fig. 5 that for i.i.d.symbol sequences or τ ≥ n , R E ( τ ) = E (cid:2) | X | (cid:3) = 1 . Inthe cases of CC symbol sequences with < τ < n , due to REPRINT, FEBRUARY 25, 2021 8
TABLE IT
HE AUTOCORRELATION R E ( i, τ ) AND ITS AVERAGE R E ( τ ) FOR τ > . T HE THREE CASES OF R E ( i, τ ) IN (55) CORRESPOND TO THE FIRST COLUMN ( IN GRAY ), THE UPPER LEFT PART ( IN BLUE ) AND THE BOTTOM RIGHT ( IN RED ), RESPECTIVELY . τi . . . n − n − n n + 1 . . . E (cid:2) | X | (cid:3) ρ ρ . . . ρ ρ E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) . . . E (cid:2) | X | (cid:3) ρ ρ . . . ρ E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) . . . ... ... ... ... ... ... ... ... ... ... n − E (cid:2) | X | (cid:3) ρ E (cid:2) | X | (cid:3) . . . E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) . . .n − E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) . . . E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) . . .R E ( τ ) ( n − ρ/n + ( n − ρ/n + . . . ρ/n + ρ/n + . . . E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) /n E (cid:2) | X | (cid:3) /n . . . ( n − E (cid:2) | X | (cid:3) /n ( n − E (cid:2) | X | (cid:3) /n E (cid:2) | X | (cid:3) E (cid:2) | X | (cid:3) . . . ... E i − W E i − W +1 ... E i E i +1 ... E i + W E i + W +1 ...G Wi = E i − W + E i − W +1 + . . . + E i + W G Wi +1 = E i − W +1 + E i − W +2 + . . . + E i + W +1 Fig. 6. Window energies G Wi and G Wi +1 given symbol energy sequence E . R E ( i, τ ) being smaller than E (cid:2) | X | (cid:3) , the average symbol en-ergy dependency manifest itself as a deviation between R E ( τ ) and E (cid:2) | X | (cid:3) (see (29) and the second case in (34)). For eachblocklength, as τ increases, the symbol energy dependencygradually decreases and vanishes when τ = n such that E i and E i + τ always belong to two independent blocks. Moreover, as n increases, the curves approach E (cid:2) | X | (cid:3) (i.e., the deviationdecreases), which implies a weaker dependency.IV. E NERGY D ISPERSION I NDEX
In this section, EDI is introduced as a figure of merit toqualitatively predict the NLI power. In Sec. II-A, we showedthat memory is one of the main phenomena affecting the NLI.In previous NLI metrics (see Sec. II-C), the effect of non-i.i.d.input sequences and their interaction with the channel memorywas not considered. EDI brings together these two elements bycapturing the statistical properties of the input sequence overa time window which is comparable to the channel memory.
A. Definition
EDI is designed to be a sliding window statistic withwindow length W . The windowed energy at time instant i , G Wi , is defined as the total energy of X i + W/ i − W/ , i.e., G Wi (cid:44) i + W/ (cid:88) j = i − W/ E j = i + W/ (cid:88) j = i − W/ | X j | . (36)Fig. 6 illustrates the windowed energies G Wi and G Wi +1 andshows the sliding window effect.The windowed energy is a random variable, and thus, wedefine the windowed energy process as [ . . . , G Wi − , G Wi , G Wi +1 , . . . ] . (37)For CCDM QAM symbol sequences, the windowed energyprocess is the sum of W + 1 time-shifted WSCS processes of symbol energies of period n , and thus the windowedenergy process is also WSCS with the same period, i.e., n [49, Ch. 17.2-Prop. 1]. Therefore, R G W ( i, τ ) and E (cid:2) G Wi (cid:3) are periodic with period n . Furthermore, the variance of thewindowed energy G Wi (see (18)) is given by Var (cid:2) G Wi (cid:3) = Cov (cid:2) G Wi , G Wi (cid:3) = R G W ( i, − E (cid:2) G Wi (cid:3) . (38)With periodic R G W ( i, and E (cid:2) G Wi (cid:3) , Var (cid:2) G Wi (cid:3) in (37) alsovaries cyclically with period n . Definition 5 (Energy Dispersion Index):
EDI is defined as Ψ (cid:44) Var (cid:2) G W (cid:3) E [ G W ] , (39)where E (cid:2) G W (cid:3) (cid:44) n n − (cid:88) i =0 E (cid:2) G Wi (cid:3) , (40) Var (cid:2) G W (cid:3) (cid:44) n n − (cid:88) i =0 Var (cid:2) G Wi (cid:3) . (41)EDI in Definition 5 measures the windowed energy varia-tions. EDI in (39) is defined as the ratio of the average win-dowed energy variance ( Var (cid:2) G W (cid:3) ) to the average windowedenergy mean ( E (cid:2) G W (cid:3) ). As shown in (40) and (41), these twoaverages have the same form of the average autocorrelation R E ( τ ) in (32). B. Alternative Formulations
EDI in Definition 5 can also be expressed in terms of thesecond-order statistics of the input symbols. In this section,we introduce such formulation of the EDI for both CCDMQAM symbol sequences as well as i.i.d. symbol sequences.Note that we view the process of i.i.d. symbol sequences as aspecial case of a WSCS process with period n = 1 . Theorem 5:
The average windowed energy mean and av-erage windowed energy variance for CCDM QAM symbolsequences can be expressed as E (cid:2) G W (cid:3) = ( W + 1) E (cid:2) | X | (cid:3) (42)and Var (cid:2) G W (cid:3) = ( W + 1) Var (cid:2) | X | (cid:3) − W ( W + 1) E (cid:2) | X | (cid:3) + 2 W (cid:88) τ =1 ( W − τ + 1) R E ( τ ) , (43) REPRINT, FEBRUARY 25, 2021 9
900 920 940 960 980 1 ,
000 1 ,
020 1 ,
040 1 ,
060 1 ,
080 1 , . . . . . . . . · − G W P r o b a b ili t y Var (cid:2) G W (cid:3) Eq. (43) or (46) σ GW n = 500 110 .
27 108 . n = 2000 383 .
96 381 . .
91 654 . Fig. 7. Histogram of windowed energy G W for W = 1000 . Normalized (cid:0) E (cid:2) | X | (cid:3) = 1 (cid:1) P A = [0 . , . , . , . .The average windowed energy mean is E (cid:2) G W (cid:3) = 1000 . respectively, where R E ( τ ) is given by (34). Proof:
See Appendix B.For i.i.d. symbol sequences, (42) also holds. On the otherhand,
Var (cid:2) G Wi (cid:3) for i.i.d. symbol sequences is the sum of W + 1 energy variances, therefore, Var (cid:2) G W (cid:3) = 1 n n − (cid:88) i =0 i + W/ (cid:88) j = i − W/ Var [ E j ] (44) = 1 n n − (cid:88) i =0 ( W + 1)Var (cid:2) | X | (cid:3) (45) =( W + 1)Var (cid:2) | X | (cid:3) , (46)where (45) follows from (23). Example 4 (Windowed Energy Histograms):
Fig. 7 illus-trates how the probabilities of windowed energies depend onthe blocklength n , based on a number of windowed energysamples. For simplicity, we only show the results of emulatedCCDM amplitude codewords, since almost the same result isobtained when using exact CCDM. Fig. 7 shows that the meanvalue of G Wi is independent of n as shown in (42). Fig. 7also shows that an increase of n results in a heavier tail, i.e.,a larger probability of observing a large windowed energy.The windowed energy variance increases as blocklength n increases, which is in good agreement with our observationsin Example 2. Fig. 7 also shows that the estimated windowedenergy variance σ GW is well-approximated by Var (cid:2) G W (cid:3) given in (43) and (46), where the discrepancies are causedby the limited number of samples.Based on Theorem 5, the EDI of CCDM QAM symbolsequences can be obtained directly by substituting (43) and(42) in (39), i.e., Ψ( n, W ) = E (cid:2) | X | (cid:3) [Φ − ( W + 1)]+ 2 (cid:80) Wτ =1 ( W − τ + 1) R E ( τ )( W + 1) E [ | X | ] , (47) where R E ( τ ) and Φ are given by (34) and (10), respectively.Notation Ψ( n, W ) is used to emphasize the dependency of theEDI on the blocklength n and window length W .Apart from n and W , in view of (10), (30) and (34), EDI in(47) is also determined by the kurtosis Φ as well as the secondand fourth order moments of | X | . Furthermore, after dividing(46) by (42) and using (10), the EDI of i.i.d. is obtained, whichis given by Ψ = E (cid:2) | X | (cid:3) (Φ − . (48)It can be seen that (48) is independent of n and W , but is stilla function of the kurtosis Φ . Recall that kurtosis from EGNmodel is derived based on i.i.d. symbols assumption, hence(48) means that EDI can give the same indication of the NLIas kurtosis does in the case of i.i.d. symbols.We have derived a closed-form expression for the EDI in(47) (and for i.i.d. symbols in (48)). In the next section, we willinvestigate properties of the EDI, and compare them againstthe estimated EDI from Monte Carlo simulations. C. Properties
In what follows, we first show that for certain values ofblocklength n and window length W , EDI depends linearly on n . This corresponds to a regime where W (and thus, implicitlythe channel memory) is larger than n . We then give boundson EDI for arbitrary values of n and W . Theorem 6:
When n ≤ W + 2 , the EDI Ψ( n, W ) in (47)depends linearly on n via Ψ lin ( n, W ) = n + 13( W + 1) E (cid:2) | X | (cid:3) (Φ − . (49) Proof:
See Appendix C.
Corollary 7:
For any finite blocklength n , the EDI in (47)is upper- and lower-bounded as ≤ Ψ( n, W ) ≤ E (cid:2) | X | (cid:3) (Φ − . (50)The bounds are achieved for asymptotic values of W , i.e., lim W → Ψ( n, W ) = E (cid:2) | X | (cid:3) (Φ − . (51) lim W →∞ Ψ( n, W ) = 0 , (52) Proof:
See Appendix D.The lower bound in (52) can be intuitively understood asfollows. When W is much larger than n , the windows alwaysinclude multiple complete blocks of QAM symbols, and thecompositions of amplitudes within a large window “hardens”,yielding a reduced fluctuation of the window energies. Theupper bound in (51) is identical as the EDI of i.i.d. symbolsequences in (48). This upper bound indicates that as windowlength decreases (less memory in the metric), the impact ofsymbol energy correlations on EDI decreases.Note that the EDI for constant-modulus constellations (suchas, e.g., phase-shift keying), is identically 0, due to the factthat the symbols have constant energy. The sliding windowenergy is thus constant. This reflects the fact that EDI is ametric specifically designed to capture NLI fluctuations in PASsystems with different shaping blocklengths, and such systemscannot be designed using constant modulus constellations. REPRINT, FEBRUARY 25, 2021 10 − − − − −
50 Blocklength n E D I [ d B ] ˆΨ, Eq. (53) Ψ lin ΨExact Emulated Eq. (49) Eq. (47) W = 10 W = 30 W = 100 W = 1000Upper Bound, Eq. (51) n = W + 2Uniform, Eq. (48) Fig. 8. Simulation results (markers) using (53), analytical results (solid lines)in (47) and (48), and linear EDI in (49) (in dB) vs. blocklength for normalized (cid:0) E (cid:2) | X | (cid:3) = 1 (cid:1) P A = [0 . , . , . , . . Example 5 (EDI and Blocklength):
Fig. 8 shows the EDI(in dB) of symbol sequences for different values of n and W ,as well as the EDI for i.i.d. uniform 64QAM sequences. Theanalytical results of EDI are computed by using (47), (48) and(49) as well as the EDI estimated as ˆΨ (cid:44) σ GW µ GW , (53)where σ GW and µ GW are the estimated windowed energyvariance and the estimated windowed energy mean, respec-tively. Fig. 8 shows that all the simulation results ˆΨ match theanalytical results Ψ in (47) very well. This is the case evenfor short blocklength ( n = 10 ), indicating that the CCDMemulation approach we took in this paper has little impact onthe EDI in (47). Fig. 8 also shows the linear EDI expression Ψ lin in (49). The EDI curves can be seen to be segmentedinto a linearly blocklength-dependent region by n = W + 2 .As n increases above W + 2 , Ψ begins to diverge from Ψ lin in (49). For a fixed n , as W increases, EDI approaches thelower bound in (52) (i.e., −∞ in dB). On the other hand, EDIgradually reaches the upper bound in (51) as W → .It can be concluded from Example 5 that for CCDM QAMsymbol sequences, if a symbol-level interleaver, or a very longblocklength is used, the EDI will approach the upper boundin (51) that is determined by the kurtosis. Hence, EDI can beviewed as a windowed version of kurtosis. For the purpose ofNLI prediction, window length W should be carefully chosensuch that the channel memory effect is properly reflected, aswill be shown in the following section.We conclude this section by emphasizing that Example 5(and also Example 3) showed that the CCDM emulationapproach of considering all N C permutations as codewordsgives very precise results. This assumption allowed us tofind closed-form expressions for the average autocorrelation(see (34)) and the EDI (see (47)). In the next section, wetherefore show results using the analytical expressions we havedeveloped above, and thus, only consider emulated CCDM. TABLE IIS
IMULATION P ARAMETERS . Parameter Value
Modulation QAMPol.-mux. SingleWavelength ( λ ) 1550 nmSymbol rate GBdWDM spacing ( ∆ f ) GHzWDM channels ( N ch ) 5Pulse shape root-raised cosineRoll-off Fiber length kmFiber loss . dB/kmDispersion parameter ( D ) ps/nm/kmNonlinear parameter ( γ ) . dBOversampling factor × QAM symbols per run No. of simulation runs 10
V. N
UMERICAL R ESULTS
In this section, we show that EDI can qualitatively predictthe NLI magnitude when different blocklengths are used. Tothis end, we study the effective SNR in (4), where NLI is asubstantial part of the total noise at relatively high power, andwhere NLI changes produce a change in effective SNR.
A. Simulation Setup
We consider an ideal single-polarization multi-span WDMfiber system with N ch = 5 channels. Nonlinear noise causedby the Kerr effect, as well as ASE and CD are taken intoconsideration. The fiber propagation is simulated using thesplit-step Fourier method with a step size of m. Other keysimulation parameters are displayed in Table II. The channel ofinterest is located at the center of the WDM spectrum, wherethe channel spacing is ∆ f = 50 GHz. The signal is generatedwith root-raised cosine pulse shaping. After propagation overeach span of standard single-mode fiber with span length 80km, the attenuation is ideally compensated by an Erbium-doped fiber amplifier (EDFA). At the receiver, the channelof interest is filtered with a matched filter, followed by CDcompensation and sampling.The one-sided channel memory M is an important referencefor the window length W in the EDI. M can be estimated us-ing (1) and (2). In (2), the optical bandwidth is ∆ ω = N ch ∆ f ,while group velocity dispersion β is related to dispersionparameter D as shown in [16, Eq. (1.2.11)].For shaped 64QAM transmission, the amplitude PMF P A =[0 . , . , . , . is used. Note that EDI can be used forarbitrary PMF in principle. Quadrature phase-shift keying(QPSK) and uniform 64QAM are presented as baselines.QPSK is anticipated to provide optimal effective SNR per-formance, since constant modulus constellations completelyremove the modulation-dependent NLI [22]. We also considershaped 64QAM symbol sequences using a randomly generatedsymbol-level interleaver to emulate i.i.d. symbol sequences.The symbol sequences are always normalized to E (cid:2) | X | (cid:3) = 1 .An overview of the NLI-related metrics is given in Table III.PAPR and kurtosis do not depend on the blocklength, and thus, REPRINT, FEBRUARY 25, 2021 11
TABLE IIINLI M
ETRICS OF THE E VALUATED S YMBOL S EQUENCES .T HE RESULTS FOR R r ARE TAKEN FROM [25, F IG . 8]. Metrics
CC PS64QAM i.i.d. PS64QAM Uniform64QAM QPSK Θ , PAPR .
769 3 .
769 2 .
336 1Φ , Kurtosis .
653 1 .
653 1 .
381 1 R r , Run Ratio ≈ .
98 0 .
978 0 .
984 0 . these two metrics cannot predict the NLI differences causedby blocklength differences. Although run ratio can, to someextent, capture the blocklength dependency of the NLI, it doesnot take channel memory into account, and thus, cannot adaptto different distances. By contrast, EDI uses a window lengthdepending on the distance.With various blocklengths n , many pairs of effective SNRand EDI can be obtained. To quantify the correlation betweeneffective SNR and EDI, Pearson’s correlation coefficient [50,Ch. 11.1] is used, i.e., r p (cid:44) Cov (cid:2)
SNR eff , ˆΨ (cid:3)(cid:113) Var [SNR eff ] Var (cid:2) ˆΨ (cid:3) . (54)Coefficient values +1 or − indicates perfect correlation,while indicates no correlation. B. EDI and Effective SNR
We first investigate how EDI is correlated to effective SNR.Fig. 9 displays effective SNR and EDI vs. blocklength. Ateach transmission distance, the optimal launch power foundfor n = 10 is used (see Fig. 12 ahead). The effective SNRsachieved for n = 10 and n = 10000 at these launch powers areshown with filled triangles in Fig. 9. EDI is calculated usingthe optimal window lengths W ∗ , which will be discussed inFig. 10. Fig. 9 shows that the estimated EDIs (circles) are ingood agreement with the analytical EDI in (47) (solid lines),despite of slight fluctuations due to the limited number oftransmitted symbols. Effective SNR almost follows the sametrend as that of EDI, and their Pearson’s linear correlationcoefficients at three distances are at least . . Based onTheorem 6, the x-axes in Fig. 9 can be divided into tworegions: n < W ∗ + 2 and n > W ∗ + 2 . The blocklength-dependent region on the left shows that the effective SNRvaries linearly with blocklength n . As n increases enteringthe region on the right, SNR begins to decreases slowly untilexhibiting marginal differences, since for long blocklengths theNLI reduction brought by weakly-correlated symbol energiesbecomes insignificant. In this region, EDI is determined bykurtosis, and the EGN model is able to give accurate predic-tions of effective SNR, as demonstrated in [25, Fig. 7].The optimal window length W ∗ used in Fig. 9 was chosensuch that EDI yields the highest correlation with effectiveSNR. To this end, W ∗ is obtained by analyzing various win-dow length W at each transmission distance, whose absolutevalue of correlation coefficient | r p | is shown in Fig. 10. It canbe seen that | r p | reaches its peak for values W ∗ , which ismuch smaller than the estimated channel memory M given . . . . . . . n E ff ec t i v e S N R [ d B ] − − − − − − − − − − − W ∗ + 2 = 32 E D I [ d B ] (a) 80 km Ψ lin , Eq. (49) Ψ, Eq. (47) ˆΨ, Eq. (53) SNR eff , Eq. (5) . . . . .
825 Blocklength n E ff ec t i v e S N R [ d B ] − − − − − − − − − W ∗ + 2 = 152 E D I [ d B ] (b) 320 km . . . . . . . . n E ff ec t i v e S N R [ d B ] − − − − − W ∗ + 2 = 1002 E D I [ d B ] (c) 1600 km Fig. 9. Effective SNR (left axis) and EDI (right axis) vs. blocklength aftertransmission distances of (a) 80 km, (b) 320 km and (c) 1600 km. The launchpowers are (a) − . dBm, (b) − . dBm and (c) − . dBm. Error barsfor effective SNRs represent 95% confidence interval. The EDI is shown indB and inverted for convenience of comparison. The optimal window lengths W ∗ shown in the figure are used for the EDI calculation. The correlationcoefficients r p in (54) are (a) − . , (b) − . and (c) − . . REPRINT, FEBRUARY 25, 2021 12 . . . . . W ∗ = 30 2 M = 216 W ∗ = 150 2 M = 866 W ∗ = 1000 2 M = 4332Window Length W | r p |
80 km320 km1600 km
Fig. 10. Absolute value of Pearson’s linear correlation coefficient | r p | in (54)between EDI and effective SNR vs. window length W for km, kmand km, whose launch powers are − . dBm, − . dBm and − . dBm, respectively. The optimal value of W is denoted by W ∗ . The channelmemories M calculated using (1)–(2) are also indicated.
200 400 600 800 1 ,
000 1 ,
200 1 ,
400 1 , , , , , , , , ,
500 Transmission Distance [km] N u m b e r o f S y m b o l s Channel Memory 2 M Optimal Window Length W ∗ Fig. 11. The estimated channel memory M calculated using (1)–(2) and theoptimal window length W ∗ ( | r p | ≥ . ) at transmission distances from km to km. by (1)–(2). Incorrectly choosing W can lead to smaller valuesof | r p | , and thus the SNR prediction by EDI is less accurate.However, Fig. 10 also shows that even if M is used aswindow length, EDI can still reflect the SNR variations withcorrelation coefficients above . Thus, in practice one candirectly use the estimated channel memory M as windowlength rather than finding the optimal one W ∗ . In general,Fig. 9 and Fig. 10 show that EDI and effective SNR arehighly correlated with each other, indicated by a nearly perfectnegative correlation. It can be concluded that EDI evaluatedwith the optimal window length is capable of reflecting theimpact of blocklength and distance on the NLI.Fig. 11 shows the relationship between W ∗ and estimatedchannel memory M at various transmission distances. All − − − − − − − .
91 dB Launch Power [dBm] E ff ec t i v e S N R [ d B ] (a) 80 km QPSK CC PS 64QAM n = 10 i.i.d. PS 64QAMUniform 64QAM CC PS 64QAM n = 10000 − − − − − − − . . . . . .
83 dB Launch Power [dBm] E ff ec t i v e S N R [ d B ] (b) 320 km − − − − − − − . . . . .
64 dB Launch Power [dBm] E ff ec t i v e S N R [ d B ] (c) 1600 km Fig. 12. Effective SNR vs. launch power after transmission distances of(a) 80 km, (b) 320 km and (c) 1600 km. PAS 64QAM with blocklength n = 10 and n = 10000 are displayed. Results of uniform 64QAM, i.i.d.PS 64QAM symbols, and QPSK are also included as references. The circledSNR performance correspond to the launch powers used in Fig. 9 (a)–(c)respectively. W ∗ are obtained with at least . absolute correlationcoefficients | r p | . It can be seen in Fig. 11 that M scales linearly REPRINT, FEBRUARY 25, 2021 13
TABLE IVEDI Ψ ( IN D B) OF THE E VALUATED S YMBOL S EQUENCES
Distance
CC PS 64QAM i.i.d. PS Uniform n = 10 n = 10000 − . − . − . − . −∞
320 km − . − . − . − . −∞ − . − . − . − . −∞ with distance, as M is computed by using (2). Likewise, W ∗ also increases approximately linearly but at a slower rate.The optimal window length W ∗ is smaller than the estimatednumber of interfering symbols M , which can be explainedby two facts. First, (2) is a rough estimation of the channelmemory. Second, the symbol energies within EDI window areassumed to have equal contributions to the NLI (as shown in(36) that all symbol energies are weighted by ). Therefore, W ∗ generally represents an effective number of dominantinterfering symbols that are involved in the NLI generation.To conclude, we show effective SNR vs. launch power inFig. 12, which shows that the impact of the shaping block-length on effective SNR can be as large as that of modulationformat. For simplicity, CCDM 64QAM symbol sequences areonly shown with ultra short blocklength n = 10 and longblocklength n = 10000 , in general representing “good” and“bad” NLI-tolerant blocklengths. The EDI of different symbolsequences are given in Table IV. The filled triangles in Fig. 12for n = 10 and n = 10000 correspond to the same effectiveSNR markers as shown in Fig. 9. The first observation fromFig. 12 (a)–(c) is that QPSK exhibits the best effective SNRfor the distances under consideration, as it has the smallestEDI. Secondly, the effective SNRs of i.i.d. 64QAM symbolsequences and the shaped symbol sequence with n = 10000 almost coincide for all distances. Meanwhile, their EDIs havemarginal differences. Thirdly, the effective SNR of uniform64QAM falls between the effective SNRs of shaped 64QAMusing n = 10 and n = 10000 , so does their EDIs. Lastly,Fig. 12 shows that the SNR gains offered by CCDM symbolsequences using n = 10 instead of n = 10000 decrease from . dB to . dB as transmission distance increases.VI. C ONCLUSIONS
This paper proposed a new heuristic metric called energydispersion index (EDI) to predict the impact of blocklengthon the effective SNR for CCDM-coded systems. EDI is ameasure of the windowed energy dispersion which captures theinteraction between the statistical properties of a CCDM inputsequence and the channel memory with respect to the receivedNLI magnitude. Numerical results show that the effective SNRis highly correlated to the EDI of the transmitted symbolsequence, with correlation coefficients greater than .Being a heuristic metric, the EDI requires future analyticalsubstantiation, possibly leading to a refined version thereof.One possible improvement of the EDI accuracy could beappropriately weighting the symbol energies within the EDIwindow to reflect their uneven contributions to the NLI. Inthis paper, we only studied EDI for CCDM, and thus, aperformance analysis of EDI to other shaping algorithms or other constellations is still pending. All these open problemsare left for future work. Nevertheless, we believe that EDI canfacilitate the development of NLI-tolerant signaling schemesthat aim to optimize time-varying statistical properties of theinput symbol sequence.A
CKNOWLEDGMENTS
The authors would like to thank Dr. Yunus Can G¨ultekin andSebastiaan Goossens (Eindhoven University of Technology)for fruitful discussions on shaping techniques.A
PPENDIX AP ROOF OF T HEOREM R E ( i, τ ) in (21).For i = 0 , , . . . , n − (one period), R E ( i, τ ) is given by R E ( i, τ ) = E (cid:2) | X | (cid:3) , if τ = 0 ρ, if τ (cid:54) = 0 , < i + τ < n E (cid:2) | X | (cid:3) , if τ (cid:54) = 0 , i + τ ≥ n or i + τ ≤ (55)The second case in (55) is when two different symbolenergies belong to the same block. In what follows wewill prove that in this case symbol energies are equallycorrelated and yields the same autocorrelation ρ in (30), andalso autocovariance. By using Lemma 1, (16) and (18), for ∀ i, j ∈ { , , . . . , n − } and i (cid:54) = j, τ = j − i , we can write Cov [ E i , E j ] = R E ( i, j − i ) − E [ E ] (56) = E (cid:2) ( A I,i + A Q,i )( A I,j + A Q,j ) (cid:3) − E (cid:2) A (cid:3) (57) =2 E (cid:2) A i A j (cid:3) − E (cid:2) A (cid:3) , (58) =2 (cid:88) a,b ∈A P A i ,A j ( a, b ) a b − E (cid:2) A (cid:3) , (59)where (57) follows from (20), (22) and (24), and (58) fromthe fact that amplitudes in the I/Q branches in Fig. 1 areindependent from each other. It can be seen in (59) that R E ( i, j − i ) and Cov [ E i , E j ] is determined by the jointprobability P A i ,A j , i.e., P A i ,A j ( a, b ) = P A i ( a ) · P A j | A i ( b | a ) , (60)where P A i ( a ) = n a /n (see (15)). For a given A i = a , theamplitude composition at time j is updated, and thus, P A j | A i ( b | a ) = (cid:26) n a − n − , if b = a n b n − , if b (cid:54) = a . (61)The joint probability P A i ,A j in (60) is thus, independent of i and j , and so is R E ( i, j − i ) = ρ and Cov [ E i , E j ] .Finally, based on the fact that ρ is a constant, we compute ρ and the autocovariance. Similar to (12), the total energy ofa symbol energy block is a constant, i.e., n − (cid:88) i =0 E i = 2 n − (cid:88) i =0 A i = 2 (cid:88) a ∈A a n a . (62) REPRINT, FEBRUARY 25, 2021 14
In analogy to (13), (62) shows that these n symbol energiesalso satisfy a linear relationship. By taking expectation on bothsides of (62), we have n − (cid:88) i =0 E [ E i ] = 2 (cid:88) a ∈A a n a . (63)Subtracting (63) from (62) yields n − (cid:88) i =0 ( E i − E [ E i ]) = 0 , (64)which can be rewritten as E j − E [ E j ] = − n − (cid:88) j (cid:48) =0 j (cid:48) (cid:54) = j ( E j (cid:48) − E [ E j (cid:48) ]) . (65)By using (17) and (65), Cov [ E i , E j ] is expanded, i.e., Cov [ E i , E j ] = E ( E i − E [ E i ]) − n − (cid:88) j (cid:48) =0 j (cid:48) (cid:54) = j ( E j (cid:48) − E [ E j (cid:48) ]) (66) = − E (cid:104) ( E i − E [ E i ]) (cid:105) − n − (cid:88) j (cid:48) =0 j (cid:48) (cid:54) = j,i E [( E i − E [ E i ]) ( E j (cid:48) − E [ E j (cid:48) ])] (67) = − Var [ E i ] − n − (cid:88) j (cid:48) =0 j (cid:48) (cid:54) = j,i Cov [ E i , E j (cid:48) ] . (68)Therefore, (68) can be rewritten as Cov [ E i , E j ] + ( n − E i , E j ] = − Var [ E i ] . (69)By using (23) in (69), the equality in (31) is obtained.The inequality in (31) clearly follows from the fact that thevariance Var (cid:2) | X | (cid:3) is positive. The last step in the proof is toshow that the autocorrelation R E ( i, τ ) = ρ is given by (30).This follows from substituting (31) into (18). Because of thenegative autocovariance, it can be observed that ρ is smallerthan E (cid:2) | X | (cid:3) . This completes the proof.A PPENDIX BP ROOF OF T HEOREM E (cid:2) G W (cid:3) = 1 n n − (cid:88) i =0 i + W/ (cid:88) j = i − W/ E (cid:2) | X j | (cid:3) (70) = 1 n n − (cid:88) i =0 i + W/ (cid:88) j = i − W/ E (cid:2) | X | (cid:3) (71) = ( W + 1) E (cid:2) | X | (cid:3) , (72)where (70) uses the linearity of expectation, and (71) followsfrom (22). j ji − W i + W i − W i + W ( i, i ) τ = 1 τ = 2...... τ = WW + 1 W +1 Fig. 13. An illustration of possible combinations of j and j (cid:48) for the doublesummations of the last term in (76). The dots at each diagonal directionrepresent W − τ + 1 possible pairs of j and j (cid:48) when j − j (cid:48) > . To prove (43), we begin with
Var (cid:2) G Wi (cid:3) . Using (36),Lemma 1, and [48, Thm. 9.2], we have Var (cid:2) G Wi (cid:3) = i + W/ (cid:88) j = i − W/ Var (cid:2) | X j | (cid:3) + 2 i + W/ − (cid:88) j = i − W/ i + W/ (cid:88) j (cid:48) = j +1 Cov [ E j , E j (cid:48) ] (73) = ( W + 1) Var (cid:2) | X | (cid:3) + 2 i + W/ − (cid:88) j = i − W/ i + W/ (cid:88) j (cid:48) = j +1 Cov [ E j , E j (cid:48) ] (74) = ( W + 1) Var (cid:2) | X | (cid:3) + 2 i + W/ − (cid:88) j = i − W/ i + W/ (cid:88) j (cid:48) = j +1 (cid:2) R E ( j, j (cid:48) − j ) − E (cid:2) | X | (cid:3) (cid:3) (75) = ( W + 1) Var (cid:2) | X | (cid:3) − W ( W + 1) E (cid:2) | X | (cid:3) + 2 i + W/ − (cid:88) j = i − W/ i + W/ (cid:88) j (cid:48) = j +1 R E ( j, j (cid:48) − j ) . (76)where (74) follows from (23), and (75) from (18).The first two terms in (76) are known, since Var (cid:2) | X | (cid:3) and E (cid:2) | X | (cid:3) are given in (25) and (24). Meanwhile, R E ( j, j (cid:48) − j ) depends on the time instant j and delay τ = j (cid:48) − j > .Therefore, based on (41), only the last term in (76) needs tobe considered for averaging over n time instants, i.e., n n − (cid:88) i =0 i + W/ − (cid:88) j = i − W/ i + W/ (cid:88) j (cid:48) = j +1 R E ( j, j (cid:48) − j ) (77) = 2 W (cid:88) τ =1 n n − (cid:88) i =0 i + W/ (cid:88) j = i − W/ R E ( j, τ ) (78) = 2 W (cid:88) τ =1 ( W − τ + 1) R E ( τ ) , (79)where (78) can be explained by observing Fig. 13, whichshows possible pairs of j and j (cid:48) and the corresponding values REPRINT, FEBRUARY 25, 2021 15 of τ . To obtain (79), we observe in Fig. 13 that for each valueof τ , there are W − τ + 1 pairs in the corresponding diagonal,whose R E ( j, τ ) are averaged. Hence, (79) is obtained. Finally, Var (cid:2) G W (cid:3) in (43) is simply the sum of the first two terms in(76) and (79), which completes the proof.A PPENDIX CP ROOF OF T HEOREM n ≤ W + 2 , the computation of Var (cid:2) G Wi (cid:3) can be simplified byconsidering the symbol energy “pattern” inside the window.This leads to a simpler expression for the EDI in (49).Given a window length W , the window has W + 1 symbolenergies. This window in general covers multiple symbolenergy blocks, and thus we can write W + 1 = un + r, (80)where u ∈ N is the maximum number of complete symbolenergy blocks covered by the window, and r is the remainderwhen W + 1 is divided by n ( ≤ r ≤ n − ).As the window slides, the symbol energy “pattern” coveredby the window varies cyclically with a period n , which isillustrated in Fig. 14. The pattern shows that the sum of symbolenergies with a shaded background is constant, whereas theenergy of the incomplete blocks located at two edges of thewindows is random. Therefore, the constant part of energyhas no contribution to window energy variance, and thuscan be removed in the computation of the variance. Forexample, for the pattern at the top of Fig. 14, it consists of u complete blocks at the right side, while at its left edge thereare r symbols from the adjacent block. Hence, the varianceof the top pattern is simplified by only considering these r symbols. Such simplification can by done by using (76). Aftersubstituting W + 1 with r , and R E ( j, j (cid:48) − j ) with ρ , we have Var (cid:2) G Wi (cid:3) = r Var (cid:2) | X | (cid:3) − r ( r − E (cid:2) | X | (cid:3) + r ( r − ρ. (81)As long as W + 1 ≥ n − , it can be concluded fromFig. 14 that: (i) two edges with random symbol energiescontribute to the variance, and (ii) these n patterns keep , , ..., n − mutually correlated symbol energies at bothedges, respectively. Under these circumstances, Var (cid:2) G W (cid:3) isobtained by averaging the variance over n patterns, i.e., Var (cid:2) G W (cid:3) = 2 n n − (cid:88) i =0 (cid:20) i Var (cid:2) | X | (cid:3) − i ( i − E (cid:2) | X | (cid:3) + i ( i − ρ (cid:21) (82) = 2Var (cid:2) | X | (cid:3) n ( n − n − (cid:88) i =0 ( ni − i ) (83) = ( n + 1)Var (cid:2) | X | (cid:3) , (84)where (83) follows from (18) and (31). By substituting (84)and (42) into (39), EDI in (49) is obtained. r unun rn − u − n r + 1 r + 1 ( u − n n − slide r symbolsslide 1 symbolslide n − r − W + 1 symbolsleft edges right edges Fig. 14. An illustration of four symbol energy patterns as window slides tothe right through the symbol energy sequence. The shaded area represents u complete blocks. A PPENDIX DP ROOF OF C OROLLARY
Var (cid:2) G W (cid:3) , which directlydetermine the bounds on EDI. For any finite blocklength n ,the Var (cid:2) G W (cid:3) of the CCDM QAM symbol sequence satisfies ≤ Var (cid:2) G W (cid:3) ≤ ( W + 1) Var (cid:2) | X | (cid:3) . (85)The variance is always positive, hence the left inequality in(85) is true. Based on (74), due to negative autocovariancefrom (31), we have, Var (cid:2) G Wi (cid:3) =( W + 1) Var (cid:2) | X | (cid:3) + 2 i + W/ − (cid:88) j = i − W/ i + W/ (cid:88) j (cid:48) = j +1 Cov [ E j , E j (cid:48) ] , (86) ≤ ( W + 1)Var (cid:2) | X | (cid:3) , (87)where (87) holds with equality when W = 0 . After using (41),we have the right inequality in (85). By using (85) and (42)in (39), the inequalities for the EDI in (50) is obtained.We now prove how the bounds in (50) are achieved asymp-totically by W . In terms of the upper bound in (51), by setting W = 0 , the window only encompasses one symbol energy,thereby Var (cid:2) G W (cid:3) = Var (cid:2) | X | (cid:3) and E (cid:2) G W (cid:3) = E (cid:2) | X | (cid:3) .With (39) and using (10), (51) is obtained. From (49), it canbe seen that for n < W + 2 , the EDI tends to for W → ∞ .This completes the proof.R EFERENCES[1] G. B¨ocherer, F. Steiner, and P. Schulte, “Bandwidth efficient andrate-matched low-density parity-check coded modulation,”
IEEE Trans.Commun. , vol. 63, no. 12, pp. 4651–4665, Dec. 2015.[2] J. Cho and P. J. Winzer, “Probabilistic constellation shaping for opticalfiber communications,”
J. Lightw. Technol. , vol. 37, no. 6, pp. 1590–1607, Mar. 2019.[3] P. Schulte and G. B¨ocherer, “Constant composition distribution match-ing,”
IEEE Trans. Inf. Theory , vol. 62, no. 1, pp. 430–434, Jan. 2016.[4] T. Fehenberger, D. S. Millar, T. Koike-Akino, K. Kojima, and K. Par-sons, “Multiset-partition distribution matching,”
IEEE Trans. Commun. ,vol. 67, no. 3, pp. 1885–1893, Mar. 2019.[5] F. Steiner, P. Schulte, and G. Bocherer, “Approaching waterfilling ca-pacity of parallel channels by higher order modulation and probabilisticamplitude shaping,” in . Princeton, NJ, USA, 21-23 Mar. 2018.[6] F. M. J. Willems and J. Wuijts, “A pragmatic approach to shaped codedmodulation,” in
Proc. Symp. Commun. Veh. Technol. Benelux . Delft,The Netherlands, Oct. 1993.
REPRINT, FEBRUARY 25, 2021 16 [7] Y. C. G¨ultekin, W. J. van Houtum, S. S¸erbetli, and F. M. J. Willems,“Constellation shaping for ieee 802.11,” in
Proc. IEEE Int. Symp. Pers.,Indoor Mobile Commun.
Montreal, QC, Canada, Oct. 2017.[8] A. K. Khandani and P. Kabal, “Shaping multidimensional signalspaces—Part I: Optimum shaping, shell mapping,”
IEEE Trans. Inf.Theory , vol. 39, no. 6, pp. 1799–1808, Nov. 1993.[9] J. Cho, S. Chandrasekhar, R. Dar, and P. J. Winzer, “Low-complexityshaping for enhanced nonlinearity tolerance,” in
Proc. Eur. Conf. Opt.Commun.
D¨usseldorf, Germany, Sep. 2016, Paper W1C.2.[10] T. Fehenberger, A. Alvarado, G. B¨ocherer, and N. Hanik, “On proba-bilistic shaping of quadrature amplitude modulation for the nonlinearfiber channel,”
J. Lightw. Technol. , vol. 34, no. 21, pp. 5063–5073, Nov.2016.[11] A. Amari, S. Goossens, Y. C. G¨ultekin, O. Vassilieva, I. Kim, T. Ikeuchi,C. M. Okonkwo, F. M. J. Willems, and A. Alvarado, “Introducingenumerative sphere shaping for optical communication systems withshort blocklengths,”
J. Lightw. Technol. , vol. 37, no. 23, pp. 5926–5936,Dec. 2019.[12] F. Buchali, F. Steiner, G. B¨ocherer, L. Schmalen, P. Schulte, and W. Idler,“Rate adaptation and reach increase by probabilistically shaped 64-QAM: An experimental demonstration,”
J. Lightw. Technol. , vol. 34,no. 7, pp. 1599–1609, Apr. 2016.[13] A. Ghazisaeidi, I. F. de Jauregui Ruiz, R. Rios-M¨uller, L. Schmalen,P. Tran, P. Brindel, A. C. Meseguer, Q. Hu, F. Buchali, G. Charlet et al. , “Advanced C+L-band transoceanic transmission systems basedon probabilistically shaped PDM-64QAM,”
J. Lightw. Technol. , vol. 35,no. 7, pp. 1291–1299, Apr. 2017.[14] J. Cho, X. Chen, S. Chandrasekhar, G. Raybon, R. Dar, L. Schmalen,E. Burrows, A. Adamiecki, S. Corteselli, Y. Pan et al. , “Trans-atlanticfield trial using high spectral efficiency probabilistically shaped 64-QAMand single-carrier real-time 250-Gb/s 16-QAM,”
J. Lightw. Technol. ,vol. 36, no. 1, pp. 103–113, Jan. 2018.[15] S. L. Olsson, J. Cho, S. Chandrasekhar, X. Chen, E. C. Burrows, andP. J. Winzer, “Record-high 17.3-bit/s/Hz spectral efficiency transmissionover 50 km using probabilistically shaped PDM 4096-QAM,” in
Proc.Opt. Fiber Commun. Conf.
San Diego, CA, USA, Mar. 2018, PaperTh4C.5.[16] G. Agrawal,
Nonlinear Fiber Optics, Third Edition . Academic Press,Jan. 2001.[17] Z. Qu and I. B. Djordjevic, “Geometrically shaped 16QAM outperform-ing probabilistically shaped 16QAM,” in
Proc. Eur. Conf. Opt. Commun .IEEE, Sep. 2017, Paper Th.2.F.4.[18] K. G¨um¨us¸, A. Alvarado, B. Chen, C. H¨ager, and E. Agrell, “End-to-end learning of geometrical shaping maximizing generalized mutualinformation,” in
Proc. Opt. Fiber Commun. Conf.
San Diego, CA,USA, Mar. 2020. Paper W3D.4.[19] J. Renner, T. Fehenberger, M. P. Yankov, F. Da Ros, S. Forchhammer,G. B¨ocherer, and N. Hanik, “Experimental comparison of probabilisticshaping methods for unrepeated fiber transmission,”
J. Lightw. Technol. ,vol. 35, no. 22, pp. 4871–4879, Nov. 2017.[20] E. Sillekens, D. Semrau, G. Liga, N. A. Shevchenko, Z. Li, A. Alvarado,P. Bayvel, R. I. Killey, and D. Lavery, “A simple nonlinearity-tailoredprobabilistic shaping distribution for square QAM,” in
Proc. Opt. FiberCommun. Conf.
San Diego, CA, USA, Mar. 2018. Paper M3C.4.[21] E. Agrell, A. Alvarado, G. Durisi, and M. Karlsson, “Capacity ofa nonlinear optical channel with finite memory,”
J. Lightw. Technol. ,vol. 32, no. 16, pp. 2862–2876, Aug. 2014.[22] R. Dar, M. Feder, A. Mecozzi, and M. Shtaif, “On shaping gain inthe nonlinear fiber-optic channel,” in
IEEE Int. Symp. Inform. Theory .Honolulu, HI, USA, June 2014, pp. 2794–2798.[23] ——, “Time varying ISI model for nonlinear interference noise,” in
Optical Fiber Communication Conference . San Diego, CA, USA, Mar.2014, Paper W2A.62.[24] M. P. Yankov, K. J. Larsen, and S. Forchhammer, “Temporal probabilis-tic shaping for mitigation of nonlinearities in optical fiber systems,”
J.Lightw. Technol. , vol. 35, no. 10, pp. 1803–1810, May 2017.[25] T. Fehenberger, D. S. Millar, T. Koike-Akino, K. Kojima, K. Parsons,and H. Griesser, “Analysis of nonlinear fiber interactions for finite-lengthconstant-composition sequences,”
J. Lightw. Technol. , vol. 38, no. 2, pp.457–465, Jan. 2020.[26] S. Goossens, S. Van der Heide, M. van den Hout, A. Amari, Y. C.G¨ultekin, O. Vassilieva, I. Kim, T. Ikeuchi, F. M. J. Willems, A. Alvarado et al. , “First experimental demonstration of probabilistic enumerativesphere shaping in optical fiber communications,” in . Fukuoka, Japan, July 2019. [27] T. Fehenberger, H. Griesser, and J.-P. Elbers, “Mitigating fiber nonlinear-ities by short-length probabilistic shaping,” in
Proc. Opt. Fiber Commun.Conf.
San Diego, CA, USA, Mar. 2020, Paper Th1I.2.[28] S. Civelli, E. Forestieri, and M. Secondini, “Interplay of probabilisticshaping and carrier phase recovery for nonlinearity mitigation,” Sep.2020. [Online] Available: https://arxiv.org/abs/2009.01135.[29] P. Skvortcov, I. D. Phillips, W. Forysiak, T. Koike-Akino, K. Kojima,K. Parsons, and D. Millar, “Huffman-coded sphere shaping for extended-reach single-span links,”
IEEE J. Sel. Top. Quantum Electron. , Jan. 2021,(early access).[30] P. Poggiolini, A. Carena, V. Curri, G. Bosco, and F. Forghieri, “An-alytical modeling of nonlinear propagation in uncompensated opticaltransmission links,”
IEEE Photon. Technol. Lett. , vol. 23, no. 11, pp.742–744, June 2011.[31] P. Poggiolini, G. Bosco, A. Carena, V. Curri, Y. Jiang, and F. Forghieri,“The GN-model of fiber non-linear propagation and its applications,”
J.Lightw. Technol. , vol. 32, no. 4, pp. 694–721, Feb. 2014.[32] A. Carena, V. Curri, G. Bosco, P. Poggiolini, and F. Forghieri, “Modelingof the impact of nonlinear propagation effects in uncompensated opticalcoherent transmission links,”
J. Lightw. Technol. , vol. 30, no. 10, pp.1524–1539, May 2012.[33] R. Dar, M. Feder, A. Mecozzi, and M. Shtaif, “Properties of nonlinearnoise in long, dispersion-uncompensated fiber links,”
Opt. Express ,vol. 21, no. 22, pp. 25 685–25 699, Nov. 2013.[34] ——, “Inter-channel nonlinear interference noise in WDM systems:modeling and mitigation,”
J. Lightw. Technol. , vol. 33, no. 5, pp. 1044–1053, Mar. 2015.[35] A. Carena, G. Bosco, V. Curri, Y. Jiang, P. Poggiolini, andF. Forghieri, “On the accuracy of the GN-model and on analyti-cal correction terms to improve it,” Jan. 2014. [Online] Available:https://arxiv.org/abs/1401.6946.[36] P. Poggiolini, G. Bosco, A. Carena, V. Curri, Y. Jiang, and F. Forghieri,“A simple and effective closed-form GN model correction formulaaccounting for signal non-Gaussian distribution,”
J. Lightw. Technol. ,vol. 33, no. 2, pp. 459–473, Jan. 2015.[37] T. Fehenberger and A. Alvarado, “Analysis and optimisation of distribu-tion matching for the nonlinear fibre channel,” in
Proc. Eur. Conf. Opt.Commun.
Dublin, Ireland, Sep. 2019. Poster Session 1.[38] T. Fehenberger, “On the impact of finite-length probabilistic shap-ing on fiber nonlinear interference,” June 2020. [Online] Available:https://arxiv.org/abs/2006.07004.[39] A. Mecozzi and R. Essiambre, “Nonlinear shannon limit in pseudolinearcoherent systems,”
J. Lightw. Technol. , vol. 30, no. 12, pp. 2011–2024,June 2012.[40] Z. Tao, L. Dou, W. Yan, L. Li, T. Hoshida, and J. C. Rasmussen,“Multiplier-free intrachannel nonlinearity compensating algorithm oper-ating at symbol rate,”
J. Lightw. Technol. , vol. 29, no. 17, pp. 2570–2576,Sep. 2011.[41] J. Armstrong, “OFDM for optical communications,”
J. Lightw. Technol. ,vol. 27, no. 3, pp. 189–204, Feb. 2009.[42] B. Chen, A. Alvarado, S. van der Heide, M. van den Hout, H. Hafer-mann, and C. Okonkwo, “Analysis and experimental demonstra-tion of orthant-symmetric four-dimensional 7 bit/4D-sym modula-tion for optical fiber communication,” Mar. 2020. [Online] Available:https://arxiv.org/abs/2003.12712.[43] O. Geller, R. Dar, M. Feder, and M. Shtaif, “A shaping algorithmfor mitigating inter-channel nonlinear phase-noise in nonlinear fibersystems,”
J. Lightw. Technol. , vol. 34, no. 16, pp. 3884–3889, May 2016.[44] F. Gray, “Pulse code communication,” US Patent 2 632 058, Mar. 1953.[45] Y. C. G¨ultekin, T. Fehenberger, A. Alvarado, and F. M. J. Willems,“Probabilistic shaping for finite blocklengths: Distribution matching andsphere shaping,”
Entropy , vol. 22, no. 5, p. 581, May 2020.[46] Y. C. G¨ultekin, W. J. Van Houtum, A. Koppelaar, and F. M. J. Willems,“Comparison and optimization of enumerative coding techniques foramplitude shaping,”
IEEE Commun. Lett. , Dec. 2020, (early access).[47] A. Papoulis and S. U. Pillai,
Probability, random variables, and stochas-tic processes, Fourth Edition . Tata McGraw-Hill Education, 2002.[48] R. D. Yates and D. J. Goodman,
Probability and stochastic processes:a friendly introduction for electrical and computer engineers, ThirdEdition . John Wiley & Sons, 2014.[49] G. B. Giannakis, “Cyclostationary signal analysis,” in
Digital SignalProcessing Handbook, First Edition . V. K. Madisetti and D. Williams,Eds, Boca Raton, FL: CRC, 1998.[50] J. D. Gibbons and S. Chakraborti,