[PDF] Channel-noise tracking for sub-shot-noise-limited receivers with neural networks

Abstract

Non-Gaussian receivers for optical communication with coherent states can achieve measurement sensitivities beyond the limits of conventional detection, given by the quantum-noise limit (QNL). However, the amount of information that can be reliably transmitted substantially degrades if there is noise in the communication channel, unless the receiver is able to efficiently compensate for such noise. Here, we investigate the use of a deep neural network as a computationally efficient estimator of phase and amplitude channel noise to enable a reliable method for noise tracking for non-Gaussian receivers. The neural network uses the data collected by the non-Gaussian receiver to estimate and correct for dynamic channel noise in real-time. Using numerical simulations, we find that this noise tracking method allows the non-Gaussian receiver to maintain its benefit over the QNL across a broad range of strengths and bandwidths of phase and intensity noise. The noise tracking method based on neural networks can further include other types of noise to ensure sub-QNL performance in channels with many sources of noise.

Full PDF

CChannel-noise tracking for sub-shot-noise-limited receivers with neural networks

M. T. DiMario and F. E. Becerra Center for Quantum Information and Control, Department of Physics and Astronomy,University of New Mexico, Albuquerque, New Mexico 87131, USA

Non-Gaussian receivers for optical communication with coherent states can achieve measurementsensitivities beyond the limits of conventional detection, given by the quantum-noise limit (QNL).However, the amount of information that can be reliably transmitted substantially degrades if thereis noise in the communication channel, unless the receiver is able to eﬃciently compensate for suchnoise. Here, we investigate the use of a deep neural network as a computationally eﬃcient estimatorof phase and amplitude channel noise to enable a reliable method for noise tracking for non-Gaussianreceivers. The neural network uses the data collected by the non-Gaussian receiver to estimate andcorrect for dynamic channel noise in real-time. Using numerical simulations, we ﬁnd that this noisetracking method allows the non-Gaussian receiver to maintain its beneﬁt over the QNL across abroad range of strengths and bandwidths of phase and intensity noise. The noise tracking methodbased on neural networks can further include other types of noise to ensure sub-QNL performancein channels with many sources of noise.

I. INTRODUCTION

The intrinsic properties of coherent states can enableeﬃcient and practical classical [1–4] and quantum [5–9] communications. When utilizing the phase of coher-ent states combined with their intensity to encode andtransmit information, higher rates of information trans-fer may be achieved compared to communication schemesusing intensity-only encodings [4, 10]. However, channelnoise can severely limit the advantage of communicationswith coherent encodings. In conventional coherent com-munications, the optical receiver performs a heterodynemeasurement with shot-noise-limited sensitivity, corre-sponding to the quantum-noise limit (QNL). This mea-surement allows for the use of post processing methodsof the collected data to estimate channel noise and cor-rect the data to recover the transmitted information [10–19]. While current coherent optical communications relyon these conventional approaches, a heterodyne measure-ment cannot reach the ultimate limit of sensitivity [20]and information transfer [1, 21, 22].In contrast to conventional strategies, non-Gaussianreceivers can surpass the QNL, providing higher mea-surement sensitivities for decoding information [23–30].However, in the presence of channel noise, the beneﬁt ofnon-Gaussian receivers over conventional strategies criti-cally depends on the ability to perform eﬃcient channel-noise tracking. Recent work demonstrated an eﬃcientmethod for phase tracking for non-Gaussian receivers[31]. This phase tracking method estimates and correctsfor the phase noise in real time, which is required by thestrategies used in non-Gaussian receivers, as opposed topost processing of the collected data with heterodyne de-tection. This method enabled sub-QNL sensitivity in thepresence of phase noise, which is particularly damagingfor coherent encodings [31, 32].In more realistic situations there may be multiplesources of noise present in the communication channel,such as thermal noise [33, 34], phase diﬀusion [35–37],phase noise, and amplitude noise. In such situations the non-Gaussian receiver must perform eﬃcient high-dimensional parameter estimation and tracking in orderto maintain the expected sub-QNL performance. How-ever, current methods for single-parameter noise track-ing cannot be eﬃciently scaled to higher dimensionsfor tracking and correction of multiple sources of noise.Thus, enabling noise tracking for non-Gaussian receiversin channels with complex and dynamic noise requiresnovel and eﬃcient methods for multi-parameter estima-tion that scale favorably to higher dimensions. Prac-tical parameter tracking also requires estimation on atime-scale which is very small ( (cid:28) a r X i v : . [ qu a n t - ph ] F e b Alice(Sender) Bob(Receiver) LO Feedback

PhaseEst.AmplitudeEst. D a t a State Discrimination MeasurementParameter Tracking & Correction

AmplitudeNoise PhaseNoise

Channel

Time (ms) E rr o r P r ob a b ilit y -1 -2 -3 (a) (b) No CorrectionHet. (Perfect Corr.)Non-Gaussian (Perfect Corr.)Het. (No Noise)Non-Gaussian (No Noise)

SPD

Time Time NN Laser

FIG. 1.

Channel noise tracking method. (a) Schematic of two parties, Alice and Bob, communicating over a noisychannel with Bob at the receiver performing phase and amplitude noise tracking. The receiver uses an adaptive non-Gaussianmeasurement strategy to realize state discrimination below the QNL using interference between the input state and a localoscillator (LO) followed by photon counting with a single photon detector (SPD). A neural network (NN) takes the datacollected from the non-Gaussian receiver and outputs estimates for the current phase oﬀset and input intensity. The estimatesare fed-forward to the LO to match the input intensity and counteract phase noise in real-time. (b) Eﬀect of channel noisewith (blue) and without (orange) perfect parameter tracking (see main text) for the non-Gaussian receiver. Black points showa perfectly-corrected heterodyne receiver, and black and gray dashed lines show the error for a non-Gaussian and a heterodynereceiver in the absence of noise, respectively. ilar performance to a Bayesian-based noise tracking ap-proach, and allows the non-Gaussian receiver to maintainsub-QNL sensitivity. This shows that a NN estimator isa viable method for real-time, multi-parameter channelnoise tracking in non-Gaussian receivers due to its ef-ﬁciency and potential scalability to higher dimensions.In Sec. II we describe the non-Gaussian receiver strat-egy and the NN estimator used for the noise trackingmethod. In Sec. III we investigate the performance ofthe channel noise tracking. We discuss the results of thework in Sec. IV.

II. RECEIVER AND ESTIMATION STRATEGY

We numerically study the use of a NN-based methodfor noise parameter tracking for non-Gaussian receiversbased on adaptive measurements and photon counting.As a proof-of-concept, we investigate a NN-based methodfor tracking phase and amplitude channel noise, that usesonly the data collected during the state discriminationmeasurement. This NN-based method can be easily ex-tended to perform higher dimensional parameter estima-tion for tracking additional sources of noise in the channelsuch as thermal noise [33, 34] or phase diﬀusion [35–37].In this section, we describe (A) the measurement strategyof the non-Gaussian receiver for coherent state discrimi-nation; (B) how the data from the measurement is usedby the NN; and (C) the NN estimator, which can be usedfor estimation of channel noise from multiple sources.

A. State Discrimination Measurement

Figure 1(a) shows a scenario where a receiver attemptsto perform coherent state discrimination with an adap- tive photon counting measurement with sensitivity belowthe QNL [25, 26]. Dynamic phase and amplitude noiseinduced by the communication channel degrades the at-tainable sensitivity of the receiver. Tracking the phaseand amplitude noise of the input states induced in thechannel using the data collected during the discrimina-tion measurement can in principle allow the receiver tocorrect its strategy and maintain sub-QNL sensitivity.Here, we study a method for channel noise trackingfor a receiver based on an adaptive non-Gaussian strat-egy [26] for phase coherent states | α k (cid:105) ∈ {| αe i πk/M (cid:105)} ,where k = 0 , , ..., M −

1. For M = 4, this correspondsto quaternary phase-shift-keyed (QPSK) coherent states.The state discrimination strategy consists of L adaptivemeasurement steps. Each step performs a hypothesistest of the input state using a local oscillator (LO) toimplement a displacement operation ˆ D ( β ) through in-terference and single photon counting. In each adaptivestep j = 1 , , ...L , the receiver attempts to displace themost likely state to the vacuum state by adjusting the LOphase arg( β ) = θ j ∈ { , π/ , π, π/ } with | β | = | α k | ,followed by single photon detection. The detector hasa ﬁnite photon number resolution (PNR) where up to m photons can be resolved, denoted as PNR( m ), beforebecoming a threshold detector [26]. At the end of the L adaptive steps, the best guess of the receiver θ disc forthe true input phase is the state with maximum a pos-teriori probability given the entire detection history. Asdescribed in Section II(B) and Section II(C), the pho-ton counting data from the adaptive measurement stepstogether with θ disc allows the receiver to perform phaseand amplitude tracking, where estimates of the channelnoise are fed-forward to the LO in order to maintain thesub-QNL performance of the receiver.Figure 1(b) shows an example of the error probabil-ity for the adaptive non-Gaussian receiver for QPSKstates for an average input mean photon number of (cid:104) ˆ n (cid:105) = | α | = 5 .

0, which is proportional to the inten-sity, averaged over 5000 noise realizations and obtainedthrough Monte-Carlo simulations. For all Monte-Carlosimulations in this study, we assume ideal detection eﬃ-ciency, zero detector dark counts, a photon number res-olution of PNR(10), and L =10 adaptive steps. To repre-sent a realistic experiment, we use an interference visibil-ity of the displacement operation of 99.7% [26]. The blue(orange) points show the error probability for the non-Gaussian receiver with (without) perfect noise tracking.Perfect tracking refers to a situation where the receiverhas complete knowledge of the time-dependent input in-tensity and phase noise induced by the channel. Theblack points show the error of an ideal heterodyne mea-surement, performing at the QNL, with perfect tracking[62]. The dashed lines show the expected error in the ab-sence of noise for a heterodyne (gray) and non-Gaussian(black) receiver. The error for the non-Gaussian mea-surement remaining below the heterodyne limit (QNL)shows that if the receiver can implement accurate pa-rameter tracking, then its beneﬁt over the QNL can bemaintained. Furthermore, any tracking method for thenon-Gaussian receiver requires correcting for dynamicalnoise in real-time to ensure sub-QNL performance [31],in contrast to methods for heterodyne receivers, whereestimation and correction can be done in post-processingof the data. B. Detection Matrix

The measurement data collected by the non-Gaussianreceiver from the discrimination of N input states is usedfor parameter estimation. For the discrimination of oneinput state, this data consists of the L photon detections { d j } L and relative phases { ∆ j } L between the LO andinput state for each adaptive step j . Due to the low er-ror rate achieved by the non-Gaussian measurement, theguess θ disc of the phase of the input state corresponds tothe true input phase with high probability. Thus, θ disc can be used to infer the relative phase ∆ j between theLO and actual input state at every adaptive measure-ment step j such that ∆ j = θ j − θ disc , as in [31]. Thisstate discrimination data { ∆ j , d j } is binned into what werefer to as the detection matrix D , which is a M × ( m +1)matrix, where m is the PNR of the receiver. After eachmeasurement, the matrix elements D l,k are incrementedby the total number of times that the number of detectedphotons in an adaptive step j was d j = l and the relativephase was ∆ j = 2 πk/M for k ∈ { , , ..., M − } . Thus,the rows of the matrix D represent the photon numberdistributions for diﬀerent relative phases kπ/ θ j ) and ﬁnal hypothesis ( θ disc ) for QPSK states[31]. After completing N experiments, the matrix D con-tains N × L pairs { d j , ∆ j } and it is used for parameterestimation. Once estimation has been performed, thematrix is reset such that D l,k = 0 for all l and k . In or- Detection Matrix D LOIntensity φ NN 𝒜 NN NN Inputs NN Outputs M m +1 . . . . . . . . . . . . . . .. . .. . . In 1In 2 Out 1Out 2

InputLayer HiddenLayers (8) OutputLayer

FIG. 2.

Neural network . The neural network (NN) fornoise estimation has 10 layers (8 hidden) with sizes describedin Table I in Appendix A. The NN inputs are a ﬂattenedversion of the detection matrix D normalized across each row,and the LO intensity B for the measurements whose data iscontained in the detection matrix. The outputs of the NN areestimates for the phase oﬀset ˆ φ NN and input intensity ˆ A NN . der to extract information from D to correct for channelnoise aﬀecting the measurement, the receiver must uti-lize a particular estimator. A Bayesian estimator, whichuses the full likelihood functions, will yield estimates forthe channel noise with small uncertainty [63]. However,this estimator is computationally demanding to calcu-late. Since the estimation and correction of the chan-nel noise for non-Gaussian receivers must be performedin real-time, a Bayesian method may be incompatiblewith applications requiring high bandwidth sub-QNL re-ceivers. Therefore, to enable practical implementationsof non-Gaussian receivers requires an estimator that isboth precise and computationally eﬃcient while beingeasily scalable to higher dimensions to track multiplesources of channel noise. For example, the simple case ofsingle parameter estimation for phase tracking for non-Gaussian measurements has been experimentally demon-strated [31] using a simple estimator, which is calculatedin real-time with minimal computational resources. C. Neural Network Estimator

We construct a NN as a multi-parameter estimatorwhich maps the data collected from the state discrimi-nation measurement to estimates for the input intensityand phase oﬀset. We compare the performance of theNN estimator to a Bayesian estimator. The Bayesian-based method for noise tracking serves as a benchmarkand is calculated from the same state discriminationmeasurement data, i.e. the detection matrix D . Al-though we study phase and amplitude tracking, a prop-erly trained NN can in principle be used as an eﬃcienthigh-dimensional estimator for tracking many sources ofcommunication channel noise.Figure 2 shows a diagram of the NN architecture forthe proposed noise tracking method, which has 10 layers(8 hidden), each with a Leaky ReLU activation function[64]. To obtain the input for the NN, the detection matrix D is ﬁrst normalized across each row, and then arrangedinto a one-dimensional vector ( D l,k → D l ( m +1)+ k ). Thisvector, along with the LO intensity for the previous N measurements, are the inputs to the NN. For ease of no-tation, we denote the time-dependent input intensity ofthe QPSK coherent states as A ( τ ) = | α | ( τ ) where τ rep-resents time discretized into steps of ∆ T , where 1 / ∆ T isthe experimental repetition rate. For a single state dis-crimination measurement at time τ , the intensity of theLO is denoted as B ( τ ) = | β | ( τ ). The NN output, de-noted as ˆ A NN and ˆ φ NN , are raw estimates of the inputintensity A ( τ ) and relative phase oﬀset φ ( τ ) during theprevious N state discrimination measurements. The NNis trained on 5 × samples of the state discriminationmeasurement generated from Monte-Carlo simulations ofthe experiment in Python. For training the NN, we usethe Tensorﬂow library [65] with a weighted mean squarederror cost function (See Appendix A for details) [66–69].The trained NN is then included in the Monte-Carlo sim-ulations to perform parameter tracking on the state dis-crimination data such that the estimates from the NNare fed-forward to the LO to correct the measurement. III. RESULTS

We simulate the performance of the noise trackingmethod based on the NN estimator for a variety of sce-narios with amplitude and phase noise for average inputintensities (cid:104) ˆ n (cid:105) = A (0) = (cid:104)A ( τ ) (cid:105) equal to 2, 5, and 10.Here (cid:104)·(cid:105) denotes the average across all noise realizationsat the time step τ . We benchmark the NN against aBayesian estimator where the prior probability distribu-tion for both parameters is uniform [31]. For all simula-tions, we use a single NN to perform multi-parameterestimation and noise tracking across a range of inputpowers and noise parameter regimes.As a model for phase noise φ ( τ ), we simulate a dis-crete Gaussian random walk in phase [31]. A single stepof this walk has a variance of σ = 2 π ∆ ν ∆ T where ∆ ν is the phase noise bandwidth due to ﬁnite laser linewidth[4, 17, 18] or other phase noise sources [4, 70]. The exper-imental repetition rate is set to 1 / ∆ T = 100MHz suchthat ∆ T = 10ns to represent a feasible, near-term com-munication bandwidth for non-Gaussian receivers [71].To model amplitude noise of the input states, we simulatenoise in the input intensity A ( τ ). As a noise model, weuse an Ornstein–Uhlenbeck (OU) process [72, 73] whosestochastic diﬀerential equation is given by:∆ A ( τ ) = γ [ (cid:104) ˆ n (cid:105) − A ( τ )]∆ T + Σ √ ∆ T dW (1)where γ is the amplitude noise bandwidth, Σ controls de-viation of the walks, and dW denotes a Wiener process.The long–time variance of A ( τ ) is given by Σ ∞ = Σ / γ Time (ms)

BayesNo CorrectionPerfect CorrectionHeterodyne E rr o r P r ob a b ilit y -1 -2 -3 (a) NN Time (ms) I npu t I n t e n s it y Time (ms) I npu t P h a s e (b) (c) Het. (No Noise) Non-Gaussian (No Noise)

FIG. 3.

Probability of error as a function of time . (a)Error probability as a function of time for (cid:104) ˆ n (cid:105) = 5 . γ = 25kHz, Σ ∞ = 1 .

5, and∆ ν = 2kHz. Blue (orange) points show the error for the NN(Bayes) based estimator. Green and black points show theerror with no correction and perfect correction, respectively.Gray points show the eﬀective QNL of a perfectly correctedheterodyne measurement. Black and gray dashed lines showthe error for a non-Gaussian and heterodyne receiver, respec-tively, with no noise. and the maximum long–time variance we implement isΣ ∞ = { . , . , . } for (cid:104) ˆ n (cid:105) = { , , } , respectively,corresponding to a relative noise level of (cid:104) ˆ n (cid:105) / Σ ∞ ≈ . N state discrimination measurements, estimatesare calculated from the detection matrix D . To imple-ment correction of the receiver, we set the LO intensity B ( τ ) to the current estimated value ˆ A ( τ ) of the inputintensity A ( τ ). For phase tracking we add a correction δ ( τ ) to the LO phase such that arg { β } = θ j + δ ( τ ). Thecorrection δ ( τ ) is equal to the cumulative sum of indi-vidual estimates ˆ φ up to the current time step τ . This isbecause the receiver is always estimating only the phaseshift accumulated in the previous N experiments. Thephase and intensity corrections remain ﬁxed at these val-ues for N experiments until new estimates are made andapplied to the LO.To reduce uncertainty in the phase and intensity esti-mates for noise tracking, we implement a Kalman ﬁlter[74] for both estimates (See Appendix B for details). Theinput for the ﬁlter are the current raw estimates for theinput intensity and phase oﬀset ( ˆ A NN , ˆ φ NN ), and theﬁlter outputs are updated, ﬁltered estimates for the in-tensity ˆ A ( τ ) and phase ˆ φ . The same procedure is done to Perfect CorrectionHeterodyne

Phase Noise BW Δ ν (Hz) -1 E rr o r P r ob a b ilit y Perfect CorrectionNo CorrectionHeterodyne -1 E rr o r P r ob a b ilit y Perfect CorrectionHeterodyne -2 -1 -3 Perfect CorrectionHeterodyne

Phase Noise BW Δ ν (Hz) -2 -1 -3 Perfect CorrectionHeterodyne Perfect CorrectionHeterodyne -3 -1 Phase Noise BW Δ ν (Hz) 〈 n 〉 = 2.0 NNBayesNNBayes NNBayes NNBayesNNBayes NNBayes (a) (b) (c)(d) (e) (f) -5 -3 -1 -5 〈 n 〉 = 5.0 〈 n 〉 = 10.0 No Correction No CorrectionNo Correction No CorrectionNo CorrectionPerfect Correction from (a) Perfect Correction from (b) Perfect Correction from (c) FIG. 4.

Phase noise tracking.

Error probability as a function of phase noise bandwidth (BW) ∆ ν without (a)–(c) and with(d)–(f) amplitude noise of γ = 25kHz and Σ ∞ = 0.25, 1.5, and 6.0 for average intensities (cid:104) ˆ n (cid:105) = 2, 5, and 10, respectively. Blueand orange lines show the performance of the noise tracking methods based on NN and Bayesian estimators, respectively. Theorange dashed line shows the error probability for a non-Gaussian receiver with no correction. The purple and gray dashedlines show the error probability for a non-Gaussian and heterodyne measurement with perfect correction, respectively. obtain ﬁltered Bayesian estimates from the raw estimates( ˆ A B , ˆ φ B ). To implement the Kalman ﬁlter, we assumethat the raw NN estimates are Gaussian distributed, anduse Monte-Carlo simulations with ﬁxed phase oﬀset andinput intensity to empirically obtain the variance of theNN estimator. We note that although we study two par-ticular models for phase and amplitude noise, we believethis NN-based tracking method can be applied to a vari-ety of noise forms such as power-law amplitude noise ordamping noise. To study diﬀerent noise models, the NNwould need to be re-trained using data generated fromthe new model and the noise dynamics would need to beincorporated into the Kalman ﬁlter accordingly.Figure 3 shows (a) the error probability of the non-Gaussian receiver with noise tracking for 1000 diﬀerentrealizations with both phase (∆ ν = 2kHz), and inten-sity ( γ = 25kHz , Σ ∞ = 1 .

5) noise shown in (b) and(c), respectively, for an input intensity (cid:104) ˆ n (cid:105) = 5 . N = 10 experiments per estimation period. The blue (or-ange) points show the results the noise tracking methodbased on NN (Bayesian) estimators. The black pointsshow the error probability with perfect correction, whichcorresponds to the case where the receiver has com-plete knowledge of the phase and intensity noise, so that B ( τ ) = A ( τ ) and δ ( τ ) = φ ( τ ). The green points show theerror probability of an uncorrected non-Gaussian mea-surement, and the gray points show that of an ideal het-eroydne measurement with perfect phase tracking (equiv-alent to φ ( τ ) = 0). We note that even though the receiver may have perfect knowledge of the noise, the overall ef-fect of the amplitude noise increases the error probabil-ity. This is because input powers smaller than the aver-age power ( A ( τ ) < (cid:104) ˆ n (cid:105) ) increase the errors more thanthe reduction of error for larger powers ( A ( τ ) > (cid:104) ˆ n (cid:105) ).The dashed black and gray lines show the error for anadaptive non-Gaussian measurement, and an ideal het-erodyne measurement with no noise, respectively. Bycomparing the error of the non-Gaussian measurementwith perfect correction (black points) to the black dashedreference line, we observe that non-Gaussian measure-ments are more sensitive to amplitude noise than a het-erodyne measurement (gray points vs gray dashed line),even when they are perfectly corrected. We observe thatthe NN-based tracking method performs equivalently tothe Bayesian method, and both can allow the receiverto maintain an error probability signiﬁcantly below theQNL. This result demonstrates the capabilities of a NNfor eﬃcient and reliable noise tracking for non-Gaussianreceivers for state discrimination.We study the robustness of the NN-based method inscenarios with diﬀerent noise strengths and bandwidthsin the phase and amplitude. For these studies, we usethe heterodyne measurement with perfect phase track-ing as the limit for conventional strategies, serving as theeﬀective QNL when the same noise is applied to both re-ceivers. In this section, we study (A) the error probabilityas a function of phase noise with ﬁxed amplitude noise,and (B) the error probability when the amplitude noise 〈 n 〉 = 5.0 υ = 5 kHzυ = 0 kHz Amplitude Noise BW ( γ ) -1 -2 -3 〈 n 〉 = 2.0Amplitude Noise BW ( γ ) E rr o r P r ob a b ilit y -1 -1 -1 -2 υ = 5 kHzυ = 0 kHz υ = 5 kHzυ = 0 kHz Amplitude Noise BW ( γ ) -2 -4 -5 -3 NNBayes

Perfect Corr. (a) (b) (c) 〈 n 〉 = 10.0 NNBayes NNBayes

Heterodyne

Perfect Corr.

Heterodyne

Perfect Corr.

Heterodyne

FIG. 5.

Amplitude noise tracking.

Error probability as a function of amplitude noise bandwidth (BW) γ for (cid:104) ˆ n (cid:105) = { , , } without and with phase noise with bandwidth ∆ ν = 5kHz. Purple and gray dashed lines show the error for a non-Gaussian andheterodyne measurement with perfect correction, respectively. Beyond γ ≈ Hz, the amplitude noise is eﬀectively randomacross the N experiments. levels are varied with phase noise with a ﬁxed bandwidth. A. Phase noise with diﬀerent bandwidths

We study the performance of the noise trackingmethod based on the NN estimator as a function ofthe phase noise bandwidth ∆ ν for ﬁxed values of am-plitude noise Σ ∞ and γ . We compare these results tothe tracking method based on a Bayesian estimator, aswell as a perfectly-corrected non-Gaussian measurement.We use diﬀerent amplitude noise parameters for diﬀer-ent values of (cid:104) ˆ n (cid:105) such that the relative amplitude noisestrength (cid:104) ˆ n (cid:105) / Σ ∞ is constant. For an average intensityof (cid:104) ˆ n (cid:105) = 2 , , and 10 we simulate 250, 250, and 500 dif-ferent realizations of the noise, respectively. The simula-tions are run for 2 × time bins of N = 10 experimentseach, giving a total of 2 × individual experiments pernoise realization. We calculate the average error acrossall realizations for all time bins.Figure 4 shows the average error probability as afunction of the bandwidth ∆ ν for intensities (cid:104) ˆ n (cid:105) = 2,5, and 10. Figures 4(a)–4(c) have no amplitude noise( γ = 0, Σ ∞ = 0), and 4(d)–4(f) have γ = 2kHz, andΣ ∞ = { . , . , . } , respectively, corresponding to rel-ative strength of (cid:104) ˆ n (cid:105) / Σ ∞ ≈ .

25. The performance ofthe noise tracking method based on the NN estimator(blue) is equivalent to the Bayesian-based method (or-ange) while being computationally eﬃcient to implement.The purple and gray dashed lines show the average errorof the non-Gaussian and ideal heterodyne receivers withperfect parameter tracking, respectively.We observe that for all the investigated average inputintensities (cid:104) ˆ n (cid:105) , the NN-based method performs as wellas the Bayesian method both with and without ampli-tude noise. The NN-based method can enable the non-Gaussian receiver to surpass the QNL up to a phase noisebandwidth of ∆ ν ≈ (cid:104) ˆ n (cid:105) = 5, and 10, we ob-serve that a NN estimator can perform slightly betterthan a Bayesian estimator. We believe this is due to therelatively small number of samples ( N = 10) from whichestimates are made. In this regime with a few samples forestimation, there may be estimators which perform bet-ter than the Bayesian estimator, which is asymptoticallyoptimal in the limit of many samples. Another potentialcause of this eﬀect is that in the training process of theNN, the relative weight between error in phase estimatesand error in mean photon number estimates can be ad-justed. This freedom may allow for ﬁne tuning of theoverall training error to allow for a slightly better over-all performance, in terms of error probability, for speciﬁcchannel models. B. Amplitude noise with diﬀerent bandwidths

To investigate the eﬀect of the amplitude noise band-width γ , we ﬁx the long-time variance Σ ∞ and thephase noise bandwidth ∆ ν . This allows for studyingthe performance of the NN-based method when the am-plitude noise bandwidth γ ranges from much smaller tomuch larger than the bandwidth for parameter estima-tion 1 /N ∆ T . Figure 5 shows the average probability oferror for diﬀerent amplitude noise bandwidths γ with-out and with phase noise of bandwidth ∆ ν = 5kHz,for (cid:104) ˆ n (cid:105) = { , , } with Σ ∞ = { . , . , . } , re-spectively. Blue (orange) lines show the error rates forthe NN (Bayesian) based tracking method. Purple andgray dashed lines show the error probability for a non-Gaussian and heterodyne measurement with perfect cor-rection, respectively. We ﬁnd that the NN-based methodperforms closely to the Bayesian-based method, and en-ables the receiver to achieve sub-QNL error rates acrossa broad range of amplitude noise bandwidths even in thepresence of phase noise.We note that for intensity (cid:104) ˆ n (cid:105) = 2 in Fig. 5(a),the error for both the noise tracking methods are be-low the perfectly corrected non-Gaussian measurementwhen ∆ ν = 0. At low input powers, strategies thatoptimize the LO intensity ( | β | > | α | ) yield lower er-ror probabilities than when | β | = | α | [28]. Due to thesmall number of samples ( N × L ) used for estimation, theNN and Bayesian estimators have a bias in ˆ A B,NN , suchthat B ( τ ) > A ( τ ). The eﬀect of this bias in the inten-sity estimates ˆ A B,NN is that the corrected measurementunintentionally approximates an optimized strategy [28].This eﬀect results in error probabilities of the correctedreceiver with both NN and Bayesian based methods thatare below the error of a perfectly corrected nulling re-ceiver where B ( t ) = A ( t ), which is due to the bias of theestimators from ﬁnite sampling. Further investigation isneeded to determine the capabilities of NN based noisetracking for optimized non-Gaussian receivers [28].The performance of the non-Gaussian receiver also de-pends on the long-time variance Σ ∞ of the amplitudenoise. In our main results, Σ ∞ was set to represent a“worst-case” scenario of ≈

25% relative amplitude noise(See Fig. 3). Appendix C describes our study of noisetracking of amplitude noise with diﬀerent long-time vari-ance Σ ∞ . In our ﬁndings, we observe that in the absenceof phase noise, both the NN and Bayesian-based track-ing methods enable the receiver to perform below theQNL, and close to the performance of perfect noise cor-rection. In the presence of phase noise with bandwidth∆ ν = 5kHz, the sub-QNL performance of the receiveris maintained, and the eﬀect of increasing Σ ∞ is smallcompared to the eﬀects of increasing phase or amplitudebandwidths. IV. DISCUSSION

The numerical studies in this work show that meth-ods for channel noise tracking based on NN estimatorsare able to accurately track dynamic phase and ampli-tude noise to allow an adaptive non-Gaussian measure-ment to maintain performance below the QNL. We notethat in the asymptotic limit of many samples availablefor parameter estimation for noise tracking, a Bayesianestimator can achieve minimal mean-square error [63].However, when noise tracking and correction need to berealized in real time to reduce errors in the state dis-crimination measurement and generate reliable data forparameter estimation, there is always a limited numberof samples from which estimates are made. In these sit-uations, there is a trade-oﬀ between estimation precisionand noise tracking bandwidth. Other estimators, suchas a NN, may balance these two parameters better thana Bayesian estimator for increased precision with ﬁnitesamples. This property can enable eﬃcient methods forhigh dimensional parameter tracking of complex dynamicchannel noise.The computational eﬃciency of the NN estimator is rooted in the small number of multiplications required tocalculate a single estimate, the limited memory require-ments, and in the fact that the NN method does not ex-plicitly depend on the value of N or the number of adap-tive steps L , as opposed to the Bayesian approach. Forexample, the NN-based estimator in this work requires ≈ ×

100 grid would require 100 × N × L = 10 multiplications, which may not be compatible with de-vices such as FPGAs [26, 31]. While there may be meth-ods to reduce this computational cost, the Bayesian es-timator also would require storage in memory of the fullphoton counting likelihood functions, putting stringentrequirements on the device memory. For example, the100 ×

100 grid for the Bayesian estimator with 16 bit pre-cision would require 800 kB of memory simply to storethe likelihood functions. Moreover, to extend the noisetracking method for estimation of three noise parame-ters, a NN would simply require a single added outputand proper retraining, while a Bayesian estimator wouldrequire possibly 10 multiplications and 80 MB of mem-ory.The robustness and versatility of the NN-based noisetracking method described here, shows that NN-basedmethods can be practical and very useful tools for non-Gaussian receivers. In addition, other machine learningtechniques, such a reinforcement learning [75, 76], couldprovide a further beneﬁts to these non-conventional mea-surements when the best detection strategy may be un-known or infeasible to calculate. We anticipate that neu-ral networks and machine learning will have a great ben-eﬁt for non-Gaussian measurements, just as these tech-niques have proven worthwhile for conventional measure-ment strategies [54–61]. V. CONCLUSION

We investigate the use of a neural network (NN) asa computationally eﬃcient multi-parameter estimator ofdynamic channel noise, enabling robust noise tracking foradaptive non-Gaussian measurements for coherent statediscrimination. We study the NN-based tracking methodfor simultaneous amplitude and phase noise and ﬁnd thatthe NN estimator can perform as well as a more complexBayesian estimator. This performance is observed acrossa broad range of noise strengths and bandwidths for dif-ferent average powers of the input coherent states. Thenon-Gaussian receiver used in this study can have broadapplications in classical [26, 27] and quantum commu-nication [77, 78] due to its ability to attain sensitivitiesbeyond the quantum noise limit (QNL). Moreover, theproposed method for noise tracking uses only the datacollected during the state discrimination measurementwithout requiring extra resources such as strong refer-ence pulses. This makes the receiver and the proposedmethod for noise tracking well suited for energy eﬃcientlow power communications. Thus, NN based methodsare ideal candidates for real-time tracking of multiplesources of channel noise for non-Gaussian receivers, al-lowing them to maintain their sub-QNL sensitivity in thepresence of complex dynamic channel noise.

ACKNOWLEDGMENTS

This work was supported by the National Sci-ence Foundation (NSF) (PHY-1653670, PHY-1521016,PHY-1630114). The source code is available at:github.com/UNM-QOlab/phase amp tracking nn

Appendix A: Neural network estimator training

To generate training data, we use Monte-Carlo sim-ulations of the adaptive non-Gaussian measurement de-scribed in Section II(A) [26]. For a single training dataelement, the strategy is simulated using a randomly cho-sen intensity for the input state A and LO intensity B ,both sampled from a uniform distribution U (0 . , . φ is then sampledfrom a Gaussian distribution N (0 , σ = 0 . N that comprise each sample from a uniform distribution U (2 , N depending on the noise characteristics. During the train-ing of the NN, we use true values of the input parameters( φ, A ) as the target values. This procedure using randomsampling of multiple input noise parameters ensures suf-ﬁcient sampling of the input parameter space, enablingtraining of a robust NN estimatorTo train the NN we use the Tensorﬂow framework [65]in Python. Table I summarizes the relevant NN andtraining parameters. We use the RMSprop optimizer [79]with a weighted mean-squared-error cost function. Theweight w i of each training sample is given by: w i = e − ( A i −B i ) / + 0 . TABLE I. Neural Network and training parameters.Network parametersNumber of Layers 10 (8 Hidden)Hidden layer size ( l i ) { , , , , , , , } Activation function Leaky ReLU ( a = 0 . σ = 1 /l i ) (Xavier)Training parametersCost function Weighted mean squared errorOptimizer RMSprop (momentum=0.8)Learning rate 50 × − Epochs 2000 where A i , and B i are the intensity of the input states andLO of the i th sample, respectively. This allows the NNto accurately estimate the parameters when the LO andinput intensities are close to each other, as one wouldexpect in practice, while also being somewhat robust tolarge amplitude ﬂuctuations. Appendix B: Kalman ﬁlter

We implement Kalman ﬁltering [74, 80] of both thephase estimate and intensity estimate in order to reducethe uncertainty in the applied corrections to the adaptivenon-Gaussian measurement. For the phase estimates, thepredicted mean value ˆ y φ and variance ˆ σ φ are:ˆ y φ = 0 (B1)ˆ σ φ = σ φ + N σ where ˆ y φ represents the predicted average value for thephase, σ φ the variance of the current prior probabilitydistribution for the phase, σ = 2 π ∆ ν ∆ T , and ˆ σ φ thepredicted phase variance. The ﬁltered estimate ˆ φ is thenobtained from the raw estimate ˆ φ NN and (B1) by:ˆ φ = K φ ˆ φ NN + (1 − K φ )ˆ y φ (B2) σ φ = (1 − K φ )ˆ σ φ K φ = ˆ σ φ ˆ σ φ + σ φ,NN where K φ is the Kalman gain for the phase estimate, σ φ,NN is the variance of the NN phase estimate, and σ φ is the updated variance of the ﬁltered phase estimate.Similarly, the equations for the predicted input inten-sity mean value ˆ y A and variance ˆ σ A are given by:ˆ y A = (1 − γ ∆ T ) N B N + γ (cid:104) ˆ n (cid:105) ∆ T N − (cid:88) k =0 (1 − γ ∆ T ) k (B3)ˆ σ A = (1 − γ ∆ T ) N σ A + Σ ∆ T N − (cid:88) k =0 (1 − γ ∆ T ) k where γ is the amplitude noise bandwidth, (cid:104) ˆ n (cid:105) is theaverage input intensity equal to A (0), and B N is the LOintensity for the previous N experiments. These equa-tions result from propagating the mean and variance ofEq. (1) for N time steps of duration ∆ T .The ﬁltered intensity estimate ˆ A and variance σ A arethen obtained from the raw estimate ˆ A NN and (B3) by:ˆ A = K A ˆ A NN + (1 − K A )ˆ y A (B4) σ A = (1 − K A )ˆ σ A K A = ˆ σ A ˆ σ A + σ A ,NN Algorithm1

NN Parameter Tracking Algorithm function

KalmanFilter ( ˆ A NN , ˆ φ NN )Predict: ˆ y φ , ˆ σ φ , ˆ y A , ˆ σ A (cid:46) Eqs. (B1), (B3)ˆ φ ← K φ ˆ φ NN + (1 − K φ )ˆ y φ (cid:46) Eq. (B2)ˆ A ( τ ) ← K A ˆ A NN + (1 − K A )ˆ y A (cid:46) Eq. (B4)Update: K φ , σ φ , K A , σ A (cid:46) Eqs. (B2), (B4) return ˆ φ, ˆ A ( τ ) end functionInitial B (0) ← ˆ A (0) δ (0) ← τ ← (cid:46) Time in increments of symbol time n ← (cid:46) Number of measurements performed loop { d j } L , { ∆ j } L ← Single discrimination measurementAdd { d j } L , { ∆ j } L to detection matrix D τ ← τ + ∆ Tn ← n + 1 if n = N then (cid:46) Update LO every N measurementsˆ A NN , ˆ φ NN ← Evaluate NN with inputs D , B ( τ )Reset D to zerosˆ φ, ˆ A ( τ ) ← KalmanFilter ( ˆ A NN , ˆ φ NN ) B ( τ ) ← ˆ A ( τ ) (cid:46) Correct LO intensity δ ( τ ) ← δ ( τ ) + ˆ φ (cid:46) Add estimate to phase correction n ← (cid:46) Reset measurement counter else (cid:46)

Don’t update LO B ( τ ) ← B ( τ − ∆ T ) δ ( τ ) ← δ ( τ − ∆ T ) end ifend loop where K A is the Kalman gain for the intensity estimateand σ A ,NN is the variance of the NN phase estimate. Theinitial ( τ = 0) variances for both phase and intensity areset to zero.In order to empirically obtain the variances σ φ,NN ,and σ A ,NN we use Monte Carlo simulations of the ex-periment and the NN estimator without the Kalman ﬁl-ter. For all average intensities, we ﬁx the input inten-sity to A ( t ) = { , , } and phase noise to zero, andcalculate the variance of 10 estimates. The varianceis calculated for a range of the number of experimentsper estimation N . We ﬁt the variance as a functionof N to a power law function so that the ﬁlter may beused for diﬀerent values of N . For N = 10, as used inthe simulations, σ A ,NN = { . , . , . } × − and σ φ,NN = { . , . , . } × − for (cid:104) ˆ n (cid:105) = { , , } ,respectively. As discussed in Section IV B , the NN es-timator is not necessarily unbiased across the range ofparameters it is trained on due to the small sample sizeof only N × L as well as imperfections in the NN trainingprocess.Employing a Kalman ﬁlter allows for construction ofthe full NN-based parameter tracking method. Algo-rithm 1 shows the pseudo-code for running the NN-basedmethod including the ﬁltering steps. For every timestep τ , a single state discrimination measurement is com- NNBayes Heterodyne υ = 5 kHz υ = 0 kHz

Long-time Amp. Noise Strength E rr o r P r ob a b ilit y -1 -2 -3 Perfect Corr.

FIG. 6. Error probability as a function of the long time vari-ance Σ ∞ of the amplitude noise for (cid:104) ˆ n (cid:105) = 5, γ = 25kHzwithout and with phase noise of bandwidth ∆ ν = 5kHz. pleted, which yields detections { d j } L and relative phases { ∆ j } L which populate the detection matrix D , as de-scribed in Section IIB. Every N time steps, i.e. every N measurements, the NN is evaluated to provide rawestimates ˆ A NN , ˆ φ NN of the intensity and phase oﬀsetwithin the previous N measurements. These raw esti-mates are then passed through the Kalman ﬁlter, whichreturns the current, ﬁltered estimates for the intensityˆ A ( τ ) and phase oﬀset ˆ φ . These current estimates arethen used to update the LO parameters B ( τ ) , δ ( τ ) forthe next N state discrimination measurements. Appendix C: Diﬀerent magnitudes of amplitudenoise

We further investigate the performance of the NN es-timator when varying the long-time strength of the am-plitude noise Σ ∞ . Figure 6 shows the error probabilityof the receiver for (cid:104) ˆ n (cid:105) = 5 across a range of Σ ∞ for ﬁxed γ = 25kHz without and with phase noise of bandwidth∆ ν = 5kHz. We ﬁnd that the NN based tracking methodperforms similar to the one based on the Bayesian esti-mator, and enables an error probability below that ofthe ideal heterodyne measurement. We have observed inother studies that the behavior for average input inten-sities (cid:104) ˆ n (cid:105) =2 and 10 is similar to the one for (cid:104) ˆ n (cid:105) =5 inFig. 6. Appendix D: Estimation Time-Bandwidth Trade-oﬀ

The overall performance of the phase tracking also de-pends on the number of experiments N used to obtaina single estimate. For this study, we ﬁxed N =10 for allsimulations to demonstrate the versatility of the NN es-0 = 0.5 kHz= 2.5 kHz= 0.1 = 5 kHz= 25 kHz= 0.5 = 50 kHz= 250 kHz= 1.0 -1 -2 Experiments per Estimation (N) E rr o r P r ob a b ilit y P E P E / P E,MIN

FIG. 7. Error probability as a function of N for three diﬀerentnoise strengths: low noise (blue), moderate noise (orange),and high noise (yellow). The black circles show the minimumerror probability P E,MIN and corresponding optimal value forN for each noise regime. The inset shows the error probabilitynormalized by the minimum error given by the black circlesin the main ﬁgure. timator. Although, realistic implementations may use aparticular channel with speciﬁc noise characteristics. Inthis scenario, the value of N can be ﬁne tuned to optimizethe performance of the phase tracking method to balancethe estimation bandwidth (smaller N ) and estimationaccuracy (larger N ). The optimization aims to ﬁnd avalue of N such that the overall error probability is min-imized when implementing the noise tracking method.The Kalman ﬁltering attempts to balance the eﬀects of the estimation variance and the noise variance over N measurements in an optimal way through the Kalmangain K . However, there is still a trade-oﬀ between thesetwo variances and a value of N which achieves minimalP E for speciﬁc channel noise conditions.Figure 7 shows the overall error probability as a func-tion of N when implementing the NN estimator with ﬁl-tering for three diﬀerent sets of channel noise param-eters for (cid:104) ˆ n (cid:105) = 5 .

0. The blue line corresponds to∆ ν = 0 . γ = 2 . ∞ = 0 .

1. The or-ange line corresponds to ∆ ν = 5kHz, γ = 25kHz, andΣ ∞ = 0 .

5. The yellow line corresponds to ∆ ν = 50kHz, γ = 250kHz, and Σ ∞ = 1 .

0. For small noise bandwidthand strength (blue), the optimal value of N (black cir-cles) is approximately N = 40, but decreases to N = 10and N = 3 as the noise bandwidth and strength in-crease (orange and yellow). The inset shows the errorprobability normalized by the minimum for each noisestrength for clarity. These optimal values of N for speciﬁcchannel noise parameters represent the optimal balancebetween estimation uncertainty and accumulated uncer-tainty from the channel noise. Thus, for a channel withknown noise characteristics, an optimal value of N can befound. In the studies presented in the main manuscript,we ﬁxed N =10, since we found that this value allows thereceiver to be versatile and operate well across a widerange of noise bandwidths. In the inset, N = 10 is opti-mal for moderate noise levels. For small and large noiselevels, the error at N = 10 is only slightly higher com-pared to their respective minimums. Thus, for a speciﬁcwell known channel an optimal N can be implemented,but for a robust and versatile implementation a diﬀerentvalue of N may be beneﬁcial. [1] V. Giovannetti, S. Guha, S. Lloyd, L. Maccone, J. H.Shapiro, and H. P. Yuen, Classical capacity of the lossybosonic channel: The exact solution, Phys. Rev. Lett. , 027902 (2004).[2] G. Li, Recent advances in coherent optical communica-tion, Advances in Optics and Photonics , 279 (2009).[3] E. Ip, A. P. T. Lau, D. J. F. Barros, and J. M. Kahn,Coherent detection in optical ﬁber systems, Opt. Express , 753 (2008).[4] K. Kikuchi and S. Tsukamoto, Evaluation of sensitivityof the digital coherent receiver, Lightwave Technology,Journal of , 1817 (2008).[5] J. M. Arrazola and N. L¨utkenhaus, Quantum communi-cation with coherent states and linear optics, Phys. Rev.A , 042335 (2014).[6] S. Ghorai, P. Grangier, E. Diamanti, and A. Leverrier,Asymptotic security of continuous-variable quantum keydistribution with a discrete modulation, Phys. Rev. X ,021059 (2019).[7] S. Pirandola, R. Laurenza, C. Ottaviani, and L. Banchi,Fundamental limits of repeaterless quantum communica-tions, Nat. Commun. , 15043 (2017).[8] N. Gisin, G. Ribordy, W. Tittel, and H. Zbinden, Quan- tum cryptography, Rev. Mod. Phys. , 145 (2002).[9] N. Gisin and R. Thew, Quantum communication, NaturePhoton. , 165 (2007).[10] K. Kikuchi, Fundamentals of coherent optical ﬁber com-munications, Lightwave Technology, Journal of , 157(2016).[11] P. Jouguet, S. Kunz-Jacques, A. Leverrier, P. Grangier,and E. Diamanti, Experimental demonstration of long-distance continuous-variable quantum key distribution,Nature Photonics , 378 (2103).[12] A. G. Armada and M. Calvo, Phase noise and sub-carrierspacing eﬀects on the performance of an ofdm communi-cation system, IEEE Communications Letters (1998).[13] A. Marie and R. All´eaume, Self-coherent phase refer-ence sharing for continuous-variable quantum key distri-bution, Phys. Rev. A , 012316 (2017).[14] B. Qi, P. Lougovski, R. Pooser, W. Grice, and M. Bobrek,Generating the local oscillator “locally” in continuous-variable quantum key distribution based on coherent de-tection, Phys. Rev. X , 041009 (2015).[15] D. B. S. Soh, C. Brif, P. J. Coles, N. L¨utkenhaus, R. M.Camacho, J. Urayama, and M. Sarovar, Self-referencedcontinuous-variable quantum key distribution protocol, Phys. Rev. X , 041010 (2015).[16] T. Wang, P. Huang, S. Wang, and G. Zeng, Carrier-phase estimation for simultaneous quantum key distri-bution and classical communication using a real local os-cillator, Phys. Rev. A , 022318 (2019).[17] E. Ip and J. M. Kahn, Feedforward carrier recovery forcoherent optical communications, Journal of LightwaveTechnology (2007).[18] D. S. Ly-Gagnon, S. Tsukamoto, K. Katoh, andK. Kikuchi, Coherent detection of optical quadraturephase-shift keying signals with carrier phase estimation,Journal of Lightwave Technology (2006).[19] M. Bina, A. Allevi, M. Bondani, and S. Olivares, Phase-reference monitoring in coherent-state discrimination as-sisted by a photon-number resolving detector, ScientiﬁcReports , 26025 (2016).[20] C. W. Helstrom, Quantum detection and estimation the-ory, Mathematics in Science and Engineering Vol. 123 (Academic Press, New York, 1976).[21] N. J. C. . A. S. H. V. Giovannetti, R. Garc´ıa-Patr´on, Ul-timate classical communication rates of quantum opticalchannels, Nature Photonics , 796 (2014).[22] K. Banaszek, L. Kunz, M. Jachura, and M. Jarzyna,Quantum Limits in Optical Communications, Journal ofLightwave Technology , 2741 (2020).[23] C. R. M¨uller, M. A. Usuga, C. Wittmann, M. Takeoka,C. Marquardt, U. L. Andersen, and G. Leuchs, Quadra-ture phase shift keying coherent state discrimination viaa hybrid receiver, New J. of Phys. , 083009 (2012).[24] C. R. M¨uller and C. Marquardt, A robust quantumreceiver for phase shift keyed signals, New Journal ofPhysics , 032003 (2015).[25] F. E. Becerra, J. Fan, G. Baumgartner, J. Goldhar, J. T.Kosloski, and A. Migdall, Experimental demonstration ofa receiver beating the standard quantum limit for multi-ple nonorthogonal state discrimination, Nature Photon-ics , 147 (2013).[26] F. E. Becerra, J. Fan, and A. Migdall, Photon numberresolution enables quantum receiver for realistic coherentoptical communications, Nat. Photonics (2015).[27] J. Lee, S.-W. Ji, J. Park, and H. Nha, Gaussian bench-mark for optical communication aiming towards ultimatecapacity, Phys. Rev. A , 050302 (2016).[28] A. R. Ferdinand, M. T. DiMario, and F. E. Becerra,Multi-state discrimination below the quantum noise limitat the single-photon level, npj Quantum Information ,43 (2017).[29] S. Izumi, M. Takeoka, M. Fujiwara, N. D. Pozza, A. As-salini, K. Ema, and M. Sasaki, Displacement receiverfor phase-shift-keyed coherent states, Phys. Rev. A ,042328 (2012).[30] S. Izumi, J. S. Neergaard-Nielsen, S. Miki, H. Terai,and U. L. Andersen, Experimental Demonstration of aQuantum Receiver Beating the Standard Quantum Limitat Telecom Wavelength, Physical Review Applied ,054015 (2020).[31] M. T. DiMario and F. E. Becerra, Phase tracking forsub-shot-noise-limited receivers, Phys. Rev. Research ,023384 (2020).[32] M. Bina, A. Allevi, M. Bondani, and S. Oli-vares, Homodyne-like detection for coherent state-discrimination in the presence of phase noise, Opt. Ex-press , 10685 (2017).[33] J. L. Habif, A. Jagannathan, S. Gartenstein, P. Amory, and S. Guha, Quantum-limited discrim-ination of laser light and thermal light, quant-ph/arXiv:1912.06718v1 (2019).[34] R. Yuan, M. Zhao, S. Han, and J. Cheng, Kennedy re-ceiver using threshold detection and optimized displace-ment under thermal noise, IEEE Communications Let-ters , 1 (2020).[35] M. T. DiMario, L. Kunz, K. Banaszek, and F. E. Be-cerra, Optimized communication strategies with binarycoherent states over phase noise channels, npj QuantumInformation , 65 (2019).[36] M. G. Genoni, S. Olivares, and M. G. A. Paris, Opticalphase estimation in the presence of phase diﬀusion, Phys.Rev. Lett. , 153603 (2011).[37] M. G. Genoni, S. Olivares, D. Brivio, S. Cialdi, D. Cipri-ani, A. Santamato, S. Vezzoli, and M. G. A. Paris, Opti-cal interferometry in the presence of large phase diﬀusion,Phys. Rev. A , 043817 (2012).[38] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld,N. Tishby, L. Vogt-Maranto, and L. Zdeborov´a, Machinelearning and the physical sciences, Rev. Mod. Phys. ,045002 (2019).[39] J. Walln¨ofer, A. A. Melnikov, W. D¨ur, and H. J. Briegel,Machine learning for long-distance quantum communica-tion, arXiv.org quant-ph/arXiv:1904.10797 (2019).[40] F. N. Khan, Q. Fan, C. Lu, and A. P. T. Lau, An opticalcommunication’s perspective on machine learning and itsapplications, Journal of Lightwave Technology , 493(2019).[41] J. Mata, I. [de Miguel], R. J. Dur´an, N. Merayo, S. K.Singh, A. Jukan, and M. Chamania, Artiﬁcial intelligence(ai) methods in optical networks: A comprehensive sur-vey, Optical Switching and Networking , 43 (2018).[42] M. Chen, U. Challita, W. Saad, C. Yin, and M. Debbah,Artiﬁcial Neural Networks-Based Machine Learning forWireless Networks: A Tutorial, IEEE CommunicationsSurveys Tutorials , 3039 (2019).[43] J. Schmidhuber, Deep learning in neural networks: Anoverview, Neural Networks , 85 (2015).[44] V. Dunjko and H. J. Briegel, Machine learning & artiﬁ-cial intelligence in the quantum domain: a review of re-cent progress, Reports on Progress in Physics , 074001(2018).[45] A. A. Melnikov, H. Poulsen Nautrup, M. Krenn, V. Dun-jko, M. Tiersch, A. Zeilinger, and H. J. Briegel, Activelearning machine learns to create new quantum experi-ments, Proceedings of the National Academy of Sciences , 1221 (2018).[46] A. Hentschel and B. C. Sanders, Machine learning for pre-cise quantum measurement, Phys. Rev. Lett. , 063603(2010).[47] A. Lumino, E. Polino, A. S. Rab, G. Milani, N. Spagnolo,N. Wiebe, and F. Sciarrino, Experimental phase estima-tion enhanced by machine learning, Phys. Rev. Applied , 044033 (2018).[48] L. J. Fiderer, J. Schuﬀ, and D. Braun, Neural-networkheuristics for adaptive bayesian quantum estimation,arXiv.org quant-ph/arXiv:2003.021836 (2020).[49] V. Cimini, I. Gianani, N. Spagnolo, F. Leccese, F. Sciar-rino, and M. Barbieri, Calibration of quantum sensors byneural networks, Phys. Rev. Lett. , 230502 (2019).[50] T. Giordani, A. Suprano, E. Polino, F. Acanfora, L. In-nocenti, A. Ferraro, M. Paternostro, N. Spagnolo, andF. Sciarrino, Machine learning-based classiﬁcation of vec- tor vortex beams, Phys. Rev. Lett. , 160401 (2020).[51] S. Lohani, E. M. Knutson, M. O’Donnell, S. D. Huver,and R. T. Glasser, On the use of deep neural networksin optical communications, Appl. Opt. , 4180 (2018).[52] D. E. . J. C. Gregory R. Steinbrecher, Jonathan P. Olson,Quantum optical neural networks, npj Quantum Infor-mation , 60 (2019).[53] K. Beer, D. Bondarenko, T. Farrelly, T. J. Osborne,R. Salzmann, D. Scheiermann, and R. Wolf, Trainingdeep quantum neural networks, Nat. Commun. , 808(2020).[54] J. Thrane, J. Wass, M. Piels, J. C. M. Diniz, R. Jones,and D. Zibar, Machine learning techniques for opticalperformance monitoring from directly detected pdm-qamsignals, Journal of Lightwave Technology , 868 (2017).[55] X. Wu, J. A. Jargon, R. A. Skoog, L. Paraschis, andA. E. Willner, Applications of artiﬁcial neural networksin optical performance monitoring, J. Lightwave Technol. , 3580 (2009).[56] D. Zibar, M. Piels, R. Jones, and C. G. Sch¨aeﬀer,Machine learning techniques in optical communication,Journal of Lightwave Technology , 1442 (2016).[57] S. Lohani and R. T. Glasser1, Generative machinelearning for robust free-space communication, arXiv ,190902249 (2019).[58] B. Karanov, M. Chagnon, F. Thouin, T. A. Eriksson,H. B¨ulow, D. Lavery, P. Bayvel, and L. Schmalen, End-to-end deep learning of optical ﬁber communications,Journal of Lightwave Technology , 4843 (2018).[59] D. Zibar, L. H. H. de Carvalho, M. Piels, A. Doberstein,J. Diniz, B. Nebendahl, C. Franciscangelis, J. Estaran,H. Haisch, N. G. Gonzalez, J. C. R. F. de Oliveira, andI. T. Monroy, Application of machine learning techniquesfor amplitude and phase noise characterization, Journalof Lightwave Technology , 1333 (2015).[60] F. N. Khan, K. Zhong, X. Zhou, W. H. Al-Arashi, C. Yu,C. Lu, and A. P. T. Lau, Joint OSNR monitoring andmodulation format identiﬁcation in digital coherent re-ceivers using deep neural networks, Optics Express ,17767 (2017).[61] D. Wang, M. Zhang, Z. Li, J. Li, M. Fu, Y. Cui, andX. Chen, Modulation format recognition and osnr esti-mation using cnn-based deep learning, IEEE PhotonicsTechnology Letters , 1667 (2017).[62] We note that in conventional optical communications,the use of ampliﬁcation prior to detection can be usedto reduce the probability of error. In those situations,the use of non-Gaussian receivers can further reduce theerror rates far beyond what could be achieved by hetero-dyne measurements. For example, given an input pulsepower corresponding to (cid:104) ˆ n (cid:105) = 5 .

0, the error reduc-tion for a non-Gaussian receiver ( P E , NG ) compared tothe heterodyne receiver at the QNL equals a factor ofQNL /P E , NG = 17. At this initial power, the heterodynereceiver would require 3 dB of noiseless gain to reach thenon-ampliﬁed non-Gaussian receiver. On the other handif the same 3 dB ampliﬁer is used with the non-Gaussianreceiver, and compared to the ampliﬁed heterodyne re- ceiver, then this ratio grows to QNL /P E , NG = 210.[63] E. L. Lehmann and G. Casella, Theory of Point Estima-tion , 2nd ed. (Springer-Verlag, New York, 1998).[64] A. L. Maas, A. Y. Hannun, and A. Y. Ng, Rectiﬁer non-linearities improve neural network acoustic models, in

ICML , Vol. 30 (2013).[65] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis,J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. , Tensorﬂow: A system for large-scale machine learn-ing, in (2016) pp. 265–283, software available from tensorﬂow.org.[66] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, Eﬃ-cient Processing of Deep Neural Networks: A Tutorialand Survey, Proceedings of the IEEE , 2295 (2017).[67] A. Jain, J. Mao, and K. Mohiuddin, Artiﬁcial neural net-works: a tutorial, Computer , 31 (1996).[68] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learn-ing (MIT Press, 2016) .[69] A. G´eron,

Hands-On Machine Learning with Scikit-Learnand TensorFlow (O’Reilly Media, Inc., 2017).[70] M. R. Khanzadi,

Phase Noise in Communication Sys-tems: Modeling, Compensation, and Performance Analy-sis , phdthesis, Chalmers University of Technology (2015).[71] I. Holzman and Y. Ivry, Superconducting nanowires forsingle-photon detection: Progress, challenges, and op-portunities, Advanced Quantum Technologies , 1800058(2019).[72] G. E. Uhlenbeck and L. S. Ornstein, On the theory ofthe brownian motion, Phys. Rev. , 823 (1930).[73] D. T. Gillespie, The mathematics of brownian motionand johnson noise, American Journal of Physics , 225(1996).[74] R. E. Kalman, A New Approach to Linear Filtering andPrediction Problems, Journal of Basic Engineering ,35 (1960).[75] M. Bilkis, M. Rosati, R. M. Yepes, and J. Cal-samiglia, Real-time calibration of coherent-state re-ceivers: learning by trial and error, arXiv.org quant-ph/arXiv:2001.10283 (2020).[76] T. F¨osel, P. Tighineanu, T. Weiss, and F. Marquardt, Re-inforcement learning with neural networks for quantumfeedback, Phys. Rev. X , 031084 (2018).[77] Q. Liao, Y. Guo, D. Huang, P. Huang, and G. Zeng,Long-distance continuous-variable quantum key distribu-tion using non-Gaussian state-discrimination detection,New Journal of Physics , 023015 (2018).[78] J.-Y. Guan, F. Xu, H.-L. Yin, Y. Li, W.-J. Zhang, S.-J. Chen, X.-Y. Yang, L. Li, L.-X. You, T.-Y. Chen,Z. Wang, Q. Zhang, and J.-W. Pan, Observation of quan-tum ﬁngerprinting beating the classical limit, Phys. Rev.Lett.116