[PDF] Optimal waveform estimation for classical and quantum systems via time-symmetric smoothing

Abstract

Classical and quantum theories of time-symmetric smoothing, which can be used to optimally estimate waveforms in classical and quantum systems, are derived using a discrete-time approach, and the similarities between the two theories are emphasized. Application of the quantum theory to homodyne phase-locked loop design for phase estimation with narrowband squeezed optical beams is studied. The relation between the proposed theory and Aharonov et al.'s weak value theory is also explored.

Full PDF

aa r X i v : . [ qu a n t - ph ] A ug Optimal waveform estimation for classical and quantum systems via time-symmetricsmoothing

Mankei Tsang ∗ Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA (Dated: January 1, 2018)Classical and quantum theories of time-symmetric smoothing, which can be used to optimallyestimate waveforms in classical and quantum systems, are derived using a discrete-time approach,and the similarities between the two theories are emphasized. Application of the quantum theory tohomodyne phase-locked loop design for phase estimation with narrowband squeezed optical beamsis studied. The relation between the proposed theory and Aharonov et al. ’s weak value theory isalso explored.

PACS numbers: 03.65.Ta, 03.65.Yz, 42.50.Dv

I. INTRODUCTION

FIG. 1: (Color online). Four classes of estimation problems,depending on the observation time interval relative to τ , thetime at which the signal is to be estimated. Estimation theory is the science of determining thestate of a system, such as a dice, an aircraft, or theweather in Boston, from noisy observations [1, 2, 3, 4].As shown in Fig. 1, estimation problems can be classiﬁedinto four classes, namely, prediction, ﬁltering, retrodic-tion, and smoothing. For applications that do not re-quire real-time data, such as sensing and communication,smoothing is the most accurate estimation technique.I have recently proposed a time-symmetric quantumtheory of smoothing, which allows one to optimally esti-mate classical diﬀusive Markov random processes, suchas gravitational waves or magnetic ﬁelds, coupled to aquantum system, such as a quantum mechanical oscilla- ∗ Electronic address: [email protected] tor or an atomic spin ensemble, under continuous mea-surements [5]. In this paper, I shall demonstrate inmore detail the derivation of this theory using a discrete-time approach, and how it closely parallels the classicaltime-symmetric smoothing theory proposed by Pardoux[6]. I shall apply the theory to the design of homodynephase-locked loops (PLL) for narrowband squeezed opti-cal beams, as previously considered by Berry and Wise-man [7]. I shall show that their approach can be re-garded as a special case of my theory, and discuss howtheir results can be generalized and improved. I shall alsodiscuss the weak value theory proposed by Aharonov etal. [8] in relation with the smoothing theory, and howtheir theory may be regarded as a smoothing theory forquantum degrees of freedom. In particular, the smooth-ing quasiprobability distribution proposed in Ref. [5] isshown to naturally arise from the statistics of weak po-sition and momentum measurements.This paper is organized as follows: In Sec. II, Par-doux’s classical time-symmetric smoothing theory is de-rived using a discrete-time approach, which is then gen-eralized to the quantum regime for hybrid classical-quantum smoothing in Sec. III. Application of the hy-brid classical-quantum smoothing theory to PLL designis studied in Sec. IV. The relation between the smooth-ing theory and Aharonov et al. ’s weak value theory isthen discussed in Sec. V. Sec. VI concludes the paperand points out some possible extensions of the proposedtheory.

II. CLASSICAL SMOOTHINGA. Problem statement

Consider the classical smoothing problem depicted inFig. 2. Let x t ≡  x t x t ... x nt  (2.1) FIG. 2: (Color online). The classical smoothing problem. be a vectoral diﬀusive Markov random process that sat-isﬁes the system It¯o diﬀerential equation [1] dx t = A ( x t , t ) dt + B ( x t , t ) dW t , (2.2)where dW t is a vectoral Wiener increment with mean andcovariance matrix given by h dW t i = 0 , (2.3) (cid:10) dW t dW Tt (cid:11) = Q ( t ) dt. (2.4)The superscript T denotes the transpose. The vectoralobservation process dy t satisﬁes the observation It¯o equa-tion dy t = C ( x t , t ) dt + dV t , (2.5)where dV t is another vectoral Wiener increment withmean and covariance matrix given by h dV t i = 0 , (2.6) (cid:10) dV t dV Tt (cid:11) = R ( t ) dt. (2.7)For generality and later purpose, dW t and dV t are as-sumed to be correlated, with covariance (cid:10) dW t dV Tt (cid:11) = S ( t ) dt. (2.8)Deﬁne the observation record in the time interval [ t , t )as dy [ t ,t ) ≡ { dy t , t ≤ t < t } . (2.9)The goal of smoothing is to calculate the conditionalprobability density of x τ , given the observation record dy [ t ,T ) in the time interval t ≤ τ ≤ T .It is more intuitive to consider the problem in discretetime ﬁrst. The discrete-time system and observationequations (2.2) and (2.5) are δx t = A ( x t , t ) δt + B ( x t , t ) δW t , (2.10) δy t = C ( x t , t ) δt + δV t . (2.11)The observation record δy [ t ,T − δt ] ≡ { δy t , δy t + δt , . . . , δy T − δt } (2.12)also becomes discrete. The covariance matrices for theincrements are (cid:10) δW t δW Tt (cid:11) = Q ( t ) δt, (2.13) (cid:10) δV t δV Tt (cid:11) = R ( t ) δt, (2.14) (cid:10) δW t δV Tt (cid:11) = S ( t ) δt, (2.15) and the increments at diﬀerent times are independent ofone another. Because δW t and δV t are proportional to √ δt , one should keep all linear and quadratic terms ofthe Wiener increments in an equation according to It¯ocalculus when taking the continuous time limit.With correlated δW t and δV t , it is preferable, for tech-nical reasons, to rewrite the system equation (2.10) as[2] δx t = A ( x t , t ) δt + B ( x t , t ) δW t + D ( x t , t ) [ δy t − C ( x t , t ) δt − δV t ] , (2.16)where D ( x t , t ) can be arbitrarily set because the expres-sion in square brackets is zero. The system equation be-comes δx t = [ A ( x t , t ) − D ( x t , t ) C ( x t , t )] δt + D ( x t , t ) δy t + B ( x t , t ) δW t − D ( x t , t ) δV t . (2.17)The new system noise is δZ t ≡ B ( x t , t ) δW t − D ( x t , t ) δV t , (2.18) (cid:10) δZ t δZ Tt (cid:11) = (cid:2) B ( x t , t ) Q ( t ) B T ( x t , t )+ D ( x t , t ) R ( t ) D T ( x t , t ) − B ( x t , t ) S ( t ) D T ( x t , t ) − D ( x t , t ) S T ( t ) B T ( x t , t ) (cid:3) δt. (2.19)The covariance between the new system noise δZ t andthe observation noise δV t is (cid:10) δZ t δV Tt (cid:11) = [ B ( x t , t ) S ( t ) − D ( x t , t ) R ( t )] δt, (2.20)and can be made to vanish if one lets D ( x t , t ) = B ( x t , t ) S ( t ) R − ( t ) . (2.21)The new equivalent system and observation model is then δx t = A ( x t , t ) δt + B ( x t , t ) S ( t ) R − ( t ) [ δy t − C ( x t , t ) δt ]+ B ( x t , t ) δU t , (2.22) δy t = C ( x t , t ) δt + δV t , (2.23)with covariances (cid:10) δU t δU Tt (cid:11) = (cid:2) Q ( t ) − S ( t ) R − S T ( t ) (cid:3) δt, (2.24) (cid:10) δV t δV Tt (cid:11) = R ( t ) δt, (2.25) (cid:10) δU t δV Tt (cid:11) = 0 . (2.26)The new system and observation noises are now indepen-dent, but note that δx t becomes dependent on δy t . B. Time-symmetric approach

According to the Bayes theorem, the smoothing prob-ability density for x τ can be expressed as P ( x τ | δy [ t ,T − δt ] ) = P ( δy [ t ,T − δt ] | x τ ) P ( x τ ) P ( δy [ t ,T − δt ] ) , (2.27) P ( δy [ t ,T − δt ] ) = Z dx τ P ( δy [ t ,T − δt ] | x τ ) P ( x τ ) , (2.28)where Z dx τ ≡ Z dx τ . . . Z dx nτ (2.29)and P ( x τ ) is the a priori probability density, which rep-resents one’s knowledge of x τ absent any observation.Functions of x τ are assumed to also depend implicitly on τ . Splitting δy [ t ,T − δt ] into the past record δy past ≡ δy [ t ,τ − δt ] (2.30)and the future record δy future ≡ δy [ τ,T − δt ] (2.31)relative to time τ , P ( δy [ t ,T ) | x τ ) in Eq. (2.27) can berewritten as P ( δy [ t ,T − δt ] | x τ ) = P ( δy past , δy future | x τ )= P ( δy future | δy past , x τ ) P ( δy past | x τ ) . (2.32)Because δV t are independent increments, the futurerecord is independent of the past record given x τ , and P ( δy [ t ,T − δt ] | x τ ) = P ( δy future | x τ ) P ( δy past | x τ ) . (2.33)Equation (2.27) becomes P ( x τ | δy [ t ,T − δt ] ) = P ( δy future | x τ ) P ( δy past | x τ ) P ( x τ ) R dx τ (numerator)= P ( δy future | x τ ) P ( x τ | δy past ) R dx τ (numerator) . (2.34)Thus, the smoothing density can be obtained by combin-ing the ﬁltering probability density P ( x τ | δy past ) and aretrodictive likelihood function P ( δy future | x τ ). C. Filtering

To derive an equation for the ﬁltering probability den-sity P ( x τ | δy past ), ﬁrst express P ( x t + δt | δy [ t ,t ] ) in termsof P ( x t | δy [ t ,t ] ) as P ( x t + δt | δy [ t ,t ] ) = Z dx t P ( x t + δt , x t | δy [ t ,t ] )= Z dx t P ( x t + δt | x t , δy [ t ,t ] ) P ( x t | δy [ t ,t ] ) . (2.35) P ( x t + δt | x t , δy [ t ,t ] ) = P ( x t + δt | x t , δy t , δy [ t ,t − δt ] ) can bedetermined from the system equation (2.22) and is equalto P ( x t + δt | x t , δy t ), due to the Markovian nature of thesystem process. So P ( x t + δt | δy [ t ,t ] ) = Z dx t P ( x t + δt | x t , δy t ) P ( x t | δy [ t ,t ] ) , (2.36) which is a generalized Chapman-Kolmogorov equation[9]. P ( x t + δt | x t , δy t ) is P ( x t + δt | x t , δy t ) ∝ exp (cid:26) − δZ Tt (cid:2) B ( x t , t ) Q ( t ) B T ( x t , t ) δt (cid:3) − δZ t (cid:27) , (2.37)where δZ t ≡ x t + δt − x t − A ( x t , t ) δt + B ( x t , t ) S ( t ) R − ( t ) [ δy t − C ( x t , t ) δt ] . (2.38)Next, write P ( x t | δy [ t ,t ] ) in terms of P ( x t | δy [ t ,t − δt ] ) us-ing the Bayes theorem as P ( x t | δy [ t ,t ] ) = P ( x t | δy [ t ,t − δt ] , δy t )= P ( δy t | x t , δy [ t ,t − δt ] ) P ( x t | δy [ t ,t − δt ] ) R dx t (numerator)= P ( δy t | x t ) P ( x t | δy [ t ,t − δt ] ) R dx t (numerator) , (2.39)where P ( δy t | x t , δy [ t ,t − δt ] ) = P ( δy t | x t ) due to theMarkovian property of the observation process. P ( δy t | x t ) is determined by the observation equation(2.23) and given by P ( δy t | x t ) ∝ exp (cid:26) −

12 [ δy t − C ( x t , t ) δt ] T [ R ( t ) δt ] − × [ δy t − C ( x t , t ) δt ] (cid:27) . (2.40)Hence, starting with the a priori probability density P ( x t ), one can solve for P ( x τ | δy past ) by iterating theformula P ( x t + δt | δy [ t ,t ] ) = Z dx t P ( x t + δt | x t , δy t ) × P ( δy t | x t ) P ( x t | δy [ t ,t − δt ] ) R dx t (numerator) . (2.41)To obtain a stochastic diﬀerential equation for the ﬁlter-ing probability density, deﬁned as F ( x, t ) ≡ P ( x t = x | dy [ t ,t ) ) (2.42)in the continuous time limit, one should expandEq. (2.41) to ﬁrst order with respect to δt and secondorder with respect to δy t in a Taylor series, then ap-ply the rules of It¯o calculus. The result is the Kushner-Stratonovich (KS) equation [1, 10], generalized for cor-related system and observation noises by Fujisaki et al. [11], given by dF = − dt X µ ∂∂x µ ( A µ F )+ dt X µ,ν ∂ ∂x µ ∂x ν h(cid:0) BQB T (cid:1) µν F i + ( C − h C i F ) T R − dη t F − X µ ∂∂x µ h(cid:0) BSR − dη t (cid:1) µ F i , (2.43)where dF ≡ F ( x, t + dt ) − F ( x, t ) , (2.44) h C i F ≡ Z dxC ( x, t ) F ( x, t ) , (2.45) dη t ≡ dy t − dt h C i F . (2.46)The initial condition is F ( x, t ) = P ( x t ) . (2.47) dη t is called the innovation process and is also a Wienerincrement with covariance matrix R ( t ) dt [11, 12].A linear stochastic equation for an unnormalized F is called the Duncan-Mortensen-Zakai (DMZ) equation[6, 13], given by df = − dt X µ ∂∂x µ ( A µ f )+ dt X µ,ν ∂ ∂x µ ∂x ν h(cid:0) BQB T (cid:1) µν f i + C T R − dy t f − X µ ∂∂x µ h(cid:0) BSR − dy t (cid:1) µ f i , (2.48)where the normalization is F ( x, t ) = f ( x, t ) R dxf ( x, t ) . (2.49) D. Retrodiction and smoothing

To solve for the retrodictive likelihood function P ( δy future | x τ ), note that P ( δy future ) = Z dx τ P ( δy future | x τ ) P ( x τ ) , (2.50)but P ( δy future ) can also be expressed in terms of the mul-titime probability density as P ( δy [ τ,T − δt ] ) = Z Dx [ τ,T ] P ( x [ τ,T ] , δy [ τ,T − δt ] ) , (2.51) where x [ τ,T ] ≡ { x τ , x τ + δt , . . . , x T } , (2.52) Z Dx [ τ,T ] ≡ Z dx τ Z dx τ + δt . . . Z dx T . (2.53)The multitime density can be rewritten as P ( x [ τ,T ] , δy [ τ,T − δt ] ) = P ( x T | x [ τ,T − δt ] , δy [ τ,T − δt ] ) × P ( x [ τ,T − δt ] , δy [ τ,T − δt ] ) . (2.54)Again using the Markovian property of the system pro-cess, P ( x T | x [ τ,T − δt ] , δy [ τ,T − δt ] ) = P ( x T | x T − δt , δy T − δt ) , (2.55)which can be determined from the system equation(2.22) and is given by Eq. (2.37). Furthermore, P ( x [ τ,T − δt ] , δy [ τ,T − δt ] ) in Eq. (2.54) can be expressed as P ( x [ τ,T − δt ] , δy [ τ,T − δt ] ) = P ( δy T − δt | x [ τ,T − δt ] , δy [ τ,T − δt ] ) × P ( x [ τ,T − δt ] , δy [ τ,T − δt ] ) . (2.56)Using the Markovian property of the observation process, P ( δy T − δt | x [ τ,T − δt ] , δy [ τ,T − δt ] ) = P ( δy T − δt | x T − δt ) , (2.57)which can be determined from the observation equation(2.23) and is given by Eq. (2.40). Applying Eqs. (2.54),(2.55), (2.56), and (2.57) repeatedly, one obtains P ( δy [ τ,T − δt ] ) = Z dx T Z dx T − δt P ( x T | x T − δt , δy T − δt ) × P ( δy T − δt | x T − δt ) × Z dx T − δt P ( x T − δt | x T − δt , δy T − δt ) × P ( δy T − δt | x T − δt ) . . . × Z dx τ P ( x τ + δt | x τ , δy τ ) × P ( δy τ | x τ ) P ( x τ ) . (2.58)Comparing this equation with Eq. (2.50), P ( δy future | x τ )can be expressed as P ( δy future | x τ ) = P ( δy τ | x τ ) × Z dx τ + δt P ( x τ + δt | x τ , δy τ ) . . . × P ( δy T − δt | x T − δt ) × Z dx T − δt P ( x T − δt | x T − δt , δy T − δt ) × P ( δy T − δt | x T − δt ) × Z dx T P ( x T | x T − δt , δy T − δt ) . (2.59)Deﬁning the unnormalized retrodictive likelihood func-tion at time t as g ( x, t ) ∝ P ( dy [ t,T ) | x t = x ) , (2.60)one can derive a linear backward stochastic diﬀerentialequation for g ( x, t ) by applying It¯o calculus backward intime to Eq. (2.59). The result is [6] − dg = dt X µ A µ ∂g∂x µ + dt X µ,ν (cid:0) BQB T (cid:1) µν ∂ g∂x µ ∂x ν + C T R − dy t g + X µ (cid:0) BSR − dy t (cid:1) µ ∂g∂x µ . (2.61)which is the adjoint equation of the forward DMZ equa-tion (2.48), to be solved backward in time in the back-ward It¯o sense, deﬁned by − dg ≡ g ( x, t − dt ) − g ( x, t ) , (2.62)with the ﬁnal condition g ( x, T ) ∝ . (2.63)The adjoint equation with respect to a linear diﬀerentialequation df ( x, t ) = ˆ Lf ( x, t ) (2.64)is deﬁned as − dg ( x, t ) = ˆ L † g ( x, t ) , (2.65)where ˆ L is a linear operator and ˆ L † is the adjoint of ˆ L ,deﬁned by D g ( x ) , ˆ Lf ( x ) E = D ˆ L † g ( x ) , f ( x ) E (2.66)with respect to the inner product h g ( x ) , f ( x ) i ≡ Z dxg ( x ) f ( x ) . (2.67)After solving Eq. (2.48) for f ( x, τ ) and Eq. (2.61) for g ( x, τ ), the smoothing probability density is h ( x, τ ) ≡ P ( x τ = x | dy [ t ,T ) ) = g ( x, τ ) f ( x, τ ) R dxg ( x, τ ) f ( x, τ ) . (2.68)Since f ( x, τ ) and g ( x, τ ) are solutions of adjoint equa-tions, their inner product, which appears as the denomi-nator of Eq. (2.68), is constant in time [6]. The denomi-nator also ensures that h ( x, τ ) is normalized, and f ( x, τ )and g ( x, τ ) need not be normalized separately.The estimation errors depend crucially on the statisticsof x t . If any component of x t , say x µt , is constant in time,then ﬁltering of that particular component is as accurateas smoothing, for the simple reason that P ( x µτ | dy [ t ,T ) ) must be the same for any τ , and one can simply esti-mate x µτ at the end of the observation interval ( τ = T )using ﬁltering alone. This also means that smoothing isnot needed when one only needs to detect the presenceof a signal in detection problems [3], since the presencecan be regarded as a constant binary parameter withina certain time interval. In general, however, smoothingcan be signiﬁcantly more accurate than ﬁltering for theestimation of a ﬂuctuating random process in the middleof the observation interval. Another reason for modelingunknown signals as random processes is robustness, asintroducing ﬁctitious system noise can improve the esti-mation accuracy when there are modeling errors [1, 4]. E. Linear time-symmetric smoothing If f , g , and h are Gaussian, one can just solve fortheir means and covariance matrices, which completelydetermine the probability densities. This is the case whenthe a priori probability density P ( x t ) is Gaussian, and A ( x t , t ) = J ( t ) x t , (2.69) B ( x t , t ) = B ( t ) , (2.70) C ( x t , t ) = K ( t ) x t . (2.71)The means and covariance matrices of f , g , and h can then be solved using the linear Mayne-Fraser-Potter(MFP) smoother [14]. The smoother ﬁrst solves for themean x ′ and covariance matrix Σ of f using the Kalmanﬁlter [1], given by dx ′ = Jx ′ dt + Γ ( dy − Kx ′ dt ) , (2.72)Γ ≡ (cid:0) Σ K T + BS (cid:1) R − , (2.73) d Σ = (cid:0) J Σ + Σ J T − Γ R Γ T + BQB T (cid:1) dt, (2.74)with the initial conditions at t determined from P ( x t ).The mean x ′′ and covariance matrix Ξ of g are then solvedusing a backward Kalman ﬁlter, − dx ′′ = − Jx ′′ dt + Υ( dy − Kx ′′ dt ) , (2.75)Υ ≡ (cid:0) Ξ K T + BS (cid:1) R − , (2.76) − d Ξ = (cid:0) − J Ξ − Ξ J T − Υ R Υ T + BQB T (cid:1) dt, (2.77)with the ﬁnal condition Ξ − T x ′′ T = 0 and Ξ − T = 0. Inpractice, the information ﬁlter formalism should be usedto solve the backward ﬁlter, in order to avoid dealingwith the inﬁnite covariance matrix at T [2, 14]. Finally,the smoothing mean ˜ x τ and covariance matrix Π τ are˜ x τ = Π τ (cid:0) Σ − τ x ′ τ + Ξ − τ x ′′ τ (cid:1) , (2.78)Π τ = (cid:0) Σ − τ + Ξ − τ (cid:1) − . (2.79)Note that x ′′ and Ξ are the mean and covariance ma-trix of a likelihood function P ( dy [ t,T ) | x t ) and not thoseof a conditional probability density P ( x t | dy [ t,T ) ), so toperform optimal retrodiction ( τ = t ) one should stillcombine x ′′ and Ξ with the a priori values [15]. III. HYBRID CLASSICAL-QUANTUMSMOOTHINGA. Problem statement

FIG. 3: (Color online). Schematic of the hybrid classical-quantum smoothing problem.

Consider the problem of waveform estimation in a hy-brid classical-quantum system depicted in Fig. 3. Theclassical system produces a vectoral classical diﬀusiveMarkov random process x t , which obeys Eq. (2.2) andis coupled to the quantum system. The goal is to esti-mate x τ via continuous measurements of both systems.This setup is slightly more general than that consideredin [5]; here the observations can also depend on x t . Thisallows one to apply the theory to PLL design for squeezedbeams, as considered by Berry and Wiseman [7], and po-tentially to other quantum estimation problems as well[16]. The statistics of x t are assumed to be unperturbedby the coupling to the quantum system, in order to avoidthe nontrivial issue of quantum backaction on classicalsystems [17]. For simplicity, in this section we neglect thepossibility that the system noise driving the classical sys-tem is correlated with the observation noise, although thenoise driving the quantum system can still be correlatedwith the observation noise due to quantum measurementbackaction. Just as in the classical smoothing problem,the hybrid smoothing problem is solved by calculatingthe smoothing probability density P ( x τ | dy [ t ,T ) ). B. Time-symmetric approach

Because a quantum system is involved, one may betempted to use a hybrid density operator [5, 7, 16, 17]to represent one’s knowledge about the hybrid classical-quantum system. The hybrid density operator ˆ ρ ( x τ ) de-scribes the joint classical and quantum statistics of ahybrid system, with the marginal classical probabilitydensity for x τ and the marginal density operator for thequantum system given by P ( x τ ) = tr [ˆ ρ ( x τ )] , (3.1)ˆ ρ ( τ ) = Z dx τ ˆ ρ ( x τ ) , (3.2) respectively. The hybrid operator can also be regardedas a special case of the quantum density operator, whencertain degrees of freedom are approximated as classi-cal. Unfortunately, the density operator in conventionalpredictive quantum theory can only be conditioned uponpast observations and not future ones, so it cannot beused as a quantum version of the smoothing probabilitydensity.The classical time-symmetric smoothing theory, as acombination of prediction and retrodiction, oﬀers an im-portant clue to how one can circumvent the diﬃculty ofdeﬁning the smoothing quantum state. Again castingthe problem in discrete time, and deﬁning a hybrid eﬀectoperator as ˆ E ( δy future | x τ ), which can be used to deter-mine the statistics of future observations given a densityoperator at τ , P ( δy future ) = Z dx τ tr h ˆ E ( δy future | x τ )ˆ ρ ( x τ ) i , (3.3)one may write, in analogy with Eq. (2.34) [5], P ( x τ | δy [ t ,T − δt ] ) = P ( x τ , δy future | δy past ) P ( δy future | δy past )= tr[ ˆ E ( δy future | x τ )ˆ ρ ( x τ | δy past )] R dx τ tr[ ˆ E ( δy future | x τ )ˆ ρ ( x τ | δy past )] , (3.4)where ˆ ρ ( x τ | δy past ) is the analog of the ﬁltering proba-bility density P ( x τ | δy past ) and ˆ E ( δy future | x τ ) is the ana-log of the retrodictive likelihood function P ( δy future | x τ ).One can then solve for the density and eﬀect operatorsseparately, before combining them to form the classicalsmoothing probability density. C. Filtering

Since the hybrid density operator can be regarded asa special case of the density operator, the same tools inquantum measurement theory can be used to derive aﬁltering equation for the hybrid operator. First, writeˆ ρ ( x t + δt | δy [ t ,t ] ) in terms of ˆ ρ ( x t | δy [ t ,t ] ) asˆ ρ ( x t + δt | δy [ t ,t ] ) = Z dx t K ( x t + δt | x t )ˆ ρ ( x t | δy [ t ,t ] ) , (3.5)where K is a completely positive map that governs theMarkovian evolution of the hybrid state independent ofthe measurement process. Equation (3.5) may be re-garded as a quantum version of the classical Chapman-Kolmogorov equation. For inﬁnitesimal δt , Z dx t K ( x t + δt | x t )ˆ ρ ( x t ) ≈ (cid:2)(cid:0) ˆ1 + δt L (cid:1) ˆ ρ ( x t = x ) (cid:3) x = x t + δt . (3.6)The hybrid superoperator L can be expressed as L ˆ ρ ( x ) = L ˆ ρ ( x ) + L I ( x )ˆ ρ ( x ) − X µ ∂∂x µ [ A µ ˆ ρ ( x )]+ 12 X µ,ν ∂ ∂x µ ∂x ν h(cid:0) BQB T (cid:1) µν ˆ ρ ( x ) i , (3.7)where L governs the evolution of the quantum system, L I governs the coupling of x t to the quantum system, viaan interaction Hamiltonian for example, and the last twoterms governs the classical evolution of x t .Next, write ˆ ρ ( x t | δy [ t ,t ] ) in terms of ˆ ρ ( x t | δy [ t ,t − δt ] ) us-ing the quantum Bayes theorem [18] asˆ ρ ( x t | δy [ t ,t ] ) = ˆ ρ ( x t | δy [ t ,t − δt ] , δy t )= J ( δy t | x t )ˆ ρ ( x t | δy [ t ,t − δt ] ) R dx t tr(numerator) . (3.8)The measurement superoperator J ( δy t | x t ), a quantumversion of P ( δy t | x t ), is deﬁned as J ( δy t | x t )ˆ ρ ( x t ) ≡ ˆ M ( δy t | x t )ˆ ρ ( x t ) ˆ M † ( δy t | x t ) . (3.9)For inﬁnitesimal δt and measurements with Gaussiannoise, the measurement operator ˆ M can be approximatedas [19] ˆ M ( δz t | x t ) ∝ ˆ1 + X µ γ µ ( t ) (cid:20)

12 ˆ c µ ( x t , t ) δz µt − δt c † µ ( x t , t )ˆ c µ ( x t , t ) (cid:21) , (3.10)where δz t is a vectoral observation process, ˆ c ( x t , t ) is avector of hybrid operators, generalized from the purelyquantum ˆ c operators in Ref. [5] so that the observationsmay also depend directly on the classical degrees of free-dom, and γ µ ( t ) is assumed to be positive. To cast thetheory in a form similar to the classical one, perform uni-tary transformations on δz t and ˆ c , δy t = U δz t , (3.11)ˆ C ( x t , t ) = U ˆ c ( x t , t ) , (3.12)where U is a unitary matrix, and rewrite the measure-ment operator asˆ M ( δy t | x t ) ∝ ˆ1 + 12 ˆ C T ( x t , t ) R − ( t ) δy t − δt C † T ( x t , t ) R − ( t ) ˆ C ( x t , t ) . (3.13)ˆ C ( x t , t ) is a generalization of C ( x t , t ) in the classical case,and R ( t ) is again a positive-deﬁnite matrix that charac-terizes the observation uncertainties and is real and sym-metric with eigenvalues 1 /γ µ . Note that † is deﬁned asthe adjoint of each vector element, and T is deﬁned asthe matrix transpose of the vector. For example,ˆ C † T R − ˆ C ≡ X µ,ν ˆ C † µ ( R − ) µν ˆ C ν . (3.14) The evolution of ˆ ρ ( x t | δy [ t ,t − δt ] ) can thus be calculatedby iterating the formulaˆ ρ ( x t + δt | δy [ t ,t ] )= Z dx t K ( x t + δt | x t ) J ( δy t | x t )ˆ ρ ( x t | δy [ t ,t − δt ] ) R dx t tr(numerator) . (3.15)Taking the continuous time limit via It¯o calculus anddeﬁning the conditional hybrid density operator at time t as ˆ F ( x, t ) ≡ ˆ ρ ( x t = x | dy [ t ,t ) ) , (3.16)one obtains [5] d ˆ F = dt L ˆ F + dt (cid:16) C T R − ˆ F ˆ C † − ˆ C † T R − ˆ C ˆ F − ˆ F ˆ C † T R − ˆ C (cid:17) + 12 (cid:20)(cid:16) ˆ C − h ˆ C i ˆ F (cid:17) T R − dη t ˆ F + H.c. (cid:21) , (3.17)where h ˆ C i ˆ F ≡ Z dx tr h ˆ C ( x, t ) ˆ F ( x, t ) i , (3.18) dη t ≡ dy t − dt h ˆ C + ˆ C † i ˆ F (3.19)is a Wiener increment with covariance matrix R ( t ) dt [19],H.c. denotes the Hermitian conjugate, and the initialcondition is the a priori hybrid density operator ˆ ρ ( x t ).Equation (3.17) is a quantum version of the KS equa-tion (2.43) and can be regarded as a special case of theBelavkin quantum ﬁltering equation [20].A linear version of the KS equation for an unnormal-ized ˆ F ( x, t ) is d ˆ f = dt L ˆ f + dt (cid:16) C T R − ˆ f ˆ C † − ˆ C † T R − ˆ C ˆ f − ˆ f ˆ C † T R − ˆ C (cid:17) + 12 (cid:16) ˆ C T R − dy t ˆ f + H.c. (cid:17) , (3.20)and the normalization isˆ F ( x, t ) = ˆ f ( x, t ) R dx tr[ ˆ f ( x, t )] . (3.21)Equation (3.20) is a quantum generalization of the DMZequation (2.48). D. Retrodiction and smoothing

Taking a similar approach to the one in Sec. II D andusing the quantum regression theorem, one can expressthe future observation statistics as [21] P ( δy future ) = Z dx τ tr h ˆ E ( δy future | x τ )ˆ ρ ( x τ ) i (3.22)= Z dx T tr (cid:20) Z dx T − δt K ( x T | x T − δt ) · J ( δy T − δt | x T − δt ) · Z dx T − δt K ( x T − δt | x T − δt ) · J ( δy T − δt | x T − δt ) . . . · Z dx τ K ( x τ + δt | x τ ) J ( δy τ | x τ )ˆ ρ ( x τ ) (cid:21) , (3.23)which are analogous to Eq. (2.50) and Eq. (2.58), re-spectively. Comparing Eq. (3.22) with Eq. (3.23), anddeﬁning the adjoint of a superoperator O as O ∗ , suchthat tr h ˆ E ( x ) O ˆ ρ ( x ) i = tr nh O ∗ ˆ E ( x ) i ˆ ρ ( x ) o , (3.24)the hybrid eﬀect operator can be written asˆ E ( δy future | x τ )= J ∗ ( δy τ | x τ ) Z dx τ + δt K ∗ ( x τ + δt | x τ ) . . . · J ∗ ( δy T − δt | x T − δt ) Z dx T − δt K ∗ ( x T − δt | x T − δt ) · J ∗ ( δy T − δt | x T − δt ) Z dx T K ∗ ( x T | x T − δt )ˆ1 . (3.25)The operation K ∗ ≡ R dx ′ K ∗ ( x ′ | x ) · may also be regardedas a hybrid superoperator on a hybrid operator, and isthe adjoint of K ≡ R dx ′ K ( x | x ′ ) · , deﬁned by D ˆ E ( x ) , K ˆ ρ ( x ) E = D K ∗ ˆ E ( x ) , ˆ ρ ( x ) E , (3.26)with respect to the Hilbert-Schmidt inner product D ˆ E ( x ) , ˆ ρ ( x ) E ≡ Z dx tr h ˆ E ( x )ˆ ρ ( x ) i . (3.27)One can then rewrite Eqs. (3.22), (3.23), and (3.25) moreelegantly as D ˆ E ( x ) , ˆ ρ ( x ) E = (cid:10) ˆ1 , KJ . . . KJ ˆ ρ ( x ) (cid:11) , (3.28)ˆ E ( x ) = J ∗ K ∗ . . . J ∗ K ∗ ˆ1 . (3.29)In the continuous time limit, a linear stochastic dif-ferential equation for the unnormalized eﬀect operatorˆ g ( x, t ) ∝ ˆ E ( dy [ t,T ) | x t = x ) can be derived. The result is[5] − d ˆ g = dt L ∗ ˆ g + dt (cid:16) C † T R − ˆ g ˆ C − ˆ g ˆ C † T R − ˆ C − ˆ C † T R − ˆ C ˆ g (cid:17) + 12 (cid:16) ˆ g ˆ C T R − dy t + H.c. (cid:17) , (3.30) to be solved backward in time in the backward It¯o sense,with the ﬁnal conditionˆ g ( x, t ) ∝ ˆ1 . (3.31)Equation (3.30) is the adjoint equation of the forwardquantum DMZ equation (3.20) with respect to the innerproduct deﬁned by Eq. (3.27). It is a generalization ofthe classical backward DMZ equation (2.61).Finally, after solving Eq. (3.20) for ˆ f ( x, τ ) andEq. (3.30) for ˆ g ( x, τ ), the smoothing probability densityis h ( x, τ ) ≡ P ( x τ = x | dy [ t ,T ) ) = tr[ˆ g ( x, τ ) ˆ f ( x, τ )] R dx tr[ˆ g ( x, τ ) ˆ f ( x, τ )] . (3.32)The denominator of Eq. (3.32) ensures that h ( x, τ ) isnormalized, so ˆ f ( x, τ ) and ˆ g ( x, τ ) need not be normalizedseparately. Table I lists some important quantities inclassical smoothing with their generalizations in hybridsmoothing for comparison. E. Smoothing in terms of Wigner distributions

To solve Eqs. (3.20), (3.30), and (3.32), one way is toconvert them to equations for quasiprobability distribu-tions [22]. The Wigner distribution is especially useful forquantum systems with continuous degrees of freedom. Itis deﬁned as [22, 23] f ( q, p ) ≡ π Z du D q − u (cid:12)(cid:12)(cid:12) ˆ f (cid:12)(cid:12)(cid:12) q + u E exp (cid:0) ip T u (cid:1) , (3.33)where q and p are normalized position and momentumvectors. It has the desirable property Z dqdpg ( q, p ) f ( q, p ) = 12 π tr (cid:16) ˆ g ˆ f (cid:17) , (3.34)which is unique among generalized quasiprobability dis-tributions [23]. The smoothing probability density givenby Eq. (3.32) can then be rewritten as h ( x, τ ) = R dqdpg ( q, p, x, τ ) f ( q, p, x, τ ) R dqdpdxg ( q, p, x, τ ) f ( q, p, x, τ ) , (3.35)where f ( q, p, x, τ ) and g ( q, p, x, τ ) are the Wigner distri-butions of ˆ f and ˆ g , respectively. Equation (3.35) resem-bles the classical expression (2.68) with the quantum de-grees of freedom q and p marginalized. If f ( q, p, x, t ) isnonnegative and the stochastic equations for f ( q, p, x, t )and g ( q, p, x, t ) converted from Eqs. (3.20) and (3.30)have the same form as the classical DMZ equations givenby Eqs. (2.48) and (2.61), the hybrid smoothing problembecomes equivalent to a classical one and can be solvedusing well known classical smoothers. For example, if f ( q, p, x, t ) and g ( q, p, x, t ) are Gaussian, h ( x, τ ) is alsoGaussian, and their means and covariances can be solvedusing the linear MFP smoother described in Sec. II E. Classical Description Hybrid Description P ( x t + δt | x t , δy t ) transition probability density, ap-pears in Chapman-Kolmogorov equa-tion (2.36) K ( x t + δt | x t ) transition superoperator, appears inquantum Chapman-Kolmogorov equa-tion (3.5) P ( δy t | x t ) observation probability density, ap-pears in Bayes theorem (2.39) J ( δy t | x t ) measurement superoperator, appearsin quantum Bayes theorem (3.8) P ( x t | dy [ t ,t ) ), F ( x, t ) ﬁltering probability density, obeysKushner-Stratonovich equation (2.43) ˆ ρ ( x t | dy [ t ,t ) ), ˆ F ( x, t ) ﬁltering hybrid density operator,obeys Belavkin equation (3.17) f ( x, t ) unnormalized F ( x, t ), obeys Duncan-Mortensen-Zakai (DMZ) equation(2.48) ˆ f ( x, t ) unnormalized f ( x, t ), obeys quantumDMZ equation (3.20) P ( dy [ t,T ) | x t ) retrodictive likelihood function ˆ E ( dy [ t,T ) | x t ) hybrid eﬀect operator g ( x, t ) unnormalized P ( dy [ t,T ) | x t ), obeysbackward DMZ equation (2.61) ˆ g ( x, t ) unnormalized ˆ E ( dy [ t,T ) | x t ), obeysbackward quantum DMZ equation(3.30) P ( x τ | dy [ t ,T ) ), h ( x, τ ) smoothing probability density, obeysEq. (2.68) P ( x τ | dy [ t ,T ) ), h ( x, τ ) smoothing probability density, obeysEq. (3.32)TABLE I: Important quantities in classical smoothing and their generalizations in hybrid smoothing. IV. PHASE-LOCKED LOOP DESIGN FORNARROWBAND SQUEEZED BEAMS

Consider the PLL setup depicted in Fig. 4. The opti-cal parametric oscillator (OPO) produces a squeezed vac-uum with a squeezed p quadrature and an antisqueezed q quadrature. The squeezed vacuum is then displaced bya real constant b to produce a phase-squeezed beam, thephase of which is modulated by φ t = x t , an element ofthe vectoral random process x t described by the systemIt¯o equation (2.2). The output beam is measured con-tinuously by a homodyne PLL, and the local-oscillatorphase φ ′ t is continuously updated according to the real-time measurement record.The use of PLL for phase estimation in the presence ofquantum noise has been mentioned as far back as 1971by Personick [24]. Wiseman suggested an adaptive homo-dyne scheme to measure a constant phase [25], which wasthen experimentally demonstrated by Armen et al. forthe optical coherent state [26]. Berry and Wiseman [27]and Pope et al. [28] studied the problem with φ t beinga Wiener process. Berry and Wiseman later generalizedthe theory to account for narrowband squeezed beams[7]. Tsang et al. also studied the problem for the case of x t being a Gaussian process [29, 30], but the squeezingmodel considered in Refs. [29, 30] is not realistic. Us-ing the hybrid smoothing theory developed in Sec. III,one can now generalize these earlier results to the caseof an arbitrary diﬀusive Markov process and a realisticsqueezing model.Let ˆ ρ ( x t ) be the hybrid density operator for the com-bined quantum-OPO-classical-modulator system. Theevolution of the OPO below threshold in the interaction picture is governed by L ˆ ρ ( x ) = − i ~ h ˆ H , ˆ ρ ( x ) i , (4.1)ˆ H = − i ~ χ (cid:0) ˆ a ˆ a − ˆ a † ˆ a † (cid:1) (4.2)= ~ χ q ˆ p + ˆ p ˆ q ) , (4.3)where ˆ a is the annihilation operator for the cavity opticalmode, and ˆ q and ˆ p are the antisqueezed and squeezedquadrature operators, respectively, deﬁned asˆ q ≡ ˆ a + ˆ a † √ , (4.4)ˆ p ≡ ˆ a − ˆ a † √ i , (4.5)with the commutation relation[ˆ q, ˆ p ] = i. (4.6)The classical phase modulator does not inﬂuence the evo-lution of the OPO, so L I = 0 , (4.7)but it modulates the OPO output. ˆ C ( x t , t ) in this caseis ˆ C ( x t , t ) = − i ( b + √ γ ˆ a ) exp ( iφ t − iφ ′ t ) , (4.8)where γ is the transmission coeﬃcient of the partiallyreﬂecting OPO output mirror, R = 1, and the sym-bol and sign conventions here roughly follows those ofRefs. [29, 30]. To ensure the correct unconditional quan-tum dynamics, the Hamiltonian should be changed to(Ref. [18], Sec. 11.4.3)ˆ H ′ = ˆ H − i i ~ b √ γ a − ˆ a † ) , L ˆ ρ ( x ) = − i ~ h ˆ H ′ , ˆ ρ ( x ) i , (4.9)0 FIG. 4: (Color online). Homodyne phase-locked loop (PLL) for phase estimation with a narrowband squeezed optical beamproduced from an optical parametric oscillator (OPO). in order to eliminate the spurious eﬀect of the displace-ment term in ˆ C on the OPO. After some algebra, theforward stochastic equation for the Wigner distribution f ( q, p, x, t ) becomes df = dt (cid:26) − X µ ∂∂x µ ( A µ f ) + 12 X µ,ν ∂ ∂x µ ∂x ν h(cid:0) BQB T (cid:1) µν f i − (cid:20)(cid:16) χ − γ (cid:17) ∂∂q ( qf ) + (cid:16) − χ − γ (cid:17) ∂∂p ( pf ) (cid:21) + γ (cid:18) ∂ f∂q + ∂ f∂p (cid:19) (cid:27) + dy t (cid:20) sin( φ − φ ′ t ) (cid:18) b + p γq + r γ ∂∂q (cid:19) + cos( φ − φ ′ t ) (cid:18)p γp + r γ ∂∂p (cid:19) (cid:21) f. (4.10)This is precisely the classical DMZ equation (2.48) withcorrelated system and observation noises. The equivalentclassical system equations are then dq t = (cid:16) χ − γ (cid:17) q t dt + r γ dα t ,dp t = (cid:16) − χ − γ (cid:17) p t dt + r γ dβ t ,dx t = A ( x t , t ) dt + B ( x t , t ) dW t , (4.11)and the equivalent observation equation is dy t = 2 b sin( φ t − φ ′ t ) dt + dζ t ,dζ t ≡ sin( φ t − φ ′ t ) (cid:16)p γq t dt − dα t (cid:17) + cos( φ t − φ ′ t ) (cid:16)p γp t dt − dβ t (cid:17) , (4.12)where dα t and dβ t are independent Wiener incrementswith covariance dt . dα t and dβ t , which appear in boththe system equation and the observation equation, aresimply quadratures of the vacuum ﬁeld, coupled to boththe cavity mode and the output ﬁeld via the OPO outputmirror. Equations (4.11) and (4.12) coincide with themodel of Berry and Wiseman in Ref. [7] when x t is a Wiener process, and Eq. (4.10) is the continuous limit oftheir approach to phase estimation. This approach canalso be regarded as an example of the general method ofaccounting for colored observation noise by modeling thenoise as part of the system [2, 3, 4].If χ = 0, dζ t /dt is an additive white Gaussian noise,and the model is reduced to that studied in Refs. [27, 28,29, 30]. In that case, it is desirable to make φ ′ t follow φ t as closely as possible, so that dy t can be approximatedas dy t ≈ b ( φ t − φ ′ t ) dt + dζ t , (4.13)and the Kalman ﬁlter can be used if x t is Gaussian [30].Provided that Eq. (4.13) is valid, one should make φ ′ t theconditional expectation of φ t = x t , given by φ ′ t = h φ t i ˆ f = Z dqdpdx x f ( q, p, x, t ) . (4.14)For phase-squeezed beams, it also seems desirable tomake φ ′ t close to φ t in order to minimize the magnitudeof dζ t . Equation (4.14) may not provide the optimal φ ′ t in general, however, as it does not necessarily minimizethe magnitude of dζ t or the estimation errors. The opti-mal control law for φ ′ t should be studied in the contextof control theory.While φ ′ t needs to be updated in real time and mustbe calculated via ﬁltering, the estimation accuracy canbe improved by smoothing. The backward DMZ equa-tion for g ( q, p, x, t ) is the adjoint equation with respectto Eq. (4.10), given by − dg = dt (cid:26) X µ A µ ∂g∂x µ + 12 X µ,ν (cid:0) BQB T (cid:1) µν ∂ g∂x µ ∂x ν + (cid:20)(cid:16) χ − γ (cid:17) q ∂g∂q + (cid:16) − χ − γ (cid:17) p ∂g∂p (cid:21) + γ (cid:18) ∂ g∂q + ∂ g∂p (cid:19) (cid:27) + dy t (cid:20) sin( φ − φ ′ t ) (cid:18) b + p γq − r γ ∂∂q (cid:19) + cos( φ − φ ′ t ) (cid:18)p γp − r γ ∂∂p (cid:19) (cid:21) g, (4.15)1and the smoothing probability density h ( x, τ ) is given byEq. (3.35). The use of linear smoothing for the case of x t being a Gaussian process and dζ t /dt being a white Gaus-sian noise has been studied in Refs. [29, 30]. Practicalstrategies of solving Eqs. (4.10) and (4.15) in general arebeyond the scope of this paper, but classical nonlinear ﬁl-tering and smoothing techniques should help [1, 2, 3, 4].One can also use the hybrid smoothing theory to studythe general problem of force estimation via a squeezedprobe beam and a homodyne PLL, by modeling the phasemodulator as a quantum mechanical oscillator insteadand combining the problem studied in this section withthe force estimation problem studied in Ref. [5]. V. WEAK VALUES AS QUANTUMSMOOTHING ESTIMATES

FIG. 5: (Color online). The quantum smoothing problem.

Previous sections focus on the estimation of classicalsignals, but there is no reason why one cannot applysmoothing to quantum degrees of freedom as well, asshown in Fig. 5. First consider the predicted densityoperator at time τ conditioned upon past observations,given by ˆ ρ ( τ ) ≡ ˆ f ( τ )tr[ ˆ f ( τ )] , (5.1)where the classical degrees of freedom are neglected forsimplicity. The predicted expectation of an observable,such as the position of a quantum mechanical oscillator,is h ˆ O i ˆ f ≡ tr h ˆ O ˆ ρ ( τ ) i = tr[ ˆ O ˆ f ( τ )]tr[ ˆ f ( τ )] . (5.2)One may also use retrodiction, after some measurementsof a quantum system have been made, to estimate itsinitial quantum state before the measurements [31, 32],using the retrodictive density operator deﬁned asˆ ρ ret ( τ ) ≡ ˆ g ( τ )tr[ˆ g ( τ )] . (5.3)The retrodicted expectation of an observable is ˆ g h ˆ O i ≡ tr h ˆ ρ ret ( τ ) ˆ O i = tr[ˆ g ( τ ) ˆ O ]tr[ˆ g ( τ )] . (5.4) Causality prevents one from going back in time to verifythe retrodicted expectation, but if the degree of freedomwith respect to ˆ O at time τ is entangled with another“probe” system, then one can verify the retrodicted ex-pectation by measuring the probe and inferring ˆ O [32].The idea of verifying retrodiction by entangling thesystem at time τ with a probe can also be extended tothe case of smoothing, as proposed by Aharonov et al. [8]. In the middle of a sequence of measurements, if oneweakly couples the system to a probe for a short time, sothat the system is weakly entangled with the probe, andthe probe is subsequently measured, the measurementoutcome on average can be characterized by the so-calledweak value of an observable, deﬁned as [8, 33] ˆ g h ˆ O i ˆ f ≡ tr[ˆ g ( τ ) ˆ O ˆ f ( τ )]tr[ˆ g ( τ ) ˆ f ( τ )] . (5.5)The weak value becomes a prediction given by Eq. (5.2)when future observations are neglected, such that ˆ g ( τ ) =ˆ1, and becomes a retrodiction given by Eq. (5.4) whenpast observations are neglected and there is no a priori information about the quantum system at time τ , suchthat ˆ f ( τ ) = ˆ1. When ˆ f ( τ ) and ˆ g ( τ ) are incoherent mix-tures of ˆ O eigenstates,ˆ f ( τ ) = X O f ( O, τ ) | O ih O | , (5.6)ˆ g ( τ ) = X O g ( O, τ ) | O ih O | , (5.7)the weak value becomes ˆ g h ˆ O i ˆ f = P O Og ( O, τ ) f ( O, τ ) P O g ( O, τ ) f ( O, τ ) , (5.8)and is consistent with the classical time-symmetricsmoothing theory described in Sec. II. Hence, the weakvalue can be regarded as a quantum generalization of thesmoothing estimate, conditioned upon past and futureobservations.One can also establish a correspondence between aclassical theory and a quantum theory via quasiprobabil-ity distributions. Given the smoothing probability den-sity in terms of the Wigner distributions in Eq. (3.35),one may be tempted to undo the marginalizations overthe quantum degrees of freedom and deﬁne a smoothingquasiprobability distribution as h ( q, p, τ ) = g ( q, p, τ ) f ( q, p, τ ) R dqdpg ( q, p, τ ) f ( q, p, τ ) , (5.9)where f ( q, p, τ ) and g ( q, p, τ ) are the Wigner distri-butions of ˆ f ( τ ) and ˆ g ( τ ), respectively. Intriguingly, h ( q, p, τ ), being the product of two Wigner distributions,can exhibit quantum position and momentum uncertain-ties that violate the Heisenberg uncertainty principle.This has been shown in Ref. [30], when the position of a2quantum mechanical oscillator is monitored via contin-uous measurements and smoothing is applied to the ob-servations. From the perspective of classical estimationtheory, it is perhaps not surprising that smoothing canimprove upon an uncertainty relation based on a predic-tive theory. The important question is whether the sub-Heisenberg uncertainties can be veriﬁed experimentally.Ref. [30] argues that it can be done only by Bayesianestimation, but in the following I shall propose anothermethod based on weak measurements.It can be shown that the expectation of q using h ( q, p, τ ) is h q i h ≡ Z dqdp qh ( q, p, τ ) (5.10)= Re tr[ˆ g ( τ )ˆ q ˆ f ( τ )]tr[ˆ g ( τ ) ˆ f ( τ )] = Re ˆ g h ˆ q i ˆ f , (5.11)which is the real part of the weak value, and likewise for h p i h , so the smoothing position and momentum estimatesare closely related to their weak values. More generally,consider the joint probability density for a quantum po-sition measurement followed by a quantum momentummeasurement, conditioned upon past and future obser-vations: P ( y q , y p ) = 1 C tr h ˆ g ( τ ) ˆ M p ( y p ) ˆ M q ( y q ) ˆ f ( τ ) ˆ M † q ( y q ) ˆ M † p ( y p ) i , (5.12) C ≡ Z dy q dy p tr h ˆ g ( τ ) ˆ M p ( y p ) ˆ M q ( y q ) × ˆ f ( τ ) ˆ M † q ( y q ) ˆ M † p ( y p ) i , (5.13)where the measurement operatorsˆ M q ( y q ) = Z dq (cid:16) ǫ q π (cid:17) exp h − ǫ q y q − q ) i | q ih q | , (5.14)ˆ M p ( y p ) = Z dp (cid:16) ǫ p π (cid:17) exp h − ǫ p y p − p ) i | p ih p | (5.15)are assumed to be Gaussian and backaction evading. Af-ter some algebra, P ( y q , y p ) = Z dqdp (cid:16) ǫ q π (cid:17) (cid:16) ǫ p π (cid:17) × exp h − ǫ q y q − q ) − ǫ p y p − p ) i ˜ P ( q, p ) , (5.16)˜ P ( q, p ) ≡ π C Z dudv exp (cid:18) − ǫ q u + ǫ p v (cid:19) × D p + v (cid:12)(cid:12)(cid:12) ˆ g ( τ ) (cid:12)(cid:12)(cid:12) p − v E exp( ivq ) × D q − u (cid:12)(cid:12)(cid:12) ˆ f ( τ ) (cid:12)(cid:12)(cid:12) q + u E exp( ipu ) . (5.17) From the perspective of classical probability theory,Eq. (5.16) can be interpreted as the probability density ofnoisy position and momentum measurements with noisevariances 1 /ǫ q and 1 /ǫ p , when the measured object hasa classical phase-space density given by ˜ P ( q, p ). In thelimit of inﬁnitesimally weak measurements, ǫ q , ǫ p → ǫ q ,ǫ p → ˜ P ( q, p ) = h ( q, p, τ ) . (5.18)Thus, h ( q, p, τ ) can be obtained approximately from anexperiment with small ǫ q and ǫ p by measuring P ( y q , y p )for the same ˆ g and ˆ f and deconvolving Eq. (5.16). Inpractice, ǫ q and ǫ p only need to be small enough suchthat ˜ P ( q, p ) ≈ h ( q, p, τ ). This allows one, at least in prin-ciple, to experimentally demonstrate the sub-Heisenberguncertainties predicted in Ref. [30] in a frequentist way,not just by Bayesian estimation as described in Ref. [30].Note, however, that h ( q, p, τ ) can still go negative, soit cannot always be regarded as a classical probabilitydensity. This underlines the wave nature of a quantumobject and may be related to the negative probabilitiesencountered in the use of weak values to explain Hardy’sparadox [34]. VI. CONCLUSION

In conclusion, I have used a discrete-time approachto derive the classical and quantum theories of time-symmetric smoothing. The hybrid smoothing theoryis applied to the design of PLL, and the relation be-tween the proposed theory and Aharonov et al. ’s weakvalue theory is discussed. Possible generalizations of thetheory include taking jumps into account for the clas-sical random process [9] and adding quantum measure-ments with Poisson statistics, such as photon counting[18, 21, 22, 23]. Potential applications not discussedin this paper include cavity quantum electrodynamics[18, 21, 22, 23], photodetection theory [16, 18, 23], atomicmagnetometry [35], and quantum information processingin general. On a more fundamental level, it might also beinteresting to generalize the weak value theory and thesmoothing quasiprobability distribution to other kinds ofquantum degrees of freedom in addition to position andmomentum, such as spin, photon number, and phase. Ageneral quantum smoothing theory would complete thecorrespondence between classical and quantum estima-tion theories.

Acknowledgments

Discussions with Seth Lloyd and Jeﬀrey Shapiro aregratefully acknowledged. This work is ﬁnancially sup-ported by the Keck Foundation Center for ExtremeQuantum Information Theory.3 [1] A. H. Jazwinski,

Stochastic Processes and Filtering The-ory (Academic Press, New York, 1970).[2] J. C. Crassidis and J. L. Junkins,

Optimal Estimation ofDynamic Systems (Chapman & Hall CRC, Boca Raton,2004).[3] H. L. Van Trees,

Detection, Estimation, and ModulationTheory, Part I (Wiley, New York, 2001);

Detection, Esti-mation, and Modulation Theory, Part II: Nonlinear Mod-ulation Theory (Wiley, New York, 2002);

Detection, Es-timation, and Modulation Theory, Part III: Radar-SonarProcessing and Gaussian Signals in Noise (Wiley, NewYork, 2001).[4] D. Simon,

Optimal State Estimation (Wiley, Hoboken,2006).[5] M. Tsang, Phys. Rev. Lett. , 250403 (2009).[6] E. Pardoux, Stochastics , 193 (1982). See also B. D. O.Anderson and I. B. Rhodes, Stochastics , 139 (1983).[7] D. W. Berry and H. M. Wiseman, Phys. Rev. A ,063824 (2006).[8] Y. Aharonov, D. Z. Albert, and L. Vaidman, Phys. Rev.Lett. , 1351 (1988); Y. Aharonov and L. Vaidman, J.Phys. A , 2315 (1991).[9] C. W. Gardiner, Handbook of Stochastic Methods (Springer-Verlag, Berlin, 1985).[10] R. L. Stratonovich, Theor. Probability Appl. , 156(1960); H. J. Kushner, J. Math. Anal. Appl. , 332(1964); SIAM J. Control , 106 (1964).[11] M. Fujisaki, G. Kallianpur, and H. Kunita, Osaka J.Math. , 19 (1972).[12] P. A. Frost and T. IEEE Trans. Auto. Control AC-16 ,217 (1971).[13] R. E. Mortensen, Ph.D. dissertation, Univ. of California,Berkeley (1966); T. E. Duncan, Ph.D. dissertation, Stan-ford University (1967); M. Zakai, Z. Wahr. verw. Geb. ,230 (1969).[14] D. Q. Mayne, Automatica , 73 (1966); D. C. Fraser andJ. E. Potter, IEEE Trans. Automatic Control , 387(1969).[15] J. E. Wall, Jr., A. S. Willsky, and N. R. Sandell Jr.,Stochastics , 1 (1981).[16] P. Warszawski, H. M. Wiseman, and H. Mabuchi, Phys.Rev. A , 023802 (2002); P. Warszawski and H. M.Wiseman, J. Opt. B: Quant. Semiclass. Opt. , 1 (2003); , 15 (2003); N. P. Oxtoby, P. Warszawski, H. M. Wise-man, He-Bi Sun and R. E. S. Polkinghorne, Phys. Rev.B , 165317 (2005).[17] I. V. Aleksandrov, Z. Naturforsch. , 902 (1981); W.Boucher and J. Traschen, Phys. Rev. D , 3522 (1988);L. Di´osi, N. Gisin, and W. T. Strunz, Phys. Rev. A ,022108 (2000). [18] C. W. Gardiner and P. Zoller, Quantum Noise (Springer-Verlag, Berlin, 2000).[19] H. M. Wiseman and L. Di´osi, Chem. Phys. , 91(2001).[20] V. P. Belavkin, Radiotech. Elektron. , 1445 (1980);V. P. Belavkin, in Information Complexity and Controlin Quantum Physics , edited by A. Blaqui`ere, S. Diner,and G. Lochak (Springer, Vienna, 1987), p. 311; V.P. Belavkin, in

Stochastic Methods in Mathematics andPhysics , edited by R. Gielerak and W. Karwowski (WorldScientiﬁc, Singapore, 1989), p. 310; V. P. Belavkin, in

Modeling and Control of Systems in Engineering, Quan-tum Mechanics, Economics, and Biosciences , edited byA. Blaqui`ere (Springer, Berlin, 1989), p. 245.[21] A. Barchielli, L. Lanz, and G. M. Prosperi, Nuovo Ci-mento, , 79 (1982); Found. Phys. , 779 (1983);H. Carmichael, An Open Systems Approach to QuantumOptics (Springer-Verlag, Berlin, 1993).[22] D. F. Walls and G. J. Milburn,

Quantum Optics (Springer-Verlag, Berlin, 2008).[23] L. Mandel and E. Wolf,

Optical Coherence and QuantumOptics (Cambridge University Press, Cambridge, 1995).[24] S. D. Personick, IEEE Trans. Inform. Theor.

IT-17 , 240(1971).[25] H. M. Wiseman, Phys. Rev. Lett. , 4587 (1995).[26] M. A. Armen, J. K. Au, J. K. Stockton, A. C. Doherty,and H. Mabuchi, Phys. Rev. Lett. , 133602 (2002).[27] D. W. Berry and H. M. Wiseman, Phys. Rev. A ,043803 (2002).[28] D. T. Pope, H. M. Wiseman, and N. K. Langford, Phys.Rev. A , 043812 (2004).[29] M. Tsang, J. H. Shapiro, and S. Lloyd, Phys. Rev. A ,053820 (2008).[30] M. Tsang, J. H. Shapiro, and S. Lloyd, Phys. Rev. A ,053843 (2009).[31] S. M. Barnett, D. T. Pegg, J. Jeﬀers, O. Jedrkiewicz,and R. Loudon, Phys. Rev. A , 022313 (2000); S. M.Barnett, D. T. Pegg, J. Jeﬀers, and O. Jedrkiewicz, Phys.Rev. Lett. , 2455 (2001); D. T. Pegg, S. M. Barnett,and J. Jeﬀers, Phys. Rev. A , 022106 (2002).[32] M. Yanagisawa, e-print arXiv:0711.3885.[33] H. M. Wiseman, Phys. Rev. A , 032111 (2002).[34] Y. Aharonov, A. Botero, S. Popescu, B. Reznik, and J.Tollaksen, Phys. Lett. A , 130 (2002).[35] D. Budker, W. Gawlik, D. F. Kimball, S. M. Rochester,V. V. Yashchuk, and A. Weis, Rev. Mod. Phys. , 1153(2002); J. M. Geremia, J. K. Stockton, A. C. Doherty,and H. Mabuchi, Phys. Rev. Lett.91