[PDF] Certifying Non-Classical Behavior for Negative Keldysh Quasi-Probabilities

Abstract

We introduce an experimental test for ruling out classical explanations for the statistics obtained when measuring arbitrary observables at arbitrary times using individual detectors. This test requires some trust in the measurements, represented by a few natural assumptions on the detectors. In quantum theory, the considered scenarios are well captured by von Neumann measurements. These can be described naturally in terms of the Keldysh quasi-probability distribution (KQPD), and the imprecision and backaction exerted by the measurement apparatus. We find that classical descriptions can be ruled out from measured data if and only if the KQPD exhibits negative values. We provide examples based on simulated data, considering the influence of a finite amount of statistics. In addition to providing an experimental tool for certifying non-classicality, our results bestow an operational meaning upon the non-classical nature of negative quasi-probability distributions such as the Wigner function and the full counting statistics.

Full PDF

CCertifying Non-Classical Behavior for Negative Keldysh Quasi-Probabilities

Patrick P. Potts ∗ Physics Department and NanoLund, Lund University, Box 118, 22100 Lund, Sweden. (Dated: March 27, 2019)We introduce an experimental test for ruling out classical explanations for the statistics obtainedwhen measuring arbitrary observables at arbitrary times using individual detectors. This test re-quires some trust in the measurements, represented by a few natural assumptions on the detectors.In quantum theory, the considered scenarios are well captured by von Neumann measurements.These can be described naturally in terms of the Keldysh quasi-probability distribution (KQPD),and the imprecision and backaction exerted by the measurement apparatus. We ﬁnd that classicaldescriptions can be ruled out from measured data if and only if the KQPD exhibits negative values.We provide examples based on simulated data, considering the inﬂuence of a ﬁnite amount of statis-tics. In addition to providing an experimental tool for certifying non-classicality, our results bestowan operational meaning upon the non-classical nature of negative quasi-probability distributionssuch as the Wigner function and the full counting statistics.

Introduction.—

The theory of quantum mechanics con-tains ingredients that are absent in classical theories, suchas entanglement, wave-function collapse, and superposi-tion of arbitrary states [1–3]. In some scenarios, these in-gredients are beneﬁcial (e.g., quantum information [4]),while in other scenarios, they provide limitations (e.g.,quantum noise in measurement and ampliﬁcation [5]).The realm of possibilities that are enabled or prohibitedby quantum mechanics is a highly non-trivial subject ofcurrent research.At the heart of this problem lies the question: “whichobservations cannot be explained by classical theories?”

A strong result in this direction is provided by Bell in-equalities [6]. With the help of such inequalities, observeddata alone can rule out any theory that fulﬁlls a naturaldeﬁnition of locality [7]. While this is an extremely pow-erful result, locality is a rather speciﬁc requirement anddoes not encompass all classical theories [8].Another well established approach for testing for non-classicality is given by the Glauber-Sudarshan P -functionin quantum optics [9, 10]. If a state is described by a P -function that cannot be interpreted as a probabilitydistribution, then some measurable intensity correlatorsresulting from this state cannot be described by classicalelectrodynamics [11, 12]. In contrast to Bell inequalities,the measurement device thus has to be trusted to produceintensity correlators of light.Arguably the most striking diﬀerence to classical the-ories is the fact that observables cannot be describedusing positive probability distributions in quantum me-chanics. Leggett-Garg inequalities [13] provide a test fornon-classicality based on this criterion. However, an ad-ditional assumption of non-invasive measurability whichis not generally justiﬁed complicates the conclusions [14].In this letter, we we provide a test for non-classicalitywhich rules out any description based on positive prob-abilities under a few realistic assumptions on the mea-surement apparatus. To this end, we consider scenarioswhere observables are measured using individual detec- FIG. 1. (a) Sketch of the setup. Two observables are mea-sured by detectors ( D j ) coupled to the system ( S ). The de-tectors come with a knob ( χ j ) and disturb the system ( γ j ).(b) Illustration of von Neumann measurements. Detectorsare quantum mechanical systems that couple to the system ofinterest at times t j via the Hamiltonian ˆ H j . The interactionshifts the probability distribution ρ j of the detectors by anamount depending on the system state ˆ ρ . After the interac-tion, a projective position measurement is performed on thedetectors resulting in the outcome ¯ A j . tors, see Fig. 1. In quantum theory, such scenarios arewell described by von Neumann type measurements [15–18], where observables of interest are coupled to detec-tors which are subsequently measured projectively. Theprobability distribution describing the measurement out-comes have a natural description in terms of a quasi-probability distribution that we abbreviate with KQPDdue to its reminiscence of the Keldysh path-integral for-mulation [19, 20]. The KQPD depends on the observ-ables of interest and can reduce to the Wigner function[21] or the full counting statistics [22]. Other applicationsinclude quantum thermodynamics [23–28], quantum op-tics [29], generalized Wigner functions [30], weak val-ues [19] (see also [31–33]), and non-equilibrium phenom-ena in quantum systems [20]. Importantly, the KQPDcan become negative, indicating non-classical behavior[18, 29, 34–39]. Here we put this non-classical feature ona ﬁrmer footing by taking an operational approach. To a r X i v : . [ qu a n t - ph ] M a r this end, we put forward a classical model for measure-ments based on individual detectors. This model is basedon a few natural assumptions on the detectors and resultsin an experimentally accessible inequality. We show thatwithin quantum theory, negativity in the KQPD is a nec-essary and suﬃcient condition to violate the inequality,ruling out a classical description. Just like negativity inthe P -function rules out an explanation by classical elec-trodynamics (as long as the detectors can be trusted toproduce intensity correlators), negativity in the KQPDrules out an explanation based on positive probabilities,as long as the measurement apparatus can be trusted tofulﬁll the assumptions speciﬁed below.In contrast to Leggett-Garg inequalities, non-invasiveness of the measurement is not required. Theproposed experimental test of non-classical behavior istherefore not subject to a clumsiness [14] or a ﬁnite pre-cision loophole [40, 41]. The model is not necessarilylocal or non-contextual [42–44].Before we introduce the classical model, we provide thequantum mechanical (QM) description of the scenariounder investigation, sketched in Fig. 1. While we assumethis to be the correct description, we stress that our testfor non-classicality does not rely on the QM-model. The KQPD.—

The QM-model relies on the KQPDwhich is discussed in detail in Ref. [19]. It encodes thejoint ﬂuctuations of multiple observables of interest. Forsimplicity, we consider the situation where we are inter-ested in two observables ˆ A and ˆ A at times t and t respectively. The generalization to more observables isstraightforward. Let us further consider the situationwhere t either comes immediately after t (subsequentmeasurements) or where t = t (simultaneous measure-ments). The KQPD is then deﬁned as ( (cid:126) = 1) P ( A | γ ) = (cid:90) d λ (2 π ) e i λ · A Tr (cid:110) ˆ Q ( λ , γ )ˆ ρ ˆ Q † ( − λ , γ ) (cid:111) , (1)where ˆ Q = exp[ − i (cid:0) λ + γ (cid:1) ˆ A ] exp[ − i (cid:0) λ + γ (cid:1) ˆ A ] forsubsequent and ˆ Q = exp[ − i (cid:80) j =1 , ( λ j / γ j ) ˆ A j ] forsimultaneous measurements. The state before the mea-surement is denoted by ˆ ρ . We grouped the observablesinto a vector A = ( A , A ) and similarly for λ and γ . Asshown below, the variables γ j are necessary to take intoaccount the backaction exerted by the measurement andcan be seen as random variables determined by the detec-tors. A physical motivation for the deﬁnition in Eq. (1)is provided below, by Eq. (3).If [ ˆ A , ˆ A ] (cid:54) = 0, the measurement of ˆ A may inﬂuencethe measurement of ˆ A and a description of the systemin terms of pre-determined values of A and A is notgenerally possible. In this case, the KQPD may becomenegative. It has been shown that such negativity requiresthe system to be in a superposition of states that corre-spond to diﬀerent values for the observable A [39]. Neg-ativity in the KQPD can thus be seen as an indicator for non-classical behavior. However, in an experiment, thenegativity of the KQPD is masked by measurement im-precision and backaction, rendering the measured prob-ability distribution strictly non-negative. The inequalitythat we introduce below relies on a way to unmask theKQPD experimentally. The QM-model.—

We consider two detectors, one foreach observable to be measured. The detectors can be de-scribed by canonically conjugate observables ˆ r j and ˆ π j ,and they are coupled to the system through the Hamil-tonian [15] ˆ H j = δ ( t − t j ) χ j ˆ A j ˆ π j , (2)where j = 1 ,

2, and χ j denotes the measurementstrength. We assume that the time-evolution inducedby any Hamiltonian other than Eq. (2) can be neglectedduring (and between) the measurements, noting that itis straightforward to include time-evolution between themeasurements (for an investigation on detector memoryeﬀects, see Ref. [45]). Equation (2) induces a displace-ment in the detector coordinates ˆ r j which depends on thestate of the system. After the interaction, a projectivemeasurement of the detectors is performed to completethe measurement of the system observables ˆ A j . The mea-sured distribution reads [19] (see also [46, 47]) P ( A | χ ) = (cid:90) d A (cid:48) d γ P ( A (cid:48) | γ ) (cid:89) j =1 , W j ( ¯ A j − ¯ A (cid:48) j , ¯ γ j ) , (3)where W j ( r, π ) denotes the Wigner function of detector j and we introduced ¯ A j = χ j A j and ¯ γ j = γ j /χ j . Thisequation has a simple interpretation, motivating the def-inition in Eq. (1). The KQPD describes the intrinsicﬂuctuations of the observables, containing all the infor-mation of the system. These ﬂuctuations are distortedby the measurement process, giving rise to the convolu-tion with the Wigner functions of the detectors. The un-certainty in the position coordinates induces a fuzzinessin the measurement (measurement imprecision) and theuncertainty in the momentum coordinates introduces arandom kick in the measured observable through Eq. (2)(measurement backaction). Due to the Heisenberg uncer-tainty relation, there exists a trade-oﬀ between impreci-sion and backaction [5] which ensures that the measureddistribution is always positive, even when the KQPD ex-hibits negativity. For an investigation of the classicallimit of von Neumann type measurements, see Ref. [48]. The classical model.—

We now introduce a classi-cal hidden-variable model that describes the situationsketched in Fig. 1 (a). To this end, we assume that thesystem is described by a probability distribution S ( A | γ ).This distribution encodes the (hidden) values of the ob-servables ( A ) and takes into account that the presence ofthe detectors may modify the system behavior ( γ ). Themeasured distribution can then be written in the com-pletely general form P cl ( A | χ ) = (cid:90) d A (cid:48) d γ M ( A , A (cid:48) , γ | χ ) S ( A (cid:48) | γ ) , (4)where χ describes the (changeable) detector settings.The function M describes the eﬀect of the detectors. Wesay that an observed probability distribution has a clas-sical explanation if it can be described by the right-handside of Eq. (4) with positive S and M .Equation (4) is suﬃciently general that it can essen-tially describe any observations. To rule out a classicalexplanation, we place some trust in the detectors andmake the assumptions:1. Uncorrelated detectors: M ( A , A (cid:48) , γ | χ ) = (cid:89) j M j ( A j , A (cid:48) j , γ j | χ j ) . (5)2. Uncorrelated imprecision and backaction: M j ( A j , A (cid:48) j , γ j | χ j ) = p j ( γ j | χ j ) D j ( A j , A (cid:48) j | χ j ) . (6)3. Backaction only aﬀects the other observable: (cid:90) dA k S ( A | γ j , γ k = 0) ≡ S ( A j | γ j ) = S ( A j ) . (7)4. Translational invariance: D j ( A j , A (cid:48) j | χ j ) = D j ( A j − A (cid:48) j | χ j ) . (8)5. Detectors can be detached: lim χ j → p j ( γ j | χ j ) D j ( A j − A (cid:48) j | χ j ) = δ ( γ j ) U ( A j ) . (9)In the spirit of the considered scenario, the ﬁrst assump-tion allows us to treat the detectors as individual objects(note that this assumption is also present in the Bell sce-nario). Assumptions 2 and 3 ensure that the backactionof a detector does not interfere with its own measure-ment, i.e., a detector’s output is independent of its back-action on the system. In Eq. (7), we introduced the dis-tribution relevant for measuring a single variable, S ( A j ),which is assumed to be independent of the backaction ofits own detector. In assumption 5, U denotes the uniformdistribution and we deﬁned γ j = 0 to denote the absenceof any backaction of detector j . We note that our as-sumptions only include the eﬀect of the detectors. On aqualitative level, one can thus replace our assumptionswith the notion of having control over measurements ofsingle observables and preventing any cross-talk betweenthe detectors. Certifying non-classicality.—

We denote by P ( A j | χ j )the distribution that describes a measurement of asingle observable. We further denote the Fourier transform of any distribution with a tilde ˜ P ( λ ) = (cid:82) dA exp( − iλA ) P ( A ). We then consider the quantity K = 1(2 π ) (cid:90) d λ e i λ · A ˜ P ( λ | χ ) (cid:89) j =1 , ˜ P ( λ j | χ (cid:48) j )˜ P ( λ j | χ j ) , (10)where we note that the right-hand side only containsFourier transforms of measurable probability distribu-tions. If the measurement is described by our classicalmodel, we can write this quantity as [49] K cl = (cid:90) d A (cid:48) d γ S ( A (cid:48) | γ ) (cid:89) j =1 , p j ( γ j | χ j ) D j ( A j − A (cid:48) j | χ (cid:48) j ) . (11)This equation is very similar to Eq. (4) (under our as-sumptions) with the only diﬀerence that χ j is replaced by χ (cid:48) j in the measurement imprecision term D j . Within ourassumptions, the measurement imprecision of the detec-tors can be corrected for. We end up with a distributionwhere the backaction is determined by χ j and the impre-cision by χ (cid:48) j . In our classical model, this still results in apositive distribution K cl ≥ . (12)Any violation of this inequality implies that the observeddata cannot be explained by Eq. (4) with positive S and M that satisfy the ﬁve assumptions. Trusting the detec-tors (i.e., the assumptions) then allows us to concludethat no explanation in terms of positive probabilities ispossible. The assumptions thus introduce loopholes sincea violation of Eq. (12) could in principle result from theirbreakdown.In quantum mechanics, the delicate interplay betweenbackaction and imprecision is what masks the negativityof the KQPD. This may result in a violation of the in-equality. Using detectors with positive Wigner functionsthat factorize in a position and a momentum part en-sures that our assumptions on the detectors are satisﬁed.The quantity K is then given by an expression analo-gous to Eq. (11), with S replaced by the KQPD P . Thiscan be seen by plugging Eq. (3), and a similar expressionfor single observables, into Eq. (10). A positive KQPDthen immediately ensures K ≥

0. In the limit where χ j → χ (cid:48) j → ∞ , we ﬁnd K → P . Whenever theKQPD exhibits negativity, we can thus ﬁnd K <

0, vio-lating the inequality in Eq. (12). Since the assumptionson the detectors are met, this implies that the measureddata cannot be explained by positive probability distri-butions. Negativity in the KQPD is therefore a necessaryand suﬃcient condition for certifying non-classicality.

Examples.—

We now illustrate how our classical modelcan be ruled out from experimental (in our case, simu-lated) data by violating the inequality in Eq. (12). Weconsider two examples: The simultaneous measurementof position and momentum, and two subsequent, non-commuting Stern-Gerlach type spin measurements. For

FIG. 2. Certifying non-classicality. (a) Simultaneous measurement of both quadratures in a single-mode Fock state containingone photon. (b) Two subsequent spin measurements in diﬀerent directions on a spin one-half particle. The large panels show K for values A that maximize the negativity [ x = 0, p = 0 for (a), σ = 0, σ = − K [Eq. (10)], the triangles to the estimate K est based on numerical simulations [Eq. (17)]. The side panelsshow the full estimate of K for a single data point. In (a), the small side-panel shows the exact distribution K . In (b), thedashed lines correspond to the exact K . As the measurement strength χ increases, the estimate becomes more reliable but thebackaction decreases the negativity in K . The simulations are based on 15 000 individual measurements of the observables and30 000 joint measurements. Other parameters: (a) χ (cid:48) = 5, c o = 0 . λ c = 10. (b) χ (cid:48) = 3, c o = 0 . λ c = 12. both examples, we consider identical detectors that aredescribed by the Wigner function (throughout, we con-sider dimensionless units for position and momentum) W j ( r j , p j ) = 1 π e − ( r j + p j ) /π, (13)corresponding to unsqueezed Gaussian states of minimaluncertainty. As demanded by assumption 2, they fac-torize into distributions for position (imprecision) andmomentum (backaction).We ﬁrst consider a simultaneous measurement of po-sition and momentum on a single-photon Fock state de-scribed by the Wigner function W ( x, p ) = 1 π (cid:2) x + p ) − (cid:3) e − ( x + p ) . (14)In this case, our quantum mechanical model reduces tothe Arthurs-Kelly model [50]. We note that such a mea-surement can be implemented by heterodyne detection[51], see Ref. [52] for an experimental realization. Asdiscussed in detail in Ref. [19], the KQPD for the simul-taneous position and momentum measurement is givenby W ( x − γ p / , p + γ x / χ x = χ p = χ and χ (cid:48) x = χ (cid:48) p = χ (cid:48) we then ﬁnd(see supplemental information for details [49]) K = 1 π (1 + g ) e − ( x p g (cid:2) x + p ) − g (cid:3) , (15)where g = ( χ/ + 1 / ( χ (cid:48) ) . We note that in the limit χ → χ (cid:48) → ∞ , we have g → g <

1, we ﬁnd

K < χ , the strongerthe negativity in the measurable quantity K . Weakermeasurements thus always seem to be preferable. Thisis only true under the assumption that K can be esti-mated precisely. Strictly speaking, this requires an inﬁ-nite amount of data. For a ﬁnite and ﬁxed number of measurements, we will ﬁnd a trade-oﬀ between havinglarge negative values in K (requiring small χ ) and be-ing able to reliably estimate K (requiring large χ ). Toestimate K , we consider an experiment with N measure-ments resulting in outcomes x j . We deﬁne the empiricalcharacteristic function [53] Y λ = 1 N N (cid:88) j =1 e − iλx j , (16)which provides an unbiased estimator of the characteris-tic function (i.e., the Fourier transform of the probabilitydistribution). We note that it is imprecise for large valuesof λ , where the characteristic function is a small number.For K , we introduce the estimator K est = (cid:82) λ c − λ c d λ (2 π ) e i λ · A Y λ Y (cid:48) λx Y λx Y (cid:48) λp Y λp for | Y λ x/p | > c o , , (17)where λ · A = λ x x + λ p p . Here the diﬀerent empiri-cal characteristic functions are labeled by λ for the jointmeasurement and by a prime for the measurements withstrength χ (cid:48) . Two empirical cut-oﬀs increase the stabil-ity of the estimator. The ﬁrst, c o , ensures that valuesof λ where we divide by a very small number are nottaken into account. The second, λ c , allows for integrat-ing over a ﬁnite domain. The estimator in Eq. (17) isillustrated in Fig. 2 (a) for simulated data. For large val-ues of χ , it is both accurate and precise. As χ becomessmaller, the spread of the estimates increases (the preci-sion is reduced). Eventually, the cut-oﬀ c o prevents anaccurate estimation because the true characteristic func-tion becomes very small for almost all values of λ x/p .As expected, we ﬁnd a trade-oﬀ between large χ , wherethe negativity in K is not very pronounced, and small χ ,where it is hard to estimate K .Our second example is provided by subsequent, non-commuting measurements on a two-level system (for arecent experimental implementation of non-commutingspin measurements, see Ref. [54], for a detailed discus-sion on simultaneous spin measurements, see Ref. [55]).We consider the system to be in a pure state | + (cid:105) , whichis an eigenstate of the Pauli matrix ˆ σ x . We then make ameasurement of ˆ σ z with strength χ = χ , followed by aprojective measurement of ˆ σ x . The KQPD for this sys-tem is discussed in Ref. [19] and given in the supplemen-tal information [49]. Because it is unavoidable that theﬁrst measurement inﬂuences the second one, the KQPDexhibits negativity. Since the second measurement is pro-jective, we only correct for the measurement imprecisionof the ﬁrst measurement, choosing χ = χ (cid:48) → ∞ inEq. (10). All distributions can then be given as densitiesin the continuous variable σ and probabilities in the dis-crete variable σ = ±

1. Certifying non-classicality of thissystem is illustrated in Fig. 2 (b), where we show both K as well as K est . We ﬁnd the same qualitative results as forthe simultaneous position and momentum measurement.The weaker the ﬁrst measurement, the more pronouncedthe negativity but the less reliable is the estimate K est .Detailed calculations can be found in the supplementalinformation [49]. Conclusions.—

We introduced a classical model formeasurements that use individual detectors for diﬀerentobservables. Under ﬁve natural assumptions, we ﬁnd theinequality K ≥

0. Any violation of this inequality impliesthat either no description in terms of positive probabil-ities is possible, or one of the assumptions on the de-tectors is not met. In scenarios which are well describedby quantum mechanical von Neumann measurements, weﬁnd that K can become negative if and only if the KQPDexhibits negative values. In this case, K provides a wayof approximating the KQPD from measurable probabilitydistributions. This is possible because measurement im-precision is a property of the detector alone and can thusbe inferred and corrected for. In weak measurements,where backaction becomes small, correcting for the mea-surement imprecision “unmasks” the KQPD, exposing itsnegativity.Our classical model is appropriate whenever individ-ual detectors are used to measure diﬀerent observables.The introduced operational procedure for certifying non-classicality is thus of broad experimental relevance andit puts the non-classical nature of the negative values inthe KQPD on a ﬁrmer footing. Acknowledgements.—

I acknowledge fruitful discus-sions with M.-O. Renou, N. Brunner, and P. Samuelsson.I acknowledge funding from the European Union’s Hori-zon 2020 research and innovation programme under theMarie Sk(cid:32)lodowska-Curie Grant Agreement No. 796700.This work was supported by the Swedish Research Coun-cil as well as the Swiss National Science Foundation. ∗ [email protected]; The author was previouslyknown as Patrick P. Hofer.[1] S. Haroche, “Entanglement, decoherence and the quan-tum/classical boundary,” Phys. Today , 36 (1998).[2] G. Parisi and G. Auletta, Foundations and Interpretationof Quantum Mechanics (World Scientiﬁc, 2000).[3] J. P. Dowling and G. J. Milburn, “Quantum technology:the second quantum revolution,” Philos. Trans. RoyalSoc. A , 1655 (2003).[4] M. A. Nielsen and I. L. Chuang,

Quantum Computationand Quantum Information: 10th Anniversary Edition (Cambridge University Press, 2010).[5] A. A. Clerk, M. H. Devoret, S. M. Girvin, F. Marquardt,and R. J. Schoelkopf, “Introduction to quantum noise,measurement, and ampliﬁcation,” Rev. Mod. Phys. ,1155 (2010).[6] J. S. Bell, “On the Einstein Podolsky Rosen paradox,”Physics , 195 (1964).[7] N. Brunner, D. Cavalcanti, S. Pironio, V. Scarani, andS. Wehner, “Bell nonlocality,” Rev. Mod. Phys. , 419(2014).[8] Newton’s theory of gravity is, for instance, non-local andcan thus strictly speaking not be ruled out by a violationof a Bell inequality.[9] Roy J. Glauber, “Coherent and incoherent states of theradiation ﬁeld,” Phys. Rev. , 2766 (1963).[10] E. C. G. Sudarshan, “Equivalence of semiclassical andquantum mechanical descriptions of statistical lightbeams,” Phys. Rev. Lett. , 277 (1963).[11] L. Mandel, “Non-classical states of the electromagneticﬁeld,” Phys. Scr. , 34 (1986).[12] W. Vogel, “Nonclassical correlation properties of radia-tion ﬁelds,” Phys. Rev. Lett. , 013605 (2008).[13] A. J. Leggett and A. Garg, “Quantum mechanics ver-sus macroscopic realism: Is the ﬂux there when nobodylooks?” Phys. Rev. Lett. , 857 (1985).[14] C. Emary, N. Lambert, and F. Nori, “Leggett-Garg in-equalities,” Rep. Prog. Phys. , 016001 (2014).[15] J. von Neumann, Mathematische Grundlagen der Quan-tenmechanik (Springer, 1932).[16] S. Stenholm, “Simultaneous measurement of conjugatevariables,” Ann. Phys. , 233 (1992).[17] P. Busch, “‘No information without disturbance’: Quan-tum limitations of measurement,” in

Quantum Reality,Relativistic Causality, and Closing the Epistemic Circle:Essays in Honour of Abner Shimony (Springer Nether-lands, Dordrecht, 2009) p. 229.[18] A. A. Clerk, F. Marquardt, and J. G. E. Harris, “Quan-tum measurement of phonon shot noise,” Phys. Rev.Lett. , 213603 (2010).[19] P. P. Hofer, “Quasi-probability distributions for observ-ables in dynamic systems,” Quantum , 32 (2017).[20] C. Aron, G. Biroli, and L. F. Cugliandolo, “(Non) equi-librium dynamics: a (broken) symmetry of the Keldyshgenerating functional,” SciPost Phys. , 8 (2018).[21] C. K. Zachos, D. B. Fairlie, and T. L. Curtright, eds., Quantum Mechanics in Phase Space: An Overview withSelected Papers (World Scientiﬁc, 2005).[22] Yu. V. Nazarov, ed.,

Quantum Noise in MesoscopicPhysics (Springer, 2003).[23] M. Esposito, U. Harbola, and S. Mukamel, “Nonequi- librium ﬂuctuations, ﬂuctuation theorems, and count-ing statistics in quantum systems,” Rev. Mod. Phys. ,1665 (2009).[24] P. Solinas and S. Gasparinetti, “Full distribution of workdone on a quantum system for arbitrary initial states,”Phys. Rev. E , 042150 (2015).[25] P. Solinas and S. Gasparinetti, “Probing quantum inter-ference eﬀects in the work distribution,” Phys. Rev. A , 052103 (2016).[26] E. B¨aumer, M. Lostaglio, M. Perarnau-Llobet, andR. Sampaio, “Fluctuating work in coherent quantumsystems: proposals and limitations,” arXiv:1805.10096[quant-ph].[27] G. De Chiara, P. Solinas, F. Cerisola, and A. J.Roncaglia, “Ancilla-assisted measurement of quantumwork,” arXiv:1805.06047 [quant-ph].[28] M. Lostaglio, “Quantum ﬂuctuation theorems, contextu-ality, and work quasiprobabilities,” Phys. Rev. Lett. ,040602 (2018).[29] A. A. Clerk, “Full counting statistics of energy ﬂuctua-tions in a driven quantum resonator,” Phys. Rev. A ,043824 (2011).[30] R. Schwonnek and R. F. Werner, “Wigner distributionsfor n arbitrary operators,” arXiv:1802.08342 [quant-ph].[31] M. Hallaji, A. Feizpour, G. Dmochowski, J. Sinclair, andA.M. Steinberg, “Weak-value ampliﬁcation of the nonlin-ear eﬀect of a single photon,” Nat. Phys. , 540 (2017).[32] J. Sinclair, D. Spierings, A. Brodutch, and A. M.Steinberg, “Weak values and neoclassical realism,”arXiv:1808.09951 [quant-ph].[33] N. Yunger Halpern, B. Swingle, and J. Dressel,“Quasiprobability behind the out-of-time-ordered corre-lator,” Phys. Rev. A , 042105 (2018).[34] W. Belzig and Yu. V. Nazarov, “Full counting statistics ofelectron transfer between superconductors,” Phys. Rev.Lett. , 197006 (2001).[35] A. Kenfack and K. ˙Zyczkowski, “Negativity of theWigner function as an indicator of non-classicality,” J.Opt. B , 396 (2004).[36] S. Deleglise, I. Dotsenko, C. Sayrin, J. Bernu, M. Brune,J.-M. Raimond, and S. Haroche, “Reconstruction of non-classical cavity ﬁeld states with snapshots of their deco-herence,” Nature , 510 (2008).[37] A. Bednorz and W. Belzig, “Quasiprobabilistic interpre-tation of weak measurements in mesoscopic junctions,”Phys. Rev. Lett. , 106803 (2010).[38] A. Bednorz, W. Belzig, and A. Nitzan, “Nonclassicaltime correlation functions in continuous quantum mea-surement,” New J. Phys. , 013009 (2012).[39] P. P. Hofer and A. A. Clerk, “Negative full countingstatistics arise from interference eﬀects,” Phys. Rev. Lett. , 013603 (2016).[40] A. Kent, “Noncontextual hidden variables and physicalmeasurements,” Phys. Rev. Lett. , 3755 (1999).[41] David A. Meyer, “Finite precision measurement nulliﬁesthe Kochen-Specker theorem,” Phys. Rev. Lett. , 3751(1999).[42] Robert W. Spekkens, “Negativity and contextuality areequivalent notions of nonclassicality,” Phys. Rev. Lett. , 020401 (2008).[43] A. Cabello, “Experimentally testable state-independentquantum contextuality,” Phys. Rev. Lett. , 210401(2008). [44] G. Kirchmair, F. Z¨ahringer, R. Gerritsma, M. Klein-mann, O. G¨uhne, A. Cabello, R. Blatt, and C. F. Roos,“State-independent experimental test of quantum con-textuality,” Nature , 494 (2009).[45] J. B¨ulte, A. Bednorz, C. Bruder, and W. Belzig, “Non-invasive quantum measurement of arbitrary operator or-der by engineered non-Markovian detectors,” Phys. Rev.Lett. , 140407 (2018).[46] Yu. V. Nazarov and M. Kindermann, “Full countingstatistics of a general quantum mechanical variable,”Eur. Phys. J. B , 413 (2003).[47] A. Di Lorenzo, “Strong correspondence principle for jointmeasurement of conjugate observables,” Phys. Rev. A ,042104 (2011).[48] T. J. Barnea, M.-O. Renou, F. Fr¨owis, and N. Gisin,“Macroscopic quantum measurements of noncommutingobservables,” Phys. Rev. A , 012111 (2017).[49] See supplemental information for a detailed derivation ofthe proposed inequality and detailed calculations for theexamples in the main text.[50] E. Arthurs and J. L. Kelly, “B.S.T.J. briefs: On the si-multaneous measurement of a pair of conjugate observ-ables,” Bell Syst. Tech. J. , 725 (1965).[51] U. Leonhardt and H. Paul, “Measuring the quantumstate of light,” Prog. Quant. Electron. , 89 (1995).[52] C. Eichler, D. Bozyigit, C. Lang, L. Steﬀen, J. Fink, andA. Wallraﬀ, “Experimental state tomography of itinerantsingle microwave photons,” Phys. Rev. Lett. , 220503(2011).[53] A. Feuerverger and R. A. Mureika, “The empirical char-acteristic function and its applications,” Ann. Statist. ,88 (1977).[54] S. Hacohen-Gourgy, L. S. Martin, E. Flurin, V. V. Ra-masesh, K. B. Whaley, and I. Siddiqi, “Quantum dy-namics of simultaneously measured non-commuting ob-servables,” Nature , 491 (2016).[55] M. Perarnau-Llobet and T. M. Nieuwenhuizen, “Simulta-neous measurement of two noncommuting quantum vari-ables: Solution of a dynamical model,” Phys. Rev. A ,052129 (2017). Supplemental information: Certifying Non-Classical Behavior for Negative KeldyshQuasi-Probabilities

Here we provide supplementary calculations and expressions for the examples discussed in the main text. Equationand Figure numbers not preceded by an ‘ S ’ refer to the main text. A. DETAILED DERIVATION OF THE INEQUALITY

Here we give a detailed derivation of the inequality K cl ≥ K cl = 1(2 π ) (cid:90) d λ e i λ · A ˜ P cl ( λ | χ ) (cid:89) j =1 , ˜ P cl ( λ j | χ (cid:48) j )˜ P cl ( λ j | χ j ) , (S1)where ˜ P cl ( λ | χ ) = (cid:90) d A e − i λ · A P cl ( A | χ ) ˜ P cl ( λ j | χ j ) = (cid:90) dA j e − iλ j A j P cl ( A j | χ j ) . (S2)The distribution describing the joint measurement of A and A is given by P cl ( A | χ ) = (cid:90) d A (cid:48) d γ (cid:89) j M j ( A j , A (cid:48) j , γ j | χ j )  S ( A (cid:48) | γ ) . (S3)Here we have already made the assumption of uncorrelated detectors. While this assumption is necessary only later inthe derivation, the whole derivation becomes considerably more transparent by making the assumption at this earlystage. The quantity P cl ( A j | χ j ) describes the measurement of a single observable and is not in general given by themarginal of Eq. (S3). To recover P cl ( A j | χ j ) from P cl ( A | χ ), we require assumption ﬁve (detectors can be detached)lim χ j → M j ( A j , A (cid:48) j , γ j | χ j ) = δ ( γ j ) U ( A j ) . (S4)With this equation, we ﬁnd P cl ( A j | χ j ) = lim χ k → (cid:90) dA k P cl ( A | χ ) = (cid:90) dA (cid:48) j dγ j M j ( A j , A (cid:48) j , γ j | χ j ) S ( A (cid:48) j | γ j ) , (S5)where k (cid:54) = j and, as in the main text, we deﬁned (cid:82) dA k S ( A | γ j , γ k = 0) ≡ S ( A j | γ j ). Using the translational invarianceof the detectors M j ( A j , A (cid:48) j , γ j | χ j ) = M j ( A j − A (cid:48) j , γ j | χ j ), i.e., assumption 4, the Fourier transforms of the probabilitydistributions reduce to products due to the convolution theorem˜ P cl ( λ | χ ) = (cid:90) d γ (cid:89) j ˜ M j ( λ j , γ j | χ j )  ˜ S ( λ | γ ) , ˜ P cl ( λ j | χ j ) = (cid:90) dγ j ˜ M j ( λ j , γ j | χ j ) ˜ S ( λ j | γ j ) , (S6)where the Fourier transforms of M and S are given analogously to Eq. (S2). Using Assumption 3 [a measurement ofa single observable is not aﬀected by backaction, i.e., ˜ S ( λ j | γ j ) is independent of γ j ] we can write˜ P cl ( λ j | χ j ) = (cid:90) dγ j ˜ M j ( λ j , γ j | χ j ) ˜ S ( λ j | γ j ) = ˜ D j ( λ j | χ j ) ˜ S ( λ j ) , (S7)where at this point, D j is obtained by integrating M j over γ j . We can then write K cl = 1(2 π ) (cid:90) d λ e i λ · A (cid:90) d γ (cid:89) j ˜ M j ( λ j , γ j | χ j ) ˜ D j ( λ j | χ (cid:48) j )˜ D j ( λ j | χ j )  ˜ S ( λ | γ ) . (S8)Using assumption 2 (uncorrelated imprecision and backaction)˜ M j ( λ j , γ j | χ j ) = p j ( γ j | χ j ) ˜ D j ( λ j | χ j ) , (S9)Eq. (S8) reduces to (this is where we also need assumption 1, uncorrelated detectors) K cl = 1(2 π ) (cid:90) d λ e i λ · A (cid:90) d γ (cid:89) j p j ( γ j | χ j ) ˜ D j ( λ j | χ (cid:48) j )  ˜ S ( λ | γ )= (cid:90) d A (cid:48) d γ S ( A (cid:48) | γ ) (cid:89) j =1 , p j ( γ j | χ j ) D j ( A j − A (cid:48) j | χ (cid:48) j ) ≥ . (S10)The second line corresponds to Eq. (11) in the main text and we used the convolution theorem once more to arrivethere. Since all involved distributions are assumed to be positive, the inequality follows directly.Note that within our assumptions, we can write˜ P cl ( λ | χ ) = (cid:90) d A (cid:48) d γ S ( A (cid:48) | γ ) (cid:89) j =1 , p j ( γ j | χ j ) D j ( A j − A (cid:48) j | χ j ) , (S11)which looks very similar to Eq. (S10) with the sole exception that both the backaction ( p j ) and the measurementimprecision ( D j ) are determined by the same measurement strength ( χ j ). Intuitively, our assumptions allow fordetermining the measurement imprecision of a detector by measuring a single observable. In particular, the ratio˜ P cl ( λ j | χ (cid:48) j )˜ P cl ( λ j | χ j ) = D j ( λ j | χ (cid:48) j ) D j ( λ j | χ j ) , (S12)is then just the ratios of the measurement imprecision terms. This allows for exchanging the measurement imprecisionat one value of χ with the measurement imprecision at another value χ (cid:48) which is how K cl is related to P cl .In the quantum case, K can be obtained analogously but starting with Eq. (3) in the main text instead of Eq. (S3).One then proceeds by making the same assumptions on the detectors which corresponds to using Wigner functionsthat factor in a position and a momentum part. B. SIMULTANEOUS POSITION AND MOMENTUM MEASUREMENTS

From Eqs. (3) , (13) , and (14) , we ﬁnd the probability distribution describing the outcomes of a joint position andmomentum measurement P ( x, p | χ ) = 4 χ π (2 + χ ) e − χ x p χ (cid:2) χ ( x + p ) + (4 − χ ) (cid:3) , (S13)where we assumed the measurement strengths to be equal, i.e., χ x = χ p = χ . The corresponding characteristicfunction reads ˜ P ( λ x , λ p | χ ) = (cid:90) dxdpe − iλ x x − iλ p p P ( x, p | χ ) = 12 e − (2+ χ χ ( λ x + λ p ) (2 − λ x − λ p ) . (S14)Analogously, we ﬁnd the probability distribution that describes a measurement of ˆ x alone P ( x | χ ) = χ (cid:112) π (1 + χ ) (1 + χ + 2 x χ ) e − χ x χ , (S15)with the characteristic function ˜ P ( λ x | χ ) = 12 e − χ χ λ x (2 − λ x ) . (S16)Due to the rotational invariance of the Wigner function of a single-photon Fock state [cf. Eq. (14)], the distributionsfor a measurement of momentum alone are equivalent. From the last equation, we ﬁnd˜ P ( λ x | χ )˜ P ( λ x | χ (cid:48) ) = e − λ χ [1 − ( χ/χ (cid:48) ) ] = ˜ D ( λ x | χ )˜ D ( λ x | χ (cid:48) ) , (S17)where for the last equality, we used the Wigner function of the detectors [cf. Eq. (13)], and the fact that they can bewritten as W j ( χ j A j , γ j /χ j ) = 1 √ πχ j e − γ j /χ j χ j √ π e − χ j A j = p ( γ j | χ j ) D ( A j | χ j ) , (S18)where, as discussed in the main text, p ( γ j | χ j ) encodes the backaction (arising from the momentum distribution ofthe detector) and D ( A j | χ j ) encodes the imprecision (arising from the position distribution). Equation (S17) thusshows that the measurement imprecision of a detector can be isolated by measuring a single observable with diﬀerentmeasurement strengths. Importantly, this works because backaction is irrelevant when measuring a single observable(we are not interested in the post-measured state). We can now write K ( x, p ) = (cid:90) dx (cid:48) dp (cid:48) dγ x dγ p P ( x, p | γ x , γ p ) [ p ( γ x | χ ) D ( x − x (cid:48) | χ (cid:48) )] [ p ( γ p | χ ) D ( p − p (cid:48) | χ (cid:48) )] , (S19)where P ( x, p | γ x , γ p ) = W ( x − γ p / , p + γ x / K given in Eq. (15) in the main text. C. SUBSEQUENT MEASUREMENTS ON A TWO-LEVEL SYSTEM

In the second example, we consider a two level system in a pure stateˆ ρ = | + (cid:105) , ˆ σ x | + (cid:105) = | + (cid:105) , (S20)where ˆ σ x denotes a Pauli matrix. We are then interested in a weak measurement of ˆ σ , followed by a projectivemeasurement of ˆ σ . We denote the eigenvector of those Pauli matrices by ˆ σ j |± j (cid:105) = ±|± j (cid:105) . It is convenient to expressall states in the basis that diagonalizes ˆ σ . | + (cid:105) = α | + (cid:105) + β |− (cid:105) | + (cid:105) = γ | + (cid:105) + δ |− (cid:105) , (S21)where we consider α = β = γ = δ = 1 / √ P c (Σ , Σ ) = 1(2 π ) (cid:90) dλ dλ e iλ Σ + iλ Σ Tr (cid:110) e − i λ ˆ σ e − i ( λ + γ ) ˆ σ | + (cid:105)(cid:104) + | e − i ( λ − γ ) ˆ σ e − i λ ˆ σ (cid:111) = (cid:88) σ =0 , ± (cid:88) σ = ± P ( σ , σ ) δ (Σ − σ ) δ (Σ − σ ) , (S22)with the discrete distribution P (+1 , +1) = | α | | γ | , P ( − , +1) = | β | | δ | , P (0 , +1) = 2Re (cid:8) e − iγ αβ ∗ γ ∗ δ (cid:9) , P (+1 , −

1) = | α | | δ | , P ( − , −

1) = | β | | γ | , P (0 , −

1) = − (cid:8) e − iγ αβ ∗ γ ∗ δ (cid:9) . (S23)Measuring ˆ σ with strength χ , followed by measuring ˆ σ projectively (i.e., χ → ∞ ) results in the distribution P ( σ , σ | χ ) = χ √ π (cid:88) σ (cid:48) =0 , ± e − χ ( σ − σ (cid:48) ) e − χ δ σ (cid:48) , P ( σ (cid:48) , σ ) | γ =0 , (S24)where σ is a continuous variable while σ = ± σ .We thus introduce˜ P ( λ , σ | χ ) = (cid:90) dσ e − iλ σ P ( σ , σ | χ ) = e − λ χ (cid:88) σ (cid:48) =0 , ± e − iλ σ (cid:48) e − χ δ σ (cid:48) , P ( σ (cid:48) , σ ) | γ =0 . (S25)A measurement of ˆ σ alone is described by P ( σ | χ ) = χ √ π (cid:104) | α | e − χ ( σ − + | β | e − χ ( σ +1) (cid:105) , (S26)with the characteristic function ˜ P ( λ | χ ) = e − λ χ (cid:2) cos( λ ) + ( | β | − | α | ) i sin( λ ) (cid:3) . (S27)This equation again fulﬁlls Eq. (S17), ensuring that measurement imprecision can be isolated. We can then write K ( σ , σ | χ ) = 12 π (cid:90) dλ e iλ σ ˜ P ( λ , σ | χ ) ˜ P ( λ | χ (cid:48) )˜ P ( λ | χ ) = χ (cid:48) √ π (cid:88) σ (cid:48) =0 , ± e − ( χ (cid:48) ) ( σ (cid:48) − σ ) e − χ δ σ (cid:48) , P ( σ (cid:48) , σ ) | γ =0 . (S28)We note that in the limit χ (cid:48) → ∞ , this distribution contains well separated peaks with weights that are given by e − χ δ σ (cid:48) , P ( σ (cid:48) , σ ) | γ =0 .To estimate K from experimental data, we consider N joint measurements, which result in outcomes σ j and σ j = ±

1. We then introduce the estimate of Eq. (S25) as Y λ ,σ = N (cid:88) j =1 δ σ j ,σ e − iλ σ j . (S29)We can then write K est = (cid:40)(cid:82) λ c − λ c dλ π e iλ σ Y λ ,σ Y (cid:48) λ Y λ for | Y λ | > c o , , (S30)where Y λ and Y (cid:48) λ denote the estimate of ˜ P ( λ | χ ) and ˜ P ( λ | χ (cid:48) ) respectively, following Eq. (16) . We note that it isbeneﬁcial to use a χ (cid:48)(cid:48)