[PDF] Uncertainty relations: An operational approach to the error-disturbance tradeoff

Abstract

The notions of error and disturbance appearing in quantum uncertainty relations are often quantified by the discrepancy of a physical quantity from its ideal value. However, these real and ideal values are not the outcomes of simultaneous measurements, and comparing the values of unmeasured observables is not necessarily meaningful according to quantum theory. To overcome these conceptual difficulties, we take a different approach and define error and disturbance in an operational manner. In particular, we formulate both in terms of the probability that one can successfully distinguish the actual measurement device from the relevant hypothetical ideal by any experimental test whatsoever. This definition itself does not rely on the formalism of quantum theory, avoiding many of the conceptual difficulties of usual definitions. We then derive new Heisenberg-type uncertainty relations for both joint measurability and the error-disturbance tradeoff for arbitrary observables of finite-dimensional systems, as well as for the case of position and momentum. Our relations may be directly applied in information processing settings, for example to infer that devices which can faithfully transmit information regarding one observable do not leak any information about conjugate observables to the environment. We also show that Englert's wave-particle duality relation [PRL 77, 2154 (1996)] can be viewed as an error-disturbance uncertainty relation.

Full PDF

UUncertainty relations: An operational approach to the error-disturbance tradeoff

Joseph M. Renes , Volkher B. Scholz , and Stefan Huber Institute for Theoretical Physics, ETH Zürich, Switzerland Department of Physics, Ghent University, Belgium Department of Mathematics, Technische Universität München, Germany

Accepted in

Quantum

July 10, 2017

The notions of error and disturbance appearing in quantum uncertainty relations are often quantiﬁedby the discrepancy of a physical quantity from its ideal value. However, these real and ideal values are notthe outcomes of simultaneous measurements, and comparing the values of unmeasured observables is notnecessarily meaningful according to quantum theory. To overcome these conceptual difﬁculties, we takea different approach and deﬁne error and disturbance in an operational manner. In particular, we formu-late both in terms of the probability that one can successfully distinguish the actual measurement devicefrom the relevant hypothetical ideal by any experimental test whatsoever. This deﬁnition itself does not relyon the formalism of quantum theory, avoiding many of the conceptual difﬁculties of usual deﬁnitions. Wethen derive new Heisenberg-type uncertainty relations for both joint measurability and the error-disturbancetradeoff for arbitrary observables of ﬁnite-dimensional systems, as well as for the case of position and mo-mentum. Our relations may be directly applied in information processing settings, for example to infer thatdevices which can faithfully transmit information regarding one observable do not leak any informationabout conjugate observables to the environment. We also show that Englert’s wave-particle duality relation [ Phys. Rev. Lett. , 2154 (1996) ] can be viewed as an error-disturbance uncertainty relation. It is no overstatement to say that the uncertainty principle is a cornerstone of our understanding of quan-tum mechanics, clearly marking the departure of quantum physics from the world of classical physics. Heisen-berg’s original formulation in 1927 mentions two facets to the principle. The ﬁrst restricts the joint measur-ability of observables, stating that noncommuting observables such as position and momentum can only besimultaneously determined with a characteristic amount of indeterminacy [

1, p. 172 ] (see [

2, p. 62 ] for anEnglish translation). The second describes an error-disturbance tradeoff, noting that the more precise a mea-surement of one observable is made, the greater the disturbance to noncommuting observables [

1, p. 175 ] ( [

2, p. 64 ] ). The two are of course closely related, and Heisenberg argues for the former on the basis of thelatter. Neither version can be taken merely as a limitation on measurement of otherwise well-deﬁned valuesof position and momentum, but rather as questioning the sense in which values of two noncommuting ob-servables can even be said to simultaneously exist. Unlike classical mechanics, in the framework of quantummechanics we cannot necessarily regard unmeasured quantities as physically meaningful.More formal statements were constructed only much later, due to the lack of a precise mathematicaldescription of the measurement process in quantum mechanics. Here we must be careful to draw a distinctionbetween statements addressing Heisenberg’s original notions of uncertainty from those, like the standardKennard-Robertson uncertainty relation [

3, 4 ] , which address the impossibility of ﬁnding a quantum statewith well-deﬁned values for noncommuting observables. Entropic uncertainty relations [

5, 6 ] are also anexample of this class; see [ ] for a review. Joint measurability has a longer history, going back at least tothe seminal work of Arthurs and Kelly [ ] and continuing in [ ] . Quantitative error-disturbance relationshave only been formulated relatively recently, going back at least to Braginsky and Khalili [

28, Chap. 5 ] andcontinuing in [

20, 29–35 ] .Beyond technical difﬁculties in formulating uncertainty relations, there is a perhaps more difﬁcult con-ceptual hurdle in that the intended consequences of the uncertainty principle seem to preclude their ownstraightforward formalization. To ﬁnd a relation between, say, the error of a position measurement and itsdisturbance to momentum in a given experimental setup like the gamma ray microscope would seem to re-quire comparing the actual values of position and momentum with their supposed ideal values. However,according to the uncertainty principle itself, we should be wary of simultaneously ascribing well-deﬁned val-ues to the actual and ideal position and momentum since they do not correspond to commuting observables.Thus, it is not immediately clear how to formulate either meaningful measures of error and disturbance, forinstance as mean-square deviations between real and ideal values, or a meaningful relation between them. This question is the subject of much ongoing debate [

25, 30, 36–39 ] . Uncertainty relations like the Kennard-Robertson bound or entropic relations do not face this issue as they do not attempt to compareactual and ideal values of the observables. a r X i v : . [ qu a n t - ph ] J u l ithout drawing any conclusions as to the ultimate success or failure of this program, in this paper we pro-pose a completely different approach which we hope sheds new light on these conceptual difﬁculties. Here,we deﬁne error and disturbance in an operational manner and ask for uncertainty relations that are state-ments about the properties of measurement devices, not of ﬁxed experimental setups or of physical quantitiesthemselves. More speciﬁcally, we deﬁne error and disturbance in terms of the distinguishing probability , theprobability that the actual behavior of the measurement apparatus can be distinguished from the relevant idealbehavior in any single experiment whatsoever. To characterize measurement error, for example, we imaginea black box containing either the actual device or the ideal device. By controlling the input and observing theoutput we can make an informed guess as to which is the case. We then attribute a large measurement errorto the measurement apparatus if it is easy to tell the difference, so that there is a high probability of correctlyguessing, and a low error if not; of course we pick the optimal input states and output measurements for thispurpose. In this way we do not need to attribute a particular ideal value of the observable to be measured,we do not need to compare actual and ideal values themselves (nor do we necessarily even care what thepossible values are), and instead we focus squarely on the properties of the device itself. Intuitively, we mightexpect that calibration provides the strictest test, i.e. inputting states with a known value of the observablein question. But in fact this is not the case, as entanglement at the input can increase the distinguishabilityof two measurements. The merit of this approach is that the notion of distinguishability itself does not relyon any concepts or formalism of quantum theory, which helps avoid conceptual difﬁculties in formalizing theuncertainty principle.Deﬁning the disturbance an apparatus causes to an observable is more delicate, as an observable itselfdoes not have a directly operational meaning (as opposed to the measurement of an observable). But wecan consider the disturbance made either to an ideal measurement of the observable or to ideal preparationof states with well-deﬁned values of the observable. In all cases, the error and disturbance measures weconsider are directly linked to a well-studied norm on quantum channels known as the completely boundednorm or diamond norm. We can then ask for bounds on the error and disturbance quantities for two givenobservables that every measurement apparatus must satisfy. In particular, we are interested in bounds de-pending only on the chosen observables and not the particular device. Any such relation is a statement aboutmeasurement devices themselves and is not speciﬁc to the particular experimental setup in which they areused. Nor are such relations statements about the values or behavior of physical quantities themselves. Inthis sense, we seek statements of the uncertainty principle akin to Kelvin’s form of the second law of thermo-dynamics as a constraint on thermal machines, and not like Clausius’s or Planck’s form involving the behaviorof physical quantities (heat and entropy, respectively). By appealing to a fundamental constraint on quan-tum dynamics, the continuity (in the completely bounded norm) of the Stinespring dilation [

40, 41 ] , weﬁnd error-disturbance uncertainty relations for arbitrary observables in ﬁnite dimensions, as well as for po-sition and momentum. Furthermore, we show how the relation for measurement error and measurementdisturbance can be transformed into a joint-measurability uncertainty relation. Interestingly, we also ﬁndthat Englert’s wave-particle duality relation [ ] can be viewed as an error-disturbance relation.The case of position and momentum illustrates the stark difference between the kind of uncertainty state-ments we can make in our approach with one based on the notion of comparing real and ideal values. Takethe notion of joint measurability, where we would like to formalize the notion that no device can accuratelymeasure both position and momentum. In the latter approach one would ﬁrst try to quantify the amountof position or momentum error made by a device as the discrepancy to the true value, and then show thatthey cannot both be small. The errors would be in units of position or momentum, respectively, and thehoped-for uncertainty relation would pertain to these values. Here, in contrast, we focus on the performanceof the actual device relative to ﬁxed ideal devices, in this case idealized separate measurements of position ormomentum. Importantly, we need not think of the ideal measurement as having inﬁnite precision. Instead,we can pick any desired precision and ask if the behavior of the actual device is essentially the same as thisprecision-limited ideal. Now the position and momentum errors do not have units of these quantities (theyare unitless and always lie between zero and one), but instead depend on the desired precision . Our uncer-tainty relation then implies that both errors cannot be small if we demand high precision in both positionand momentum. In particular, when the product of the scales of the two precisions is small compared toPlanck’s constant, then the errors will be bounded away from zero (see Theorem 3 for a precise statement).It is certainly easier to have a small error in this sense when the demanded precision is low, and this accordsnicely with the fact that sufﬁciently-inaccurate joint measurement is possible. Indeed, we ﬁnd no bound onthe errors for low precision.An advantage and indeed a separate motivation of an operational approach is that bounds involvingoperational quantities are often useful in analyzing information processing protocols. For example, entropicuncertainty relations, which like the Robertson relation characterize quantum states, have proven very useful2n establishing simple proofs of the security of quantum key distribution [

6, 7, 43–45 ] . Here we show that theerror-disturbance relation implies that quantum channels which can faithfully transmit information regardingone observable do not leak any information whatsoever about conjugate observables to the environment.This statement cannot be derived from entropic relations, as it holds for all channel inputs. It can be used toconstruct leakage-resilient classical computers from fault-tolerant quantum computers [ ] , for instance.The remainder of the paper is structured as follows. In the next section we give the mathematical back-ground necessary to state our results, and describe how the general notion of distinguishability is related tothe completely bounded norm (cb norm) in this setting. In Section 3 we deﬁne our error and disturbancemeasures precisely. Section 4 presents the error-disturbance tradeoff relations for ﬁnite dimensions, and de-tails how joint measurability relations can be obtained from them. Section 5 considers the error-disturbancetradeoff relations for position and momentum. Two applications of the tradeoffs are given in Section 6: a for-mal statement of the information disturbance tradeoff for information about noncommuting observables andthe connection between error-disturbance tradeoffs and Englert’s wave-particle duality relations. In Section 7we compare our results to previous approaches in more detail, and ﬁnally we ﬁnish with open questions inSection 8. The notion of the distinguishing probability is independent of the mathematical framework needed to describequantum systems, so we give it ﬁrst. Consider an apparatus E which in some way transforms an input A intoan output B . To describe how different E is from another such apparatus E (cid:48) , we can imagine the followingscenario. Suppose that we randomly place either E or E (cid:48) into a black box such that we no longer have anyaccess to the inner workings of the device, only its inputs and outputs. Now our task is to guess which deviceis actually in the box by performing a single experiment, feeding in any desired input and observing the outputin any manner of our choosing. In particular, the inputs and measurements can and should depend on E and E (cid:48) . The probability of making a correct guess, call it p dist ( E , E (cid:48) ) , ranges from to 1, since we can always justmake a random guess without doing any experiment on the box at all. Therefore it is more convenient towork with the distinguishability measure δ ( E , E (cid:48) ) : = p dist ( E , E (cid:48) ) − F to both E and E (cid:48) , sincethis just restricts the possible tests. That is, both δ ( EF , E (cid:48) F ) ≤ δ ( E , E (cid:48) ) and δ ( FE , FE (cid:48) ) ≤ δ ( E , E (cid:48) ) hold forall channels F whose inputs and outputs are such that the channel concatenation is sensible. Here and inthe remainder of the paper, we denote concatenation of channels by juxtaposition, while juxtaposition ofoperators denotes multiplication as usual. In the ﬁnite-dimensional case we will be interested in two arbitrary nondegenerate observables denoted X and Z . Only the eigenvectors of the observables will be relevant, call them | ϕ x 〉 and | θ z 〉 , respectively. Ininﬁnite dimensions we will conﬁne our analysis to position Q and momentum P , taking ħ h =

1. The analog of Q and P in ﬁnite dimensions are canonically conjugate observables X and Z for which | ϕ x 〉 = (cid:112) d (cid:80) z ω xz | θ z 〉 ,where d is the dimension and ω is a primitive d th root of unity.It will be more convenient for our purposes to adopt the algebraic framework and use the Heisenbergpicture, though we shall occasionally employ the Schrödinger picture. In the Heisenberg picture we describesystems chieﬂy by the algebra of observables on them and describe transformations of systems by quantumchannels, completely positive and unital maps from the algebra of observables of the output to the observablesof the input [

10, 47–50 ] . This allows us to treat classical and quantum systems on an equal footing withinthe same framework. When the input or output system is quantum mechanical, the observables are thebounded operators B ( H ) from the Hilbert space H associated with the system to itself. Classical systems,such as the results of measurement or inputs to a state preparation device, take values in a set, call it Y .The relevant algebra of observables here is L ∞ ( Y ) , the (bounded, measureable) functions on Y . Hybridsystems are described by tensor products, so an apparatus E which measures a quantum system has an outputalgebra described by L ∞ ( Y ) ⊗ B ( H ) . To describe just the measurement result, we keep only L ∞ ( Y ) . We shalloccasionally denote the input and output spaces explicitly as E A → Y B when useful.3or arbitrary input and output algebras A A and A B , quantum channels are precisely those maps E whichare unital, E ( B ) = A , and completely positive, meaning that not only does E map positive elements of A B to positive elements of A A , it also maps positive elements of A B ⊗ B ( (cid:67) n ) to positive elements of A A ⊗ B ( (cid:67) n ) for all integer n . This requirement is necessary to ensure that channels act properly on entangled systems. A E B Y Figure 1: A general quantum apparatus E . The apparatus measures a quantum system A giving theoutput Y . In so doing, E also transforms the input A into the output system B . Here the wavy linesdenote quantum systems, the dashed lines classical systems. Formally, the apparatus is described bya quantum instrument.A general measurement apparatus has both classical and quantum outputs, corresponding to the mea-surement result and the post-measurement quantum system. Channels describing such devices are called quantum instruments ; we will call the channel describing just the measurement outcome a measurement . Inﬁnite dimensions any measurement can be seen as part of a quantum instrument, but not so for idealizedposition or momentum measurements, as shown in Theorem 3.3 of [ ] (see page 57). Technically, we mayanticipate the result since the post-measurement state of such a device would presumably be a delta functionlocated at the value of the measurement, which is not an element of L ( Q ) . This need not bother us, though,since it is not operationally meaningful to consider a position measurement instrument of inﬁnite precision.And indeed there is no mathematical obstacle to describing ﬁnite-precision position measurement by quan-tum instruments, as shown in Theorem 6.1 (page 67 of [ ] ). For any bounded function α ∈ L ( Q ) we candeﬁne the instrument E α : L ∞ ( Q ) ⊗ B ( H ) → B ( H ) by E α ( f ⊗ a ) = (cid:90) d q f ( q ) A ∗ q ; α aA q ; α , (2)where A q ; α ψ ( q (cid:48) ) = α ( q − q (cid:48) ) ψ ( q (cid:48) ) for all ψ ∈ L ( Q ) . The classical output of the instrument is essentiallythe ideal value convolved with the function α . Thus, setting the width of α sets the precision limit of theinstrument. The distinguishability measure is actually a norm on quantum channels, equal (apart from a factor of onehalf) to the so-called norm of complete boundedness, the cb norm [ ] . The cb norm is deﬁned as anextension of the operator norm, similar to the extension of positivity above, as (cid:107) T (cid:107) cb : = sup n ∈ (cid:78) (cid:107) n ⊗ T (cid:107) ∞ , (3)where (cid:107) T (cid:107) ∞ is the operator norm. Then δ ( E , E ) = (cid:107) E − E (cid:107) cb . (4)In the Schrödinger picture we instead extend the trace norm (cid:107)·(cid:107) , and the result is usually called the diamondnorm [

51, 53 ] . In either case, the extension serves to account for entangled inputs in the experiment to testwhether E or E is the actual channel. In fact, entanglement is helpful even when the channels describeprojective measurements, as shown by an example given in Appendix A. This expression for the cb or diamondnorm is not closed-form, as it requires an optimization. However, in ﬁnite dimensions the cb norm can be castas a convex optimization, speciﬁcally as a semideﬁnite program [

54, 55 ] , which makes numerical computationtractable. Further details are given in Appendix B. According to the Stinespring representation theorem [

52, 56 ] , any channel E mapping an algebra A to B ( H ) can be expressed in terms of an isometry V : H → K to some Hilbert space K and a representation π of A in B ( K ) such that, for all a ∈ A , E ( a ) = V ∗ π ( a ) V . (5)4he isometry in the Stinespring representation is usually called the dilation of the channel, and K the dilationspace. In ﬁnite-dimensional settings, calling the input A and the output B , one usually considers maps taking A = B ( H B ) to B ( H A ) . Then one can choose K = H B ⊗ H E , where H E is a suitably large Hilbert space associatedto the “environment” of the transformation ( H E can always be chosen to have dimension dim ( H A ) dim ( H B ) ).The representation π is just π ( a ) = a ⊗ E . Using the isometry V , we can also construct a channel from B ( H E ) to B ( H A ) in the same manner; this is known as the complement E (cid:93) of E .The advantage of the general form of the Stinespring representation is that we can easily describe mea-surements, possibly continuous-valued, as well. For the case of ﬁnite outcomes, consider the ideal projectivemeasurement Q X of the observable X . Choosing a basis {| b x 〉} of L ( X ) and deﬁning π ( δ x ) = | b x 〉〈 b x | for δ x the function taking the value 1 at x and zero elsewhere, the canonical dilation isometry W X : H → L ( X ) ⊗ H is given by W X = (cid:88) x | b x 〉 ⊗ | ϕ x 〉〈 ϕ x | . (6)Note that this isometry deﬁnes a quantum instrument, since it can describe both the measurement outcomeand the post-measurement quantum system. If we want to describe just the measurement result, we couldsimply use W X = (cid:80) x | b x 〉 〈 ϕ x | with the same π . More generally, a POVM with elements Λ x has the isometry W X = (cid:80) x | b x 〉 ⊗ (cid:112) Λ x .For ﬁnite-precision measurements of position or momentum, the form of the quantum instrument in (2)immediately gives a Stinespring dilation W Q : H → K with K = L ( Q ) ⊗ H whose action is deﬁned by ( W Q ψ )( q , q (cid:48) ) = α ( q − q (cid:48) ) ψ ( q (cid:48) ) , (7)and where π is just pointwise multiplication on the L ∞ ( Q ) factor, i.e. for f ∈ L ∞ ( Q ) , and a ∈ B ( H ) , [ π ( f ⊗ a )( ξ ⊗ ψ )]( q , q (cid:48) ) = f ( q ) ξ ( q ) · ( a ψ )( q (cid:48) ) for all ξ ∈ L ( Q ) and ψ ∈ H .A slight change to the isometry in (6) gives the dilation of the device which prepares the state | ϕ x 〉 for classical input x . Formally the device is described by the map P : B ( H ) → L ( X ) for which P ( Λ ) = (cid:80) x | b x 〉〈 b x | 〈 ϕ x | Λ | ϕ x 〉 . Now consider W (cid:48) X : L ( X ) → H ⊗ L ( X ) given by W (cid:48) X = (cid:88) x | ϕ x 〉 ⊗ | b x 〉〈 b x | . (8)Choosing π ( Λ ) = Λ ⊗ X , we have P ( Λ ) = W (cid:48)∗ X π ( Λ ) W (cid:48) X .The Stinespring representation is not unique [ ] . Given two representations ( π , V , K ) and ( π , V , K ) of the same channel E , there exists a partial isometry U : K → K such that U V = V , U ∗ V = V , and U π ( a ) = π ( a ) U for all a ∈ A . For the representations π as usually employed for the ﬁnite-dimensionalcase, this last condition implies that U is a partial isometry from one environment to the other, for U ( a ⊗ E ) =( a ⊗ E (cid:48) ) U can only hold for all a if U acts trivially on B . For channels describing measurements, ﬁnite orcontinuous, the last condition implies that any such U is a conditional partial isometry, dependent on theoutcome of the measurement result. Thus, for any set of isometries U x : H S → H R , (cid:80) x | b x 〉 ⊗ U x | ϕ x 〉〈 ϕ x | U ∗ x is a valid dilation of Q X , just as is W X in (6). Similarly, ( W (cid:48) Q ψ )( q , q (cid:48) ) = α ( q − q (cid:48) )[ U q ψ ]( q (cid:48) ) is a valid dilationof E α in (2).The main technical ingredient required for our results is the continuity of the Stinespring representationin the cb norm [

40, 41 ] . That is, channels which are nearly indistinguishable have Stinespring dilations whichare close and vice versa. For completely positive and unital maps E and E , [

40, 41 ] show that (cid:107) E − E (cid:107) cb ≤ inf π i , V i (cid:107) V − V (cid:107) ∞ ≤ (cid:198) (cid:107) E − E (cid:107) cb , (9)where the inﬁmum is taken over all Stinespring representations ( π i , V i , K i ) of E i . Using the Stinespring representation we can easily show that, in principle, any joint measurement can alwaysbe decomposed into sequential measurement.

Lemma 1.

Suppose that E : L ∞ ( X ) ⊗ L ∞ ( Z ) → B ( H ) is a channel describing a joint measurement. Thenthere exists an apparatus A : L ∞ ( X ) ⊗ B ( H (cid:48) ) → B ( H ) and a conditional measurement M : L ∞ ( X ) ⊗ L ∞ ( Z ) → L ∞ ( X ) ⊗ B ( H (cid:48) ) such that E = AM . roof. Deﬁne M (cid:48) : L ∞ ( X ) → B ( H ) to be just the X output of E , i.e. M (cid:48) ( f ) = E ( f ⊗ ) . Now suppose that V : H → L ( X ) ⊗ L ( Z ) ⊗ H (cid:48)(cid:48) is a Stinespring representation of E and V X : H → L ( X ) ⊗ H (cid:48) is a representationof M (cid:48) , both with the standard representation π of L ∞ into L . By construction, V is also a dilation of M (cid:48) ,and therefore there exists a partial isometry U X such that V = U X V X . More speciﬁcally, conditional on thevalue X = x , each U x sends H (cid:48) to L ( Z ) ⊗ H (cid:48)(cid:48) . Thus, setting A ( f ⊗ a ) = V ∗ X ( π ( f ) ⊗ a ) V X and M x ( f ) = U ∗ x ( π ( f ) ⊗ ) U x , we have E = AM . To characterize the error (cid:34) X an apparatus E makes relative to an ideal measurement Q X of an observable X ,we can simply use the distinguishability of the two channels, taking only the classical output of E . Supposethat the apparatus is described by the channel E : B ( H B ) ⊗ L ∞ ( X ) → B ( H A ) and the ideal measurementby the channel Q X : L ∞ ( X ) → B ( H A ) . To ignore the output system B , we make use of the partial tracemap T B : L ∞ ( X ) → B ( H B ) ⊗ L ∞ ( X ) given by T B ( f ) = B ⊗ f . Then a sensible notion of error is given by (cid:34) X ( E ) = δ ( Q X , ET B ) . If it is easy to tell the ideal measurement apart from the actual device, then the error islarge; if it is difﬁcult, then the error is small.As a general deﬁnition, though, this quantity is deﬁcient to two respects. First, we could imagine anapparatus which performs an ideal Q X measurement, but simply mislabels the outputs. This leads to (cid:34) X ( E ) =

1, even though the ideal measurement is actually performed. Second, we might wish to consider the case thatthe classical output set of the apparatus is not equal to X itself. For instance, perhaps E delivers much moreoutput than is expected from Q X . In this case we also formally have (cid:34) X ( E ) =

1, since we can just examine theoutput to distinguish the two devices.We can remedy both of these issues by describing the apparatus by the channel E : B ( H B ) ⊗ L ∞ ( Y ) → B ( H A ) and just including a further classical postprocessing operation R : L ∞ ( X ) → L ∞ ( Y ) in the distin-guishability step. Since we are free to choose the best such map, we deﬁne (cid:34) X ( E ) : = inf R δ ( Q X , ERT B ) . (10)The setup of the deﬁnition is depicted in Figure 2. A E R X ≈ " X A Q X X B Y Figure 2: Measurement error. The error made by the apparatus E in measuring X is deﬁned by howdistinguishable the actual device is from the ideal measurement Q X in any experiment whatsoever,after suitably processing the classical output Y of E with the map R . To enable a fair comparison, weignore the quantum output of the apparatus, indicated in the diagram by graying out B . If the actualand ideal devices are difﬁcult to tell apart, the error is small. Deﬁning the disturbance an apparatus E causes to an observable, say Z , is more delicate, as an observable itselfdoes not have a directly operational meaning. But there are two straightforward ways to proceed: we caneither associate the observable with measurement or with state preparation. In the former, we compare howwell we can mimic the ideal measurement Q Z of the observable after employing the apparatus E , quantifyingthis using measurement error as before. Additionally, we should allow the use of recovery operations inwhich we attempt to “restore” the input state as well as possible, possibly conditional on the output of themeasurement. Formally, let Q Z : L ∞ ( Z ) → B ( H A ) be the ideal Z measurement and R be a recoverymap R : B ( H A ) → B ( H B ) ⊗ L ∞ ( X ) which acts on the output of E conditional on the value of the classicaloutput X (which it then promptly forgets). As depicted in Figure 3, the measurement disturbance is then themeasurement error after using the best recovery map: ν Z ( E ) : = inf R δ ( Q Z , ERT Y Q Z ) . (11)6 E R Q Z Z ≈ ν Z A Q Z ZY Figure 3: Measurement disturbance. To deﬁne the disturbance imparted by an apparatus E to themeasurement of an observable Z , consider performing the ideal Q Z measurement on the output B of E . First, however, it may be advantageous to “correct” or “recover” the original input A by someoperation R . In general, R may depend on the output X of E . The distinguishability between theresulting combined operation and just performing Q Z on the original input deﬁnes the measurementdisturbance. For state preparation, consider a device with classical input and quantum output that prepares the eigenstatesof Z . We can model this by a channel P Z , which in the Schrödinger picture produces | θ z 〉 upon receiving theinput z . Now we compare the action of P Z to the action of P Z followed by E , again employing a recoveryoperation. Formally, let P Z : B ( H A ) → L ∞ ( Z ) be the ideal Z preparation device and consider recoveryoperations R of the form R : B ( H A ) → B ( H B ) ⊗ L ∞ ( X ) . Then the preparation disturbance is deﬁned as η Z ( E ) : = inf R δ ( P Z , P Z ERT Y ) . (12) Z P Z E R A ≈ η Z Z P Z A Y Figure 4: Preparation disturbance. The ideal preparation device P Z takes a classical input Z andcreates the corresponding Z eigenstate. As with measurement disturbance, the preparation distur-bance is related to the distinguishability of the ideal preparation device P Z and P Z followed by theapparatus E in question and the best possible recovery operation R .All of the measures deﬁned so far are “ﬁgures of merit”, in the sense that we compare the actual device tothe ideal, perfect functionality. In the case of state preparation we can also deﬁne a disturbance measure as a“ﬁgure of demerit”, by comparing the actual functionality not to the best-case behavior but to the worst. Tothis end, consider a state preparation device C which just ignores the classical input and always prepares thesame ﬁxed output state. These are constant (output) channels, and clearly E disturbs the state preparation P Z considerably if P Z E has effectively a constant output. Based on this intuition, we can then make the followingformal deﬁnition: (cid:98) η Z ( E ) : = d − d − inf C :const. δ ( C , P Z E ) . (13)The disturbance is small according to this measure if it is easy to distinguish the action of P Z E from havinga constant output, and large otherwise. To see that (cid:98) η Z is positive, use the Schrödinger picture and let theoutput of C ∗ be the state σ for all inputs. Then note that inf C δ ( C , P Z E ) = min C max z δ ( σ , E ∗ ( θ z )) , where thelatter δ is the trace distance. Choosing σ = d (cid:80) z E ∗ ( θ z ) and using joint convexity of the trace distance, wehave inf C δ ( C , P Z E ) ≤ d − d .We remark that while this disturbance measure leads to ﬁnite bounds in the case of ﬁnite dimensions, itis less well behaved in the case of position and momentum measurements: Without any bound on the energyof the test states, two channels tend to be as distinguishable as possible, unless they are already constantchannels. To be more precise, any non-constant channel which only changes the energy by a ﬁxed amountcan be differentiated from a constant channel by inputing states of very high energy. Roughly speaking, evenan arbitrarily strongly disturbing operation can be used to gain some information about the input and hence aconstant channel is not a good “worst case” scenario. This is in sharp contrast to the ﬁnite-dimensional case,and supports the view that the disturbance measures ν Z ( E ) and η Z ( E ) are physically more sensible.7 P Z E B ≈ d − d − b η Z Z C B Y Y

Figure 5: Figure of “demerit” version of preparation disturbance. Another approach to deﬁningpreparation disturbance is to consider distinguishability to a non-ideal device instead of an idealdevice. The apparatus E imparts a large disturbance to the preparation P Z if the output of the com-bination P Z E is essentially independent of the input. Thus we consider the distinguishability of P Z E and a constant preparation C which outputs a ﬁxed state regardless of the input Z .For ﬁnite-dimensional systems, all the measures of error and disturbance can be expressed as semideﬁniteprograms, as detailed in Appendix B. As an example, we compute these measures for the simple case of a non-ideal X measurement on a qubit; we will meet this example later in assessing the tightness of the uncertaintyrelations and their connection to wave-particle duality relations in the Mach-Zehnder interferometer. Considerthe ideal measurement isometry (6), and suppose that the basis states | b x 〉 are replaced by two pure states | γ x 〉 which have an overlap 〈 γ | γ 〉 = sin θ . Without loss of generality, we can take | γ x 〉 = cos θ | b x 〉 + sin θ | b x + 〉 .The optimal measurement Q for distinguishing these two states is just projective measurement in the | b x 〉 basis, so let us consider the channel E MZ = WQ . Then, as detailed in Appendix B, for Z canonically conjugateto X we ﬁnd (cid:34) X ( E MZ ) = ( − cos θ ) and (14) ν Z ( E MZ ) = η Z ( E ) = (cid:98) η Z ( E ) = ( − sin θ ) . (15)In all of the ﬁgures of merit, the optimal recovery map R is to do nothing, while in (cid:98) η Z the optimal channel C outputs the average of the two outputs of P Z E . Before turning to the uncertainty relations, we ﬁrst present several measures of complementarity that willappear therein. Indeed, we can use the above notions of disturbance to deﬁne several measures of comple-mentarity that will later appear in our uncertainty relations. For instance, we can measure the complemen-tarity of two observables just by using the measurement disturbance ν . Speciﬁcally, treating Q X as the actualmeasurement and Q Z as the ideal measurement, we deﬁne c M ( X , Z ) : = ν Z ( Q X ) . This quantity is equivalentto (cid:34) Z ( Q X ) since any recovery map R X → Z in (cid:34) Z can be used to deﬁne R (cid:48) X → A in ν Z by R (cid:48) = RP Z . Similarly, wecould treat one observable as deﬁning the ideal state preparation device and the other as the measurementapparatus, which leads to c P ( X , Z ) : = η Z ( Q X ) . Here we could also use the “ﬁgure of demerit” and deﬁne (cid:98) c P ( X , Z ) : = (cid:98) η Z ( Q X ) .Though the three complementarity measures are conceptually straightforward, it is also desireable to haveclosed-form expressions, particularly for the bounds in the uncertainty relations. To this end, we derive lowerbounds as follows. First, consider c M and choose as inputs Z basis states. This gives, for random choice ofinput, c M ( X , Z ) ≥ inf R δ ( P Z Q Z , P Z Q X R ) (16a) ≥ − max R d (cid:88) xz |〈 ϕ x | θ z 〉| R zx (16b) ≥ − max R d (cid:88) x max z |〈 ϕ x | θ z 〉| (cid:88) z (cid:48) R z (cid:48) x (16c) = − d (cid:88) x max z |〈 ϕ x | θ z 〉| , (16d)where the maximization is over stochastic matrices R , and we use the fact that (cid:80) z R zx = x . For c P wecan proceed similarly. Again replacing the recovery map R X → A followed by Q Z with a classical postprocessing8ap R X → Z , we have c P ( X , Z ) ≥ inf R X → A δ ( P Z Q Z , P Z Q X RQ Z ) (17a) = inf R X → Z δ ( P Z Q Z , P Z Q X R ) (17b) ≥ − d (cid:88) x max z |〈 ϕ x | θ z 〉| . (17c)For (cid:98) c P ( X , Z ) we have (cid:98) c P ( X , Z ) = d − d − inf C :const. δ ( C , P Z Q X ) (18a) = d − d − min P max z δ ( P , Q ∗ X ( θ z )) (18b) ≥ d − d − max z (cid:88) x | d − |〈 ϕ x | θ z 〉| | , (18c)where the bound comes from choosing P to be the uniform distribution. We could also choose P ( x ) = |〈 ϕ x | θ z (cid:48) 〉| for some z (cid:48) to obtain the bound (cid:98) c P ( X , Z ) ≥ d − d − min z (cid:48) max z (cid:80) x (cid:12)(cid:12) Tr [ ϕ x ( θ z − θ z (cid:48) )] (cid:12)(cid:12) . However,from numerical investigation of random bases, it appears that this bound is rarely better than the previousone.Let us comment on the properties of the complementarity measures and their bounds in (16d), (17c),and (18c). Both expressions in the bounds are, properly, functions only of the two orthonormal bases in-volved, depending only on the set of overlaps. In particular, both are invariant under relabelling the bases.Uncertainty relations formulated in terms of conditional entropy typically only involve the largest overlapor largest two overlaps [

7, 57 ] , but the bounds derived here are yet more sensitive to the structure of theoverlaps. Interestingly, the quantity in (16d) appears in the information exclusion relation of [ ] , wherethe sum of mutual informations different systems can have about the observables X and Z is bounded bylog d (cid:80) x max z |〈 ϕ x | θ z 〉| .The complementarity measures themselves all take the same value in two extreme cases: zero in the trivialcase of identical bases, ( d − ) / d in the case that the two bases are conjugate, meaning |〈 ϕ x | θ z 〉| = / d forall x , z . In between, however, the separation between the two can be quite large. Consider two observablesthat share two eigenvectors while the remainder are conjugate. The bounds (16d) and (17c) imply that c M and c P are both greater than ( d − ) / d . The bound on (cid:98) c P from (18c) is zero, though a better choice of constantchannel can easily be found in this case. In dimensions d = k +

2, ﬁx the constant channel to output thedistribution P with probability 1 / / k for any k of the remainder,and zero otherwise. Then we have ˆ c P ≥ d − d − max z δ ( P , Q ∗ X P ∗ Z ( z )) . It is easy to show the optimal value is 2 / c P ≥ ( d − ) / d . Hence, in the limit of large d , the gap between the two measures can be at least 2 / We ﬁnally have all the pieces necessary to formally state our uncertainty relations. The ﬁrst relates measure-ment error and measurement disturbance, where we have

Theorem 1.

For any two observables X and Z and any quantum instrument E , (cid:198) (cid:34) X ( E ) + ν Z ( E ) ≥ c M ( X , Z ) and (19) (cid:34) X ( E ) + (cid:198) ν Z ( E ) ≥ c M ( Z , X ) . (20)Due to Lemma 1, any joint measurement of two observables can be decomposed into a sequential measure-ment, which implies that these bounds hold for joint measurement devices as well. Indeed, we will makeuse of that lemma to derive (20) from (19) in the proof below. Of course we can replace the c M quantitieswith closed-form expressions using the bound in (16d). Figure 6 shows the bound for the case of conjugateobservables of a qubit, for which c M ( X , Z ) = c M ( Z , X ) = . It also shows the particular relation between errorand measurement disturbance achieved by the apparatus E MZ mentioned at the end of §3, from which we canconclude the that bound is tight in the region of vanishing error or vanishing disturbance.For measurement error and preparation disturbance we ﬁnd the following relations9 / / Error D i s t u r b a n c e ( " X , ν Z )( " X , η Z ) & ( " X , b η Z ) E MZ Figure 6: Error versus disturbance bounds for conjugate qubit observables. Theorem 1 restricts thepossible combinations of measurement error (cid:34) X and measurement disturbance ν Z to the dark grayregion bounded by the solid line. Theorem 2 additionally includes the light gray region. Also shownare the error and disturbance values achieved by E MZ from §3. Theorem 2.

For any two observables X and Z and any quantum instrument E , (cid:198) (cid:34) X ( E ) + η Z ( E ) ≥ c P ( X , Z ) and (21) (cid:198) (cid:34) X ( E ) + (cid:98) η Z ( E ) ≥ (cid:98) c P ( X , Z ) . (22)Returning to Figure 6 but replacing the vertical axis with η Z or (cid:98) η Z , we now have only the upper branch of thebound, which continues to the horizontal axis as the dotted line. Here we can only conclude that the boundsare tight in the region of vanishing error. The proofs of all three uncertainty relations are just judicious applications of the triangle inequality, andthe particular bound comes from the setting in which P Z meets Q X . We shall make use of the fact that aninstrument which has a small error in measuring Q X is close to one which actually employs the instrumentassociated with Q X . This is encapsulated in the following Lemma 2.

For any apparatus E A → Y B there exists a channel F X A → Y B such that δ ( E , Q (cid:48) X F ) ≤ (cid:112) (cid:34) X ( E ) ,where Q (cid:48) X is a quantum instrument associated with the measurement Q X . Furthermore, if Q X is a projectivemeasurement, then there exists a state preparation P X → Y B such that δ ( E , Q X P ) ≤ (cid:112) (cid:34) X ( E ) .Proof. Let V : H A → H B ⊗ H E ⊗ L ( X ) and W X : H A → L ( X ) ⊗ H A be respective dilations of E and Q X . Usingthe dilation W X we can deﬁne the instrument Q (cid:48) X as Q (cid:48) X : L ∞ ( X ) ⊗ B ( H B ) → B ( H A ) g ⊗ A (cid:55)→ W ∗ X ( π ( g ) ⊗ A ) W X . (23)Suppose R Y → X is the optimal map in the deﬁnition of (cid:34) X ( E ) , and let R (cid:48) Y → XY be the extension of R whichkeeps the input Y ; it has a dilation V (cid:48) : L ( Y ) → L ( Y ) ⊗ L ( X ) . By Stinespring continuity, in ﬁnite dimensionsthere exists a conditional isometry U X : L ( X ) ⊗ H A → L ( X ) ⊗ L ( Y ) ⊗ H B ⊗ H E such that (cid:13)(cid:13) V (cid:48) V − U X W X (cid:13)(cid:13) ∞ ≤ (cid:198) (cid:34) X ( E ) . (24)Now consider the map E (cid:48) : L ∞ ( Y ) ⊗ B ( H B ) → B ( H A ) f ⊗ A (cid:55)→ W ∗ X U ∗ X ( X ⊗ π ( f ) ⊗ A ⊗ E ) U X W X . (25)10y the other bound in Stinespring continuity we thus have δ ( E , E (cid:48) ) ≤ (cid:112) (cid:34) X ( E ) . Furthermore, as describedin §2.4, U X is a conditional isometry, i.e. a collection of isometries U x : H A → L ( Y ) ⊗ H B ⊗ H E for eachmeasurement outcome x . Note that we may regard elements of L ∞ ( X ) ⊗ B ( H ) as sequences ( A x ) x ∈ X with A x ∈ B ( H ) for all x ∈ X such that ess sup x (cid:107) A x (cid:107) ∞ < ∞ . Therefore we may deﬁne F : L ∞ ( Y ) ⊗ B ( H B ) → L ∞ ( X ) ⊗ B ( H A ) f ⊗ A (cid:55)→ ( U ∗ x ( π ( f ) ⊗ A ⊗ E ) U x ) x ∈ X , (26)so that E (cid:48) = Q (cid:48) X F . This completes the proof of the ﬁrst statement.If Q X is a projective measurement, then the output B of Q (cid:48) X can just as well be prepared from the X output. Describing this with the map P (cid:48) X → X A which prepares states in A given the value of X and retains X atthe output, we have Q (cid:48) X = Q X P (cid:48) . Setting P = P (cid:48) F completes the proof of the second statement.Now, to prove (19), start with the triangle inequality and monotonicity. Suppose P X → Y B is the statepreparation map from Lemma 2. Then, for any R Y B → A , δ ( Q Z , Q X PRQ Z ) ≤ δ ( Q Z , ERQ Z ) + δ ( ERQ Z , Q X PRQ Z ) (27a) ≤ δ ( Q Z , ERQ Z ) + δ ( E , Q X P ) (27b) = δ ( Q Z , ERQ Z ) + (cid:198) (cid:34) X ( E ) . (27c)Observe that PRQ Z is just a map R (cid:48) X → Z . Taking the inﬁmum over R we then have (cid:198) (cid:34) X ( E ) + ν Z ( E ) ≥ inf R δ ( Q Z , Q X PRQ Z ) (28a) ≥ inf R δ ( Q Z , Q X R ) . (28b)To show (20), let R Y B → A and R (cid:48) Y → X be the optimal maps in ν Z ( E ) and (cid:34) X ( E ) , respectively. Now applyLemma 1 to M = ER (cid:48) RQ Z and suppose that E (cid:48) A → Z B is the resulting instrument and M Z B → X is the conditionalmeasurement. By the above argument, (cid:112) (cid:34) Z ( E (cid:48) ) + ν X ( E (cid:48) ) ≥ inf R δ ( Q X , Q Z R ) . But (cid:34) Z ( E (cid:48) ) ≤ δ ( Q Z , E (cid:48) T B ) = ν Z ( E ) and ν X ( E (cid:48) ) ≤ δ ( Q X , E (cid:48) M ) = (cid:34) X ( E ) , where in the latter we use the fact that we could always repreparean X eigenstate and then let Q X measure it. Therefore the desired bound holds.To establish (21), we proceed just as above to obtain δ ( P Z , P Z Q X PR ) ≤ δ ( P Z , P Z ER ) + (cid:198) (cid:34) X ( E ) . (29)Now P X → Y B R Y B → A is a preparation map P X → A , and taking the inﬁmum over R gives (cid:198) (cid:34) X ( E ) + η Z ( E ) ≥ inf R δ ( P Z , P Z Q X PR ) (30a) ≥ inf P δ ( P Z , P Z Q X P ) . (30b)Finally, (22). Since the (cid:98) η Z disturbance measure is deﬁned “backwards”, we start the triangle inequalitywith the distinguishability quantity related to disturbance, rather than the eventual constant of the bound.For any channel C Z → X and P X → Y B from Lemma 2, just as before we have δ ( CP , P Z E ) ≤ δ ( CP , P Z Q X P ) + δ ( P Z Q X P , P Z E ) (31a) ≤ δ ( C , P Z Q X ) + (cid:198) (cid:34) X ( E ) . (31b)Now we take the inﬁmum over constant channels C Z → X . Note thatinf C Z → Y B δ ( C , P Z E ) ≤ inf C Z → X δ ( CP , P Z E ) . (32)Therefore, we have (cid:198) (cid:34) X ( E ) + (cid:98) η Z ( E ) ≥ d − d − inf C δ ( C , P Z Q X ) . (33)This last proof also applies to a more general deﬁnition of disturbance which does not use P Z at the input,but rather diagonalizes or “pinches” any input quantum system in the Z basis. Such a transformation can11e thought of as the result of performing an ideal Z measurement, but forgetting the result. More formally,letting Q (cid:92) Z = W Z T Z with W Z : a → W ∗ Z aW Z , we can deﬁne (cid:101) η Z ( E ) = d − d − inf C δ ( C , Q (cid:92) Z E ) . (34)Though perhaps less conceptually appealing, this is a more general notion of disturbance, since now we canpotentially use entanglement at the input to increase distinguishability of Q (cid:92) Z E from any constant channel.However, due to the form of Q (cid:92) Z , entanglement will not help. Applied to any bipartite state, the map Q (cid:92) Z produces a state of the form (cid:80) z p z | θ z 〉〈 θ z | ⊗ σ z for some probability distribution p z and set of normalizedstates σ z , and therefore the input to E itself is again an output of P Z . Since classical correlation with ancillarysystems is already covered in (cid:98) η Z ( E ) , it follows that (cid:101) η Z ( E ) = (cid:98) η Z ( E ) . Now we turn to the inﬁnite-dimensional case of position and momentum measurements. Let us focus onGaussian limits on precision, where the convolution function α described in §2.2 is the square root of anormalized Gaussian of width σ , and for convenience deﬁne g σ ( x ) = (cid:112) πσ e − x σ . (35)One advantage of the Gaussian choice is that the Stinespring dilation of the ideal σ -limited measurementdevice is just a canonical transformation. Thus, measurement of position Q just amounts to adding this valueto an ancillary system which is prepared in a zero-mean Gaussian state with position standard deviation σ Q ,and similarly for momentum. The same interpretation is available for precision-limited state preparation.To prepare a momentum state of width σ P , we begin with a system in a zero-mean Gaussian state withmomentum standard deviation σ P and simply shift the momentum by the desired amount.Given the ideal devices, the deﬁnitions of error and disturbance are those of §3, as in the ﬁnite-dimensionalcase, with the slight change that the ﬁrst term of (cid:98) η is now 1. To reduce clutter, we do not indicate σ Q and σ P speciﬁcally in the error and disturbance functions themselves.Since our error and disturbance measures are based on possible state preparations and measurementsin order to best distinguish the two devices, in principle one ought to consider precision limits in the distin-guishability quantity δ . However, we will not follow this approach here, and instead we allow test of arbitraryprecision in order to preserve the link between distinguishability and the cb norm. This leads to bounds thatare perhaps overly pessimistic, but nevertheless limit the possible performance of any device. As discussed previously, the disturbance measure of demerit (cid:98) η cannot be expected to lead to uncertaintyrelations for position and momentum observables, as any non-constant channel can be perfectly differentiatedfrom a constant one by inputting states of arbitrarily high momentum. We thus focus on the disturbancemeasures of merit. Theorem 3.

Set c = σ Q σ P for any precision values σ Q , σ P > . Then for any quantum instrument E , (cid:198) (cid:34) Q ( E ) + ν P ( E ) (cid:34) Q ( E ) + (cid:198) ν Q ( E ) (cid:41) ≥ − c ( + c / + c / ) / and (36) (cid:113) (cid:34) Q ( E ) + η P ( E ) ≥ ( + c ) / (( + c ) + c / ( + c ) / + c / ( + c ) / ) / . (37)Before proceeding to the proofs, let us comment on the properties of the two bounds. As can be seen inFigure 7, the bounds take essentially the same values for σ Q σ P (cid:28) , and indeed both evaluate to unity at σ Q σ P =

0. This is the region of combined position and momentum precision far smaller than the naturalscale set by ħ h , and the limit of inﬁnite precision accords with the ﬁnite-dimensional bounds for conjugateobservables in the limit d → ∞ . Otherwise, though, the bounds differ remarkably. The measurement distur-bance bound in (36) is positive only when σ Q σ P ≤ , which is the Heisenberg precision limit. In contrast,the preparation disturbance bound in (37) is always positive, though it decays roughly as ( σ Q σ P ) .12he distinction between these two cases is a result of allowing arbitrarily precise measurements in the dis-tinguishability measure. It can be understood by the following heuristic argument. Consider an experimentin which a momentum state of width σ in P is subjected to a position measurement of resolution σ Q and then amomentum measurement of resolution σ out P . From the uncertainty principle, we expect the position measure-ment to change the momentum by an amount ∼ /σ Q . Thus, to reliably detect the change in momentum, σ out P must fulﬁll the condition σ out P (cid:28) σ in P + /σ Q . The Heisenberg limit in the measurement disturbancescenario is σ out P = /σ Q , meaning this condition cannot be met no matter how small we choose σ in P . This isconsistent with no nontrivial bound in (36) in this region. On the other hand, for preparation disturbance theHeisenberg limit is σ in P = /σ Q , so detecting the change in momentum simply requires σ out P (cid:28) /σ Q . A moresatisfying approach would be to include the precision limitation in the distinguishability measure to restorethe symmetry of the two scenarios, but this requires signiﬁcant changes to the proof and is left for futurework. / σ Q σ P l o w e r bo u n d measurementpreparation Figure 7: Uncertainty bounds appearing in Theorem 3 in terms of the combined precision σ Q σ P .The solid line corresponds to the bound involving measurement disturbance, (36), the dashed line tothe bound involving preparation disturbance, (37). The proof of Theorem 3 is broadly similar to the ﬁnite-dimensional case. We would again like to begin with F Q A → Y B from Lemma 2 such that δ ( E , Q (cid:48) Q F ) ≤ (cid:198) (cid:34) Q ( E ) . However, the argument does not quite go through,as in inﬁnite dimensions we cannot immediately ensure that the inﬁmum in Stinespring continuity is attained.Nonetheless, we can consider a sequence of maps ( F n ) n ∈ (cid:78) such that the desired distinguishability bound holdsin the limit n → ∞ .To show (36), we follow the steps in (27). Now, though, consider the map F (cid:48) n which just appends Q to theoutput of F n , and deﬁne N = Q (cid:48) Q F n RQ P , where Q (cid:48) Q is the instrument associated with position measurement Q Q . Then we have δ ( Q P , N T Q ) ≤ δ ( Q P , ERQ P ) + δ ( ERQ P , N T Q ) (38a) ≤ δ ( Q P , ERQ P ) + δ ( E , Q (cid:48) Q F n ) . (38b)Taking the limit n → ∞ and the inﬁmum over recovery maps R produces (cid:198) (cid:34) Q ( E ) + ν P ( E ) on the righthandside. We can bound the lefthand side by testing with pure unentangled inputs: δ ( Q P , N T Q ) ≥ sup ψ , f 〈 ψ , (cid:0) Q P ( f ) − [ N T Q ]( f ) (cid:1) ψ 〉 . (39)Now we want to show that, since Q P is covariant with respect to phase space translations, without lossof generality we can take N to be covariant as well. Consider the translated version of both Q P and N T Q ,obtained by shifting their inputs and outputs correspondingly by some amount z = ( q , p ) . For the states ψ this shift is implemented by the Weyl-Heisenberg operators V z , while for tests f only the value of p isrelevant. Any such shift does not change the distinguishability, because we can always shift ψ and f aswell to recover the original quantity. Averaging over the translated versions therefore also leads to the samedistinguishability, and since Q P is itself covariant, the averaging results in a covariant N T Q . The details ofthe averaging require some care in this noncompact setting, but are standard by now, and we refer the reader13o the work of Werner [ ] for furter details. Since T Q just ignores the Q output of the measurement N , wemay thus proceed by assuming that N is a covariant measurement.Any covariant N has the form N ( f ) = (cid:90) (cid:82) d z π f ( z ) V z mV ∗ z , (40)for some positive operator m such that Tr [ m ] =

1. Due to the deﬁnition of N , the position measurementresult is precisely that obtained from Q Q . By the covariant form of N , this implies that the position width of m is just σ Q (or rather that of the parity version of m , see [ ] ). Suppose the momentum distribution hasstandard deviation (cid:98) σ P ; then σ Q (cid:98) σ P ≥ / [ ] .Now we can evaluate the lower bound term by term. Let us choose a Gaussian state in the momentumrepresentation and test function: ψ = g σ ψ and f = (cid:112) πσ f g σ f . Then the ﬁrst term is a straightforwardGaussian integral, since the precision-limited measurement just amounts to the ideal measurement convolvedwith g σ P : 〈 ψ , Q P ( f ) ψ 〉 = (cid:90) (cid:82) d p (cid:48) d p g σ ψ ( p (cid:48) ) g σ P ( p (cid:48) − p ) f ( p ) (41a) = σ f (cid:199) σ f + σ P + σ ψ . (41b)The second term is the same, just with (cid:98) σ P instead of σ P , so we have δ ( Q P , N T Q ) ≥ σ f (cid:199) σ f + σ P + σ ψ − σ f (cid:199) σ f + (cid:98) σ P + σ ψ . (42)The tightest possible bound comes from the smallest (cid:98) σ P , which is 1 / σ Q , and the bound is clearly trivial if σ Q σ P ≥ /

2. If this is not the case, we can optimize our choice of σ f . To simplify the calculation, assumethat σ ψ is small compared to σ f (so that we are testing with a very narrow momentum state). Then, with c = σ Q σ P , the optimal σ f is given by σ f = σ P c / ( + c / ) . (43)Using this in (42) gives (36).For preparation disturbance, proceed as before to obtain δ ( P P , P P Q (cid:48) Q F (cid:48) n RT Q ) ≤ δ ( P P , P P ER ) + δ ( P P ER , P P Q (cid:48) Q F (cid:48) n RT Q ) (44a) ≤ δ ( P P , P P ER ) + δ ( E , Q (cid:48) Q F n ) (44b)Now the limit n → ∞ and the inﬁmum over recovery maps R produces (cid:198) (cid:34) Q ( E ) + η P ( E ) on the righthandside. A lower bound on the quantity on the lefthand side can be obtained by using P P to prepare a σ P -limitedinput state and making a σ m -limited momentum measurement ¯ Q P measurement on the output, so that, for N as before, δ ( P P , P P Q (cid:48) Q F (cid:48) n RT Q ) ≥ sup ψ :Gaussian; f 〈 ψ , (cid:0) ¯ Q P ( f ) − [ N T Q ]( f ) (cid:1) ψ 〉 . (45)The only difference to (39) is that the supremum is restricted to Gaussian states of width σ P . The covarianceargument nonetheless goes through as before, and we can proceed to evaluate the lower bound as above.This yields δ ( P P , P P Q (cid:48) Q F (cid:48) n RT Q ) ≥ σ f (cid:199) σ f + σ m + σ P − σ f (cid:114) σ f + σ Q + σ P . (46)We may as well consider σ m → σ f is then given by the optimizerabove, replacing c with c / (cid:112) + c . Making the same replacement in (36) yields (37).14 Applications6.1 No information about Z without disturbance to X A useful tool in the construction of quantum information processing protocols is the link between reliabletransmission of X eigenstates through a channel N and Z eigenstates through its complement N (cid:93) , particularlywhen the observables X and Z are maximally complementary, i.e. |〈 ϕ x | ϑ z 〉| = d for all x , z . Due to theuncertainty principle, we expect that a channel cannot reliably transmit the bases to different outputs, sincethis would provide a means to simultaneously measure X and Z . This link has been used by Shor and Preskillto prove the security of quantum key distribution [ ] and by Devetak to determine the quantum channelcapacity [ ] . Entropic state-preparation uncertainty relations from [

6, 44 ] can be used to understand bothresults, as shown in [

60, 61 ] .However, the above approach has the serious drawback that it can only be used in cases where the speciﬁc X -basis transmission over N and Z -basis transmission over N (cid:93) are in some sense compatible and not coun-terfactual ; because the argument relies on a state-dependent uncertainty principle, both scenarios must becompatible with the same quantum state. Fortunately, this can be done for both QKD security and quantumcapacity, because at issue is whether X -basis ( Z -basis) transmission is reliable (unreliable) on average whenthe states are selected uniformly at random . Choosing among either basis states at random is compatible witha random measurement in either basis of half of a maximally entangled state, and so both X and Z basis sce-narios are indeed compatible. The same restriction to choosing input states uniformly appears in the recentresult of [ ] , as it also ultimately relies on a state-preparation uncertainty relation.Using Theorem 2 we can extend the method above to counterfactual uses of arbitrary channels N , inthe following sense: If acting with the channel N does not substantially affect the possibility of performingan X measurement, then Z -basis inputs to N (cid:93) result in an essentially constant output. More concretely, wehave Corollary 1.

Given a channel N and complementary channel N (cid:93) , suppose that there exists a measurement Λ X such that δ ( Q X , N Λ X ) ≤ (cid:34) . Then there exists a constant channel C such that δ ( Q (cid:92) Z N (cid:93) , C ) ≤ (cid:112) (cid:34) + d − d − (cid:98) c P ( X , Z ) . (47) For maximally complementary X and Z, δ ( Q (cid:92) Z N (cid:93) , C ) ≤ (cid:112) (cid:34) .Proof. Let V be the Stinespring dilation of N such that N (cid:93) is the complementary channel and deﬁne E = V N Λ X . For C the optimal choice in the deﬁnition of (cid:98) η Z ( E ) , (22), (34), and (cid:101) η Z = (cid:98) η Z imply δ ( Q (cid:92) Z E , C ) ≤(cid:112) (cid:34) + d − d − (cid:98) c P ( X , Z ) . Since N (cid:93) is obtained from E by ignoring the Λ X measurement result, δ ( Q (cid:92) Z N (cid:93) , C ) ≤ δ ( Q (cid:92) Z E , C ) .This formulation is important because in more general cryptographic and communication scenarios we areinterested in the worst-case behavior of the protocol, not the average case under some particular probabilitydistribution. For instance, in [ ] the goal is to construct a classical computer resilient to leakage of Z -basis information by establishing that reliable X basis measurement is possible despite the interference of theeavesdropper. However, such an X measurement is entirely counterfactual and cannot be reconciled with theactual Z -basis usage, as the Z -basis states will be chosen deterministically in the classical computer.It is important to point out that, unfortunately, calibration testing is in general completely insufﬁcientto establish a small value of δ ( Q X , N Λ X ) . More speciﬁcally, the following example shows that there is nodimension-independent bound connecting inf Λ X δ ( Q X , N Λ X ) to the worst case probability of incorrectly iden-tifying an X eigenstate input to N , for arbitrary N . Let the quantities p yz be given by p y ,0 = / d for y =

0, . . . , d / − p y ,1 = / d for y = d /

2, . . . , d −

1, and p y , z = / d otherwise, where we assume d iseven, and then deﬁne the isometry V : H A → H B ⊗ H C ⊗ H D as the map taking | z 〉 A to (cid:80) y (cid:112) p yz | y 〉 B | z 〉 C | y 〉 D .Finally, let N : B ( H B ) ⊗ B ( H C ) → B ( H A ) be the channel obtained by ignoring D , i.e. in the Schrödinger picture N ∗ ( (cid:37) ) = Tr D [ V (cid:37) V ∗ ] . Now consider inputs in the X basis, with X canonically conjugate to Z . As shown inAppendix C, the probability of correctly determining any particular X input is the same for all values, andis equal to d (cid:80) y (cid:128)(cid:80) z (cid:112) p y , z (cid:138) = ( d + (cid:112) − ) / d . The worst case X error probability therefore tends tozero like 1 / d as d → ∞ . On the other hand, Z -basis inputs 0 and 1 to the complementary channel E (cid:93) resultin completely disjoint output states due to the form of p yz . Thus, if we consider a test which inputs oneof these randomly and checks for agreement at the output, we ﬁnd inf C δ ( Q (cid:92) Z N (cid:93) , C ) ≥ . Using the boundabove, this implies inf Λ X δ ( Q X , N Λ X ) ≥ . This is not 1, but the point is it is bounded away from zero and15ndependent of d : There must be a factor of d when converting between the worst case error probability andthe distinguishability.We can appreciate the failure of calibration in this example from a different point of view, by appealingto the information-disturbance tradeoff of [ ] . Since N transmits Z eigenstates perfectly to BC and X eigenstates almost perfectly, we might be tempted to conclude that the channel is close to the identity channel.However, the information-disturbance tradeoff implies that complements of channels close to the identityare close to constant channels. Clearly this is not the case here, since N ∗ ( | 〉〈 | ) is distinguishable from N ∗ ( | 〉〈 | ) . This point is discussed further by one of us in [ ] . The counterexample constructed above itnot symmetric for Z inputs, and it is an open question if calibration is sufﬁcient in the symmetric case. Forchannels that are covariant with respect to the Weyl-Heisenberg group (also known as the generalized Pauligroup), it is not hard to show that calibration is in fact sufﬁcient. In [ ] Englert presents a wave-particle complementarity relation in a Mach-Zehnder interferometer, quan-tifying the extent to which “the observations of an interference pattern and the acquisition of which-wayinformation are mutually exclusive”. The particle-like “which-way” information is obtained by additionaldetectors in the arms of the interferometer, while fringe visibility is measured by the population differencebetween the two output ports of the interferometer. The detectors can be thought of as producing differentstates in an ancilla system, depending on the path taken by the light. Englert shows the following tradeoffbetween the visibility V and distinguishability D of the which-way detector states: V + D ≤

1. (48)We may regard the entire interferometer plus which-way detector as an apparatus E MZ with quantum andclassical output. It turns out that E MZ is precisely the nonideal qubit X measurement considered in §3 andthat path distinguishability is related error of X and visibility to disturbance (all of which are equal in thiscase by (15)) of a conjugate observable Z . More speciﬁcally, as shown in Appendix D, (cid:34) X ( E MZ ) = ( − D ) and ν Z ( E MZ ) = η Z ( E MZ ) = (cid:98) η Z ( E MZ ) = ( − V ) . (49)Therefore, (48) is also an error-preparation disturbance relation. By the same token, the uncertainty relationsin Theorems 1 and 2 imply wave-particle duality relations.Let us comment on other connections between uncertainty and duality relations. Recently, [ ] showed arelation between wave-particle duality relations and entropic uncertainty relations. As discussed above, thelatter are state-dependent state-preparation relations, and so the interpretation of the wave-particle dualityrelation is somewhat different. Here we have shown that Englert’s relation can actually be understood as astate-independent relation.Each of the disturbance measures are related to visibility in Englert’s setup. It is an interesting question toconsider a multipath interferometer to settle the question of which disturbance measure should be associatedto visibility in general. From the discussion of [ ] , it would appear that visibility ought to be related tomeasurement disturbance ν Z , but we leave a complete analysis to future work. Broadly speaking, there are two main kinds of uncertainty relations: those which are constraints on ﬁxedexperiments, including the details of the input quantum state, and those that are constraints on quantumdevices themselves, independent of the particular input. All of our relations are of the latter type, in contrastto entropic relations, which are typically of the former type. At a formal level, this distinction appears inwhether or not the quantities involved in the precise relation depend on the input state or not. Each type ofrelation certainly has its use, though when considering error-disturbance uncertainty relations, we argued inthe introduction that the conceptual underpinnings of state-dependent relations describing ﬁxed experimentsare unclear. Indeed, it is precisely because of the uncertainty principle that trouble arises in deﬁning error anddisturbance in this case. Worse still, there can be no nontrivial bound relating error and disturbance whichapplies universally, i.e. to all states [ ] .Independent of the previous question, another major contrast between different kinds of uncertainty re-lations is whether they depend on the values taken by the observables, or only the conﬁguration of theireigenstates. Again, our relations are all of the latter type, but now we share this property with entropic rela-tions. That is not to say that the observable values are completely irrelevant in our setting, merely that they This is separate from the issue of whether the bound depends on the state, as for instance in the Robertson relation [ ] . [ ] , the authors used the Wasserstein metric of ordertwo, corresponding to the mean squared error, as the underlying distance D ( ., . ) to measure the closenessof probability distributions. If M Q , M P are the marginals of some joint measurement of position Q andmomentum P , and X (cid:37) denotes the distribution coming from applying the measurement X to the state (cid:37) , theirrelation reads sup (cid:37) D ( M Q (cid:37) , Q (cid:37) ) · sup (cid:37) D ( M P (cid:37) , P (cid:37) ) ≥ c , (50)for some universal constant c . In [ ] , the authors generalize their results to arbitrary Wasserstein metrics. Asin our case, the two distinguishability quantities in (50) are separately maximized over all states, and hencethe resulting expression characterizes the goodness of the approximate measurement.One could instead ask for a “coupled optimization”, a relation of the formsup (cid:37) (cid:148) D ( M Q (cid:37) , Q (cid:37) ) D ( M P (cid:37) , P (cid:37) ) (cid:151) ≥ c (cid:48) , (51)for some other constant c (cid:48) . This approach is taken in [ ] for the question of joint measurability. Whilethis statement certainly tells us that no device can accurately measure both position and momentum for allinput states, the bound c (cid:48) only holds (and can only hold) for the worst possible input state. In contrast, ourbounds, as well as in (50) are state-independent in the sense that the bound holds for all states. Indeed, thetwo approaches are more distinct than the similarities between (50) and (51) would suggest. By optimizingover input states separately, our results and those of [

22, 25, 27 ] are statements about the properties ofmeasurement devices themselves, independent of any particular experimental setup. State-dependent settingscapture the behavior of measurement devices in speciﬁc experimental setups and must therefore account forthe details of the input state.The same set of authors also studied the case of ﬁnite-dimensional systems, in particular qubit systems,again using the Wasserstein metric of order two [ ] . Their results for this case are similar, with the product in(50) replaced by a sum. Perhaps most closely related to our results is the recent work by Ipsen [ ] , who usesthe variational distance as the underlying distinguishability measure to derive similar additive uncertaintyrelations. We note, however, that both [ ] and [ ] only consider joint measurability and do not considerthe change to the state after the approximate measurement is performed, as it is done in our error-disturbancerelation. Furthermore, both base their distinguishability measures on the measurement statistics of the devicesalone. But this does not necessarily tell us how distinguishable two devices ultimately are, as we could employinput states entangled with ancilla systems to test them. These two measures can be different [ ] , even forentanglement-breaking channels [ ] . In Appendix A we give an example which shows that this is also trueof quantum measurements, a speciﬁc kind of entanglement-breaking channel.Entropic quantities are another means of comparing two probability distributions, an approach takenrecently by Buscemi et al. [ ] and Coles and Furrer [ ] (see also Martens and de Muynck [ ] ). Bothcontributions formalize error and disturbance in terms of relative or conditional entropies, and derive theirresults from entropic uncertainty relations for state preparation which incorporate the effects of quantumentanglement [

6, 44 ] . They differ in the choice of the entropic measure and the choice of the state on whichthe entropic terms are evaluated. Buscemi et al. ﬁnd state-independent error-disturbance relations involvingthe von Neumann entropy, evaluated for input states which describe observable eigenstates chosen uniformlyat random. As described in Sec. 6, the restriction to uniformly-random inputs is signiﬁcant, and leads toa characterization of the average-case behavior of the device (averaged over the choice of input state), notthe worst-case behavior as presented here. Meanwhile, Coles and Furrer make use of general Rényi-typeentropies, hence also capturing the worst-case behavior. However, they are after a state-dependent error-disturbance relation which relates the amount of information a measurement device can extract from a stateabout the results of a future measurement of one observable to the amount of disturbance caused to otherobservable.An important distinction between both these results and those presented here is the quantity appearing inthe uncertainty bound, i.e. the quantiﬁcation of complementarity of two observables. As both the aforemen-tioned results are based on entropic state-preparation uncertainty relations, they each quantify complemen-tarity by the largest overlap of the eigenstates of the two observables. This bound is trivial should the two Such an approach has been advocated by David Reeb (private communication).

We have formulated simple, operational deﬁnitions of error and disturbance based on the probability ofdistinguishing the actual measurement apparatus from the relevant ideal apparatus by any testing proce-dure whatsoever. The resulting quantities are conceptually straightfoward properties of the measurementapparatus, not any particular ﬁxed experimental setup. We presented uncertainty relations for both jointmeasurability and the error-disturbance tradeoff, for both arbitrary ﬁnite-dimensional systems and for po-sition and momentum. In the former case the bounds involve simple measures of the complementarity oftwo observables, while the latter involve the ratio of the desired position and momentum precisions σ Q and σ P to Planck’s constant ħ h . We further showed that this operational approach has applications to quantuminformation processing and to wave-particle duality relations. Finally, we presented a detailed comparison ofthe relation of our results to previous work on uncertainty relations.Several interesting questions remain open. One may inquire about the tightness of the bounds. Thequbit example for conjugate observables discussed at the end of §3 shows that the ﬁnite-dimensional boundsof Theorem 2 are tight for small error (cid:34) X , though no conclusion can be drawn from this example for smallpreparation disturbance. It would be interesting to check the tightness of the position and momentum boundsby computing the error and disturbance measures for a device described by a covariant measurement. Forreasons of simplicity, we have not attempted to incorporate precision limits into the deﬁnitions of error anddistinguishability of position and momentum. Doing so would lead to more conceptually satisfying boundsand perhaps remedy the fact that the measurement error-preparation disturbance bound is nontrivial evenoutside the Heisenberg limit. Bounds for other observables in inﬁnite dimensions would also be quite in-teresting, for instance the mixed discrete / continuous case of energy and position of a harmonic oscillator.Restricting to covariant measurements, in ﬁnite or inﬁnite dimensions, it would also be interesting to deter-mine if entangled inputs improve the distinguishability measures, or whether calibration testing is sufﬁcient.From the application in Corollary 1, it would appear that calibration is sufﬁcient, but we have not settled thematter conclusively. Acknowledgements:

The authors are grateful to David Sutter, Paul Busch, Omar Fawzi, Fabian Furrer,Michael Walter and especially David Reeb and Reinhard Werner for helpful discussions. This work was sup-ported by the Swiss National Science Foundation (through the NCCR ‘Quantum Science and Technology’ andgrant no. 200020-135048) and the European Research Council (grant no. 258932). SH is funded by the Ger-man Excellence Initiative and the European Union Seventh Framework Programme under grant agreementno. 291763. He acknowledges additional support by DFG project no. K05430 / References [ ] W. Heisenberg, “Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik”, Zeitschrift fürPhysik , 172–198 (1927) (page 1). [ ] J. A. Wheeler and W. H. Zurek,

Quantum Theory and Measurement (Princeton University Press, 1984) (page 1). [ ] E. H. Kennard, “Zur Quantenmechanik einfacher Bewegungstypen”, Zeitschrift für Physik , 326–352 (1927)(pages 1, 14). [ ] H. P. Robertson, “The Uncertainty Principle”, Physical Review , 163 (1929) (pages 1, 16). [ ] H. Maassen and J. B. M. Ufﬁnk, “Generalized entropic uncertainty relations”, Physical Review Letters , 1103(1988) (page 1). [ ] M. Berta, M. Christandl, R. Colbeck, J. M. Renes, and R. Renner, “The uncertainty principle in the presence ofquantum memory”, Nature Physics , 659–662 (2010), arXiv:0909.0950 [ quant-ph ] (pages 1, 3, 15, 17). [ ] P. J. Coles, M. Berta, M. Tomamichel, and S. Wehner, “Entropic uncertainty relations and their applications”, Reviewsof Modern Physics , 015002 (2017), arXiv:1511.04857 [ quant-ph ] (pages 1, 3, 9). [ ] E. Arthurs and J. L. Kelly, “On the Simultaneous Measurement of a Pair of Conjugate Observables”, Bell SystemTechnical Journal , 725–729 (1965) (page 1). [ ] C. Y. She and H. Heffner, “Simultaneous Measurement of Noncommuting Observables”, Physical Review , 1103–1110 (1966) (page 1). [ ] E. B. Davies,

Quantum theory of open systems (Academic Press, London, 1976) (pages 1, 3, 4). [ ] S. T. Ali and E. Prugoveˇcki, “Systems of imprimitivity and representations of quantum mechanics on fuzzy phasespaces”, Journal of Mathematical Physics , 219–228 (1977) (page 1). ] E. Prugoveˇcki, “On fuzzy spin spaces”, Journal of Physics A: Mathematical and General , 543 (1977) (page 1). [ ] P. Busch, “Indeterminacy relations and simultaneous measurements in quantum theory”, International Journal ofTheoretical Physics , 63–92 (1985) (page 1). [ ] P. Busch, “Unsharp reality and joint measurements for spin observables”, Physical Review D , 2253–2261 (1986)(page 1). [ ] E. Arthurs and M. S. Goodman, “Quantum Correlations: A Generalized Heisenberg Uncertainty Relation”, PhysicalReview Letters , 2447–2449 (1988) (page 1). [ ] H. Martens and W. M. de Muynck, “Towards a new uncertainty principle: quantum measurement noise”, PhysicsLetters A , 441–448 (1991) (page 1). [ ] S. Ishikawa, “Uncertainty relations in simultaneous measurements for arbitrary observables”, Reports on Mathemat-ical Physics , 257–273 (1991) (page 1). [ ] M. G. Raymer, “Uncertainty principle for joint measurement of noncommuting variables”, American Journal ofPhysics , 986–993 (1994) (page 1). [ ] U. Leonhardt, B. Böhmer, and H. Paul, “Uncertainty relations for realistic joint measurements of position and mo-mentum in quantum optics”, Optics Communications , 296–300 (1995) (page 1). [ ] D. M. Appleby, “Concept of Experimental Accuracy and Simultaneous Measurements of Position and Momentum”,International Journal of Theoretical Physics , 1491–1509 (1998), arXiv:quant-ph / [ ] M. J. W. Hall, “Prior information: How to circumvent the standard joint-measurement uncertainty relation”, PhysicalReview A , 052113 (2004), arXiv:quant-ph / [ ] R. F. Werner, “The uncertainty relation for joint measurement of position and momentum”, Quantum Informationand Computation , 546–562 (2004), arXiv:quant-ph / [ ] M. Ozawa, “Uncertainty relations for joint measurements of noncommuting observables”, Physics Letters A ,367–374 (2004) (page 1). [ ] Y. Watanabe, T. Sagawa, and M. Ueda, “Uncertainty relation revisited from quantum estimation theory”, PhysicalReview A , 042121 (2011), arXiv:1010.3571 [ quant-ph ] (page 1). [ ] P. Busch, P. Lahti, and R. F. Werner, “Proof of Heisenberg’s Error-Disturbance Relation”, Physical Review Letters ,160405 (2013), arXiv:1306.1565 [ quant-ph ] (pages 1, 17). [ ] P. Busch, P. Lahti, and R. F. Werner, “Heisenberg uncertainty for qubit measurements”, Physical Review A , 012129(2014), arXiv:1311.0837 [ quant-ph ] (pages 1, 17). [ ] P. Busch, P. Lahti, and R. F. Werner, “Measurement uncertainty relations”, Journal of Mathematical Physics , 042111(2014), arXiv:1312.4392 [ quant-ph ] (pages 1, 17). [ ] V. B. Braginsky and F. Y. Khalili,

Quantum Measurement (Cambridge University Press, 1992) (page 1). [ ] H. Martens and W. M. de Muynck, “Disturbance, conservation laws and the uncertainty principle”, Journal of PhysicsA: Mathematical and General , 4887 (1992) (pages 1, 17). [ ] M. Ozawa, “Universally valid reformulation of the Heisenberg uncertainty principle on noise and disturbance inmeasurement”, Physical Review A , 042105 (2003), arXiv:quant-ph / [ ] Y. Watanabe and M. Ueda, “Quantum Estimation Theory of Error and Disturbance in Quantum Measurement”,(2011), arXiv:1106.2526 [ quant-ph ] (page 1). [ ] C. Branciard, “Error-tradeoff and error-disturbance relations for incompatible quantum measurements”, Proceedingsof the National Academy of Sciences , 6742–6747 (2013), arXiv:1304.2071 [ quant-ph ] (page 1). [ ] F. Buscemi, M. J. W. Hall, M. Ozawa, and M. M. Wilde, “Noise and Disturbance in Quantum Measurements: AnInformation-Theoretic Approach”, Physical Review Letters , 050401 (2014), arXiv:1310.6603 [ quant-ph ] (pages 1,15, 17). [ ] A. C. Ipsen, “Error-disturbance relations for ﬁnite dimensional systems”, (2013), arXiv:1311.0259 [ math-ph ] (pages 1,17). [ ] P. J. Coles and F. Furrer, “State-dependent approach to entropic measurement–disturbance relations”, Physics LettersA , 105–112 (2015), arXiv:1311.7637 [ quant-ph ] (pages 1, 17). [ ] M. Ozawa, “Uncertainty relations for noise and disturbance in generalized quantum measurements”, Annals ofPhysics , 350–416 (2004) (page 1). [ ] P. Busch, P. Lahti, and R. F. Werner, “Quantum root-mean-square error and measurement uncertainty relations”,Reviews of Modern Physics , 1261–1281 (2014), arXiv:1312.4393 [ quant-ph ] (page 1). [ ] D. M. Appleby, “Quantum Errors and Disturbances: Response to Busch, Lahti and Werner”, Entropy , 174 (2016),arXiv:1602.09002 [ quant-ph ] (page 1). [ ] M. Ozawa, “Disproving Heisenberg’s error-disturbance relation”, (2013), arXiv:1308.3540 [ quant-ph ] (page 1). [ ] D. Kretschmann, D. Schlingemann, and R. Werner, “The Information-Disturbance Tradeoff and the Continuity ofStinespring’s Representation”, IEEE Transactions on Information Theory , 1708–1717 (2008), arXiv:quant-ph / ] D. Kretschmann, D. Schlingemann, and R. F. Werner, “A continuity theorem for Stinespring’s dilation”, Journal ofFunctional Analysis , 1889–1904 (2008), arXiv:0710.2495 [ quant-ph ] (pages 2, 5). [ ] B.-G. Englert, “Fringe Visibility and Which-Way Information: An Inequality”, Physical Review Letters , 2154 (1996)(pages 2, 16, 25). [ ] J. M. Renes and J.-C. Boileau, “Conjectured Strong Complementary Information Tradeoff”, Physical Review Letters , 020402 (2009), arXiv:0806.3984 [ quant-ph ] (page 3). [ ] M. Tomamichel and R. Renner, “Uncertainty Relation for Smooth Entropies”, Physical Review Letters , 110506(2011), arXiv:1009.2015 [ quant-ph ] (pages 3, 15, 17). [ ] M. Tomamichel, C. C. W. Lim, N. Gisin, and R. Renner, “Tight ﬁnite-key analysis for quantum cryptography”, NatureCommunications , 634 (2012), arXiv:1103.4130 [ quant-ph ] (page 3). [ ] F. G. Lacerda, J. M. Renes, and R. Renner, “Classical leakage resilience from fault-tolerant quantum computation”,(2014), arXiv:1404.7516 [ quant-ph ] (pages 3, 15). [ ] K. Kraus,

States, Effects, and Operations: Fundamental Notions of Quantum Theory , Lecture notes in physics 190(Springer-Verlag, Berlin, 1983) (page 3). [ ] R. F. Werner, “Quantum Information Theory — an Invitation”, in

Quantum Information , Springer Tracts in ModernPhysics 173 (Springer Berlin Heidelberg, 2001), pp. 14–57, arXiv:quant-ph / [ ] M. M. Wolf, “Quantum Channels and Operations: A Guided Tour”, (2012) (pages 3, 21). [ ] C. Bény and F. Richter, “Algebraic approach to quantum theory: a ﬁnite-dimensional guide”, (2015), arXiv:1505.03106 [ quant-ph ] (page 3). [ ] A. Y. Kitaev, “Quantum computations: Algorithms and error correction”, Russian Mathematical Surveys , 1191–1249 (1997) (pages 4, 17). [ ] V. Paulsen,

Completely Bounded Maps and Operator Algebras , Vol. 78, Cambridge Studies in Advanced Mathematics(Cambridge University Press, Jan. 15, 2003), 320 pp. (page 4). [ ] A. Gilchrist, N. K. Langford, and M. A. Nielsen, “Distance measures to compare real and ideal quantum processes”,Physical Review A , 062310 (2005), arXiv:quant-ph / [ ] J. Watrous, “Semideﬁnite Programs for Completely Bounded Norms”, Theory of Computing , 217–238 (2009),arXiv:0901.4709 [ quant-ph ] (pages 4, 21). [ ] J. Watrous, “Simpler semideﬁnite programs for completely bounded norms”, Chicago Journal of Theoretical Com-puter Science , 8 (2013), arXiv:1207.5726 [ quant-ph ] (page 4). [ ] W. F. Stinespring, “Positive functions on C*-algebras”, Proceedings of the American Mathematical Society , 211–216(1955) (page 4). [ ] P. J. Coles and M. Piani, “Improved entropic uncertainty relations and information exclusion relations”, PhysicalReview A , 022112 (2014), arXiv:1307.4265 [ quant-ph ] (page 9). [ ] P. W. Shor and J. Preskill, “Simple Proof of Security of the BB84 Quantum Key Distribution Protocol”, Physical ReviewLetters , 441 (2000), arXiv:quant-ph / [ ] I. Devetak, “The private classical capacity and quantum capacity of a quantum channel”, IEEE Transactions on In-formation Theory , 44–55 (2005), arXiv:quant-ph / [ ] J. M. Renes, “Duality of privacy ampliﬁcation against quantum adversaries and data compression with quantum sideinformation”, Proceedings of the Royal Society A , 1604–1623 (2011), arXiv:1003.0703 [ quant-ph ] (page 15). [ ] J. M. Renes, “The Physics of Quantum Information: Complementarity, Uncertainty, and Entanglement”, Habilitation(TU Darmstadt, 2012), arXiv:1212.2379 [ quant-ph ] (page 15). [ ] J. M. Renes, “Uncertainty relations and approximate quantum error correction”, Physical Review A , 032314(2016), arXiv:1605.01420 [ quant-ph ] (page 16). [ ] P. J. Coles, J. Kaniewski, and S. Wehner, “Equivalence of wave–particle duality to entropic uncertainty”, NatureCommunications , 5814 (2014), arXiv:1403.4687 [ quant-ph ] (page 16). [ ] P. J. Coles, “Entropic framework for wave-particle duality in multipath interferometers”, Physical Review A ,062111 (2016), arXiv:1512.09081 [ quant-ph ] (page 16). [ ] K. Korzekwa, D. Jennings, and T. Rudolph, “Operational constraints on state-dependent formulations of quantumerror-disturbance trade-off relations”, Physical Review A , 052108 (2014), arXiv:1311.5506 [ quant-ph ] (page 16). [ ] A. Barchielli, M. Gregoratti, and A. Toigo, “Measurement uncertainty relations for discrete observables: Relativeentropy formulation”, (2016), arXiv:1608.01986 [ math-ph ] (page 17). [ ] M. F. Sacchi, “Entanglement can enhance the distinguishability of entanglement-breaking channels”, Physical ReviewA , 014305 (2005), arXiv:quant-ph / [ ] V. P. Belavkin, “Optimal multiple quantum statistical hypothesis testing”, Stochastics , 315 (1975) (page 24). [ ] P. Hausladen and W. K. Wootters, “A ‘Pretty Good’ Measurement for Distinguishing Quantum States”, Journal ofModern Optics , 2385 (1994) (page 24). Entanglement improves the distinguishability of measurements

Here we give an example of two measurements whose distinguishability is improved by entanglement.Let E be a measurement in an arbitrary chosen basis | b 〉 , | b 〉 , and | b 〉 , and deﬁne E be measurement in thebasis given by | θ 〉 = ( | b 〉 + | b 〉−| b 〉 ) , | θ 〉 = ( − | b 〉 + | b 〉 + | b 〉 ) and | θ 〉 = ( | b 〉−| b 〉 + | b 〉 ) .Using T k = | b k 〉〈 b k | − | θ k 〉〈 θ k | , the largest distinguishability to be had without entanglement is given by δ (cid:48) ( E , E ) = max (cid:37)

12 2 (cid:88) k = (cid:12)(cid:12) Tr [ (cid:37) T k ] (cid:12)(cid:12) (52a) = max (cid:37) max { s k = ± }

12 2 (cid:88) k = Tr [ s k T k (cid:37) ] (52b) = max { s k = ± } (cid:13)(cid:13) (cid:88) k = s k T k (cid:13)(cid:13) ∞ . (52c)Checking the eight combinations of s k , one easily ﬁnds that the maximimum value is (cid:112) / (cid:37) =  − − − − − −  (53)to deﬁne Ψ = ( ⊗ (cid:112) (cid:37) ) Ω ( ⊗ (cid:112) (cid:37) ) for Ω the projector onto | Ω 〉 = (cid:80) k | b k 〉 ⊗ | b k 〉 , then δ ( E , E ) ≥

12 2 (cid:88) k = (cid:13)(cid:13) Tr [( T k ⊗ ) Ψ ] (cid:13)(cid:13) . (54)Direct calculation shows that δ ( E , E ) ≥ (cid:112) /

2. Thus, there exist projective measurements for which δ ( E , E ) >δ (cid:48) ( E , E ) . B Computing error and disturbance by convex optimization

Here we detail how to compute the error and disturbance quantities via semideﬁnite programming andcalculate these for the nonideal qubit X measurement example. Given a Hilbert space H with basis {| k 〉} dk = ,deﬁne, just as above, | Ω 〉 = (cid:80) dk = | k 〉 ⊗ | k 〉 ∈ H ⊗ H . Then, for any channel E , let C denote the Choi mappingof E ∗ to an unnormalized bipartite state, C ( E ) : = E ∗ ⊗ I ( | Ω 〉〈 Ω | ) ∈ B ( H B ⊗ H A ) . (55)The action of the channel can be compactly expressed in terms of the Choi operator as E A → B ( Λ B ) = Tr B [ Λ B C ( E ) BA ] or in the Schrödinger picture as E ∗ A → B ( (cid:37) A ) = Tr A [ C ( E ) BA (cid:37) TA ] , where the transpose is taken in the basis deﬁning C (see, e.g. [ ] ). The cb norm can then be expressed in primal and dual form as [ ] (cid:107) E A → B (cid:107) cb = maximum K , (cid:37) Tr [ C ( E ) BA K BA ] subject to K BA − B ⊗ (cid:37) A ≤

0, Tr [ (cid:37) A ] ≤ (cid:37) A , K BA ≥

0, (56) = minimum T , λ λ subject to T BA ≥ C ( E ) BA , λ A − T A ≥ T BA , λ ≥ (cid:107) T A (cid:107) ∞ . For inﬁnite-dimensional systems the Choi operator does not have such a nice form, though it might be possible to formu-late the cb norm of Gaussian channels as a tractable optimization.The additional optimizations involving R in the measures of error and disturbance are immediately com-patible with the dual formulation in (57), and so these quantities can be cast as semideﬁnite programs. To21tart, consider the error in measuring X . With Q X A = C ( Q X ) and E Y BA = C ( E A → Y B ) , we have (cid:34) X ( E A → Y B ) = minimum T , λ , R λ subject to T X A + Tr Y [ R XY E Y A ] ≥ Q X A , λ A − T A ≥ R Y = Y , λ , T X A , R XY ≥ T X A to be a hybrid classical-quantum operator, classicalon X , and of course R XY is classical on both systems. This is also the reason it is unnecessary to transpose Y in Tr Y [ R XY E Y A ] . Further symmetries of Q X A and E X A can be quite helpful in simplifying the program, but wewill not pursue this further here. The associated primal form is as follows. (cid:34) X ( E A → Y B ) = maximum K , (cid:37) , L Tr [ Q X A K X A ] − Tr [ L Y ] subject to K X A − X ⊗ (cid:37) A ≤

0, Tr [ (cid:37) A ] ≤

1, Tr A [ E Y A K X A ] − L Y ⊗ X ≤ (cid:37) A , K X A ≥ L Y = L ∗ Y . (59)In writing an equality we have assumed that the duality gap is zero. But this is easy enough to show using theSlater condition, namely by ensuring that the value of the minimization is ﬁnite and that there exists a strictlyfeasible set of maximization variables. The former holds because (cid:34) X is the inﬁnimum of the distinguishability,and hence (cid:34) X ( E ) ≥

0. Meanwhile, a strictly feasible set of variables in (59) is given by K = k , (cid:37) = k , and L = kE Y for k < / dim ( A ) .To formulate the measurement disturbance ν Z ( E A → Y B ) we are interested in C ( ERT Y Q Z ) , which can beexpressed as a linear map on R AB Y : C ( ERT Y Q Z ) = Tr A (cid:48) Y B [ Q Z A (cid:48) R T A (cid:48) A (cid:48) Y B E T B Y BA ] (60a) = Tr A (cid:48) Y B [ R A (cid:48) Y B Q T A (cid:48) Z A (cid:48) E T B Y BA ] . (60b)In the second step we have transposed the A (cid:48) system in the ﬁrst. Then we have ν Z ( E A → Y B ) = minimum T , λ , R λ subject to T Z A + Tr A (cid:48) Y B [ R A (cid:48) Y B Q T A (cid:48) Z A (cid:48) E T B Y BA ] ≥ Q Z A , λ A − T A ≥ R Y B = Y B , λ , T Z A , R A (cid:48) Y B ≥ = maximum K , (cid:37) , L Tr [ Q Z A K Z A ] − Tr [ L Y B ] subject to K Z A − Z ⊗ (cid:37) A ≤

0, Tr [ (cid:37) A ] ≤

1, Tr Z A [ Q Z A (cid:48) E Y BA K Z A ] − A (cid:48) ⊗ L Y B ≤ (cid:37) A , K Z A ≥ L Y B = L ∗ Y B . (62)Here we have absorbed the transposes over A (cid:48) and B into A (cid:48) and the deﬁnition of L Y B , since this does notaffect Hermiticity or the value of the objective function. Strong duality is essentially the same as before:The minimization is ﬁnite and we can choose K = k and (cid:37) = k . Then in the third constraint we haveTr Z A [ Q Z A (cid:48) E Y BA K Z A ] = k A (cid:48) ⊗ E Y B since Q Z is unital. Setting L = kE Y B gives a strictly feasible set.Finally, we come to the two preparation disturbance measures. The ﬁrst is simply η Z ( E A → Y B ) = minimum T , λ , R λ subject to T A Z + Tr Y BA (cid:48) [ R A Y B E T B Y BA (cid:48) P T A (cid:48) A (cid:48) Z ] ≥ P A Z , λ Z − T Z ≥ R Y B = Y B , λ , T X A , R A Y B ≥ = maximum K , (cid:37) , L Tr [ P A Z K A Z ] − Tr [ L Y B ] subject to K A Z − A ⊗ (cid:37) Z ≤

0, Tr [ (cid:37) Z ] ≤

1, Tr A (cid:48) Z [ E Y BA (cid:48) P T A (cid:48) A (cid:48) Z K A Z ] − A ⊗ L Y B ≤ (cid:37) Z , K A Z ≥ L Y B = L ∗ Y B . (64)Here we have absorbed the transpose on B into the deﬁnition of L Y B since this doesn’t affect Hermiticity orthe value of the objective function. Strong duality holds as before, and also for the demerit measure which22eads d − d − (cid:98) η Z ( E A → Y B ) = minimum T , λ , σ λ subject to T Y B Z + σ Y B ⊗ Z ≥ Tr A [ E Y BA P T A A Z ] , λ Z − T Z ≥

0, Tr [ σ Y B ] = λ , T Y B Z , σ Y B ≥ = maximum K , (cid:37) , µ Tr [ E Y BA P T A A Z K Y B Z ] − µ subject to K Y B Z − Y B ⊗ (cid:37) Z ≤

0, Tr [ (cid:37) Z ] ≤ K Y B − µ Y B ≤ (cid:37) Z , K Y B Z ≥ µ ∈ (cid:82) . (66)Now let us consider the particular example described in the main text, a suboptimal X measurement.Suppose we use | ϕ x 〉 from the ideal X measurement to deﬁne the Choi operator. After a bit of calculation,one ﬁnds that the Choi operator E Y BA of E Y B | A is given by E Y BA = | b 〉〈 b | Y ⊗ | Ψ 〉〈 Ψ | BA + | b 〉〈 b | Y ⊗ ( σ z ⊗ σ z ) | Ψ 〉〈 Ψ | BA ( σ z ⊗ σ z ) , (67)where | Ψ 〉 = cos θ | ϕ 〉⊗| ϕ 〉 + sin θ | ϕ 〉⊗| ϕ 〉 . Tracing out B gives the Choi operator of just the measurementresult Y , E Y A = (cid:80) x | b x 〉〈 b x | Y ⊗ Λ x , with Λ x = + ( − ) x cos θ σ x .To compute the measurement error (cid:34) X ( E ) , suppose that no recovery operation is applied, i.e. the outcome Y is treated as X . Then we can work with E X A and dispense with R so that the third constraint in (58) issatisﬁed. To satisfy the ﬁrst constraint, choose T X A to be the positive part of Q X A − E X A . This gives T X A = ( − cos θ ) (cid:80) x | b x 〉〈 b x | ⊗ | ϕ x 〉〈 ϕ x | ; consequently, T A = ( − cos θ ) A and therefore (cid:34) X ( E ) ≤ ( − cos θ ) .On the other hand, K X A = Q X A and (cid:37) A = A satsify the ﬁrst two constraints in (59). The last constraintinvolves the quantity Tr A [ E Y A K X A ] = (cid:80) x y | b y 〉〈 b y | Y ⊗ | b x 〉〈 b x | X ( + ( − ) x + y cos θ ) and can therefore besatisﬁed by choosing L Y = ( + cos θ ) Y . Evaluating the objective function gives (cid:34) X ( E ) ≥ ( − cos θ ) .Note that the choice of K X A corresponds to the unentangled test of randomly inputting | ϕ x 〉 and checkingthat the result is x . We could have anticipated that unentangled tests would be sufﬁcient in this case, since theoptimal and actual measurements are both diagonal in the σ x basis: Any input state can be freely dephasedin this basis, thus removing any entanglement.Next, consider the measurement disturbance ν Z ( E ) . Proceeding as with measurement error, suppose thatno recovery operation is applied, so that the output B is just regarded as A (cid:48) (cid:39) A and the third constraint in(61) is trivially satisﬁed. For the ﬁrst constraint we need only the operator Tr Y A (cid:48) [ Q Z A (cid:48) E T A (cid:48) Y A (cid:48) A ] , and after somecalculation we ﬁnd that it equals (cid:80) z | b z 〉〈 b z | ⊗ Γ z with Γ z = ( + ( − ) z sin θ σ z ) . Thus, the optimizationis just like that of (cid:34) X ( E ) , but with cos θ replaced by sin θ . Hence ν Z ( E ) ≤ ( − sin θ ) . To show the otherinequality from the maximization form (62) also proceeds as before, starting with K Z A = Q Z A and (cid:37) A = A .For the third constraint a bit of calculation showsTr Z A [ Q Z A (cid:48) E Y BA K Z A ] = (cid:88) x , z | b z 〉〈 b z | A (cid:48) ⊗ | b x 〉〈 b x | Y ⊗ ( σ zx σ xz | ψ 〉〈 ψ | σ xz σ zx ) B , (68)with | ψ 〉 = (cid:112) ( (cid:112) + sin θ | θ 〉 + (cid:112) − sin θ | θ 〉 . Choosing L Y B = (cid:88) x | b x 〉〈 b x | Y ⊗ (( + sin θ ) + ( − ) x cos θ σ x ) B (69)satisﬁes the constraints, and the objective function becomes ( − sin θ ) . As with (cid:34) X ( E ) , entangled inputs donot increase the distinguishability in this particular case.A trivial recovery map also optimizes η Z ( E ) . To see this, set K A Z = P A Z and (cid:37) Z = Z . Then in thethird constraint of (64) we have Tr A (cid:48) [ E Y BA (cid:48) P T A (cid:48) A (cid:48) Z K A Z ] , which is precisely the same as (68) with A (cid:48) replaced by A .Hence, if we choose L Y B as in (69), we obtain the lower bound η Z ( (cid:34) ) ≥ ( − sin θ ) . To establish optimality,suppose R does nothing but discard the Y system. In the minimization (63) we then have Tr Y A (cid:48) [ E Y AA (cid:48) P T A (cid:48) A (cid:48) Z ] ,which is the same as Tr Y A (cid:48) [ Q Z A (cid:48) E T A (cid:48) Y A (cid:48) A ] from ν Z ( E ) . Proceeding as there, we ﬁnd the matching upper bound.Finally, consider (cid:98) η Z ( E ) . Here there are two possible outputs of P Z E , call them ξ and ξ . It is notdifﬁcult to show that for arbitrary ξ z the distinguishability is precisely (cid:98) η Z ( E ) = ( − δ ( ξ , ξ )) . On the onehand, we can simply pick the output of C to be ξ = ( ξ + ξ ) . Then, with T in (65) the positive part of (cid:80) z | z 〉〈 z | Z ⊗ ( ξ z − ξ ) , the objective function becomes δ ( ξ , ξ ) . On the other hand, in (66) we can choose23 Y B Z = | 〉〈 | Z ⊗ Λ Y B + | 〉〈 | Z ⊗ ( − Λ ) Y B , for Λ the projector onto the nonnegative part of ξ − ξ . Then µ = and (cid:37) = are feasible and lead again to the same objective function. In this particular case the twostates are ξ = σ z ξ σ z and ξ = (cid:80) x | b x 〉〈 b x | Y ⊗ σ x | ψ 〉〈 ψ | σ x , which yields δ ( ξ , ξ ) = sin θ and hence (cid:98) η Z ( E ) = ( − sin θ ) . C Counterexample channel

Here we present the calculations involved in §6.1 Let | ξ z 〉 BD = (cid:80) y (cid:112) p yz | y 〉 B | y 〉 D . Then the isometry isjust V = (cid:88) z | ξ z 〉 | z 〉 C 〈 z | A . (70)Observe that the action on | ˜ x 〉 states leads to symmetric output in BC : V | ˜ x 〉 = (cid:112) d (cid:88) z ω xz V | z 〉 (71a) = (cid:112) d (cid:88) z ω xz | ξ z 〉 BD | z 〉 C (71b) = Z xC (cid:112) d (cid:88) z | ξ z 〉 BD | z 〉 C . (71c)Therefore, the probability of incorrectly identifying any particular input state is the same as any other, andwe can consider the case that the input x value is chosen uniformly at random. We can further simplify the BC output by deﬁning p y = d (cid:80) z p yz and | η y 〉 = (cid:112) d (cid:88) z (cid:113) p yz / p y | z 〉 , (72)which is a normalized state on H C for each y . Then we have V | ˜ x 〉 = Z xC (cid:88) y (cid:112) p y | η y 〉 C | y 〉 B | y 〉 D . (73)Ignoring the D system will produce a classical-quantum state, with system B recording the classical value y , which occurs with probability p y , and C the quantum state Z x | η y 〉 . The optimal measurement thereforehas elements Λ x of the form Λ x = (cid:80) y | y 〉〈 y | B ⊗ ( Γ x , y ) C for some set of POVMs { Γ x , y } y . In every sector ofﬁxed y value, the measurement has to distinguish between a set of pure states occurring with equal proba-bilities. Therefore, by a result going back to Belavkin, the optimal measurement is the so-called “pretty goodmeasurement” [

68, 69 ] . This has measurement elements Γ x , y which project onto the orthonormal states | µ x , y 〉 = S − / Z x | η y 〉 , where S = (cid:80) x Z x | η y 〉〈 η y | Z − x . It is easy to work out that S = (cid:80) x ( p yz / p y ) | z 〉〈 z | , andthus | µ x , y 〉 = | ˜ x 〉 for all y . Hence, we can in fact dispense with the B system altogether, since the particularvalue of y does not alter the optimal measurement. The average guessing probability is thus p guess = d (cid:88) x , y p y (cid:12)(cid:12) 〈 ˜ x | Z x | η y 〉 (cid:12)(cid:12) (74a) = (cid:88) y p y (cid:12)(cid:12) 〈 ˜0 | η y 〉 (cid:12)(cid:12) (74b) = d (cid:88) y (cid:128) (cid:88) z (cid:112) p yz (cid:138) , (74c)as intended. D Englert’s complementarity relation

Here we describe Englert’s setup in our formalism and establish (49). He considers a Mach-Zehnderinterferometer with a relative phase shift between the two arms and additional which-way detectors in eacharm. To the two possible paths inside the interferometer we may associate the (orthogonal) eigenstates | ϑ z 〉 ofan observable Z , with z ∈ {

0, 1 } . For simplicity, we assume Z has eigenvalues ( − ) z . The action of a relative ϕ phase shift is described by the unitary U PS = (cid:80) z = e iz ϕ | ϑ z 〉〈 ϑ z | . It will prove convenient to choose ϕ = | γ z 〉 ,the detector corresponds to the isometry U WW = (cid:80) z = | ϑ z 〉〈 ϑ z | Q ⊗ | γ z 〉 A , where A denotes the ancilla and Q the system itself, which Englert terms a “quanton”.Ignoring the phase shifts associated with reﬂection, the output modes of a symmetric (50 /

50) beamsplitterare related to the input modes by the unitary U BS = (cid:80) z = | ϑ z 〉 〈 ϕ z | , with | ϕ x 〉 = (cid:112) (cid:80) z ( − ) xz | ϑ z 〉 for x ∈{

0, 1 } . We may associate these states with the observable X , also taking eigenvalues ( − ) x . Observe that allthree complementarity measures are . The entire Mach-Zehnder device can be described by the isometry U MZ = U BS U PS U WW U BS (75a) = (cid:88) x , z = e ix ϕ | ϑ z 〉 〈 ϕ z | ϑ x 〉 〈 ϕ x | Q ⊗ | γ x 〉 A (75b) = (cid:88) x = e ix ϕ | ϕ x 〉〈 ϕ x | Q ⊗ | γ x 〉 A . (75c)When the ancilla is subsequently measured so as to extract information about the path, we may regard thewhole operation as an apparatus E MZ with one quantum and one classical output.The available “which-way” information, associated with particle-like behavior of Q , is characterized bythe distinguishability D : = δ ( γ , γ ) . Given the particular form of U in (75), we may set sin θ = 〈 γ | γ 〉 for θ ∈ (cid:82) without loss of generality; D is then cos θ . This amounts to deﬁning | γ k 〉 = cos θ | k 〉 + sin θ | k + 〉 ,where the states {| k 〉} k = form an orthonormal basis and arithmetic inside the ket is modulo two. Thus, E MZ with ϕ = X measurement E considered in §3. We shall see momentarily that ϕ = (cid:34) X ( E MZ ) = ( − D ) as claimed.Meanwhile, the fringe visibility V is deﬁned as the difference in probability (or population) in the twooutput modes of the interferometer, maximized over the choice of input state. Since Z = | ϑ 〉〈 ϑ | − | ϑ 〉〈 ϑ | ,this is just V = max (cid:37) (cid:12)(cid:12) Tr [( Z Q ⊗ A ) U MZ (cid:37) U ∗ MZ ] (cid:12)(cid:12) . (76)A straightforward calculation yields U ∗ MZ ( Z Q ⊗ A ) U MZ = sin θ ( cos ϕ Z + i sin ϕ X Z ) . It can be veriﬁed that ( cos ϕ Z + i sin ϕ X Z ) has eigenvalues ±

1, and therefore V = sin θ . Thus, V + D = [ ] ). Note that ϕ does not appear in the visiblity itself, justifying our choice of ϕ = ν Z ( E MZ ) = η Z ( E MZ ) = (cid:98) η Z ( E MZ ) = ( − V ))