[PDF] Operational applications of the diamond norm and related measures in quantifying the non-physicality of quantum maps

Abstract

Although quantum channels underlie the dynamics of quantum states, maps which are not physical channels -- that is, not completely positive -- can often be encountered in settings such as entanglement detection, non-Markovian quantum dynamics, or error mitigation. We introduce an operational approach to the quantitative study of the non-physicality of linear maps based on different ways to approximate a given linear map with quantum channels. Our first measure directly quantifies the cost of simulating a given map using physically implementable quantum channels, shifting the difficulty in simulating unphysical dynamics onto the task of simulating linear combinations of quantum states. Our second measure benchmarks the quantitative advantages that a non-completely-positive map can provide in discrimination-based quantum games. Notably, we show that for any trace-preserving map, the quantities both reduce to a fundamental distance measure: the diamond norm, thus endowing this norm with new operational meanings in the characterisation of linear maps. We discuss applications of our results to structural physical approximations of positive maps, quantification of non-Markovianity, and bounding the cost of error mitigation.

Full PDF

aa r X i v : . [ qu a n t - ph ] M a r Operational applications of the diamond norm and related measuresin quantifying the non-physicality of quantum maps

Bartosz Regula, ∗ Ryuji Takagi, † and Mile Gu

1, 2, 3, ‡ Nanyang Quantum Hub, School of Physical and Mathematical Sciences,Nanyang Technological University, 637371, Singapore Complexity Insitute, Nanyang Technological University, 637371, Singapore Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, 117543, Singapore

Although quantum channels underlie the dynamics of quantum states, maps which are not physicalchannels — that is, not completely positive — can often be encountered in settings such as entangle-ment detection, non-Markovian quantum dynamics, or error mitigation. We introduce an operationalapproach to the quantitative study of the non-physicality of linear maps based on diﬀerent ways toapproximate a given linear map with quantum channels. Our ﬁrst measure directly quantiﬁes the costof simulating a given map using physically implementable quantum channels, shifting the diﬃcultyin simulating unphysical dynamics onto the task of simulating linear combinations of quantum states.Our second measure benchmarks the quantitative advantages that a non-completely-positive map canprovide in discrimination-based quantum games. Notably, we show that for any trace-preservingmap, the quantities both reduce to a fundamental distance measure: the diamond norm, thus en-dowing this norm with new operational meanings in the characterisation of linear maps. We discussapplications of our results to structural physical approximations of positive maps, quantiﬁcation ofnon-Markovianity, and bounding the cost of error mitigation. As a consequence of our ﬁndings, wealso show that a measure of physical implementability of maps recently considered in [Jiang et al.,arXiv:2012.10959] actually equals the diamond norm.

1. Introduction

It is one of the fundamental properties of quantum mechanics that the evolution of quantumstates is described by linear maps which are completely positive and trace preserving (CPTP),stemming from the unitary dynamics enforced on a larger Hilbert space [1]. However, inseveral diﬀerent settings of practical importance, various applications of quantum dynamicswhich are not CPTP can be encountered. This motivates the study of such transformations, andin particular a precise understanding of how they can be compared with and approximated byphysical quantum channels.One important application of non-CPTP maps is in entanglement detection, where positivebut not completely positive maps can serve as entanglement witnesses [2]. A bipartite state 𝜌 is entangled if and only if there exists a positive map Φ such that id ⊗ Φ ( 𝜌 ) is no longer apositive operator, and therefore such a map can reveal the correlations of 𝜌 . This approach hasconstituted one of the most important ways of detecting entanglement [3, 4], but its experimentalimplementation encounters an obstacle: how to realise the action of an unphysical linear mapin practice? This question prompted the introduction of structural physical approximations(SPA) of non-CPTP maps [5], which aim to enable the physical evaluation of general mapsby designing suitable approximations in terms of quantum channels and using them to inferproperties of the original map [6–8].Another setting in which non-CPTP maps are encountered is that of non-Markovian quantumdynamics or, generally, in the reduced dynamics of correlated systems. Speciﬁcally, when anopen quantum system shares some initial correlations with its environment, the evolution ofthe composite system-environment state can correspond to a non-CPTP map when looking onlyat the dynamics of the reduced state of the system [9–12]. Although the physical interpretationof this is a matter of debate and alternative ways to understand such dynamics have beenproposed [13–15], it can nevertheless be useful to study such non-CPTP evolutions directly togain an understanding of reduced dynamics of open quantum systems.Even broader types of unphysical quantum dynamics can be found in the areas of quantumerror correction and error mitigation [16–18]. This is because, in a broad sense, both of thesesettings are concerned with the following problem: if an unknown system has undergone a noisyevolution as 𝜌 ↦→ Θ ( 𝜌 ) , how can we reconstruct the original state as closely as possible, that is,how to implement a map Φ such that Φ ◦ Θ ( 𝜌 ) ≈ 𝜌 ? Such inverse operations typically cease ∗ [email protected] † [email protected] ‡ [email protected] to be valid quantum channels, and so it becomes necessary to devise approaches to implementthem in practice with the use of physical operations.In this work, we introduce a general quantitative framework for the characterisation of suchunphysical maps by approximating them with quantum channels. We then explicitly givethe considered measures operational meaning by connecting them with the performance ofpractical tasks, including the cost of simulating a given map with quantum channels. Notably,we show that all of the considered measures reduce to the same quantity when the given linearmap is trace preserving: they all equal the diamond norm [19, 20], a fundamental computationaltool that serves as a measure of quantum channel distance and ﬁnds many uses in the practicalcharacterisation of quantum processes [21]. This endows the diamond norm with new meaningsin the operational tasks that we consider, and furthermore allows a number of new connectionsto be established. On the one hand, many known results in the quantiﬁcation of the diamondnorm can be carried over to the setting of our work, and on the other hand, we can use ourcharacterisation to provide new insight into the computation and applications of the diamondnorm.Our approach is based on the notion of robustness measures [22] — inspired by recentapplications of such quantities in the study of general resource theories of channels [23–33],we use them to quantify the amount of noise needed to turn a given map into a quantumchannel. Such measures allow for several diﬀerent generalisations to the setting of linear maps,motivating us to study and compare these deﬁnitions. The robustness-based approaches can beunderstood as diﬀerent ways of designing optimal decompositions of linear maps in terms ofquantum channels, and so they generalise the standard structural physical approximations [5].We express the measures as semideﬁnite programs and establish various relations and boundsbetween them.We apply our ﬁrst measure in the task of simulating the action of an unphysical map with validchannels, accomplished by allowing the use of ancillary systems which can consists of linearcombinations of quantum states. We show that the cost of such simulation is given exactly bythe robustness measure. Furthermore, answering the question of whether any unphysical mapcan provide measurable operational advantages over quantum channels, we show this to be thecase in the setting of discrimination-based quantum games, establishing our second robustnessmeasure as the exact quantiﬁer of this advantage.Our results also generalise and shed light on the very recent ﬁndings of Ref. [33], whichconsidered a similar framework for approximating trace-preserving maps using a robustness-and quasiprobability-based approach. In particular, we show that the measure considered in[33] is actually an alternative expression for the diamond norm of a map, rather than a newquantity.The paper is structured as follows. In Sec. 2, we introduce the notions of robustness measuresand show how they can be applied to non-CPTP linear maps. We establish precise connectionswith the diamond norm in Sec. 3. We then proceed to show that the robustness measures —and hence the diamond norm — play a crucial role in quantifying the cost of simulating linearmaps (Sec. 4) as well as in understanding the advantages a non-CPTP map could provide ininput-output quantum games (Sec. 5). We proceed to establish a number of bounds for themeasures in Sec. 6. Finally, we discuss the applications of our approach, comparisons withother methods, and explicitly show how the measures can be evaluated for some representativeexamples in Sec. 7.

2. Robustness of non-CP maps

Let 𝐴 and 𝐵 denote two ﬁnite-dimensional quantum systems of dimension 𝑑 𝐴 and 𝑑 𝐵 , respec-tively. We will use L ( 𝐴 ) to denote the set of all linear operators, H ( 𝐴 ) to denote the set of allHermitian operators, and D ( 𝐴 ) to denote all density operators acting on the Hilbert space ofsystem 𝐴 . We use h 𝑋, 𝑌 i = Tr ( 𝑋 † 𝑌 ) for the Hilbert-Schmidt inner product.Among all linear maps from L ( 𝐴 ) to L ( 𝐵 ) , we will be primarily concerned with Hermiticity-preserving maps H ( 𝐴, 𝐵 ) , which are deﬁned as maps such that Φ ( 𝑋 ) ∈ H ( 𝐵 ) ∀ 𝑋 ∈ H ( 𝐴 ) . Amap is called positive if Φ ( 𝑋 ) ≥ ∀ 𝑋 ≥ 𝐴 ⊗ Φ is positive, trace preserving if Tr Φ ( 𝑋 ) = Tr 𝑋 ∀ 𝑋 , and trace non-increasing if Tr Φ ( 𝑋 ) ≤ Tr 𝑋 ∀ 𝑋 . To each map Φ ∈ H ( 𝐴, 𝐵 ) we will associate the Choi operator 𝐽 Φ = ( id 𝐴 ⊗ Φ )[| Ω ih Ω |] ∈ H ( 𝐴 ⊗ 𝐵 ) where | Ω i = Í 𝑖 | 𝑖𝑖 i . Importantly, a map is Hermiticity-preserving iﬀ 𝐽 Φ = 𝐽 † Φ , CP iﬀ 𝐽 Φ ≥

0, and trace preserving iﬀ Tr 𝐵 𝐽 Φ = 𝐴 (see e.g. [34]). LetCPTNI ( 𝐴, 𝐵 ) denote the set of completely positive and trace–non-increasing maps in H ( 𝐴, 𝐵 ) ,and analogously CPTP ( 𝐴, 𝐵 ) the set of completely positive and trace-preserving maps. Forsimplicity of notation, we will often simply write CPTP for CPTP ( 𝐴, 𝐵 ) (and analogously forother sets) when the spaces in consideration are not relevant.In order to quantify how much a given map deviates from the set of CPTP maps, we willemploy the concept of robustness measures [22]. It will be insightful to ﬁrst review how suchmeasures are deﬁned for quantum states. Given a set of interest F ⊆ D , commonly chosen tobe the set of free states in a given resource theory, one asks: how much noise from a set N ⊆ D has to be added to a state 𝜌 in order to make it a free state? This has the intuitive interpretationof measuring how robust the resources contained in the state 𝜌 are with respect to noise fromthe set N . Speciﬁcally, we write 𝑟 N ( 𝜌 ) ≔ min (cid:26) 𝜆 (cid:12)(cid:12)(cid:12)(cid:12) 𝜌 + 𝜆𝜔 + 𝜆 ∈ F , 𝜔 ∈ N (cid:27) . (1)The most common choices of the noise set N are: N = D , in which case we obtain the so-called generalised robustness equivalently given by 𝑟 D ( 𝜌 ) = min (cid:8) 𝜆 (cid:12)(cid:12) 𝜌 ≤ ( + 𝜆 ) 𝜎 , 𝜎 ∈ F (cid:9) , (2)and the choice N = F , which corresponds to the standard robustness 𝑟 F . The latter quantityis directly related to the so-called base norm (cid:13)(cid:13) 𝜌 (cid:13)(cid:13) F of the set F , which can be alternativelyunderstood as an optimisation of quasiprobability distributions over the set F :2 𝑟 F ( 𝜌 ) + = (cid:13)(cid:13) 𝜌 (cid:13)(cid:13) F ≔ min (cid:8) 𝜆 + + 𝜆 − (cid:12)(cid:12) 𝜌 + 𝜆 − 𝜎 − = 𝜆 + 𝜎 + , 𝜎 ± ∈ F (cid:9) = min ( Õ 𝑖 | 𝜆 𝑖 | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝜌 = Õ 𝑖 𝜆 𝑖 𝜎 𝑖 , 𝜎 𝑖 ∈ F ) (3)where the third line is a simple consequence of the convexity of F . The deﬁnitions straightfor-wardly extend to unnormalised operators 𝑋 ; in such cases, it is important to notice that the traceof 𝑋 will come into play, and the base norm will equal k 𝑋 k F = 𝑟 F ( 𝑋 ) + Tr 𝑋 .The case of interest to us will be where the set of free states F contains all physical quantumstates, F = D , in which case the diﬀerent notions of the robustness are equal and one has2 𝑟 D ( 𝑋 ) + Tr 𝑋 = k 𝑋 k , (4)that is, the base norm is precisely the trace norm (Schatten 1-norm) k·k . Robustness of linear maps . A generalisation of these concepts to the case of linear maps canbe done in several diﬀerent ways. Firstly, one has to note that it does not suﬃce to consider trace-preserving maps, since not every Hermiticity-preserving map Φ can be written as 𝜆 + Λ + − 𝜆 − Λ − for CPTP Λ ± . To circumvent this, we will employ the set of completely positive and trace–non-increasing maps, which can be understood as probabilistic implementations of quantumchannels. Importantly, robustness-based deﬁnitions which were all equal in the case of statesmight not be equal any more. We therefore need to explicitly consider three diﬀerent types ofthe robustness w.r.t. the sets CPTP or CPTNI: 𝑅 ( Φ ) ≔ min (cid:26) 𝜆 (cid:12)(cid:12)(cid:12)(cid:12) Φ + 𝜆 Λ + 𝜆 ∈ CPTNI , Λ ∈ CPTNI (cid:27) ,𝑅 ′ ( Φ ) ≔ min (cid:8) 𝜆 | 𝐽 Φ ≤ ( + 𝜆 ) 𝐽 Λ , Λ ∈ CPTNI (cid:9) = min (cid:8) 𝜆 | 𝐽 Φ ≤ ( + 𝜆 ) 𝐽 Λ , Λ ∈ CPTP (cid:9) ,𝑅 ′′ ( Φ ) ≔ min (cid:8) 𝜆 (cid:12)(cid:12) Φ + 𝜆 Λ ∈ CP , Λ ∈ CPTNI (cid:9) = min (cid:8) 𝜆 (cid:12)(cid:12) Φ + 𝜆 Λ ∈ CP , Λ ∈ CPTP (cid:9) , (5)as well as a generalised notion of a base norm with respect to the set of completely positive andtrace–non-increasing maps: k Φ k ♦ ≔ min (cid:8) 𝜆 + + 𝜆 − (cid:12)(cid:12) Φ = 𝜆 + Λ + − 𝜆 − Λ − , Λ ± ∈ CPTNI (cid:9) . (6)In the expressions for 𝑅 ′ and 𝑅 ′′ , we made use of the fact that one can, without loss of generality,restrict the optimisation to CPTP maps; this follows since for any Λ ∈ CPTNI such that Tr 𝐵 𝐽 Λ ≤ 𝐴 we can deﬁne the map Λ ′ by 𝐽 Λ ′ = 𝐽 Λ + 𝐶𝑑 𝐵 ⊗ 𝐵 where 𝐶 = 𝐴 − Tr 𝐵 𝐽 Λ ≥ Λ ′ ∈ CPTP and achieves the same value of the objective function. We note that closely relateddeﬁnitions were recently also considered in Ref. [33] for the case of trace-preserving maps.All of the quantities above are well-deﬁned and take a ﬁnite value for any map Φ ∈ H ( 𝐴, 𝐵 ) , aswe shall see explicitly by establishing general upper bounds in Sec. 6. The robustness 𝑅 ( Φ ) canbe seen to be an upper bound for all other quantities: the decomposition of Φ in the deﬁnitionof 𝑅 ( Φ ) can be used to construct feasible solutions for the other robustness measures, andanalogously the robustness measures give feasible decompositions for the base norm. It is apriori unclear whether one can ﬁnd general conditions under which the inequalities betweenthe diﬀerent measures are tight. We shall shortly see that equality indeed holds for all trace-preserving linear maps.All of the introduced quantities can be computed as semideﬁnite programs, which followssince the constraints for a map to be CPTNI (or CPTP) are linear matrix inequalities. This meansthat the measures can be evaluated eﬃciently (in the dimensions of the map) using numericalsoftware. The equivalent dual forms of the problems, which can also provide some insight intothe diﬀerences between the diﬀerent deﬁnitions of the robustness measures, will be reportedshortly in Sec. 6.

3. Relation with the diamond norm

For any Hermiticity-preserving map Φ , the diamond norm (completely bounded trace norm)is deﬁned as [19, 34] k Φ k ^ = max 𝜌 ∈ D ( 𝐴 ⊗ 𝐴 ) (cid:13)(cid:13) id 𝐴 ⊗ Φ ( 𝜌 ) (cid:13)(cid:13) , (7)where, in a slight abuse of notation, we use D ( 𝐴 ⊗ 𝐴 ) to denote the states acting on a bipartiteHilbert space composed of the space 𝐴 and another space isomorphic thereto.The diamond norm ﬁnds use as a fundamental measure of distance between quantum chan-nels, mirroring the operational role of the trace distance in measuring distances between quan-tum states [19, 34, 35]. It is one of the most widely employed ﬁgures of merit in comparingquantum channels and benchmarking channel manipulation protocols. Its quantiﬁcation andcharacterisation is therefore crucial to an eﬀective understanding of the properties of quantumprocesses.We will ﬁrst introduce the following lemma, which establishes a useful formulation of thediamond norm for Hermiticity-preserving maps, closely related to a more general approach forgeneralised quantum channels considered previously by Jenčová [36]. Lemma 1.

For any Hermiticity-preserving map Φ , it holds that k Φ k ^ = min (cid:8) 𝜇 (cid:12)(cid:12) 𝐽 Φ = 𝑀 + − 𝑀 − , 𝑀 ± ≥ , Tr 𝐵 ( 𝑀 + + 𝑀 − ) ≤ 𝜇 𝐴 (cid:9) (8) = min (cid:8) 𝜇 (cid:12)(cid:12) 𝐽 Φ = 𝑀 + − 𝑀 − , 𝑀 ± ≥ , Tr 𝐵 ( 𝑀 + + 𝑀 − ) = 𝜇 𝐴 (cid:9) . (9) Proof.

Let k Φ k ′ ^ denote the quantity in (9). We ﬁrst notice that the constraint Tr 𝐵 ( 𝑀 + + 𝑀 − ) = 𝜇 𝐴 can be relaxed to Tr 𝐵 ( 𝑀 + + 𝑀 − ) ≤ 𝜇 𝐴 without loss of generality. This follows since forany feasible 𝑀 ± s.t. Tr 𝐵 ( 𝑀 + + 𝑀 − ) + 𝐶 = 𝜇 with 𝐶 ≥

0, one can deﬁne feasible solutions 𝑀 ′± = 𝑀 ± + 𝐶 𝑑 𝐵 ⊗ 𝐵 which satisfy Tr 𝐵 ( 𝑀 ′+ + 𝑀 ′− ) = 𝜇 𝐴 and thus achieve the same optimalvalue. We thus have k Φ k ′ ^ = min (cid:8) 𝜇 (cid:12)(cid:12) 𝐽 Φ = 𝑀 + − 𝑀 − , 𝑀 ± ≥ , Tr 𝐵 ( 𝑀 + + 𝑀 − ) ≤ 𝜇 𝐴 (cid:9) = min (cid:8) k Tr 𝐵 ( 𝑀 + + 𝑀 − )k ∞ (cid:12)(cid:12) 𝐽 Φ = 𝑀 + − 𝑀 − , 𝑀 ± ≥ (cid:9) . (10)Taking the Lagrange dual of the above (see Appendix A) gives k Φ k ′ ^ = max (cid:8) h 𝐽 Φ , 𝑊 i (cid:12)(cid:12) − 𝜌 ⊗ 𝐵 ≤ 𝑊 ≤ 𝜌 ⊗ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) = max n D 𝑊 , p 𝜌 ⊗ 𝐵 𝐽 Φ p 𝜌 ⊗ 𝐵 E (cid:12)(cid:12)(cid:12) 𝜌 ∈ D ( 𝐴 ) , − 𝐴 ⊗ 𝐵 ≤ 𝑊 ≤ 𝐴 ⊗ 𝐵 o = max 𝜌 ∈ D ( 𝐴 ) (cid:13)(cid:13)(cid:13)p 𝜌 ⊗ 𝐵 𝐽 Φ p 𝜌 ⊗ 𝐵 (cid:13)(cid:13)(cid:13) , (11)where in the second line we made the change of variables 𝑊 ↦→ p 𝜌 − ⊗ 𝐵 𝑊 p 𝜌 − ⊗ 𝐵 , assum-ing without loss of generality that 𝜌 is full rank. The fact that this equals the diamond normof Φ can be deduced from the results of Ref. [37] already; for completeness, we will show thisexplicitly. Recalling that 𝐽 Φ = ( id 𝐴 ⊗ Φ ) | Ω ih Ω | with | Ω i being the unnormalised maximallyentangled state, and using the fact that Φ is only acting on one of the subsystems, we can write k Φ k ′ ^ = max 𝜌 ∈ D ( 𝐴 ) (cid:13)(cid:13)(cid:13)p 𝜌 ⊗ 𝐵 ( id 𝐴 ⊗ Φ ) [| Ω ih Ω |] p 𝜌 ⊗ 𝐵 (cid:13)(cid:13)(cid:13) = max 𝜌 ∈ D ( 𝐴 ) (cid:13)(cid:13) ( id 𝐴 ⊗ Φ ) (cid:2) √ 𝜌 ⊗ 𝐴 | Ω ih Ω | √ 𝜌 ⊗ 𝐴 (cid:3)(cid:13)(cid:13) = max 𝜓 ∈ D ( 𝐴 ⊗ 𝐴 ) (cid:13)(cid:13) id 𝐴 ⊗ Φ ( 𝜓 ) (cid:13)(cid:13) = max 𝜌 ∈ D ( 𝐴 ⊗ 𝐴 ) (cid:13)(cid:13) id 𝐴 ⊗ Φ ( 𝜌 ) (cid:13)(cid:13) = k Φ k ^ (12)where we used that any pure state 𝜓 ∈ D ( 𝐴 ⊗ 𝐴 ) can be written as (cid:0) √ 𝜌 ⊗ 𝐴 (cid:1) | Ω ih Ω | (cid:0) √ 𝜌 ⊗ 𝐴 (cid:1) for a suitable choice of 𝜌 ∈ D ( 𝐴 ) , with | 𝜓 i constituting the canonical puriﬁcation of 𝜌 .We note that the form of the diamond norm presented in Lemma 1 constitutes a major simpli-ﬁcation of the semideﬁnite programs for the diamond norm of general linear maps originallyderived in Refs. [37, 38].As an immediate consequence of the above result, we can use the characterisation of thediamond norm in Eq. (8) to construct valid feasible solutions for the base norm and robustnessmeasures in Eqs. (5)–(6), and vice versa. Corollary 2.

For any Hermiticity-preserving map Φ , it holds that2 k Φ k ^ ≥ k Φ k ♦ ≥ k Φ k ^ , k Φ k ^ + ≥ 𝑅 ′ ( Φ ) ≥ (cid:16) k Φ k ^ − + 𝜆 min ( Tr 𝐵 𝐽 Φ ) (cid:17) k Φ k ^ ≥ 𝑅 ′′ ( Φ ) ≥ (cid:16) k Φ k ^ − 𝜆 max ( Tr 𝐵 𝐽 Φ ) (cid:17) , (13)where 𝜆 min and 𝜆 max denote, respectively, the smallest and the largest eigenvalues.Equality between the diﬀerent quantities can be shown for all trace-preserving maps, directlyrelating the diamond norm with our considered measures. Theorem 3.

For any map Φ ∈ H ( 𝐴, 𝐵 ) which is trace preserving or, more generally,proportional to a trace-preserving map in the sense that Tr 𝐵 𝐽 Φ ∝ , it holds that k Φ k ^ = k Φ k ♦ = min (cid:8) 𝜇 + + 𝜇 − (cid:12)(cid:12) 𝐽 Φ = 𝜇 + 𝐽 Λ + − 𝜇 − 𝐽 Λ − , Λ ± ∈ CPTP , 𝜇 ± ∈ R + (cid:9) . (14)As a consequence, for trace-preserving maps Φ we have 𝑅 ( Φ ) = 𝑅 ′ ( Φ ) = 𝑅 ′′ ( Φ ) = k Φ k ♦ − = k Φ k ^ − . (15) Proof.

From the fact that Tr 𝐵 𝐽 Φ = 𝑡 𝐴 for some 𝑡 ∈ R , it is easy to see that every decompositionof the form 𝐽 Φ = 𝑀 + − 𝑀 − , 𝑀 ± ≥ , Tr 𝐵 ( 𝑀 + + 𝑀 − ) = 𝜇 𝐴 as in Lemma 1 has to satisfyTr 𝐵 𝑀 + = 𝜇 + 𝑡 𝐴 , Tr 𝐵 𝑀 − = 𝜇 − 𝑡 𝐴 . (16)This implies that we can equivalently write k Φ k ^ = min (cid:8) 𝜇 + + 𝜇 − (cid:12)(cid:12) 𝐽 Φ = 𝜇 + 𝑀 + − 𝜇 − 𝑀 − , 𝑀 ± ≥ , Tr 𝐵 𝑀 + = Tr 𝐵 𝑀 − = 𝐴 (cid:9) (17)which is precisely Eq. (14). Notice then that any such decomposition gives a valid feasiblesolution for k Φ k ♦ , together with Cor. 2 yielding equality between the two norms.When Φ is trace preserving ( 𝑡 = k Φ k ^ = min (cid:8) 𝜇 + + 𝜇 − (cid:12)(cid:12) 𝐽 Φ = 𝜇 + 𝐽 Λ + − 𝜇 − 𝐽 Λ − , Λ ± ∈ CPTP (cid:9) = min (cid:8) 𝜇 + − (cid:12)(cid:12) 𝐽 Φ = 𝜇 + 𝐽 Λ + − 𝜇 − 𝐽 Λ − , Λ ± ∈ CPTP (cid:9) = min (cid:8) 𝜇 − + (cid:12)(cid:12) 𝐽 Φ = 𝜇 + 𝐽 Λ + − 𝜇 − 𝐽 Λ − , Λ ± ∈ CPTP (cid:9) . (18)The equality k Φ k ^ = 𝑅 ′ ( Φ ) + = 𝑅 ′′ ( Φ ) + 𝑅 ′ and 𝑅 ′′ in Eq. (5), and on the otherhand, any decomposition for the robustness measures is necessarily of the form in Eq. (18).Equality with the robustness 𝑅 ( Φ ) follows by noting again that any feasible decompositionin Eq. (18) gives a feasible decomposition for 𝑅 ( Φ ) , and on the other hand using the relation 𝑅 ( Φ ) ≥ 𝑅 ′ ( Φ ) which holds by deﬁnition. Remark.

The expression in Eq. (14) is valid also in the case of trace-annihilating maps (Tr 𝐵 𝐽 Φ = k Λ − Λ ′ k ^ between two quantum channels. Asimpliﬁed expression for this problem appeared previously in [37] and was explicitly expressedas a robustness-type measure in [26].We note that the quantity k·k ♦ , applied to trace-preserving maps, was recently consideredin Refs. [33] and [39]. It was not noticed in these works that this is simply the diamondnorm, and hence many results shown in [33] (e.g. the multiplicativity with respect to tensorproduct, unitary invariance, bounds with trace norm k 𝐽 Φ k , monotonicity under the action ofsuperchannels, and some explicit expressions) follow directly from known properties of thediamond norm [20, 37, 40, 41].We will later see that this equivalence does not extend to maps which are not trace preserving(or proportional thereto), and indeed we can have k Φ k ♦ = k Φ k ^ in the extreme case.

4. Quantifying simulation cost

Since the quantum dynamics which can be realised in practice are restricted to completelypositive maps, a relevant question then becomes: how can one simulate the action of a non-CPTPmap on a quantum state when only CPTP maps are available to us?A similar question was recently asked in Ref. [33], where the authors applied quasiprobabilitysampling methods [17, 30, 42] to the desired operation Φ . We take a diﬀerent approach hereand instead allow for the use of an ancillary system 𝑋 , which can be an aﬃne combination ofquantum states, in order to simulate the action of the map Φ as a CPTP map Λ acting jointlyon the input quantum state and the ancilla 𝑋 . The “non-physicality” of the given map Φ isthen pushed into the system 𝑋 , allowing for the overall transformation Λ to be a valid quantumchannel.The motivation for this approach is that the task of simulating the action of the non-CPTPmap Φ is eﬀectively replaced with the simulation of a unit-trace Hermitian operator 𝑋 , whichcould be signiﬁcantly easier to realise in practice, especially since we will see that the dimensionof the ancilla can be taken to be arbitrarily small. Standard quasiprobability-based approachessuch as the ones employed in [17, 30, 33] aim to estimate the expectation value Tr [ Φ ( 𝜌 ) 𝐴 ] , where Φ is a non-CPTP map and 𝐴 an observable, by decomposing the given map as Φ = 𝜆 𝑖 Λ 𝑖 with 𝜆 𝑖 ∈ R and Λ 𝑖 ∈ CPTP (or CPTNI). The expectation value Tr [ Φ ( 𝜌 ) 𝐴 ] is then estimated by eval-uating Tr [ Λ 𝑖 ( 𝜌 ) 𝐴 ] and appropriately sampling from the output distributions with probabilitiesdetermined by the coeﬃcients 𝜆 𝑖 [17, 42]. In practice, this means that we have to repeatedlyrealise each operation Λ 𝑖 , which requires the implementation of a diﬀerent quantum circuit foreach operation. Consider, on the other hand, a situation in which the dynamics is ﬁxed as somemap Λ ∈ CPTNI, and we only need to vary the input states. This can be achieved by writing Φ (·) = Λ (· ⊗ 𝑋 ) , where we can write any Hermitian operator in a quasiprobability representationas 𝑋 = 𝜇 𝑖 𝜌 𝑖 . The task of sampling from the output distribution is then reduced to feeding in thediﬀerent states 𝜌 𝑖 into the circuit which realises the ﬁxed operation Λ , thus greatly simplifyingthe implementation.As mentioned in Sec. 2, a natural quantiﬁer of how much a given operator 𝑋 ∈ H ( 𝐴 ) withTr ( 𝑋 ) = k 𝑋 k . Indeed, this quantitycan be given an explicit interpretation in terms of the optimal cost of a quasiprobability-basedsimulation of the measurement statistics of 𝑋 [42]. We then deﬁne the simulation cost of a mapas the minimal amount of such “non-physicality” of 𝑋 needed to simulate the action of the map: 𝑆 ( Φ ) ≔ min (cid:8) k 𝑋 k (cid:12)(cid:12) Λ (· ⊗ 𝑋 ) = Φ (·) , Tr ( 𝑋 ) = , Λ ∈ CPTNI (cid:9) . (19)We then have the following. Theorem 4.

For any map Φ ∈ H ( 𝐴, 𝐵 ) , it holds that 𝑆 ( Φ ) = 𝑅 ( Φ ) + . (20)In the case of a trace-preserving Φ , we have in particular that 𝑆 ( Φ ) = k Φ k ^ (21)and an optimal Λ for the simulation can be chosen to satisfy Λ ∈ CPTP.

Proof.

Let Λ ± ∈ CPTNI ( 𝐴, 𝐵 ) be maps that achieve an optimal decomposition for Φ suchthat Φ = ( + 𝑅 ( Φ )) Λ + − 𝑅 ( Φ ) Λ − . Now, consider a non-positive Hermitian operator 𝑋 = 𝜇 + 𝜔 + − 𝜇 − 𝜔 − ∈ H ( 𝐴 ′ ) where 𝜔 ± are orthogonal quantum states and Tr [ 𝑋 ] = 𝜇 + − 𝜇 − =

1. Wedo not impose any additional conditions on the size of the ancillary system 𝐴 ′ , meaning thatits Hilbert space can be chosen to be an arbitrary space of dimension at least 2. Deﬁning theprojector onto the positive part of 𝑋 as 𝑃 + , we then consider the map deﬁned by the action on abasis | 𝑖 ih 𝑗 | ⊗ | 𝑖 ih 𝑗 | ∈ L ( 𝐴 ⊗ 𝐴 ′ ) as follows: Λ (| 𝑖 ih 𝑗 | ⊗ | 𝑖 ih 𝑗 |) : = Tr (cid:20) 𝑃 + 𝜇 + | 𝑖 ih 𝑗 | (cid:21) Φ (| 𝑖 ih 𝑗 |) + Tr (cid:20)(cid:18) − 𝑃 + 𝜇 + (cid:19) | 𝑖 ih 𝑗 | (cid:21) Λ − (| 𝑖 ih 𝑗 |) = ( + 𝑅 ( Φ )) Tr (cid:20) 𝑃 + 𝜇 + | 𝑖 ih 𝑗 | (cid:21) Λ + (| 𝑖 ih 𝑗 |)+ Tr (cid:20)(cid:18) − ( + 𝑅 ( Φ )) 𝑃 + 𝜇 + (cid:19) | 𝑖 ih 𝑗 | (cid:21) Λ − (| 𝑖 ih 𝑗 |) (22)It is easy to check that Λ ( 𝜌 ⊗ 𝑋 ) = Φ ( 𝜌 ) . Now, we will show that as long as the condition ( + 𝑅 ( Φ )) Tr (cid:20) 𝑃 + 𝜇 + 𝜌 (cid:21) ≤ ∀ 𝜌 (23)is satisﬁed, then Λ is also CPTNI. This can be seen by observing ﬁrst that (23) gives0 ≤ ˜ 𝑃 ≔ ( + 𝑅 ( Φ )) 𝑃 + 𝜇 + ≤ , (24)which implies that ˜ 𝑃 is a valid POVM element. Note that we can rewrite (22) as Λ = Λ + ⊗ 𝑇 ˜ 𝑃 + Λ − ⊗ 𝑇 − ˜ 𝑃 (25) = ( ⊗ 𝑇 ˜ 𝑃 )( Λ + ⊗ ) + ( ⊗ 𝑇 − ˜ 𝑃 )( Λ − ⊗ ) (26)where 𝑇 𝑃 (·) : = Tr [ 𝑃 ·] . Since Λ + , Λ − , and 𝑇 ˜ 𝑃 , 𝑇 − ˜ 𝑃 are all completely positive, Λ is also completelypositive. Since (23) is always satisﬁed when 𝜇 + ≥ + 𝑅 ( Φ ) , (27)an operator 𝑋 with k 𝑋 k = 𝜇 + + 𝜇 − = + 𝑅 ( Φ ) achieves the desired implementation.The converse part can be proven by extending an argument in Ref. [23] to our setting. Supposea non-quantum resource 𝑋 = 𝜇 + 𝜔 + − 𝜇 − 𝜔 − and CPTNI map Λ realise the simulation of Φ , i.e. Φ (·) = Λ (· ⊗ 𝑋 ) . Also, deﬁne Λ + (·) ≔ Λ (· ⊗ 𝜔 + ) . Then, by linearity of Λ , we get Λ + (·) = 𝜇 + Λ (· ⊗ 𝑋 ) + 𝜇 + Λ (· ⊗ 𝜇 − 𝜔 − ) (28) = 𝜇 + Φ (·) + 𝜇 − 𝜇 + Λ (· ⊗ 𝜔 − ) . (29)Since Λ + and Λ − ≔ Λ (· ⊗ 𝜔 − ) are CPTNI maps, this is a valid linear decomposition of Φ intotwo CPTNI maps, providing an upper bound for its robustness as 𝑅 ( Φ ) ≤ 𝜇 − = 𝜇 + −

1. Thisgives the desired lower bound for the simulation cost as 𝜇 + + 𝜇 − ≥ + 𝑅 ( Φ ) .An interesting quantitative equivalence emerges between our approach and the methodof Ref. [33]. In that work, the authors showed that the minimal overhead required to em-ploy quasiprobability-based simulation techniques [17, 42] to estimate Tr [ Φ ( 𝜌 ) 𝐴 ] for a trace-preserving map Φ scales with the norm k Φ k ♦ (see also the discussion in Sec. 7.2). Since weknow from Thm. 3 that 2 𝑅 ( Φ ) + = k Φ k ♦ = k Φ k ^ (30)holds for any trace-preserving map, the quantitative cost of the simulation scheme is actuallythe same as our method, despite the seemingly diﬀerent approaches employed. In fact, ourThm. 4 shows that it is suﬃcient to consider decompositions of Φ as Φ (·) = 𝜇 + Λ (· ⊗ 𝜔 + ) − 𝜇 − Λ (· ⊗ 𝜔 − ) (31)where Λ and 𝑋 = 𝜇 + 𝜔 + − 𝜇 − 𝜔 − are as constructed in our protocol. This means that, despitethe signiﬁcant practical simpliﬁcation obtained by ﬁxing the dynamics of the simulator as Λ and optimising over the quasiprobability representations of 𝑋 instead, our simulation methoddoes not sacriﬁce any performance, and the optimal sampling overhead cost of the more directapproach of [33] cannot be any better.We note that Theorem 4 gives a general way of reducing the task of simulating the action ofa linear map Φ to simulating an aﬃne combination of states in the form of the operator 𝑋 . Thiscould provide methods for the simulation of dynamics even beyond quasiprobability-basedapproaches like the one discussed above, although the speciﬁcs of this will depend on the givensimulation method. State injection and resource simulation . The setting considered here is closely related tostate injection methods which generalise quantum teleportation [43] and ﬁnd use e.g. in theresource theories of entanglement [44–48], stabiliser-state quantum computation [49, 50], andcoherence [23, 51]. In such tasks, a resourceful state 𝜙 (such as a maximally entangled singlet) isused to simulate the action of an arbitrary quantum channel Θ as Θ (·) = Γ (· ⊗ 𝜙 ) , where now Γ isa free operation (such as a protocol consisting of local operations and classical communicationonly). In this sense, our result can be thought of as the cost of channel simulation in theresource theory of “non-physicality” beyond quantum mechanics, with the operator 𝑋 actingas a resource. There are many potential ways to interpret such a result: for instance, unit-trace Hermitian operators which are not necessarily positive semideﬁnite have found use asso-called pseudo-states in [52], where they were used to study correlations beyond quantummechanics, and as so-called pseudo-density matrices in [53], where they were used to putspatial and temporal correlations on equal footing. Being able to use a Hermitian system 𝑋 could then be interpreted as having access to such extended sets of correlations. We leavea precise investigation of the connections between the operational setting employed here andresource theories of correlations for future work. Amortised simulation . A related setting that we can consider is that of amortised simulation[23, 54], in which the non-quantum resource 𝑋 is not consumed completely, but instead we canrecover some of it in the form of another resource 𝑌 which can be reused. Precisely, we deﬁne 𝑆 𝐴 ( Φ ) = min (cid:26) k 𝑋 k k 𝑌 k (cid:12)(cid:12)(cid:12)(cid:12) Λ (· ⊗ 𝑋 ) = Φ (·) ⊗ 𝑌, Tr ( 𝑋 ) = Tr ( 𝑌 ) = , Λ ∈ CPTNI (cid:27) . (32)We then have the following. Corollary 5.

For any trace-preserving map Φ ∈ H ( 𝐴, 𝐵 ) , it holds that 𝑆 𝐴 ( Φ ) = 𝑆 ( Φ ) = k Φ k ^ . (33) Proof.

Clearly, 𝑆 𝐴 ( Φ ) ≤ 𝑆 ( Φ ) as we can just take 𝑋 to be optimal for 𝑆 and 𝑌 to be the trivialsystem 1. On the other hand, let Λ be the optimal map such that Λ (· ⊗ 𝑋 ) = Φ (·) ⊗ 𝑌 and so 𝑆 𝐴 ( Φ ) = k 𝑋 k /k 𝑌 k . By Thm. 4, such a Λ can exist only if k 𝑋 k ≥ 𝑆 ( Φ (·) ⊗ 𝑌 ) = k Φ (·) ⊗ 𝑌 k ^ = k Φ k ^ k 𝑌 k = 𝑆 ( Φ ) k 𝑌 k (34)where we used the multiplicativity of the diamond norm and the fact that k 𝑌 k ^ = k 𝑌 k where wetreat 𝑌 as a preparation channel with a trivial input space. From this we have that 𝑆 𝐴 ( Φ ) ≥ 𝑆 ( Φ ) ,which concludes the proof.Therefore, amortisation cannot improve the simulation cost of a given map.

5. Quantifying advantages in quantum games

The study of general linear maps in a resource-theoretic setting motivates the question: isthere a well-deﬁned operational task in which having access to any non-CPTP map could providepractical advantages over all quantum channels?In order to give an instance of such a task, we consider the setting of input-output games,inspired by the work of Ref. [55] and studied in the context of dynamical quantum resourcesin [27, 28]. The setting is as follows: Alice prepares a state chosen randomly from the ensemble { 𝑝 𝑖 , 𝜎 𝑖 } 𝑖 and sends the state through the map Φ ∈ H ( 𝐴, 𝐵 ) to Bob, who then measures with aPOVM { 𝑀 𝑗 } 𝑗 . The players are then awarded a score based on a reward function characterisedby the coeﬃcients { 𝑤 𝑖𝑗 } 𝑖,𝑗 ∈ R , and their goal is to maximise the average payoﬀ given by 𝑃 ( Φ , { 𝑝 𝑖 , 𝜎 𝑖 } , { 𝑀 𝑗 } , { 𝑤 𝑖𝑗 }) = Õ 𝑖,𝑗 𝑤 𝑖𝑗 𝑝 𝑖 (cid:10) 𝑀 𝑗 , Φ ( 𝜎 𝑖 ) (cid:11) (35)by a suitable choice of the states and measurements. The tuple G = ({ 𝑝 𝑖 , 𝜎 𝑖 } , { 𝑀 𝑗 } , { 𝑤 𝑖𝑗 }) thendeﬁnes the input-output game G .We stress that, although the payoﬀ 𝑃 ( Φ , G ) might lose its physical meaning as a discriminationtask when Φ is an arbitrary linear map, already for a positive trace-preserving map Φ we havethat every output Φ ( 𝜎 𝑖 ) is indeed a valid density matrix and thus the measurement at the outputconstitutes a well-deﬁned state discrimination task.We are then interested in quantifying the best possible advantage that a given map Φ couldprovide over CPTP maps. Such an optimisation is unbounded without any further constraints,so we will consider games for which any completely positive map Γ achieves a non-negativepayoﬀ value — this can always be ensured by suitably shifting the payoﬀ function for a givengame. We then have the following. Theorem 6.

For any map Φ ∈ H ( 𝐴, 𝐵 ) , it holds thatsup G 𝑃 ( Φ , G ) max Λ ∈ CPTP 𝑃 ( Λ , G ) = 𝑅 ′ ( Φ ) + G such that 𝑃 ( Γ , G ) ≥ ∀ Γ ∈ CP.In the case of a trace-preserving Φ , we have in particular thatsup G 𝑃 ( Φ , G ) max Λ ∈ CPTP 𝑃 ( Λ , G ) = k Φ k ^ + , (37)and it suﬃces to optimise over games such that 𝑃 ( Λ , G ) ≥ ∀ Λ ∈ CPTP.

Proof.

Any Φ can be written as Φ = ( + 𝑅 ′ ( Φ )) Λ − Γ where Λ ∈ CPTP, Γ ∈ CP. On the onehand, we then have for any G that 𝑃 ( Φ , G ) = ( + 𝑅 ′ ( Φ )) 𝑃 ( Λ , G ) − 𝑃 ( Γ , G )≤ ( + 𝑅 ′ ( Φ )) 𝑃 ( Λ , G )≤ ( + 𝑅 ′ ( Φ )) max Λ ∈ CPTP 𝑃 ( Λ , G ) (38)where the ﬁrst inequality follows since 𝑃 ( Γ , G ) ≥ ∀ Γ ∈ CP, which shows that the left-hand sideof Eq. (36) is upper-bounded by the right-hand side. By Thm. 3, in the case of a trace preservingmap Φ we can equivalently write Φ = ( + 𝑅 ′ ( Φ )) Λ + − 𝑅 ′ ( Φ ) Λ − where Λ ± ∈ CPTP, so one onlyneeds to consider games such that 𝑃 ( Λ , G ) ≥ ∀ Λ ∈ CPTP.On the other hand, by strong Lagrange duality (see App. A) we can write 𝑅 ′ ( Φ ) + = max (cid:8) h 𝑊 , 𝐽 Φ i (cid:12)(cid:12) ≤ 𝑊 ≤ 𝜌 ⊗ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) . (39)We can then make the following observations. Firstly, since the set of separable states in D ( 𝐴 ⊗ 𝐵 ) has a non-empty interior [56], any Hermitian operator 𝑋 can be written as 𝑋 = Í 𝑛𝑖 = 𝑥 𝑖 𝜎 𝑖 ⊗ 𝜂 𝑖 for some 𝜎 𝑖 ∈ D ( 𝐴 ) , 𝜂 𝑖 ∈ D ( 𝐵 ) , 𝑥 𝑖 ∈ R , and 𝑛 ∈ N . Then, choose the optimal 𝑊 in Eq. (39) andwrite 𝑊 𝑇 𝐴 = Í 𝑛𝑖 = 𝑥 𝑖 𝜎 𝑖 ⊗ 𝜂 𝑖 , where 𝑇 𝐴 denotes the partial transpose. Deﬁning the set { 𝑀 𝑖 } 𝑛 + 𝑖 = by 𝑀 𝑖 ≔ 𝜂 𝑖 / (cid:13)(cid:13)(cid:13)Í 𝑗 𝜂 𝑗 (cid:13)(cid:13)(cid:13) ∞ for 𝑖 ≤ 𝑛 and 𝑀 𝑛 + = − Í 𝑛𝑖 = 𝑀 𝑖 , we have that 𝑊 = Õ 𝑖 𝑝 𝑖 𝑤 𝑖 𝜎 𝑇𝑖 ⊗ 𝑀 𝑖 (40)0where 𝑝 𝑖 = / 𝑛 for 𝑖 ≤ 𝑛 and 𝑝 𝑛 + =

0, and the coeﬃcients 𝑤 𝑖 are deﬁned by 𝑤 𝑖 = 𝑥 𝑖 𝑛 (cid:13)(cid:13)(cid:13)Í 𝑗 𝜂 𝑗 (cid:13)(cid:13)(cid:13) ∞ .By the Choi-Jamiołkowski isomorphism and the linearity of Φ , we then have for Φ that h 𝑊 , 𝐽 Φ i = Õ 𝑖 𝑝 𝑖 𝑤 𝑖 h 𝑀 𝑖 , Φ ( 𝜎 𝑖 )i = 𝑃 ( Φ , G ′ ) (41)with G ′ deﬁned by the above choices of { 𝑝 𝑖 , 𝜎 𝑖 } , { 𝑀 𝑖 } , and { 𝑤 𝑖 } . Noticing that 𝑊 ≥ ⇒ 𝑃 ( Γ , G ′ ) ≥ ∀ Γ ∈ CP, this ﬁnally givessup G 𝑃 ( Φ , G ) max Λ ∈ CPTP 𝑃 ( Λ , G ) ≥ 𝑃 ( Φ , G ′ ) max Λ ∈ CPTP 𝑃 ( Λ , G ′ )≥ 𝑅 ′ ( Φ ) + Λ ∈ CPTP 𝑅 ′ ( Λ ) + = 𝑅 ′ ( Φ ) + 𝑃 ( Φ , G ′ ) = h 𝑊 , 𝐽 Φ i = 𝑅 ′ ( Φ ) + 𝑃 ( Λ , G ′ ) = h 𝑊 , 𝐽 Λ i ≤ 𝑅 ′ ( Λ ) + Λ by deﬁnition, and the last equalityfollows since 𝑅 ′ ( Λ ) = Λ ∈ CPTP.

6. General bounds

Useful bounds for the measures can be obtained by relating them with norms or quantitiescomputed at the level of the Choi operator 𝐽 Φ , avoiding an optimisation over all CPTNI or CPTPmaps. For instance, the following relation with the trace norm generalises known bounds forthe diamond norm [41, 57] (see also [33]). Proposition 7.

For any Hermiticity-preserving map Φ ∈ H ( 𝐴, 𝐵 ) , decompose 𝐽 Φ into itspositive and negative parts as 𝐽 Φ = 𝐽 Φ + − 𝐽 Φ − with 𝐽 Φ ± ≥

0. Then k 𝐽 Φ k ≥ k Φ k ♦ ≥ 𝑑 𝐴 k 𝐽 Φ k , max (cid:8) Tr 𝐽 Φ + − , Tr 𝐽 Φ − (cid:9) ≥ 𝑅 ( Φ ) ≥ max (cid:26) 𝑑 𝐴 Tr 𝐽 Φ + − , 𝑑 𝐴 Tr 𝐽 Φ − (cid:27) , Tr 𝐽 Φ + − ≥ 𝑅 ′ ( Φ ) ≥ 𝑑 𝐴 Tr 𝐽 Φ + − , Tr 𝐽 Φ − ≥ 𝑅 ′′ ( Φ ) ≥ 𝑑 𝐴 Tr 𝐽 Φ − . (43) Proof.

Consider k·k ♦ ﬁrst. Using the expression k 𝐽 Φ k = min (cid:8) 𝜇 + + 𝜇 − (cid:12)(cid:12) 𝐽 Φ = 𝜇 + 𝜔 + − 𝜇 − 𝜔 − , 𝜔 ± ∈ D ( 𝐴 ⊗ 𝐵 ) (cid:9) (44)we see that any such decomposition provides a feasible solution for k Φ k ♦ , since 𝜔 ± constitutevalid Choi operators of maps Ω ± ∈ CPTNI ( 𝐴, 𝐵 ) . The ﬁrst inequality thus follows. The secondinequality is a consequence of the bound k Φ k ♦ ≥ k Φ k ^ from Cor. 2 and the fact that 𝑑 𝐴 k 𝐽 Φ k is known to lower bound the diamond norm (see e.g. [41, 57]). It can also be explicitly seen bynoting that any decomposition of the form 𝐽 Φ = 𝜆 + 𝐽 Λ + − 𝜆 − 𝐽 Λ − with Λ ± ∈ CPTNI can provide adecomposition for the trace norm by rescaling each 𝐽 Λ ± by its trace; speciﬁcally, k 𝐽 Φ k ≤ 𝜆 + Tr 𝐽 Λ + + 𝜆 − Tr 𝐽 Λ − , (45)and using the fact that Tr 𝐽 Λ ± ≤ 𝑑 𝐴 k 𝐽 Λ ± k ∞ ≤ 𝑑 𝐴 gives the desired bound.The case of the robustness measures 𝑅 ′ , 𝑅 ′′ follows analogously, where we now use thefact that Tr 𝑋 + = min (cid:8) 𝜇 (cid:12)(cid:12) 𝑋 ≤ 𝜇𝜌 , 𝜌 ∈ D (cid:9) and Tr 𝑋 − = min (cid:8) 𝜇 (cid:12)(cid:12) 𝑋 + 𝜇𝜌 ≥ , 𝜌 ∈ D (cid:9) for anyHermitian 𝑋 . For the robustness 𝑅 , take 𝜆 to be the greater of Tr 𝐽 Φ + − 𝐽 Φ − , and write 𝐽 Φ + 𝜆 𝐽 Φ − 𝜆 = ( + 𝜆 ) 𝐽 Φ + + 𝜆 . (46)Since each 𝐽 Φ ± 𝜆 ∈ CPTNI, this provides a valid feasible solution for 𝑅 . On the other hand, 𝑅 ( Φ ) ≥ max { 𝑅 ′ ( Φ ) , 𝑅 ′′ ( Φ )} by deﬁnition, from which the lower bound follows.Both the upper and the lower bounds in Prop. 7 can be tight, as was shown already for thediamond norm [40]. However, better upper bounds can be obtained as follows.1 Proposition 8.

For any Hermiticity-preserving map Φ ∈ H ( 𝐴, 𝐵 ) , it holds that k Φ k ^ ≤ 𝜆 max ( Tr 𝐵 [ 𝐽 Φ + + 𝐽 Φ − ]) , k Φ k ♦ ≤ 𝜆 max ( Tr 𝐵 𝐽 Φ + ) + 𝜆 max ( Tr 𝐵 𝐽 Φ − ) ,𝑅 ( Φ ) ≤ max (cid:8) 𝜆 max ( Tr 𝐵 𝐽 Φ + ) − , 𝜆 max ( Tr 𝐵 𝐽 Φ − ) (cid:9) ,𝑅 ′ ( Φ ) ≤ 𝜆 max ( Tr 𝐵 𝐽 Φ + ) − ,𝑅 ′′ ( Φ ) ≤ 𝜆 max ( Tr 𝐵 𝐽 Φ − ) . (47)We note that the bound for the diamond norm, which we stated above for completeness,appeared previously in [41]. Proof.

All of the bounds follow simply by using 𝐽 Φ = 𝐽 Φ + − 𝐽 Φ − as feasible solutions.For the robustness 𝑅 , take 𝜆 to be the greater of 𝜆 max ( Tr 𝐵 𝐽 Φ + ) − 𝜆 max ( Tr 𝐵 𝐽 Φ − ) , and write 𝐽 Φ + 𝜆 𝐽 Φ − 𝜆 = ( + 𝜆 ) 𝐽 Φ + + 𝜆 . (48)As for lower bounds, we will ﬁrst need to establish dual expressions for the consideredmeasures. The following Proposition is an application of standard convex duality arguments,and we include details in Appendix A for completeness. Proposition 9.

For any Φ ∈ H ( 𝐴, 𝐵 ) , the following dual expressions hold. k Φ k ^ = max (cid:8) h 𝐽 Φ , 𝑊 i (cid:12)(cid:12) − 𝜌 ⊗ 𝐵 ≤ 𝑊 ≤ 𝜌 ⊗ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) k Φ k ♦ = max (cid:8) h 𝐽 Φ , 𝑊 i (cid:12)(cid:12) − 𝜌 ⊗ 𝐵 ≤ 𝑊 ≤ 𝜎 ⊗ 𝐵 , 𝜌 , 𝜎 ∈ D ( 𝐴 ) (cid:9) 𝑅 ( Φ ) = max (cid:8) h 𝐽 Φ , 𝑊 i − Tr 𝑌 (cid:12)(cid:12) − 𝑋 ⊗ 𝐵 ≤ 𝑊 ≤ 𝑌 ⊗ 𝐵 , 𝑋, 𝑌 ≥ , Tr ( 𝑋 + 𝑌 ) = (cid:9) 𝑅 ′ ( Φ ) = max (cid:8) h 𝐽 Φ , 𝑊 i − (cid:12)(cid:12) ≤ 𝑊 ≤ 𝜌 ⊗ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) 𝑅 ′′ ( Φ ) = max (cid:8) (cid:10) 𝐽 Φ , 𝑊 − 𝜌 ⊗ 𝐵 (cid:11) (cid:12)(cid:12) ≤ 𝑊 ≤ 𝜌 ⊗ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) = max (cid:8) h 𝐽 Φ , 𝑊 i − Tr Φ ( 𝜌 ) (cid:12)(cid:12) ≤ 𝑊 ≤ 𝜌 ⊗ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) . (49)We can then obtain lower bounds by employing the dual optimisation problems. The boundfor the diamond norm is well known [20], but we ﬁnd it is insightful to rederive it using thisapproach . Proposition 10.

For any Φ ∈ H ( 𝐴, 𝐵 ) and any input state 𝜌 ∈ D ( 𝐴 ) , let Φ ( 𝜌 ) ± denote thepositive/negative part of the output operator Φ ( 𝜌 ) . Then k Φ k ♦ ≥ k Φ k ^ ≥ max (cid:8) (cid:13)(cid:13) Φ ( 𝜌 ) (cid:13)(cid:13) (cid:12)(cid:12) 𝜌 ∈ D ( 𝐴 ) (cid:9) ≥ k Tr 𝐵 𝐽 Φ k ∞ k Φ k ♦ ≥ max (cid:8) Tr Φ ( 𝜌 ) − + Tr Φ ( 𝜎 ) + (cid:12)(cid:12) 𝜌 , 𝜎 ∈ D ( 𝐴 ) (cid:9) ≥ 𝜆 max [( Tr 𝐵 𝐽 Φ ) + ] + 𝜆 max [( Tr 𝐵 𝐽 Φ ) − ] 𝑅 ( Φ ) ≥ max (cid:8) Tr Φ ( 𝑋 ) − + Tr Φ ( 𝑌 ) + − Tr 𝑌 (cid:12)(cid:12) 𝑋, 𝑌 ≥ , Tr ( 𝑋 + 𝑌 ) = (cid:9) ≥ max n 𝜆 max [( Tr 𝐵 𝐽 Φ ) + ] − , 𝜆 max [( Tr 𝐵 𝐽 Φ ) − ] o 𝑅 ′ ( Φ ) ≥ max (cid:8) Tr Φ ( 𝜌 ) + − (cid:12)(cid:12) 𝜌 ∈ D ( 𝐴 ) (cid:9) ≥ 𝜆 max [( Tr 𝐵 𝐽 Φ ) + ] − 𝑅 ′′ ( Φ ) ≥ max (cid:8) Tr Φ ( 𝜌 ) − (cid:12)(cid:12) 𝜌 ∈ D ( 𝐴 ) (cid:9) ≥ 𝜆 max [( Tr 𝐵 𝐽 Φ ) − ] . (50) We remark the curious fact that, despite the apparent similarity, the bound for the diamond norm in Prop. 10 is notthe induced Schatten norm k·k → , as the latter requires an optimisation over non-Hermitian input operators evenwhen the map Φ is Hermiticity-preserving [20]. Proof.

Consider the diamond norm ﬁrst. The main idea is to restrict the optimisation in thedual expression of k·k ^ in (49) to operators of the form 𝑊 = 𝜌 ⊗ 𝑍 for some operator 𝑍 . Thenwe have k Φ k ^ ≥ max (cid:8) (cid:10) 𝐽 Φ , 𝜌 ⊗ 𝑍 (cid:11) (cid:12)(cid:12) − 𝐵 ≤ 𝑍 ≤ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) = max (cid:8) (cid:10) Φ ( 𝜌 𝑇 ) , 𝑍 (cid:11) (cid:12)(cid:12) − 𝐵 ≤ 𝑍 ≤ 𝐵 , 𝜌 ∈ D ( 𝐴 ) (cid:9) = max (cid:8) (cid:13)(cid:13) Φ ( 𝜌 𝑇 ) (cid:13)(cid:13) (cid:12)(cid:12) 𝜌 ∈ D ( 𝐴 ) (cid:9) , (51)where the second line follows by the Choi-Jamiołkowski isomorphism. Taking 𝑍 ∈ { , − } , weget the lower bound k Φ k ^ ≥ max (cid:8) ± (cid:10) Tr 𝐵 𝐽 Φ , 𝜌 (cid:11) (cid:12)(cid:12) 𝜌 ∈ D ( 𝐴 ) (cid:9) = k Tr 𝐵 𝐽 Φ k ∞ . (52)In the case of k·k ♦ , we use feasible solutions of the form 𝑊 = 𝜌 ⊗ 𝑍 + 𝜎 ⊗ 𝑉 with − ≤ 𝑍 ≤ ≤ 𝑉 ≤ to obtain the stated bound analogously — the crucial observation being thatmax (cid:8) h 𝐴, 𝐵 i (cid:12)(cid:12) ≤ 𝐵 ≤ (cid:9) = Tr 𝐴 + for any Hermitian 𝐴 . The other measures follow in the sameway.Note the similarity between the eigenvalue-based lower bounds of Prop. 10 and the upperbounds of Prop. 8: the upper bounds consider the eigenvalues after decomposing 𝐽 Φ as 𝐽 Φ + − 𝐽 Φ − ,while the lower bounds use the positive and negative parts of Tr 𝐵 𝐽 Φ .An immediate consequence is that for any completely positive map Φ , it holds that k Φ k ♦ = k Φ k ^ = k Tr 𝐵 𝐽 Φ k ∞ (53)since the operators 𝐽 Φ and Tr 𝐵 𝐽 Φ are both positive semideﬁnite. However, the lower boundsallow us to show explicitly that the equality k Φ k ♦ = k Φ k ^ is no longer true for maps which areneither CP nor trace preserving, and in fact the extreme disparity of k Φ k ♦ = k Φ k ^ (cf. Cor. 2)can be achieved. Consider for instance the case when Φ (·) = h | · | i | ih | − h | · | i | ih | . (54)Decomposing 𝐽 Φ = | ih | ⊗ | ih | − | ih | ⊗ | ih | into its positive and negative parts, thebound of Prop. 8 gives k Φ k ^ ≤

1. However, the best upper bound we get for k Φ k ♦ is 2, andit is indeed tight: we have Φ (| ih |) = | ih | and Φ (| ih |) = − | ih | , and so Prop. 10 gives k Φ k ♦ ≥

2. A similar argument can be used to show that 𝑅 ( Φ ) =

1, which in particular impliesthat 2 𝑅 ( Φ ) + > k Φ k ♦ > k Φ k ^ .All of the bounds that we established in this section can be tight, as we shall demonstrate inwhat follows.

7. Applications and examples

Positive maps constitute a fundamental way to detect and characterise quantum entangle-ment [2–4]. One of the most studied approaches to implementing such maps in practice is thestructural physical approximation (SPA) [5, 6], which aims to approximate a given positive map Φ with a physical quantum channel by considering decompositions of the form Φ + 𝜍 D , where D is the completely depolarising channel, 𝐽 D = / 𝑑 𝐵 . Such approximations have found usein both understanding the properties of positive maps [7, 58], as well as in realising them inexperiments [6, 8, 59].Intuitively, the robustness measures can then be understood as diﬀerent approaches to deﬁn-ing an optimised SPA to the map Φ , by allowing channels other than the depolarising map tobe used in the decomposition (cf. [33]). We will now discuss the similarities and diﬀerencesbetween the approaches by studying two representative examples of positive maps. Transposition map . Consider ﬁrst the transposition map 𝑇 ∈ H ( 𝐴, 𝐴 ) . Letting SPA ( 𝑇 ) denotethe minimal amount 𝜍 needed for ( 𝑇 + 𝜍 D )/( + 𝜍 ) to be a quantum channel, it can be easilyveriﬁed that SPA ( 𝑇 ) = 𝑑 𝐴 . However, by making a more suitable choice of a channel in theoptimisation, our robustness measures construct an approximation as ( 𝑇 + 𝜆 Λ )/( + 𝜆 ) where 𝜆 = ( 𝑑 𝐴 − ) already suﬃces to ensure that this is a valid physical channel. From this we seethat 𝑅 ( 𝑇 ) = ( 𝑑 𝐴 − ) and hence k 𝑇 k ♦ = 𝑑 𝐴 . Quantitatively, the advantage gained by allowingarbitrary channels in such decompositions can therefore be signiﬁcant.3To understand why a better approximation can be obtained, let us take a closer look at theoptimal decomposition for this map. Our generalised approach can take into considerationthe fact that the Choi operator of the transposition map, 𝐽 𝑇 (the swap operator), already has anon-trivial positive part, which means that there is no need to act on that part of the space. Morespeciﬁcally, a better approximation is obtained simply by deﬁning the map 𝐽 Λ = − 𝐽 𝑇 𝑑 𝐴 − ∈ CPTPand mixing as 𝐽 𝑇 + ( 𝑑 𝐴 − ) 𝐽 Λ ≥ ⇒ 𝑑 𝐴 + 𝑇 + 𝑑 𝐴 − 𝑑 𝐴 + Λ ∈ CPTP . (55)Structurally, this is not too diﬀerent from the SPA — the only maps involved in the combinationare the depolarising channel and the transposition map itself, even if the optimal approximationis not simply a convex mixture of the two. Indeed, we could deﬁne an optimised structuralphysical approximation which allows for such decompositions to be used:SPA ′ ( Φ ) ≔ min (cid:26) 𝜍 ′ (cid:12)(cid:12)(cid:12)(cid:12) 𝐽 Φ + 𝜍 ′ (cid:20) 𝜆 max ( 𝐽 Φ ) − 𝐽 Φ 𝜆 max ( 𝐽 Φ ) 𝑑 𝐵 − (cid:21) ≥ (cid:27) = − 𝜆 min ( 𝐽 Φ ) 𝑑 𝐵 𝜆 max ( 𝐽 Φ ) − 𝜆 max ( 𝐽 Φ ) − 𝜆 min ( 𝐽 Φ ) , (56)with the expression valid for any map such that 𝜆 min ( 𝐽 Φ ) < 𝜆 max ( 𝐽 Φ ) ≠ 𝑑 − 𝐵 . This can be used togive a general bound to the robustness measures. Proposition 11.

For any trace-preserving map Φ ∈ H ( 𝐴, 𝐵 ) such that Φ ≠ D , it holds that 𝑅 ( Φ ) ≤ SPA ′ ( Φ ) ≤ SPA ( Φ ) . (57)In the case of the transpose, it holds that SPA ′ ( 𝑇 ) = 𝑅 ( 𝑇 ) = ( 𝑑 𝐴 − ) , so we know that anoptimal approximation of the transposition map can be realised with only the depolarisingchannel, as long as one considers the optimised approach of Eq. (56). However, this is not thecase for general maps, and the advantages oﬀered by the generalised robustness approach canprovide new insight into optimal approximations of maps, as we shall see in the following. Choi map . The Choi map C ∈ H ( 𝐴, 𝐴 ) with 𝑑 𝐴 = C ( 𝑋 ) ≔ 𝑋 + 𝑋 − 𝑋 − 𝑋 − 𝑋 𝑋 + 𝑋 − 𝑋 − 𝑋 − 𝑋 𝑋 + 𝑋 ! (58)where 𝑋 𝑖𝑗 denote the matrix elements of 𝑋 in a chosen basis. It can then be shown that theoptimal SPA of C are given by SPA ( C ) = and SPA ′ ( C ) = . With the robustness, an improvedchoice can be obtained by choosing Λ = id and mixing as 𝐽 C + 𝐽 id ≥

0, yielding 𝑅 ( C ) = .Consequently, mixing with more general maps can not only provide quantitative improvements,but also identify ways of implementing non-CPTP maps which are impossible to ﬁnd with thestandard structural physical approximations.An interesting diﬀerence between the SPA- and robustness-based approaches is that the opti-mal SPA of the Choi map is a measure-and-prepare (entanglement-breaking) channel [7], whilethe map obtained in the robustness-based approach is not (as can be veriﬁed with the PPT crite-rion). Since measure-and-prepare channels enjoy an easy implementation in practical settings,it would be an interesting extension of our approach to consider the extent of a quantitativeadvantage that can be maintained while requiring that the optimal CPTP approximation beentanglement breaking.We also note that another approach to realising positive maps was studied in Ref. [61] by usingmultiple copies of the input state, where a related SPA-based approximation was also considered.An extension of the methods of our work to this framework could provide additional insightinto the implementability of positive maps. A fundamentally important case of a non-CPTP map encountered in many settings is theinverse linear map of a bĳective quantum channel, that is, a map such that Λ − ◦ Λ = Λ ◦ Λ − = id. We note that in many cases it suﬃces to consider only left or right inverses, but we assume two-sided invertibility forsimplicity. Λ is a unitary map. However, many importantcases of quantum dynamics are indeed invertible, allowing us to study their inverses in theformalism of our work. Non-Markovianity . One setting in which channel inverses play a role is the study of non-Markovianity. Among the diﬀerent ways to deﬁne Markovian evolution, a common way is tosay that a time-dependent evolution governed by the channel Λ 𝑡, is Markovian if it behaves as aphysical map over any time interval [ 𝑡, 𝑡 + 𝛿 𝑡 ] . Mathematically, any Λ 𝑡, satisfying this conditionis said to be CP-divisible [62–64], which can be formalised by the statement that for all times 𝑡 and 𝑠 ≤ 𝑡 we can write Λ 𝑡, = Ξ 𝑡,𝑠 ◦ Λ 𝑠, where the propagator Ξ 𝑡,𝑠 is a CPTP map. For more general channels, the decomposition Λ 𝑡, = Ξ 𝑡,𝑠 ◦ Λ 𝑠, results in some Ξ 𝑡,𝑠 that is non-CPTP, indicating that Markovian dynamicsbreak down after some time point 𝑠 .Observe that, provided Λ 𝑡, is invertible for all 𝑡 , we can take Ξ 𝑡,𝑠 = Λ 𝑡, ◦ Λ − 𝑠, . Therefore,the non-physicality of either Ξ 𝑡,𝑠 or Λ − 𝑠, are each an indicator of non-Markovianity. Noting thatboth of these maps are trace-preserving, we can therefore consider their respective diamondnorms, (cid:13)(cid:13)(cid:13) Λ − 𝑠, (cid:13)(cid:13)(cid:13) ^ or k Ξ 𝑡,𝑠 k ^ as indicators of non-Markovianity over the time-interval [ 𝑡, 𝑠 ] . Thisis similar to the original approach of Ref. [62] where a quantiﬁer based on the trace norm of theChoi operator was employed — the advantage of our deﬁnition is the ability to interpret thisquantity operationally.Speciﬁcally, we observe that quantum mechanics is ultimately a Markovian theory: if we hadknowledge of all relevant objects, then all quantum dynamics could be described by Markovianunitary dynamics. That is, any information from the past that is relevant to the future mustpass through the present, and hence the optimal prediction of future observational statisticsultimately depends only on the the present state of reality. Non-Markovianity is an artefact of nottracking all relevant information in the present. In our context, this arises as our mathematicalcharacterisation of the candidate channel, Λ 𝑠, , does not track the state of the environment. Theoperational relevance of k Ξ 𝑡,𝑠 k ^ then becomes more evident. Notably, in Sec. 4 we presented asystematic means of simulating any unphysical map Ξ 𝑡,𝑠 by introducing an ancillary system 𝑋 .Here, we may think of this as building a Markovian model for Ξ 𝑡,𝑠 by introducing 𝑋 = 𝜇 𝑖 𝜌 𝑖 as an“artiﬁcial environment”. The feeding in of diﬀerent states 𝜌 𝑖 depending on 𝑋 then representsa means in which non-Markovian behaviour on the system is realised. While this constructiondoes not immediately look physical (as it allows aﬃne mixtures of quantum states), it can besimulated by a classical computer with suﬃcient resource overhead. The resource costs ofdoing so — k Ξ 𝑡,𝑠 k ^ — thus represents a bound on the information processing capabilities of theenvironment that enable said non-Markovian behaviour to emerge. (cid:13)(cid:13)(cid:13) Λ − 𝑠, (cid:13)(cid:13)(cid:13) ^ oﬀers a very similarinterpretation, simply in time-reverse. That is, consider viewing the channel in reverse fromtime 𝑡 to time 0: in this scenario, the operation Λ − 𝑠, would correspond to time evolution from 𝑠 to 0. Its non-physicality would then mean that this reverse-time evolution no longer dependsonly on the present state of the system at time 𝑠 , but also times after 𝑠 .There are multiple approaches for extending this to a time-independent measure of non-Markovianity of Λ . One could, for example, take the supremum of the measure k Ξ 𝑡,𝑠 k ^ overall 𝑡 and 𝑠 . This would then characterise how much extra information processing we needbeyond tracking the state of the system at time 𝑠 to simulate dynamics over the time-interval [ 𝑠, 𝑡 ] . Alternatively, one can look at the process in time-reverse and consider sup 𝑡 (cid:13)(cid:13)(cid:13) Λ − 𝑡, (cid:13)(cid:13)(cid:13) ^ . Wemay also follow an approach based on Ref. [62] and deﬁne I ^ ( Λ ) ≔ ∫ ∞ 𝑔 ^ ,𝑡 ( Λ ) d 𝑡 , where 𝑔 ^ ,𝑡 can be understood as the right-hand derivative of the diamond norm of the dynamics at time 𝑡 : 𝑔 ^ ,𝑡 ( Λ ) ≔ lim 𝜀 → + (cid:13)(cid:13)(cid:13) Λ 𝑡 + 𝜀 , ◦ Λ − 𝑡, (cid:13)(cid:13)(cid:13) ^ − (cid:13)(cid:13)(cid:13) Λ 𝑡, ◦ Λ − 𝑡, (cid:13)(cid:13)(cid:13) ^ 𝜀 = lim 𝜀 → + (cid:13)(cid:13)(cid:13) Λ 𝑡 + 𝜀 , ◦ Λ − 𝑡, (cid:13)(cid:13)(cid:13) ^ − 𝜀 . (59) I ^ ( Λ ) therefore represents the total amount of non-Markovianity in this evolution. A suitablenormalisation of this quantity can allow for the comparison of the strength of non-Markovianityin diﬀerent settings [62, 64]. We leave a careful consideration of these possibilities to futurework. Error mitigation . Another application for the study of channel inverses is error mitigation.This setting considers the scenario where one is tasked with computing expectation values5of the type Tr [ U ( 𝜌 ) 𝐴 ] for an input state 𝜌 , ideal gate U , and observable 𝐴 , while operationsare followed by a noise channel Θ . A leading approach to this problem, called probabilis-tic error cancellation [17, 65], is to counteract the noise with the inverse map Θ − , so thatTr [ U ( 𝜌 ) 𝐴 ] = Tr [ Θ ◦ Θ − ◦ U ( 𝜌 ) 𝐴 ] . By decomposing Θ − into a quasiprobability distribution overa convex subset of channels P = { Λ 𝑖 } such that Λ 𝑖 ◦ U would be implementable on a (ﬁctitious)noiseless device, standard quasiprobability sampling arguments allows us to construct an un-biased estimator for Tr [ U ( 𝜌 ) 𝐴 ] using only operations implementable on a noisy device. Theoptimal overhead cost of such a procedure scales as 𝛾 P ( Θ ) , where [17, 30] 𝛾 P ( Θ ) = min ( Õ 𝑖 | 𝜆 𝑖 | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Θ − = Õ 𝑖 𝜆 𝑖 Λ 𝑖 , Λ 𝑖 ∈ P ) = min (cid:8) 𝜆 + + 𝜆 − (cid:12)(cid:12) Θ − = 𝜆 + Λ + − 𝜆 − Λ − , Λ ± ∈ P (cid:9) . (60)The speciﬁc choice of P can be made depending on not only the physical setting in consideration,but also on one’s precise motivations. On the one hand, a set with a ﬁnite number of operations(e.g., Cliﬀord gates) turns Eq. (60) into a linear program [17, 65], making the overhead costeasily computable while sacriﬁcing the expressibility of devices. On the other hand, choosinga larger set with an inﬁnite number of implementable operations takes into account a largerexpressibility [30], but makes the computation of Eq. (60) hard in general. Here, to accommodatecomputability and expressibility at the same time, we take another approach considered inRef. [33, 66]: we choose P to be all physical quantum channels. We notice that the norm k·k ♦ provides the cost of error mitigation in this setting as 𝛾 CPTP ( Θ ) = (cid:13)(cid:13) Θ − (cid:13)(cid:13) ♦ , which can be eﬃcientlycomputed by semideﬁnite programming. Although this choice of P might seem too permissive,the lower bound obtained through this approach can actually match known achievability results(upper bounds) [33], showing new optimality results and even improving on the specialisedcharacterisation of Ref. [30] in some cases. Of note is the fact that, since any inverse map Θ − ofa quantum channel Θ is trace preserving, our Thm. 3 shows a new application of the diamondnorm in bounding the cost of error mitigation: it always holds that 𝛾 P ( Θ ) ≥ (cid:13)(cid:13) Θ − (cid:13)(cid:13) ^ , regardlessof the choice of P .In some cases — such as when experiencing the leakage or loss of some qubits during com-putation — the noisy evolution can actually correspond to a map which is not trace preserving.Although many previous approaches did not take this into consideration, our methods explic-itly extend to such maps, allowing one to understand the simulation of non-trace-preservinglinear maps through Thm. 4. Related settings which our methods can characterise include theso-called linear quantum error correction [67], which aims to correct errors of systems undergo-ing general, non-CPTP dynamics Θ , as well as error mitigation for non-Markovian noise [68],where the mitigation cost can be related to a measure of non-Markovianity. In such cases, ourapproach can thus help understand the implementation of not only the inverse maps, but alsothe dynamics themselves. To showcase the application of our methods and evaluate the measures for some representa-tive examples, we will consider the inverse maps of several fundamental types of noisy quantumevolutions: depolarising, amplitude damping, dephasing, and qubit leakage channels. The ex-pressions for the ﬁrst two appeared in Ref. [33], which we rederive using the methods and resultsof this work. We also ﬁnd for the ﬁrst three that the optimal decomposition into Λ ± for the norm k Θ − k ♦ (Eq. (6)), can be taken as convex mixtures of unitaries and state preparations. Thus, k Θ − k ♦ also serves as the optimal cost 𝛾 P ( Θ − ) with a smaller set P as considered in Ref. [30],indicating that the capability to implement all CPTNI maps does not provide any advantageover that of implementing unitaries and state preparations only. Note that the inverses of trace-preserving maps are trace preserving, and so in such cases the equality k Φ k ♦ = k Φ k ^ = 𝑅 ( Φ )+ Depolarising noise . The depolarising channel, given by D 𝑝 ( 𝑋 ) ≔ ( − 𝑝 ) 𝑋 + 𝑝 Tr 𝑋 𝑑 𝐴 forsome noise parameter 𝑝 ∈ [ , ) , has the inverse D − 𝑝 ( 𝑋 ) = − 𝑝 𝑋 − 𝑝 − 𝑝 Tr 𝑋 𝑑 𝐴 . This gives 𝐽 D − 𝑝 = − 𝑝 | Ω ih Ω | − 𝑝 ( − 𝑝 ) 𝑑 𝐴 𝐴 ⊗ 𝐴 . (61)Importantly, one can notice that Tr 𝐵 𝐽 D − 𝑝 + and Tr 𝐵 𝐽 D − 𝑝 − are proportional to identity. As ﬁrstnoticed in [40, 41], this means that the lower bound 𝑑 𝐴 (cid:13)(cid:13)(cid:13) 𝐽 D − 𝑝 (cid:13)(cid:13)(cid:13) of Prop. 7 matches the upper6bound 𝜆 max (cid:16) Tr 𝐵 h 𝐽 D − 𝑝 + + 𝐽 D − 𝑝 − i (cid:17) of Prop. 8 . We thus get (cid:13)(cid:13)(cid:13) D − 𝑝 (cid:13)(cid:13)(cid:13) ♦ = (cid:13)(cid:13)(cid:13) D − 𝑝 (cid:13)(cid:13)(cid:13) ^ = 𝑑 𝐴 (cid:13)(cid:13)(cid:13) 𝐽 D − 𝑝 (cid:13)(cid:13)(cid:13) = + (cid:0) − 𝑑 − 𝐴 (cid:1) 𝑝 − 𝑝 . (62) Dephasing noise . The generalised dephasing channel [69] is deﬁned by Δ p ( 𝑋 ) ≔ Í 𝑑 𝐴 − 𝑖 = 𝑝 𝑖 𝑍 𝑖 𝑋𝑍 † 𝑖 ,where p = ( 𝑝 , . . . , 𝑝 𝑑 𝐴 − ) is a chosen set of noise parameters 𝑝 𝑖 ≥

0, and 𝑍 𝑖 refers to the quditclock operators 𝑍 𝑖 = 𝑑 𝐴 − Õ 𝑗 = 𝜔 𝑖𝑗 | 𝑗 ih 𝑗 | (63)in some basis {| 𝑖 i} , with 𝜔 being a primitive 𝑑 𝐴 th root of unity. In the case of 𝑑 𝐴 =

2, thisrecovers the usual qubit dephasing channel Δ 𝑝 ( 𝑋 ) = ( − 𝑝 ) 𝑋 + 𝑝𝑍𝑋𝑍 † . One can notice that theaction of this channel can be represented by Δ p ( 𝑋 ) = 𝑋 ⊙ 𝑆 where ⊙ denotes the element-wisematrix product (Schur/Hadamard product), and ( 𝑆 ) 𝑗𝑘 = 𝑑 𝐴 − Õ 𝑖 = 𝑝 𝑖 𝜔 𝑖𝑗 ( 𝜔 𝑖𝑘 ) ∗ = 𝑑 𝐴 − Õ 𝑖 = 𝑝 𝑖 𝜔 𝑖 ( 𝑗 − 𝑘 ) 𝑗, 𝑘 = , . . . 𝑑 𝐴 − {| 𝑖 i} . Provided that the coeﬃcients of 𝑆 are non-zero (that is, Δ p does not actas a completely dephasing channel on any subspace), the map is invertible as Δ − p ( 𝑋 ) = 𝑋 ⊙ 𝑆 with 𝑆 deﬁned by ( 𝑆 ) 𝑗𝑘 = ( 𝑆 ) 𝑗𝑘 𝑗, 𝑘 = , . . . 𝑑 𝐴 − . (65)We will now show that (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ♦ = (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ^ = 𝑑 𝐴 (cid:13)(cid:13)(cid:13) 𝐽 Δ − p (cid:13)(cid:13)(cid:13) = 𝑑 𝐴 (cid:13)(cid:13)(cid:13) 𝑆 (cid:13)(cid:13)(cid:13) .The equality (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ♦ = (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ^ is a consequence of Thm. 3; note here that we do not actuallyneed to impose that Δ p be trace preserving (i.e., that Í 𝑖 𝑝 𝑖 = Δ p and Δ − p are alwaysproportional to a trace-preserving map by construction.To show the equality (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ♦ = 𝑑 𝐴 (cid:13)(cid:13)(cid:13) 𝑆 (cid:13)(cid:13)(cid:13) , consider the decomposition of 𝑆 as 𝑆 = 𝑆 + − 𝑆 − .Crucially, since 𝑆 is a circulant matrix, so is 𝑆 , and hence it can be diagonalised by the Fouriertransform matrix ( 𝐹 ) 𝑗𝑘 = √ 𝑑 𝐴 𝜔 𝑗𝑘 [70, 2.2.P10]. Each eigenvector of 𝑆 is therefore of the form | 𝑠 𝑚 i = √ 𝑑 𝐴 𝑑 𝐴 − Õ 𝑖 = 𝜔 𝑖𝑚 | 𝑖 i , (66)ensuring in particular that all diagonal elements of each density matrix | 𝑠 𝑚 ih 𝑠 𝑚 | are equal. Thisentails that 𝑆 + and 𝑆 − both have constant diagonals. Deﬁne now the maps Λ ± ( 𝑋 ) ≔ 𝑋 ⊙ 𝑆 ± . (67)Since 𝑆 ± ≥

0, each such map is completely positive [71, Thm. 3.7], and clearly Λ ′± ≔ Λ ± 𝑑 𝐴 / Tr ( 𝑆 ± ) is trace preserving as we have just seen that ( 𝑆 ± ) 𝑖𝑖 = ( 𝑆 ± ) 𝑗𝑗 ∀ 𝑖, 𝑗 . Thus we have a decompositionas Δ − p = Tr 𝑆 + 𝑑 𝐴 Λ ′+ − Tr 𝑆 − 𝑑 𝐴 Λ ′− , Λ ′± ∈ CPTP , (68)from which we get the bound (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ♦ ≤ 𝑑 𝐴 (cid:13)(cid:13)(cid:13) 𝑆 (cid:13)(cid:13)(cid:13) . On the other hand, let | 𝜓 i = √ 𝑑 𝐴 Í 𝑑 𝐴 − 𝑖 = | 𝑖 i anduse Prop. 10 to get (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ♦ ≥ (cid:13)(cid:13)(cid:13) Δ − p (| 𝜓 ih 𝜓 |) (cid:13)(cid:13)(cid:13) = 𝑑 𝐴 (cid:13)(cid:13)(cid:13) 𝑆 (cid:13)(cid:13)(cid:13) . (69) In fact, k Φ k ^ = 𝑑 𝐴 k 𝐽 Φ k if and only if Tr 𝐵 ( 𝐽 Φ + + 𝐽 Φ − ) ∝ [40, 41]. (cid:13)(cid:13)(cid:13) 𝐽 Δ − p (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) 𝑆 (cid:13)(cid:13)(cid:13) is obtained by noticing that 𝐽 Δ − p = Í 𝑖,𝑗 ( 𝑆 ) 𝑖𝑗 | 𝑖𝑖 ih 𝑗 𝑗 | which hasthe same eigenvalues as 𝑆 .The eigenvalues of 𝑆 can be readily obtained due to the fact that it is a circulant matrix [70,2.2.P10], allowing for a straightforward computation of the trace norm (cid:13)(cid:13)(cid:13) 𝑆 (cid:13)(cid:13)(cid:13) and altogethergiving (cid:13)(cid:13)(cid:13) Δ − p (cid:13)(cid:13)(cid:13) ^ = 𝑑 𝐴 𝑑 𝐴 − Õ 𝑚 = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑑 𝐴 − Õ 𝑗 = 𝑑 𝐴 − Õ 𝑖 = 𝑝 𝑖 𝜔 𝑗 ( 𝑖 − 𝑚 ) ! − (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (70)For the qubit dephasing channel with 𝑝 ∈ [ , ) , we recover (cid:13)(cid:13)(cid:13) Δ − 𝑝 (cid:13)(cid:13)(cid:13) ^ = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − 𝑝 − 𝑝 !(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = − 𝑝 . (71)Since each eigenvector | 𝑠 𝑚 i for 𝑆 in (66) corresponds to the application of 𝑍 𝑚 , Λ ′± in (68) arerealised as probabilistic applications of the generalised phase unitaries. Amplitude damping noise . The qubit amplitude damping channel A 𝛾 (·) = 𝐴 · 𝐴 † + 𝐴 · 𝐴 † is deﬁned by the Kraus operators 𝐴 ≔ | ih | + p − 𝛾 | ih | and 𝐴 ≔ √ 𝛾 | ih | . Using thefact that | ih | = − 𝛾 A 𝛾 (| ih |) − 𝛾 − 𝛾 | ih | = − 𝛾 A 𝛾 (| ih |) − 𝛾 − 𝛾 A 𝛾 (| ih |) , (72)we have A − 𝛾 (| ih |) = − 𝛾 | ih | − 𝛾 − 𝛾 | ih | . (73)Proposition 10 thus gives (cid:13)(cid:13)(cid:13) A − 𝛾 (cid:13)(cid:13)(cid:13) ♦ = (cid:13)(cid:13)(cid:13) A − 𝛾 (cid:13)(cid:13)(cid:13) ^ ≥ (cid:13)(cid:13)(cid:13) A − 𝛾 (| ih |) (cid:13)(cid:13)(cid:13) = + 𝛾 − 𝛾 . (74)A matching upper bound can be obtained by explicitly computing 𝐽 A − 𝛾 (see e.g. [17, 30]) andusing the upper bound in Prop. 8.The above shows a rather general method of obtaining lower bounds for linear maps whichare inverses of other linear maps, without having to explicitly compute the full inverse map.Indeed, this can be extended to maps which only approximately invert a given channel —useful, for instance, when dealing with non-invertible maps, or when aiming to reduce the costof implementing a given map by only requiring that it approximately mitigates the error. Proposition 12.

Let Φ ∈ H ( 𝐴, 𝐵 ) and e Φ ∈ H ( 𝐵, 𝐴 ) be such that (cid:13)(cid:13)e Φ ◦ Φ ( 𝜌 ) − 𝜌 (cid:13)(cid:13) ≤ 𝜀 forall 𝜌 ∈ D ( 𝐴 ) . Then (cid:13)(cid:13)e Φ (cid:13)(cid:13) ^ ≥ max (cid:8) k 𝑍 k ( − 𝜀 ) (cid:12)(cid:12) Φ ( 𝑍 ) ∈ D ( 𝐵 ) (cid:9)(cid:13)(cid:13)e Φ (cid:13)(cid:13) ♦ ≥ max (cid:8) Tr 𝑍 − + Tr 𝑄 + − 𝜀 (k 𝑍 k + k 𝑄 k ) (cid:12)(cid:12) Φ ( 𝑍 ) , Φ ( 𝑄 ) ∈ D ( 𝐵 ) (cid:9) 𝑅 ( e Φ ) ≥ max (cid:8) Tr 𝑍 − + Tr 𝑄 + − Tr Φ ( 𝑄 ) − 𝜀 (k 𝑍 k + k 𝑄 k ) (cid:12)(cid:12) Φ ( 𝑍 ) , Φ ( 𝑄 ) ≥ , Tr Φ ( 𝑍 + 𝑄 ) = (cid:9) 𝑅 ′ ( e Φ ) ≥ max (cid:8) Tr 𝑍 + − − 𝜀 k 𝑍 k (cid:12)(cid:12) Φ ( 𝑍 ) ∈ D ( 𝐵 ) (cid:9) 𝑅 ′′ ( e Φ ) ≥ max (cid:8) Tr 𝑍 − − 𝜀 k 𝑍 k (cid:12)(cid:12) Φ ( 𝑍 ) ∈ D ( 𝐵 ) (cid:9) . (75)8 Proof.

We use Prop. 10 to get that (cid:13)(cid:13)e Φ (cid:13)(cid:13) ^ ≥ max n (cid:13)(cid:13)e Φ ( 𝜎 ) (cid:13)(cid:13) (cid:12)(cid:12)(cid:12) 𝜎 ∈ D ( 𝐵 ) ∩ ran ( Φ ) o = max n (cid:13)(cid:13)e Φ ◦ Φ ( 𝑍 ) (cid:13)(cid:13) (cid:12)(cid:12)(cid:12) Φ ( 𝑍 ) ∈ D ( 𝐵 ) o ≥ max n k 𝑍 k − (cid:13)(cid:13) 𝑍 − e Φ ◦ Φ ( 𝑍 ) (cid:13)(cid:13) (cid:12)(cid:12)(cid:12) Φ ( 𝑍 ) ∈ D ( 𝐵 ) o ≥ max (cid:8) k 𝑍 k ( − 𝜀 ) (cid:12)(cid:12) Φ ( 𝑍 ) ∈ D ( 𝐵 ) (cid:9) . (76)The third line follows by the triangle inequality, and the last line is a consequence of theassumption that (cid:13)(cid:13)e Φ ◦ Φ ( 𝜌 ) − 𝜌 (cid:13)(cid:13) ≤ 𝜀 for all 𝜌 ∈ D ( 𝐴 ) , since we can write any 𝑍 = 𝜇 + 𝜌 + − 𝜇 − 𝜌 − forsome 𝜌 ± ∈ D ( 𝐴 ) to get (cid:13)(cid:13)(cid:13)e Φ ◦ Φ ( 𝑍 ) − 𝑍 (cid:13)(cid:13)(cid:13) ≤ 𝜀 ( 𝜇 + + 𝜇 − ) ≤ 𝜀 k 𝑍 k . The case of the other measures isanalogous: using the variational form of the function Tr 𝑍 + (and similarly Tr 𝑍 − ) we can obtainTr e Φ ( 𝜎 ) + = max n De Φ ◦ Φ ( 𝑍 ) , 𝑊 E (cid:12)(cid:12)(cid:12) ≤ 𝑊 ≤ o = max n h 𝑍, 𝑊 i − D 𝑍 − e Φ ◦ Φ ( 𝑍 ) , 𝑊 E (cid:12)(cid:12)(cid:12) ≤ 𝑊 ≤ o ≥ max n h 𝑍, 𝑊 i − (cid:13)(cid:13) 𝑍 − e Φ ◦ Φ ( 𝑍 ) (cid:13)(cid:13) k 𝑊 k ∞ (cid:12)(cid:12)(cid:12) ≤ 𝑊 ≤ o ≥ Tr 𝑍 + − 𝜀 k 𝑍 k (77)where we used the Cauchy-Schwarz inequality. Using these bounds in Prop. 10 yields the statedresult. Leakage error . Consider the qubit leakage error L 𝑝 (·) = 𝐿 𝑝 · 𝐿 † 𝑝 where 𝐿 𝑝 ≔ | ih | + p − 𝑝 | ih | . This represents a situation where the excited state is lost with probability 1 − 𝑝 ,and this stochastic nature is reﬂected to the fact that L 𝑝 is not trace preserving. The inverse ofthe leakage error is given by L − 𝑝 (·) = 𝐿 − 𝑝 · 𝐿 − 𝑝 . Since this is a completely positive map, Eq. (53)gives (cid:13)(cid:13)(cid:13) L − 𝑝 (cid:13)(cid:13)(cid:13) ♦ = (cid:13)(cid:13)(cid:13) L − 𝑝 (cid:13)(cid:13)(cid:13) ^ = (cid:13)(cid:13)(cid:13) Tr 𝐵 𝐽 L − 𝑝 (cid:13)(cid:13)(cid:13) ∞ = − 𝑝 . (78)Note that the inverse can be realised as L − 𝑝 = + p − 𝑝 ! id − p − 𝑝 − ! Z + 𝑝 − 𝑝 Π | ih | (79)where Z (·) ≔ 𝑍 · 𝑍 with 𝑍 = | ih | − | ih | being the Pauli 𝑍 matrix, and Π | ih | (·) ≔ | ih | · | ih | being the projection onto the state | i .

8. Discussion

We introduced a comprehensive quantitative approach to the study of non-completely-positive linear maps, focusing in particular on the task of approximating and simulating themwith valid quantum channels. To this end, we considered several quantiﬁers which generalisemeasures employed in the study of quantum resources — namely, variants of the robustness andbase norm measures. We showed that they satisfy very close relations with the diamond norm,and in particular are exactly equal to it for any trace-preserving linear map. Since such trace-preserving maps are the most commonly encountered examples of dynamics beyond physicalquantum channels, this allowed us to establish fruitful interrelations between the quantities,and discover new applications of the fundamentally important quantity that is the diamondnorm. We developed in particular two operational connections. Firstly, we introduced a methodof simulating general linear maps with quantum channels, shifting the diﬃculty of realisingnon-quantum dynamics onto the structurally simpler task of implementing linear combinationsof quantum states. We showed that our robustness measure exactly quantiﬁes the cost of real-ising such schemes in terms of the required state-based resources. Secondly, we showed thatanother variant of the robustness ﬁnds use as an exact quantiﬁer of the performance advantagethat a general linear map can enable over quantum channels in a class of state discriminationgames. We introduced a number of useful bounds and explicitly employed them to demonstrate9the computability of the measures for some representative examples. Finally, we showed howour measures can ﬁnd use in the quantitative characterisation of several practically relevantsettings, namely, structural approximations of positive maps, non-Markovianity quantiﬁcation,and tightly bounding the cost of probabilistic error mitigation.Although we focused on the application of our framework to Hermiticity-preserving maps,we note that more general linear maps can be treated in a similar way. The simplest way toapproach this is to decompose any linear map Φ into its Hermiticity-preserving and skew-Hermiticity-preserving parts, that is, write Φ = Φ H + 𝑖 Φ SH where the constituent maps aredeﬁned through 𝐽 Φ H ≔ ( 𝐽 Φ + 𝐽 † Φ ) and 𝐽 Φ SH ≔ 𝑖 ( 𝐽 Φ − 𝐽 † Φ ) . The maps Φ H and Φ SH are then explicitlyHermiticity-preserving, and our arguments can be applied to them directly. A similar approachwas employed in [72] to decompose the two-point quantum correlator T : L ( 𝐴 ) → L ( 𝐴 ⊗ 𝐴 ) ,deﬁned as the map satisfying Tr [ T ( 𝜌 )( 𝐴 ⊗ 𝐵 )] = Tr [ 𝐴 𝜌 𝐵 ] for all 𝐴, 𝐵 . Indeed, one can show thatthe decompositions constructed in [72] are also optimal for the robustness-based quantities.A major outstanding issue is to understand how the framework can be extended to non-linearmaps, which could allow for the characterisation and more eﬃcient approximation of importantunphysical dynamics such as quantum cloners. This question was already asked in the earliestworks concerned with approximating non-CPTP maps with quantum channels [5], but it stillremains a considerable challenge to devise approaches which could apply to general non-lineartransformations.

Acknowledgments

We acknowledge fruitful discussions with Joonwoo Bae, Francesco Buscemi, Ludovico Lami,Varun Narasimhachar, Jayne Thompson, and Xiao Yuan. This research is supported by theNational Research Foundation (NRF), Singapore, under its NRFF Fellow program (Award No.NRF-NRFF2016-02), the National Research Foundation and Agence Nationale de la Recherchejoint Project No. NRF2017-NRFANR004 VanQuTe, the Singapore Ministry of Education Tier 1Grant RG162/19 (S) and grant No. FQXi-RFP-IPW-1903 from the Foundational Questions Insti-tute and Fetzer Franklin Fund (a donor advised fund of Silicon Valley Community Foundation).B.R. is supported by the Presidential Postdoctoral Fellowship from Nanyang Technological Uni-versity, Singapore. Any opinions, ﬁndings and conclusions or recommendations expressedin this material are those of the author(s) and do not reﬂect the views of National ResearchFoundation, Singapore. [1] M. A. Nielsen and I. L. Chuang,

Quantum Computation and Quantum Information , tenth ed. (CambridgeUniversity Press, New York, NY, USA, 2011).[2] M. Horodecki, P. Horodecki, and R. Horodecki,

Separability of mixed states: Necessary and suﬃcientconditions,

Physics Letters A , 1 (1996).[3] O. Gühne and G. Tóth,

Entanglement detection,

Phys. Rep. , 1 (2009).[4] R. Horodecki, P. Horodecki, M. Horodecki, and K. Horodecki,

Quantum entanglement,

Rev. Mod. Phys. , 865 (2009).[5] P. Horodecki, From limits of quantum operations to multicopy entanglement witnesses and state-spectrumestimation,

Phys. Rev. A , 052101 (2003).[6] P. Horodecki and A. Ekert, Method for Direct Detection of Quantum Entanglement,

Phys. Rev. Lett. ,127902 (2002).[7] J. K. Korbicz, M. L. Almeida, J. Bae, M. Lewenstein, and A. Acín, Structural approximations to positivemaps and entanglement-breaking channels,

Phys. Rev. A , 062105 (2008).[8] J. Bae, Designing quantum information processing via structural physical approximation,

Rep. Prog. Phys. , 104001 (2017).[9] P. Pechukas, Reduced Dynamics Need Not Be Completely Positive,

Phys. Rev. Lett. , 1060 (1994).[10] A. Shaji and E. C. G. Sudarshan, Who’s afraid of not completely positive maps?

Physics Letters A , 48(2005).[11] C. A. Rodríguez-Rosario, K. Modi, A.-m. Kuah, A. Shaji, and E. C. G. Sudarshan,

Completely positivemaps and classical correlations,

J. Phys. A: Math. Theor. , 205301 (2008).[12] H. A. Carteret, D. R. Terno, and K. Życzkowski, Dynamics beyond completely positive maps: Someproperties and applications,

Phys. Rev. A , 042113 (2008).[13] R. Alicki, Comment on “Reduced Dynamics Need Not Be Completely Positive”,

Phys. Rev. Lett. , 3020(1995).[14] K. Modi, Operational approach to open dynamics and quantifying initial correlations,

Sci. Rep. , 581 (2012).[15] D. Schmid, K. Ried, and R. W. Spekkens, Why initial system-environment correlations do not imply thefailure of complete positivity: A causal perspective,

Phys. Rev. A , 022112 (2019). [16] P. Shor, Fault-tolerant quantum computation, in Proceedings of 37th Conference on Foundations of ComputerScience (1996) pp. 56–65.[17] K. Temme, S. Bravyi, and J. M. Gambetta,

Error Mitigation for Short-Depth Quantum Circuits,

Phys. Rev.Lett. , 180509 (2017).[18] Y. Li and S. C. Benjamin,

Eﬃcient Variational Quantum Simulator Incorporating Active Error Minimization,

Phys. Rev. X , 021050 (2017).[19] A. Y. Kitaev, Quantum computations: Algorithms and error correction,

Russ. Math. Surv. , 1191 (1997).[20] J. Watrous, Notes on super-operator norms induced by Schatten norms, arXiv:quant-ph/0411077 (2004).[21] J. Watrous,

Theory of Quantum Information (University of Waterloo, 2011).[22] G. Vidal and R. Tarrach,

Robustness of entanglement,

Phys. Rev. A , 141 (1999).[23] M. G. Díaz, K. Fang, X. Wang, M. Rosati, M. Skotiniotis, J. Calsamiglia, and A. Winter, Using andreusing coherence to realize quantum processes,

Quantum , 100 (2018).[24] R. Takagi and B. Regula, General Resource Theories in Quantum Mechanics and Beyond: OperationalCharacterization via Discrimination Tasks,

Phys. Rev. X , 031053 (2019).[25] Z.-W. Liu and A. Winter, Resource theories of quantum channels and the universal role of resource erasure, arXiv:1904.04201 (2019).[26] G. Gour and A. Winter,

How to Quantify a Dynamical Quantum Resource,

Phys. Rev. Lett. , 150401(2019).[27] R. Uola, T. Kraft, and A. A. Abbott,

Quantiﬁcation of quantum dynamics with input-output games,

Phys.Rev. A , 052306 (2020).[28] X. Yuan, Y. Liu, Q. Zhao, B. Regula, J. Thompson, and M. Gu,

Universal and Operational Benchmarkingof Quantum Memories, arXiv:1907.02521 (2020).[29] R. Takagi, K. Wang, and M. Hayashi,

Application of the Resource Theory of Channels to CommunicationScenarios,

Phys. Rev. Lett. , 120502 (2020).[30] R. Takagi,

Optimal resource cost for error mitigation, arXiv:2006.12509 (2020).[31] B. Regula and R. Takagi,

Fundamental limitations on quantum channel manipulation, arXiv:2010.11942(2020).[32] B. Regula and R. Takagi,

One-shot manipulation of dynamical quantum resources, arXiv:2012.02215 (2020).[33] J. Jiang, K. Wang, and X. Wang,

Physical Implementability of Quantum Maps and Its Application in ErrorMitigation, arXiv:2012.10959 (2020).[34] J. Watrous,

The Theory of Quantum Information (Cambridge University Press, Cambridge, 2018).[35] A. Gilchrist, N. K. Langford, and M. A. Nielsen,

Distance measures to compare real and ideal quantumprocesses,

Phys. Rev. A , 062310 (2005).[36] A. Jenčová, Base norms and discrimination of generalized quantum channels,

Journal of MathematicalPhysics , 022201 (2014).[37] J. Watrous, Semideﬁnite programs for completely bounded norms,

Theory Comput. , 217 (2009).[38] J. Watrous, Simpler semideﬁnite programs for completely bounded norms,

Chic. J. Th. Comp. Sci. , 1 (2013).[39] C. Piveteau, D. Sutter, and S. Woerner, Quasiprobability decompositions with reduced sampling overhead, arXiv:2101.09290 (2021).[40] U. Michel, M. Kliesch, R. Kueng, and D. Gross,

Comments on “Improving Compressed Sensing With theDiamond Norm”–Saturation of the Norm Inequalities Between Diamond and Nuclear Norm,

IEEE Trans. Inf.Theory , 7443 (2018).[41] I. Nechita, Z. Puchała, Ł. Pawela, and K. Życzkowski, Almost all quantum channels are equidistant,

Journal of Mathematical Physics , 052201 (2018).[42] H. Pashayan, J. J. Wallman, and S. D. Bartlett, Estimating Outcome Probabilities of Quantum CircuitsUsing Quasiprobabilities,

Phys. Rev. Lett. , 070501 (2015).[43] C. H. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres, and W. K. Wootters,

Teleporting an unknownquantum state via dual classical and Einstein-Podolsky-Rosen channels,

Phys. Rev. Lett. , 1895 (1993).[44] M. Berta, F. G. S. L. Brandão, M. Christandl, and S. Wehner, Entanglement Cost of Quantum Channels,

IEEE Trans. Inf. Theory , 6779 (2013).[45] S. Pirandola, R. Laurenza, C. Ottaviani, and L. Banchi, Fundamental limits of repeaterless quantumcommunications,

Nat. Commun. , 15043 (2017).[46] M. M. Wilde, Entanglement cost and quantum channel simulation,

Phys. Rev. A (2018), 10.1103/Phys-RevA.98.042338.[47] G. Gour and C. M. Scandolo, The Entanglement of a Bipartite Channel, arXiv:1907.02552 (2019).[48] S. Bäuml, S. Das, X. Wang, and M. M. Wilde,

Resource theory of entanglement for bipartite quantumchannels, arXiv:1907.04181 (2019).[49] D. Gottesman and I. L. Chuang,

Demonstrating the viability of universal quantum computation usingteleportation and single-qubit operations,

Nature , 390 (1999).[50] J. R. Seddon and E. T. Campbell,

Quantifying magic for multi-qubit operations,

Proceedings of the RoyalSociety A: Mathematical, Physical and Engineering Sciences , 20190251 (2019).[51] K. Ben Dana, M. García Díaz, M. Mejatty, and A. Winter,

Resource theory of coherence: Beyond states,

Phys. Rev. A , 062327 (2017).[52] J. Geller and M. Piani, Quantifying non-classical and beyond-quantum correlations in the uniﬁed operatorformalism,

J. Phys. A: Math. Theor. , 424030 (2014).[53] J. F. Fitzsimons, J. A. Jones, and V. Vedral, Quantum correlations which imply causation,

Sci. Rep. , 18281(2015).[54] E. Kaur and M. M. Wilde, Amortized entanglement of a quantum channel and approximately teleportation-simulable channels,

J. Phys. A: Math. Theor. , 035303 (2017).[55] D. Rosset, F. Buscemi, and Y.-C. Liang, Resource Theory of Quantum Memories and Their Faithful Veriﬁcation with Minimal Assumptions,

Phys. Rev. X , 021033 (2018).[56] K. Życzkowski, P. Horodecki, A. Sanpera, and M. Lewenstein, Volume of the set of separable states,

Phys.Rev. A , 883 (1998).[57] M. Kliesch, R. Kueng, J. Eisert, and D. Gross, Improving Compressed Sensing With the Diamond Norm,

IEEE Trans. Inf. Theory , 7445 (2016).[58] F. Shultz, The structural physical approximation conjecture,

Journal of Mathematical Physics , 015218(2015).[59] H.-T. Lim, Y.-S. Kim, Y.-S. Ra, J. Bae, and Y.-H. Kim, Experimental Realization of an Approximate PartialTranspose for Photonic Two-Qubit Systems,

Phys. Rev. Lett. , 160401 (2011).[60] M.-D. Choi,

Some assorted inequalities for positive linear maps on C*-algebras,

J. Oper. Theory , 271 (1980).[61] Q. Dong, M. T. Quintino, A. Soeda, and M. Murao, Implementing positive maps with multiple copies ofan input state,

Phys. Rev. A , 052352 (2019).[62] Á. Rivas, S. F. Huelga, and M. B. Plenio, Entanglement and Non-Markovianity of Quantum Evolutions,

Phys. Rev. Lett. , 050403 (2010).[63] D. Chruściński and S. Maniscalco,

Degree of Non-Markovianity of Quantum Evolution,

Phys. Rev. Lett. , 120404 (2014).[64] Á. Rivas, S. F. Huelga, and M. B. Plenio,

Quantum non-Markovianity: Characterization, quantiﬁcationand detection,

Rep. Prog. Phys. , 094001 (2014).[65] S. Endo, S. C. Benjamin, and Y. Li, Practical Quantum Error Mitigation for Near-Future Applications,

Phys.Rev. X , 031027 (2018).[66] Y. Xiong, D. Chandra, S. X. Ng, and L. Hanzo, Sampling overhead analysis of quantum error mitigation:Uncoded vs. coded systems,

IEEE Access , 228967 (2020).[67] A. Shabani and D. A. Lidar, Maps for general open quantum systems and a theory of linear quantum errorcorrection,

Phys. Rev. A , 012309 (2009).[68] H. Hakoshima, Y. Matsuzaki, and S. Endo, Relationship between costs for quantum error mitigation andnon-markovian measures,

Phys. Rev. A , 012611 (2021).[69] I. Devetak and P. W. Shor,

The Capacity of a Quantum Channel for Simultaneous Transmission of Classicaland Quantum Information,

Commun. Math. Phys. , 287 (2005).[70] R. A. Horn and C. R. Johnson,

Matrix Analysis (Cambridge University Press, 2012).[71] V. Paulsen,

Completely Bounded Maps and Operator Algebras (Cambridge University Press, 2002).[72] F. Buscemi, M. Dall’Arno, M. Ozawa, and V. Vedral,

Direct observation of any two-point quantumcorrelation function, arXiv:1312.4240 (2013).[73] S. Boyd and L. Vandenberghe,

Convex Optimization (Cambridge University Press, New York, 2004).[74] J. P. Ponstein,

Approaches to the Theory of Optimization (Cambridge University Press, 2004).

Appendix A: Dual forms

Here we derive the dual expressions of the measures, as stated in Prop. 9. The derivationfollows standard arguments in convex optimisation [73, 74] (see also [24, App. B]). Let usexplicitly consider the case of the diamond norm. As our starting point, we will take the primaloptimisation problem as in Lem. 1: k Φ k ^ ,𝑝 = min (cid:8) 𝜇 (cid:12)(cid:12) 𝐽 Φ = 𝑀 + − 𝑀 − , 𝑀 ± ≥ , Tr 𝐵 ( 𝑀 + + 𝑀 − ) ≤ 𝜇 𝐴 (cid:9) . (A1)The Lagrangian of this problem is given by 𝐿 ( 𝜇 , 𝑀 + , 𝑀 − ; 𝑊 , 𝑃, 𝑄, 𝑅 ) = 𝜇 − h 𝑀 + − 𝑀 − − 𝐽 Φ , 𝑊 i − h 𝑀 + , 𝑃 i − h 𝑀 − , 𝑄 i− (cid:10) 𝜇 𝐴 − Tr 𝐵 ( 𝑀 + + 𝑀 − ) , 𝑅 (cid:11) = 𝜇 ( − Tr 𝑅 ) + h 𝑀 + , − 𝑊 − 𝑃 + 𝑅 ⊗ 𝐵 i+ h 𝑀 − , 𝑊 − 𝑄 + 𝑅 ⊗ 𝐵 i + h 𝐽 Φ , 𝑊 i (A2)where 𝑊 , 𝑃, 𝑄 ∈ H ( 𝐴 ⊗ 𝐵 ) , 𝑅 ∈ H ( 𝐴 ) are Lagrange multipliers, and we used that h 𝑆, Tr 𝐵 𝑇 i = h 𝑆 ⊗ 𝐵 , 𝑇 i holds for any 𝑆 ∈ H ( 𝐴 ) , 𝑇 ∈ H ( 𝐴 ⊗ 𝐵 ) . The dual problem is then deﬁned as k Φ k ^ ,𝑑 ≔ sup 𝑊 ∈ H 𝑃,𝑄,𝑅 ≥ inf 𝜇 ∈ R 𝑀 + ,𝑀 − ∈ H 𝐿 ( 𝜇 , 𝑀 + , 𝑀 − ; 𝑊 , 𝑃, 𝑄, 𝑅 ) = sup 𝑊 ∈ H 𝑃,𝑄,𝑅 ≥ (cid:26) h 𝐽 Φ , 𝑊 i if Tr 𝑅 = 𝑊 + 𝑃 = 𝑅 ⊗ 𝐵 and 𝑊 − 𝑄 = − 𝑅 ⊗ 𝐵 −∞ otherwise = sup (cid:8) h 𝐽 Φ , 𝑊 i (cid:12)(cid:12) 𝑊 ≥ − 𝑅 ⊗ , 𝑊 ≤ 𝑅 ⊗ , 𝑅 ≥ , Tr 𝑅 = (cid:9) , (A3)with the supremum achieved since the feasible set is compact. A strictly feasible solution,that is, a feasible solution for which the inequality constraints are strict, can be constructed bydecomposing 𝐽 Φ = 𝐽 Φ + − 𝐽 Φ − and deﬁning 𝑀 ± ≔ 𝐽 Φ ± + 𝜀 𝐴 ⊗ 𝐵 with 𝜇 suitably large. By Slater’stheorem (see e.g. [74]), the existence of a strictly feasible solution ensures that k Φ k ^ ,𝑝 = k Φ k ^ ,𝑑 .2The dual forms of the other measures are obtained in full analogy with the derivation above.The crucial observation is that an optimisation of the form k Φ k ♦ = min (cid:8) 𝜆 + + 𝜆 − (cid:12)(cid:12) Φ = 𝜆 + Λ + − 𝜆 − Λ − , Λ ± ∈ CPTNI (cid:9) (A4)can be rewritten as k Φ k ♦ = min (cid:8) 𝜆 + + 𝜆 − (cid:12)(cid:12) 𝐽 Φ = 𝑀 + − 𝑀 − , 𝑀 ± ≥ , Tr 𝐵 𝑀 ± ≤ 𝜆 ± 𝐴 (cid:9)(cid:9)