[PDF] Markovianizing Cost of Tripartite Quantum States

Abstract

We introduce and analyze a task that we call Markovianization, in which a tripartite quantum state is transformed to a quantum Markov chain by a randomizing operation on one of the three subsystems. We consider cases where the initial state is the tensor product of n copies of a tripartite state ρ ABC , and is transformed to a quantum Markov chain conditioned by B n with a small error, using a random unitary operation on A n . In an asymptotic limit of infinite copies and vanishingly small error, we analyze the Markovianizing cost, that is, the minimum cost of randomness per copy required for Markovianization. For tripartite pure states, we derive a single-letter formula for the Markovianizing costs. Counterintuitively, the Markovianizing cost is not a continuous function of states, and can be arbitrarily large even if the state is close to a quantum Markov chain. Our results have an application in analyzing the cost of resources for simulating a bipartite unitary gate by local operations and classical communication.

Full PDF

11 Markovianizing Cost of Tripartite Quantum States

Eyuri Wakakuwa, Akihito Soeda and Mio Murao

Abstract —We introduce and analyze a task that we call

Marko-vianization , in which a tripartite quantum state is transformed toa quantum Markov chain by a randomizing operation on one ofthe three subsystems. We consider cases where the initial stateis the tensor product of n copies of a tripartite state ρ ABC , andis transformed to a quantum Markov chain conditioned by B n with a small error, using a random unitary operation on A n .In an asymptotic limit of inﬁnite copies and vanishingly smallerror, we analyze the Markovianizing cost , that is, the minimumcost of randomness per copy required for Markovianization.For tripartite pure states, we derive a single-letter formula forthe Markovianizing costs. Counterintuitively, the Markovianizingcost is not a continuous function of states, and can be arbitrarilylarge even if the state is close to a quantum Markov chain. Ourresults have an application in analyzing the cost of resourcesfor simulating a bipartite unitary gate by local operations andclassical communication.

I. I

NTRODUCTION

Tripartite quantum states for which the quantum conditionalmutual information (QCMI) is zero are called short quantumMarkov chains , or

Markov states for short [1]. They playimportant roles, e.g., in analyzing the cost of quantum state re-distribution [2], [3], investigating effects of the initial system-environment correlation on the dynamics of quantum states[4], and computing the free energy of quantum many-bodysystems [5].In analogy to the quantum mutual information (QMI) ofa bipartite state quantifying a distance to the closest productstates, it would be natural to expect a similar relation betweenQCMI of a tripartite state and Markov states. However, thisconjecture has been falsiﬁed [6](see also [7], [8]). The recentresults show that the relation between QCMI and Markovstates is not so straightforward [6]–[10], particularly whencompared to the relation between QMI and product states.From an operational point of view, QMI quantiﬁes theminimum cost of randomness required for destroying thecorrelation between two quantum systems in an asymptoticlimit of inﬁnite copies [11]. This fact and its variants includingsingle-shot cases are called decoupling theorems , and haveplayed a signiﬁcant role in the development of quantum

This work is supported by the Project for Developing Innovation Systemsof MEXT, Japan and JSPS KAKENHI (Grant No. 23540463, No. 23240001,No. 26330006, and No. 15H01677). We also gratefully acknowledge to theELC project (Grant-in-Aid for Scientiﬁc Research on Innovative Areas MEXTKAKENHI (Grant No. 24106009)) for encouraging the research presented inthis paper. This work was presented in part at ISIT 2015.E. Wakakuwa is with the Department of Communication Engineering andInformatics, Graduate School of Informatics and Engineering, The Universityof Electro-Communications, Japan (email: [email protected]).A. Soeda is with the Department of Physics, Graduate School of Science,The University of Tokyo, Japan.M. Murao is with the Department of Physics, Graduate School of Science,The University of Tokyo, and is with Institute for Nano Quantum InformationElectronics, The University of Tokyo. information theory for a decade [12]–[17]. In a simple analogy,one may ask the following question: Is QCMI equal to theminimum cost of randomness required for transforming atripartite state to a Markov state?In this paper, we address this question, and answer in thenegative. We derive a single-letter formula for the “Marko-vianizing cost” of pure states, that is, the minimum cost ofrandomness per copy required for Markovianizing tripartitepure states in the asymptotic limit of inﬁnite copies. The ob-tained formula is not equal to QCMI, or not even a continuousfunction of states. Moreover, the Markovianizing cost of a statecan be arbitrarily large, regardless of how close the state is toa Markov state. In the proof, we improve a random codingmethod using the Haar distributed random unitary ensemble,which is widely used in the proof of the decoupling theorems,by incorporating the mathematical structure of Markov states.There are two ways for deﬁning the property of tripartitequantum states being “approximately Markov”: one by thecondition that the state is close to a Markov state, on which ourdeﬁnition of Markovianization in this paper is based; and theother by the condition that the state is approximately recover-able [10], i.e., there exists a quantum operation E : B → BC such that ρ ABC ≈ E ( ρ AB ) . Ref. [10] proved that the lattercondition has a direct connection with QCMI, namely, smallQCMI implies recoverability with a small error.In [18], we introduce another formulation of the Marko-vianizing cost by employing the concept of recoverability, andprove that the cost function is equal to the one obtained in thispaper for pure states. We then apply the results in analyzingthe cost of entanglement and classical communication forsimulating a bipartite unitary gate by local operations andclassical communication [19]. As a consequence, we prove in[20] that there is a trade-off relation between the entanglementcost and the number of rounds of communication for a two-party distributed quantum information processing.The structure of this paper is as follows. In Section II,we review mathematical theorems regarding the structure ofquantum Markov chains, which are extensively used in thispaper. In Section III, we introduce the formal deﬁnition ofMarkovianization, and describe the main results. Outlinesof proofs of the main results are presented in Section IV.In Section V, we describe properties of the Markovianizingcost. In Section VI, we calculate the Markovianizing costof particular classes of tripartite pure states to illustrateits properties. Conclusions are given in Section VII. SeeAppendices for detailed proofs. Notations.

A Hilbert space associated with a quantumsystem A is denoted by H A , and its dimension is denotedby d A . For ρ ∈ S ( H A ) , we denote supp[ ρ ] ⊆ H A by H Aρ . Asystem composed of two subsystems A and B is denoted by a r X i v : . [ qu a n t - ph ] D ec AB . When M and N are linear operators on H A and H B ,respectively, we denote M ⊗ N as M A ⊗ N B for clarity. Weabbreviate | ψ (cid:105) A ⊗ | φ (cid:105) B as | ψ (cid:105) A | φ (cid:105) B . The identity operatoron a Hilbert space is denoted by I . We denote ( M A ⊗ I B ) | ψ (cid:105) AB as M A | ψ (cid:105) AB , and ( M A ⊗ I B ) ρ AB ( M A ⊗ I B ) † as M A ρ AB M A † . We abbreviate M AB ( ρ A ⊗ I B ) M † AB as M AB ρ A M † AB . When E is a quantum operation on A , wedenote ( E ⊗ id B )( ρ AB ) as ( E A ⊗ id B )( ρ AB ) or E A ( ρ AB ) . For ρ AB , ρ A represents Tr B [ ρ AB ] . We denote | ψ (cid:105)(cid:104) ψ | simply as ψ .A system composed of n identical systems of A is denotedby A n or ¯ A , and the corresponding Hilbert space is denotedby ( H A ) ⊗ n or H ¯ A . The Shannon entropy of a probabilitydistribution is denoted as H ( { p j } j ) , and the von Neumannentropy of a state ρ A is interchangeably denoted by S ( ρ A ) and S ( A ) ρ . log x represents the base logarithm of x .II. P RELIMINARIES

In this section, we present a decomposition of a Hilbertspace called the

Koashi-Imoto (KI) decomposition , which isintroduced in [21] and is extensively used in the followingpart of this paper. We then summarize a result in [1], whichstates that the structure of Markov states is characterized bythe KI decomposition.

A. Koashi-Imoto Decomposition

For any set of states on a quantum system, operations onthat system are classiﬁed into two categories: one that do notchange any state in the set, and the other that changes at leastone state in the set. It is proved in [21] that there exists aneffectively unique way of decomposing a Hilbert space intoa direct-sum form, in such a way that all quantum operationsthat do not change a given set of states have a simple formwith respect to the decomposition. We call this decompositionof the Hilbert space as the

Koashi-Imoto decomposition , or the

KI decomposition for short. As we verify in Remark in thissection, the KI decomposition is equivalently be represented inthe form of a tensor product of three Hilbert spaces. Theorem3 in [21], which proves the existence of the KI decomposition,is described in this tensor-product form as follows.

Theorem 1 ( [21], see also Theorem 9 in [1]) Consider aquantum system A described by a ﬁnite dimensional Hilbertspace H A . Associated to any set of states S := { ρ k } k on A , there exist three Hilbert spaces H a , H a L , H a R , anorthonormal basis {| j (cid:105)} j ∈ J of H a and a linear isometry Γ from H A S := supp( (cid:80) k ρ k ) ⊆ H A to H a ⊗ H a L ⊗ H a R , suchthat the following three properties hold. (For later convenience,we exchange labels L and R in the original formulation.)1) The states in S are decomposed by Γ as Γ ρ k Γ † = (cid:88) j ∈ J p j | k | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ρ a R j | k (1)with some probability distribution { p j | k } j ∈ J on J := { , · · · , dim H a } , states ω j ∈ S ( H a L ) and ρ j | k ∈S ( H a R ) . 2) A quantum operation E on S ( H A S ) leaves all ρ k invariantif and only if there exists an isometry U : H A S → H A S ⊗H E such that a Stinespring dilation of E is given by E ( τ ) = Tr E [ U τ U † ] , and that U is decomposed by Γ as (Γ ⊗ I E ) U Γ † = (cid:88) j ∈ J | j (cid:105)(cid:104) j | a ⊗ U a L j ⊗ I a R j . (2)Here, I j are the identity operators on H a R j :=supp (cid:80) k ρ j | k , and U j : H a L j → H a L j ⊗ H E are isome-tries that satisfy Tr E [ U j ω j U † j ] = ω j for all j , where H a L j := supp ω j .3) Γ satisﬁes imgΓ = (cid:77) j ∈ J H a j ⊗ H a L j ⊗ H a R j , (3)where H a j are one-dimensional subspaces of H a spanned by | j (cid:105) .4) H a , H a L and H a R are minimal in the sense that dim H a L = max j ∈ J dim H a L j , dim H a R = max j ∈ J dim H a R j and ∀ j ∈ J, ∃ k s.t. p j | k > . We call Γ as the KI isometry on system A with respect to S .The KI decomposition and the corresponding KI isometry areuniquely determined from S , up to trivial changes of the basis(Lemma 7 in [21]). The dimensions of H a , H a L and H a R areat most d A . An algorithm for obtaining the KI decompositionis proposed in [21].It is also proved in [21] that the sets of states { ρ j | k } k in(1) are irreducible in the following sense. Lemma 2 (Corollary of Lemma 6 in [21]) The set of states { ρ j | k } k in (1) satisﬁes the following properties.1) If a linear operator N on H a R j := supp (cid:80) k ρ j | k satisﬁes p j | k N ρ j | k = p j | k ρ j | k N for all k , then N = cI a R j fora complex number c , where I a R j is the identity operatoron H a R j .2) If a linear operator N : H a R j → H a R j (cid:48) ( j (cid:54) = j (cid:48) ) satisﬁes p j | k N ρ j | k = p j (cid:48) | k ρ j (cid:48) | k N for all k , then N = 0 .Let us now describe an extension of the KI decompositionto a bipartite quantum states, which is introduced in [1].Associated to any bipartite state Ψ AA (cid:48) ∈ S ( H A ⊗ H A (cid:48) ) , thereexists a set of states on A to which system A can be steered through Ψ AA (cid:48) , i.e., the set of states that can be preparedby performing a measurement on A (cid:48) on the state Ψ AA (cid:48) andpost-selecting one outcome. The KI decomposition of A withrespect to the set is then associated to Ψ AA (cid:48) . It happens thatany quantum operation on A which leaves all states in the setinvariant also leaves Ψ AA (cid:48) invariant, and vice versa. Hence theset of operations preserving Ψ AA (cid:48) is completely characterizedby the corresponding KI decomposition. More precisely, wehave the following statements. Deﬁnition 3

Consider quantum systems A and A (cid:48) describedby ﬁnite dimensional Hilbert spaces H A and H A (cid:48) , respectively. The KI decomposition of system A with respect to a bipartitestate Ψ AA (cid:48) ∈ S ( H A ⊗H A (cid:48) ) is deﬁned as the KI decompositionof A with respect to the following set S Ψ A (cid:48)→ A of states, i.e.,using L ( H A (cid:48) ) to denote the set of linear operators on H A (cid:48) , S Ψ A (cid:48)→ A := { ρ ∈ S ( H A ) | ∃ M ∈ L ( H A (cid:48) ) s.t. ρ = Tr A (cid:48) [ M A (cid:48) Ψ AA (cid:48) M † A (cid:48) ] } . (4)The KI isometry on system A with respect to Ψ AA (cid:48) is deﬁnedas that with respect to S Ψ A (cid:48)→ A . Lemma 4 (See the proof of Theorem 6 in [1] and Equality(14) therein.) Let Γ be the KI isometry on A with respect to Ψ AA (cid:48) , and deﬁne H A Ψ := supp[Ψ A ] ⊆ H A . Γ satisﬁes thefollowing properties.1) Γ gives Ψ AA (cid:48) KI := Γ A Ψ AA (cid:48) Γ † A = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ϕ a R A (cid:48) j (5)with some probability distribution { p j } j ∈ J , orthonormalbasis {| j (cid:105)} j ∈ J of H a , states ω j ∈ S ( H a L ) and ϕ j ∈S ( H a R ⊗ H A (cid:48) ) .2) A quantum operation E on S ( H A Ψ ) leaves Ψ AA (cid:48) invariantonly if there exists an isometry U : H A Ψ → H A Ψ ⊗ H E such that a Stinespring dilation of E is given by E ( τ ) =Tr E [ U τ U † ] , and that U is decomposed by Γ as (2), inwhich case we deﬁne H a R j := supp ϕ a R j .We call (5) as the KI decomposition of Ψ AA (cid:48) on A . Thefollowing lemma, regarding the equivalence between the KIisometries of two bipartite states, immediately follows. Lemma 5

The following conditions are equivalent when H A Ψ = H A Ψ = H A :1) A quantum operation on A leaves a state Ψ AA (cid:48) invariantif and only if it leaves a state Ψ AA (cid:48)(cid:48) invariant.2) The KI isometries on A with respect to Ψ AA (cid:48) and Ψ AA (cid:48)(cid:48) are the same.We deﬁne the sub-KI isometries as follows. Deﬁnition 6

Consider a bipartite state Ψ AA (cid:48) , three Hilbertspaces H a , H a L , H a R and let Γ be a linear isometry from H A Ψ to H a ⊗ H a L ⊗ H a R . We call Γ as a sub-KI isometry onsystem A with respect to Ψ AA (cid:48) if it satisﬁes Condition 1) inLemma 4. B. Markov States

A tripartite quantum state Υ ABC is called a Markov stateconditioned by B if it satisﬁes I ( A : C | B ) Υ = 0 . It is provedin [1] that the structure of Markov states is characterized bythe KI decomposition as follows. Theorem 7 (See Theorem 6 in [1] and the proof thereof.) Thefollowing three conditions are equivalent:1) Υ ABC is a Markov state conditioned by B . A b b L b R C | j j | j j | i i | i i Fig. 1. A graphical representation of the Markov decomposition of the Markovstate (6). Each vertex corresponds to a quantum system, and the white circlerepresents a ‘classical’ system, where the state of the whole system is diagonalwith respect to | i (cid:105) b . The dotted lines represent mixed states. The whole stateis the probabilistic mixture of the above state with probability q i , namely, (cid:80) i q i | i (cid:105)(cid:104) i | b ⊗ σ Ab L i ⊗ φ b R Ci .

2) There exist three Hilbert spaces H b , H b L , H b R and alinear isometry Γ from H B Υ := supp[Υ B ] to H b ⊗H b L ⊗H b R such that Υ ABC is decomposed by Γ as Γ B Υ ABC Γ † B = (cid:88) i q i | i (cid:105)(cid:104) i | b ⊗ σ Ab L i ⊗ φ b R Ci (6)with some probability distribution { q i } i , orthonormalbasis {| i (cid:105)} i of H b , states σ i ∈ S ( H A ⊗ H b L ) and φ i ∈ S ( H b R ⊗ H C ) .3) Υ ABC is decomposed in the form of (6) with Γ beingthe KI isometry on B with respect to Υ BC .4) There exist quantum operations R from B to BC and R (cid:48) from B to AB such that Υ ABC = R (Υ AB ) = R (cid:48) (Υ BC ) . (7)We call (6) as a Markov decomposition of a Markov state Υ ABC (Figure 1).

Remark:

The KI decomposition is ﬁrst proved in [21] byan algorithmic construction, and by an algebraic proof in [1]afterward. A similar decomposition is derived in [22] and [23]in the context of “information preserving structure”. In theseliteratures, the decomposition is given in the form of the directsum of Hilbert spaces as (cid:76) j H Lj ⊗ H Rj . This is equivalentto the decomposition in the form of a tensor product of threeHilbert spaces described in this section, as veriﬁed by choosing H a , H a L and H a R such that dim H a = | J | , dim H a L =max j ∈ J H Lj and dim H a R = max j ∈ J H Rj . The correspondingKI isometry is deﬁned as Γ := (cid:88) j ∈ J | j (cid:105) a ⊗ (Γ L,j ⊗ Γ R,j ) P j (8)where Γ L,j : H Lj → H a L and Γ R,j : H Rj → H a R are linearisometries and P j is the projection onto H Lj ⊗ H Rj ∈ H .As stressed in [21], H a in (1) holds the “classical” part ofinformation possessed by ρ k , H a R the “quantum” part, and H a L the redundant part. III. D

EFINITIONS AND M AIN R ESULTS

In this section, we introduce the formal deﬁnition of Marko-vianization, and state the main results on the Markovianizingcost of tripartite pure states. The outlines of proofs are givenin Section IV. Rigorous proofs will be given in Appendix Band C.

Deﬁnition 8

A tripartite state ρ ABC is Markovianized withthe randomness cost R on A , conditioned by B , if thefollowing statement holds. That is, for any (cid:15) > , there exists n (cid:15) such that for any n ≥ n (cid:15) , we ﬁnd a random unitaryoperation V n : τ (cid:55)→ − nR (cid:80) nR k =1 V k τ V † k on A n and a Markovstate Υ A n B n C n conditioned by B n that satisfy (cid:13)(cid:13)(cid:13) V A n n ( ρ ⊗ n ) − Υ A n B n C n (cid:13)(cid:13)(cid:13) ≤ (cid:15). (9)The Markovianizing cost of ρ ABC is deﬁned as M A | B ( ρ ABC ) := inf { R | ρ ABC is Markovianized withthe randomness cost R on A , conditioned by B } .The following theorem is the main contribution of this work.The outline of the proof is given in the next section. Theorem 9

Let | Ψ (cid:105) ABC be a pure state, and let Ψ ACKI = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ϕ a R Cj (10)be the KI decomposition of Ψ AC on A . Then we have M A | B (Ψ ABC ) = H ( { p j } j ∈ J ) + 2 (cid:88) j ∈ J p j S ( ϕ a R j ) . Based on this theorem, it is possible to compute the Marko-vianizing cost of pure states once we obtain the KI decom-position of its bipartite reduced density matrix. However, thealgorithm for obtaining the KI decomposition, which is pro-posed in [21], involves repeated application of decompositionsof the Hilbert space into subspaces, and is difﬁcult to executein general.Below we propose an algorithm by which we can computethe Markovianizing cost for a particular class of pure states,without obtaining an explicit form of the KI decomposition.The algorithm is based on the following theorem, whichconnects the Markovianizing cost of a pure state and thePetz recovery map corresponding to the state. Here, the

Petzrecovery map of a tripartite state Ψ ABC from A to AC , anidea ﬁrst introduced in [1], is deﬁned by R A → AC Ψ ( τ ) = (Ψ AC ) (Ψ A ) − τ (Ψ A ) − (Ψ AC ) ( ∀ τ ∈ S ( H A )) . A proof of the theorem will be given in Appendix C.

Theorem 10

Let | Ψ (cid:105) ABC be a pure state, such that a CPTPmap E on S ( H A Ψ ) deﬁned by E := Tr C ◦ R A → AC Ψ (11)is self-adjoint. Deﬁne another CPTP map E ∞ by E ∞ := lim N →∞ N N (cid:88) n =1 E n , (12) | j | j | j | j a a L a R b R b L b C Fig. 2. A graphical representation of the KI decomposition of tri-partite pure states (15). Each vertex corresponds to a quantum system.The solid lines express pure states. The whole state is the superposi-tion of the above states with the probability amplitude √ p j , namely, (cid:80) j ∈ J √ p j | j (cid:105) a | j (cid:105) b | ω j (cid:105) a L b L | ϕ j (cid:105) a R b R C . and consider the state Ψ ABC ∞ := E A ∞ ( | Ψ (cid:105)(cid:104) Ψ | ABC ) . (13)Then we have M A | B (Ψ ABC ) = S (Ψ ABC ∞ ) . (14)Due to this theorem, the Markovianizing cost of purestates can be computed by the following algorithm, basedon a matrix representation of CPTP maps. Here, {| k (cid:105)} d A k =1 is an orthonormal basis of H A , and [ · ] kl,mn denotes a matrixelement in the kl -th row and the mn -th column. (See alsoRemark in Appendix C-B.)1) Compute d A -dimensional square matrices Λ , Λ and Λ given by [Λ ] kl,mn = (cid:104) k | (Ψ A ) | m (cid:105)(cid:104) n | (Ψ A ) | l (cid:105) , [Λ ] kl,mn = Tr (cid:104) (cid:104) k | A (Ψ AC ) | m (cid:105) A (cid:104) n | A (Ψ AC ) | l (cid:105) A (cid:105) and Λ = Λ Λ − , where the superscript − denotes thegeneralized inverse.2) Check the hermiticity of Λ , which is equivalent to theself-adjointness of E . If it is Hermitian, continue to Step3. If not, this algorithm is not applicable.3) Compute a matrix Λ ∞ corresponding to E ∞ , whichis given by the projection onto the eigensubspace of Λ corresponding to the eigenvalue 1. Then compute ˜Λ ∞ = Λ ∞ Λ .4) Compute Ω AA (cid:48) ∞ given by Ω AA (cid:48) ∞ = (cid:88) klmn [˜Λ ∞ ] kl,mn | k (cid:105)(cid:104) l | A ⊗ | m (cid:105)(cid:104) n | A (cid:48) .

5) Compute the Shannon entropy of the eigenvalues of Ω AA (cid:48) ∞ , which is equal to M A | B (Ψ ABC ) .IV. O UTLINE OF P ROOFS OF THE M AIN T HEOREMS

In this section, we describe the outline, main concepts andtechnical ingredients for the proofs of Theorem 9 and 10.Detailed proofs are given in Appendix B and C.We ﬁrst introduce an adaptation of the KI decomposition to tripartite pure states as follows. ¯ a ¯ a L ¯ a R ¯ b R ¯ b L ¯ b ¯ C | j | j | j | j j j | j j | | j j | | j j | j | j j | Fig. 3. A graphical representation of the state transformation from Ψ (cid:48) n,δ in(17) to ¯Ψ n,δ in (19) by a random unitary operation given by (18) . Lemma 11

Let | Ψ (cid:105) ABC be a tripartite pure state and supposethat the KI decomposition of Ψ AC on A is given by Γ A Ψ AC Γ † A = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ϕ a R Cj . There exists a linear isometry Γ (cid:48) : H B Ψ → H b ⊗ H b L ⊗ H b R that decomposes | Ψ (cid:105) ABC together with Γ as (Γ A ⊗ Γ (cid:48) B ) | Ψ (cid:105) ABC = (cid:88) j ∈ J √ p j | j (cid:105) a | j (cid:105) b | ω j (cid:105) a L b L | ϕ j (cid:105) a R b R C , (15)where | ω j (cid:105) a L b L and | ϕ j (cid:105) a R b R C are puriﬁcations of ω a L j and ϕ a R Cj , respectively, and (cid:104) j | j (cid:48) (cid:105) b = δ jj (cid:48) . Moreover, Γ (cid:48) is thesub-KI isometry on B with respect to Ψ BC . Proof:

The existence of Γ (cid:48) follows from Uhlmann’s theorem([24], see Appendix A-A). It is straightforward to verify that Γ (cid:48) is the sub-KI isometry, since we have Γ (cid:48) B Ψ BC Γ (cid:48)† B = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | b ⊗ ω b L j ⊗ ϕ b R Cj from (15). (cid:4) We call (15) as the

KI decomposition of | Ψ (cid:105) ABC on A and B (Figure 2), and denote it by | Ψ KI (cid:105) . A. For Theorem 9: Achievability

The direct part of Theorem 9 is formulated by the followinginequality: M A | B (Ψ ABC ) ≤ H ( { p j } j ∈ J ) + 2 (cid:88) j ∈ J p j S ( ϕ a R j ) . (16)The outline of the proof is as follows. The state | Ψ ABC (cid:105) ⊗ n islocal unitarily equivalent to | Ψ KI (cid:105) ⊗ n , which is almost equalto the state deﬁned by | Ψ (cid:48) n,δ (cid:105) := (cid:88) j ∈ J n,δ √ p j | j (cid:105) ¯ a | j (cid:105) ¯ b | ω j (cid:105) ¯ a L ¯ b L Π ¯ a R j ,δ | ϕ j (cid:105) ¯ a R ¯ b R ¯ C (17)for sufﬁciently large n . Here, we have introduced notations j = j · · · j n , ϕ j = ϕ j ⊗ · · · ⊗ ϕ j n and ω j = ω j ⊗ · · · ⊗ ω j n . J n,δ is the δ -strongly typical set with respect to the probability distribution { p j } j , and Π ¯ a R j ,δ is the projection onto the condi-tionally typical subspace of ϕ ¯ a R j conditioned by j . Consider aunitary operation on supp [Ψ (cid:48) ¯ a ¯ a L ¯ a R n,δ ] of the form V := (cid:88) j ∈ J n,δ | j (cid:105)(cid:104) j | ¯ a ⊗ I ¯ a L j ⊗ v ¯ a R j , (18)where I ¯ a L j is the identity operator on supp ω ¯ a L j and v ¯ a R j is aunitary on the support of Π ¯ a R j ,δ . We apply V ¯ a ¯ a L ¯ a R on Ψ (cid:48) n,δ byindependently choosing v ¯ a R j from the Haar distributed randomunitary ensemble for each j . By this random unitary operation,the state (17) is transformed to the following state ¯Ψ n,δ := (cid:88) j ∈ J n,δ p j | jj (cid:105)(cid:104) jj | ¯ a ¯ b ⊗ | ω j (cid:105)(cid:104) ω j | ¯ a L ¯ b L ⊗ π ¯ a R j ⊗ ϕ (cid:48) ¯ b R ¯ C j , (19)where ϕ (cid:48) j := Tr a R [Π ¯ a R j ,δ | ϕ j (cid:105)(cid:104) ϕ j | ] and π ¯ a R j = Π ¯ a R j ,δ / Tr[Π ¯ a R j ,δ ] . ¯Ψ n,δ is a Markov state conditioned by B (Figure 3).To Markovianize | Ψ ABC (cid:105) ⊗ n , it is sufﬁcient that we ap-proximate the transformation from (17) to (19) by V n with avanishingly small error, where V k in V n are unitaries which aredecomposed by Γ ⊗ n as (18). By a random coding method andthe operator Chernoff bound [11], it is shown that a sufﬁcientnumber of unitaries in V n for this approximation is almostequal to the inverse of the minimum nonzero eigenvalue of(19), and is given as H ( { p j } j ∈ J ) + 2 (cid:80) j ∈ J p j S ( ϕ a R j ) percopy. We note that the error (cid:15) converges exponentially with n to zero. B. For Theorem 9: Optimality

The converse part of Theorem 9 is formulated by thefollowing inequality: M A | B (Ψ ABC ) ≥ H ( { p j } j ∈ J ) + 2 (cid:88) j ∈ J p j S ( ϕ a R j ) . (20)Let us ﬁrst assume tentatively that a Markov decompositionof Υ ¯ A ¯ B ¯ C in (9) is given by (Γ (cid:48) B ) ⊗ n Υ ¯ A ¯ B ¯ C (Γ (cid:48) B ) ⊗ n = (cid:88) j ∈ J n p (cid:48) j | j (cid:105)(cid:104) j | b ⊗ σ ¯ Ab L j ⊗ φ b R ¯ C j , (21)with Γ (cid:48) being the KI isometry on B with respect to Ψ BC .In this case, it is not difﬁcult to show that the amount ofrandomness per copy required for transforming | Ψ ⊗ n (cid:105) ¯ A ¯ B ¯ C to Υ ¯ A ¯ B ¯ C is bounded below by the R.H.S. of (20). Indeed,in order to transform | Ψ KI (cid:105) ⊗ n to a Markov state in theform of (21), it is necessary that (i) the off-diagonal termswith respect to | j (cid:105) vanish, and (ii) the correlation between ¯ a R and ¯ b R ¯ C in the state | ϕ j (cid:105) ¯ a R ¯ b R ¯ C is destroyed for each j . An optimal way for satisfying these two conditions istransforming the state (17) close to a state of the form(19). Since the entropy of the state (19) is approximatelyequal to n ( H ( { p j } j ∈ J ) + 2 (cid:80) j ∈ J p j S ( ϕ a R j )) , the cost ofrandomness required for this transformation is at least about H ( { p j } j ∈ J ) + 2 (cid:80) j ∈ J p j S ( ϕ a R j ) bits per copy.However, it might be possible in general that the amount ofrandomness can be further reduced by appropriately choosing E † E DP E E E ¯ A ¯ A Add i ˆ a R ˆ a L ˆ a † Fig. 4. A graphical representation of the channel E χ . Due to the completelydephasing channel denoted as DP, the system ˆ a has some capacity to transmitclassical information, but has no capacity to transfer entanglement. The system ˆ a R has some capacity to transfer entanglement. Υ ¯ A ¯ B ¯ C and the corresponding KI decomposition of ¯ B . We shallsee that our choice presented above is indeed optimal. At thecore of the proof lies the following lemma. Lemma 12

Let Ψ AC be a bipartite quantum state, and let Γ Ψ : H A Ψ → H a ⊗ H a L ⊗ H a R be the KI isometry on A withrespect to Ψ AC . For any n and (cid:15) > , let χ ¯ A ¯ C be a state thatsatisﬁes (cid:13)(cid:13)(cid:13) (Ψ ⊗ n ) ¯ A ¯ C − χ ¯ A ¯ C (cid:13)(cid:13)(cid:13) ≤ (cid:15), (22)and let Γ χ : H ¯ Aχ → H ˆ a ⊗ H ˆ a L ⊗ H ˆ a R be a sub-KI isometryon ¯ A with respect to χ ¯ A ¯ C . Denoting the decompositions Γ A Ψ Ψ AC Γ † A Ψ and Γ ¯ Aχ χ ¯ A ¯ C Γ † ¯ Aχ by Ψ ACKI and χ ¯ A ¯ CsKI , respectively,we have n ( S (ˆ a ) χ sKI + 2 S (ˆ a R | ˆ a ) χ sKI ) ≥ S ( a ) Ψ KI + 2 S ( a R | a ) Ψ KI − ζ (cid:48) Ψ ( (cid:15) ) log d A . (23)Here, ζ (cid:48) Ψ ( (cid:15) ) is a function of (cid:15) > and Ψ , which does notdepend on n , and satisﬁes lim (cid:15) → ζ (cid:48) Ψ ( (cid:15) ) = 0 . See Equality(90) in Appendix B-D for a rigorous deﬁnition. Proof Outline for Lemma 12:

Assume here for simplicitythat ( H A Ψ ) ⊗ n = H ¯ Aχ = H ¯ A , (24)and suppose the decompositions of Ψ AC and that of χ ¯ A ¯ C aregiven by Ψ ACKI = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ϕ a R Cj and χ ¯ A ¯ CsKI = (cid:88) i q i | i (cid:105)(cid:104) i | ˆ a ⊗ ξ ˆ a L i ⊗ φ ˆ a R ¯ Ci , (25)respectively. Consider a quantum channel E χ on ¯ A deﬁned by E χ ( τ ) = Γ † χ (cid:32)(cid:88) i | i (cid:105)(cid:104) i | ˆ a Tr ˆ a L [Γ χ τ Γ † χ ] | i (cid:105)(cid:104) i | ˆ a ⊗ ξ ˆ a L i (cid:33) Γ χ , (26)which is decomposed as E Γ † χ ◦ E ◦ E ◦ E ◦ E Γ χ . The maps E Γ χ and E Γ † χ are isometry channels corresponding to Γ χ and Γ † χ , respectively; E is discarding of system ˆ a L ; E is thecompletely dephasing channel on ˆ a with respect to the basis | i (cid:105) ; E is appending of the state ξ ˆ a L i , conditioned by ˆ a (Figure E ¯ A ( n ) DP ¯ C ¯ C n n ¯ A ¯ A Add i ˆ a R ˆ a L ˆ a † ( E ¯ A id ¯ C )( n ) ¯ C ¯ C E Fig. 5. A graphical representation of the state transformation of ψ ⊗ n under E χ . The channel E χ has little effect on the state ψ ⊗ n on average. Inparticular, it almost conserves the correlation that ψ ⊗ n initially has. Thusthe intermediate state ψ n has the same amount of correlation due to themonotonicity. E χ immediatelyfollows from (26), and the trace-preserving property resultsfrom the fact that Γ χ satisﬁes Condition (3).The state χ ¯ A ¯ C is invariant under the action of E χ , and thus (Ψ ⊗ n ) ¯ A ¯ C is almost unchanged due to (22). By extending thedata compression theorem for quantum mixed-state ensembles[25], it follows that any state of the form ( ψ ⊗ n ) ¯ A ¯ C (cid:48) is almostunchanged by E χ on average, as long as ψ A = Ψ A holds andthe KI isometry on A with respect to ψ AC (cid:48) is equal to Γ Ψ .We consider ψ AC (cid:48) such that its KI decomposition on A is, upto an additional decomposition on C (cid:48) , given by ψ AC (cid:48) KI = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ | ˜ ϕ j (cid:105)(cid:104) ˜ ϕ j | a R c (cid:48) R ⊗ | j (cid:105)(cid:104) j | c (cid:48) , where | ˜ ϕ j (cid:105) a R c (cid:48) R is a puriﬁcation of ϕ a R j . The correlation be-tween ¯ A and ¯ C (cid:48) in the state ψ ⊗ n , measured by QMI, is equalto nI ( A : C (cid:48) ) ψ = nI ( a a L a R : c (cid:48) c (cid:48) R ) ψ KI = n ( S ( a ) ψ KI + 2 S ( a R | a ) ψ KI ) . (27)It can be shown that this amount of correlation is almostconserved under E χ .Due to the monotonicity of QMI, it follows that the cor-relation between ˆ a ˆ a L ˆ a R and ¯ C (cid:48) is approximately equal to(27) at any intermediate step of E χ (see Figure 4). After theaction of the completely dephasing channel E , the system ˆ a holds no quantum correlation with other systems, and thusthe correlation between ˆ a ˆ a R and ¯ C (cid:48) is bound to be at most S (ˆ a ) + 2 S (ˆ a R | ˆ a ) (Figure 5). Moreover, the state on ˆ a ˆ a R after E is almost equal to χ ˆ a ˆ a R sKI due to (22). A more detailedargument reveals that S (ˆ a ) χ sKI + 2 S (ˆ a R | ˆ a ) χ sKI (cid:38) n ( S ( a ) ψ KI + 2 S ( a R | a ) ψ KI ) , and consequently proving (23). C. For Theorem 10

Let us ﬁrst express (11), (12) and (13) in terms of the“decomposed” Hilbert space H a ⊗ H a L ⊗ H a R . A Krausrepresentation of a map E deﬁned by (11) is given by E ( · ) = (cid:80) kl E kl ( · ) E † kl , with the Kraus operators E kl := (cid:104) k | C (Ψ AC ) | l (cid:105) C (Ψ A ) − . (28) Let Γ be the KI isometry on A with respect to Ψ AC , andsuppose the KI decomposition of Ψ AC is given by (10). Foreach E kl , we have ˆ E kl := Γ E kl Γ † = (cid:88) j ∈ J | j (cid:105)(cid:104) j | a ⊗ I a L j ⊗ e a R j,kl , (29)where e j,kl := (cid:104) k | C ( ϕ a R Cj ) | l (cid:105) C ( ϕ a R j ) − . (30)By an extension of Lemma 2, it follows that { e j,kl } kl isirreducible in the sense that it satisﬁes Property 1) and 2). It isstraightforward to verify from (30) that maps E a R j on S ( H a R j ) ,deﬁned by E a R j ( · ) := (cid:80) k,l e j,kl ( · ) e † j,kl ( j = 1 , · · · , | J | ) ,are trace-preserving. Representations of E and E ∞ in thedecomposed Hilbert space are given by ˆ E ( · ) = (cid:88) kl ˆ E kl ( · ) ˆ E † kl (31)and ˆ E ∞ := lim n →∞ (1 /N ) (cid:80) Nn =1 ˆ E n , respectively, and that of Ψ ABC ∞ is given by ˆ E ∞ ( | Ψ KI (cid:105)(cid:104) Ψ KI | ) .Due to E ◦ E ∞ = E ∞ and the irreducibility of { e j,kl } kl , wehave ˆ E ∞ ( | Ψ KI (cid:105)(cid:104) Ψ KI | ) = (cid:88) j ∈ J p (cid:48) j | j (cid:105)(cid:104) j | a ⊗ ˜ ω j ⊗ π a R j , (32)where { p (cid:48) j } j ∈ J is a probability distribution and ˜ ω j ( j ∈ J ) arestates on a L b b L b R C . Explicit forms of ˜ ω j and p (cid:48) j are obtainedas follows. First, from (29), (31) and the trace-preservingproperty of E j , we have Tr[ (cid:104) j | a ˆ E (ˆ τ ) | j (cid:105) a ] = Tr[(id a L ⊗ E a R j )( (cid:104) j | a ˆ τ | j (cid:105) a )]= Tr[ (cid:104) j | a ˆ τ | j (cid:105) a ] for any j and ˆ τ = Γ τ Γ † ( τ ∈ S ( H A Ψ ) ). This implies thatthe probability amplitude with respect to the basis {| j (cid:105)} j isconserved by ˆ E , as well as by ˆ E ∞ . Thus we have p (cid:48) j = p j .Observe from (29) that ˆ E and ˆ E ∞ do not affect the system a L ,which implies ˜ ω j = | ω j (cid:105)(cid:104) ω j | a L b L ⊗ | j (cid:105)(cid:104) j | b ⊗ ϕ b R Cj . Hence the von Neumann entropy of Ψ ABC ∞ , which is equal tothat of (32), is given by S (Ψ ABC ∞ ) = H ( { p j } j ) + (cid:88) j p j (cid:16) S ( ϕ b R Cj ) + S ( π a R j ) (cid:17) . In addition, that the self-adjointness of E implies ϕ a R j = π a R j .Since | ϕ j (cid:105) a R b R C is a puriﬁcation of ϕ b R Cj , we ﬁnally obtainthat S (Ψ ABC ∞ ) = H ( { p j } j ) + 2 (cid:88) j p j S ( ϕ a R j ) = M A | B (Ψ ABC ) . V. P

ROPERTIES

In this section, we describe properties of the Markovianizingcost of tripartite quantum states. We ﬁrst consider arbitrary(possibly mixed) states, and then focus on the case of purestates. a a L a R b R b L b | j | j | j | j c c c | j, | j, Fig. 6. A graphical representation of a decomposition of tripartite pure statesfor which the Markovianizing cost is equal to QCMI. The whole state is thesuperposition of the above states with the probability amplitude √ p j , namely, (cid:80) j ∈ J √ p j | j (cid:105) a | j (cid:105) b | j (cid:105) c | ω j (cid:105) a L b L | ϕ j, (cid:105) a R c | ϕ j, (cid:105) b R c . A. General Properties

Let ρ ABC be an arbitrary tripartite state on ﬁnite dimen-sional quantum systems A , B and C . The Markovianizingcost of ρ ABC satisﬁes I ( A : C | B ) ρ ≤ M A | B ( ρ ABC ) ≤ I ( A : BC ) ρ . (33)The second inequality directly follows from the fact thatdecoupling A from BC is sufﬁcient for converting the state toa Markov state and that the cost of randomness for decouplingbipartite states is asymptotically given by QMI [11]. Theﬁrst inequality is proved in Appendix D. Consequently, theMarkovianizing cost is equal to zero only for Markov states.The Markovianizing cost satisﬁes a kind of the data pro-cessing inequality, namely, that M A | B ( E C ( ρ ABC )) ≤ M A | B ( ρ ABC ) under any quantum operation E on C . This is because anyrandom unitary operation V n on A n satisfying (9) also satisﬁes (cid:13)(cid:13)(cid:13) V A n n ([ E C ( ρ )] ⊗ n ) − ( E C ) ⊗ n (Υ A n B n C n ) (cid:13)(cid:13)(cid:13) ≤ (cid:15), and ( E C ) ⊗ n (Υ A n B n C n ) is a Markov state conditioned by B n .As a consequence, an upper bound on the Markovianizing costof a mixed state is obtained as M A | B ( ρ ABC ) ≤ M A | B ( ψ ABC (cid:48) ρ ) , where | ψ ρ (cid:105) ABC (cid:48) is a puriﬁcation of ρ AB . This is because therealways exists a quantum operation F ρ : C (cid:48) → C such that F ρ ( ψ ABC (cid:48) ρ ) = ρ ABC . B. Pure States

Let us now consider pure states, based on the result pre-sented in Section III. First, we see that the Markovianizingcost M A | B of two pure states Ψ and Ψ are equal if thereexist λ (0 < λ ≤ and σ C ∈ S ( H C ) such that Ψ AC = λ Ψ AC + (1 − λ )Ψ A ⊗ σ C . (34)This is because Ψ A = Ψ A and the KI decompositions of A with respect to Ψ AC and Ψ AC are equal, the latter of whichfollows from Lemma 5. Indeed, any quantum operation on A which keeps Ψ AC invariant also keeps Ψ AC invariant and viceversa, as can be seen by observing from (34) that we have Ψ AC = 1 λ Ψ AC − − λλ Ψ A ⊗ σ C . Second, we obtain a necessary and sufﬁcient condition for theMarkovianizing cost of pure states to be equal to QCMI asfollows (Figure 6).

Theorem 13

We have I ( A : C | B ) Ψ = I ( A : C ) Ψ = I ( a a L a R : C ) Ψ KI = I ( a : C ) Ψ KI + I ( a L a R : C | a ) Ψ KI = I ( a : C ) Ψ KI + (cid:88) j p j I ( a R : C ) ϕ j , as well as M A | B (Ψ ABC ) = H ( a ) + 2 (cid:88) j p j S ( a R ) ϕ j . Since Ψ a CKI is a classical-quantum state, we have I ( a : C ) Ψ KI ≤ H ( a ) with equality if and only if { supp[ ϕ Cj ] } j ismutually orthogonal. We also have S ( a R ) ϕ j ≥ I ( a R : C ) ϕ j ,which is saturated if and only if S ( a R ) ϕ j − I ( a R : C ) ϕ j = I ( a R : b R ) ϕ j = 0 (see Inequality (45) in Appendix A-B).Hence we have M A | B (Ψ ABC ) = I ( A : C | B ) Ψ if and only if Ψ ABKI = (cid:88) j ∈ J p j | j, j (cid:105)(cid:104) j, j | a b ⊗ | ω j (cid:105)(cid:104) ω j | a L b L ⊗ ϕ a R j ⊗ ϕ b R j , which concludes the proof due to Uhlmann’s theorem ([24],see Appendix A-A). (cid:4) An example of states that satisfy the above conditions is givenin Section VI-C. VI. E

XAMPLES

In this section, we consider examples of pure states toillustrate discontinuity and asymmetry of the Markovianizingcost. We also give an example of states for which the Marko-vianizing cost is equal to QCMI.

A. Discontinuity

We consider tripartite pure states that are expressed as | Ψ λ (cid:105) = (cid:114) d λ − d − | (cid:105) B | Φ d (cid:105) AC + (cid:114) − λd − d (cid:88) k,l =1 | kl (cid:105) B | k (cid:105) A | l (cid:105) C , where d = dim H A = dim H C , dim H B = d +1 , /d ≤ λ ≤ , and Φ d is a maximally entangled state of Schmidt rank d ,deﬁned as | Φ d (cid:105) = 1 √ d d (cid:88) k =1 | k (cid:105)| k (cid:105) . (35)For this state, we have I ( A : C | B ) Ψ λ = 2 log d − h ( λ ) − (1 − λ ) log ( d − , (36)where h denotes the binary entropy deﬁned by h ( λ ) = − λ log λ − (1 − λ ) log (1 − λ ) . The state is a Markov stateif and only if λ = 1 /d . The distance to the closest Markovstate is bounded from above by (cid:13)(cid:13)(cid:13) Ψ ABCλ − Ψ ABC /d (cid:13)(cid:13)(cid:13) = 2 (cid:113) − |(cid:104) Ψ λ | Ψ /d (cid:105)| = 2 (cid:114) d λ − d − . (37)The reduced state on AC is given by Ψ ACλ = d λ − d − | Φ d (cid:105)(cid:104) Φ d | AC + 1 − λd − I A ⊗ I C = λ (cid:48) | Φ d (cid:105)(cid:104) Φ d | AC + (1 − λ (cid:48) ) π A ⊗ π C , where π is the d -dimensional maximally mixed state and λ (cid:48) =( d λ − / ( d − . Hence the Markovianizing cost does notdepend on λ (cid:48) when λ (cid:48) > , as we proved in Section V-B.As directly veriﬁed by considering the case of λ (cid:48) = 1 , theMarkovianizing cost is equal to d for λ (cid:48) > . Taking thesymmetry of Ψ λ between A and C into account, we obtain M A | B (Ψ λ ) = M C | B (Ψ λ ) = (cid:40) d ( λ > /d )0 ( λ = 1 /d ) . (38)Hence the Markovianizing cost is not a continuous function ofstates. In a particular case where λ = 2 /d , the Markovianiz-ing cost grows logarithmically with respect to the dimensionof the system, whereas QCMI, as well as the distance to theclosest Markov state, approaches zero as indicated by (36) and(37).We note that Ψ ABCλ is “approximately recoverable” if λ isclose to /d , i.e., it satisﬁes Equalities (7) approximately if λ ≈ /d . Indeed, since Ψ /d is a Markov state, there existquantum operations R : B → BC and R (cid:48) : B → AB suchthat Ψ ABC /d = R (Ψ AB /d ) = R (cid:48) (Ψ BC /d ) . Due to the triangle inequality and the monotonicity of the tracedistance (see Appendix A-A), we have (cid:13)(cid:13) Ψ ABCλ − R (Ψ ABλ ) (cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) Ψ ABCλ − Ψ ABC /d (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) Ψ ABC /d − R (Ψ AB /d ) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) R (Ψ AB /d ) − R (Ψ ABλ ) (cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) Ψ ABCλ − Ψ ABC /d (cid:13)(cid:13)(cid:13) as well as (cid:13)(cid:13) Ψ ABCλ − R (cid:48) (Ψ BCλ ) (cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) Ψ ABCλ − Ψ ABC /d (cid:13)(cid:13)(cid:13) . Thus Equality (37) implies (cid:13)(cid:13) Ψ ABCλ − R (Ψ ABλ ) (cid:13)(cid:13) , (cid:13)(cid:13) Ψ ABCλ − R (cid:48) (Ψ BCλ ) (cid:13)(cid:13) ≤ (cid:114) d λ − d − . B. Asymmetry

We consider tripartite pure states that are expressed as | Ψ λ (cid:105) = √ λ | (cid:105) B | Φ d (cid:105) AC + √ − λ | (cid:105) A | Φ d (cid:105) BC , where d = dim H C , dim H A = dim H B = d + 1 , ≤ λ ≤ and Φ d is a maximally entangled state deﬁned by (35). Thereduced state on AC is given by Ψ ACλ = λ | Φ d (cid:105)(cid:104) Φ d | AC + (1 − λ ) | (cid:105)(cid:104) | A ⊗ π C . (39)Note that | Ψ d (cid:105) AC does not have any | (cid:105) A | (cid:105) C component.Hence the CPTP maps on A deﬁned as (11) and (12) aregiven by E ( τ ) = Tr[ P τ ] · π + Tr[ P τ ] · | (cid:105)(cid:104) | + 1 d P τ P + 1 d P τ P and E ∞ ( τ ) = Tr[ P τ ] · π + Tr[ P τ ] · | (cid:105)(cid:104) | , respectively, where P = | (cid:105)(cid:104) | , P = I − P and π = P /d .It is straightforward to verify that E is self-adjoint. By applying E ∞ to | Ψ λ (cid:105) on A , we obtain that Ψ ABC ∞ = λ π A ⊗ | (cid:105)(cid:104) | B ⊗ π C +(1 − λ ) | (cid:105)(cid:104) | A ⊗ | Φ d (cid:105)(cid:104) Φ d | BC . Therefore, due to Theorem 10, the Markovianizing cost isgiven by M A | B (Ψ λ ) = h ( λ ) + 2 λ log d. On the other hand, from (39), the Markovianizing costs M C | B of Ψ λ does not depend on λ when λ > as proved inSection V-B. Thus we have M C | B (Ψ λ ) = (cid:40) d ( λ > λ = 0) in the same way as (38). Hence the Markovianizing cost is notsymmetric in A and C , as opposed to QCMI, which satisﬁes I ( A : C | B ) = I ( C : A | B ) . C. States for which the Markovianizing cost coincides QCMI

We consider states that are expressed as | Ψ { λ k } (cid:105) := d (cid:88) k =1 (cid:112) λ k | k (cid:105) A | k (cid:105) B | k (cid:105) C , where λ k ≥ and (cid:80) dk =1 λ k = 1 . These states satisfyconditions in Theorem 13, thus the Markovianizing cost isgiven by M A | B (Ψ { λ k } ) = I ( A : C | B ) Ψ { λk } = H ( { λ k } k ) . VII. C

ONCLUSIONS AND D ISCUSSIONS

We have introduced the task of Markovianization, andderived a single-letter formula for the minimum cost ofrandomness required for Markovianizing tripartite pure states.We have also proposed an algorithm to compute the Marko-vianizing cost of a class of pure states without obtaining anexplicit form of the Koashi-Imoto decomposition. We thenhave computed the Markovianizing cost for certain pure states,and revealed its discontinuity and asymmetry. Our resultshave an application in analyzing optimal costs of resourcesfor simulating a bipartite unitary gate by local operationsand classical communication [19]. Some open questions aregeneralization to mixed states, formulation of a classical ana-log of Markovianization, in addition to ﬁnding an alternativeformulation of Markovianization for which we obtain QCMIas the cost function.In [18], we have introduced and analyzed an alternativeformulation of Markivianization and the Markovianizing cost.Instead of requiring Condition (9), we require that the stateafter a random unitary operation is “approximately recover-able”, i.e., it satisﬁes Equalities (7) approximately. For purestates, we have proved that the Markovianizing cost in thatcase is equal to the one obtained in this paper.A

CKNOWLEDGMENT

The authors thank Tomohiro Ogawa and Masato Koashi foruseful discussions. R

EFERENCES[1] P. Hayden, R. Jozsa, D. Petz, and A. Winter, “Structure of states whichsatisfy strong subadditivity of quantum entropy with equality,”

Comm.Math. Phys. , vol. 246, pp. 359–374, 2004.[2] I. Devetak and J. Yard, “Exact cost of redistributing multipartite quantumstates,”

Phys. Rev. Lett , vol. 100, p. 230501, 2008.[3] J. T. Yard and I. Devetak, “Optimal quantum source coding withquantum side information at the encoder and decoder,”

IEEE Trans.Inf. Theory , vol. 55, pp. 5339–5351, 2009.[4] F. Buscemi, “Complete positivity, markovianity, and the quantum data-processing inequality, in the presence of initial system-environmentcorrelations,”

Phys. Rev. Lett. , vol. 113, p. 140502, 2014.[5] D. Poulin and M. B. Hastings, “Markov entropy decomposition: Avariational dual for quantum belief propagation,”

Phys. Rev. Lett. , vol.106, p. 080403, 2011.[6] B. Ibinson, N. Linden, and A. Winter, “Robustness of quantum markovchains,”

Comm. Math. Phys. , vol. 277, pp. 289–304, 2008. [7] F. G. S. L. Brand˜ao, M. Christandl, and J. Yard, “Faithful squashedentanglement,” Comm. Math. Phys. , vol. 306, pp. 805–830, 2011.[8] K. Li and A. Winter, “Relative entropy and squashed entanglement,”

Comm. Math. Phys. , vol. 326, pp. 63–80, 2014.[9] ——, “Squashed entanglement, k -extendibility, quantum markov chains,and recovery maps,” e-print arXiv:1410.4184, 2014.[10] O. Fawzi and R. Renner, “Quantum conditional mutual information andapproximate markov chains,” Comm. Math. Phys. , vol. 340, pp. 575–611, 2015.[11] B. Groisman, S. Popescu, and A. Winter, “Quantum, classical, and totalamount of correlations in a quantum state,”

Phys. Rev. A , vol. 72, p.032317, 2005.[12] M. Horodecki, J. Oppenheim, and A. Winter, “Partial quantum informa-tion,”

Nature , vol. 436, pp. 673–676, 2005.[13] ——, “Quantum state merging and negative information,”

Comm. Math.Phys. , vol. 269, pp. 107–136, 2007.[14] A. Abeyesinghe, I. Devetak, P. Hayden, and A. Winter, “The mother ofall protocols: Restructuring quantum information’s family tree,”

Proc.R. Soc. A , vol. 465, p. 2537, 2009.[15] F. Dupuis, M. Berta, J. Wullschleger, and R. Renner, “One-shot decou-pling,”

Comm. Math. Phys. , vol. 328, pp. 251–284, 2014.[16] M. Berta, M. Christandl, and R. Renner, “The quantum reverse shannontheorem based on one-shot information theory,”

Comm. Math. Phys. ,vol. 306, pp. 579–615, 2011.[17] P. Hayden, M. Horodecki, A. Winter, and J. Yard, “A decouplingapproach to the quantum capacity,”

Open Sys. Inf. Dyn. , vol. 15, pp.7–19, 2008.[18] E. Wakakuwa, A. Soeda, and M. Murao, “The cost of randomness forconverting a tripartite quantum state to be approximately recoverable,”e-print arXiv:1512.06920v2, 2015.[19] ——, “A coding theorem for distributed quantum computation,” e-printarXiv:1505.04352v3, 2015.[20] ——, “A four-round locc protocol outperforms all two-round protocolsin reducing the entanglement cost for a distributed quantum informationprocessing,” in preparation.[21] M. Koashi and N. Imoto, “Operations that do not disturb partially knownquantum states,”

Phys. Rev. A , vol. 66, p. 022318, 2002.[22] R. Blume-Kohout, H. K. Ng, D. Poulin, and L. Viola, “Characterizingthe structure of preserved information in quantum processes,”

Phys. Rev.Lett. , vol. 100, p. 030501, 2008.[23] ——, “Information-preserving structures: A general framework forquantum zero-error information,”

Phys. Rev. A , vol. 82, p. 062306, 2010.[24] A. Uhlmann, “The ‘transition probability’ in the state space of a *-algebra,”

Rep. Math. Phys. , vol. 9, p. 273, 1976.[25] M. Koashi and N. Imoto, “Compressibility of quantum mixed-statesignals,”

Phys. Rev. Lett. , vol. 87, p. 017902, 2001.[26] M. A. Nielsen and I. L. Chuang,

Quantum Computation and QuantumInformation . Cambridge University Press, 2000.[27] M. Hayashi,

Quantum Information: An Introduction . Springer, 2006.[28] M. Wilde,

Quantum Information Theory . Cambridge University Press,2013.[29] I. Devetak, A. W. Harrow, and A. J. Winter, “A resource framework forquantum shannon thoery,”

IEEE Trans. Inf. Theory , vol. 54, pp. 4587–4618, 2008.[30] E. H. Lieb and M. B. Ruskai, “Proof of the strong subadditivity ofquantum-mechanical entropy,”

J. Math. Phys. , vol. 14, pp. 1938–1941,1973.[31] A. S. Holevo, “Bounds for the quantity of information transmitted by aquantum channel,”

Prob. Inf. Trans. , vol. 9, pp. 177–183, 1973.[32] M. Fannes, “A continuity property of the relative entropy density forspin lattice systems,”

Comm. Math. Phys. , vol. 31, pp. 291–294, 1973.[33] T. M. Cover and J. A. Thomas,

Elements of Information Theory (2nded.) . Wiley-Interscience, 2005.[34] B. Schumacher, “Quantum coding,”

Phys. Rev. A , vol. 51, pp. 2738–2747, 1995.[35] R. Ahlswede, “A method of coding and an application to arbitrarilyvarying channels,”

J. Comb., Info. and Syst. Sciences , vol. 5, pp. 10–35,1980. A PPENDIX AM ATHEMATICAL P RELIMINARIES

In this appendix, we summarize frequently used facts andtechnical tools used when studying quantum Shannon theory and also in the following appendices. Readers who are familiarwith the material may skip this section. For the references, seee.g. [26]–[28].

A. Trace Distance and Uhlmann’s Theorem

The trace distance between two quantum states ρ, σ ∈ S ( H ) is deﬁned by d ( ρ, σ ) := 12 (cid:107) ρ − σ (cid:107) = 12 Tr (cid:104)(cid:112) ( ρ − σ ) (cid:105) . In the following, we omit the coefﬁcient / for simplicity.For pure states | ψ (cid:105) , | φ (cid:105) ∈ H , the trace distance takes a simpleform of (cid:107)| ψ (cid:105)(cid:104) ψ | − | φ (cid:105)(cid:104) φ |(cid:107) = 2 (cid:112) − |(cid:104) ψ | φ (cid:105)| . For ρ, σ, τ ∈ S ( H ) , we have (cid:107) ρ − τ (cid:107) ≤ (cid:107) ρ − σ (cid:107) + (cid:107) σ − τ (cid:107) , which is called the triangle inequality . The trace distance ismonotonically nonincreasing under quantum operations, i.e.,it satisﬁes (cid:107) ρ − σ (cid:107) ≥ (cid:107)E ( ρ ) − E ( σ ) (cid:107) for any linear CPTP map E : S ( H ) → S ( H (cid:48) ) . As a particularcase, the trace distance between two states on a compositesystem is nonincreasing under taking the partial trace, that is,for ρ, σ ∈ S ( H A ⊗ H B ) we have (cid:107) ρ AB − σ AB (cid:107) ≥ (cid:107) ρ A − σ A (cid:107) . Consider two states ρ, σ ∈ S ( H A ) satisfying (cid:107) ρ − σ (cid:107) ≤ (cid:15) ,and let | ψ ρ (cid:105) AB and | φ σ (cid:105) AB (cid:48) be puriﬁcations of the two states,respectively. If d B ≤ d B (cid:48) , there exists an embedding of H B into H B (cid:48) , represented by an isometry from H B to H B (cid:48) , suchthat (cid:13)(cid:13)(cid:13) | ψ ρ (cid:105)(cid:104) ψ ρ | AB (cid:48) − | ψ σ (cid:105)(cid:104) ψ σ | AB (cid:48) (cid:13)(cid:13)(cid:13) ≤ √ (cid:15). This relation is referred to as

Uhlmann’s theorem ( [24], seealso Lemma 2.2 in [29]). In the case of (cid:15) = 0 , the abovestatement implies that all puriﬁcations are equivalent up to alocal isometry.The gentle measurement lemma (Lemma 9.4.1 in [28]) statesthat for any ρ ∈ S ( H ) , X ∈ L ( H ) and (cid:15) ≥ such that ≤ X ≤ I and Tr[ ρX ] ≥ − (cid:15) , we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ρ − √ Xρ √ X Tr[ ρX ] (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ √ (cid:15). (40)As a corollary, when two bipartite states ρ ∈ S ( H A ⊗ H B ) and σ ∈ S ( H A ⊗ H B (cid:48) ) satisﬁes (cid:107) ρ A − σ A (cid:107) ≤ (cid:15) , and Π σ isthe projection onto supp[ σ A ] ⊆ H A , we have (cid:13)(cid:13)(cid:13)(cid:13) ρ AB − Π Aσ ρ AB Π Aσ Tr[ ρ A Π Aσ ] (cid:13)(cid:13)(cid:13)(cid:13) ≤ √ (cid:15). (41)This is because we have (cid:15) ≥ (cid:13)(cid:13) Π σ ρ A Π σ + Π ⊥ σ ρ A Π ⊥ σ − σ A (cid:13)(cid:13) = (cid:13)(cid:13) σ A − Π σ ρ A Π σ (cid:13)(cid:13) + Tr[ ρ A Π ⊥ σ ] , where Π ⊥ σ denotes the projection onto the orthogonal comple-ment of supp[ σ A ] ⊆ H A , and thus have Tr[ ρ AB Π Aσ ] = Tr[ ρ A Π Aσ ] = 1 − Tr[ ρ A Π ⊥ σ ] ≥ − (cid:15). B. Quantum Entropies and Mutual Informations

The Shannon entropy of a probability distribution { p x } x ∈X is deﬁned as H ( { p x } x ∈X ) := − (cid:88) x ∈X p x log p x . The von Neumann entropy of a quantum state ρ A ∈ S ( H A ) is deﬁned as S ( ρ A ) = S ( A ) ρ := − Tr[ ρ A log ρ A ] . If ρ A is a probabilistic mixture of pure states as ρ A = (cid:80) j p j | φ j (cid:105)(cid:104) φ j | , we have S ( ρ A ) ≤ H ( { p j } j ) with equality ifand only if {| φ j (cid:105)} j is mutually orthogonal. The von Neumannentropy is monotonically nondecreasing under random unitaryoperations, that is, we have S ( A ) ρ ≤ S ( A ) V ( ρ ) for any randomunitary operation V on A . For a bipartite pure state | ψ (cid:105) AA (cid:48) ,we have S ( ψ A ) = S ( ψ A (cid:48) ) . (42)For a bipartite state ρ ∈ S ( H A ⊗ H B ) , the quantum con-ditional entropy and the quantum mutual information (QMI)are deﬁned as S ( A | B ) ρ = S ( AB ) ρ − S ( B ) ρ ,I ( A : B ) ρ = S ( A ) ρ + S ( B ) ρ − S ( AB ) ρ , respectively. The von Neumann entropy satisﬁes the subaddi-tivity , expressed as S ( A ) ρ + S ( B ) ρ ≥ S ( AB ) ρ , (43)which guarantees the nonnegativity of QMI. The equalityholds if and only if ρ AB = ρ A ⊗ ρ B . Applying (43) to | ψ σ (cid:105) ABC , which is a puriﬁcation of σ AC , and by using (42),we obtain S ( C ) σ − S ( A ) σ ≤ S ( AC ) σ . (44)Hence QMI is bounded above as I ( A : C ) ρ ≤ min { S ( A ) ρ , S ( C ) ρ } . (45)For any ρ ∈ S ( H A ⊗ H B ) and quantum operation E on B ,we have S ( A | B ) ρ ≤ S ( A | B ) E ( ρ ) , I ( A : B ) ρ ≥ I ( A : B ) E ( ρ ) . (46)Inequalities (46) are called the data processing inequality .For a tripartite state ρ ∈ S ( H A ⊗ H B ⊗ H C ) , the quantumconditional mutual information (QCMI) is deﬁned as I ( A : C | B ) ρ = S ( AB ) ρ + S ( BC ) ρ − S ( B ) ρ − S ( ABC ) ρ . QCMI is nonnegative because of the strong subadditivity ofthe von Neumann entropy [30], which is also equivalent to thedata processing inequality. QMI and QCMI are related by asimple relation as I ( A : BC ) ρ = I ( A : B ) ρ + I ( A : C | B ) ρ , which is called the chain rule .For a class of states called the classical-quantum states , thequantum conditional entropy and QCMI take simple forms. That is, for states ρ ∈ S ( H X ⊗ H A ) and σ ∈ S ( H X ⊗ H A ⊗H B ) , given as ρ XA = (cid:88) i p i | i (cid:105)(cid:104) i | X ⊗ ρ Ai ,σ XAB = (cid:88) i q i | i (cid:105)(cid:104) i | X ⊗ σ ABi , where {| i (cid:105)} i is an orthonormal basis of H X , we have S ( A | X ) ρ = (cid:88) i p i S ( ρ Ai ) ,I ( A : B | X ) σ = (cid:88) i q i I ( A : B ) σ i . QMI of a classical-quantum state takes the form of I ( X : A ) ρ = S (¯ ρ A ) − (cid:88) i p i S ( ρ Ai ) , where ¯ ρ A = (cid:80) i p i ρ Ai . This quantity is equal to the Holevoinformation [31], and satisﬁes I ( X : A ) ρ ≤ S ( X ) ρ = H ( { p i } i ) with equality if and only if { supp[ ρ Ai ] } i is mutually orthogo-nal. C. Continuity of Quantum Entropies

Deﬁne η ( x ) := (cid:40) − x log x ( x ≤ /e ) e ( x ≥ /e ) ,η ( x ) = x + η ( x ) and h ( x ) := η ( x ) + η (1 − x ) , where e isthe base of the natural logarithm. For two states ρ and σ in a d -dimensional quantum system ( d < ∞ ) such that (cid:107) ρ − σ (cid:107) ≤ (cid:15) ,we have | S ( ρ ) − S ( σ ) | ≤ (cid:15) log d + η ( (cid:15) ) ≤ η ( (cid:15) ) log d, (47)which is called the Fannes inequality [32]. It follows that fortwo bipartite states ρ, σ ∈ S ( H A ⊗H B ) such that (cid:107) ρ − σ (cid:107) ≤ (cid:15) ,we have | S ( A | B ) ρ − S ( A | B ) σ | ≤ η ( (cid:15) ) log ( d A d B ) (48)and | I ( A : B ) ρ − I ( A : B ) σ | ≤ η ( (cid:15) ) log ( d A d B ) . (49) D. Typical Sequences and Subspaces ( [33], [34], see alsoAppendices in [14] for further details.)

Let X be a discrete random variable with ﬁnite alphabet X and probability distribution p x = Pr { X = x } where x ∈ X .A sequence x = ( x , · · · , x n ) ∈ X n is said to be δ -weaklytypical with respect to { p x } x ∈X if it satisﬁes − n ( H ( X )+ δ ) ≤ N (cid:89) i =1 p x i ≤ − n ( H ( X ) − δ ) . The set of all δ -weakly typical sequences is called the δ -weaklytypical set , and is denoted by T n,δ in the following. Denoting (cid:81) Ni =1 p x i by p x , we have (cid:88) x ∈X n p x ≥ (cid:88) x ∈T n,δ p x ≥ |T n,δ | · − n ( H ( X )+ δ ) , which implies that |T n,δ | ≤ n ( H ( X )+ δ ) . (50)A sequence x = ( x , · · · , x n ) ∈ X n is called δ -stronglytypical with respect to { p x } x ∈X if it satisﬁes (cid:12)(cid:12)(cid:12)(cid:12) n N x | x − p x (cid:12)(cid:12)(cid:12)(cid:12) < δ |X | for all x ∈ X and N x | x = 0 if p x = 0 . Here, N x | x is thenumber of occurrences of the symbol x in the sequence x . Theset of all δ -strongly typical sequences is called the δ -stronglytypical set , and denoted by T ∗ n,δ in the following. From theweak law of large numbers, we have that for any (cid:15), δ > andsufﬁciently large n , Pr { ( X , · · · , X n ) ∈ T n,δ } ≥ − (cid:15), (51) Pr { ( X , · · · , X n ) ∈ T ∗ n,δ } ≥ − (cid:15). (52)Suppose the spectral decomposition of ρ ∈ S ( H ) is givenby ρ = (cid:80) x p x | x (cid:105)(cid:104) x | . The δ -weakly typical subspace H n,δ ⊂H ⊗ n with respect to ρ is deﬁned as H n,δ := span {| x (cid:105) · · · | x n (cid:105) ∈ H ⊗ n | ( x , · · · , x n ) ∈ T n,δ } , where T n,δ is the δ -weakly typical set with respect to p x .Similarly, the δ -strongly typical subspace H ∗ n,δ ⊂ H ⊗ n withrespect to ρ is deﬁned as H ∗ n,δ := span {| x (cid:105) · · · | x n (cid:105) ∈ H ⊗ n | ( x , · · · , x n ) ∈ T ∗ n,δ } . Suppose the Schmidt decomposition of | ψ (cid:105) AB ∈ H A ⊗ H B is given by | ψ (cid:105) AB = (cid:80) x √ p x | x (cid:105) A | x (cid:105) B . For any δ > and n , let H n,δ and H ∗ n,δ be the δ -weakly and strongly typicalsubspace of ( H A ) ⊗ n with respect to ψ A = Tr B [ | ψ (cid:105)(cid:104) ψ | AB ] ,and let Π n,δ and Π ∗ n,δ be the projection onto those subspaces,respectively. From (50), we have rank Π n,δ = dim H n,δ ≤ n ( H ( A ) ψ + δ ) . From (51) and (52), we have

Tr[Π ¯ An,δ ( | ψ (cid:105)(cid:104) ψ | AB ) ⊗ n ] = (cid:88) x ∈T n,δ p x ≥ − (cid:15), (53) Tr[Π ∗ ¯ An,δ ( | ψ (cid:105)(cid:104) ψ | AB ) ⊗ n ] = (cid:88) x ∈T ∗ n,δ p x ≥ − (cid:15) (54)for any (cid:15), δ > and sufﬁciently large n .A PPENDIX BP ROOF OF T HEOREM a a L a R by A and b b L b R by B , when there is no fear ofconfusion. A. Proof of Achievability (Inequality (16))

Fix arbitrary n and δ ∈ (0 , . Let J n,δ ⊂ J n be the δ -strongly typical set with respect to { p j } j ∈ J . For each j ∈ J and j = j · · · j n ∈ J n,δ , deﬁne L j, j := { l | j l = j, ≤ l ≤ n } .The number of elements in the set is bounded as n (cid:18) p j − δ | J | (cid:19) ≤ | L j, j | ≤ n (cid:18) p j + δ | J | (cid:19) . For each j ∈ J n,δ , we sort ( H a R ) ⊗ n = H a R ⊗ · · · ⊗ H a R n as ( H a R ) ⊗ n = (cid:79) j ∈ J  (cid:79) l ∈ L j, j H a R l  . For each j and j , let H j, j ,δ be the δ -weakly typical subspaceof (cid:78) l ∈ L j, j H a R l with respect to ϕ a R j , Π j, j ,δ be the projectiononto H j, j ,δ , and let Π ¯ a R j ,δ := (cid:78) j ∈ J Π j, j ,δ . Deﬁne Π ¯ Aδ := (cid:88) j ∈ J n,δ | j (cid:105)(cid:104) j | ¯ a ⊗ I ¯ a L j ⊗ Π ¯ a R j ,δ (55)and | Ψ (cid:48) n,δ (cid:105) ¯ A ¯ B ¯ C := Π ¯ Aδ | Ψ ⊗ nKI (cid:105) ¯ A ¯ B ¯ C = (cid:88) j ∈ J n,δ √ p j | j (cid:105) ¯ a | j (cid:105) ¯ b | ω j (cid:105) ¯ a L ¯ b L Π ¯ a R j ,δ | ϕ j (cid:105) ¯ a R ¯ b R ¯ C , (56)where we introduced notations p j = p j × · · · × p j n , ϕ j = ϕ j ⊗ · · · ⊗ ϕ j n and ω j = ω j ⊗ · · · ⊗ ω j n .Let v j be any unitary acting on (cid:78) j ∈ J H j, j , and deﬁne aunitary on (cid:78) j ∈ J n,δ ( (cid:78) j ∈ J H j, j ) by V ¯ A := (cid:88) j ∈ J n,δ | j (cid:105)(cid:104) j | ¯ a ⊗ I ¯ a L j ⊗ v ¯ a R j , (57)as (18). We have | Ψ (cid:48) n,δ ( V ) (cid:105) ¯ A ¯ B ¯ C := V ¯ A | Ψ (cid:48) n,δ (cid:105) ¯ A ¯ B ¯ C = (cid:88) j ∈ J n,δ √ p j | j (cid:105) ¯ a | j (cid:105) ¯ b | ω j (cid:105) ¯ a L ¯ b L v ¯ a R j | ϕ (cid:48) j (cid:105) ¯ a R ¯ b R ¯ C , (58)where | ϕ (cid:48) j (cid:105) := Π ¯ a R j | ϕ j (cid:105) .Let { p ( dV ) , V } be the ensemble of unitaries generated bychoosing v j randomly and independently according to theHaar measure for each j in (57). Due to Schur’s lemma, asan ensemble average we have E (cid:104) v ¯ a R j | ϕ (cid:48) j (cid:105)(cid:104) ϕ (cid:48) j | v † ¯ a R j (cid:105) = π ¯ a R j ⊗ ϕ (cid:48) ¯ b R ¯ C j , where π ¯ a R j = Π ¯ a R j ,δ / TrΠ ¯ a R j ,δ , and E (cid:104) v ¯ a R j | ϕ (cid:48) j (cid:105)(cid:104) ϕ (cid:48) j (cid:48) | v † ¯ a R j (cid:48) (cid:105) = 0 for j (cid:54) = j (cid:48) . Thus the average state of (58) is given by ¯Ψ n,δ := E (cid:104) | Ψ (cid:48) n,δ ( V ) (cid:105)(cid:104) Ψ (cid:48) n,δ ( V ) | ¯ A ¯ B ¯ C (cid:105) = (cid:88) j ∈ J n,δ p j | jj (cid:105)(cid:104) jj | ¯ a ¯ b ⊗ | ω j (cid:105)(cid:104) ω j | ¯ a L ¯ b L ⊗ π ¯ a R j ⊗ ϕ (cid:48) ¯ b R ¯ C j = (cid:88) j ∈ J n,δ p j | j (cid:105)(cid:104) j | ¯ b ⊗ (cid:16) π ¯ a R j ⊗ | j , ω j (cid:105)(cid:104) j , ω j | ¯ a ¯ a L ¯ b L (cid:17) ⊗ ϕ (cid:48) ¯ b R ¯ C j , (59) which is a subnormalized Markov state conditioned by ¯ B corresponding to (19) (see Figure 3).The minimum nonzero eigenvalue of ¯Ψ n,δ is calculated asfollows. First, due to the deﬁnition of J n,δ , we have p j ≥ (cid:89) j ∈ J p n ( p j + δ/ | J | ) j = 2 − n ( H ( { p j } j )+ δH (cid:48) ( { p j } j ) ) , where H (cid:48) ( { p j } j ) := 1 | J | (cid:88) j log p j > −∞ . Second, since the spectrums of ϕ (cid:48) ¯ a R j and ϕ (cid:48) ¯ b R ¯ C j are the same,the minimum nonzero eigenvalue µ j of ϕ (cid:48) ¯ b R ¯ C j is bounded frombelow as µ j ≥ (cid:89) j ∈ J − L j, j ( S ( ϕ a R j )+ δ ) ≥ (cid:89) j ∈ J − n ( p j + δ/ | J | )( S ( ϕ a R j )+ δ ) ≥ − n ( (cid:80) j p j S ( ϕ a R j )+ δ log (4 d A ) ) , where the last line follows from (cid:88) j (cid:18) p j + δ | J | (cid:19) (cid:0) S ( ϕ a R j ) + δ (cid:1) = (cid:88) j p j S ( ϕ a R j ) + δ  | J | (cid:88) j S ( ϕ a R j ) + δ  ≤ (cid:88) j p j S ( ϕ a R j ) + δ (2 + log d A )= (cid:88) j p j S ( ϕ a R j ) + δ log (4 d A ) . Third, we have rank Π j, j ,δ ≤ | L j, j | ( S ( ϕ a R j )+ δ ) ≤ n ( p j + δ/ | J | )( S ( ϕ a R j )+ δ ) and rank Π ¯ a R j ,δ = (cid:89) j ∈ J rank Π j, j ,δ ≤ (cid:89) j ∈ J n ( p j + δ/ | J | )( S ( ϕ a R j )+ δ ) . Thus the nonzero eigenvalue ν j of π ¯ a R j is, in the same way as µ j , bounded from below as ν j ≥ (cid:89) j ∈ J − n ( p j + δ/ | J | )( S ( ϕ a R j )+ δ ) . All in all, the minimum nonzero eigenvalue λ of ¯Ψ n,δ isbounded as λ = p j µ j ν j ≥ − n [ H ( { p j } j )+2 (cid:80) j p j S ( ϕ a R j )+ δ ( H (cid:48) ( { p j } j )+2 log (4 d A ) )] . We also have rank ¯Ψ n,δ ≤ | J n,δ | × rank π ¯ a R j × rank ϕ (cid:48) ¯ a R j ≤ d nA . Suppose V , · · · , V N are unitaries that are randomly andindependently chosen from the ensemble { p ( dV ) , V } . Due tothe operator Chernoff bound (Lemma 3 in [11]), we have Pr (cid:40) N N (cid:88) i =1 Ψ (cid:48) n,δ ( V i ) / ∈ [(1 − (cid:15) ) ¯Ψ n,δ , (1 + (cid:15) ) ¯Ψ n,δ ] (cid:41) ≤ d nA exp (cid:18) − N λ(cid:15) (cid:19) for any (cid:15) ∈ (0 , , which implies that Pr (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) nR nR (cid:88) i =1 Ψ (cid:48) n,δ ( V i ) − ¯Ψ n,δ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:15)  ≥ − d nA exp (cid:18) − nR λ(cid:15) (cid:19) (60)for an arbitrary R > . Therefore, if R satisﬁes R > H ( { p j } j ) + 2 (cid:88) j p j S ( ϕ a R j )+ δ ( H (cid:48) ( { p j } j ) + 2 log (4 d A )) , (61)and if n is sufﬁciently large so that the R.H.S. in (60) is greaterthan , there exists a set of unitaries { V i } nR i =1 such that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) nR nR (cid:88) i =1 Ψ (cid:48) n,δ ( V i ) − ¯Ψ n,δ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:15) . (62)Using unitaries in the set, construct a random unitary operation V n on ˆ a ˆ a L ˆ a R as V n ( · ) = 2 − nR (cid:80) nR k =1 V k ( · ) V † k .Let us evaluate the total error. First, from (53), (54), (55)and (56), we have D n,δ := Tr[Π ¯ Aδ | Ψ ⊗ nKI (cid:105)(cid:104) Ψ ⊗ nKI | ] = (cid:104) Ψ (cid:48) n,δ | Ψ (cid:48) n,δ (cid:105) ≥ − (cid:15) (63)for any (cid:15) > and sufﬁciently large n . Thus, by the gentlemeasurement lemma (40), we have (cid:13)(cid:13)(cid:13)(cid:13) | Ψ (cid:48) n,δ (cid:105)(cid:104) Ψ (cid:48) n,δ | D n,δ − | Ψ ⊗ nKI (cid:105)(cid:104) Ψ ⊗ nKI | (cid:13)(cid:13)(cid:13)(cid:13) ≤ √ (cid:15) , which leads to (cid:13)(cid:13)(cid:13)(cid:13) V n (cid:18) | Ψ (cid:48) n,δ (cid:105)(cid:104) Ψ (cid:48) n,δ | D n,δ (cid:19) − V n (cid:0) | Ψ ⊗ nKI (cid:105)(cid:104) Ψ ⊗ nKI | (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) ≤ √ (cid:15) . Second, from (62) and (63), we have (cid:13)(cid:13)(cid:13)(cid:13) V n (cid:18) | Ψ (cid:48) n,δ (cid:105)(cid:104) Ψ (cid:48) n,δ | D n,δ (cid:19) − ¯Ψ n,δ D n,δ (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:15) D n,δ ≤ (cid:15) − (cid:15) . Therefore, by the triangle inequality, we obtain (cid:13)(cid:13)(cid:13)(cid:13) V n (cid:0) | Ψ ⊗ nKI (cid:105)(cid:104) Ψ ⊗ nKI | (cid:1) − ¯Ψ n,δ D n,δ (cid:13)(cid:13)(cid:13)(cid:13) ≤ √ (cid:15) + 2 (cid:15) − (cid:15) . (64)From (58) and (59), we have Tr[ ¯Ψ n,δ ] = D n,δ , which impliesthat ¯Ψ n,δ /D n,δ is a normalized Markov state conditioned by B n . Since the relation (64) holds for any (cid:15) , (cid:15) > , R >H ( { p j } j ) + 2 (cid:80) j p j S ( ϕ a R j ) , any δ ∈ (0 , that satisﬁes (61)and sufﬁciently large n , we obtain (16). (cid:4) B. Convergence Speed of the Error

We prove that, in the direct part of Theorem 9, the error (cid:15) vanishes exponentially in the asymptotic limit of n → ∞ .More precisely, we prove the following theorem. Theorem 14

There exists a constant c Ψ > such that forany R > M A | B (Ψ ABC ) , sufﬁciently small δ > and anysufﬁciently large n , we ﬁnd a random unitary operation V n : τ (cid:55)→ − nR (cid:80) nR k =1 V k τ V † k on A n and a Markov state Υ A n B n C n conditioned by B n that satisfy (cid:13)(cid:13)(cid:13) V n ( ρ ⊗ n ) − Υ A n B n C n (cid:13)(cid:13)(cid:13) ≤ (cid:18) − c Ψ δ n (cid:19) . Proof:

Let X , · · · , X n be a sequence of i.i.d. randomvariables obeying a probability distribution { p x } x . It is provedin [35] that there exists a constant c > , which depends on { p x } x , such that for any δ > and n , we have Pr { ( X , · · · , X n ) ∈ T n,δ } ≥ − exp ( − cδ n ) , Pr { ( X , · · · , X n ) ∈ T ∗ n,δ } ≥ − exp ( − cδ n ) . As a consequence, there exists a constant c ψ > such thatwe have Tr[Π n,δ ( | ψ (cid:105)(cid:104) ψ | AB ) ⊗ n ] = (cid:88) x ∈T n,δ p x ≥ − exp ( − c ψ δ n )Tr[Π ∗ n,δ ( | ψ (cid:105)(cid:104) ψ | AB ) ⊗ n ] = (cid:88) x ∈T ∗ n,δ p x ≥ − exp ( − c ψ δ n ) for any δ > and n , corresponding to (53) and (54). Thus,for any δ > , n and D n,δ deﬁned by (63), we obtain D n,δ ≥ − exp ( − c Ψ δ n ) , where c Ψ > is a constant. Hence we have (cid:13)(cid:13)(cid:13)(cid:13) V n (cid:0) | Ψ ⊗ nKI (cid:105)(cid:104) Ψ ⊗ nKI | (cid:1) − ¯Ψ n,δ D n,δ (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:18) − c Ψ δ n (cid:19) + 2 (cid:15) − exp ( − c Ψ δ n ) for any δ, (cid:15) > and n , corresponding to (64). Substituting exp ( − c Ψ δ n/ into (cid:15) , we obtain (cid:13)(cid:13)(cid:13)(cid:13) V n (cid:0) | Ψ ⊗ nKI (cid:105)(cid:104) Ψ ⊗ nKI | (cid:1) − ¯Ψ n,δ D n,δ (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:18) − c Ψ δ n (cid:19) (65)for any δ > and n ≥ (ln 2) /c Ψ δ .For an arbitrary R , choose sufﬁciently small δ > suchthat R > H ( { p j } j ) + 2 (cid:88) j p j S ( ϕ a R j )+ δ ( H (cid:48) ( { p j } j ) + 2 log (4 d A ) + c Ψ δ ) . Inequality (65) then holds for sufﬁciently large n , while keep-ing the R.H.S. in (60) strictly greater than . This completesthe proof. (cid:4) C. Proof of Optimality (Inequality (20))

We assume, without loss of generality, that d A ≥ d B d C .This condition is always satisﬁed by associating a sufﬁcientlylarge Hilbert space H A to system A .Take an arbitrary R > M A | B (Ψ ABC ) . By deﬁnition, forany (cid:15) ∈ (0 , and sufﬁciently large n , there exist a randomunitary operation V n : τ (cid:55)→ − nR (cid:80) nR k =1 V k τ V † k on ¯ A and aMarkov state Υ ¯ A ¯ B ¯ C conditioned by ¯ B such that (cid:13)(cid:13)(cid:13) V n (Ψ ⊗ n ) − Υ ¯ A ¯ B ¯ C (cid:13)(cid:13)(cid:13) ≤ (cid:15). (66)By tracing out A n , we have (cid:13)(cid:13)(cid:13) (Ψ ⊗ n ) ¯ B ¯ C − Υ ¯ B ¯ C (cid:13)(cid:13)(cid:13) ≤ (cid:15). (67)Due to Uhlmann’s theorem ([24], see Appendix A-A), thereexists a puriﬁcation | χ (cid:105) ¯ A ¯ B ¯ C of Υ ¯ B ¯ C such that we have (cid:13)(cid:13)(cid:13) (Ψ ⊗ n ) ¯ A ¯ B ¯ C − χ ¯ A ¯ B ¯ C (cid:13)(cid:13)(cid:13) ≤ √ (cid:15). (68)Let ˜Γ (cid:48) : H ¯ Bχ → H ˆ b ⊗ H ˆ b L ⊗ H ˆ b R be the KI isometry on ¯ B with respect to χ ¯ B ¯ C . From Lemma 11, there exists a sub-KI isometry ˜Γ : H ¯ Aχ → H ˆ a ⊗ H ˆ a L ⊗ H ˆ a R such that the KIdecomposition of | χ (cid:105) on ¯ B and ¯ A is given by | χ KI (cid:105) := (˜Γ ¯ A ⊗ ˜Γ (cid:48) ¯ B ) | χ (cid:105) = (cid:88) i √ q i | i (cid:105) ˆ a | i (cid:105) ˆ b | ξ i (cid:105) ˆ a L ˆ b L | φ i (cid:105) ˆ a R ˆ b R ¯ C . (69)From Theorem 7 and χ ¯ B ¯ C = Υ ¯ B ¯ C , a Markov decompositionof Υ ¯ A ¯ B ¯ C is obtained by ˜Γ (cid:48) as Υ ¯ A ¯ B ¯ CMk := ˜Γ (cid:48) ¯ B Υ (cid:48) ¯ A ¯ B ¯ C ˜Γ (cid:48)† ¯ B = (cid:88) i q i | i (cid:105)(cid:104) i | ˆ b ⊗ σ ¯ A ˆ b L i ⊗ φ ˆ b R ¯ Ci . (70)Due to (68) and the monotonicity of the trace distance, wehave (cid:13)(cid:13)(cid:13) V n (Ψ ⊗ n ) − V n ( χ ¯ A ¯ B ¯ C ) (cid:13)(cid:13)(cid:13) ≤ √ (cid:15). Thus from (66) and the triangle inequality, we obtain (cid:13)(cid:13)(cid:13) V n ( χ ¯ A ¯ B ¯ C ) − Υ ¯ A ¯ B ¯ C (cid:13)(cid:13)(cid:13) ≤ √ (cid:15) + (cid:15) < √ (cid:15). Applying ˜Γ (cid:48) ¯ B yields (cid:13)(cid:13)(cid:13) V n (cid:16) ˜Γ (cid:48) ¯ B χ ¯ A ¯ B ¯ C ˜Γ (cid:48)† ¯ B (cid:17) − Υ ¯ A ¯ B ¯ CMk (cid:13)(cid:13)(cid:13) ≤ √ (cid:15), due to (70). Hence we obtain from (69) that (cid:13)(cid:13)(cid:13) V n (cid:16) ˜Γ † ¯ A χ ¯ A ¯ B ¯ C KI ˜Γ ¯ A (cid:17) − Υ ¯ A ¯ B ¯ CMk (cid:13)(cid:13)(cid:13) ≤ √ (cid:15). (71)Let D ˆ b be the completely dephasing operation on ˆ b with re-spect to the basis {| i (cid:105) ˆ b } i . From (70), we have D ˆ b (Υ ¯ A ¯ B ¯ CMk ) =Υ ¯ A ¯ B ¯ CMk . Thus we obtain from (71) that (cid:13)(cid:13)(cid:13) ( T (cid:48) n ⊗ D ˆ b )( | χ KI (cid:105)(cid:104) χ KI | ) − Υ ¯ A ¯ B ¯ CMk (cid:13)(cid:13)(cid:13) ≤ √ (cid:15). (72)Here, we deﬁned a random isometry operation T (cid:48) n := V n ◦E ˜Γ † ,where E ˜Γ † is an isometry operation corresponding to ˜Γ † . Due to (69), we have D ˆ b ( | χ KI (cid:105)(cid:104) χ KI | )= (cid:88) i q i | i (cid:105)(cid:104) i | ˆ a ⊗ | i (cid:105)(cid:104) i | ˆ b ⊗ | ξ i (cid:105)(cid:104) ξ i | ˆ a L ˆ b L ⊗ | φ i (cid:105)(cid:104) φ i | ˆ a R ˆ b R ¯ C , which leads to Tr ˆ b R ¯ C (cid:104) D ˆ b ( | χ KI (cid:105)(cid:104) χ KI | ) (cid:105) = (cid:88) i q i | i (cid:105)(cid:104) i | ˆ b ⊗ | i, ξ i (cid:105)(cid:104) i, ξ i | ˆ a ˆ a L ˆ b L ⊗ φ ˆ a R i . Hence we have Tr ˆ b R ¯ C (cid:104) ( T (cid:48) n ⊗ D ˆ b )( | χ KI (cid:105)(cid:104) χ KI | ) (cid:105) = (cid:88) i q i | i (cid:105)(cid:104) i | ˆ b ⊗ φ ¯ A ˆ b L i, T (cid:48) n , where we deﬁne φ ¯ A ˆ b L i, T (cid:48) n := T (cid:48) n ( | i, ξ i (cid:105)(cid:104) i, ξ i | ˆ a ˆ a L ˆ b L ⊗ φ ˆ a R i ) . (73)From (70) we have Tr ˆ b R ¯ C (cid:104) Υ ¯ A ¯ B ¯ CMk (cid:105) = (cid:88) i q i | i (cid:105)(cid:104) i | ˆ b ⊗ σ ¯ A ˆ b L i . Therefore, by tracing out ˆ b R ¯ C in (72), we obtain (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:88) i q i | i (cid:105)(cid:104) i | ˆ b ⊗ φ ¯ A ˆ b L i, T (cid:48) n − (cid:88) i q i | i (cid:105)(cid:104) i | ˆ b ⊗ σ ¯ A ˆ b L i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ √ (cid:15). Thus, by Inequality (47), we have H ( { q i } i ) + (cid:88) i q i S ( σ ¯ A ˆ b L i ) ≥ H ( { q i } i ) + (cid:88) i q i S ( φ ¯ A ˆ b L i, T (cid:48) n ) − η (3 √ (cid:15) ) log( d ¯ A d ¯ B ) . Since the von Neumann entropy is nondecreasing under ran-dom unitary operations, we have S ( φ ¯ A ˆ b L i, T (cid:48) n ) ≥ S ( φ ˆ a R i ) for each j from (73). Hence we obtain (cid:88) i q i S ( σ ¯ A ˆ b L i ) ≥ (cid:88) i q i S ( φ ˆ a R i ) − η (3 √ (cid:15) ) log( d ¯ A d ¯ B ) . (74)The von Neumann entropy of the state Υ ¯ A ¯ B ¯ C is thenbounded below as S ( ¯ A ¯ B ¯ C ) Υ = S ( ¯ A ˆ b ˆ b L ˆ b R ¯ C ) Υ Mk = S (ˆ b ) Υ Mk + S ( ¯ A ˆ b L ˆ b R ¯ C | ˆ b ) Υ Mk = H ( { q i } i ) + (cid:88) i q i ( S ( σ ¯ A ˆ b L i ) + S ( φ ˆ b R ¯ Ci ))= H ( { q i } i ) + (cid:88) i q i ( S ( σ ¯ A ˆ b L i ) + S ( φ ˆ a R i )) ≥ H ( { q i } i ) + 2 (cid:88) i q i S ( φ ˆ a R i ) − nη (3 √ (cid:15) ) log( d A d B d C )= S (ˆ a ) χ KI + 2 S (ˆ a R | ˆ a ) χ KI − nη (3 √ (cid:15) ) log( d A d B d C ) . (75)Here, the third line follows from (70); the fourth line becauseof φ ˆ a R ˆ b R ¯ Ci being a pure state; the ﬁfth line by Inequality (74);and the sixth line from (69). From Lemma 12, (68) implies S (ˆ a ) χ KI + 2 S (ˆ a R | ˆ a ) χ KI ≥ n (cid:0) S ( a ) Ψ KI + 2 S ( a R | a ) Ψ KI − ζ (cid:48) Ψ (2 √ (cid:15) ) log d A (cid:1) , (76) where ζ (cid:48) Ψ ( (cid:15) ) is a function deﬁned by (90) in Appendix B-D.Putting together (75) and (76), we obtain n S ( ¯ A ¯ B ¯ C ) Υ ≥ S ( a ) Ψ KI + 2 S ( a R | a ) Ψ KI − (cid:0) η (3 √ (cid:15) ) + ζ (cid:48) Ψ (2 √ (cid:15) ) (cid:1) log( d A d B d C ) . Noting that V n (Ψ ⊗ n ) is a mixture of nR (not necessarilyorthogonal) pure states, from (66), we ﬁnally obtain R ≥ n S ( ¯ A ¯ B ¯ C ) V n (Ψ ⊗ n ) ≥ n S ( ¯ A ¯ B ¯ C ) Υ − η ( (cid:15) ) log( d A d B d C ) ≥ S ( a ) Ψ KI + 2 S ( a R | a ) Ψ KI − (cid:0) η (3 √ (cid:15) ) + ζ (cid:48) Ψ (2 √ (cid:15) ) (cid:1) log( d A d B d C )= H ( { p j } j ∈ J ) + 2 (cid:88) j ∈ J p j S ( ϕ a R j ) − (cid:0) η (3 √ (cid:15) ) + ζ (cid:48) Ψ (2 √ (cid:15) ) (cid:1) log( d A d B d C ) , which implies (20) by taking the limit of (cid:15) → . (cid:4) D. Proof of Lemma 12

The key idea for the proof of Lemma 12 is similar to theone used in [25]. Let ψ AC (cid:48) be a state such that the KI isometryon A with respect to ψ AC (cid:48) is the same as that with respect to Ψ AC , and that it is decomposed as ψ AC (cid:48) KI := (Γ A Ψ ⊗ Γ C (cid:48) ψ ) ψ AC (cid:48) (Γ A Ψ ⊗ Γ C (cid:48) ψ ) † = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ | ˜ ϕ j (cid:105)(cid:104) ˜ ϕ j | a R c (cid:48) R ⊗ | j (cid:105)(cid:104) j | c (cid:48) , (77)where Γ ψ : H C (cid:48) → H c (cid:48) ⊗ H c (cid:48) R is an isometry and | ˜ ϕ j (cid:105) a R c (cid:48) R isa puriﬁcation of ϕ a R j . The state satisﬁes ψ A = Ψ A . Note thatwe have d C (cid:48) = (cid:88) j rank ˜ ϕ c (cid:48) R j = (cid:88) j rank ˜ ϕ a R j ≤ d A . Let E be the set of all linear CPTP maps on S ( H A ) , anddeﬁne two functions f, g : E → R by f ( E ) = (cid:13)(cid:13) E (Ψ AC ) − Ψ AC (cid:13)(cid:13) ,g ( E ) = (cid:13)(cid:13)(cid:13) E ( ψ AC (cid:48) ) − ψ AC (cid:48) (cid:13)(cid:13)(cid:13) . Since the KI decomposition of A with respect to Ψ AC andthat with respect to ψ AC (cid:48) are the same, f ( E ) = 0 if and onlyif g ( E ) = 0 (see Lemma 5). Deﬁne ζ Ψ ( (cid:15) ) := sup E∈ E { g ( E ) | f ( E ) ≤ (cid:15) } . (78)This is a monotonically nondecreasing function of (cid:15) by deﬁni-tion, and satisﬁes lim (cid:15) → ζ Ψ ( (cid:15) ) = 0 as we prove in AppendixB-E.We consider a general situation in which the relation (24)does not necessarily hold. Let Π χ be the projection onto H ¯ Aχ ⊆H ¯ A , and Π ⊥ χ be that onto its orthogonal complement. Usinga quantum channel E χ on S ( H ¯ Aχ ) deﬁned by (26), constructanother quantum channel E ∗ χ on S ( H ¯ A ) by E ∗ χ ( τ ) = E χ (Π χ τ Π χ ) + Π ⊥ χ τ Π ⊥ χ ( ∀ τ ∈ S ( H ¯ A )) . (79) Deﬁne quantum channels E l on A l ( ≤ l ≤ n ) by E l ( τ A l ) = Tr ¯ A \ A l (cid:2) E ∗ χ (cid:0) Ψ A ⊗ · · · ⊗ Ψ A l − ⊗ τ A l ⊗ Ψ A l +1 ⊗ · · · ⊗ Ψ A n (cid:1)(cid:3) , where Tr ¯ A \ A l denotes the partial trace over A · · · A l − A l +1 · · · A n . From (25), we have E ∗ χ ( χ ¯ A ¯ C ) = E χ ( χ ¯ A ¯ C ) = χ ¯ A ¯ C , and thus from (22) andthe triangle inequality, we have (cid:13)(cid:13)(cid:13) E ∗ χ (Ψ ⊗ n ) ¯ A ¯ C − (Ψ ⊗ n ) ¯ A ¯ C (cid:13)(cid:13)(cid:13) ≤ (cid:15), which implies (cid:13)(cid:13) E l (Ψ A l C l ) − Ψ A l C l (cid:13)(cid:13) ≤ (cid:15) by taking the partial trace. Thus we have (cid:13)(cid:13)(cid:13) E l ( ψ A l C (cid:48) l ) − ψ A l C (cid:48) l (cid:13)(cid:13)(cid:13) ≤ ζ Ψ (2 (cid:15) ) for any ≤ l ≤ n . By Inequality (49) and d C (cid:48) ≤ d A , itfollows that I ( A : C (cid:48) ) ψ − I ( A l : C (cid:48) l ) E l ( ψ ) ≤ η ( ζ Ψ (2 (cid:15) )) log d A , and consequently, that nI ( A : C (cid:48) ) ψ − n (cid:88) l =1 I ( A l : C (cid:48) l ) E l ( ψ ) ≤ nη ( ζ Ψ (2 (cid:15) )) log d A . (80)We also have I ( ¯ A : ¯ C (cid:48) ) E ∗ χ ( ψ ⊗ n ) = S ( ¯ C (cid:48) ) E ∗ χ ( ψ ⊗ n ) − S ( ¯ C (cid:48) | ¯ A ) E ∗ χ ( ψ ⊗ n ) = S ( ¯ C (cid:48) ) ψ ⊗ n − n (cid:88) l =1 S ( C (cid:48) l | A · · · A n C (cid:48) · · · C (cid:48) l − ) E ∗ χ ( ψ ⊗ n ) ≥ n (cid:88) l =1 S ( C (cid:48) l ) ψ − n (cid:88) l =1 S ( C (cid:48) l | A l ) E ∗ χ ( ψ ⊗ n ) = n (cid:88) l =1 S ( C (cid:48) l ) E l ( ψ ) − n (cid:88) l =1 S ( C (cid:48) l | A l ) E l ( ψ ) = n (cid:88) l =1 I ( A l : C (cid:48) l ) E l ( ψ ) . (81)Here, we used the fact that E ∗ χ on ¯ A does not change thereduced state on ¯ C (cid:48) , and that Tr ¯ A \ A l , ¯ C (cid:48) \ C (cid:48) l (cid:2) E ∗ χ (cid:0) ψ ⊗ n (cid:1)(cid:3) = E l ( ψ A l C (cid:48) l ) , because of Ψ Al (cid:48) = ψ Al (cid:48) . Combining (80) and (81), we obtain nI ( A : C (cid:48) ) ψ ≤ I ( ¯ A : ¯ C (cid:48) ) E ∗ χ ( ψ ⊗ n ) + 4 nη ( ζ Ψ (2 (cid:15) )) log d A . (82)Deﬁne ψ ¯ A ¯ C (cid:48) n,χ := Π Aχ ( ψ AC (cid:48) ) ⊗ n Π Aχ Tr[Π χ ( ψ A ) ⊗ n ] and ψ ˆ a ˆ a R ¯ C (cid:48) n := ( E ◦ E ◦ E Γ χ )( ψ ¯ A ¯ C (cid:48) n,χ ) as depicted in Figure 5. From Condition (22) and ψ A = Ψ A ,we have (cid:107) ( ψ ⊗ n ) ¯ A − χ ¯ A (cid:107) ≤ (cid:15) . Thus, due to (79) andInequality (41), we have (cid:13)(cid:13)(cid:13) E χ ( ψ ¯ A ¯ C (cid:48) n,χ ) − E ∗ χ ( ψ ⊗ n ) ¯ A ¯ C (cid:48) (cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) ψ ¯ A ¯ C (cid:48) n,χ − ( ψ AC (cid:48) ) ⊗ n (cid:13)(cid:13)(cid:13) ≤ √ (cid:15), which leads to I ( ¯ A : ¯ C (cid:48) ) E ∗ χ ( ψ ⊗ n ) ≤ I ( ¯ A : ¯ C (cid:48) ) E χ ( ψ n,χ ) + 4 nη (2 √ (cid:15) ) log d A (83)by Inequality (49) and d C (cid:48) ≤ d A . By the data processinginequality, we also have I ( ¯ A : ¯ C (cid:48) ) E χ ( ψ n,χ ) ≤ I (ˆ a ˆ a R : ¯ C (cid:48) ) ψ n . (84)Consequently, we obtain from (82), (83) and (84) that nI ( A : C (cid:48) ) ψ ≤ I (ˆ a ˆ a R : ¯ C (cid:48) ) ψ n +4 n (cid:0) η (2 √ (cid:15) ) + η ( ζ Ψ (2 (cid:15) )) (cid:1) log d A . (85)The QMIs in (85) are calculated as follows. First, from (25),we have ( E ◦ E ◦ E Γ χ )( χ ¯ A ) = χ ˆ a ˆ a R sKI = (cid:88) i q i | i (cid:105)(cid:104) i | ˆ a ⊗ φ ˆ a R i . (86)Therefore, from the monotonicity of the trace distance under E ◦ E ◦ E Γ χ , Equality (22) and ψ A = Ψ A , we obtain (cid:13)(cid:13)(cid:13) ψ ˆ a ˆ a R n − χ ˆ a ˆ a R sKI (cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) ( ψ ⊗ n ) ¯ A − χ ¯ A (cid:13)(cid:13)(cid:13) ≤ (cid:15). (87)Due to (47), (48) and dim H ˆ a , dim H ˆ a R ≤ dim H ¯ A = d nA , (87) implies that | S (ˆ a ) χ sKI − S (ˆ a ) ψ n | ≤ nη ( (cid:15) ) log d A , | S (ˆ a R | ˆ a ) χ sKI − S (ˆ a R | ˆ a ) ψ n | ≤ nη ( (cid:15) ) log d A , and consequently, that S (ˆ a ) ψ n + 2 S (ˆ a R | ˆ a ) ψ n ≤ S (ˆ a ) χ sKI + 2 S (ˆ a R | ˆ a ) χ sKI + 7 nη ( (cid:15) ) log d A . Since ψ ˆ a ˆ a R ¯ C (cid:48) n is a classical-quantum state between ˆ a and ˆ a R ¯ C (cid:48) , we obtain I (ˆ a ˆ a R : ¯ C (cid:48) ) ψ n = I (ˆ a : ¯ C (cid:48) ) ψ n + I (ˆ a R : ¯ C (cid:48) | ˆ a ) ψ n ≤ S (ˆ a ) ψ n + 2 S (ˆ a R | ˆ a ) ψ n ≤ S (ˆ a ) χ sKI + 2 S (ˆ a R | ˆ a ) χ sKI + 7 nη ( (cid:15) ) log d A = H ( { q i } i ) + 2 (cid:88) i q i S ( φ ˆ a R i ) + 7 nη ( (cid:15) ) log d A , (88)where the last equality follows from (86).It is straightforward to obtain from (77) that I ( A : C (cid:48) ) ψ = I ( a a L a R : c (cid:48) c (cid:48) R ) ψ KI = H ( { p j } j ) + 2 (cid:88) j p j S ( ϕ a R j ) . (89) Combining (85), (88) (89), we obtain n  H ( { p j } j ) + 2 (cid:88) j p j S ( ϕ a R j )  ≤ H ( { q i } i ) + 2 (cid:88) i q i S ( φ ˆ a R i ) + nζ (cid:48) Ψ ( (cid:15) ) log d A , where ζ (cid:48) Ψ ( (cid:15) ) = 11 η (2 √ (cid:15) ) + 4 η ( ζ Ψ (2 (cid:15) )) (90)and ζ Ψ is a function deﬁned by (78). Thus we ﬁnally arriveat (23). (cid:4) E. Convergence of ζ Ψ We prove that ζ Ψ ( (cid:15) ) deﬁned by (78) satisﬁes lim (cid:15) → ζ Ψ ( (cid:15) ) =0 , based on an idea used in [25]. Due to the Choi-Jamiolkowskiisomorphism, E can be identiﬁed with S ( H A ⊗ H A ) . Hence E is compact, which implies that the supremum in (78) canactually be the maximum: ζ Ψ ( (cid:15) ) = max E∈ E { g ( E ) | f ( E ) ≤ (cid:15) } . Hence we have that ∀ (cid:15) > , ∃E ∈ E : g ( E ) = ζ Ψ ( (cid:15) ) , f ( E ) ≤ (cid:15). Deﬁne α := lim (cid:15) → ζ Ψ ( (cid:15) ) . Due to the monotonicity, we have ζ Ψ ( (cid:15) ) ≥ α for all (cid:15) > . Consequently, we have that ∀ (cid:15) > , ∃E ∈ E : g ( E ) ≥ α, f ( E ) ≤ (cid:15). (91)Deﬁne E α := {E ∈ E | g ( E ) ≥ α } . Due to the continuity of g , E α is a closed subset of E . Hence β := min E∈ E α f ( E ) exists due to the continuity of f . By deﬁnition, we have that ∀E ∈ E : g ( E ) ≥ α ⇒ f ( E ) ≥ β. (92)Suppose now that α > . We have f ( E ) > for all E ∈ E α due to Lemma 5. Thus we have β > , in which case (92)contradicts with (91) because (cid:15) can be arbitrarily small. (cid:4) A PPENDIX CP ROOF OF T HEOREM

A. Irreducibility of the KI decomposition

Similarly to the irreducibility of the KI decomposition of aset of states presented in Lemma 2, the KI decomposition ofa bipartite state deﬁned by Deﬁnition 3 also has a property ofirreducibility as follows.

Lemma 15

Suppose the KI decomposition of Ψ AC on A isgiven by Ψ ACKI = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ϕ a R Cj , and deﬁne ϕ j,kl := (cid:104) k | C ϕ a R Cj | l (cid:105) C , where {| k (cid:105)} k is an or-thonormal basis of H C . Then the following two propertieshold.1) If a linear operator N on H a R j := supp[ ϕ a R j ] satisﬁes p j N ϕ j,kl = p j ϕ j,kl N for all k and l , then N = cI a R j fora complex number c , where I a R j is the identity operatoron H a R j .2) If a linear operator N : H a R j → H a R j (cid:48) ( j (cid:54) = j (cid:48) ) satisﬁes p j N ϕ j,kl = p j (cid:48) ϕ j (cid:48) ,kl N for all k and l , then N = 0 . Proof:

Deﬁne p M := Tr[ M C Ψ AC M † C ] , Ψ M := p − M Tr C [ M C Ψ AC M † C ] ,ϕ a R j,M := p − M Tr C [ M C ϕ a R Cj M † C ] for M ∈ L ( H C ) . The set of steerable states correspondingto (4) is given by S ϕ C → A := { Ψ M } M ∈L ( H C ) . Hence the KIisometry Γ on A with respect to Ψ AC is equal to that withrespect to S ϕ C → A by Deﬁnition 3. Thus Ψ M is decomposedas ΓΨ M Γ † = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ϕ a R j,M , where { ϕ a R j,M } M ∈L ( H C ) is a set of states which is irreduciblein the sense of Lemma 2.To prove Property 1), suppose that N ∈ L ( H a R j ) satisﬁes p j N ϕ j,kl = p j ϕ j,kl N for all k and l . Since ϕ j,M is de-composed as ϕ j,M = (cid:80) k,l (cid:104) l | M † M | k (cid:105) ϕ j,kl , it follows that p j N ϕ j,M = p j ϕ j,M N for all M ∈ L ( H C ) . Hence we obtainProperty 1) due to the irreducibility of { ϕ a R j,M } M ∈L ( H C ) .Property 2) is proved in a similar vein. (cid:4) B. Proof of Theorem 10

Let us ﬁrst adduce a useful lemma regarding ﬁxed pointsof the adjoint map of a linear CPTP map.

Lemma 16 (See Lemma 11 in [1].) Let E be a linear CPTPmap on S ( H ) , the Kraus representation of which is given by E ( · ) = (cid:80) k E k ( · ) E † k . Let E ∗ be the adjoint map of E deﬁnedby E ∗ ( · ) = (cid:80) k E † k ( · ) E k . Then X ∈ L ( H ) satisﬁes E ∗ ( X ) = X if and only if [ E k , X ] = [ E † k , X ] = 0 for all k .The proof of Theorem 10 proceeds as follows. Let Γ be theKI isometry on A with respect to Ψ AC , and let ˆΨ AC := Γ A Ψ AC Γ † A = (cid:88) j ∈ J p j | j (cid:105)(cid:104) j | a ⊗ ω a L j ⊗ ϕ a R Cj be the KI decomposition of Ψ AC on A . We have ( ˆΨ AC ) = (cid:88) j ∈ J √ p j | j (cid:105)(cid:104) j | a ⊗ ( ω a L j ) ⊗ ( ϕ a R Cj ) , ( ˆΨ A ) − = (cid:88) j ∈ J √ p j | j (cid:105)(cid:104) j | a ⊗ ( ω a L j ) − ⊗ ( ϕ a R j ) − and ( ˆΨ AC ) ( ˆΨ A ) − = (cid:88) j ∈ J | j (cid:105)(cid:104) j | a ⊗ I a L j ⊗ ( ϕ a R Cj ) ( ϕ a R j ) − . Hence the Kraus operators of E deﬁned by (28) is decomposedas (29).It follows from (11) that E (Ψ A ) = Ψ A . Due to Lemma 16,we have ∀ k, l ; [ E kl , Ψ A ] = 0 , or equivalently, have ∀ k, l ; [ ˆ E kl , ˆΨ A ] = 0 , from which it follows that ∀ j, k, l ; [ e j,kl , ϕ a R j ] = 0 . Using (30), we obtain ∀ j, k, l ; [ ϕ j,kl , ϕ a R j ] = 0 , where ϕ j,kl := (cid:104) k | C ϕ a R Cj | l (cid:105) C . Therefore, due to the irre-ducibility of the KI decomposition, we have that ϕ a R j = π a R j := I a R j / (dim H a R j ) , and consequently, that e j,kl = (dim H a R j ) ϕ j,kl , which implies the irreducibility of { e j,kl } k,l .From (28), for any ˆ τ = Γ τ Γ † ( τ ∈ S ( H A )) , we have (cid:104) j | a ˆ E (ˆ τ ) | j (cid:48) (cid:105) a = (id a L ⊗ E a R jj (cid:48) )( (cid:104) j | a ˆ τ | j (cid:48) (cid:105) a ) , where E jj (cid:48) is a linear map on L ( H a R ) deﬁned by E jj (cid:48) ( · ) = (cid:88) kl e j,kl ( · ) e † j (cid:48) ,kl . Thus we have (cid:104) j | a ˆ E ∞ (ˆ τ ) | j (cid:48) (cid:105) a = (id a L ⊗ E a R ∞ ,jj (cid:48) )( (cid:104) j | a ˆ τ | j (cid:48) (cid:105) a ) , where E ∞ ,jj (cid:48) is a linear map on L ( H a R ) deﬁned by E ∞ ,jj (cid:48) := lim n →∞ N N (cid:88) n =1 E njj (cid:48) . Therefore, from | Ψ KI (cid:105) = (cid:88) j ∈ J √ p j | j (cid:105) a | j (cid:105) b | ω j (cid:105) a L b L | ϕ j (cid:105) a R b R C , we obtain ˆ E ∞ ( | Ψ KI (cid:105)(cid:104) Ψ KI | )= (cid:88) jj (cid:48) | j (cid:105)(cid:104) j | a ˆ E ∞ ( | Ψ KI (cid:105)(cid:104) Ψ KI | ) | j (cid:48) (cid:105)(cid:104) j (cid:48) | a = (cid:88) jj (cid:48) √ p j p j (cid:48) | j (cid:105)(cid:104) j (cid:48) | a ⊗ | j, ω j (cid:105)(cid:104) j (cid:48) , ω j (cid:48) | b a L b L ⊗ E a R ∞ ,jj (cid:48) ( | ϕ j (cid:105)(cid:104) ϕ j (cid:48) | a R b R C ) . (93)Consider that we have E ◦ E ∞ = E ∞ , and thus have E A ∞ (Ψ ABC ∞ ) = Ψ ABC ∞ . It follows that ∀ k, l ; [ E Akl ⊗ I BC , Ψ ABC ∞ ] = 0 , and consequently, that ∀ k, l ; [ ˆ E Akl ⊗ I BC , ˆ E ∞ ( | Ψ KI (cid:105)(cid:104) Ψ KI | )] = 0 . Due to (29) and (93), this is equivalent to e j,kl E a R ∞ ,jj (cid:48) ( | ϕ j (cid:105)(cid:104) ϕ j (cid:48) | a R b R C ) = E a R ∞ ,jj (cid:48) ( | ϕ j (cid:105)(cid:104) ϕ j (cid:48) | a R b R C ) e † j (cid:48) ,kl ( ∀ j, j (cid:48) , k, l ) . Hence we have E a R ∞ ,jj (cid:48) ( | ϕ j (cid:105)(cid:104) ϕ j (cid:48) | a R b R C ) = (cid:40) π a R j ⊗ ϕ b R Cj ( j = j (cid:48) )0 ( j (cid:54) = j (cid:48) ) due to the irreducibility of { e j,kl } k,l . From (93), we obtain ˆ E ∞ ( | Ψ KI (cid:105)(cid:104) Ψ KI | )= (cid:88) j ∈ J p j | j, j, ω j (cid:105)(cid:104) j, j, ω j | a b a L b L ⊗ π a R j ⊗ ϕ b R Cj . Since ˆ E ∞ ( | Ψ KI (cid:105)(cid:104) Ψ KI | ) and Ψ ABC ∞ are equivalent up to localisometries on A and B , we ﬁnally obtain S (Ψ ABC ∞ ) = H ( { p j } ) + (cid:88) j p j (cid:16) S ( π a R j ) + S ( ϕ b R Cj ) (cid:17) = H ( { p j } ) + 2 (cid:88) j p j S ( ϕ a R j )= M A | B (Ψ ABC ) , where we used the fact that ϕ a R j = π a R j and that ϕ a R b R Cj is apure state. (cid:4) Remark:

From (13) and (14), it is straightforward to verifythat the statement of Theorem 10 does not depend on a partic-ular choice of a puriﬁcation of Ψ A . That is, for any puriﬁcation | Ω (cid:105) AA (cid:48) of Ψ A , we have M A | B (Ψ ABC ) = S (Ω AA (cid:48) ∞ ) , where Ω AA (cid:48) ∞ := E A ∞ ( | Ω (cid:105)(cid:104) Ω | AA (cid:48) ) . A puriﬁcation of Ψ A is simplyobtained by | Ω (cid:105) AA (cid:48) = (Ψ A ) d A (cid:88) k =1 | k (cid:105) A | k (cid:105) A (cid:48) , and its matrix representation is given by | Ω (cid:105)(cid:104) Ω | AA (cid:48) = (cid:88) klmn [Λ ] kl,mn | k (cid:105)(cid:104) l | A ⊗ | m (cid:105)(cid:104) n | A (cid:48) . The matrix elements of the generalized inverse matrix Λ − are given by [Λ − ] kl,mn = (cid:104) k | (Ψ A ) − | m (cid:105)(cid:104) n | (Ψ A ) − | l (cid:105) . In addition, the dimension of the eigensubspace of Λ ∞ cor-responding to the eigenvalue 1 is at least 1, since we have E ∞ ( I ) = I due to the self-adjointness of E . These facts justifythe algorithm described in Section III.A PPENDIX DP ROOF OF I NEQUALITY (33)The ﬁrst inequality in (33) is proved as follows. For anarbitrary n and (cid:15) > , let V n : τ (cid:55)→ − nR (cid:80) nR k =1 V k τ V † k bea random unitary operation on A n , and let Υ A n B n C n be aMarkov state conditioned by B n such that (cid:13)(cid:13)(cid:13) V n ( ρ ⊗ n ) − Υ A n B n C n (cid:13)(cid:13)(cid:13) ≤ (cid:15). (94) Let | ψ (cid:105) ABCD be a puriﬁcation of ρ ABC , and E be a quantumsystem with dimension nR . Deﬁning an isometry W : A n → EA n by W = (cid:80) nR k =1 | k (cid:105) E ⊗ V A n k , a Stinespring dilation of V n is given by V n ( τ ) = Tr E [ W τ W † ] . Then a puriﬁcationof ρ (cid:48) ABCn := V n ( ρ ⊗ n ) is given by | ψ (cid:48) n (cid:105) EA n B n C n R n := W ( | ψ (cid:105) ABCR ) ⊗ n . For this state, we have nR ≥ S ( E ) ψ (cid:48) n = S ( A n B n C n R n ) ψ (cid:48) n ≥ S ( A n B n C n ) ψ (cid:48) n − S ( R n ) ψ (cid:48) n = S ( A n B n C n ) ρ (cid:48) n − S ( R n ) ψ ⊗ n = S ( A n B n C n ) ρ (cid:48) n − nS ( ABC ) ρ , (95)where the second line follows from (44). From (94), we alsohave S ( A n B n C n ) ρ (cid:48) n ≥ S ( A n B n C n ) Υ − nη ( (cid:15) ) log ( d A d B d C )= S ( A n B n ) Υ + S ( B n C n ) Υ − S ( B n ) Υ − nη ( (cid:15) ) log ( d A d B d C ) ≥ S ( A n B n ) ρ (cid:48) n + S ( B n C n ) ρ (cid:48) n − S ( B n ) ρ (cid:48) n − nη ( (cid:15) ) log ( d A d B d C ) ≥ S ( A n B n ) ρ ⊗ n + S ( B n C n ) ρ ⊗ n − S ( B n ) ρ ⊗ n − nη ( (cid:15) ) log ( d A d B d C )= n ( S ( AB ) ρ + S ( BC ) ρ − S ( B ) ρ ) − nη ( (cid:15) ) log ( d A d B d C ) . (96)Here, the second line follows by Inequality (47); the third linebecause of Υ being a Markov state conditioned by B n ; thefourth line by Inequality (47); and the ﬁfth line by the vonNeumann entropy being nondecreasing under random unitaryoperations, in addition to ρ (cid:48) B n C n n = ( ρ BC ) ⊗ n . From (95) and(96), we obtain R ≥ I ( A : C | B ) ρ − η ( (cid:15) ) log ( d A d B d C ) , which concludes the proof.which concludes the proof.