[PDF] Quantum Darwinism and the spreading of classical information in non-classical theories

Abstract

Quantum Darwinism posits that the emergence of a classical reality relies on the spreading of classical information from a quantum system to many parts of its environment. But what are the essential physical principles of quantum theory that make this mechanism possible? We address this question by formulating the simplest instance of Darwinism -- CNOT-like fan-out interactions -- in a class of probabilistic theories that contain classical and quantum theory as special cases. We determine necessary and sufficient conditions for any theory to admit such interactions. We find that every non-classical theory that admits this spreading of classical information must have both entangled states and entangled measurements. Furthermore, we show that Spekkens' toy theory admits this form of Darwinism, and so do all probabilistic theories that satisfy principles like strong symmetry, or contain a certain type of decoherence processes. Our result suggests the counterintuitive general principle that in the presence of local non-classicality, a classical world can only emerge if this non-classicality can be "amplified" to a form of entanglement.

Full PDF

QQuantum Darwinism and the spreading of classical informationin non-classical theories

Roberto D. Baldijão,

1, 2, ∗ Marius Krumm,

2, 3, ∗ Andrew J. P. Garner, and Markus P. Müller

2, 4, 5 Instituto de Física Gleb Wataghin, Universidade Estadual de Campinas, Campinas, SP 13083-859, Brazil Institute for Quantum Optics and Quantum Information,Austrian Academy of Sciences, Boltzmanngasse 3, A-1090 Vienna, Austria Faculty of Physics, University of Vienna, Boltzmanngasse 5, A-1090 Vienna, Austria Vienna Center for Quantum Science and Technology (VCQ),Faculty of Physics, University of Vienna, Vienna, Austria Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, ON N2L 2Y5, Canada (Dated: December 11, 2020)Quantum Darwinism posits that the emergence of a classical reality relies on the spreading ofclassical information from a quantum system to many parts of its environment. But what are theessential physical principles of quantum theory that make this mechanism possible? We addressthis question by formulating the simplest instance of Darwinism – CNOT-like fan-out interactions– in a class of probabilistic theories that contain classical and quantum theory as special cases.We determine necessary and suﬃcient conditions for any theory to admit such interactions. Weﬁnd that every non-classical theory that admits this spreading of classical information must haveboth entangled states and entangled measurements. Furthermore, we show that Spekkens’ toytheory admits this form of Darwinism, and so do all probabilistic theories that satisfy principleslike strong symmetry, or contain a certain type of decoherence processes. Our result suggests thecounterintuitive general principle that in the presence of local non-classicality, a classical world canonly emerge if this non-classicality can be “ampliﬁed” to a form of entanglement.

I. INTRODUCTION

Quantum Darwinism [1–12] addresses one of thetoughest questions raised by quantum theory: If theuniverse is fundamentally described by quantum me-chanics, how does an objective classical world arise? Atthe heart of this question is a tension between the micro-scopic quantum realm, in which systems happily exist instates of super-imposed possibility, and the macroscopicworld of “classical” systems (such as the pointer needleof a read-out gauge), which are only ever observed indeﬁnite objective states. Several mechanisms and for-malisms have been proposed which intend to provide abridge between the quantum and classical realms, in-cluding the formal limit of (cid:126) → [13], saddle point ap-proximations to the path integral [14], and the processof environment-induced decoherence [1, 15].Quantum Darwinism identiﬁes key prerequisite forsuch a bridge to arise: there must be a mechanism bywhich some aspect of a quantum system can be spreadout to many parts of its environment. Particularly, sincethe no–cloning theorem [16] forbids the copying of quan-tum information, this means some classical informationfrom the system must be copied into its environment insuch a way that given long enough (and enough of theenvironment), this information can be learned throughenough measurements on the environment.Here we ask: What are the essential features of quan-tum theory that enable this spreading of classical in-formation in the ﬁrst place? Certainly, this is possi-ble in Quantum Theory’s rich mathematical structure ∗ These authors contributed equally to this work. of complex Hilbert spaces, but can we identify a se-lective subset of more physically–motivated principlesthat similarly enable this Darwinistic emergence of clas-sical reality? To approach this, we adopt the minimal–assumptions framework of generalized probabilistic the-ories (GPTs) [17, 18]. These encompass a wide class ofoperational scenarios, in which a physical system is en-tirely characterized by its experimental statistics result-ing from preparation and subsequent measurement pro-cedures. The GPT approach has thus far enjoyed partic-ular success in identifying which operational features arenecessary or suﬃcient for quantum phenomena like tele-portation [19], no-cloning [20], entanglement [18], phaseand interference [21, 22], or decoherence [23]. With thisarticle, we aim to extend this canon to include QuantumDarwinism.We begin by recalling the essential features of Quan-tum Darwinism (section II A), and providing a briefoverview of the GPT framework (section II B). We thenproceed to the results of the article: an operational for-mulation of Quantum Darwinism (section III A), fol-lowed by necessary (section III B) and suﬃcient (sec-tion III C) conditions for such to exist. Particularly,we show that both entangled states and entangled mea-surements are necessary features in any non-classicaltheory that exhibits Darwinism, suggesting the coun-terintuitive general principle that in the presence of lo-cal non-classicality, a classical world can only emergeif this non-classicality can be “ampliﬁed” to a form ofentanglement. We then identify how other physically–motivated features, such as the no-restriction hypothe-sis [24, 25] and strong symmetry [26], or the existenceof decoherence [23], are suﬃcient to imply the pres-ence of Darwinism. Finally (section III D), we give a a r X i v : . [ qu a n t - ph ] D ec concrete example of a non-classical theory other thanquantum theory that admits Darwinism: we show itsexistence in Spekkens’ Toy Model [27] and its convexextensions [21, 25]. II. BACKGROUNDA. Quantum Darwinism

The typical setting of Quantum Darwinism (QD) [1–3] consists of a central system S interacting with a multi-partite environment E , . . . E N . This is similar to thesetting in which decoherence is studied (e.g. [15]), butrather than focusing on the change in S ’s state, QD isconcerned with the information that fragments of theenvironment can learn about S .Not everything about S can be spread to the envi-ronment – for instance, sharing arbitrary quantum in-formation would violate the no cloning principle [16].Nonetheless, something can still be learned about S –perhaps because the interaction induces certain quan-tum states on system and environment such that mea-surements made on S and {E i } in the right choice ofbasis yield correlated outcomes. This interaction mustalso preserve some aspect of the initial state of S , sothat what the environment learns can be considered asbeing about S .In the ideal scenario, we would like to extract asmuch classical information from any E i about S as wecould from S directly. Holevo’s theorem [28] tells usthat the most information that can possibly be sharedwith each environmental system is upper-bounded bythat directly obtainable from a single measurement on S . This can be realized as follows, when S and all of E i ( i = 1 . . . N ) are d -dimensional quantum systems.Let M := {| (cid:105) , . . . | d − (cid:105)} be some orthonormal basis.Suppose S is initially in a pure state | ψ (cid:105) S = (cid:80) α k | k (cid:105) ,and each environmental system starts in a pure basisstate | j i (cid:105) i ∈ M . Consider the following fan-out gate (ageneralization of control-NOT / control-shift gates, seeﬁg. 1): FAN (cid:32) | k (cid:105) S ⊗ N (cid:79) i =1 | j i (cid:105) (cid:33) := | k (cid:105) S ⊗ N (cid:79) i =1 | j i ⊕ k (cid:105) , (1)such that FAN (cid:32) | ψ (cid:105) S ⊗ N (cid:79) i =1 | j i (cid:105) (cid:33) = (cid:88) k α k FAN (cid:32) | k (cid:105) S ⊗ N (cid:79) i =1 | j i (cid:105) (cid:33) = (cid:88) k α k | k (cid:105) S ⊗ N (cid:79) i =1 | j i ⊕ k (cid:105) , (2)where j i ⊕ k indicates addition modulo d .It is clear that fan-out realizes the above ideals, per-fectly broadcasting classical information about M on S to every environment system, while preserving the out-come probabilities of M on S . First, if | ψ (cid:105) S ∈ M (as ψ SE E E Figure 1.

Ideal Quantum Darwinism: fan-out gate.

The fan-out gate (eq. (2)) is realized for the case N = 3 , d =2 by three consecutive CNOT gates. After this process, thestatistics of the computational-basis measurement Z on allof the environmental subsystems ( E , E and E ) agrees withthose of the the main system ( S ). Meanwhile, the statisticsof this measurement on the system are the same as if it hadbeen directly made on | ψ (cid:105) . As such, the classical informationabout Z in S has been spread to its environment. in eq. (1)), then it remains unchanged after the inter-action; that is, M is the pointer basis selected by FAN .Moreover, if | ψ (cid:105) S ∈ M , this element can be perfectlyidentiﬁed by simply measuring any of E i with M andapplying the appropriate relabelling (subtraction of j i modulo d ) – as a consequence of the so-called einselec-tion process [1]. Furthermore, when | ψ (cid:105) S is a superpo-sition of multiple states in M , the resulting entangledstate now has the property that whatever the outcomeof M on S , the same outcome will be obtained by mak-ing M on E i (again, via subtraction of j i modulo d ).Finally, the statistics of measuring M on S before andafter the fan-out are identical. Thus, such a fan-outimplements an ideal Darwinism process .In this idealized setting, any state in M represents avalid initial state of an environment E i for which FANcan register information about S ’s pointer basis. Thismultiplicity of “good registers” makes the process morerobust to modiﬁcations in the initial state of the en-vironment subsystems – generically reducing ‘misalign-ment’, in the language of Zwolak et al. [11]. In addition,this type of interaction aligns with physically-motivatedmodels of Darwinism [1, 8, 10, 12, 29–31].QD can also encompass more complicated scenarios[3, 8–12], where only partial information is spread (typi-cally quantiﬁed through mutual–information quantities– though the eﬃcacy of this is of debate [6, 7]). Forinstance, pointer states may not be perfectly robust tointeraction, the information may not be perfectly regis-tered in the environment [8, 10, 12], or a more generalclass of measurement than projection onto the pointerbasis may be used [4, 5]. However, for the purpose ofthis article (and preempting the need to cast the sce-nario in the operational language of GPTs), we restrictour discussion here to the idealized case described above. B. The GPT framework

Generalized probabilistic theories (GPTs) are aminimal–assumptions framework in which a physicaltheory is speciﬁed by the statistics of every experi-ment that could be conducted within it. The funda-mental elements of a GPT correspond to laboratoryoperations, such as state preparations, and measure-ment outcomes. In addition to the aforementionedisolation of quantum features [18–23], this broad op-erational approach makes the GPT framework well-suited for attempts to reconstruct quantum theory ei-ther from experimental data [32] or from sets of reason-able physically–motivated axioms [17, 26, 33, 34]. The-ories such as quantum theory (QT) and classical proba-bility theory (CPT) are GPTs, but the framework alsoadmits more exotic theories such as “boxworld” [18] orhigher-dimensional Bloch ball state spaces [35].In this section, we brieﬂy review the aspects of theGPT framework that are relevant for our discussion.Readers who are familiar with the GPT framework maywish to skip to the summary of assumptions at the endof the section. For more detailed and pedagogical intro-ductions to the GPT framework, see e.g. [17, 18, 36].

1. Single Systems

The primitive elements of a GPT are the states thatone can prepare, and the outcomes of measurements(known as eﬀects ) that one can make on a given physicalsystem. Mathematically, states (not necessarily normal-ized) are given by the elements of a closed subset A + of some ﬁnite-dimensional real vector space A . With aslight abuse of notation, the physical system will alsobe denoted A . This subset A + is assumed to be a cone , meaning that ϕ, ω ∈ A + and λ ≥ imply that λϕ ∈ A + and ϕ + ω ∈ A + . Furthermore, A + is assumedto be generating , i.e. span( A + ) = A , and pointed , i.e. A + ∩ ( − A + ) = { } . (For the example of QT, this is thecone of positive semideﬁnite matrices, see example 1 be-low.)Eﬀects correspond to elements in a generating cone E A ⊆ A ∗ , where A ∗ is A ’s dual space of linear func-tionals. The probability of observing eﬀect e ∈ E A given a preparation ω ∈ A + is given by e ( ω ) . Sincethis must be non-negative, we must have E A ⊆ A ∗ + ,where A ∗ + := { e ∈ A ∗ | e ( ω ) ≥ for all ω ∈ A + } is the dual cone of A + [37]. We assume the existence of a dis-tinguished unit eﬀect u A ∈ E A such that for all a ∈ E A there is some λ > with a ≤ λu A (where a ≤ b if andonly if there exists some c ∈ E A such that a + c = b ).The measurements of a theory correspond to collectionsof eﬀects { e i } i =1 ...N that sum to u A – each constituenteﬀect corresponds to one mutually exclusive outcome.Since (cid:80) i e i ( ω ) = u A ( ω ) , we can interpret u A ( ω ) as thenormalization of the state ω — that is, the total proba-bility to obtain any outcome if the measurement is per-formed on the corresponding physical system. We say an eﬀect is valid if it can be part of a measurement (i.e. e ∈ E A and e ≤ u A ).If E A = A ∗ + , we say that the system is unrestricted ,or that it satisﬁes the no-restriction hypothesis [24, 25].From u A and A + , one can infer its compact convex setof normalized states Ω A := { ω ∈ A + | u A ( ω ) = 1 } ⊂ A + .An example is sketched in ﬁg. 2. States A + Effects E A Ω A u A e e ω ω Figure 2.

Geometric picture of a GPT.

An examplestate space A + (LHS) and eﬀect space E A (RHS) of a GPTwith A = R is drawn. On the RHS, the unit eﬀect u A islabeled, and all eﬀects on or within the shaded octahedronare valid in that e ≤ u A . Two pure eﬀects { e , e } thatsatisfy e + e = u A and hence form a reﬁned measurement are labelled. On the LHS, the convex set of normalized states u A ( ω ) = 1 is shaded as Ω A . Within it, a maximal frame oftwo pure states { ω , ω } is labelled. The convexity of A + and E A amounts to the assump-tion that statistical ﬂuctuations can always be intro-duced into an experiment. Consider measurement out-come e ∈ E A on one of two preparations ω or ω , withrespective statistics e ( ω ) and e ( ω ) . If ω is preparedwith probability p and ω otherwise, then this prepa-ration procedure should be representable by the sin-gle state ω whose statistics satisfy: e ( ω ) = pe ( ω ) +(1 − p ) e ( ω ) = e ( pω + (1 − p ) ω ) . It then follows that ω = pω + (1 − p ) ω . A similar interpretation of convexcombinations applies to the eﬀects.An eﬀect e ∈ E A is said to be pure [33, 38] if e = (cid:80) f i , with f i ∈ E A , implies f i ∝ e for all i (seealso ﬁg. 2). Pure eﬀects cannot be obtained from (non-trivial) coarse-graining of other eﬀects. A collection ofpure eﬀects that sum to u A with no eﬀects proportionalto any other in the set is known as a reﬁned measure-ment . A pure state is deﬁned to be a normalized statethat is extremal in Ω A , i.e. that cannot be written asa non-trivial convex combination of other normalizedstates. A frame is a collection of pure states { ω j } thatcan be perfectly distinguished in a single measurement:i.e. there is at least one measurement { e i } such that e i ( ω j ) = δ ij . A maximal frame is a frame with thelargest number of distinguishable states for that system.Dynamics in GPTs are described by linear maps T : A → A known as transformations . Transformations T must map states to states, i.e. T ( A + ) ⊆ A + , and ef-fects to eﬀects, in the sense that if e ∈ E A is a valideﬀect, then e ◦ T must also be a valid eﬀect. (The lattercorresponds to an outcome where transformation T hasbeen applied before the measurement.) Motivated bythe intuition to consider only closed-system dynamicsin which all environments are explicitly modelled, wewill in the following restrict our attention to reversibletransformations . These are transformations T that areinvertible as a linear map and whose inverse T − is alsoa transformation. Since transformations can be com-posed, it follows that the reversible transformations ofany GPT system A form a group T A . Furthermore, theymap the set Ω A of normalized states onto itself.In summary, a GPT system A is deﬁned by a tuple ( A, A + , E A , u A , T A ) of a real vector space, the state andeﬀect cones, the unit eﬀect, and the group of reversibletransformations. Let us illustrate this framework withtwo familiar examples: Example 1 (Quantum theory (QT)) . An n -levelquantum system corresponds to the GPT system ( A ( n ) , A ( n )+ , E ( n ) A , u ( n ) A , T ( n ) A ) with A ( n ) = H n ( C ) , A ( n )+ = H + n ( C ) (cid:39) E ( n ) A , u ( n ) A = n , T ( n ) A = { ρ (cid:55)→ U ρU † | U † U = n } , where H n ( C ) is the real vector space of n × n complexHermitian matrices, and H + n ( C ) the subset of positivesemideﬁnite matrices. Via the Hilbert-Schmidt innerproduct, (cid:104) X, Y (cid:105) := tr( XY ) , we can identify A ( n ) withits dual space such that the eﬀects are also Hermitianmatrices. For example, u ( n ) A ( ρ ) = tr( ρ ) can be written (cid:104) n , ρ (cid:105) , hence we can identify u ( n ) A = n .The measurements { E i } i =1 ,...,N thus correspond toPOVMs (positive operator-valued measures), i.e. E i ≥ and (cid:80) i E i = n . The normalized states Ω ( n ) A are the(unit-trace) density matrices, and the maximal framescorrespond to the various n -element orthonormal basesof the Hilbert space C n . The reversible transformationsare the unitary conjugations. Pure eﬀects correspond torank- POVM elements.

Example 2 (Classical probability theory (CPT)) . A classical random variable that can take n dif-ferent values corresponds to the GPT system ( B ( n ) , B ( n )+ , E ( n ) B , u ( n ) B , T ( n ) B ) with B ( n ) = R n , B ( n )+ = { x ∈ R n | all x i ≥ } (cid:39) E ( n ) B ,u ( n ) B = (1 , , . . . , T , T ( n ) B (cid:39) S n . In this notation, we have identiﬁed R n with its dualspace via the usual dot product x · y = (cid:80) i x i y i . Theunit eﬀect is thus u ( n ) B · p = (cid:80) ni =1 p i , and so Ω ( n ) B isthe simplex of n -dimensional probability vectors, i.e. Ω ( n ) B = { p ∈ R n | p i ≥ , (cid:80) i p i = 1 } . The reversibletransformations are the permutations of the entries: p i (cid:55)→ p π ( i ) , with π some permutation of { , , . . . , n } .Thus, the group of reversible transformations is a rep-resentation of the permutation group S n .A crucial signature of classicality is that CPT has(up to relabelling) only a single reﬁned measurement { e i } . Its eﬀects are e i ( p ) = p i , and it can be in-terpreted as asking which of the n possible conﬁgura-tions is actually the case. It distinguishes the (up torelabelling) unique maximal frame { ω j } , where ω j :=(0 , . . . , , (cid:124)(cid:123)(cid:122)(cid:125) j , , . . . , T . Both QT and CPT are unrestricted and self-dual [39],i.e. there is some inner product according to which A + = E A . Note that GPTs will in general satisfy neither ofthese two properties.

2. Maximal classical information (MCI) frames

Our goal is to generalize ideal Quantum Darwinism– in particular, the mechanism for perfect spreading ofclassical information via fan-out gates – to GPTs. As aﬁrst step, we have to identify the analogue of the pointerstates and the measurements that read out their en-coded classical information. We will focus on Darwin-ism generalizations that allow one to extract the max-imal amount of classical information. In the quantumcase, such classical information is encoded onto an or-thonormal basis {| j (cid:105)} . The natural analogue of this ina GPT is a maximal frame { ω j } .Let us consider the measurements that could extractthis classical information. As seen in example 1, QTenjoys a strong form of duality that allows one to treatthe pure states ω j = | j (cid:105) (cid:104) j | and the corresponding rank-1 projective measurements e j ( • ) = Tr[ | j (cid:105) (cid:104) j | • ] as the“same” objects, and it is exactly this dual set of rank- projectors that form the measurement that extracts themaximal amount of information out of the system.In general, GPTs do not have such an automatic dual-ity between states and eﬀects. Moreover, measurementsthat distinguish the elements of a maximal frame do noteven need to be reﬁned. However, since we are interestedin the idealized case, where one spreads the maximalclassical information contained in some system, we willhere focus on maximal frames that can be distinguishedby a reﬁned measurement: Deﬁnition 1 (Maximal classical information in GPTs) . A maximal frame ω , ..., ω n is called a maximal clas-sical information frame (MCI-frame) if there is areﬁned measurement { e i } ⊂ E A which discriminates thestates ω j , i.e. e j ( ω k ) = δ jk . Many GPTs contain MCI-frames: quantum theorycertainly does (in the form of orthonormal bases), and sodo quantum theory over the real numbers and over thequaternions, and d -ball state spaces. As expected, clas-sical theories in all dimensions also have MCI-frames.Furthermore, so-called “dichotomic” systems as deﬁnedin Ref. [40] contain MCI-frames, which includes unre-stricted systems whose sets of normalized states are reg-ular polygons with an even number of vertices, or a d -cube or d -octoplex for d ≥ .In appendix A, we give an example of a state spacethat does not have an MCI-frame: the pentagon. Thisexample illustrates the counterintuitive properties ofsuch systems: the pentagon has at most two perfectlydistinguishable states, but one can in some sense encode more than one bit of information into such a system [41].That is, any classical bit that sits inside this state spacedoes not represent the maximal amount of informationthat can be encoded into the system. For the remain-der of this work, we will thus exclude such systems andfocus on state spaces that contain MCI-frames.

3. Composite systems

Darwinism is inherently linked to composition of sub-systems; therefore, we need to understand how to treatcomposition in GPTs. There are several approachesto this [24, 42], including category-theoretic formula-tions [43]. Here, we will motivate and state a list ofminimal assumptions on a state space AB , composed oftwo state spaces A and B , that allows us to formulate ageneralization of Darwinism. For the case of more thantwo subsystems, we assume that the joint state satisﬁesall desiderata on all pairs of subsystems.First, we demand that the combined state space AB has a notion of independent parallel preparation. Thismeans that given some state ϕ A on A and some state ω B on B , there should be a state of AB (denoted ϕ A (cid:12) ω B )that represents the state obtained by the independentlocal preparation of the two states on A and on B . Sincestatistical mixtures of local preparations must lead tostatistical mixtures of the corresponding global state,the map (cid:12) must be bilinear.As pure states can be interpreted as states of max-imal knowledge, we assume that independent parallelpreparations of pure states lead to global pure states,i.e. we demand that if ϕ A and ω B are pure then so is ϕ A (cid:12) ω B [38]. Likewise, there should exist a notionof parallel implementation of measurements on the sys-tems. For this we require another bilinear function (alsodenoted by (cid:12) ) that maps eﬀects e A ∈ E A , f B ∈ E B toeﬀects e A (cid:12) f B ∈ E AB . Furthermore, if one performsa parallel implementation of two local measurements ona composite state whose parts were prepared indepen-dently in parallel, then the probabilities should factorizein the sense that e Aj (cid:12) f Bk ( ϕ A (cid:12) ω B ) = e Aj ( ϕ A ) f Bk ( ω B ) . Inother words, independent local procedures lead to sta-tistical independence. The bilinearity of (cid:12) ensures thevalidity of the no-signalling principle: the choice of localmeasurement on B does not aﬀect the outcome proba-bilities of local measurements in A (and vice-versa). In-deed, (cid:80) j e Ai (cid:12) e Bj = e Ai (cid:12) u B , for all eﬀects e Ai and anymeasurement { e Bj } .Similarly as for states, we assume that the composi-tion of pure eﬀects results in a pure eﬀect.Finally, we must ensure that the global structure isconsistent with the local structure. Consider a validcomposite eﬀect e AB ∈ E AB and a normalized state ω B ∈ Ω B . Then the eﬀect ˜ e A deﬁned by ˜ e A ( ϕ A ) := e AB ( ϕ A (cid:12) ω B ) should be valid eﬀect on A , i.e. ˜ e A ∈ E A and ˜ e A ≤ u A : it can be implemented by preparing ω B on B and then measuring e AB on AB .Similarly, consider a global state ω AB ∈ Ω AB sharedbetween two parties A and B . Imagine that one ofthe parties, say B , implements a local measurement { f Bk } k ⊂ E B and tells the other party the outcome k .Then agent A holds a conditional state, which shouldbe a (subnormalized) element of the state space of A .More speciﬁcally, for an eﬀect e A ∈ E A , the proba-bility for both f Bk and e A to be obtained is given by e A (cid:12) f Bk ( ω AB ) . This implicitly deﬁnes a subnormalizedstate ˜ ω A on A via e A (˜ ω A ) = e A (cid:12) f Bk ( ω AB ) , which mustthus be an element of A + . In the special case of thetrivial measurement f Bk = u B , the state ˜ ω A becomesthe reduced state on A . A similar condition should holdif the roles of A and B are interchanged.Together, we will call these assumptions the minimalassumptions on composition . Deﬁnition 2 (Minimal assumptions on composition) . A composition of GPT systems A and B is a GPT sys-tem AB together with two bilinear maps A × B → AB and A ∗ × B ∗ → ( AB ) ∗ , both denoted by (cid:12) , satisfyingthe following:i. All product states are allowed and normalized: if ω A ∈ Ω A and ω B ∈ Ω B then ω A (cid:12) ω B ∈ Ω AB .ii. All products of valid eﬀects are valid eﬀects: if e A ∈ E A and e B ∈ E B then e A (cid:12) e B ∈ E AB . Inparticular, local measurements cannot lead to prob-abilities larger than : u A (cid:12) u B ≤ u AB .iii. Local measurements on product states yield statisti-cally independent outcomes: e A (cid:12) f B ( ω A (cid:12) ω B ) = e A ( ω A ) f B ( ω B ) .iv. Products of pure states (eﬀects) are pure states (ef-fects).v. Conditional eﬀects: for all eﬀects e AB ∈ E AB andall normalized states ϕ A ∈ Ω A and ω B ∈ Ω B , also e AB ( ϕ A (cid:12) • ) ∈ E B and e AB ( • (cid:12) ω B ) ∈ E A areeﬀects.vi. Conditional states: for all states ω AB ∈ Ω AB andall eﬀects e A ∈ E A , f B ∈ E B , the vectors ˜ ω A , ˜ ω B which are implicitly deﬁned via ˜ e A (˜ ω A ) = ˜ e A (cid:12) f B ( ω AB )˜ f B (˜ ω B ) = e A (cid:12) ˜ f B ( ω AB ) must be states, i.e. ˜ ω A ∈ A + , ˜ ω B ∈ B + . While these assumptions imply the no-signalling prin-ciple, we do not demand the popular principle of “tomo-graphic locality” [17], i.e. that the ω A (cid:12) ω B span all of AB . Thus, the (cid:12) operation cannot in general be identi-ﬁed mathematically with the tensor product operation.The above minimal assumptions are also compatible, forexample, with QT over the real numbers [17].As we know from QT, a striking feature of compositesystems in non-classical theories is entanglement. Hav-ing a deﬁnition of composite systems at hand, we are inplace to deﬁne entangled states and eﬀects in GPTs [19]: Deﬁnition 3 (Entangled states) . Consider a compositesystem A = A A . . . A N . States ω A ∈ Ω A which can bewritten as ω A = (cid:88) i p i ω A i (cid:12) ω A i (cid:12) . . . (cid:12) ω A N i (3) with ω A i ∈ Ω A i and { p i } a probability distribution, arecalled separable . States which cannot be written in thisform are called entangled . Deﬁnition 4 (Entangled eﬀects) . Eﬀects e A ∈ E A which can be written as e A = (cid:88) i e A i (cid:12) e A i (cid:12) . . . (cid:12) e A N i (4) with e A j i ∈ E A j are called separable . Eﬀects which can-not be written in this form are called entangled . A pure eﬀect is separable if and only if it is a productof pure eﬀects (see, e.g. appendix B).

Summary of assumptions.

We consider theoriesthat satisfy:• For a single system ( A, A + , E A , u A , T A ) : A isﬁnite-dimensional. We do not assume the no-restriction hypothesis.• For pairs of systems: composition satisﬁes all con-ditions of deﬁnition 2. In particular, products ofpure states (or eﬀects) are pure, but we do not assume tomographic locality.• For three or more systems: composition satis-ﬁes all conditions of deﬁnition 2 on all subsys-tem pairs. For example, a quadripartite system ABCD is a valid composition of AB and of CD ,with subsystem ABC being a valid composition of B and AC , and so forth.Unless otherwise stated, all introduced states are nor-malized, and all introduced eﬀects are valid ( e ≤ u A ). III. RESULTSA. A deﬁnition of Darwinism in GPTs

With all these ingredients we can now ask: is theideal mechanism for Darwinism present in GPTs otherthan quantum theory? To answer this, we must ﬁrst formulate the features of the ideal Darwinism process inan operational way – that is, in terms of experimentalstatistics.To this end, recall the scenario of ideal

Quantum

Darwinism (section II A). The desire is to broadcastsome classical information encoded within S to the en-vironment, say, relating to pointer measurement M := {| k (cid:105) (cid:104) k |} k =0 ...d − . Let each environment system begin inan eigenstate of M (for system i , labeled | j i (cid:105) ), then afterthe fan-out operation T (eq. (2)), the outcome proba-bilities when measuring M on any environment E i willsatisfy P S ( M = k ) = | α k | = P E i ( M = j i + k ) for all k. (5)Moreover, if one makes the joint measurement M ⊗ ( N +1) = {| k (cid:105) (cid:104) k | ⊗ · · · ⊗ | k N (cid:105) (cid:104) k N |} on the en-tire composite system, the probability of outcome ( k , ..., k N ) is p ( k , . . . k N ) = | α k | δ k ,k − j . . . δ k ,k N − j N . (6)This is the sense in which objectivity can emerge un-der Quantum Darwinism: when this mechanism suc-ceeds, all independent observers can learn about thesame (maximal) classical information and agree abouttheir ﬁndings. Moreover, P S ( M = k ) is the same beforeand after the fan-out is performed.To generalize Darwinism to the GPT framework, wemust capture the same operational behaviour on thelevel of probabilities. First, we need an analogue ofpointer states – a set of distinguishable states corre-sponding to the classical information to be broadcast.As mentioned in section II B 2, this role is played byan MCI-frame { ω (0) j } j =0 ,...d − of S and its correspond-ing reﬁned measurement { e (0) k } with the distinguishingproperty e (0) k ( ω (0) j ) = δ jk . Again, we assume the mainsystem is in some pure state ν , that may not be anelement of { ω (0) j } j =0 ,...d − . Lacking the mathematicalstructure of a Hilbert space, we cannot so easily express ν as a superposition of frame elements. Nonetheless, wemay readily recover the outcome probabilities when ν ismeasured by M := { e (0) k } : P S ( M = k ) = e (0) k ( ν ) . (7)In the special case when ν is a member of the MCI-frame, ν = ω (0) j , we have P S ( M = k ) = δ j k .To carry the d outcomes of the MCI-frame measure-ment spread from system S , we assume that each en-vironment system (labeled by i ∈ { . . . N } ) containsan MCI-frame { ω ( i ) j } j =0 ,...d − , distinguished by somereﬁned measurement { e ( i ) k } . Like qubits in quantumtheory, the E i are not necessarily standalone systemslike single particles, but they can correspond to eﬀec-tive subsystems of larger environmental systems, pickedout by the speciﬁc form of the interaction with S . Letus brieﬂy consider the simplest case with just a singleenvironment, initially in the ﬁrst state ω of the frame { ω (1) j } . Then, to exhibit the same operational behaviouras eq. (6) (via eq. (7)), the joint probability of any pairof outcomes j and j on S and E should satisfy ( e (0) j (cid:12) e (1) j ) [ T ( ν (cid:12) ω )] = e (0) j ( ν ) δ j ,j . (8)In this way, the distribution { e j ( ν ) } j is broadcast tothe environment, as in eq. (5). Crucially, eq. (8) impliesthat the system and environment will agree on the out-come of M on S . Moreover, the probabilities of suchan outcome when directly measuring S are not aﬀectedby the transformation T , making T a member of thephase group [21] of this pointer measurement. This canbe seen by summing eq. (8) over j .The same operational desiderata easily extend to themore general case of N environmental systems, each nowstarting in an arbitrary frame state ω ( i ) k i . We summarizethis with the following deﬁnitions: Deﬁnition 5.

A composition of GPT system S and en-vironments E , . . . , E N is said to admit an ideal Dar-winism process if(a) S has a d –state MCI-frame { ω (0) k } , discriminated bya reﬁned measurement { e (0) j } , and(b) each E i has a d -state MCI-frame { ω ( i ) j } discrimi-nated by a reﬁned measurement { e ( i ) j } , such that(c) there exists a reversible (“fan-out”) transformation T ∈ T SE ... E N that satisﬁes ( e (0) j (cid:12) e (1) j (cid:12) ... (cid:12) e ( N ) j N )[ T ( ν (cid:12) ω (1) k (cid:12) ... (cid:12) ω ( N ) k N )]= δ j ,j + k ...δ j N ,j + k N e (0) j ( ν ) (9) for all k , . . . k N , j , j , . . . j N and all ν ∈ Ω S , whereaddition is modulo d . Deﬁnition 6.

If for a collection of MCI-frames { ω ( i ) j } that satisfy items (a) and (b) of deﬁnition 5, a reversibletransformation T ∈ T SE ... E N satisﬁes T ( ω (0) j (cid:12) ω (1) j (cid:12) . . . (cid:12) ω ( N ) j N )= ω (0) j (cid:12) ω (1) j + j (cid:12) . . . (cid:12) ω ( N ) j + j N , (10) then we say that T robustly spreads classical infor-mation . Deﬁnition 6 demands that the system and environ-ment behave in some sense like classical informationregisters: if, for example, j = . . . = j N = 0 , the trans-formation T copies the classical information in S to theenvironments, directly on the level of states. In quan-tum theory, such robust spreading of classical informa-tion is suﬃcient for Darwinism: the pointer basis of S spans the system’s Hilbert space, and so eq. (10) implieseq. (9) due to the state vector linearity of unitary maps. More generally, Deﬁnitions 5 and 6 are equivalent inquantum theory, in the sense that ideal Quantum Dar-winism processes are exactly those that robustly spreadclassical information.However, this equivalence does not hold for arbitraryGPTs, since eq. (10) will not in general imply eq. (9).Even if deﬁnition 5 holds, deﬁnition 6 can put additionalconstraints on both the system and the environment.With respect to the system, one needs to consider thepossibility of a T that preserves the statistics of { e (0) j } on S , but still changes the state of S , even if S is pre-pared in one of the frame states ω (0) j . This is impossiblein quantum theory, since every rank- quantum projec-tor E (0) j has a unique normalized and pure state ω (0) j that satisﬁes tr (cid:16) E (0) j ω (0) j (cid:17) = 1 . However, many GPTsystems (such as gbits [18]) violate the analogous opera-tional condition on MCI-frames, which can in some casesbe traced back to the fact that GPTs need not obey theusual quantum uncertainty principles [22]. With respectto the environment, deﬁnition 6 precludes the possibilitythat T creates exotic correlations between the E i whilepreserving the statistics of the product measurements e (1) j (cid:12) . . . (cid:12) e ( N ) j N .Thus, deﬁnition 5 captures the essential features forideal Darwinism on the operational level, while deﬁni-tion 6 further requests classical features from the framestates themselves. B. Necessary features for Darwinism in GPTs

In QT, the fan-out gate (eq. (2)) can create entangle-ment whenever the system is not initialized to a pointerstate. The ﬁrst main results of this paper are to showthat entanglement–creation is a necessary property of any generalized ideal Darwinism process. We begin byshowing that preventing a Darwinism process from cre-ating entangled states puts a very strong constraint onthe theory.

Theorem 1.

Suppose that we have an ideal Darwinismprocess for which the fan-out transformation T mapsseparable states to separable states. Then, for every purestate ν ∈ Ω S , we have e (0) i ( ν ) = 0 or e (0) i ( ν ) = 1 for all i . That is, the system S cannot have pure states thatdo not lead to deterministic outcomes on measurement { e (0) i } i .Remark. This conclusion is valid also for non-ideal Dar-winism processes that, instead of deﬁnition 5, satisfy theweaker condition (cid:16) e i (cid:12) e (1) j (cid:12) . . . (cid:12) e ( N ) j N (cid:17) T (cid:16) ν (cid:12) ω (1) (cid:12) . . . (cid:12) ω ( N ) (cid:17) = e i ( ν ) δ i,j δ i,j . . . δ i,j N (11)for all states ν ∈ Ω S , where ω (1) , . . . , ω ( N ) is an arbitrary ﬁxed set of pure states and the { e i } i and { e ( j ) j i } i are ar-bitrary ﬁxed measurements (as opposed to MCI–framesand reﬁned measurements). Proof.

Since T is a reversible transformation, it mapspure states to pure states. Hence, if it also preservesseparability, then there are pure states ϕ (0) , . . . , ϕ ( N ) (which may all depend on ν ) such that T (cid:16) ν (cid:12) ω (1) (cid:12) . . . (cid:12) ω ( N ) (cid:17) = ϕ (0) (cid:12) ϕ (1) (cid:12) . . . (cid:12) ϕ ( N ) . (12)Since T satisﬁes eq. (11), we obtain e i ( ν ) δ i,j . . . δ i,j N = e i ( ϕ (0) ) e (1) j ( ϕ (1) ) . . . e ( N ) j N ( ϕ ( N ) ) . (13)Summing over all j , . . . , j N yields e i ( ν ) = e i ( ϕ (0) ) forall i .Now suppose that i ∗ is an outcome label such that e i ∗ ( ϕ (0) ) = 0 , then e i ∗ ( ν ) = 0 . On the other hand,consider the case that e i ∗ ( ϕ (0) ) (cid:54) = 0 . If at least one ofthe j k is diﬀerent from i ∗ , then setting i = i ∗ in eq. (13)yields e i ∗ ( ϕ (0) ) (cid:124) (cid:123)(cid:122) (cid:125) (cid:54) =0 e (1) j ( ϕ (1) ) . . . e ( N ) j N ( ϕ ( N ) ) , hence e (1) j ( ϕ (1) ) . . . e ( N ) j N ( ϕ ( N ) ) = 0 . But since (cid:80) j ,...,j N e (1) j ( ϕ (1) ) . . . e ( N ) j N ( ϕ ( N ) ) = 1 , we must have e (1) i ∗ ( ϕ (1) ) . . . e ( N ) i ∗ ( ϕ ( N ) ) = 1 , and so e ( j ) i ∗ ( ϕ ( j ) ) = 1 forall j . Recalling eq. (13) we therefore see that e i ( ν ) = 0 for all i (cid:54) = i ∗ , and so e i ∗ ( ν ) = 1 .In summary, we obtain e i ∗ ( ν ) ∈ { , } for all i ∗ .Thus, for all GPT systems S that contain purestates on which the MCI-frame measurement gives non-deterministic outcomes, the corresponding ideal Dar-winism processes (if they exist) must create entangledstates. While this property will be satisﬁed for typicalGPT systems, we cannot immediately conclude that asystem satisfying e ( ν ) = 0 or must be classical. Forinstance, a GPT system with a cubic state space (i.e.gbits in a theory called “boxworld” [18]) and the fulldual octahedral eﬀect space will satisfy e ( ν ) = 0 or for every pair of pure state ν and pure eﬀect e – but isevidentally nonclassical. However, as we shall see in fol-lowing theorem, Darwinism in boxworld (among a widerclass of theories) can be ruled out by another necessarycondition: this time, on the measurements.In particular, let us focus on GPT systems S thatare non-classical in the following sense: in addition tothe reﬁned measurement e (0)0 , ..., e (0) d − that reads out theMCI-frame { ω (0) k } , there is at least one other reﬁnedmeasurement ˜ e (0)0 , ..., ˜ e (0) d − that is not just a relabellingof the measurement { e (0) j } , i.e. at least one of the ˜ e (0) j is not equal to any of the e (0) k . (In quantum theory,this would correspond to projective measurements indiﬀerent bases, with all projectors rank-one.) Theorem 2.

Suppose that we have an ideal Darwinismprocess such that the system S is non-classical in the sense described above. Then the fan-out transformation T must map some pure product eﬀects to entangled ef-fects.Proof. It will be useful to use the notation (cid:107) e (cid:107) :=max ω ∈ Ω e ( ω ) for eﬀects e . Suppose that T maps all pureproduct eﬀects to separable eﬀects. Then, since T is re-versible and preserves purity, Lemma 9 (appendix B)implies that T maps pure product eﬀects to pure prod-uct eﬀects. Hence, due to eq. (9), for every j and forevery j = ( j , . . . , j N ) there are eﬀects h (0) j , j , . . . , h ( N ) j , j such that (cid:16) e (0) j (cid:12) e (1) j (cid:12) . . . (cid:12) e ( N ) j N (cid:17) T = h (0) j , j (cid:12) h (1) j , j (cid:12) . . . (cid:12) h ( N ) j , j . (14)Due to multilinearity, we can move any multiplicativeconstant into the zeroth factor, and in this way choosethe eﬀects such that (cid:107) h ( i ) j , j (cid:107) = 1 for all i ∈ { , . . . , N } .If we had (cid:107) h (0) j , j (cid:107) < , then the right-hand side couldnever attain the value on product states, but we knowthat it does due to deﬁnition 5. Thus, (cid:107) h (0) j , j (cid:107) = 1 .Substituting eq. (14) into eq. (9) and noting that theresult is valid for every state ν ∈ Ω S , we obtain h (0) j , j p j , j , k = δ j ,j + k . . . δ j N ,j + k N e (0) j , where p j , j , k := h (1) j , j ( ω (1) k ) · . . . · h ( N ) j , j ( ω ( N ) k N ) ≥ . Thespecial case of k = j − j := ( j − j , . . . , j N − j ) yields e (0) j = p j , j , j − j h (0) j , j But since (cid:107) e (0) j (cid:107) = 1 = (cid:107) h (0) j , j (cid:107) , thisimplies that h (0) j , j = e (0) j for all j .Since S is non-classical, there is another reﬁned mea-surement { ˜ e (0) j } j which is not just a relabelling (i.e. per-mutation) of { e (0) i } i . Using again our assumption that T maps products of pure eﬀects to product eﬀects, weobtain (cid:16) ˜ e (0) j (cid:12) e (1) j (cid:12) . . . (cid:12) e ( N ) j N (cid:17) T = ˜ h (0) j , j (cid:12) ˜ h (1) j , j (cid:12) . . . (cid:12) ˜ h ( N ) j , j (15)for some suitable eﬀects ˜ h (0) j , j , . . . , ˜ h ( N ) j , j . Again, we deﬁnethe eﬀects such that (cid:107) ˜ h ( i ) j , j (cid:107) = 1 for all i ∈ { , . . . , N } (the case i = 0 will be discussed later). Summing over j , using that (cid:80) j ˜ e (0) j = u S = (cid:80) j e (0) j , yields (cid:88) j e (0) j (cid:12) h (1) j , j (cid:12) . . . (cid:12) h ( N ) j , j = (cid:88) j ˜ h (0) j , j (cid:12) ˜ h (1) j , j (cid:12) . . . (cid:12) ˜ h ( N ) j , j . (16)Applying both sides to the product state ν (cid:12) ω (1) k (cid:12) . . . (cid:12) ω ( N ) k N and recalling eq. (9), we obtain (cid:88) j ˜ h (0) j , j ( ν )˜ h (1) j , j ( ω (1) k ) . . . ˜ h ( N ) j , j ( ω ( N ) k N )= (cid:88) j e (0) j ( ν ) δ j ,j + k . . . δ j N ,j + k N . (17)So far, j and k are arbitrary, but now set j i := k i + l forall i , where l is ﬁxed (we abbreviate this by j = k + l ).We obtain e (0) l ( ν ) = (cid:88) j q j , k ,l ˜ h (0) j , k + l ( ν ) , (18)where q j , k ,l := ˜ h (1) j , k + l ( ω (1) k ) . . . ˜ h ( N ) j , k + l ( ω ( N ) k N ) ∈ [0 , .Since this is true for all states ν ∈ Ω S , we may againdrop the ν and read it as an equality between eﬀects.Since e (0) l (cid:54) = 0 , for every l and for every k there mustbe some j such that q j , k ,l (cid:54) = 0 . Since e (0) l is pure,this implies that e (0) l ∝ ˜ h (0) j , k + l . Now ﬁx an arbitrary j ,and consider the special case k := j − l . It follows thatfor all l , there exists at least one j such that e (0) l isa scalar multiple of ˜ h (0) j , j . There are d diﬀerent linearlyindependent e (0) l (labelled by l ), and there are d diﬀerent ˜ h (0) j , j , labelled by j . Thus, to every l there is a unique j such that e (0) l = q j , j − l,l ˜ h (0) j , j . We have (cid:107) e (0) l (cid:107) = q j , j − l,l (cid:124) (cid:123)(cid:122) (cid:125) ≤ (cid:107) ˜ h (0) j , j (cid:107) (cid:124) (cid:123)(cid:122) (cid:125) ≤ , (19)hence (cid:107) ˜ h (0) j , j (cid:107) = 1 , and so e (0) l = ˜ h (0) j , j . We can rephrasethis as follows. For every j there is a permutation π ofthe indices such that ˜ h (0) j , j = e (0) π ( j ) for all j .Now ﬁx some j . Let us return to eq. (16) and apply itto ω (0) π ( j ) (cid:12) ω , where π is the permutation correspondingto j , and ω is an arbitrary global state of the N envi-ronments. Using the identities that we have just derivedand e (0) j ( ω (0) i ) = δ j ,i , we obtain ˜ h (1) j , j (cid:12) . . . (cid:12) ˜ h ( N ) j , j = h (1) π ( j ) , j (cid:12) . . . (cid:12) h ( N ) π ( j ) , j . (20)Recalling eqs. (14) and (15), it follows that (cid:16) ˜ e (0) j (cid:12) e (1) j (cid:12) . . . (cid:12) e ( N ) j N (cid:17) T = ˜ h (0) j , j (cid:12) ˜ h (1) j , j (cid:12) . . . (cid:12) ˜ h ( N ) j , j = e (0) π ( j ) (cid:12) h (1) π ( j ) , j (cid:12) . . . (cid:12) h ( N ) π ( j ) , j = (cid:16) e (0) π ( j ) (cid:12) e (1) j (cid:12) . . . (cid:12) e ( N ) j N (cid:17) T. (21)Since T is reversible, the terms in the brackets must beidentical. Consider the case j = j = . . . = j N = 0 andenvironment states ϕ (1) , . . . , ϕ ( N ) with e ( k )0 ( ϕ ( k ) ) = 1 for all k = 1 , . . . , N . Then applying the above brack-ets to the product state ν (cid:12) ϕ (1) (cid:12) . . . (cid:12) ϕ ( N ) yields ˜ e (0) j ( ν ) = e (0) π ( j ) ( ν ) . Since this is true for all ν ∈ Ω S , weobtain ˜ e (0) j = e (0) π ( j ) . This contradicts our assumptionthat { ˜ e (0) j } j is not just a permutation of the { e (0) i } i .Thus, a reversible transformation T that implementsan ideal Darwinism process will create entangled eﬀects.An important consequence is that GPTs without entan-gled eﬀects, such as those constructed by taking the maximal tensor product in the context of tomographiclocality, cannot admit such a process. In particular, thisrules out Darwinism in boxworld [18] (a theory contain-ing the aforementioned gbits) or any dichotomic maxi-mally nonlocal theory. For these speciﬁc examples, onecould also infer this from Refs. [40, 44], but here wehave shown it without having to determine the com-plete structure of the reversible transformations.Interestingly, entanglement for states is also neededin general physical theories if one imposes another con-dition of relevance for the classical limit: the existenceof a decoherence map [23]. However, theories that havean ideal Darwinism process – and by our results needentangled states and measurements – may not containsuch a decoherence map, as we shall show in sectionIII D. Therefore, our results provide not only alterna-tive proofs but are complementary to that of Richenset al. [23]: together they support the idea that thisnon-classical feature must be present for a locally non-classical theory to admit a meaningful classical limit. C. Suﬃcient features for Darwinism in GPTs

Let us now determine suﬃcient conditions that guar-antee that Quantum Darwinism can be generalized intoa theory. In particular, we are interested in which op-erationally well-motivated postulates that have alreadyappeared in the GPT literature can lead to such Dar-winism. In this spirit, we will see how a framework thatadmits decoherence also admits Darwinism.We will ﬁrst determine suﬃcient structure in GPTsto allow for the robust spreading of classical informa-tion (in the manner of deﬁnition 6), before determiningwhich additional postulates can be added to guaran-tee the existence of an ideal Darwinism process (deﬁ-nition 5) that additionally broadcasts classical informa-tion to the environment even when the system is not ina MCI–frame state.Recall that both, the spreading of classical informa-tion and the ideal Darwinism processes, require thesystem to have an MCI-frame (playing the role ofpointer states) that deﬁnes the classical information tobe spread to the environment (deﬁnition 5(a)). Likewisethe environments must admit MCI-frames on which toreceive this classical information (deﬁnition 5(b)). Eventhough a theory admitting such frames may arguably besaid to contain classical information (i.e. admitting “reg-isters” that can encode the appropriate values), it maynot generally admit all (or even any!) classical informa-tion processing – that is, there is no guarantee that thetheory admits suﬃcient dynamics to satisfy deﬁnition 6.In the following, we will consider what physical charac-teristics do ensure that the theory has enough classicalinformation processing power to implement a fan-outgate in the manner of eq. (10).The ﬁrst possible characteristic is to demand thatcomposite systems satisfy strong symmetry [45]:0 Deﬁnition 7.

A GPT system with group of reversibletransformations T satisﬁes strong symmetry (onstates) if for all n ∈ N and for all pairs of frames ω , ..., ω n and ν , ..., ν n , there exists some T ∈ T with T ω j = ν j for all j . Strong symmetry says that all ways of encoding clas-sical information are computationally equivalent. Inparticular, it implies that classical reversible computa-tion can be performed on the MCI-frames of systemand environment: since the set of states ω j ,...,j N := ω (0) j (cid:12) ... (cid:12) ω ( N ) j N constitutes a frame of the compositesystem, strong symmetry implies that we can performarbitrary classical reversible gates (and thus arbitrarypermutations) of those frame elements. This immedi-ately gives us the following result: Lemma 3.

Consider GPT systems S , E , . . . , E N thatcarry d -outcome MCI-frames. Every composition SE . . . E N that satisﬁes strong symmetry (on states) ad-mits the robust spreading of classical information. While strong symmetry on states implies the robustspreading of classical information in the sense of deﬁni-tion 6, we do not know whether this property impliesthe existence of an ideal Darwinism process in the senseof deﬁnition 5. Interestingly, the existence of such aprocess follows if we consider a dual notion of strongsymmetry on the measurements : Deﬁnition 8.

A GPT system with group of reversibletransformations T satisﬁes strong symmetry (on ef-fects) if the following holds for all n ∈ N : If ( e , . . . , e n ) is a collection of pure eﬀects that perfectly distinguishessome frame, and so is ( f , . . . , f n ) , then there exists a T ∈ T with e j = f j ◦ T for all j . If this property holds, we can show the following:

Lemma 4.

Consider again GPT systems S , E , . . . , E N that carry d -outcome MCI-frames. Every composition SE . . . E N that satisﬁes strong symmetry (on eﬀects)admits an ideal Darwinism process.Proof. The e j ,...,j N are pure eﬀects which perfectly dis-tinguish the frame ω j ,...,j N . Thus, strong symmetry oneﬀects implies that there is some T ∈ T SE ... E N with e j ,j ,...,j N ◦ T = e j ,j − j ,...,j N − j , (22)where subtraction is modulo d . One can check directlythat this map T satisﬁes eq. (9).Thus, the version of Darwinism that is guaranteed tohold (according to deﬁnition 5 or 6) depends on whetherwe demand strong symmetry on the states or on the ef-fects. Is there a way to guarantee it on both? Indeed, itturns out that the no-restriction hypothesis is suﬃcientfor this: Theorem 5.

Consider GPT systems S , E , . . . , E N thatcarry d -outcome MCI-frames. Every unrestricted com-position SE . . . E N that satisﬁes strong symmetry (on states) has a transformation T ∈ T SE ... E N that robustlyspreads classical information and that generates an idealDarwinism process.Proof. For unrestricted systems A with strong symme-try on states, it was shown in Ref. [39] that there is aparticularly strong duality between states and eﬀects:there is an inner product (cid:104)· , ·(cid:105) on A such that frames ω , . . . , ω n correspond to orthonormal systems, and thecorresponding pure eﬀects with e i ( ω j ) = δ ij must begiven by e i ( ω ) = (cid:104) ω i , ω (cid:105) . Moreover, all T ∈ T A are or-thogonal with respect to this inner product. If f , . . . , f n is any other collection of pure eﬀects that distinguish aframe (say, ν , . . . , ν n ), then strong symmetry on statessays that there is some T ∈ T A with T ω j = ν j , and so e j ◦ T − ( ω ) = (cid:104) ω j , T − ω (cid:105) = (cid:104) T ω j , ω (cid:105) = (cid:104) ν j , ω (cid:105) = f j ( ω ) . Consequently, A also satisﬁes strong symmetry on ef-fects. Now, choose T as in eq. (22), then we alreadyknow that it generates an ideal Darwinism process.Moreover, we have just seen that T − maps the cor-responding frame elements onto each other, i.e. T − ω j ,j ,...,j N = ω j ,j − j ,...,j N − j . Applying T to both sides shows that T robustly spreadsclassical information in the sense of deﬁnition 6.A second path to this spreading of classical informa-tion arises from decoherence theory. In quantum theory,decoherence plays an important role in Quantum Dar-winism by explaining in some sense why we see classicalprobabilities instead of superposition states. Recently,a decoherence formalism for GPTs was developed [23],and we shall here see that it enables Darwinism in GPTsas well. We adapt the decoherence formalism of Richenset al. [23] to our setting: Deﬁnition 9 (Decoherence maps) . Consider any GPTsystem A . A linear map D : A → A is called a deco-herence map if the following properties hold:1. The image of A + under D is isomorphicto a classical state space, i.e. there exists aframe ω , ..., ω d − ∈ Ω A such that D (Ω A ) =conv { ω , ..., ω d − } (i.e. the convex hull of the { ω i } ). Consequently, D is normalization-preserving, i.e. u A ◦ D = u A .2. D is idempotent, i.e. D ◦ D = D .3. For every classical reversible transformation T C : D ( A ) → D ( A ) there is a reversible transformation T ∈ T A that implements T C , i.e. T ( ω ) = T C ( ω ) for all ω ∈ D ( A + ) . Not only does this map T pre-serve the classical state space D ( A + ) , but it alsopreserves the corresponding classical eﬀect space E A ◦ D .Furthermore, if we have a composite GPT system A = A A . . . A N with decoherence maps D , . . . , D N , A has a decoherence map D ...N that acts as D ...N ( ν (cid:12) . . . (cid:12) ν N ) = D ( ν ) (cid:12) . . . (cid:12) D N ( ν N ) . Richens et al. [23] additionally assume that D is phys-ically implementable, but we do not assume this here.In the following, we will need a simple property ofdecoherence maps: Lemma 6.

Consider a GPT system A with decoher-ence map D , and T any reversible transformation thatimplements some classical transformation in the senseof deﬁnition 9 item 3. Then DT = T D .Proof.

Let e ∈ E A and ϕ ∈ A + , then f := e ◦ D is anelement of the classical eﬀect space E A ◦ D , and so is f (cid:48) := f ◦ T , hence f (cid:48) = f (cid:48) ◦ D . Thus, we have e ◦ DT ϕ = f ◦ T ϕ = f (cid:48) ( ϕ ) = f (cid:48) ◦ Dϕ = e ◦ DT Dϕ.

Since A + and E A span A and A ∗ , respectively, it fol-lows that DT = DT D . But T preserves D ( A ) =span( D ( A + )) , hence DT D = T D .In analogy with how quantum systems decohere tomixtures of pointer states, it is natural to consider Dar-winism for frames that can result from decoherence pro-cesses.

Deﬁnition 10.

Consider any GPT system A . We saythat an MCI-frame { ω i } ⊂ Ω A together with a corre-sponding reﬁned measurement { e i } ⊂ E A arises fromdecoherence if there is a decoherence map D : A → A such that D ( A + ) = cone { ω i } and E A ◦ D = cone { e i } . In this deﬁnition, cone { ω i } denotes the set of non-negative linear combinations of the ω i , i.e. the con-vex cone of unnormalized states generated by the MCI-frame (similarly for the { e i } ).Let { ω (0) j } d − j =0 be an MCI-frame of the main system S that arises from decoherence map D , and similarly let { ω ( i ) j } d − j =0 , i = 1 , . . . , N , be MCI-frames of the environ-mental systems E , . . . , E N that arise from decoherencemaps D , ..., D N . Then requirement 4 of deﬁnition 9implies that there is a decoherence map D ...N with D ...N ( ω (0) j (cid:12) . . . (cid:12) ω ( N ) j N ) = D ( ω (0) j ) (cid:12) . . . (cid:12) D N ( ω ( N ) j N ) . Since each D i is a projection map and since every ω ( i ) j is in its image, we have D i ( ω ( i ) j ) = ω ( i ) j , and hence D ...N ( ω (0) j (cid:12) . . . (cid:12) ω ( N ) j N ) = ω (0) j (cid:12) . . . (cid:12) ω ( N ) j N . Requirement 3 for decoherence maps implies that theclassical transformation deﬁned by eq. (10) (a particularpermutation of the classical pure states) is implementedas a reversible transformation T ∈ T A on the compos-ite GPT system A := SE . . . E N . This transformationhence robustly spreads classical information in the senseof deﬁnition 6. Furthermore, consider any state ν ∈ Ω S , and let ν := D ν . Since the MCI-frame of S arises from D , there isa convex decomposition ν = (cid:80) d − i =0 λ i ω (0) i with λ i ≥ , (cid:80) d − i =0 λ i = 1 . Using lemma 6, we thus obtain ( e (0) j (cid:12) e (1) j (cid:12) . . . (cid:12) e ( N ) j N ) T ( ν (cid:12) ω (1) k (cid:12) . . . (cid:12) ω ( N ) k N )= e j ,...,j N ◦ D ...N T ( ν (cid:12) ω (1) k (cid:12) . . . (cid:12) ω ( N ) k N )= e j ,...,j N ◦ T D ...N ( ν (cid:12) ω (1) k (cid:12) . . . (cid:12) ω ( N ) k N )= e j ,...,j N ◦ T ( ν (cid:12) ω (1) k (cid:12) . . . (cid:12) ω ( N ) k N )= d − (cid:88) i =0 λ i e j ,...,j N ◦ T ( ω (0) i (cid:12) ω (1) k (cid:12) . . . (cid:12) ω ( N ) k N )= d − (cid:88) i =0 λ i e j ,...,j N ( ω i,i + k ,...,i + k N )= λ j δ j ,j + k . . . δ j N ,j + k N . Furthermore, e (0) j ( ν ) = e (0) j ◦ D ( ν ) = λ . This provesthat T generates an ideal Darwinism process.We summarize our ﬁndings in the following theorem: Theorem 7.

Consider a composition SE . . . E N ofGPT systems S , E , . . . , E N that carry d -outcome MCI-frames arising from decoherence. This composite sys-tem admits a transformation T ∈ T SE ... E N that robustlyspreads classical information and that generates an idealDarwinism process. Composite systems in quantum theory are unre-stricted and satisfy strong symmetry (on states andeﬀects). Furthermore, they admit MCI-frames arisingfrom decoherence in the way speciﬁed above. Thus, theexistence of an ideal Darwinism process and the robustspreading of classical information in quantum theory fol-low both as special cases of theorem 5 and theorem 7.

D. Darwinism in Spekkens’ Toy Model

If one identiﬁes too many speciﬁc restrictions on aGPT, it raises the natural question: “is quantum theorythe only physical theory that allows for Darwinism?”We answer this in the negative by providing an exam-ple that admits Darwinism, but is not quantum theory:Spekkens’ Toy Model (STM) [27].STM satisﬁes many of the same restrictions asquantum theory, such as no-signalling and no-cloning,and emulates many quantum behaviours such as com-plementary measurements, interference, entanglement(and monogamy thereof), and teleportation [27]. De-spite this, it is very diﬀerent from quantum theory: bothmathematically and conceptually, since at its core it isa classical hidden-variable model. What enables thisquantum-like behaviour is that the states of maximumknowledge of the system are subject to the epistemicrestriction that one knows only half of the possible in-formation about the hidden ontic variable, along with2a measurement-update rule that ensures that this re-striction is maintained even when one makes sequentialmeasurements on the system.A more detailed description of STM and its exten-sion into the GPT framework is given in appendix C.For now, it suﬃces to remark that the composition ofsuch systems is achieved by composing the underlyinghidden classical variable (i.e. by Cartesian product) andapplying the epistemic restriction to both the compositesystem and every subsystem thereof.As observed by Pusey [46] (and recounted in ap-pendix C 3), the states within STM may be treated verysimilarly to the stabilizer subset of quantum theory (fora single system, the state spaces are isomorphic). In par-ticular, a single elementary STM system admits three“toy observables” X , Y and Z which act on the state toproduce outputs +1 or − – and there is one pure statefor each of these six possibilities ( | x ±(cid:105) , | y ±(cid:105) , | z ±(cid:105) ) andno other pure states. When the “wrong” observable actson a pure state (e.g. acting on | z + (cid:105) with X ), outcomes +1 and − occur with equal probability. In this lan-guage, one can deﬁne the CNOT analogue for two STMbits “control” C and “target” T : CNOT : X C (cid:55)→ X C X T , X T (cid:55)→ X T ,Z C (cid:55)→ Z C , Z T (cid:55)→ Z C Z T . This can be read as, e.g. X C (cid:55)→ X C X T , “The product ofthe observation of X on C and X on T after the trans-formation CNOT yields the same outcome statistics asthe observation X on C before the transformation.”With this shorthand, we hence specify our candidatefor an ideal Darwinism process from main system S ontomultiple environments E , ..., E N : FAN : X S (cid:55)→ X S X E ...X E n , ∀ k : X E k (cid:55)→ X E k ,Z S (cid:55)→ Z S , ∀ k : Z E k (cid:55)→ Z S Z E k . (23)The validity of this, as a transformation in STM, canbe veriﬁed in one of two ways: the ﬁrst is to considera direct implementation of this as a series of pairwiseCNOT gates (in the manner of ﬁg. 1), reasoning (e.g.via category theory [47]) that such composition is per-missible. The second way is to note that this map is ad-missible as a transformation on an analogously deﬁned N -bit quantum stabilizer system, and then use the re-sult of Pusey [46] to infer that this makes FAN a validSTM transformation.Thus, it remains to verify that such a transforma-tion indeed achieves the desired ideal Darwinistic be-haviour. Suppose we have an initial state of the form | ψ (cid:105) S ⊗ | + z (cid:105) ⊗ nE ...E N where | ψ (cid:105) S is some arbitrary pureSTM bit state of the main system, and | + z (cid:105) correspondsto the state that always gives output +1 when measuredby toy observable Z . As for each k , FAN maps Z E k to Z S Z E k , the ﬁnal state will always have result +1 forjoint measurements of Z S Z E k – mandating that the re-sults of Z S and Z E k are perfectly correlated. (In the case E j starts at |− z (cid:105) , anti-correlation is established.)Therefore the fan-out results in all observers seeing thesame outcome as made on the original system.Our other requirement for Darwinism is that the out-come probability of Z S is not changed, and this is alsoexplicitly given by the rule in the map Z S (cid:55)→ Z S . Inparticular | z ±(cid:105) are the only pure states that have non-zero expectation value for the observable Z , and themap does not take any state of main system stabilizedby another observable (i.e. X or Y ) to any state stabi-lized by an expression containing Z . As such, since S can only be in one of these possibilities (or convex com-bination thereof in the GPT extension) this implies thatthe statistics of Z S remain unchanged. We summarizethis with our ﬁnal theorem of the article: Theorem 8 (STM admits an ideal Darwinism process) . The FAN operation speciﬁed in eq. (23) implements anideal Darwinism process, as per deﬁnition 5.

We conclude this section with some remarks on theimplications of this example to the theorems of this pa-per. First, in terms of necessary conditions: STM isnonclassical in the sense that there are more than oneset of suﬃciently diﬀerent reﬁned measurements (recallsection III B), and indeed also STM has entangled eﬀectsas mandated by Theorem 2. Although requiring entan-gled eﬀects in a non-classical setting, we can furtherconclude (by counterexample) that the stronger condi-tion of violating of Bell inequalities (see e.g. [48]) is not necessary since STM does not violate these. A similarconclusion follows for contextuality , which is not presentin STM [27] and thus shown to be unnecessary for Dar-winism.Secondly, in terms of the suﬃcient conditions, STMneither admits a decoherence map, nor is it stronglysymmetric (as we show in appendix C 4). This illus-trates that the suﬃcient conditions are not tight – theyenable the fan-out dynamic by mandating the existenceof all classical dynamics within the theory. However,the fan-out operation can be admitted without requiringuniversal classical computation – indeed, as above forSTM, or existing as a member of the (non-universal [49])Cliﬀord group in the case of quantum stabilizers.

IV. CONCLUSIONS

Quantum Darwinism provides a mechanism throughwhich crucial aspects of classicality can be understoodto emerge in the quantum domain [1–5]. In this arti-cle, we generalized an ideal notion of Darwinism, wheremaximal classical information is perfectly broadcast toan environment split into fractions, to the frameworkof GPTs. We showed that entanglement, in both statesand measurements, is a necessary feature for such a pro-cess to be present in generalized theories, and demon-strates that some important physical principles – likestrong symmetry and decoherence – provide suﬃcientstructure to admit Darwinism. Finally, we described3a mechanism for Darwinism in Spekkens Toy Model,showing that such broadcasting of classical informationis not unique to quantum theory.Our results show that objectivity may arise through aDarwinism process in non-classical theories other thanquantum – adding to the results of Scandolo et al.[50], which analyzed objectivity through State SpectrumBroadcast in GPTs. Complementing a previous resulton decoherence [23], our work also shows the importantrole of entanglement to allow for emergence of classi-cality, suggesting the counterintuitive principle that lo-cally non-classical theories must also allow for sharednon-classicality to allow for the emergence of classicalobjectivity. In addition, our results show that stronglysymmetric and unrestricted GPTs – that is, those en-dowed with suﬃcient structure to allow for reversibleclassical computation and the encoding and decodingof classical information – have suﬃcient structure forDarwinism to be present.Finally, although this work has been presented witha focus on the origins of classical limits, our results alsohave a bearing on the general foundations of computa-tion [51, 52]. The Darwinism–enabling fan-out transfor-mation (eq. (1)) has its origins in classical logic circuits,connecting the output of one logic gate to the input ofmany others, and its quantum analogue plays a role inthe design of quantum neural networks [53]. The con-clusions of this article therefore imply that such compu- tation also necessitates the existence of entanglement, ifthe theory is not strictly classical – meanwhile identify-ing potential suﬃcient structure (e.g. no-restriction andstrong symmetry) to guarantee that such computationcan be performed.

ACKNOWLEDGMENTS

RDB acknowledges funding by São Paulo Re-search Foundation – FAPESP, through scholarships no.2016/24162-8 and no. 2019/02221-0. RDB is also thank-ful to Marcelo Terra Cunha for insightful ideas and dis-cussions, and to IQOQI for the hospitality during thetime as guest researcher. MK acknowledges the sup-port of the Vienna Doctoral School in Physics (VDSP)and the support of the Austrian Science Fund (FWF)through the Doctoral Programme CoQuS. MK, AJPGand MPM thank the Foundational Questions Instituteand Fetzer Franklin Fund, a donor advised fund of Sil-icon Valley Community Foundation, for support viagrant number FQXi-RFP-1815. This research was sup-ported in part by Perimeter Institute for TheoreticalPhysics. Research at Perimeter Institute is supportedby the Government of Canada through the Depart-ment of Innovation, Science and Economic DevelopmentCanada and by the Province of Ontario through theMinistry of Research, Innovation and Science. [1] W. H. Zurek. Decoherence, einselection, and the quan-tum origins of the classical.

Rev. Mod. Phys. , 75(3):715–775, 2003. doi:10.1103/revmodphys.75.715.[2] W. H. Zurek. Relative States and the Environment:Einselection, Envariance, Quantum Darwinism, and theExistential Interpretation.

Pre-print , arXiv:0707.2832,2007. URL https://arxiv.org/abs/0707.2832 .[3] W. H. Zurek. Quantum Darwinism.

Nature Physics , 5(3):181–188, 2009. doi:10.1038/nphys1202.[4] F. G. S. L. Brandão, M. Piani, and P. Horodecki.Generic emergence of classical features in quantum dar-winism.

Nature Communications , 6(1), 2015. doi:10.1038/ncomms8908.[5] P. A. Knott, T. Tufarelli, M. Piani, and G. Adesso.Generic emergence of objectivity of observables in in-ﬁnite dimensions.

Phys. Rev. Lett. , 121:160401, 2018.doi:10.1103/PhysRevLett.121.160401.[6] R. Horodecki, J. K. Korbicz, and P. Horodecki. Quan-tum origins of objectivity.

Phys. Rev. A , 91(3):032122,2015. doi:10.1103/PhysRevA.91.032122.[7] T. P. Le and A. Olaya-Castro. Objectivity (or lackthereof): Comparison between predictions of quantumdarwinism and spectrum broadcast structure.

Phys.Rev. A , 98(3), 2018. doi:10.1103/physreva.98.032103.[8] R. Blume-Kohout and W. H. Zurek. Quantum darwin-ism in quantum brownian motion.

Phys. Rev. Lett. , 101:240405, 2008. doi:10.1103/PhysRevLett.101.240405.[9] R. Blume-Kohout and W. H. Zurek. A simple exampleof “quantum darwinism”: Redundant information stor-age in many-spin environments.

Foundations of Physics , 35(11):1857–1876, 2005. doi:10.1007/s10701-005-7352-5.[10] C. J. Riedel and W. H. Zurek. Quantum darwinismin an everyday environment: Huge redundancy in scat-tered photons.

Phys. Rev. Lett. , 105:020404, 2010. doi:10.1103/PhysRevLett.105.020404.[11] M. Zwolak, H. T. Quan, and W. H. Zurek. Redun-dant imprinting of information in nonideal environ-ments: Objective reality via a noisy channel.

Phys. Rev.A , 81:062110, 2010. doi:10.1103/PhysRevA.81.062110.[12] M. Zwolak, H. T. Quan, and W. H. Zurek.Quantum darwinism in a mixed environment.

Phys. Rev. Lett. , 103:110402, 2009. doi:10.1103/PhysRevLett.103.110402.[13] P. A. M. Dirac.

The Principles of Quantum Mechanics .Oxford University Press, Oxford, 4th edition, 1958.[14] J. J. Sakurai and J. Napolitano.

Modern Quantum Me-chanics . Addison-Wesley, 2nd edition, 2011.[15] M. Schlosshauer. Decoherence, the measurementproblem, and interpretations of quantum mechan-ics.

Rev. Mod. Phys. , 76(4):1267–1305, 2004. doi:10.1103/RevModPhys.76.1267.[16] W. K. Wootters and W. H. Zurek. A single quantumcannot be cloned.

Nature , 299(5886):802–803, 1982. doi:10.1038/299802a0.[17] L. Hardy. Quantum Theory From Five Reasonable Ax-ioms.

Pre-print , arXiv:quant-ph/0101012, 2001. URL https://arxiv.org/abs/quant-ph/0101012 .[18] J. Barrett. Information processing in generalized prob-abilistic theories.

Phys. Rev. A , 75:032304, 2007. doi: Pre-print ,arXiv:0805.3553, 2008. URL https://arxiv.org/abs/0805.3553 .[20] H. Barnum, J. Barrett, M. Leifer, and A. Wilce. Gen-eralized no-broadcasting theorem.

Phys. Rev. Lett. , 99(24), 2007. doi:10.1103/physrevlett.99.240501.[21] A. J. P. Garner, O. C. O. Dahlsten, Y. Nakata,M. Murao, and V. Vedral. A framework for phaseand interference in generalized probabilistic theories.

New J. Phys. , 15(9):093044, 2013. doi:10.1088/1367-2630/15/9/093044.[22] O. C. O. Dahlsten, A. J. P. Garner, and V. Vedral. Theuncertainty principle enables non-classical dynamics inan interferometer.

Nature Communications , 5(4592),2014. doi:10.1038/ncomms5592.[23] J. G. Richens, J. H. Selby, and S. W. Al-Saﬁ. Entangle-ment is necessary for emergent classicality in all phys-ical theories.

Phys. Rev. Lett. , 119:080503, 2017. doi:10.1103/PhysRevLett.119.080503.[24] G. Chiribella, Giacomo M. D’Ariano, and P. Perinotti.Probabilistic theories with puriﬁcation.

Phys. Rev. A ,81(6), 2010. doi:10.1103/physreva.81.062348.[25] P. Janotta and R. Lal. Generalized probabilistic theorieswithout the no-restriction hypothesis.

Phys. Rev. A , 87(5), 2013. doi:10.1103/physreva.87.052131.[26] H. Barnum, M. P. Müller, and C. Ududec. Higher-orderinterference and single-system postulates characterizingquantum theory.

New J. Phys. , 16(12):123029, 2014.doi:10.1088/1367-2630/16/12/123029.[27] R. W. Spekkens. Evidence for the epistemic view ofquantum states: A toy theory.

Phys. Rev. A , 75(3):32110, 2007. doi:10.1103/PhysRevA.75.032110.[28] A. S. Holevo. Bounds for the quantity of informa-tion transmitted by a quantum communication channel.

Probl. Peredachi Inf. , 9(3):177–183, 1973.[29] M. Zwolak, C. J. Riedel, and W. H. Zurek. Ampliﬁca-tion, decoherence and the acquisition of information byspin environments.

Scientiﬁc Reports , 6(1):25277, 2016.doi:10.1038/srep25277.[30] T. K. Unden, D. Louzon, M. Zwolak, W. H. Zurek, andF. Jelezko. Revealing the emergence of classicality usingnitrogen-vacancy centers.

Phys. Rev. Lett. , 123:140402,2019. doi:10.1103/PhysRevLett.123.140402.[31] M. A. Ciampini, G. Pinna, P. Mataloni, and M. Pater-nostro. Experimental signature of quantum darwinismin photonic cluster states.

Phys. Rev. A , 98:020101,2018. doi:10.1103/PhysRevA.98.020101.[32] M. D. Mazurek, M. F. Pusey, K. J. Resch, and R. W.Spekkens. Experimentally bounding deviations fromquantum theory in the landscape of generalized prob-abilistic theories.

Pre-print , arXiv:1710.05948, 2017.URL https://arxiv.org/abs/1710.05948 .[33] G. Chiribella, G. M. D’Ariano, and P. Perinotti. Infor-mational derivation of quantum theory.

Phys. Rev. A ,84(1):012311, 2011. doi:10.1103/PhysRevA.84.012311.[34] Ll. Masanes and M. P. Müller. A derivation of quantumtheory from physical requirements.

New J. Phys. , 13(6):063001, 2011. doi:10.1088/1367-2630/13/6/063001.[35] M. P. Müller and Ll. Masanes. Three-dimensionality ofspace and the quantum bit: an information-theoreticapproach.

New J. Phys. , 15(5):053040, 2013. doi:10.1088/1367-2630/15/5/053040. [36] M. P. Müller. Probabilistic theories and reconstruc-tions of quantum theory.

Les Houches 2019 lecturenotes , arXiv:2011.01286, 2020. URL https://arxiv.org/abs/2011.01286 .[37] R. Webster.

Convexity . Oxford University Press, Ox-ford, 1994. ISBN 0-19-853147-8.[38] G. Chiribella and C. M. Scandolo. Operational ax-ioms for diagonalizing states.

Electronic Proceedings inTheoretical Computer Science , 195:96–115, 2015. doi:10.4204/EPTCS.195.8.[39] M. P. Müller and C. Ududec. Structure of reversiblecomputation determines the self-duality of quantumtheory.

Phys. Rev. Lett. , 108:130401, 2012. doi:10.1103/PhysRevLett.108.130401.[40] S. W. Al-Saﬁ and J. Richens. Reversibility and thestructure of the local state space.

New J. Phys. , 17(12):123001, 2015. doi:10.1088/1367-2630/17/12/123001.[41] S. Massar and M. K. Patra. Information and commu-nication in polygon theories.

Phys. Rev. A , 89:052124,2014. doi:10.1103/PhysRevA.89.052124.[42] P. Janotta and H. Hinrichsen. Generalized probabil-ity theories: what determines the structure of quan-tum theory?

Journal of Physics A: Mathematicaland Theoretical , 47(32):323001, 2014. doi:10.1088/1751-8113/47/32/323001.[43] B. Coecke and C. Heunen. Pictures of complete posi-tivity in arbitrary dimension.

Information and Compu-tation , 250:50–58, 2016. doi:10.1016/j.ic.2016.02.007.[44] D. Gross, M. P. Müller, R. Colbeck, and O. C. O.Dahlsten. All reversible dynamics in maximally non-local theories are trivial.

Phys. Rev. Lett. , 104:080402,2010. doi:10.1103/PhysRevLett.104.080402.[45] H. Barnum, M. P. Müller, and C. Ududec. Higher-orderinterference and single-system postulates characterizingquantum theory.

New J. Phys. , 16(12):123029, 2014.doi:10.1088/1367-2630/16/12/123029.[46] M. F. Pusey. Stabilizer Notation for Spekkens’ Toy The-ory.

Foundations of Physics , 42(5):688–708, 2012. doi:10.1007/s10701-012-9639-7.[47] B. Coecke, B. Edwards, and R. W. Spekkens. PhaseGroups and the Origin of Non-locality for Qubits.

Elec-tronic Notes in Theoretical Computer Science , 270(2):15–36, 2011. doi:10.1016/j.entcs.2011.01.021.[48] N. Brunner, D. Cavalcanti, S. Pironio, V. Scarani, andS. Wehner. Bell nonlocality.

Reviews of Modern Physics ,86(2):419–478, 2014. doi:10.1103/RevModPhys.86.419.[49] D. Gottesman. The Heisenberg Representation of Quan-tum Computers. In

Proceedings of the XXII Inter-national Colloquium on Group Theoretical Methods inPhysics , pages 32–43, 1999. URL https://arxiv.org/abs/quant-ph/9807006 .[50] C. M. Scandolo, R. Salazar, J. K. Korbicz, andP. Horodecki. The origin of objectivity in all fundamen-tal causal theories.

Pre-print , arXiv:1805.12126, 2018.URL https://arxiv.org/abs/1805.12126 .[51] C. M. Lee and J. Barrett. Computation in generalisedprobabilisitic theories.

New J. Phys. , 17(17), 2015. doi:10.1088/1367-2630/17/8/083001.[52] A. J. P. Garner. Interferometric Computation BeyondQuantum Theory.

Foundations of Physics , 2018. doi:10.1007/s10701-018-0142-7.[53] K. H. Wan, O. C. O. Dahlsten, H. Kristjánsson,R. Gardner, and M. S. Kim. Quantum generalisationof feedforward neural networks. npj Quantum Informa- tion , 3(1), 2017. doi:10.1038/s41534-017-0032-4.[54] P. Janotta, C. Gogolin, J. Barrett, and N. Brunner.Limits on nonlocal correlations from the structure ofthe local state space. New J. Phys. , 13(6):063024, 2011.doi:10.1088/1367-2630/13/6/063024.[55] Ll. Masanes, M. P. Müller, R. Augusiak, andD. Pérez-García. Existence of an information unitas a postulate of quantum theory.

Proceedingsof the National Academy of Sciences of the UnitedStates of America , 110(41):16373–16377, 2013. doi:10.1073/pnas.1304884110.[56] L. Hardy. Disentangling nonlocality and teleportation.

Pre-print , arXiv:quant-ph/9906123, 1999. URL https://arxiv.org/abs/quant-ph/9906123 .[57] A. J. P. Garner.

Phase and interferencephenomena in generalised probabilistic theo-ries . PhD thesis, University of Oxford, 2015.URL https://ora.ox.ac.uk/objects/uuid:c0017faf-cbe0-4365-a1ff-080fa031d006 .[58] M. A. Nielsen and I. L. Chuang.

Quantum Computa-tion and Quantum Information . Cambridge UniversityPress, Cambridge, 2000. ISBN 0521635039.

APPENDIXAppendix A: The pentagon state space

We present an example of a state space [41] (broughtto our attention in Janotta et al. [54]) without an MCI-frame (deﬁnition 1), and illustrate its counterintuitiveproperties.

Example 3 (Pentagon state space) . Consider a GPTsystem with states in A = R such that Ω A is a regularpentagon (with pure states being the vertices), and witha dual space of eﬀects E A subject to the no-restrictionhypothesis. Such a system admits a self-dual identiﬁ-cation between A + and E A in the following sense: foreach vertex ω j , there is a unique related eﬀect e j ∈ E A with e j ≤ u A such that e j ( ν ) = 1 ⇒ ν = ω j ; that is,these eﬀects are in one-to-one correspondence with thevertices – and those are exactly the reﬁned eﬀects. Let us label the vertices clockwise. The maximalframe is of size two: any pair of vertices whose ab-solute diﬀerence between indices is (modulo ) lieon “opposite” sides of the pentagon, and form such aframe. Then, both { ω , ω } and { ω , ω } are maxi-mal frames; their states are distinguished, for exam-ple, by M = { e , u A − e } . However u A − e is not reﬁned : we have that u A − e = α ( e + e ) with α > (1 / , so one could also perform the reﬁned mea-surement M = { e , αe , αe } to distinguish the statesin each frame. (In this case, the equation e i ( ω k ) = δ ik is not valid, but the idea of single shot measurementsallowing for the detection of the states still holds: if out-come αe or αe is obtained, one knows that ω was notprepared.) A similar conclusion applies to every maxi-mal frame in this theory: they all fail to be MCI-frames. All unrestricted GPT systems built from regular poly-gon state spaces with an odd number of vertices alsolack MCI-frames.Suppose someone is promised to receive, with proba-bility P( i ) , the state ω i from one of the frames { ω , ω } or { ω , ω } and should guess the value of i . Then, theprobability of success when using measurement M = { e , u A − e } is given by p M success = P(0) + (1 / , since one can always guess correctly if the outcome re-lated to e clicks but must make a random guess for i = 2 or i = 3 if the other outcome clicks. However, byusing M = { e , αe , αe } one has p M success = P(0) + α [P(2) + P(3)] > p M success , since α > / . We see that the reﬁned measurement M allows for a higher probability of distinguishing betweena set of states which is larger than the maximal frames.In other words, the reﬁned measurement M can distin-guish slightly more than bit, even though the maxi-mal frame has size and this measurement M coarse-grains to the distinguishing measurement M . If oneunderstands coarse-graining as erasing of classical infor-mation, p M success > p M success suggests that there was moreclassical information available than can be encoded ontoa maximal frame. Such a phenomenon occurs for everyunrestricted GPT built from a polygon state space withan odd number of vertices (see also Massar and Patra[41]). This diﬀerence between the amount of classical in-formation that can be encoded into a GPT system andthe size of a maximal frame is a violation of a principlethat has been called “No Simultaneous Encoding” [55].By explicitly only allowing MCI–frames (deﬁnition 1) tocharacterize the classical information to be spread by anideal Darwinism process (deﬁnition 5), we ensure thatno such over–encoding occurs in the systems consideredin this article. Appendix B: Pure separable eﬀects

Lemma 9.

A pure eﬀect is separable if and only if it isa product of pure eﬀects.Proof.

Only one direction is non-trivial: suppose thatthe eﬀect e , ,...,N is separable, then it can be written e , ,...,N = (cid:88) i e (1) i (cid:12) . . . (cid:12) e ( N ) i (B1)where the e ( j ) i are suitable local eﬀects. Since e , ,...,N is pure, we must have e (1) i (cid:12) . . . (cid:12) e ( N ) i ∝ e , ,...,N for all i . Hence these product eﬀects are all multiples of eachother, and e , ,...,N = e (1) (cid:12) . . . (cid:12) e ( N ) for suitable localeﬀects e ( j ) . If we could non-trivially decompose any ofthe e ( j ) , then we could decompose e , ,...,N , which wouldcontradict its purity.6 Appendix C: Spekkens’ Toy Model

In this appendix, we brieﬂy review some details ofSpekkens’ Toy Model [27] (STM) and its GPT exten-sions [21, 25, 56].

1. Overview

STM is essentially a classical hidden-variable modelon which an epistemic restriction is imposed: no morethan half the information (as measured in bits) can beknown. The simplest (and for our purposes, only) sin-gle system in this framework then consists of a so-called ontic hidden variable with four possibilities { , , , } .Valid questions about such system can only narrowdown the state to at best two possibilities (e.g. “is thesystem in ∨ (read ‘1 or 2’)?”) for both aﬃrmativeand negative answers to the question. This yields threesets of mutually exclusive questions of the form “is thesystem in [X]” which we label as follows: (cid:104) x + | := 1 ∨ , (cid:104) x − | := 2 ∨ , (cid:104) y + | := 1 ∨ , (cid:104) y − | := 2 ∨ , (cid:104) z + | := 1 ∨ , (cid:104) z − | := 3 ∨ . (C1)By the rules of STM, whenever such a question is asked,the ontic state must be randomized within the sup-porting set of states consistent with the answer to thequestion. For example, an aﬃrmative answer to ques-tion (cid:104) z + | will randomize the ontic state of the systemto or . This randomization ensures we cannot ﬁndthe exact ontic state, say, by asking two diﬀerent ques-tions in a row – while maintaining the property thatif we ask the same question twice in a row, we willget the same answer. Thus, one may deﬁne a set ofmaximum–knowledge epistemic states in one-to-one cor-respondence with the aﬃrmative answer to these ques-tions, labeled, e.g., as | x + (cid:105) = 1 ∨ . (STM also admits a“unit” question u := “is the system in ∨ ∨ ∨ ?” towhich the answer is always aﬃrmative; similarly, thereis also a maximally mixed state, in which the ontic statecan take any value with the same probability.)The ontic state of a composite system is formed bytaking the Cartesian product of each constituent sys-tem’s ontic state (written for a and b as ab ). The al-lowed epistemic states in this context then are thosethat satisfy the epistemic restriction both on the en-tire system, and also any subsystem thereof. Thus, atwo-system epistemic state must admit at least four on-tic possibilities. In addition to the Cartesian productof single system states, this also allows for “entangled”states, such as ∨ ∨ ∨ , where even though thelocal marginal states are maximally mixed, perfect cor-relation is guaranteed if the same measurement is madeon both systems. On the other hand, a state such as ∨ ∨ ∨ is forbidden. This is because should the (cid:104) z + | measurement on the second system be answered in the aﬃrmative, then the ﬁrst system is deﬁnitelyin state , which violates the epistemic restriction. Itcan thus be seen that STM is self–dual by construction :every maximum–knowledge measurement outcome canbe uniquely identiﬁed with a maximum–knowledge epis-temic state [46].Transformations in the theory are performed by per-muting the underlying hidden variable, in such a waythat no valid epistemic state is taken to an invalid state.For single systems, every permutation is valid – but thisis not the case for multipartite systems. Since thesepermutations are a ﬁnite group, when searching for atransformation that achieves a desired outcome (e.g. ex-hibits Darwinism), one can (with computer assistance)exhaustively search through possible transformations toﬁnd one that achieves the desired aims – or otherwiserule out its existence entirely [57]. However, by formal-izing the similarity between STM and the stabilizer sub-set of quantum mechanics, Pusey [46] enables an elegantsuﬃcient condition for the existence of a transformation,which we will subsequently describe.

2. GPT Extension

First, however, let us remark on the extension of STMinto the GPT framework. In particular, STM deﬁnes adiscrete state space with a ﬁnite number of states – soin order to treat it as a GPT, we must make it contin-uous. This is done in the obvious way: we treat thequestions such as “is the system in ∨ ?” as an ef-fect, and then admit all convex combinations of sucheﬀects. A complete (i.e. at least one question answersin the aﬃrmative for any state) and mutually exclusive(i.e. no more than one question answers in the aﬃr-mative) set of questions maps to a set of eﬀects thatform a normalized measurement (i.e. will sum to theunit eﬀect). Meanwhile, each set of epistemic states ofmaximum knowledge with no overlap in their ontic vari-able support (e.g. { ∨ , ∨ } ) form maximal frames,in which the maximum-knowledge epistemic states areextremal. We then allow convex combinations of suchstates as “mixed” states, yielding a theory dubbed STM–GPT. The set of allowed transformations on the theoryare then deﬁned as exactly those allowed on the (non-GPT) STM, and due to linearity, each of these uniquelyextends into a transformation on the STM–GPT statespace .One representation of a single system in STM–GPT in R is to identify each ontic state with a Cartesian vector, (cid:126)o := (1 , , , T , (cid:126)o := (0 , , , T , (cid:126)o := (0 , , , T , This implies that not all symmetries of the state space of STM-GPT belong to the group of allowed transformations, G . Forinstance, the rotation in the z -axis which permutes | y + (cid:105) (cid:55)→| x + (cid:105) (cid:55)→ | y − (cid:105) (cid:55)→ | x − (cid:105) (cid:55)→ | y + (cid:105) is a symmetry of the octahedronbut is not an allowed transformation in the ontic state space(see ﬁgure 3). yx o z + o o o x + z y + Figure 3.

Normalized states of a Spekkens’ bit.

The tetrahedron is the normalized slice of R correspondingto the underlying classical ontic variable, with basis states { (cid:126)o , (cid:126)o , (cid:126)o , (cid:126)o } . The pure epistemic states correspond to thehalf-way points between these ontic states. The valid epis-temic states of the theory are these states’ octahedral convexhull. (cid:126)o := (0 , , , T , and then write each epistemic state x ∨ y as the vector ( (cid:126)e x + (cid:126)e y ) (see ﬁg. 3). Here A = R and Ω A is the convex combination of such (geometri-cally: this is the octahedron formed by connecting themidpoint of every line in a tetrahedron [21]). As ob-served in Janotta and Lal [25], the unrestricted dual ofthis space is cubic (i.e. a gbit) – but STM does notfollow the no-restriction hypothesis. Rather, insteadthe space of eﬀects can be represented by exactly thesame vector space (carrying forward the self-duality-by-construction of STM), where the self-dualizing innerproduct (cid:104) e, ρ (cid:105) := 2 e · ρ is directly proportional to theEuclidean inner product on the real vector spaces.An analogous representation can also be formed for n STM–GPT systems in R n . Take the Cartesian prod-uct { (cid:126)o , (cid:126)o , (cid:126)o , (cid:126)o } ⊗ n to ﬁnd the set of ontic states, andlikewise deﬁne the epistemic set as valid (as per above)mixtures thereof. For example, ∨ ∨ ∨ is rep-resented here as ( (cid:126)o + (cid:126)o + (cid:126)o + (cid:126)o ) . Meanwhile,product states of lower-dimensional STM–GPT systemsare simply found by the tensor product. For example, ∨ ⊗ ∨ ≡ ∨ ∨ ∨ satisﬁes ( (cid:126)o + (cid:126)o ) ⊗ ( (cid:126)o + (cid:126)o ) = ( (cid:126)o + (cid:126)o + (cid:126)o + (cid:126)o ) . This also allowsfor a self-dualizing inner product: (cid:104) (cid:126)e, (cid:126)ρ (cid:105) := 2 n (cid:126)e · (cid:126)ρ .

3. Stabilizer Formalism

Stabilizer groups originate in group theory, but havebeen adapted for use in quantum theory in the con-text of error-correcting codes and measurement-basedquantum computation, as they provide concise waysto describe certain high-dimensional quantum states.Essentially, a transformation T is said to stabilize astate | ψ (cid:105) if T | ψ (cid:105) = | ψ (cid:105) [58]. Listing enough simul-taneous stabilizing transformations may be enough to uniquely deﬁne a state (up to global phase): for exam-ple, the only two qubit state stabilized by both σ x ⊗ σ x and σ z ⊗ σ z is the Bell state | Ψ (cid:105) = √ ( | (cid:105) + | (cid:105) ) .The stabilizer subset of quantum theory are exactlythe n qubit states that can be so described, whenthe stabilizers are taken from the Pauli group P n := {± , ± σ x , ± iσ x , ± σ y , ± iσ y , ± σ z , ± iσ z } ⊗ n .STM(–GPT) shares many similarities with (the con-vex hull of) quantum stabilizer states [46]. For instance,a qubit has six distinct pure qubit stabilizer states (sta-bilized by the Hermitian elements ± σ x , ± σ y , and ± σ z ).Meanwhile, for an STM bit (using the GPT representa-tion above), we can similarly deﬁne three “observable”matrices: X := diag (1 , − , , − ,Y := diag (1 , − , − , ,Z := diag (1 , , − , − , such that for each measurement, there is a unique (pure)epistemic state corresponding to the and − eigenvec-tor from each (e.g. X | x + (cid:105) = | x + (cid:105) ) – and this coversall pure epistemic states. We can identify each of X , Y and Z respectively with the ontic state permutations X ↔ , Y ↔ , Z ↔ , along with an identity element I := diag(1 , , , ↔ . Then { I, X, Y, Z } together with matrix multi-plication is the Klein four-group V and is isomorphicto the permutation subgroup { , , , } .The Cartesian product of these matrices with Z = { +1 , − } forms the toy stabilizer group G := Z ⊗ V = {± I, ± X, ± Y, ± Z } .Unlike the Pauli group, this group is Abelian with XZ = ZX = Y (cf. σ x σ z = − σ z σ x = − iσ y ). For n bit systems, we denote the application of T ∈ V to the k th system as T k := I ⊗ ( k − ⊗ T ⊗ I ⊗ ( n − k ) . Finally, letus deﬁne the map m : V n → P n that makes an obvi-ous identiﬁcation between STM stabilizers and quantumstabilizers (e.g. m : I X (cid:55)→ ⊗ σ x ).Now we may use the result of Pusey [46]:if a set of independent quantum stabilizers m ( R ) , m ( R ) , . . . m ( R k ) describes a unique quantumstate, then R , R , . . . , R k describes a unique epistemicstate in STM. Moreover, if a map on a set of quantumstabilizers T : m ( A ) (cid:55)→ m ( B ) , . . . , m ( A k ) (cid:55)→ m ( B k ) deﬁnes a unitary quantum transformation and m ( A ) . . . m ( A k ) are a canonical generating set , then A (cid:55)→ B , . . . , A k (cid:55)→ B k deﬁnes a valid STM trans-formation. The full deﬁnition of canonical generatingset is complicated, but for our purposes, it suﬃcesto note that { X , . . . X k , Z , . . . Z k } is one such set.With the aid of these sets, we can construct the FANtransformation (deﬁning how it acts on each X k / Z k ,that broadcasts information about the measurement {(cid:104) z + | , (cid:104) z − |} to the environment, (see equation (23)).8

4. STM is not strongly symmetric, nor does ithave a decoherence map

In this section, we show that stabilizer quantum the-ory and (GPT-)STM fail to admit a decoherence map(in the sense of Richens et al. [23], as adapted in deﬁni-tion 9), and similarly neither theory obeys strong sym-metry.

Lemma 10.

Stabilizer quantum states do not admit adecoherence map.Proof.

By counterexample. Consider the classical -bitcontrol-control-NOT gate that ﬂips the third bit only ifthe ﬁrst two bits are in state 1, and otherwise does noth-ing. This corresponds to a Toﬀoli gate in the quantumcircuit, which is not a member of the Cliﬀord group [49],and hence not a valid quantum stabilizer transforma-tion. This violates condition of deﬁnition 9: thereis a classical reversible transformation that cannot beinduced by a transformation in the theory.Analogously, there is a classical transformation thatcannot be implemented in STM as well: Lemma 11.

Spekkens’ Toy Model does not admit a de-coherence map.Proof.