OOn the Origin of Quantum Uncertainty
Christoph Adami
Department of Physics and Astronomy, Michigan State University Department of Microbiology and Molecular Genetics, Michigan State University * [email protected] Abstract
I propose that quantum uncertainty is a manifestation of the indeterminism inherent inmathematical logic.
Introduction
Quantum physics is strange. When we measure a quantum state, the result is uncertain unless wepreviously prepared the state exactly the way we were going to measure it. Now, when we measurethe state of a classical system without prior knowledge of its state the result is also uncertain, butin a different way. If many systems S i are prepared in the same way and we measure the state ofeach one, all of our measurement devices M i will show the same result. Indeed, if the measurementdevices are perfect, the only variation across devices stems from inaccuracies in the preparation ofthe systems S i .Quantum uncertainty is different: even if Alice, say, diligently prepared all systems S i in thesame exact way, Bob’s measurement results could be highly uncertain, in the sense that he willobserve a probabilistic distribution of results when the preparation had no such uncertainty. Wheredoes this quantum uncertainty come from? What process could possibly create randomness fromcertainty?Let’s first take a look at how a classical measurement is supposed to work. You have a system S that can take on different states s (for ease of discussion we’ll assume that the states are discrete),and a device M that can take on states m . In order for the measurement device’s state to reflectthe system’s state, the two need to become correlated. That is exactly what a good measurementoperation must do, and that operation is essentially a “copy” operation: If the measurement deviceis prepared in the state m (some known calibrated state), then the measurement operation O should bring s × m O → s × m s . (1)Here, m s is the “copy” of s . It doesn’t have to “look” like s , but ideally there exists a one-to-onerelationship between them (see Fig. 1). Quantum Measurement
The quantum measurement process is modeled after the classical idea, of course, but there is afundamental difference: quantum states cannot be copied perfectly [1, 2]. Because of the linearity1 a r X i v : . [ qu a n t - ph ] M a y s m sS Mm s m sS M O Figure 1: In a classical measurement, the state of the system S is “copied” onto the measurementdevice so that the latter reflects the former.of quantum mechanics, the best we can do is to construct “optimal cloning machines” [3] that copyquantum states as accurately as is allowed by the laws of physics, but when measuring arbitrarystates the measurement device will always be uncertain . Quantum mechanics provides us with aset of rules that allows us to predict this quantum uncertainty very accurately, and I will brieflyoutline those rules here because it is essential that we understand the respective roles of the systemthat is being measured S , and the measurement device M . In most accounts of measurement theorythere is an asymmetry between S and M that will turn out to be artificial, but let’s first make surewe distinguish the two. If we prepare a quantum system S (I will focus on qubits in this essay forsimplicity, but all arguments can be extended to arbitrary dimension) at an angle θ with respectto the measurement device M | Ψ (cid:105) S = cos( θ ) | (cid:105) S + sin( θ ) | (cid:105) S (2)and let it interact with a measurement ancilla prepared in the state | (cid:105) M using the unitary mea-surement operator U , the joint system becomes U | Ψ (cid:105) S | (cid:105) M = cos( θ ) | (cid:105) S | (cid:105) M + sin( θ ) | (cid:105) S | (cid:105) M . (3)To quantify the statistics of measurement, all we need to do is study the density matrix of themeasurement device. In the present discussion we attempt to follow the system’s wave functionbecause the time evolution of a quantum system is deterministic: it evolves in a unitary mannerso that the joint probability distribution of a closed system is conserved. Probabilities emergewhen we focus on subsystems of an entangled whole. For example, the density matrix of themeasurement device ρ M is a mixed state even though the joint system of quantum state (includingits preparation) and measurement device is pure. It is obtained by tracing the joint density matrixover the quantum system’s Hilbert space: ρ M = cos ( θ ) | (cid:105) M (cid:104) | + sin ( θ ) | (cid:105) M (cid:104) | = (cid:18) cos ( θ ) 00 sin ( θ ) (cid:19) . (4)Equation (4) tells us that Bob will observe state 0 with probability cos ( θ ) while he will record 1with probability sin ( θ ). The states 0 and 1 are (it goes without saying) place holders for physicalqubits, which could be spin-1/2 particles or the polarization states of a photon. For example, ifthe angle θ represents the rotation that a photon’s polarization undergoes before it is detectedby Bob, then the correlation between Alice’s state preparation and Bob’s detector is given by (cid:104) M A M B (cid:105) = cos( θ ): if θ = 90 ◦ , for example, Alice’s measurement results (we can safely assume thatAlice prepared her quantum states using measurements) and Bob’s will be completely uncorrelated. This minimum uncertainty so as not to violate the no-cloning theorem is, incidentally, the reason for the existenceof Hawking radiation in black holes [4]. To keep things simple, we will ignore complex phases here.
S M
Figure 2: In quantum measurement, system and device are entangled and measuring each other.an entirely different Hilbert space, one that is twice as large as either the state’s or the device’ssystem. After the interaction, it takes twice as many bits to specify the final state, and there isjust not enough “room” in the measurement device to describe that joint space: the device simplydoesn’t have enough bits. I will argue here that any system (classical or quantum) that is forced torespond to a question without having enough bits to encode the answer will respond with some bitsof randomness that are a consequence of undecidability. To make this connection, let me introduceundecidability more formally.
Undecidability
In 1931, the Austrian mathematician Kurt G¨odel destroyed Hilbert’s dream of a “mechanical the-orem solver”—the idea that it would one day be possible to construct an algorithm that couldautomatically prove all correct theorems. By formulating his eponymous “Incompleteness Theo-rem”, G¨odel showed that there cannot be a consistent set of axioms that can be used to prove allcorrect statements about natural numbers, implying that there are true statements about thesenumbers that will remain unprovable within the axiomatic system. Shortly after, Turing appliedthis sort of thinking to computer science [6] by showing that a computer program that can deter-mine whether any particular computer program will “halt” (meaning, terminate and issue a result)cannot exist: the halting probability is undecidable.One way to prove this result is to assume that such a program exists, and follow this assumptionto a logical contradiction. It is important that we understand this contradiction for the argumentsthat follow. Suppose there is a function F ( s, m ) that has two arguments: an input sequence s anda program m . The function F is supposed to return 1 if m halts on s , and zero otherwise: it is ahalting algorithm. Now imagine a program M that takes a program m and uses that program asits “data”: M ( m, m ). Let’s also say that M ( m, m ) does the opposite of F ( m, m ), that is, return 0if m halts on s and 1 otherwise. Then by using a version of Cantor’s algorithm it is now clear thatthe program M cannot be among the set of programs m (even though they are all enumerated),therefore the function F ( s, m ) cannot exist. 3’m sure the reader has noted my surreptitious use of s and m in these definitions, mirroringthe notations for “system” and “measurement device” used earlier. Essentially, it appears as if theimpossibility of a halting algorithm is due to the fact that a program that evaluates itself could bemade to be inconsistent with the state of the system as a whole. This is reminiscent of our earlierobservation that a measurement device cannot encode the joint state created after entanglementbecause the result of the measurement must be consistent with the system state s which, after all,is also encoded in the joint state. In a way, we can say that a quantum system is asked to (in part)measure itself , because after measurement the only state that exists is the joint state of system andmeasurement device.Is this correspondence just an accident? It certainly could be as the halting problem appearsto pertain to Turing machines with infinite tapes, while quantum systems can be as small as singlequbits. To study this question, we must first move away from “the set of all programs of arbitrarylength” that we used for Cantor’s argument, and look at finite programs of length n . To do this,let’s first review Chaitin’s function Ω. Say S is the set of all programs and H is the set of allprograms that halt. Then Chaitin definesΩ = (cid:88) s ∈ H −| s | (5)where | s | denotes the size of s (in bits). By Kraft’s inequality Ω is a probability , but because F ( s, m ) does not exist (as we proved earlier), Ω is uncomputable. In fact, it is possible to provethat every bit of the binary representation of Ω is random, that is, Ω is not compressible [7, 8].The randomness of Ω is a consequence of mathematical logic only.Let’s now move to finite programs of length n . For those F ( s, m ) definitely exists (you cannotuse Cantor’s argument on finite sequences), and therefore we can calculate the halting probabilityΩ n of all program up to length n (but not longer). Let S n be the set of all programs of length upto n and let H n be the set of all programs up to size n that halt after running them for some time T (the argument does not have to specify exactly how large T is, but imagine that T should scalewith n ). We can then define Ω n = (cid:88) m ∈ H n −| m | . (6)If we wait long enough, all n bits of Ω n will be correct—meaning that they are the same bits as thefirst n bits of Ω . In that case, the program that tests programs must have found all programs up tolength n that halt. This, in turn, implies that the program that encodes the algorithm F ( s, m ) thatachieves this feat must be longer than n , since otherwise a universal halting program would exist.This is significant because we have now shown that a program that attempts to determine if a copyof itself halts must be longer than the program itself: precisely the problem that we encounteredin quantum measurement!What if we force a program of length n to determine its own halting probability (as we do inquantum mechanics, as a matter of fact)? How will it respond? To answer this question, I will firstconstruct simple machines that perform computations on binary sequences using binary programs,and then explore what happens if these machines are forced to operate on each other. We must specify here that the programs are self-delimiting , which means that if a short program halts, we do not include in Ω all the programs that have that program as the beginning (so-called “prefix-free” codes). Note that Ω n < Ω < Ω n + 2 − n because (due to the prefix-free nature of the programs) all the halting programslonger than n in the world can never add more than 2 − n to Ω n . ntangled Classical Turing Machines Let us construct machines that act on sequences of length n and return a single bit as output (theconstruction can readily be generalized to write any number of output bits). This single output bitcould be the i th bit of the halting probability Ω n , but for our purposes it could be any calculationreturning a single bit. Such a computation is determined by one of 2 n logic tables (the computationis effectively an n → (cid:126)m , living in a space M n . This computation can be written formally using the linear operator CC = (cid:88) m m ··· m n P m m ··· m n ⊗ n (cid:88) i =1 P i ⊗ σ m i x . (7)Here, the operators P m m ··· m n are projectors acting on the space of programs (the vectors | m · · · m n (cid:105) ),while the P i are projectors on the input states | s · · · s n (cid:105) . The operator σ m i x acts on the outputbit, condionally flipping that bit (via the Pauli matrix σ x ) depending on whether m i = 1 (flip)or m i = 0 (don’t flip). An explicit example of a pair of Turing machines implementing arbitrary1 → (cid:126)s and (cid:126)m using the Dirac bra-ket notation, butbecause these represent classical states, they can never occur in superpositions in this context.A simple example computation is the CNOT (controlled NOT) operator C CNOT given by pro-gram | m (cid:105) = | (cid:105) acting on the set spanned by the vectors | (cid:105) , | (cid:105) , | (cid:105) and 11 (cid:105) . So, for exampleto calculate CNOT(1,0), we apply C CNOT on the input state | (cid:105) s , and pick the m i from the CNOTprogram | (cid:105) . The term with projector P is the only non-vanishing contribution, and lookingup the third bit in the program | (cid:105) tells us that the ancilla needs to be flipped for this particularinput (we suppress the program vector here for simplicity): C CNOT | (cid:105) s | (cid:105) = | (cid:105) s ( σ x ) | (cid:105) = | (cid:105) s | (cid:105) . (8)A particular important operation (program) is the simplest logic operation acting on a single bit:the operation COPY. We can write this as O = P ⊗ + P ⊗ σ x , (9)with P = | (cid:105)(cid:104) | and P = | (cid:105)(cid:104) | . When writing (9), I have suppressed the “copy” program that O is conditional on, namely the program | (cid:105) . This operation is precisely that which implements themeasurement process (1) on classical bits, so that O| (cid:105)| (cid:105) = | (cid:105)| (cid:105) (10) O| (cid:105)| (cid:105) = | (cid:105)| (cid:105) . (11)In fact, this turns out to be the classical version of the quantum operator U in Eq. (3): they aregiven by precisely the same equation, and only differ in the states that the operation is applied to(the classical operator is never applied to superpositions).We now start considering the possibility that the bits that the Turing machine conditions on(input bits and program bits) might be modified by another Turing machine. In this manner, bothTuring machines are entangled with each other, but classically of course. For this to be possible wemust account for the possibility that the measurement device is not in the prepared state | (cid:105) . Thisis not a problem in principle, because we can simply extend the range of the operator O so that O| (cid:105)| (cid:105) = | (cid:105)| (cid:105) (12) O| (cid:105)| (cid:105) = | (cid:105)| (cid:105) . (13)5easurements with an uncertain initial state are not uncommon in quantum mechanics, wherethe degree of uncertainty of the ancilla simply implements different strengths of measurements [9,10]. Allowing for unprepared ancilla states begins to blur the distinction between system andmeasurement device.The simplest general scenario of two Turing machines T and T implementing 1 → s, m ) is not well-defined, because which operation T willperform on its I/O space will depend on what operation T is performing, since T ’s program isdetermined by the operation of T . What that operation is, however, depends on T (see Fig. 3a).Representing such a pair of transformations mathematically using the formalism from Eq. (7) is i j k ` (a) i j (b) T COPY1 ( i ! j ) T COPY2 ( j ! i ) (c) T k` ( j ! i ) T ij ( k ! ` ) Figure 3: (a) Two machines T and T that write on each other’s program spaces. T uses bits i and j as programs, to operate on bits k and (cid:96) , while T uses the program in bits k and (cid:96) with I/O bits j and i (this machine is constructed in detail in the Supplementary Material). Other permutationsof the bits i, j, k, (cid:96) are equally possible. (b) Two machines with fixed COPY programs, writing oneach other’s I/O bits. (c): Two non-universal entangled machines (with permission from the M.C.Escher Company.)possible but cumbersome (this is carried out in the Supplementary Information). To obtain someinsights on the consequences of potentially incompatible operations on classical bits, I will simplifythe situation even further: Let’s keep the program of the two Turing machines unchanged (thesimple COPY operation (9)), but the bit that T writes to is the bit that simultaneously T readsfrom, and vice versa (Fig. 3b).The copy operators U and V copying the “left” bit to the “right” bit (and vice versa) can bewritten as U = P ⊗ + P ⊗ σ x (14) V = ⊗ P + σ x ⊗ P . (15)Using those, we find that U V | (cid:105) = | (cid:105) V U | (cid:105) = | (cid:105) (16) U V | (cid:105) = | (cid:105) V U | (cid:105) = | (cid:105) (17) U V | (cid:105) = | (cid:105) V U | (cid:105) = | (cid:105) (18) U V | (cid:105) = | (cid:105) V U | (cid:105) = | (cid:105) . (19)It is clear that the order of operations matters, except for the input state | (cid:105) . For inputs | (cid:105) and | (cid:105) one of the bits is inconsistent (between the two orders), while for the input state | (cid:105) both bits6re inconsistent. But it is also clear that when acting on states we can really only implement one orthe other operation at a time, and performing operations sequentially does not create ambiguities.In fact, even simulating simultaneous operations on a computer is impossible because the computermust also operate sequentially. One way to move away from a deterministic sequential picture isto write the transformations statistically, in terms of their effect on a (classsical) density matrixinstead. If we choose the representation | (cid:105) = (cid:18) (cid:19) and | (cid:105) = (cid:18) (cid:19) , we can write a probabilisticuniformly mixed state as the matrix ρ in = | in (cid:105)(cid:104) in | = 14 . (20)The transformations U and V in this representation read U = V = . (21)The matrices U and V are symmetric orthogonal matrices with negative determinant, so that U U T = V V T = . They are in fact reflections, so we also have U = V = 1.Operating with U or V on this matrix does not change the input state because U ρ in U T = V ρ in V T = ρ in . What if we operate with U from the left, and with V T from the right? ρ in ? −→ U ρ in V T (22)This is, to put it mildly, an illegal operation (it is not trace-preserving, for example), but it doesappear to create both transformations at the same time. Let us investigate the ramifications byconsidering the two matrices ρ UV = U ρ in V T ρ V U = V ρ in U T . (23)Because ρ in is diagonal, the matrices U and V are really just performing a singular value decom-position (SVD) of the transformed state, so ρ UV = 14 U V ; ρ V U = 14
V U . (24)Both of these matrices are non-diagonal: in fact they are elements of the special orthogonal groupin four dimensions, SO(4) .We also note that ρ TUV = ρ V U , and that as a a consequence both ρ UV and ρ V U are projectors( ρ = ρ ). However, while Tr ρ in = 1, Tr ρ UV = Tr ρ VU = , that is, we have “lost” probabilities.The loss of probability is not a problem in principle as it is a sign of dissipative dynamics; but ofcourse, in a closed system this is inadmissible in principle. But let us cast such concerns aside for This is not particularly surprising, as the quantum counterpart of these operations (the measure-ment/entanglement operator U for qubits shown in (3)) belongs to the group SU (2) × SU (2), which is a doublecover of SO(4). ρ L = Tr R ( ρ UV ) = Tr R ( ρ VU ) = 14 (cid:18) (cid:19) , (25) ρ R = Tr L ( ρ UV ) = Tr L ( ρ VU ) = 14 (cid:18) (cid:19) . (26)Obviously both matrices are non-diagonal, as we have already expected. They each have twoeigenvalues 0 and 1, but their eigenvectors are different. The eigenvector of ρ L to the eigenvalue1 is the state | (cid:105) , while the corresponding eigenvector for ρ R is | (cid:105) . It is the eigenvectors to theeigenvalues zero that are the main surprise: for ρ L (the “left bit”) the eigenvector is | (cid:105) − | (cid:105) , whilefor ρ R it is | (cid:105) + | (cid:105) .Perhaps the reader is at this point shaking their head vigorously, as I promised that super-positions would not appear in the classical description. And to some extent they do not: theseeigenvectors are mere ghosts as they appear with vanishing strength (zero amplitude). They are, ifyou will, palimpsests of the quantum case, announcing their posssible presence only by insinuation.Most importantly, we need to keep in mind that it is really the classicality of these bits that is anabomination. After all, classical physics does not exist: we just find it easier to conceptualize.The fact that the superposition eigenvectors occur with zero probability essentially tells usthat the computations that we asked these Turing machines to perform will never halt, but loopendlessly over the possibilities (indeed, two eigenvalues of U V and
V U are complex pairs on theunit circle). In real physics (quantum physics, that is) these computations do halt, but the valuethat these bits take on–like the bits of the halting probability–must be random because if they werenot then halting algorithms shorter than program size could be written. In hindsight, these randombits that are forced upon us by the “strange loop” of quantum measurement (measurement devicesmeasuring each other, where the distinction between system and device has fully lost its meaning)represent the only reality. What we perceive to be real are just conditional probabilities relatingrandom variables to each other; the only reality that exists is the relative state of measurementdevices [11]. So, while to Einstein it appeared that quantum mechanics must be incomplete, it nowseems as if he glimpsed the incompleteness of mathematics instead.
Epilogue
In 1998 and 1999 I received two books [7, 8] in the mail, sent to me unsolicited by Greg Chaitin.At the time I was working on quantum computation and quantum information theory, and thequantum information-theoretic description of quantum measurement with Nicolas Cerf had justappeared [12]. Shortly thereafter, Greg (then at the IBM Watson Research Laboratory) calledme up on the phone in my office at the California Institute of Technology, and after introducinghimself, asked me a very direct question: “Chris”, he asked, “do you believe that the randomnessin Mathematics is a consequence of the randomness inherent in quantum mechanics?” Withouta moment’s hesitation, I blurted out my answer: “This is preposterous Greg, it is just the otherway around!” After a moment of stunned silence, he asked the logical follow-up question: “Whatmakes you say that?” It was now my turn to be silent, because I didn’t really know why I hadsaid that. I knew exactly what I meant, but I did not know how to explain it. I tried to explainit, but the more I tried, the more I realized that I did not understand it myself. For over twentyyears now I have (on and off) tried to connect mathematical and quantum uncertainty. It seemedthat there were two ways to go about it: show that a quantum measurement really is attemptingto solve the halting problem, or else to show that a classically entangled set of Turing machines8ooks in important ways just like a quantum system. In this essay I decided to try the latter route,but perhaps the former has a better chance at succeeding, as at least that can be couched in realphysics.
Acknowledgements
I am grateful to numerous people that have patiently listened to my ruminations about the linkbetween quantum and classical entanglement, their relation to the halting problem, and the origunof quantum indeterminacy. First and foremost is Arend Hintze, who hammered out the Turingmachine representation of logic operations with me, many years ago, and attempted to simulateentangled classical Turing machines on a (classical) computer (to no avail). I’ve also discussedthese concepts at length with Richard Lenski. Thanks are also due to Thomas Sgouros, whosecomments on the manuscript helped improve it. Finally, I am grateful to Nicolas Cerf for a decadeof collaboration on quantum information theory, which laid the foundations for these ideas, and inparticular to Greg Chaitin, without whose phone call none of this work would exist. This work wassupported in part by a grant from the Templeton Foundation.
References [1] D. Dieks. Communication by EPR devices.
Phys. Lett.A , 92:271, 1982.[2] W. K. Wootters and W. H. Zurek. A single quantum cannot be cloned.
Nature , 299:802, 1982.[3] N. Gisin and S. Massar. Opimal quantum cloning machines.
Phys. Rev. Lett. , 79:2153–2156,1997.[4] C. Adami and G. Ver Steeg. Black holes are almost optimal quantum cloners.
J. Phys. A ,48:23FT01, 2015.[5] J. R. Glick and C. Adami. Markovian and non-Markovian quantum measurements. arXiv1701.05636, 2017.[6] A. Turing. On computable numbers, with an application to the Entscheidungsproblem.
Pro-ceedings of the London Mathematical Society , 42:230–65, 1936.[7] Gregory J. Chaitin.
The Limits of Mathematics . Springer Verlag, Singapore, 1998.[8] Gregory J. Chaitin.
The Unknowable . Springer Verlag, Singapore, 1999.[9] Y. Aharonov, D. Z. Albert, and L. Vaidman. How the result of a measurement of a componentof the spin of a spin-1/2 particle can turn out to be 100.
Phys Rev Lett , 60:1351–1354, 1988.[10] T. A. Brun. A simple model of quantum trajectories.
Am. J. Phys. , 70:719–737, 2002.[11] C. Rovelli. Relative information at the foundation of physics. In A Aguirre, B Foster, andZ Merali, editors,
It from Bit or Bit from It? On Physics and information , pages 79–86.Springer Verlag, 2015.[12] N. J. Cerf and C. Adami. Information theory of quantum entanglement and measurement.
Physica D , 120:62–81, 1998. 9 upplementary Material
In the main text I discussed a situation where two Turing machines with fixed programs writeon each other’s I/O space, leading to inconsistent bits without a determined state. But becausethe programs were fixed for simplicity, this leaves the possibility that the programs’ indeterminacycould compensate for the I/O indeterminacy. Here I show that in the minimum system of programsof size 2 and an I/O space of size two (see Fig. 4), classical indeterminacy still occurs.There are four possible programs for machine T acting on the 2-bit I/O space, given by U = P ⊗ + P ⊗ (27) U = P ⊗ + P ⊗ σ x (28) U = P ⊗ σ x + P ⊗ (29) U = P ⊗ σ x + P ⊗ σ x , (30)specified by the four projectors P , P , P , P acting on the 2-bit program space ( P ij = | ij (cid:105)(cid:104) ij | ),that is, T = (cid:88) m m P m m ⊗ U m m . (31)Figure 4 shows a machine implementing one of the four operations in Eqs. (27-30) (depending on U T Figure 4: A machine T reading a two-bit program U (from the two left-most bits) that controls a1 → T that reads its program fromthe two right-most bits instead, controlling the 1 → U ij (but the operator acts on bits 3and 4 instead) and implement the operation (see Fig. 5) T = (cid:88) m m V m m ⊗ P m m . (32)Choosing the representation | (cid:105) = (cid:18) (cid:19) and | (cid:105) = (cid:18) (cid:19) as in the main text, the operators T and T are 16 ×