Quantum Stopwatch: How To Store Time in a Quantum Memory
QQuantum Stopwatch: How To Store Time in a Quantum Memory
Yuxiang Yang,
1, 2
Giulio Chiribella,
3, 1, 4 and Masahito Hayashi
5, 6 Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong HKU Shenzhen Institute of Research and Innovation Yuexing 2nd Rd Nanshan, Shenzhen 518057, China Department of Computer Science, The University of Oxford, Parks Road, Oxford, UK Canadian Institute for Advanced Research, CIFAR Program in Quantum Information Science, Toronto, ON M5G 1Z8 Graduate School of Mathematics, Nagoya University, Nagoya, Japan Centre for Quantum Technologies, National University of Singapore, Singapore
Quantum mechanics imposes a fundamental tradeoff between the accuracy of time measurementsand the size of the systems used as clocks. When the measurements of different time intervals arecombined, the errors due to the finite clock size accumulate, resulting in an overall inaccuracy thatgrows with the complexity of the setup. Here we introduce a method that in principle eludes theaccumulation of errors by coherently transferring information from a quantum clock to a quantummemory of the smallest possible size. Our method could be used to measure the total duration of asequence of events with enhanced accuracy, and to reduce the amount of quantum communicationneeded to stabilize clocks in a quantum network.
INTRODUCTION
Accurate time measurements are important in a variety of applications, including GPS systems [1], frequencystandards [2], and astronomical observations [3, 4]. But the accuracy of time measurements is not just a technologicalissue. At the most fundamental level, every clock is subject to an unavoidable quantum limit, which cannot beovercome even with the most advanced technology. The limit has its roots in Heisenberg’s uncertainty principle,which implies fundamental bounds on the accuracy of time measurements [5–7]. For an individual time measurement,the ultimate quantum limit can be attained by initializing the clock in a suitably engineered superposition of energylevels [8–10]. However, the situation is different when multiple time measurements are performed on the same clock(e. g. in order to measure the total duration of a sequence of events) or on different clocks (e. g. in GPS technology). Inthese scenarios the errors accumulate, resulting in an inaccuracy that grows linearly with the number of measurements.To address this problem, one may try to minimise the number of measurements: instead of measuring individual clocks,one could store their state into the memory of a quantum computer, process all the data coherently, and finally readout the desired information with a single measurement. However, quantum memories are notoriously expensive andhard to scale up [11]. This leads to the fundamental question: how much memory is required to record time at thequantum level?Here we derive the ultimate quantum limit on the amount of memory needed to record time with a prescribedaccuracy. The limit is based on a Heisenberg-type bound, expressing the tradeoff between the accuracy in the read-out of a given parameter and the size of the system in which the parameter is encoded. We show that the boundis tight, by constructing a protocol that faithfully transfers information from the system to a quantum memory ofminimal size. The protocol, which we call quantum stopwatch , freezes the time evolution of a clock by storing itsstate into the state of the memory, as in Figure 1. The quantum stopwatch protocol works with clocks made of manyidentical and independently prepared particles, a common setting when the clocks are identical atoms or ions [12].The use of identical particles can also be thought as a simple repetition code for transmitting time information. Sinceour protocol uses identically prepared particles, the optimal scaling with the memory size is robust to depolarizationof the clocks and to particle loss.Storing time coherently into a quantum memory is a useful primitive for many applications. As an illustration, weconstruct a quantum-enhanced protocol to measure the total duration of a sequence of events. The same protocolcan be used to establish a shared frequency standard among the nodes of a network, and to generate quantum stateswith Heisenberg-limited sensitivity to time evolution and to phase shifts.
THE SIZE-ACCURACY TRADEOFF
Suppose that a parameter T is encoded in the state of a quantum system, say ρ T . The system can be either aquantum clock, where T is the time elapsed since the beginning of the evolution, or a quantum memory, where thedependence of ρ T on T can be completely arbitrary. In general, the parameter T does not have to be time: it can bephase, frequency, or any other real parameter. a r X i v : . [ qu a n t - ph ] J a n FIG. 1.
Working principle of the quantum stopwatch.
The quantum stopwatch coherently transfers time informationfrom a quantum clock, consisting of many identical particles, to a quantum memory of minimum size.
When needed, one can extract information about the parameter T by measuring the system. The question is howaccurate the measurement can be. The inaccuracy of a given measurement can be quantified by the size of the smallestinterval, centred around the true value, in which the measurement outcome falls with a prescribed probability—forexample, P = 99%. Explicitly, the inaccuracy has the expression δ ( P, T ) := inf (cid:110) δ (cid:12)(cid:12)(cid:12) P ( δ, T ) ≥ P (cid:111) , (1)where P ( δ, T ) is the probability that the measurement outcome (cid:98) T belongs to an interval of size δ centred around thetrue value T .Note that the inaccuracy can generally depend on the true value T , which is unknown to the experimenter. Thedependence can be removed by fixing a fiducial interval [ T min , T max ]. For example, the fiducial interval could be theinversion region where the parameter T is in one-to-one correspondence with the state of the system [13]. We denoteby δ ( P ) the worst-case value of the inaccuracy within the fiducial interval. In the Bayesian approach, δ ( P ) providesa lower bound on the probability that the true value falls within an interval of size δ ( P ) around the measured value (cid:98) T : such probability is guaranteed to be at least P for every prior distribution on T and for all values of (cid:98) T except atmost a zero-probability set. Other properties of the inaccuracy, used later in the paper, are presented in Methods.We now derive a fundamental lower bound on the inaccuracy, expressed in terms of the size of the quantumsystem used to encode the parameter T . Let us denote by D the dimension of the smallest subspace containing theeigenvectors of the states { ρ T | T min ≤ T ≤ T max } . Physically, D can be regarded as the effective dimension of thesystem used to encode the parameter T . In terms of the effective dimension, the inaccuracy satisfies the bound δ ( P ) ≥ P ∆ TD + 1 , ∆ T := T max − T min , (2)valid for arbitrary encodings of the parameter T and for arbitrary quantum measurements. We call Eq. (2) the size-accuracy bound .The size-accuracy bound follows from dividing the fiducial interval ∆ T into N = (cid:98) ∆ T /δ ( P ) (cid:99) disjoint intervals ofsize δ ( P ). One can then encode the midpoint value of the i -th interval into the state ρ T i . In this way, one obtains N quantum states, which can be distinguished with probability of success at least P (one has just to estimate T and todeclare the state ρ T i if the estimate of T falls in the i -th interval). On the other hand, the N states are contained inan D -dimensional subspace, and therefore the probability of success is upper bounded by D/N [14], leading to thebound P ≤ D/N and, in turn, to Eq. (2).The size-accuracy bound (2) captures in a unified way the Heisenberg scaling of quantum clocks and the ultimatelimits on the memory needed to store the parameter T . Let us first see how it implies the Heisenberg scaling ofquantum clocks. Consider a clock made of n identical non-interacting particles, each evolving with the same periodicevolution U T = e − iT H/ (cid:126) , where H is the single-particle Hamiltonian. If the particles are initialized in the state | Ψ (cid:105) ,then the quantum state at time T is | Ψ T (cid:105) = U ⊗ nT | Ψ (cid:105) . Now, in order for the time evolution to be periodic, theeigenvalues of H must be integer multiples of a given energy. This implies that the number of distinct eigenvalues ofthe n -particle Hamiltonian grows linearly with n , and, therefore, all the states | Ψ T (cid:105) are contained in a subspace ofdimension proportional to n . Hence, one obtains the bound δ ( P ) ≥ c/n for some suitable constant c >
0. Note thatthis relation also holds for mixed states, because mixing can only increase the inaccuracy (see Methods). Moreover,the relation δ ( P ) ≥ c/n holds for arbitrary measurements.The bound δ ( P ) ≥ c/n implies the familiar Heisenberg bound on the standard deviation of the best unbiasedmeasurement. The argument is simple: by Chebyshev’s inequality, the standard deviation σ satisfies the bound σ ≥ (cid:112) (1 − P ) / δ ( P ), which combined with the bound δ ( P ) ≥ c/n implies the Heisenberg scaling of the standarddeviation. It is important to stress that our “Heisenberg-like” bound δ ( P ) ≥ c/n holds even when the measurementin question is not unbiased.The size-accuracy bound (2) can also be applied to memories. Suppose that one wants to write down the parameter T with accuracy δ ( P ) into a quantum memory of q qubits. Then, Eq. (2) implies that, no matter what encoding isused, the number of memory qubits must be a least q ≥ log 1 δ ( P ) + O (1) . (3)We call Eq. (3) the quantum memory bound . In the following, we show that the bound is tight, meaning that thereexist quantum states and quantum measurements for which Eq. (3) holds with the equality sign. Moreover, weshow that these states can be generated from an ensemble of identically prepared quantum particles by applying acompression protocol that minimises the memory size while preserving the accuracy. COMPRESSING TIME INFORMATION: THE NOISELESS SCENARIO
Consider a quantum clock made of n identical particles oscillating between two energy levels. Restricting theattention to these levels, each particle can be modelled as a qubit. In the absence of noise, the evolution of eachqubit is governed by the Hamiltonian H = E | (cid:105)(cid:104) | + E | (cid:105)(cid:104) | , where E and E are the energy levels and | (cid:105) and | (cid:105) are the corresponding eigenstates. For each individual qubit, the best clock state is the uniform superposition | + (cid:105) = ( | (cid:105) + | (cid:105) ) / √
2. Choosing units such that ( E − E ) / (cid:126) = 1, the state at time T is | ψ T (cid:105) = ( | (cid:105) + e − iT | (cid:105) ) / √ n qubits in an entangled state can achieve the Heisenberg scaling 1 /n in terms of standarddeviation, from which it is immediate that the same scaling can be achieved in terms of inaccuracy [15]. However,here we consider n qubits in a product state—specifically, the product state | ψ T (cid:105) ⊗ n obtained by preparing each qubitin the optimal single-copy state. The state | ψ T (cid:105) ⊗ n has the standard scaling δ ( P ) ≈ / √ n with the clock size (seeMethods). And yet, it can be compressed to a state that has the optimal scaling with the memory size.To understand how the compression works, it is useful to expand the clock state as | ψ T (cid:105) ⊗ n = n (cid:88) k =0 e − ikT (cid:113) B k,n, / | n, k (cid:105) , (4)where B k,n, / is the binomial distribution with probability 1 /
2, and | n, k (cid:105) is the state obtained by symmetrizingthe state | (cid:105) ⊗ k ⊗ | (cid:105) ⊗ n − k over the n qubits. The key observation is that, for large n , the binomial distribution isconcentrated in an interval of size O ( √ n ) around the average value (cid:104) k (cid:105) = (cid:98) n/ (cid:99) . This means that the state | ψ T (cid:105) ⊗ n can be compressed into a typical subspace of dimension O ( √ n ) without introducing significant errors. More precisely,the errors are determined by the tails of the binomial distribution, which fall off exponentially fast as n increases.After the clock state has been projected in the typical subspace, it can be encoded into 1 / n memory qubits atthe leading order. This encoding attains the bound (3): indeed, the inaccuracy scales as δ ( P ) ≈ / √ n for every fixed P , and the number of memory qubits, equal to 1 / n , grows exactly as log[1 /δ ( P )].The original state | ψ T (cid:105) ⊗ n can be retrieved from the compressed state, up to an error that vanishes exponentiallyfast with n . Thanks to the exponential decay of the error, a good compression performance can obtained already forsmall clocks: for example, n = 16 is already in the asymptotic regime for all practical purposes. A compression from16 clock qubits to 4 memory qubits can be done with a compression error of 5 . × − , in terms of the trace distance,or 3 . × − , in terms of the infidelity. Relatively high quality compression can be obtained also for smaller numberof qubits: for example, four clock qubits can be encoded into two memory qubits with fidelity 87 . n identically prepared clock qubits can be compressed into 1 / n memory qubitswithout compromising the accuracy. The key ingredient of the compression protocol is the projection of the state (4)into the typical subspace spanned by energy eigenstates with oscillation frequencies in an interval of size √ n aroundthe mean value. We call this technique frequency projection . EXTENSION TO MIXED STATES AND NOISY EVOLUTION
We have seen that the optimal tradeoff between inaccuracy and memory size can be achieved for pure states withunitary time evolution. A similar result can be obtained also in the noisy case. Let us consider first the case wherenoise affects the state preparation, while the evolution itself is still unitary. In this model, each clock qubit starts offin the mixed state ρ ,p = p | ψ (cid:105)(cid:104) ψ | + (1 − p ) I/ ρ T,p = p | ψ T (cid:105)(cid:104) ψ T | + (1 − p ) I/
2. Physically,we can think of these mixed states as the result of dephasing noise on the pure states | ψ T (cid:105) .In general, the amount of dephasing may vary from one qubit to another. However, as long as the variationsare random and affect all qubits equally and independently, the state of the clock can be described as ρ ⊗ nT,p , for someeffective p . Even more generally, one could consider some types of correlated noise, where the errors acting on differentqubits are part of an (ideally infinite) exchangeable sequence [16]. Physically, this means that each qubit undergoes arandom phase kick, possibly correlated with the phase kicks received by the others, but without any systematic biasthat makes one qubit more prone to noise than the others. The model of exchangeable dephasing noise includes thecorrelated errors due to an overall uncertainty on the initial time of the evolution. In general, de Finetti’s theorem[16] implies that exchangeable dephasing errors lead to a mixture of states of the form ρ ⊗ nT + T ,p , where T is a randomshift of the time origin and p is a random single-qubit dephasing parameter. Thanks to this fact, we can focus firston the compression of the clock states ρ ⊗ nT,p , and then include the case of correlated noise by allowing p to vary.The clock state ρ ⊗ nT,p can be decomposed as a mixture of states with different values of the total spin. The decom-position is implemented by the Schur transform [17], which transforms the original n qubit system into a tripartitesystem, consisting of a spin register, a rotation register, and a permutation register, as in the following equation Schur (cid:16) ρ ⊗ nT,p (cid:17) = n/ (cid:88) J =0 q J,p (cid:16) | J (cid:105)(cid:104) J | ⊗ ρ T,p,J ⊗ ω J (cid:17) , (5)where J is the quantum number of the total spin, q J,p is a probability distribution, {| J (cid:105)} n/ J =0 is an orthonormal basisfor the spin register, ρ T,p,J is a state of the rotation register, and ω J is a fixed state of the permutation register,independent of T and p .Since the states of the spin register are orthogonal, the value of J can be read out without disturbing the state. Theproblem is then to store the state ρ T,p,J ⊗ ω J in the minimum amount of memory. Note also that the permutationregister can be discarded, for it contains no information about T . Hence, the problem is actually to store the state ρ T,p,J . This can be done through the technique of frequency projection, which is realised here by projecting the stateinto the subspace spanned by eigenstates of the total Hamiltonian in an interval of size √ J around the mean.It turns out that the error introduced by frequency projection is negligible for large J . Specifically, we showed thatthe trace distance between the original state and the frequency-projected state is upper bounded as (cid:15) proj ,J ≤ (3 / J − ln ( p − p ) + O (cid:16) J − ln J (cid:17) (6)(see Supplementary Note 1 for the details). The error of frequency projection becomes significant when J is small,but fortunately the probability that J is small tends to zero exponentially fast as n grows: indeed, the probabilitydistribution q J is the product of a polynomial in J times a Gaussian with variance of order √ n centred around thevalue J = p ( n + 1) / √ J , which can then be encoded into aquantum memory of approximately 1 / J qubits. Now, the question is whether 1 / J is the minimum number ofqubits compatible with the quantum memory bound (3). Here we show that the answer is affirmative for all values of J in the high probability region. The argument is based on two observations: First, the inaccuracy for the state ρ ⊗ nT,p ,minimised over all possible measurements, has the scaling δ min ( P ) = f ( P ) / √ n , where f ( P ) is a suitable function (seeMethods). Second, the state ρ ⊗ nT,p can be converted into any of the typical states ρ J,T in an approximately reversiblefashion, with an error that vanishes in the large n limit [19]. Since the inaccuracy is a continuous function of the state(see Methods), we obtain that the minimum inaccuracy for the typical state ρ T,J is δ ( J )min ( P ) = f ( P ) / √ n at leadingorder in n . Now, recall that the typical values of J are equal to J = p ( n + 1) /
2, up to a correction of size at most √ n . Hence, one has δ ( J )min ( P ) = √ p f ( P ) √ J (7)at leading order. Hence, the quantum memory bound (3) implies that at least 1 / J memory qubits are necessaryat leading order. But this is exactly the number of memory qubits used by our compression protocol. This concludesthe proof that the protocol is optimal in terms of memory size for every typical value of J .Note that the compression protocol does not require any knowledge of the time parameter T , nor it requiresknowledge of the dephasing parameter p . Thanks to this feature, the protocol applies even in the presence of randomlyfluctuating and/or correlated dephasing, as long as dephasing errors on different qubits arise from an exchangeablesequence of random variables.The protocol can also be applied when the evolution is noisy. Dephasing during the time evolution is described bythe master equation [9], d ρ t d t = i (cid:126) [ σ z , ρ t ] + γ σ z ρ t σ z − ρ t ) , (8)where σ z is the Pauli matrix σ z = | (cid:105)(cid:104) | − | (cid:105)(cid:104) | and γ ≥ T in NMR. The state at time T is ρ T,γ = p T,γ | ψ T (cid:105)(cid:104) ψ T | + (1 − p T,γ ) I , (9)where p T,γ = (1 − e − γ ( T + τ ) ) /
2, and τ accounts for dephasing noise in the state preparation. The only difference withthe case of unitary evolution is that now the amount of dephasing depends on T . However, since our compressionprotocol does not require knowledge of the dephasing parameter p , all the results shown before are still valid. APPLICATIONSMeasuring the duration of a sequence of events.
An important feature of the compression protocol is that it is approximately reversible, meaning that the original n -qubit state can be retrieved from the memory, up to a small error that vanishes in the large n limit (see SupplementaryNote 2). Thanks to this feature, one can engineer a setup that pauses the time evolution and resumes it on demand.The setup, illustrated in Figure 2, uses a quantum clock made of n identical qubits. At time t , each qubit isinitialized in the state ρ t ,γ . The qubit evolves until time t (cid:48) = t + T under the noisy dynamics (8). The state of the n clock qubits is then stored in the quantum memory, where it remains until time t . For simplicity, we assume thatthe memory is ideal, meaning that the state of the memory qubits does not change during the lag time between t (cid:48) and t . Physically, this means that the decoherence time of the memory is long compared to the lag time between oneevent and the next. At time t , the state of the memory is transferred back to the clock, which resumes its evolutionuntil t (cid:48) = t + T . The procedure is iterated for k times, so that in the end the state of the clock records the totalduration T = T + T + · · · + T k − .Our coherent setup offers an advantage over incoherent protocols where the duration of each time interval T j ismeasured individually. In the noiseless scenario, the comparison is straightforward. The probability distribution forthe optimal time measurement on the state | Ψ T j (cid:105) ⊗ n is approximately Gaussian, and the inaccuracy for the sum of k Gaussian variables grows as √ k . Instead, the inaccuracy of the coherent protocol is approximately constant in k ,up to higher order terms arising from the compression error (see Methods). Hence, performing a single measurementreduces the inaccuracy by a factor √ k .The advantage of the coherent protocols persists even after taking into account the error of frequency projection,and even for relatively small n . As a simple example let us consider the case of n = 8 and P = 0 .
9. For a sequence of k = 4 events, a coherent protocol using three qubits of memory has an inaccuracy 0 .
787 times that of the incoherentprotocol.The benefits of the coherent protocol are not limited to the noiseless scenario. A performance comparison betweencoherent and incoherent protocols is presented in Figure 3, for the task of measuring the duration of a time interval T ,divided into k subintervals of equal length. The figure shows the advantage of the coherent approach for γ = 0 . FIG. 2.
Coherent protocol measuring the total duration of k events. The clock starts its time evolution at time t and continues evolving until time t (cid:48) , when the first event is concluded. At this point, the time information is transferred tothe quantum memory, where it remains until time t , when the information is transferred back to the clock. The procedure isrepeated for k times, so that the total duration of the k events is coherently recorded in the state of the memory. Finally, thestate of the memory is measured, yielding an estimate of the total duration. k T (cid:45) (cid:45) (cid:45) T δ * FIG. 3.
Comparison between coherent and incoherent protocols . On the left: ratio between the inaccuracy of theincoherent protocol and the inaccuracy of the coherent protocol. The figure shows the ratio for clock qubits with decay rate γ = 0 .
2. The quantum advantage grows with the number of time intervals k , with an inaccuracy reduction of about 5 timesfor k = 50. On the right:
Dependence of the rescaled inaccuracy δ ∗ = √ n δ on the total time T for γ = 0 .
2. For the coherentprotocol (red line) the inaccuracy is independent of k , while incoherent protocols have inaccuracies increasing with k , illustratedby the blue lines for k = 10 , , and 50. for every k larger than 2. In Supplementary Note 3 we provide a necessary and sufficient condition for the coherentprotocol to have better performance than the incoherent protocol. The condition shows that the total duration isbetter computed coherently whenever the length of each subinterval T /k is small compared to the decoherence time1 /γ . Note that the advantage persists even when the total time T is large, although the performance of both thecoherent and the incoherent protocol worsen as T grows.In the above discussion we assumed that the the memory is free from noise and that the compression protocol isimplemented without errors. Of course, realistic implementations will also involve errors. One way to take into accountthe noise in the memory is to introduce an effective dephasing rate γ j , which models independent and symmetricerrors occurring in the lag time between t (cid:48) j and t j +1 . Overall, the result of this extra dephasing is to reduce thesize of the parameter region where the coherent storage of time information offers an advantage over the incoherentstrategy. Now, let us consider the errors in the implementation of the compression protocol. Thanks to the continuityof the inaccuracy (see Methods), the error of the circuit implementation can be analysed independently of estimationinaccuracy. Assuming an independent error model, the errors in the implementation of the encoding and decodingoperations will introduce an error (cid:15) circuit , which is bounded by the error probability of each elementary gate, denotedby (cid:15) , times the gate complexity of the whole circuit. The overhead of the gate complexity is the complexity of theSchur transform [17], which was recently reduced to n log n [20]. For k iterations, the overall error (cid:15) circuit scales as k(cid:15) n log n , resulting in an additional term (cid:15) circuit / √ n to the inaccuracy. Therefore, one can see that the inaccuracywill remain almost unaffected as long as the gate error of the compression circuit (cid:15) is small compared to (cid:0) kn log n (cid:1) − .This is, of course, a challenging requirement, but it is important to note that the required gate error can be achievedusing fault tolerance, without the need of implementing physical gates with an error vanishing with n . In fact, thedesired rate of (cid:15) can be achieved by using physical gates with error below a constant threshold value, by recursivelyincreasing the number of layers of error correction [21–23]. Stabilizing quantum clocks in a network.
Networks of clocks are important in many areas, such as GPS technology and distributed computing. Recently,K´om´ar et al proposed a quantum protocol, allowing multiple nodes in a network to jointly stabilize their clocks withhigher accuracy [24, 25]. The protocol involves a network of k nodes, each node with a local oscillator used as atime-keeping device. The goal is to guarantee that all local oscillators have approximately the same frequency. Tothis purpose, a central station distributes a GHZ state | GHZ (cid:105) = (cid:0) | (cid:105) ⊗ k + | (cid:105) ⊗ k (cid:1) / √ k nodes. The entanglementis then transferred to k atomic clocks. By interacting with the clock qubits, the k nodes adjust the frequencies oftheir local oscillators, obtaining a shared time standard with accuracy 1 / ( √ nk ), where n is the number of repetitionsof the whole procedure. The key ingredient in this last step is a protocol for estimating the sum of the frequencies ofthe local oscillators.Overall, the above protocol requires the communication of kn qubits. Using our stopwatch protocol, we can reducethe amount of quantum communication to k log n/ T , and then encodes the state of the clock into a memory, which is sent to the second node. Thesecond node performs the same operations, and passes the memory to the third node, and so on until the k -th node.In the end of the protocol, the memory will contain information about the total phase ϕ tot = ( ω + ω + · · · + ω k ) T ,where ω j is the frequency of the j -th local oscillator. In this way, the sum of the frequencies can be read out with anerror 1 / √ n independently of k , meaning that the average frequency has Heisenberg limited error of size 1 / ( √ n k ).An alternative to the sequential protocol is to use a parallel protocol, where the central station distributes entangledstates to the k nodes, as in Refs. [24, 25]. Using our frequency projection technique, it is possible to reduce the amountof quantum communication also in this case, from kn qubits to k log n/ n copies of the GHZ state into a multipartite entangled state where each node has an exponentially smaller clockof 1 / n qubits. Exploiting this fact, one can obtain the same precision of Refs. [24, 25] using an exponentiallysmaller amount of quantum communication between the nodes and the central station, as illustrated in Figure 4. CONCLUSION
The compression of clock states is a versatile technique. In the addition to advantages in measuring the totalduration and in stabilizing quantum clocks in a network, it offers the opportunity to transform product states of n qubits into entangled states of √ n qubits, allowing one to reversibly switch from an encoding where the informationcan be accessed locally to a more compact encoding where the information is available globally. Quite interestingly,this approach works also for mixed states and in the presence of noise, thus defining a new set of mixed states achievingthe Heisenberg limit.In view of the applications, it is natural to ask what ingredients would be needed to implement our compressionprotocol experimentally. The protocol requires a quantum computer capable of implementing the encoding anddecoding operations. The question is how large the computer should be and how many elementary operations itshould perform. In terms of size, we have seen that the large n regime can be already probed for values around n = 16, a number that is likely to be within reach in the near future. In terms of complexity, one can break downthe compression protocol in two parts: the Schur transform and the frequency projection. The Schur transform canbe efficiently realised by a quantum circuit of at most polynomially many gates [17], scaling as n log n according tothe most recent proposal [20]. The circuit is simpler in the noiseless case [26] and has been recently implemented FIG. 4.
Boosting the performance of quantum sensor networks.
A central station C distributes quantum informationto k nodes, generating entanglement among k local clocks. Using frequency projection, the central station distributes aquantum state that guarantees the same precision of n GHZ states, while requiring an exponentially smaller amount of quantumcommunication of 1 / n qubits per node. in a prototype photonic setup [27], which however is hard to scale to larger number of qubits. NMR and ion trapsystems are another good candidate for prototype demonstrations of the Schur transform with small numbers ofqubits, such as n = 10. The frequency projection can be efficiently implemented with a technique introduced in Ref.[26], whereby the spin eigenstate | J, m (cid:105) is encoded into a (2 J + 1)-qubit state, with the m -th qubit is in the state | (cid:105) and all the other qubits are in the state | (cid:105) . In this encoding, projecting on a restricted range of values of m is the same as throwing away some of the qubits. The encoding operations and their inverses can be implementedusing O ( J ) elementary gates [26]. In summary, all the components of the quantum stopwatch can be implementedwith a moderate amount of elementary gates. The main challenge for the experimental realisation of our protocol isthe required accuracy in the encoding and decoding operations, whose fault tolerant realisation requires additionallayers of error corrections. Our protocol provides an additional motivation to the realisation of fault-tolerant quantumcomputers, showing an example of application where the aid of a quantum computer could significantly enhance theprecision of time measurements. METHODSProperties of the inaccuracy
The results of this paper take advantage of three basic properties of the inaccuracy, presented in the following: (i) Continuity.
Suppose that the states ρ T and ρ (cid:48) T are close in trace distance for every value of the parameter T .Operationally, this means that the outcome probabilities for every measurement performed on ρ T are close to theoutcome probabilities for the same measurement on ρ (cid:48) T . If the trace distance is smaller than (cid:15) , then one has P ( δ, T ) − (cid:15) ≤ P (cid:48) ( δ, T ) ≤ P ( δ, T ) + (cid:15) , (10)where P ( δ, T ) [respectively, P (cid:48) ( δ, T )] is the probability that the estimate falls within an interval of size δ around thetrue value T . In turn, Eq. (10) implies the following bound in the inaccuracy δ ( P − (cid:15), T ) ≤ δ (cid:48) ( P, T ) ≤ δ ( P + (cid:15), T ) , (11)where δ ( P, T ) and δ (cid:48) ( P, T ) are the inaccuracies for the states ρ T and ρ (cid:48) T , respectively. When the probability distri-bution is sufficiently regular, these bounds guarantee that the inaccuracy is a continuous function of the state. Forexample, we will see that the accuracy for an n -qubit quantum clock in the state ρ ⊗ nT,γ is equal to δ ( P, T ) = f ( P ) / √ n at the leading order, where f ( P ) is an analytical function of P . A state that is (cid:15) -close to ρ ⊗ nT,γ will also have accuracy δ (cid:48) ( P ) = f ( P ) / √ n , up to a correction of size (cid:15)/ √ n . In the case of our compression protocol, the compression protocolvanishes with n , meaning that the correction does not affect the leading order. (ii) Data-processing inequality. Suppose that the system S , used to encode the parameter T , is transformed intoanother system S (cid:48) by some physical process. Let ρ (cid:48) T be the state generated by the process acting on the state ρ T .Then, every measurement M (cid:48) on the output system S (cid:48) defines a measurement M on the input system S , obtained byfirst transforming S into S (cid:48) and then performing the measurement M (cid:48) . By construction, the two measurements havethe same statistics and, in particular, the same inaccuracy. By minimising the inaccuracy over all measurements, oneobtains the inequality δ (cid:48) min ( P ) ≥ δ min ( P ) , (12)expressing the fact that physical processes cannot reduce the minimum inaccuracy. (iii) Symmetry. For quantum clocks, the accuracy is maximised by covariant measurements [7], that is, measure-ments described by positive operator valued measures of the form M (cid:98) T = e − i (cid:98) T H
M e i (cid:98) T H . For covariant measurements,one can show that mixing increases the inaccuracy:
Proposition 1.
For a convex mixture ρ = (cid:80) i p i ρ i , one has the inequality δ min ( P ) ≥ min i δ ( i )min ( P ) , where δ min ( P ) [respectively, δ ( i )min ( P ) ] is the minimum inaccuracy for the state ρ [respectively, ρ i ]. The argument is simple. For every covariant measurement, the probability to find the estimate in an interval ofsize δ around the true value T is independent of T and will be denoted simply as P ( δ ). For measurements on thestate ρ = (cid:80) i p i ρ i , the probability is the convex combination P ( δ ) = (cid:80) i p i P i ( δ ), where P i ( δ ) is the correspondingprobability for the state ρ i . Setting δ = min i δ ( i )min ( P ), one has the inequality P ( δ ) ≤ (cid:80) i p i P = P , with equality ifand only if all the inaccuracies δ ( i )min ( P ) are equal. Hence, the inaccuracy for the mixture ρ cannot be smaller than δ . Accuracy of time measurements on identically prepared clock states
We now construct a measurement strategy that estimates the parameter T from the state ρ ⊗ nT,p with inaccuracy δ ( P ) = O (1 / √ n ). In this strategy, each clock qubit is measured independently, projecting the j -th qubit on theeigenstates of the observable O j = cos τ j σ x + sin τ j σ y , where τ j is an angle, chosen uniformly at random between 0and 2 π . The measurement has two possible outcomes, +1 and −
1. If the outcome of the measurement is +1, onerecords the value τ j , if the outcome is −
1, one records the value τ j + π . Mathematically, the measurement strategyis described by the positive operator valued measure { M τ } τ ∈ [0 , π ) with M τ = 12 π (cid:18) e − iτ e iτ (cid:19) . (13)The probability density that the measurement yields the outcome τ when the input state is ρ T,p is P ( τ | T, p ) =Tr [ M τ ρ T,p ]. Explicit calculation shows that the classical Fisher information of this probability distribution is F loc = 1 − (cid:112) p (1 − p ) . (14)Now, since n qubits are measured independently, one can collect the results of the measurements and use classicalstatistics to generate an estimate of T . Using the maximum likelihood estimator, one obtains an estimate that isapproximately Gaussian-distributed with average T and standard deviation σ = 1 / √ nF loc . The probability that theestimate (cid:98) T deviates from the true value by less than δ/ Prob (cid:18) | (cid:98) T − T | ≤ δ (cid:19) = erf (cid:32) δ (cid:114) nF loc (cid:33) + O (cid:18) √ n (cid:19) , (15)where erf( x ) is the error function. Hence, the inaccuracy can be expressed as δ loc ( P ) = (cid:114) nF loc erf − ( P ) + O (cid:18) n (cid:19) . (16)The same argument applies to the states ρ T,γ , generated by the noisy time evolution. The only difference is that,in this case, the classical Fisher information is F loc = 1 − γ − (cid:112) − e − γT + γ √ − e − γT , (17)0when the decay rate γ is known, and F loc = 1 − (cid:112) − e − γT (18)when γ is unknown (see Supplementary Note 4 for the derivation). Acknowledgement.
We thank Lorenzo Maccone for useful comments and Xinhui Yang for drawing the figures.This work is supported by the Hong Kong Research Grant Council through Grant No. 17326616 and 17300317, byNational Science Foundation of China through Grant No. 11675136, by the HKU Seed Funding for Basic Research,the John Templeton Foundation, and by the Canadian Institute for Advanced Research (CIFAR). YY is supportedby a Microsoft Research Asia Fellowship and a Hong Kong and China Gas Scholarship. MH was supported in part bya MEXT Grant-in-Aid for Scientific Research (B) No. 16KT0017, a MEXT Grant-in-Aid for Scientific Research (A)No. 23246071, the Okawa Research Grant, and Kayamori Foundation of Informational Science Advancement. Centrefor Quantum Technologies is a Research Centre of Excellence funded by the Ministry of Education and the NationalResearch Foundation of Singapore. [1] Klepczynski WJ. 1996 GPS for Precise Time and Time Interval Measurement.
Global Positioning Systems: Theory andApplications .[2] Santarelli G, Laurent P, Lemonde P, Clairon A, Mann AG, Chang S, Luiten AN, Salomon C. 1999 Quantum projectionnoise in an atomic fountain: A high stability cesium frequency standard. Physical Review Letters , 4619.[3] Steinmetz T, Wilken T, Araujo-Hauck C, Holzwarth R, H¨ansch TW, Pasquini L, Manescau A, D’Odorico S, Murphy MT,Kentischer T et al.. 2008 Laser frequency combs for astronomical observations. Science , 1335–1337.[4] Li CH, Benedick AJ, Fendel P, Glenday AG, K¨artner FX, Phillips DF, Sasselov D, Szentgyorgyi A, Walsworth RL. 2008A laser frequency comb that enables radial velocity measurements with a precision of 1 cm s − . Nature , 610–612.[5] Mandelstam L, Tamm I. 1945 The uncertainty relation between energy and time in nonrelativistic quantum mechanics.
Journal of Physics (USSR) , 1.[6] Helstrom CW. 1976 Quantum detection and estimation theory . Academic press.[7] Holevo A. 1982
Probabilistic and statistical aspects of quantum theory . North-Holland.[8] Buˇzek V, Derka R, Massar S. 1999 Optimal quantum clocks.
Physical Review Letters , 2207.[9] Huelga SF, Macchiavello C, Pellizzari T, Ekert AK, Plenio M, Cirac J. 1997 Improvement of frequency standards withquantum entanglement. Physical Review Letters , 3865.[10] Smirne A, Ko(cid:32)lody´nski J, Huelga SF, Demkowicz-Dobrza´nski R. 2016 Ultimate Precision Limits for Noisy FrequencyEstimation. Physical Review Letters , 120801.[11] Simon C, Afzelius M, Appel J, de La Giroday AB, Dewhurst S, Gisin N, Hu C, Jelezko F, Kr¨oll S, M¨uller J et al.. 2010Quantum memories.
The European Physical Journal D , 1–22.[12] Nicholson T, Campbell S, Hutson R, Marti G, Bloom B, McNally R, Zhang W, Barrett M, Safronova M, Strouse G et al..2015 Systematic evaluation of an atomic clock at 2 × − total uncertainty. Nature Communications .[13] Kohlhaas R, Bertoldi A, Cantin E, Aspect A, Landragin A, Bouyer P. 2015 Phase locking a clock oscillator to a coherentatomic ensemble. Physical Review X , 021011.[14] Yuen H, Kennedy R, Lax M. 1975 Optimum testing of multiple hypotheses in quantum detection theory. IEEE Transactionson Information Theory , 125–134.[15] Note1. This can be shown by applying Markov inequality to the squared deviation from the mean.[16] Kallenberg O. 2006 Probabilistic symmetries and invariance principles . Springer Science & Business Media.[17] Bacon D, Chuang IL, Harrow AW. 2006 Efficient quantum circuits for Schur and Clebsch-Gordan transforms.
PhysicalReview Letters , 170502.[18] Yang Y, Chiribella G, Ebler D. 2016a Efficient Quantum Compression for Ensembles of Identically Prepared Mixed States. Physical Review Letters , 080501.[19] Yang Y, Chiribella G, Hayashi M. 2016b Optimal Compression for Identically Prepared Qubit States.
Physical ReviewLetters , 090502.[20] Kirby WM, Strauch FW. 2017 A Practical Quantum Algorithm for the Schur Transform. arXiv preprint arXiv:1709.07119 .[21] Aharonov D, Ben-Or M. 1997 Fault-tolerant quantum computation with constant error. In
Proceedings of the twenty-ninthannual ACM symposium on Theory of computing pp. 176–188. ACM.[22] Kitaev AY. 1997 Quantum computations: algorithms and error correction.
Russian Mathematical Surveys , 1191–1249.[23] Knill E, Laflamme R, Zurek WH. 1998 Resilient quantum computation. Science , 342–345.[24] K´om´ar P, Kessler EM, Bishof M, Jiang L, Sørensen AS, Ye J, Lukin MD. 2014 A quantum network of clocks.
NaturePhysics , 582–587.[25] K´om´ar P, Topcu T, Kessler EM, Derevianko A, Vuleti´c V, Ye J, Lukin MD. 2016 Quantum Network of Atom Clocks: APossible Implementation with Neutral Atoms. Physical Review Letters , 060506.[26] Plesch M, Buˇzek V. 2010 Efficient compression of quantum information..
Physical Review A , 032317. [27] Rozema LA, Mahler DH, Hayat A, Turner PS, Steinberg AM. 2014 Quantum Data Compression of a Qubit Ensemble. Physical Review Letters , 160504.[28] Van der Vaart AW. 2000
Asymptotic statistics vol. 3. Cambridge university press.[29] Fulton W, Harris J. 1991
Representation theory vol. 129. Springer Science & Business Media.[30] Hayashi M. 2017
Group Representation for Quantum Theory . Springer.[31] Winter A. 1999 Coding Theorem and Strong Converse for Quantum Channels.
IEEE Transactions on information theory , 2481.[32] Wilde M. 2013 Quantum information theory . Cambridge University Press.
Supplementary Note 1: error bound for frequency projection [Eq. (6) of the main text].
Here we prove Eq. (6) of the main text, regarding the error of frequency projection of the compressor. We provethe result in a general setting, where the states to be compressed are of the form ρ φ,p := p | φ (cid:105)(cid:104) φ | + (1 − p ) | φ ⊥ (cid:105)(cid:104) φ ⊥ | , (19)with p ∈ (1 / , | φ (cid:105) = √ s | (cid:105) + √ − se − iφ | (cid:105) , and | φ ⊥ (cid:105) := √ − s | (cid:105) − √ se − iφ | (cid:105) for some fixed s ∈ (0 , ρ t,γ considered in the main text are a special case of states of the form (19), with p = (1 + e − γt ) / φ = t .To begin with, we recall a few basic facts from the main text. First, the n -fold clock state ρ ⊗ nφ,p can be decomposedas ρ ⊗ nφ,p (cid:39) n/ (cid:88) J =0 q J (cid:18) | J (cid:105)(cid:104) J | ⊗ ρ φ,p,J ⊗ I m J m J (cid:19) , (20)where (cid:39) denotes the unitary equivalence implemented by the Schur transform, J is the quantum number of the totalspin, q J,t,γ is a probability distribution, | J (cid:105) is the state of the index register, ρ t,γ,J is the state of the representationregister, and I m J /m J is the maximally mixed state in a suitable subspace of the multiplicity register [29, 30]. Thestate ρ φ,p,J can be expressed in the form ρ φ,p,J = U ⊗ nφ ( ρ p,J ) U † ⊗ nφ U φ = | (cid:105)(cid:104) | + e iφ | (cid:105)(cid:104) | , (21)where the fixed state ρ p,J has the form ρ p,J := ( N J ) − J (cid:88) m = − J p J + m (1 − p ) J − m | J, m (cid:105) s (cid:104) J, m | s (22) N J := J (cid:88) k = − J p J + m (1 − p ) J − m (23)where | J, m (cid:105) s is the orthonormal basis defined as | J, m (cid:105) s := (cid:80) π ∈ S J V π | φ (cid:105) ⊗ ( J + m ) | φ , ⊥ (cid:105) ⊗ ( J − m ) (cid:112) (2 J )!( J + m )!( J − m )! (24)with | φ (cid:105) = √ s | (cid:105) + √ − s | (cid:105) , | φ , ⊥ (cid:105) = √ − s | (cid:105) − √ s | (cid:105) , S J being the (2 J )-symmetric group and V π being theunitary implementing the permutation π .The frequency projection channel P proj ,J is defined as P proj ,J ( ρ ) := P proj ,J ρ P proj ,J + (1 − Tr[ ρ P proj ,J ]) ρ P proj ,J := (cid:88) | m − (2 s − J |≤ √ J log J | J, m (cid:105)(cid:104)
J, m | , (25)where ρ is a fixed state of the representation register. What we need to prove is exactly the following theorem.2 Theorem 1.
For large J , the frequency projection error (cid:15) proj ,J := (cid:107)P proj ,J ( ρ φ,p,J ) − ρ φ,p,J (cid:107) is upper bounded as (cid:15) proj ,J ≤ (3 / J − ln ( p − p ) + O (cid:16) J − ln J (cid:17) (26) for every t . The property below is useful in our proof of the error bound.
Lemma 1.
The basis {| J, m (cid:105) s } defined by Eq. (24) satisfies the property |(cid:104) J, m | s | J, k (cid:105)| ≤ (cid:113)(cid:0) JJ + k (cid:1)(cid:0) JJ − m (cid:1) s J + k − m (1 − s ) m − k , s ≥ / (cid:113)(cid:0) JJ + k (cid:1)(cid:0) JJ − m (cid:1) s m + k (1 − s ) J − m − k , s < / . (27) where | J, k (cid:105) := (cid:80) π ∈ S J V π | (cid:105) ⊗ ( J + k ) | (cid:105) ⊗ ( J − k ) (cid:112) (2 J )!( J + k )!( J − k )! is the symmetric basis.Proof. Exploiting the symmetry of both bases, we have the following chain of inequalities. |(cid:104)
J, m | s | J, k (cid:105)| = 1(2 J )! (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) π,π (cid:48) ∈ S J (cid:104) φ | ⊗ ( J + m ) (cid:104) φ , ⊥ | ⊗ ( J − m ) V π V π (cid:48) | (cid:105) ⊗ ( J + k ) | (cid:105) ⊗ ( J − k ) (cid:112) ( J + m )!( J − m )!( J + k )!( J − k )! (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (28)= (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) π ∈ S J (cid:104) φ | ⊗ ( J + m ) (cid:104) φ , ⊥ | ⊗ ( J − m ) V π | (cid:105) ⊗ ( J + k ) | (cid:105) ⊗ ( J − k ) (cid:112) ( J + m )!( J − m )!( J + k )!( J − k )! (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (29)= (cid:115) ( J + m )!( J − m )!( J + k )!( J − k )! (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) min { J − m,J − k } (cid:88) l =max { , − k − m } ( − l (cid:18) J − kl (cid:19)(cid:18) J + kJ − m − l (cid:19) s l + k + m (1 − s ) J − m − k − l (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (30) ≤ (cid:115) ( J + m )!( J − m )!( J + k )!( J − k )! (cid:88) l (cid:18) J − kl (cid:19)(cid:18) J + kJ − m − l (cid:19) s l + k + m (1 − s ) J − m − k − l (31)= (cid:115) ( J + m )!( J − m )! s J + k (1 − s ) J − k ( J + k )!( J − k )! (cid:88) l (cid:18) J − kl (cid:19)(cid:18) J + kJ − m − l (cid:19) (cid:18) s − s (cid:19) l − J − m (32) ≤ (cid:115)(cid:18) JJ + k (cid:19) s J + k (1 − s ) J − k · (cid:115)(cid:18) JJ − m (cid:19) (cid:18) s − s (cid:19) J − m , (33)and thus we have reached Eq. (27). Notice that the last inequality is for the case s ≥ /
2. For s < /
2, the last termin the second square-root should be replaced by [(1 − s ) /s ] J − m . Proof of Theorem 1.
First observe that since (cid:104) U ⊗ nφ , P proj ,J (cid:105) = 0, the error is independent of φ . Then we can rewritethe error as (cid:15) proj ,J ≤ (cid:8) (cid:107) P proj ,J ρ p,J P proj ,J − ρ p,J (cid:107) + (1 − Tr [ P proj ,J ρ p,J ]) (cid:9) ≤ (cid:26) (cid:113) (1 − Tr [ P proj ,J ρ p,J ]) + (1 − Tr [ P proj ,J ρ p,J ]) (cid:27) ≤ (cid:113) − Tr [ P proj ,J ρ p,J ] , − Tr [ P proj ,J ρ p,J ]. First, we expend the quantity as1 − Tr [ P proj ,J ρ p,J ] = J (cid:88) m = − J p J + m (1 − p ) J − m N J (cid:104) J, m | s ( I − P proj ,J ) | J, m (cid:105) s = J (cid:88) m = − J p J + m (1 − p ) J − m N J (cid:88) | k − (2 s − J | > √ J log J |(cid:104) J, m | s | J, k (cid:105)| . Applying Lemma 1 (for s ≥ /
2, and the case s < / − Tr [ P proj ,J ρ p,J ] ≤ J (cid:88) m = J − a p J + m (1 − p ) J − m N J (cid:18) s − s (cid:19) J − m (cid:18) JJ − m (cid:19) (cid:88) | k − (2 s − J | > √ J log J s k (1 − s ) J − k (cid:18) Jk (cid:19) + J − a − (cid:88) m = − J p J + m (1 − p ) J − m N J where a ≥ − Tr [ P proj ,J ρ p,J ] ≤ (cid:18) s − s (cid:19) a (cid:18) Ja (cid:19) (cid:88) | k − (2 s − J | > √ J log J s k (1 − s ) J − k (cid:18) Jk (cid:19) + (cid:18) − pp (cid:19) a +1 (34) ≤ J ) a a ! exp (cid:20) − log J a ln (cid:18) s − s (cid:19)(cid:21) + (cid:18) − pp (cid:19) a +1 (35) ≤ exp (cid:20) − ln J a ln(2 J ) + a ln (cid:18) s − s (cid:19)(cid:21) + (cid:18) − pp (cid:19) a +1 , (36)having used Hoeffding’s bound and a ≥
2. Choosing, for instance, a = (cid:98) (ln J ) / (cid:99) guarantees that (cid:15) proj ,J ≤ (cid:115) exp (cid:20) − ln J J J (cid:18) s − s (cid:19)(cid:21) + (cid:18) − pp (cid:19) (ln J ) / ≤ (cid:18) − pp (cid:19) (ln J ) / + O (cid:16) J − ln J (cid:17) = (3 / J − ln ( p − p ) + O (cid:16) J − ln J (cid:17) . Supplementary Note 2: bound on the storage error.
Here we prove the compressor mentioned in the main text has a vanishing error. First, notice that the trace distancebetween the original state ρ ⊗ nφ,p and the output of the compression protocol, denoted as ρ (cid:48) φ,p,n , is (cid:15) φ,p = 12 (cid:13)(cid:13)(cid:13) ρ (cid:48) φ,p,n − ρ ⊗ nφ,p (cid:13)(cid:13)(cid:13) = 12 (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:88) J q J | J (cid:105)(cid:104) J | ⊗ [ P proj ,J ( ρ φ,p,J ) − ρ φ,p,J ] ⊗ I m J m J (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = (cid:88) J q J · (cid:15) proj ,J , (cid:15) proj ,J = 12 (cid:107)P proj ,J ( ρ φ,p,J ) − ρ φ,p,J (cid:107) . (37)4Since we already have a bound for (cid:15) proj ,J , the only ingredient needed for the error bound is the concentration propertyof q J . Notice that the probability distribution q J in Eq. (20) has the explicit form [19] q J = 2 J + 12 J (cid:104) B (cid:16) n J + 1 (cid:17) − B (cid:16) n − J (cid:17)(cid:105) (38)where B ( k ) = p k (1 − p ) n − k (cid:0) nk (cid:1) and J = ( p − / n + 1), which is a Gaussian distribution concentrated in aninterval of width O ( √ n ) around J when n is large. Picking an interval much larger than √ n , we get from Hoeffding’sinequality that (cid:88) | J − J |≤ n / q J ≥ − (cid:20) n / p (cid:21) . (39)Substituting the above inequality into Eq. (37), we have (cid:15) φ,p ≤ max | J − J |≤ n / (cid:15) proj ,J + (cid:88) | J − J | >n / q J ≤ (cid:18) p − n (cid:19) ln ( p − p ) + O (cid:16) n − ln n (cid:17) . Finally, substituting p = (1 + e − γt ) / (cid:15) γ,t ≤ (cid:18) e γt n (cid:19) ln cosh γt + O (cid:16) n − ln n (cid:17) . From the above bound, we can directly get the bound for the overall error of compression in a quantum stopwatchas (cid:15) ( k ) T,γ ≤ k (cid:18) e γT n (cid:19) ln coth γT + O (cid:16) kn − ln n (cid:17) , (40)where k is the number of events and T is the total duration. Supplementary Note 3: coherent versus incoherent protocols.
Here we compare the accuracy of the coherent protocols with the accuracy of incoherent protocols using repeatedmeasurements.Suppose that we want to measure the total duration of k intervals, each of which has length T /k . In the coherentprotocol, each clock qubit starts with a pure state and ends up in a mixed state with maximum eigenvalue dependingon the total duration T as p ( T, γ ) = (1 + e − γT ) /
2. The inaccuracy for the quantum stopwatch thus has leading order1 / √ nF loc , where F loc =1 − γ − (cid:112) − e − γT + γ √ − e − γT is the classical Fisher information. On the other hand, an incoherent protocol involves k measurements of time and k initializations of the clock qubits, resulting in an of leading order (cid:112) k/ ( nF (cid:48) loc ), where F (cid:48) loc is the classical Fisherinformation of the individual step, given by F (cid:48) loc =1 − γ − (cid:112) − e − γT/k + γ √ − e − γT/k . Comparing the two error terms, we conclude that the coherent protocol outperforms the incoherent one when F loc ≥ ( F (cid:48) loc ) /k , namely that k (cid:20) − γ − (cid:112) − e − γT + γ √ − e − γT (cid:21) ≥ − γ − (cid:112) − e − γT/k + γ √ − e − γT/k . k , fixing γ and T . In this case, the left hand side of the above inequalityscales linearly in k , while the right hand side scales as √ k (which can be seen by taking the Taylor expansion of thelast term). Therefore, the left hand side dominates the right hand side in the large k limit, and thus the coherentprotocol is always better for large k . Supplementary Note 4: inaccuracy of the local strategy for unknown γ . Here we analyze the precision of local time measurement strategy when γ is unknown and is treated as a nuisanceparameter, which is not directly of interest but affects the analysis of our estimation. For this analysis, we parameterizethe distribution of our interest as P T,γ ( τ ) because the dependence on γ is crucial in the following discussion. Then,we introduce Fisher information matrix F loc defined by( F loc ) x,y := (cid:90) π dτ ∂ log P T,γ ( τ ) ∂x · ∂ log P T,γ ( τ ) ∂y P T,γ ( τ ) , where x, y ∈ { T, γ } . And we need to evaluate (cid:0) F − (cid:1) T T , namely the (
T, T )-entry of the inverse Fisher informationmatrix F − .To simplify the calculation, we first adapt a re-parameterization of the nuisance parameter from γ to p = ( e − γT +1) /
2. The T component (cid:0) F − (cid:1) T T of the inverse Fisher information is invariant under such a re-parameterization ofthe nuisance parameter, which can be proved as follows.The inverse Fisher information F − with respect to the parametrization ( T, γ ) is given by F − = (cid:20) ( F loc ) T T ( F loc ) T,γ ( F loc ) γ,T ( F loc ) γγ (cid:21) − . Now, consider a re-parameterization of the nuisance parameter γ → p , and the new Fisher information matrix is F − ,p = A T F − A where A = (cid:18) ∂γ∂T ∂γ∂p (cid:19) is the Jacobi matrix. Therefore, we have F − = A (cid:16) F − ,p (cid:17) A T and it can be checked by straightforward calculation that (cid:0) F − (cid:1) T T = (cid:16) F − ,p (cid:17) T T . It is therefore enough to calculate the Fisher information matrix for the parameterization (
T, p ), when the inputstate is ρ T,p = p | ϕ T (cid:105)(cid:104) ϕ T | + (1 − p ) | ϕ ⊥ T (cid:105)(cid:104) ϕ ⊥ T | with | ϕ T (cid:105) = ( | (cid:105) + e − iT | (cid:105) ) / √ | ϕ ⊥ T (cid:105) = ( | (cid:105) − e − iT | (cid:105) ) / √
2. That is,in the following, we employ the parametrization P T,p ( τ ) by the pair ( T, p ) for the probability of the measurement toyield outcome τ . When the input state is ρ T,p , it can be expressed as P T,p ( τ ) := Tr [ ρ T,p M τ ]= 12 π (1 + (2 p −
1) cos( τ − T )) . (41)The Fisher information of P T,p ( τ ) can be derived from a lengthy but straightforward calculation, and we have( F loc ,p ) T T = (cid:90) π (cid:18) ∂ log P T,p ( τ ) ∂T (cid:19) P T,p ( τ ) dτ = 1 − (cid:112) − (2 p − = 1 − (cid:112) − e − γT . (42)6To evaluate the inverse of the Fisher information matrix, we also need to calculate the off-diagonal elements. Wedefine the logarithmic likelihood derivative l p ( τ ) with respect to p as l p ( τ ) := ∂ log P T,p ( τ ) ∂p = 2 cos( τ − T )1 + (2 p −