One from many: Estimating a function of many parameters
OOne from many: Estimating a function of many parameters
Jonathan A. Gross
1, 2, 3, ∗ and Carlton M. Caves † Center for Quantum Information and Control,University of New Mexico, Albuquerque NM 87131-0001, USA Centre for Engineered Quantum Systems,School of Mathematics and Physics,The University of Queensland, St. Lucia QLD 4072, Australia Institut quantique and D´epartment de Physique,Universit´e de Sherbrooke, Qu´ebec J1K 2R1, Canada (Dated: September 18, 2020)
Abstract
Difficult it is to formulate achievable sensitivity bounds for quantum multiparameter estimation.Consider a special case, one parameter from many: many parameters of a process are unknown;estimate a specific linear combination of these parameters without having the ability to control anyof the parameters. Superficially similar to single-parameter estimation, the problem retains gen-uinely multiparameter aspects. Geometric reasoning demonstrates the conditions, necessary andsufficient, for saturating the fundamental and attainable quantum-process bound in this context. a r X i v : . [ qu a n t - ph ] S e p . INTRODUCTION Well-traveled is the path of deriving quantum bounds on the mean-square error of esti-mating a single parameter. Fisher information provides the necessary concept. MarryingFisher information to quantum measurement theory—a marriage made in heaven!—yieldsthe quantum Cram´er-Rao bound (QCRB) on estimating a single parameter.
Less trav-eled is the deceptively similar trail of estimating a function of several parameters. Similar,yes, yet not merely a recasting of the single-parameter problem, this is a different problemwith genuinely multiparameter connotations.For those venturing onto this path, this paper formulates a roadmap for navigating thetricky terrain. Our work, challenged into existence by Eldredge et al. , explores the bound,presented there, on estimating a function of the parameters. Our goals: examine and in-terpret this bound, relating it to the standard bound on estimating a single parameter;formulate the quantum version in terms of a QCRB; find the necessary and sufficient con-ditions for saturation of the quantum bound; finally, optimize over quantum measurementsand states to forge a new bound that depends only on the quantum process that imprints theinformation one wants to determine. The key to achieving these goals comes, surprisingly,from differential geometry: respect the distinction between tangent vectors, associated withsingle-parameter estimation, and differential forms, associated with estimation of a function,a distinction obscured and suppressed by a parochial preoccupation with single-parameterestimation.Work within the physics community has considered the task we set for ourselves here.Distributed (or networked) quantum sensing is how physicists describe this task, thinkingthat the function to be estimated is constructed from parameters on distributed sensingdevices. Sidhu and Kok provide an overview of distributed sensing in Sec. VIII of anexcellent review of quantum parameter estimation. To avoid the pitfalls of single-parameterthinking, a typical approach is to calculate genuine multiparameter-estimation bounds andfrom these to extract a function-estimation bound. . Successful though this approach is,it obscures the geometry of the problem through the introduction of extraneous ingredientsand suffers from uncertainty in the saturability of some bounds. Both issues we address, byidentifying the relevant geometric objects.An extensive statistics literature has considered Fisher-information bounds on estimat-ing one or more relevant parameters in the presence of a set of irrelevant parameters callednuisance parameters. Developed for classical estimation in the 1970s through 1990s, this nuisance-parameter approach has been extended recently to quantum estimation. Equivalent though the nuisance-parameter language is to what we do here, we generallyavoid it, because it encourages inattention to the distinction between parameter estimationand function estimation. Knowing a function to be estimated does not specify a set ofirrelevant nuisance parameters; indeed, such specification defines a single-parameter estima-tion problem, not a function estimation. Geometrically invariant language being always ourpreference, we say, instead of referring to nuisance parameters, that a subspace of constantfunction value is not under the control of the experimenter. Notable also in this statistics lit-erature, despite formulation as “information geometry,” is an absence of geometric intuitionand visualization, as evidenced by the near absence of figures. That deficiency we remedy.Noteworthy is recent work by Tsang et al. , which unites the physics and statisticsstrands and considers bounds on function estimation from a geometric perspective. Similarto, yet different from our analysis, the work of Tsang et al. is in some ways more general in2hat the analysis applies to bounds on quantities in addition to the mean-square error thatgoes with Fisher information and Cram´er-Rao bounds.Guided by a single star, this paper rows gently, but steadily and relentlessly in onedirection: for an arbitrary, unitary or nonunitary quantum process, which imprints theinformation to be estimated on a quantum system, optimize over quantum measurementsand initial state to obtain an achievable QCRB for function estimation that depends onlyon the quantum process. The key geometric object that emerges from this journey we callthe process norm. The process norm and associated quantum-process bounds on functionestimation, along with persistent attention to geometric thinking and visualization, are thechief contributions of this paper.The general absence of geometric visualization and intuition in the parameter-estimationliterature, invocations of information geometry, is an example of what colleague ChristopherJackson calls “algebra fever,” the mania in modern mathematics to eschew geometricthinking in favor of translating geometric concepts into algebra, at which point geometricintuition is forgotten.Inexplicable, it might be thought, to proceed from Jackson’s observation to an expla-nation of this paper’s distinctive style. Yet explaining the inexplicable—that’s our job, solisten up. Kip Thorne’s recent biographical memoir of John A. Wheeler reminded us ofWheeler’s passion to geometrize Einstein’s general relativity and of the idiosyncratic, yetcompelling writing style Wheeler employed to promote that passion. Possessing a similarpassion to geometrize metrology—less grand, to be sure, than Wheeler’s goals, but passion-ate nonetheless—we adopt here Wheeler’s style. As compelling we hope to be, but failingthat, as idiosyncratic. The style is a reminder, on every page of the paper, that we aim toput geometric thinking at the heart of quantum metrology.
II. SETTING UP THE PROBLEM
Specify the problem of interest: estimate a property of a physical process through re-peated interactions. Assume the physical process belongs to a family of quantum channels E ˜ θ parametrized by ˜ θ = ( θ , . . . , θ N ) and the property is a function q (˜ θ ) of these parameters.Consider interacting with the process by preparing a quantum system in a chosen state, sub-jecting the system to the evolution the process dictates, and finally measuring the evolvedsystem. Perform many such interactions, and estimate the property of interest based on thedata obtained. Pose now the natural question: what is the best precision with which theproperty can be estimated?A luxury that guarantees proximity to the truth are the many interactions with theprocess. Indulge therefore in an initial estimate for the process encoded in a parameterpoint ˜ θ near to the true parameter point. Lock the interactions to this fiducial operatingpoint. Precision then describe by the extent to which small deviations of the truth from ˜ θ can be detected.Avoid complications arising from singularities and degeneracies by taking the parametersto be independent and physically meaningful in the neighborhood of ˜ θ . Such parametriza-tions are realized by local co¨ordinate charts of the process manifold, as distinct points in thismanifold refer to independent, physically distinguishable channels. More pathological sce-narios, induced perhaps by infinite dimensions or additional constraints on the experimenter,might be addressed using techniques employed by Tsang et al. for the state-estimationproblem; these techniques circumvent direct inversion of potentially singular objects.3eify the setup through two examples. To estimate a phase shift ϕ in the presence ofan unknown loss rate γ , co¨ordinates ˜ θ = ( ϕ, γ ) parametrize the process, and the functionis simply q ( ϕ, γ ) = ϕ . To estimate the average fidelity of a process with respect to atarget unitary U , co¨ordinates ˜ θ parametrize the family of all quantum channels, and q (˜ θ ) = (cid:82) dψ (cid:10) ψ (cid:12)(cid:12)(cid:0) U − ◦ E ˜ θ (cid:1)(cid:0) | ψ (cid:105)(cid:104) ψ | (cid:1)(cid:12)(cid:12) ψ (cid:11) .Unitary processes occupy much of our attention in this exposition, so additional commentspeculiar to this case are in order. Transform the family of interest V ˜ θ to the equivalent family U ˜ θ = V ˜ θ ◦ V − θ ; the fiducial parameter point ˜ θ then corresponds to the identity process, U ˜ θ = I . Convenient it is to parametrize this unitary by its Hamiltonian H (˜ θ ), U ˜ θ ( ρ ) = e − iH (˜ θ ) ρe iH (˜ θ ) . (2.1)The motivating work by Eldredge et al. considered a Hamiltonian for a set of spins and aproperty q , both assumed to be linear in the chosen parametrization ˜ θ , H (˜ θ ) = 12 (cid:88) j θ j σ zj = θ j σ zj , (2.2) q (˜ θ ) = (cid:88) j α j θ j = α j θ j . (2.3)The last forms introduce the Einstein summation convention: sum over index labels thatoccur simultaneously in a lower and an upper position within an expression. Though arbi-trary Hamiltonians and properties are not linear functions of a parametrization, write linearapproximations to them in the neighborhood of the fiducial point ˜ θ : H (˜ θ + d ˜ θ ) = H (˜ θ ) + dθ j ∂ j H | ˜ θ = dθ j X j , (2.4) q (˜ θ + d ˜ θ ) = q (˜ θ ) + ∂ j q | ˜ θ dθ j = q (˜ θ ) + q j dθ j . (2.5)Here and throughout, employ the shorthand ∂ j = ∂/∂θ j to harmonize with the summationconvention. Also eliminated is H (˜ θ ), set to zero since U ˜ θ = I . Further, reparametrize tochoose ˜ θ = 0 and to set q (˜ θ ) = 0.Justified indeed are these linear approximations when bounding optimal estimation, asthe limit of many interactions is our concern and the uncertainty in ˜ θ in this limit is cor-respondingly small. Press this point home: estimation in the limit of many experiments isproperly studied in the tangent space to the parameter manifold at a fiducial point. Thisperspective we develop in greater detail in the following section.Much attention has been devoted to a problem similar to ours: that of estimating aproperty of a quantum state with many unknown parameters. Only sporadically hasthe corresponding problem for quantum channels been addressed, a notable example beingthe estimation of Pauli-channel asymmetry in Gazit et al. III. CLASSICAL ESTIMATION: EXERCISING YOUR DIFFERENTIALGEOMETRY
Tangent vectors, differential forms, and metrics: these basic elements from differentialgeometry provide the mathematical language for the estimation problem. Generally forgot-ten in the parameter-estimation literature are differential forms, mainly due to a focus on4ingle-parameter problems. Worthy of our meditation and attention day and night, renewnow acquaintance with these geometric objects.
A. Classical Fisher information
Understand first the classical problem of estimating the parameters specifying a givenprobability distribution within a parametrized family of distributions p ( x | ˜ θ ), reserving forsubsequent sections the issue of choosing initial system state and final system measurementthat transform a parametrized family of quantum channels into such a family of distributions.The classical procedure is straightforward: sample data x from the conditional probability p ( x | ˜ θ ) and use hatted function ˆ θ j ( x ) to estimate the parameter θ j from the data. Thecovariance matrix of the estimators, C jk = Cov(ˆ θ j , ˆ θ k ) = (cid:90) dx p ( x | ˜ θ ) (cid:2) ˆ θ j ( x ) − (cid:104) ˆ θ j (cid:105) ˜ θ (cid:3)(cid:2) ˆ θ k ( x ) − (cid:104) ˆ θ k (cid:105) ˜ θ (cid:3) , (3.1)captures a mean-square notion of the accuracy of the estimates. In this definition, (cid:104) ˆ θ j (cid:105) ˜ θ = (cid:90) dx p ( x | ˜ θ )ˆ θ j ( x ) (3.2)is the mean value of the estimator ˆ θ j ( x ), and the parameters ˜ θ = ( θ , . . . , θ N ) should beregarded as true values.The deviations ˆ θ j ( x ) − (cid:104) ˆ θ j (cid:105) ˜ θ express how far the estimates depart from the mean value.Better it might be thought to use as deviations the difference between the estimate and thetrue value, ˆ θ j ( x ) − θ j ; this usage replaces the covariance matrix with the error-correlationmatrix. An unbiased estimator has mean values equal to true values, i.e. , (cid:104) ˆ θ j (cid:105) ˜ θ = θ j .Appendix A demonstrates how to extract from a biased estimator an estimator unbiasedin a neighborhood of the fiducial operating point, referred to in the literature as a locallyunbiased estimator. Specialize now and henceforth to unbiased estimators, thus makingthe error-correlation matrix identical to the covariance matrix.The Fisher-information matrix, F jk = (cid:90) dx p ( x | ˜ θ ) ∂ j ln p ( x | ˜ θ ) ∂ k ln p ( x | ˜ θ ) = (cid:90) dx p ( x | ˜ θ ) ∂ j p ( x | ˜ θ ) ∂ k p ( x | ˜ θ ) , (3.3)is the foundation on which rests classical multiparameter-estimation theory. For the smalldeviations from the fiducial operating point contemplated in this paper, the integrands in theexpressions for the covariance matrix and the Fisher-information matrix should be evaluatedat the fiducial point, i.e. , ˜ θ = ˜ θ .Foundation because the covariance matrix satisfies the matrix inequality C ≥ F − , (3.4)called the multiparameter (classical) Cram´er-Rao bound (CCRB). Achieving the CCRBgenerally requires working in the asymptotic limit of many trials and requires using theright estimators—maximum-likelihood estimation works. Assume here and hereafter anappropriate estimator and sufficient trials to achieve the CCRB.5
IG. 1. Level surfaces of linear combination q = q j θ j , here taken to be q = θ + θ . This examplefor parameter q is used throughout in figures for two co¨ordinate dimensions (three-dimensionalFig. 12 uses q = θ + θ + θ ). The vector v , here taken to be v = ∂ − ∂ , extends throughtwo units of q , this expressed by dq ( v ) = v ( q ) = 2. The level surfaces of the parameters θ and θ define a square grid. The directional-derivative basis vector ∂ , = ∂/∂θ , lies in a level surfaceof θ , and extends one unit in θ , . Neatly summarizing this description is the grid equation dθ j ( ∂ k ) = ∂ k θ j = δ jk . No notion of length and orthogonality yet—the square grid for θ and θ isused only for convenience. To understand the message of the CCRB, learn now to inhabit the linearized neighbor-hood of the fiducial point ˜ θ . Of primary importance is appreciating an important distinc-tion: measuring changes in the property q along a particular path corresponding to varyinga linear combination of the parameters θ j requires bringing together two distinct geomet-ric objects, one that characterizes how q changes as the parameters θ j wander around theneighborhood and another that identifies the particular path.Call the linearized neighborhood of the fiducial point ˜ θ by its formal name, the tangentspace. Represent a small displacement v on the tangent space graphically by an arrow, asdone in Fig. 1, and algebraically by a directional (partial) derivative, v = v j ∂ j . (3.5)The vector v is a linear combination of the directional derivatives ∂ j associated with theco¨ordinates θ j .Represent the property q graphically on the tangent space by its level surfaces, as is donein Fig. 1, and algebraically by a differential form, dq = q j dθ j . (3.6)Notice that dq is a linear combination of the differential forms dθ j associated with the levelsurfaces of the co¨ordinates θ j ; together, the level surfaces of all the co¨ordinates define thefamiliar co¨ordinate grid (see Fig. 1). 6he differential form dq characterizes how q changes in the linear neighborhood of thefiducial point and is poised to measure the change in the value of q effected by a vector v in the tangent space: dq ( v ) = v ( q ) = v j ∂ j q = q j v j (3.7)is the difference between the value of q at the tip of v and the value of q at the tail of v (located at the origin).The parametrization ˜ θ defines a basis of forms and a basis of vectors dual to one anotherin the sense that ∂ j lies within the zero surface of all dθ k (cid:54) = j and extends to the unit surfaceof dθ j . Summarizing the pictorial properties of the co¨ordinate grid is a compact set ofequations: dθ j ( ∂ k ) = ∂θ j ∂θ k = δ jk . (3.8)This formalizes the important distinction: differential forms characterize how a quantity like q varies in the linear neighborhood of the fiducial point; vectors specify movement in thetangent space.Constructed by taking directional derivatives, the Fisher-information matrix (3.3) has anatural expression as a (covariant) 2-tensor, F ↓ = F jk dθ j ⊗ dθ k , (3.9)on the tangent space. Indeed, manifestly symmetric and positive is the Fisher-informationmatrix, so it is a Riemannian metric on the tangent space, providing a prescription for takinginner products between vectors, (cid:104) u , v (cid:105) F = F jk u j v k = F ↓ ( u , v ) . (3.10)The matrix elements F jk are the inner products F ↓ ( ∂ j , ∂ k ). Positive the Fisher-informationmatrix is, but it can have zero eigenvalues. Care is required in dealing with degenerateFisher-information matrices, as is evident from the CCRB (3.4). Proceed now with cau-tion, assuming the Fisher-information matrix is strictly positive; return to the question ofdegenerate Fisher-information matrices at the end of Sec. III D.Each vector v defines a single-parameter estimation problem by locally restricting thefamily of distributions to parameter variations that give displacements along v . The Fisherinformation for this single-parameter problem is the scalar F vv = (cid:90) dx p ( x | ˜ θ ) v (cid:0) p ( x | ˜ θ ) (cid:1) v (cid:0) p ( x | ˜ θ ) (cid:1) = (cid:90) dx p ( x | ˜ θ ) v j ∂ j p ( x | ˜ θ ) v k ∂ k p ( x | ˜ θ ) = F ↓ ( v , v ) . (3.11)Pause to savor that the Fisher-information tensor holds within itself the CCRB for all single-parameter problems.More explicit we can be about the single-parameter estimation problem specified by v :vary and estimate a parameter φ = φ satisfying dφ ( v ) = 1, while holding fixed N − φ j , j = 2 , . . . , N , satisfying dφ j ( v ) = 0. In words, considering ( φ , . . . , φ N ) as7 IG. 2. Natural co¨ordinate grid relative to the level surfaces of q , natural because one of the pa-rameters, call it φ , is equal to q . Read the ( φ , φ ) co¨ordinates of a parameter point by identifyingthe level surfaces of φ and φ in which the point lies. The basis vectors (directional derivatives)are b j = ∂/∂φ j = b kj ∂ k . That φ = q means that b advances one unit in q [ dq ( b ) = 1] andthat b lies in the q = 0 level surface means that dq ( b ) = 0. Enough to define a single-parameterestimation problem these conditions are not, because b , along which φ advances, can point toany location on the plane q = 1, its direction determined by the other parameters; specifically, thedirection of b is determined by varying φ while holding φ constant [ dφ ( b ) = 0]. a local co¨ordinate system, v extends one unit in φ and points in the direction obtained byvarying φ while holding the other co¨ordinates fixed (see Fig. 2); implied is that v = ∂/∂φ .The scalar Fisher information (3.11) bounds the single-parameter estimator variance (keepin mind the assumption of unbiased estimators, (cid:104) ˆ φ (cid:105) ˜ θ = φ ),∆ ˆ φ = (cid:90) dx p ( x | ˜ θ ) (cid:2) ˆ φ ( x ) − φ (cid:3) ≥ F vv = 1 F ↓ ( v , v ) . (3.12)Find a fuller understanding of co¨ordinate systems matched to single-parameter estimationin Sec. III C.The inverse of the Fisher-information matrix is the optimal covariance matrix, whichmeasures deviations of parameter estimates from the true parameter value. Since measuringdeviations is the job of differential forms, learn with satisfaction that the natural formulationof the inverse Fisher-information matrix is as a (contravariant) 2-tensor, F ↑ = ( F − ) jk ∂ j ⊗ ∂ k = F jk ∂ j ⊗ ∂ k , (3.13)which provides a prescription for calculating (optimal) covariances of parameters specifiedby forms, Cov(ˆ θ j , ˆ θ k ) = F jk = F ↑ ( dθ j , dθ k ) . (3.14)8 . Scalar estimation No control of any of the parameters, no prior constraints on how any parameter varies, noability to hold any combination of the parameters fixed—these mean that an estimate of q must be extracted from estimating all the parameters. Uncertainties in the estimatesof all the parameters feed into the uncertainty in the estimate of q .From the parameter estimators ˆ θ j ( x ) comes an estimator ˆ q ( x ), the same linear combina-tion as q is a linear combination of the parameters θ j :ˆ q ( x ) = q j ˆ θ j ( x ) . (3.15)The estimator variance∆ˆ q = (cid:90) dx p ( x | ˜ θ ) (cid:2) ˆ q ( x ) − q (cid:3) = q j q k (cid:90) dx p ( x | ˜ θ ) (cid:2) ˆ θ j ( x ) − θ j ] (cid:2) ˆ θ k ( x ) − θ k ] (3.16)—recall the assumption of unbiased estimators, for which (cid:104) ˆ q (cid:105) ˜ θ = q —is the action of thecovariance matrix (3.1), written as a contravariant 2-tensor C ↑ , on the form dq :∆ˆ q = C jk q j q k = C ↑ ( dq, dq ) . (3.17)The matrix CCRB (3.4) provides the one-from-many, no-control CCRB for the func-tion q , ∆ˆ q = C ↑ ( dq, dq ) ≥ F ↑ ( dq, dq ) . (3.18)Implicated here is the invariant constructed from the contravariant form of the Fisher metric, F ↑ , and the 1-form dq : F ↑ ( dq, dq ) = F jk q j q k = q j q j . (3.19)Raise the index on dq using F ↑ , and find the vector q F = q j ∂ j introduced in the last form, q j = F jk q k . (3.20)Orthogonal to the level surfaces of q , according to the Fisher metric, is q F : (cid:104) q F , v (cid:105) F = F ↓ ( q F , v ) = F jk q j v k = q k v k = dq ( v ) = 0 , (3.21)for any v that lies in the level surfaces of q . Express the invariant (3.19) in all its forms, F jk q j q k = F ↑ ( dq, dq )= F jk q j q k = F ↓ ( q F , q F ) = (cid:104) q F , q F (cid:105) F = q j q j = dq ( q F ) . (3.22)Pause to appreciate that no-control estimation is controlled by this invariant.9 IG. 3. The Fisher information for a multiparameter family of probability distributions givesthe smallest covariance of estimates about the true parameter values. The Fisher informationfor a single parameter in this setting gives the smallest variance of estimates of that parameterabout its true value, conditioned on knowledge of the true values of the other parameters (hereillustrated by the parameter b ). The smallest variance of estimates of the value of a function ofthe parameters about its valuation at the true parameter values is given by the variance of themultiparameter-estimate distribution marginalized over parameters that don’t change the functionvalue (here illustrated by the level surfaces of the function q ). The probability densities at rightillustrate how the conditional distribution of a parameter associated with the function values isgenerally narrower than the limit given by the marginal of the Fisher information. C. Scalar estimation is not single-parameter estimation
Address now the pitfalls in neglecting the distinction between forms and vectors.Reparametrize the tangent space with new co¨ordinates ˜ φ = (cid:0) φ , . . . , φ N (cid:1) . Match theseco¨ordinates to the job of estimating q by calling out one of the new co¨ordinates, make it thefirst, to be q itself, i.e. , φ = q , with associated differential form dφ = dq = q j dθ j . (3.23)Emerging from these new co¨ordinates are new directional derivatives, ∂∂φ j = b j = b kj ∂ k , (3.24)their vectorial character highlighted by the special designation b j . Choose often in thefollowing to omit the subscript on the special co¨ordinate φ and its associated directionalderivative b , writing φ = φ and b = b . These new co¨ordinates and their basis vectorsdefine a new co¨ordinate grid, characterized by the equations dφ j ( b k ) = b k ( φ j ) = δ jk . (3.25)10ne such new co¨ordinate grid is illustrated in Fig. 2.Suggested by this parametrization is a single-parameter estimation problem closely tiedto the problem of estimating q : b specifies a line through the fiducial origin in the tangentspace that specifies a single-parameter manifold of distributions, and dq ( b ) = 1 means thatthe parameter φ changes by one unit from tail to tip of b . Alluring though this identificationis, at our disposal are the tools to silence the siren’s call.Observe the difference between optimal variances of single-parameter estimation of φ andestimation of q within a multiparameter manifold:∆ ˆ φ ≥ F ↓ ( b , b ) = ( F jk b j b k ) − (3.26)∆ˆ q ≥ F ↑ ( dq, dq ) = ( F − ) jk q j q k . (3.27)Figure 3 illustrates the distinction between these two quantities, depicting the covarianceof the full estimator as a shaded ellipse containing the tips of all vectors v that representparameter changes within a standard deviation of the origin, i.e. , F ↓ ( v , v ) ≤
1. Variationin ˆ q is clearly variation in the full estimator distribution marginalized over deviations thatleave q unchanged, while variation in ˆ φ is variation in the full estimator conditioned on theother parameters being held fixed to their fiducial values.Most importantly, variation in ˆ φ depends on an arbitrary choice of parametrization.Given only the choice φ = q , b can place its tip at any point on the plane q = 1; itsdirection, required to specify a single-parameter problem, is determined by the co¨ordinatesthat accompany φ . Specifically, b points in the direction determined by holding the otherco¨ordinates fixed: dφ j ( b ) = 0 , j = 2 , . . . , N. (3.28)Free we are to modify b by adding to it any vector lying in the null surface of q —that is,any linear combination of b , . . . , b N . Such a modification of b drags along the co¨ordinates φ , . . . , φ N , ensuring they still satisfy Eq. (3.28). Different choices for b pick out differ-ent single-parameter submanifolds. The variance of ˆ φ measures estimator precision forthese irrelevant single-parameter problems. Variation in ˆ q rises above petty differences inparametrizations and measures estimator precision for the no-control problem at hand.An alternative perspective is that dq and F ↑ together privilege a particular single-parameter problem whose sensitivity bound coincides with the bound for the scalar esti-mation problem. The vector q F defined in Eq. (3.20), orthogonal to surfaces of constant q according to the Fisher metric, is not suitably normalized to define a single-parameterestimation problem, because dq ( q F ) = q j q j = (cid:104) q F , q F (cid:105) F . Suitable it becomes by scaling itto place the tip on the unit surface of q : b F = q F (cid:104) q F , q F (cid:105) F . (3.29)The vector b F has squared Fisher length F ↓ ( b F , b F ) = (cid:104) b F , b F (cid:105) F = 1 (cid:104) q F , q F (cid:105) F = 1 F ↑ ( dq, dq ) , (3.30)leading to a no-control CCRB,∆ˆ q ≥ F ↑ ( dq, dq ) = 1 F ↓ ( b F , b F ) , (3.31)11 IG. 4. Vectors of the same length according to the Fisher metric F ↓ have their tips on a covarianceellipse centered at the origin. A single-parameter problem is specified by a vector b that extendsone unit in q ; the Fisher information F bb for this problem is the length of b as measured by theFisher metric. The shortest vector, b F , thus having the least Fisher information, is orthogonalto the level surfaces of q according to the Fisher metric; this smallest Fisher information governsestimation of q when one has no control over any of the parameters θ j . Other vectors that extendone unit in q , exemplified by b , have more Fisher information, as they can be made shorter bysliding the tip along the unit surface of q toward b F . Indeed, a way of characterizing orthogonalityto the level surfaces of q is that the tip of b F is at the point where the Fisher ellipse is tangent tothe surface q = 1, so that any sliding of the tip of b F increases the length. which coincides with the CCRB for single-parameter estimation defined by b F .Figure 4 depicts the geometry: b F , as the vector orthogonal to level surfaces of q accordingto the classical Fisher metric, is the shortest vector that extends one unit in q and so hasthe least Fisher information of all such vectors. Consider any vector b satisfying1 = dq ( b ) = q j b j = F jk q j b k = (cid:104) q F , b (cid:105) F . (3.32)Cauchy-Schwarz commands, (cid:104) q F , b (cid:105) F ≤ (cid:104) q F , q F (cid:105) F (cid:104) b , b (cid:105) F = (cid:104) b , b (cid:105) F F ↓ ( b F , b F ) , (3.33)so the Fisher information for b (cid:54) = b F exceeds that for b F , F bb = (cid:104) b , b (cid:105) F ≥ F ↓ ( b F , b F ) . (3.34)Revealed is that the no-control bound is the most pessimistic single-parameter bound:∆ˆ q ≥ F ↑ ( dq, dq ) = 1 F ↓ ( b F , b F ) ≥ F bb . (3.35)12election of b F according to Eq. (3.29) is known as “parameter orthogonalization” inthe statistics literature. . A co¨ordinate change, as in the discussion surroundingEq. (3.28), makes ∂/∂φ = b = b F orthogonal, relative to the Fisher metric, to the surfaces ofconstant q . The co¨ordinate transformation changes only the co¨ordinates other than q , so thiscan be regarded as identifying the right nuisance parameters relative to the Fisher metric,or it can be regarded as finding the single-parameter estimation problem that coincides withno-control estimation of the function q . Either way, the Cauchy-Schwarz inequality (3.33)embodies Fisher orthogonality and thus is the key to selecting b F as the vector that goeswith no-control (function) estimation. D. Interpretation
Apparently identical, yet subtly different, the variances ∆ ˆ φ of Eq. (3.26) and ∆ˆ q ofEq. (3.27) teach a lesson: in the integrals (3.12) and (3.16) for the variances, the parametersare evaluated at the fiducial point, taken here to be zero parameter values; the differencelies in that ∆ˆ q is honest about its uncertainty in all parameters, whereas ∆ ˆ φ presumesto know the true values of φ , . . . , φ N . Assuming ˆ φ has the blind luck to correctly guess φ , . . . , φ N , it will outperform ˆ q . In the presence of real uncertainty, though, ˆ φ trips on thetangled web it wove and underperforms ˆ q .More enlightening still is it to understand, as is depicted in Fig. 4, that the Fisher ellipse F ↓ ( v , v ) = F ↓ ( b F , b F ) is tangent to the unit level surface of q . Implied is that errors inestimates of parameters that don’t change q are uncorrelated with errors in b F ; there isno danger in using an estimator that assumes incorrect values for such parameters. Thisinsensitivity to errors in the other parameters is the reason the single-parameter problemspecified by b F is the same as the no-control estimation problem for q : a single-parameterproblem assumes the other parameters are fixed at their fiducial values, but for the specialsingle-parameter problem specified by b F , this assumption is unnecessary, and the otherparameters can be left uncontrolled.Insensitivity to errors in these other parameters suggests considering Fisher-informationmatrices that are degenerate and thus not metrics at all. Of particular interest is a rank-oneFisher-information matrix, F ↓ = A dq ⊗ dq , (3.36)where A is a constant. The components of the Fisher-information matrix are F jk = F ↓ ( ∂ j , ∂ k ) = A dq ( ∂ j ) dq ( ∂ k ) = A q j q k . (3.37)Constructed from dq alone, the Fisher-information matrix (3.36) enjoys the exalted statusof the invariant F ↑ ( dq, dq ) = 1 / F ↓ ( b F , b F ). Any vector v has Fisher information F ↓ ( v , v ) = F vv = A ( q j v j ) = A [ dq ( v )] , (3.38)meaning that any sampling procedure giving rise to such a Fisher-information matrix issensitive only to the parameter q and not to any of the other co¨ordinates. Indeed, anyvector b satisfying dq ( b ) = 1 has Fisher information F ↓ ( b , b ) = F bb = A , (3.39)13aking this Fisher-information matrix the embodiment of one-from-many estimation: nomatter what are the co¨ordinates other than φ = q , the Fisher information is the same (all b have the same Fisher length).The Fisher ellipsoid degenerates to a pair of level surfaces of q having opposite valuesof q . Equivalent to Eq. (3.39) is that for any vector v that lies in the level surface q = 0, i.e. , dq ( v ) = 0, F ↓ ( v , v ) = 0 = F ↓ ( v , b ) . (3.40)Quantum procedures that yield these Fisher-information matrices come up in Sec. V A. IV. QUANTUM ESTIMATIONA. Quantum Cram´er-Rao bound
Return now to the quantum setting, abandoned at the end of Sec. II. Quantum mechan-ics generates the classical conditional probability p ( x | ˜ θ ) from an initial state ρ , which isprocessed through a quantum process E ˜ θ to give a state, ρ ˜ θ = E ˜ θ ( ρ ) , (4.1)and a measurement described by a POVM { E x } , whose outcome x is the data collected bythe measurement: p ( x | ˜ θ ) = tr( E x ρ ˜ θ ) . (4.2)Appreciate that in the quantum setting, the Fisher-information matrix and its Fisher ellip-soid are functions of the initial state ρ and the quantum measurement used to extract datafrom the system.The foundation of quantum estimation of a single parameter is the quantum Fisherinformation, defined at the fiducial state ρ = ρ by Q bb = max measurements | ρ F bb = tr( ρ L b ) , (4.3) b ρ ˜ θ (cid:12)(cid:12) ˜ θ =0 = ∂ρ ˜ θ ∂φ (cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 = 12 ( ρ L b + L b ρ ) . (4.4)The Hermitian operator L b sports the title of symmetric logarithmic derivative (SLD); no-tice that tr( ρ L b ) = 0. Foundation the quantum Fisher information is because accord-ing to Eq (4.3), it is the same as the classical Fisher information for the best quantummeasurement; hence, find the bound F bb ≤ Q bb . (4.5)The result is a chain of bounds on estimator variance:∆ ˆ φ ≥ F bb ≥ Q bb . (4.6)14he chain can be saturated: the first inequality, asymptotically in many trials, by using,for example, maximum-likelihood estimation; the second by choice of optimal quantummeasurement.Consider now unitary operations, as in Eq. (2.1), where ρ ˜ θ = U ˜ θ ( ρ ). The Hamiltonian (2.4),written in terms of the new parameters ˜ φ and associated generators, becomes H (˜ θ ) = φ j Y j = φY + N (cid:88) j =2 φ j Y j = H ( ˜ φ ) . (4.7)Here Y j = ∂H/∂φ j = ( ∂θ k /∂φ j ) X k = b kj X k . Generating changes in φ = q is the operator Y = Y = b k X k = b H (˜ θ ) = ∂H (˜ θ ) ∂φ . (4.8)Cumbersome indeed is the implicit expression (4.4) for determining the SLD L b , butan appealingly simple, explicit form is available for a unitary process, ρ ˜ θ = U ˜ θ ( ρ ) = e − iH (˜ θ ) ρ e iH (˜ θ ) , applied to a pure fiducial state ρ = | ψ (cid:105)(cid:104) ψ | . For a unitary process, it isalways true that ∂ρ ˜ θ ∂φ (cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 = − i [ Y, ρ ] = − i [∆ Y, ρ ] ; (4.9)introduction of the operator deviation ∆ Y = Y − (cid:104) Y (cid:105) = Y − tr( ρY ) makes life easier shortly.Realize now that for a pure fiducial state, since ρ ˜ θ = ρ θ , have we ∂ρ ˜ θ ∂φ = ρ ˜ θ ∂ρ ˜ θ ∂φ + ∂ρ ˜ θ ∂φ ρ ˜ θ (4.10)and the conclusion, L b = 2 ∂ρ ˜ θ ∂φ (cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 = − i [∆ Y, ρ ] . (4.11)(Note that the SLDs are not completely determined for rank-deficient states, since theirprojection onto the null space of ρ is irrelevant. ) Simple now is the quantum Fisher infor-mation (4.3): Q bb = − (cid:0) (∆ Y ρ − ρ ∆ Y ) ρ (cid:1) = 4 (cid:104) (∆ Y ) (cid:105) . (4.12)The variance of the generator Y is calculated in the fiducial (pure) system state ρ .Confronting us again, now in the quantum setting, are the requirements for defininga single-parameter estimation problem. The generator Y = b H (˜ θ ), whose variance is thequantum Fisher information, is determined by the vector b that defines the single-parameterproblem. The Hamiltonian (4.7) emphasizes that the parameters that accompany φ mustbe held fixed to get a clean estimate of φ = q .One more inequality, Q bb = 4 (cid:104) (∆ Y ) (cid:105) ≤ (cid:107) Y (cid:107) s = (cid:107) b H (˜ θ ) (cid:107) s . (4.13)15ompletes the quantum discussion, by introducing the operator seminorm (cid:107) Y (cid:107) s , the dif-ference between the largest and smallest eigenvalues of Y . Add yet one more bound to thechain of single-parameter estimator bounds (4.5),∆ ˆ φ ≥ F bb ≥ Q bb ≥ (cid:107) b H (˜ θ ) (cid:107) s . (4.14)Saturated is the last inequality by choosing an optimal fiducial state, an equal superpositionof the eigenstates of Y with largest and smallest eigenvalues. Equally deserving the appella-tion of quantum Cram´er-Rao bound (QCRB) are the last two inequalities; distinguish themby letting the first be the QCRB and the second, the focus of our attention because of itsoptimal fiducial-state, the QCRB-O.Quantum Fisher information also comes in a multiparameter version, in which it is apositive matrix that defines a quadratic form on the space of parameters. With no need forthis quantum Fisher-information matrix, tarry not to introduce it. The quantum Fisher-information matrix enjoys only a vestigial presence in our treatment: labeling the single-parameter quantum Fisher information as the bb component of a quantum Fisher-informationmatrix.Focused though we are on unitary processes, realize that arbitrary processes can be in-cluded by employing the same reasoning to develop a process-dependent norm optimized overgeneral (including mixed) initial states ρ . Not required for unitary processes, yet natural indeveloping the process norm is to generalize the optimization of states and measurements tobe over an extended Hilbert space that includes ancillas in addition to the original system,even though the process itself acts only on the original system. Sufficient it is to consider an-cillary Hilbert spaces with dimension equal to that of the original system. The parametrizedfamily of final states becomes ρ ˜ θ = ( I ⊗ E ˜ θ )( ρ ). Thus define the process norm, (cid:107) b (cid:107) E ˜ θ = max ρ Q bb = max ρ tr (cid:0) ρL b (cid:1) = max ρ max measurements | ρ F bb , (4.15)and generalize the chain (4.14) of inequalities to a quantum-process bound,∆ ˆ φ ≥ F bb ≥ Q bb ≥ (cid:107) b (cid:107) E ˜ θ . (4.16)More detail for this norm—indeed, that it is a norm—comes in App. B.Easy it is to imagine that ancillas permit joint measurements that can extract moreinformation about the parameters, thus increasing the quantum Fisher information Q bb .Indeed, the implicit definition (4.4) of the SLD indicates that L b generally changes whenone allows joint system-ancilla states ρ . Nonetheless, simple it is to argue that for a unitaryprocess U ˜ θ , as in Eq. (2.1), the process norm is (cid:107) b (cid:107) E ˜ θ = (cid:107) b H (˜ θ ) (cid:107) s , (4.17)even after including ancillas. Suppose the maximum (4.15) for a unitary process occurs ona mixed state ρ . Purify ρ into further ancillas, and find that the maximum occurs on apure state. Given that, run through the argument leading from Eq. (4.7) to Eq. (4.13),and conclude with the result (4.17) for the unitary process norm. Appreciate also that anargument from the convexity of the Fisher information demonstrates the optimality of purestates. IG. 5. For typical choices of b , saturating the QCRB-O leaves the one-from-many (Cauchy-Schwarz) inequality unsaturated: F ↓ ( b F , b F ) < F bb = (cid:107) b (cid:107) E ˜ θ . The black circle represents theQCRB-O, demarking the minimum width of a Fisher ellipsoid in all directions. The light-grayellipse represents the CCRB on the multiparameter estimator covariance given by a particularpreparation/measurement protocol. Maximized by this protocol is the single-parameter Fisherinformation F bb , thus making F bb = Q bb = (cid:107) b (cid:107) E ˜ θ , since the covariance ellipse touches the quantumCram´er-Rao circle along the direction b . But the shortest vector, according to F ↓ , that extends oneunit in q , is b F , not b . Failure to saturate the one-from-many inequality is the result: F ( b F , b F ) Easy it is to imagine protocols that rescale the parameters in the unitary operator e − iH (˜ θ ) = e − iθ j X j by changing the constants that couple a generator to the system or,equivalently, by adjusting separately the evolution times for those generators. Spin echocan accomplish this effect without directly adjusting coupling constants or evolution times.Such rescaling effectively changes the Hamiltonian, yet might be regarded as a no-controlprotocol, since rather than directly controlling an underlying parameter in the Hamiltonian,the protocol controls quantities, associated with a generator, that are generally available toan agent in charge of a metrological experiment. Appreciating this argument, nonethelesswe stick with the approach outlined up till now: the family of processes E ˜ θ is part of thestatement of the problem—completely specified by the Hamiltonian H (˜ θ ) = θ j X j for unitaryprocesses; separate scaling of the parameters via intervention techniques leads to a problemthat, though readily analyzed by the techniques developed in this paper, is nonetheless adifferent problem. Tying the notion of a parameter to the process family and sticking with22hat notion fixes the method by which the parameters are impressed on the system, enablingus to extract a magic number from the process family, the ultimate quantum limit given bythe square of the process norm (cid:107) b min (cid:107) E ˜ θ , which for unitary processes becomes the squaredseminorm of the generator, (cid:107) b min H (˜ θ ) (cid:107) s . V. EXAMPLES: PUTTING THE FORMALISM TO WORKA. Commuting generators 1. Setup Consider now the scenario introduced by Eldredge et al. : the parameters θ j are rotationangles about the Bloch z axis for different qubits; hence, the generators are Pauli z operators σ zj for the various qubits, giving Hamiltonian H (˜ θ ) = θ j σ zj . (5.1)For convenience and without any loss of generality, discard qubits that do not contribute to q , i.e. , for which q j = 0; order the remaining qubits so that the absolute value of q j descendsthrough the list of qubits; and scale q such that q = 1, thus giving 1 = q ≥ | q | ≥ · · · ≥| q N | > b = b j ∂ j , the single-parameter generator and QCRB-O norm are Y = b H (˜ θ ) = 12 b j σ zj , (5.2) (cid:107) b H (˜ θ ) (cid:107) s = 12 (cid:107) b j σ zj (cid:107) s = N (cid:88) j =1 | b j | = (cid:107) b (cid:107) . (5.3)Here (cid:107) b (cid:107) is the 1-norm of the vector b .The geometric object of interest is the QCRB-O unit surface, (cid:107) b (cid:107) = 1. This, the unitcross-polytope in N dimensions, is the dual of the unit hypercube. In three dimensions, thecross-polytope is the octahedron. A hyperface of the cross-polytope lies in the unit planedefined by a linear function z = z j θ j , with z j = ± 1. Indeed, a hyperface is the intersectionof the cross-polytope with the unit plane of z , dz ( b ) = b z = z j b j = 1 . (5.4)Here dz = z j dθ j is the 1-form corresponding to z . Thus define a hyperface of the cross-polytope by (cid:40) b (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z j b j = 1 and N (cid:88) j =1 | b j | = 1 (cid:41) . (5.5)Stress that the sign of b j is z j , implying that b j = z j | b j | (no sum). Convenient and productiveit is to let a string z = z . . . z N list the coefficients z j and so specify a hyperface.Construct first in Sec. V A 2 measurements optimal for a q that coincides with a hyperface z , i.e. , dq = dz ; then use these measurements in Sec. V A 3 as ingredients in the recipe formeasurements optimal at the QCRB-O corners that arise for general q .23 . Hyperface measurements When the unit surface of q coincides with a hyperface z there are many choices for b min (any vector in the hyperface z will do). Aesthetic sensibilities direct us to b min withcomponents b j = 1 /z j N . The generator associated with this choice, Y = b min H (˜ θ ) = 12 1 z j N σ zj , (5.6)has extremal eigenvalues ± associated with the eigenvectors |± z (cid:105) = N (cid:79) j =1 |± z j (cid:105) . (5.7)Here | z j (cid:105) is the eigenstate of σ zj with eigenvalue z j , i.e. , σ zj | z j (cid:105) = z j | z j (cid:105) , and − z is the stringwith the sign of all the entries reversed, i.e. , − z = − z . . . − z N ; z and − z specify oppositefaces of the cross-polytope. The normalized states | z (cid:105) are orthogonal: (cid:104) z | z (cid:48) (cid:105) = N (cid:89) j =1 (cid:104) z j | z (cid:48) j (cid:105) = δ zz (cid:48) . (5.8)Define cat-superposition states, | ψ ( ± ) z (cid:105) = 1 √ (cid:0) | z (cid:105) ± |− z (cid:105) (cid:1) (cat state), (5.9) | ψ ( ± i ) z (cid:105) = 1 √ (cid:0) | z (cid:105) ± i |− z (cid:105) (cid:1) ( i cat state) . (5.10)Any of these choices work as the initial state for an optimal estimation strategy. Choosing | ψ (+) z (cid:105) and imposing the parameters via H (˜ θ ) yields the final state e − iH (˜ θ ) | ψ (+) z (cid:105) = 1 √ (cid:0) e − iz j θ j / | z (cid:105) + e iz j θ j / |− z (cid:105) (cid:1) . (5.11)Measure now in an orthonormal basis containing | ψ ( ± i ) z (cid:105) (the basis elements in the subspaceorthogonal to the span of | z (cid:105) and |− z (cid:105) are superfluous, since the final state has no supporton that subspace). Appendix C shows how to think of the needed measurement as a paritymeasurement and thus how to implement it locally.The probabilities for the results corresponding to states | ψ ( ± i ) z (cid:105) are p ( ±| ˜ θ ) = (cid:12)(cid:12) (cid:104) ψ ( ± i ) z | e − iH (˜ θ ) | ψ (+) z (cid:105) (cid:12)(cid:12) = 14 (cid:12)(cid:12) e − iz j θ j / ∓ ie iz j θ j / (cid:12)(cid:12) = 12 (cid:0) ± sin z j θ j (cid:1) , (5.12)leading to Fisher-information matrix F jk = (cid:88) ± p ( ±| ˜ θ = 0) ∂p ( ±| ˜ θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 ∂p ( ±| ˜ θ ) ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 = z j z k ; (5.13)24 IG. 10. A sampling of Fisher “ellipsoids” for hyperface measurements in the case N = 3, wherethe QCRB-O unit surface, the N = 3 cross-polytope, is the (blue) octahedron. The hyperfacesare depicted by pairs of shaded triangles bounding the covariance regions. The chosen samplescorrespond to the Fisher informations F ( k ) ↓ used to saturate QCRB-O when all q j > 0. Thecorresponding z strings are, from left to right, , − − , and − . i.e. , F ↓ = dz ⊗ dz is the degenerate Fisher-information matrix for no-control estimation of q = z discussed at the end of Sec. III D.Verify that the tangency condition (4.24) is met for q j = z j , noting that (cid:107) b min H (˜ θ ) (cid:107) s = (cid:107) b min (cid:107) = 1: F jk b j min = N (cid:88) j =1 z j z k z j N = z k = (cid:107) b min H (˜ θ ) (cid:107) s q k . (5.14)Figure 10 illustrates the Fisher “ellipsoids” for several of these optimal measurements in thecase N = 3. 3. Getting away with kissing at corners Arbitrary q , put in the canonical form described in Sec. V A 1, now comes to the fore.Broken is the symmetry of q = z ; the unit cross-polytope is only guaranteed to touch theunit surface of q at one point, b min = ∂ . (5.15)This vector lives at the corner of the unit circle just like the vector in Fig. 9. Implement theprobabilistic corner strategy discussed at the end of Sec. IV B: construct a Fisher ellipsoidwhose tangent surface matches the level surfaces of q at b min (and hence saturates the one-from-many inequality) by using a convex combination of Fisher informations saturating theQCRB-O on the hyperfaces adjacent to that corner.25 jq z (1) z (2) z (3) z (4) z (5) +1 + + ¡ + +1 +1 +1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ FIG. 11. Illustration of the special strings z ( k ) for q = θ + θ + θ − θ + θ . The parameters θ j have been ordered such that | q j | ≥ | q k | for k ≥ j . Define the strings z ( k ) corresponding to the adjacent hyperfaces, z (1) j = sgn q j = (cid:40) q j > − q j < , (5.16) z ( k> j = (cid:40) z (1) j if j < k , − z (1) j if j ≥ k , (5.17) dz ( k ) = z (1)1 dθ + · · · + z (1) k − dθ k − − z (1) k dθ k − · · · − z (1) N dθ N . (5.18)Figure 11 depicts these strings for a particular q when N = 5. Appreciate now two importantproperties of these strings: first, { dz ( k ) } is a basis of forms; second, the coefficients of dq in this basis are positive and normalized to unity—they make up a probability distribution.Specifically, dq = N (cid:88) k =1 p k dz ( k ) , (5.19)with p = 12 (cid:0) | q N | (cid:1) , (5.20) p k> = 12 (cid:0) | q k − | − | q k | (cid:1) . (5.21)The Fisher informations for the hyperface measurements are F ( k ) ↓ = dz ( k ) ⊗ dz ( k ) . Per-forming the F ( k ) measurement with probability p k yields the new Fisher information, F ↓ = (cid:80) k p k F ( k ) ↓ . Appendix D explains why such a protocol is allowed and why the Fisher infor-mation takes this form. Verify now that the kissing condition (4.23) is satisfied: F ↓ ( b min , v ) = N (cid:88) k =1 p k dz ( k ) ( v ) = (cid:107) b H (˜ θ ) (cid:107) s dq ( v ) . (5.22)26 IG. 12. The Fisher information (illustrated as a mesh ellipsoid) whose tangent plane at b min = ∂ (illustrated as the shaded plane) is a level surface of q = θ + θ + θ . This ellipsoid contactsthe QCRB-O unit circle, (cid:107) b (cid:107) U ˜ θ = (cid:107) b H (˜ θ ) (cid:107) s = 1, illustrated as the blue octahedron, at all vertices. Fig. 12 illustrates a Fisher information constructed according to this recipe.Similar in spirit is this construction to that in Sec. IV B 1 of Eldredge et al. and tothe problem addressed by Sekatski et al. , since one can recover the qubit nature of thisexample by restricting oneself to the span of the extremal eigenstates of the generators foreach probe. Appendix E explores a zoo of variations on these sorts of measurements. B. Noncommuting generators Turn now to noncommuting generators. As a simple example, consider the Hamiltonianfor a single qubit: H ( θ , θ , θ ) = 12 (cid:0) θ σ x + θ σ y + θ σ z (cid:1) = 12 θ · σ . (5.23)Here introduce, by necessity, a bastard inner product that recognizes the natural Euclideangeometry of the Bloch sphere. The Euclidean geometry runs rough-shod over the distinctionbetween upper and lower indices; using dot notation for this inner product sidesteps uglysums over indices that are both upper or both lower.The generator for a vector b = b j ∂ j , Y = b H ( θ , θ , θ ) = 12 (cid:0) b σ x + b σ y + b σ z (cid:1) = 12 b · σ , (5.24)gives QCRB-O seminorm (cid:107) b H ( θ , θ , θ ) (cid:107) s = (cid:112) ( b ) + ( b ) + ( b ) = √ b · b = (cid:107) b (cid:107) , (5.25)with (cid:107) b (cid:107) being the Euclidean length of b . The QCRB-O unit circle is the Euclidean unitsphere. 27ow estimate linear combination q = q j θ j . Tangent to the QCRB-O sphere the unitplane of q must be; scaling q appropriately, this means that q j = b j , which also yields thedesired dq ( b ) = q j b j = b · b = 1. The rest is standard qubitology. Use as fiducial statean optimal state for generator (5.24), say, | ψ (cid:105) = (cid:0) | b (cid:105) + |− b (cid:105) (cid:1) / √ 2. After imposition of theparameters by Hamiltonian (5.23), measure in the basis | ψ ( ± i ) b (cid:105) = (cid:0) | b (cid:105) ± i |− b (cid:105) (cid:1) / √ 2. Theoutcome probabilities, (cid:12)(cid:12) (cid:104) ψ ( ± i ) b | e − iH ( θ ) | ψ (cid:105) (cid:12)(cid:12) = 12 [1 ± sin( b · θ )] = 12 [1 ± sin( q j θ j )] , (5.26)depend only on the component of θ along b . Realize with satisfaction that this component—the summation convention rightly restored!—is the property q itself, which gives the rotationangle about b that is being measured. The result? A no-control Fisher-information matrix F jk = q j q k , whose Fisher ellipsoid consists of the two planes tangent to the unit sphere atthe tips of b and − b .Observe more interesting behavior by varying the degree to which the generators fail tocommute, as in the two-qubit Hamiltonian H ( θ , θ ) = 12 (cid:2) θ (cid:0) σ z + √ (cid:15) σ x (cid:1) + θ σ z (cid:3) . (5.27)The generator for vector b = b j ∂ j , Y = b H (˜ θ ) = (cid:2) b (cid:0) σ z + √ (cid:15) σ x (cid:1) + b σ z (cid:3) , has seminorm (cid:107) b H ( θ , θ ) (cid:107) s = | b | + (cid:112) ( b ) + 2 (cid:15) ( b ) . (5.28)Figure 13 illustrates how the QCRB-O unit circle changes as the process generators becomeincreasingly noncommuting. The smooth curves of the unit circle are serviced by optimalmeasurements like those just encountered for the three Pauli operators; the corners presentopportunities for measurements like those encountered for commuting generators in Sec. V A.To assess those opportunities, notice that vectors on the upper ( b ≥ 0) part of theQCRB-O unit circle take the form b = b ∂ + (cid:112) − | b | + (1 − (cid:15) )( b ) ∂ . (5.29)The generators associated with the vectors are Y = 12 (cid:2) b σ z + (1 − | b | ) ˆ n · σ (cid:3) , ˆ n = (cid:112) − | b | + (1 − (cid:15) )( b ) ˆ z + b √ (cid:15) ˆ x − | b | . (5.30)The extremal eigenvalues of Y , ± , correspond to eigenvectors |± sgn( b ) ˆ z (cid:105) ⊗ |± ˆ n (cid:105) . Thecorresponding optimal measurement is a hyperface measurement, like those in Sec. V A 2,except that on the second qubit the z direction is replaced by ˆ n .Focus now on the upper cusp of the unit circle of (cid:107) b (cid:107) U ˜ θ . Consider a scalar q = q θ + q θ ,where q < q = 1. The unit surface of q touches the QCRB-O unit circle at the upper cusp, b min = ∂ . Near the cusp, regardless of the value of (cid:15) , the QCRB-O unit circle looks likethe square that applies for (cid:15) = 0; for the vectors of Eq. (5.29), as | b | → 0, ˆ n → ˆ z , and themeasurements are the two hyperface measurements, for b > b < 0, considered inSec. V A 2. Matching the tangent made by q is then carried out just as it was in Sec. V A 3.28 µ −101 µ ² = 0 ² = 0 : ² = 0 : FIG. 13. Process norms for several generator pairs ranging from commuting ( (cid:15) = 0) to increasingdegrees of noncommutivity. The cross polytope indicative of commutivity becomes rounded at twoof its corners as (cid:15) grows. VI. CONCLUSION Laid to rest is the question of ultimate, achievable precision in the estimation of scalarproperties of arbitrary quantum channels. Tempted to stray from the straight, but nar-row path by superficial similarities to single-parameter estimation, we stayed the course bykeeping eyes fixed on the distinction between the differential forms defining our problem andthe tangent vectors defining single-parameter problems. Yet unwise it would have been todisregard completely the voice of those who have trod the single-parameter road, for fromtheir stores of knowledge came forth the process norm on the tangent space. By examin-ing the relation between this process norm and the differential form of the scalar propertyof interest, all becomes clear, and maximally precise scalar estimation strategies emerge,beautiful to behold, constructed from the optimal single-parameter strategies known fromold.In light of these investigations of parameter estimation, as was said over two thousandyears ago, so still it must be said, “Let no one ignorant of geometry enter here.” ACKNOWLEDGMENTS Both authors thanks the University of New Mexico’s Center for Quantum Informationand Control for providing a stimulating intellectual environment. JAG was supported inpart by funding from the Canada First Research Excellence Fund and from NSERC.29 ppendix A: Estimator bias Worthwhile it is to consider sensing small deviations away from a true value that is itselfclose to the fiducial operating point (˜ θ = 0). Typical this situation is, and it is the situationconsidered in this paper.In this situation, calculate the Fisher information at the fiducial point, instead of at the(unknown) true point, i.e. , F jk = (cid:90) dx p ( x | ˜ θ = 0) ∂ ln p ( x | ˜ θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 ∂ ln p ( x | ˜ θ ) ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 ; (A1)likewise, the Jacobian of the mean estimates should be calculated at the fiducial point, J jk = ∂ (cid:104) ˆ θ j (cid:105) ˜ θ ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 . (A2)Not knowing the true parameter values, we are commanded to do things this way for thetensor formalism to make sense.Approximate can we also the means of the estimators by expanding about the fiducialpoint: (cid:104) ˆ θ j (cid:105) ˜ θ = (cid:104) ˆ θ j (cid:105) ˜ θ =0 + ∂ (cid:104) ˆ θ j (cid:105) ˜ θ ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 θ k = (cid:104) ˆ θ j (cid:105) ˜ θ =0 + J jk θ k . (A3)If ˆ θ j ( x ) is a biased estimator, an associated estimator ˆ¯ θ j ( x ) can be defined byˆ¯ θ j ( x ) = ( J − ) jk (cid:2) ˆ θ k ( x ) − (cid:104) ˆ θ k (cid:105) ˜ θ =0 (cid:3) , (A4)and the new estimator is unbiased, (cid:104) ˆ¯ θ j (cid:105) ˜ θ = θ j . (A5)The offset removes bias at the fiducial point; the inverse of the Jacobian removes scalingand mixing that introduce bias away from the fiducial point. This removal of bias in theneighborhood of the fiducial point has been called a locally unbiased estimator. Ableto remove bias locally, we can and should always do it and thus use the multiparameterCCRB for unbiased estimators; it then is a matter of indifference whether we use the error-correlation matrix or the covariance matrix to state the CCRB. Appendix B: Process norm Defined in Eq. (4.15) is the process norm. Appreciate first that this is a norm. FromEq. (4.15), discern that the associated unit ball is the intersection of the unit covarianceellipsoids of all possible Fisher informations. The length that our potential norm assignsto any vector is the smallest positive scaling of the unit ball that contains the vector. Anorm must assign a finite, nonnegative value to every vector; satisfy the triangle inequality( (cid:107) v + w (cid:107) ≤ (cid:107) b (cid:107) + (cid:107) w (cid:107) ); be absolutely scalable ( (cid:107) λ v (cid:107) = | λ |(cid:107) v (cid:107) ); and be nondegenerate30 (cid:107) v (cid:107) = 0 ⇒ v = ). The first three properties correspond to the unit ball being anabsolutely convex absorbing set, and the nondegeneracy corresponds to the unit ball beingbounded.A norm we have because the unit ball is absolutely convex, being an intersection ofellipsoids, which are absolutely convex; absorbing, not assigning infinite length to any vector b , since that would correspond to infinite estimation precision; and bounded, not assigningzero length to any vector, since we assume that deviations in all parameters are detectable( i.e. , there are no physically meaningless parameters). Appendix C: Parity measurements and i cat states The i cat states | ψ ( ± i ) z (cid:105) of Eq. (5.10) are eigenstates of ( σ y ) ⊗ N when N is odd and( σ y ) ⊗ N − ⊗ σ x when N is even. Understand why by recalling σ y | z j (cid:105) = iz j |− z j (cid:105) and σ x | z j (cid:105) = |− z j (cid:105) . For N odd,( σ y ) ⊗ N | ψ ( ± i ) z (cid:105) = i N √ z · · · z N ) (cid:0) |− z (cid:105) ± ( − N i | z (cid:105) (cid:1) = ∓ ( − ( N +1) / ( z · · · z N ) | ψ ( ± i ) z (cid:105) , (C1)and for N even,( σ y ) ⊗ N − ⊗ σ x | ψ ( ± i ) z (cid:105) = i N − √ z · · · z N − ) (cid:0) |− z (cid:105) ± ( − N − i | z (cid:105) (cid:1) = ∓ ( − N/ ( z · · · z N − ) | ψ ( ± i ) z (cid:105) . (C2)Measuring these generalized parity operators realizes an optimal measurement for the pro-tocols in Sec. V A 2. Appendix D: Probabilistic protocols Probabilistic protocols we invoke in Sec. IV B to argue that measurements saturating theone-from-many inequality always exist for b min . Understand now the precise nature of theseprobabilistic protocols and the means by which they achieve our aim.Given two different measurement protocols, each dictating the preparation of a particularinitial state and the measurement of a particular POVM, one can combine the two bydeciding to choose randomly which protocol to follow before making use of the channel ofinterest. The bounds in this paper are derived allowing for the possibility of entangledancillas. Since random choice between different state preparation and measurement can beeffected by a deterministic protocol using entangled ancillas, a probabilistic protocol alongthese lines is allowed within the quantum framework of Sec. IV A. Entangled protocols finda place in the examples of App. E 2.The n th deterministic protocol has fiducial state ρ n and measures POVM { E x n } , labeledby outcomes x n ; the outcomes have probability p ( x n | n, ˜ θ ) = tr( ρ n, ˜ θ E x n ), where ρ n, ˜ θ = E ˜ θ ( ρ n )is the output of the quantum process. The probabilistic protocol has all the outcomes of allthe deterministic protocols; the probability of outcome x n is p ( x n | ˜ θ ) = p ( x n | n, ˜ θ ) p ( n ) , (D1)31here p ( n ) is the probability to choose the n th deterministic protocol. Now easy it is to seethat the Fisher information for the probabilistic protocol is the convex combination of theFisher informations for the deterministic protocols: F jk = (cid:88) n (cid:90) dx n p ( x n | ˜ θ ) ∂ j p ( x n | ˜ θ ) ∂ k p ( x n | ˜ θ )= (cid:88) n p ( n ) (cid:90) dx n p ( x n | n, ˜ θ ) ∂ j p ( x n | n, ˜ θ ) ∂ k p ( x n | n, ˜ θ )= (cid:88) n p ( n ) F jk ( n ) . (D2)Return now to the problem of constructing an optimal probabilistic protocol at a corner b min of the QCRB-O surface. Consider all the b near to b min that have the same QCRB-Onorm. In a small enough neighborhood, this looks like the boundary of a convex cone,called the tangent cone. Identifying tangent planes to this cone with forms results in theconstruction of the dual cone. We show that all the tangent planes to the tip of the tangentcone—that is, all forms in the dual cone—can be expressed as convex combinations of tangentplanes to smooth points on the tangent cone. Since there always exists a quantum protocolrealizing at least one tangent plane to a point on the cone, and since smooth points onlyhave one tangent plane, this implies that arbitrary tangent planes to the tip of the conecan be realized through probabilistic combinations of quantum protocols that are knownto exist.We first eliminate irrelevant parameters so the base of the restricted tangent cone isbounded. If the surface of constant QCRB-O norm is flat in certain directions at b min (for example, if it is a sphere) the tangent cone extends infinitely in that direction. Thelevel surfaces of dq coincide exactly with the tangent cone in those directions, as does thecovariance of any QCRB-O-saturating measurement protocol, so we can safely ignore thosedirections and restrict to the remaining cone, whose base is a bounded convex set just likethe unit ball of our norm. A smooth point on the boundary of this set corresponds to a rayof smooth points on the boundary of the tangent cone.We now argue that the set of extremal tangent planes to this restricted tangent cone isequivalent to the set of tangent planes to its base. Extremal tangent planes are rotated outas far away as possible from being flat at the tip of the cone, so they are entirely determinedby the lower-dimensional tangent plane they make with the base of the cone. Combine thiswith the observation that a lower-dimensional tangent plane to a smooth point on the basecorresponds to a tangent plane to a smooth point on the cone, since the additional degreeof freedom in the cone is a ray emanating from the tip, and therefore smooth.We use this trick of reducing the dimension to bootstrap a higher-dimensional protocolfrom lower-dimensional protocols. Start by assuming we can make arbitrary tangent planesto any point on the boundary of this lower-dimensional convex set using a convex combi-nation of tangent planes to smooth points in the neighborhood of that point. From thisit would follow that we can make arbitrary extremal tangent planes to the point of inter-est in our higher-dimensional convex set using convex combinations of tangent planes tosmooth points. Since the dual cone of tangent planes is convex, probabilistic protocols formaking extremal tangent planes yield probabilistic protocols for making all tangent planes.For a two-dimensional cone it is easy to see how to make arbitrary tangent planes to itsbase using convex combinations of tangent planes to smooth points, since the base is a32ne-dimensional object and both points on the boundary are smooth. Inductively, one canthen build up convex combinations of smooth tangent planes to construct arbitrary tangentplanes of higher-and-higher-dimensional convex sets, ultimately arriving at a probabilisticprotocol that matches the level surface of dq at the point of interest.Summarize: make arbitrary tangent planes to a point of interest on the unit ball byutilizing lower-dimensional protocols for making arbitrary tangent planes to points on theboundary of the base of the tangent cone of the point of interest. Appendix E: A zoo of measurements in the commuting case Hyperface measurements are the focus of Sec. V A 2, because they are sufficient for con-structing the optimal protocols needed in Sec. V A. Yet these are far from the only determin-istic measurement protocols that saturate the QCRB-O. As additional examples, considerhyperedges of the cross-polytope, specified by a string w = w . . . w N , much like the string fora hyperface, except that the characters can be in addition to and − . The hyperedgesso signified are (cid:40) b (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) w j b j = 1 and N (cid:88) j =1 | b j | = 1 (cid:41) . (E1)In three dimensions—the cross-polytope is an octahedron—the six vertices correspond tothe six strings with two zeroes, the twelve edges to the twelve strings with one zero, andthe eight faces to the eight strings with no zeroes. For example, w = − is the vertexon the negative y axis, w = − is the edge that connects the − x axis with the positive z axis, and w = − is the face in the octant defined by the + x , − y , and + z axes.Generally, there are 2 N vertices corresponding to strings with N − N ( N − N − N faces corresponding to strings with nozeroes; and 2 K N ! /K !( N − K )! hyperedges of dimension K —these we call K -hyperedges—corresponding to strings with N − K zeroes.Consider now achieving the QCRB-O in a no-control estimation of q = q j θ j (recall thatwe assume that | q j | ≤ j = 1 , . . . , n ) for a vector b that lies on the unit surface of q andalso lies in the interior of a K -hyperedge of the cross-polytope specified by string w . Thediscussion at Eq. (E1) leads to b = N (cid:88) j =1 w j | b j | ∂ j , (cid:107) b (cid:107) = N (cid:88) j =1 | b j | = 1 . (E2)According to the discussion in Sec. IV B, the cross-polytope must kiss the unit surface of q at b . Hence, coincide with the K -hyperedge the unit surface of q must, meaning that q j = w j for w j = ± 1, with the other q j s left arbitrary. Summarize: the linear combinations for whichone-from-many and QCRB-O can be simultaneously saturated at b of Eq. (E2)—notice that dq ( b ) = 1—are q = (cid:88) { j | w j = ± } w j θ j + (cid:88) { j | w j =0 } q j θ j = w + (cid:88) { j | w j =0 } q j θ j , | q j | ≤ 1. (E3)33 . Measurements sensitive only to the parameters on a hyperedge Specialize now to no-control measurements that are sensitive only to the parameters ona hyperedge, i.e., q = w . Construct the states necessary for hyperedge measurements byconsidering the zero-including strings w . Let w be the string in which all the zeroes in w arereplaced by +1: | w (cid:105) = (cid:79) { j | w j = ± } | w j (cid:105) (cid:79) { j | w j =0 } | +1 (cid:105) . (E4)Appreciate that in − w , all the zero entries remain zero, so those entries become +1 in ( − w ) ,giving | ( − w ) (cid:105) = (cid:79) { j | w j = ± } |− w j (cid:105) (cid:79) { j | w j =0 } | +1 (cid:105) . (E5)Note carefully that the strings w and − w specify opposite K -hyperedges of the cross-polytope,whereas w and ( − w ) specify hyperfaces that contain these opposite K -hyperedges, but alsoshare hyperedges that are specified by the 1s held in common by w and ( − w ) .Introduce the analog of the cat and i cat states of Eqs. (5.9) and (5.10): | ψ ( ± ) w (cid:105) = 1 √ (cid:0) | w (cid:105) ± | ( − w ) (cid:105) (cid:1) , (E6) | ψ ( ± i ) w (cid:105) = 1 √ (cid:0) | w (cid:105) ± i | ( − w ) (cid:105) (cid:1) . (E7)Understand that in these states, unlike the cat and i cat states, the (irrelevant) qubits thathave w j = 0 are in a product of +1 eigenstates of σ z . Any of these states is an optimal statesfor b of Eq. (E2); other optimal state can be constructed using any state for the irrelevantqubits, but the product of +1 eigenstates is convenient.Use these new states as ingredients in the standard recipe. Let the qubits begin in thestate | ψ (+) w (cid:105) . Imposition of the parameters leads to the state e − iH (˜ θ ) | ψ (+) w (cid:105) = 1 √ (cid:0) e − iw j θ j / | w (cid:105) + e iw j θ j / | ( − w ) (cid:105) (cid:1) exp (cid:18) − i (cid:88) { j | w j =0 } θ j (cid:19) . (E8)The irrelevant qubits, in state | +1 (cid:105) in both parts of the superposition, contribute the finalphase factor, which has no effect on measurement probabilities. Make a measurement inthe orthonormal basis consisting of | ψ ( ± i ) w (cid:105) and the product states | z (cid:105) , with z (cid:54) = w , ( − w ) .Results z have zero probability, and the probabilities for the results corresponding to | ψ ( ± i ) w (cid:105) are p ( ±| ˜ θ ) = (cid:12)(cid:12) (cid:104) ψ ( ± i ) w | e − iH (˜ θ ) | ψ (+) w (cid:105) (cid:12)(cid:12) = 14 (cid:12)(cid:12) e − iw j θ j / ∓ ie iw j θ j / (cid:12)(cid:12) = 12 (cid:0) ± sin w j θ j (cid:1) , (E9)leading to Fisher-information matrix F jk = (cid:88) ± p ( ±| ˜ θ = 0) ∂p ( ±| ˜ θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 ∂p ( ±| ˜ θ ) ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 = w j w k (E10)34r, equivalently, F ↓ = dw ⊗ dw = dq ⊗ dq . (E11)This estimation scenario gathers information only about the property q = w = w j θ j . Forany vector v , we have F vv = F jk v j v k = ( w j v j ) = [ dw ( v )] . (E12)which has the value 1 for any vector on the unit surface of w (the vector need not be confinedto the portion of that surface that is the hyperedge w of the polytope). For a vector b onthe hyperedge specified by w , as in Eq. (E2),1 = dw ( b ) = w j b j = (cid:88) j | b j | = (cid:107) b (cid:107) = (cid:107) b H (˜ θ ) (cid:107) s , (E13)so the measurement satisfies the unified kissing condition (4.23), F ↓ ( b , v ) = (cid:107) b H (˜ θ ) (cid:107) s dq ( v ) for all v , (E14)and is an optimal no-control measurement of the parameter q = w , achieving both theone-from-many bound and the QCRB-O. 2. A zoo of measurements Return now to a property q of the general form (E3), and visit a zoo of varied optimalmeasurements that can be used for estimating q .Specifying the fiducial state requires an ancillary qubit, which can be thought of as thezeroth qubit—let it appear on the far left of tensor products—and which does not participatein the parameter-dependent interaction. Necessary will it be to make one of the primaryqubits special, as the primary qubit that is entangled with the ancillary qubit, and thatspecial qubit might as well be the first.Choose as fiducial state | ψ (cid:105) = (cid:88) z | z (cid:105) ⊗ c z | ψ (+) z (cid:105) = (cid:88) z | z (cid:105) ⊗ c z √ (cid:0) | z (cid:105) + |− z (cid:105) (cid:1) ; (E15)Assume the amplitudes factor as c z = c ,z c z ...z N ; (E16)squared, they are a probability distribution p z = | c z | = p ,z p z ...z N . Unpack the notationto reveal what | ψ (cid:105) is: | ψ (cid:105) = (cid:88) z c ,z | z (cid:105) ⊗ (cid:88) z ,...,z N c z ...z N √ (cid:0) | z , z , . . . , z N (cid:105) + |− z , − z , . . . , − z N (cid:105) (cid:1) . (E17)This fiducial state could be created in the following way: start the primary qubits in the z = +1 state on the right of Eq. (E17), start the ancilla in the state (cid:80) z c ,z | z (cid:105) , and runa controlled-NOT from the ancilla to the first primary qubit.35f only one c ,z = 1 is nonzero, the ancillary qubit is not entangled with the primaryqubits, and the state of the primary qubits is a superposition of cat states, each correspondingto opposite faces of the cross-polytope. If all the c z = 1 / √ N are equal, the ancillary qubitis not entangled with the primary qubits, and | ψ (cid:105) reduces to | ψ (cid:105) = 1 √ (cid:0) | +1 (cid:105) + |− (cid:105) (cid:1) ⊗ √ N (cid:88) z | z (cid:105) . (E18)The sum over equal linear combination of all the basis states of the primary qubits is aproduct of +1 σ x eigenstates, so the entire state is a product of +1 σ x eigenstates for theancillary qubit and the primary qubits.Imposition of the parameters leads to | ψ ˜ θ (cid:105) = e − iH (˜ θ ) | ψ (cid:105) = (cid:88) z | z (cid:105) ⊗ √ c z (cid:0) e − iz j θ j / | z (cid:105) + e iz j θ j / |− z (cid:105) (cid:1) = 12 (cid:88) z | z (cid:105) ⊗ c z (cid:16)(cid:0) e − iz j θ j / − ie iz j θ j / (cid:1) | ψ (+ i ) z (cid:105) + (cid:0) e − iz j θ j / + ie iz j θ j / (cid:1) | ψ ( − i ) z (cid:105) (cid:17) . (E19)Measure in the orthonormal basis consisting of states | z (cid:105) ⊗ | ψ ( ± i ) z (cid:105) . The outcome probabil-ities, p ( ± z | ˜ θ ) = (cid:12)(cid:12)(cid:12)(cid:0) (cid:104) z | ⊗ (cid:104) ψ ( ± i ) z | (cid:1) | ψ ˜ θ (cid:105) (cid:12)(cid:12)(cid:12) = 14 p z (cid:12)(cid:12) e − iz j θ j / ∓ ie iz j θ j / (cid:12)(cid:12) = 12 p z (cid:0) ± sin z j θ j (cid:1) , (E20)give rise to Fisher-information matrix F jk = (cid:88) ± z p ( ± z | ˜ θ = 0) ∂p ( ± z | ˜ θ ) ∂θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 ∂p ( ± z | ˜ θ ) ∂θ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ θ =0 = (cid:88) z p z z j z k = z j z k , (E21)where a bar denotes an average over p z . This Fisher information is a convex combination ofthe Fisher informations of the form (E10) for the case where the two hyperedges are oppositefaces of the cross-polytope. Because z j = 1, the Fisher-information matrix (E21) has 1s onthe diagonal.The same result emerges if the pure fiducial state (E15) is replaced by the mixed state ρ = (cid:88) z p z | z (cid:105)(cid:104) z | ⊗ | ψ (+) z (cid:105)(cid:104) ψ (+) z | . (E22)This works because the measurement can be regarded as first determining whether z is ± | z (cid:105) and |− z (cid:105) and then doing a measure-ment in the i cat basis | ψ ( ± i ) z (cid:105) within that subspace; coherence between these possibilitiesmatters not.Any vector v has Fisher information F vv = F jk v j v k = (cid:88) z p z ( z j v j ) = (cid:88) z p z [ dz ( v )] . (E23)36f b points to a vertex on the j axis of the cross-polytope, i.e. , b = ± ∂ j , then dz ( b ) = ± z j and F bb = F jj = 1. The Fisher ellipsoid circumscribes the vertices of the cross-polytope.Specialize now to the case where the amplitudes and probabilities factor completely, c z = N (cid:89) j =1 c j,z j = c ,z · · · c N,z N , p z = N (cid:89) j =1 p j,z j = p ,z · · · p N,z N . (E24)The fiducial state (E17) becomes | ψ (cid:105) = 1 √ (cid:32) (cid:88) z c ,z | z (cid:105) ⊗ | z (cid:105) N (cid:79) j =2 (cid:88) z j c j,z j | z j (cid:105) + (cid:88) z c ,z | z (cid:105) ⊗ |− z (cid:105) N (cid:79) j =2 (cid:88) z j c j,z j |− z j (cid:105) (cid:33) . (E25)Only the marginals z j = p j, +1 − p j, − = a j , (E26)which can take on values | a j | ≤ 1, matter now, with the Fisher-information matrix becoming F jk = δ jk (1 − a j ) + a j a k . (E27)Worthwhile as an example is the case a j = a , j = 1 , . . . , N . The Fisher ellipsoid has oneminor axis, v = 1 (cid:112) N [1 + a ( N − (cid:88) j ∂ j , (E28)which points directly into the all-positive 2 N -ant; any vector u that lies in the plane 0 = (cid:80) j u j and has (cid:80) j ( u j ) = (1 − a ) − is a major axis. For 0 < a < 1, the Fisher ellipsoidis prolate and circumscribes the cross-polytope. When a = 0, the Fisher ellipsoid becomesa sphere; when a = +1, it degenerates to the pair of planes (cid:80) j v j = ± w = . . . (the same construction works at any vertex). Vector b = ∂ points to this vertex. Aspromised by Eq. (E3), there should be an optimal no-control measurement of the parameter( w = θ ), q = θ + N (cid:88) j =2 q j θ j , | q j | ≤ 1. (E29)Required is that the Fisher ellipsoid be tangent to the level surface of q ; thus demand thatthe gradient of the Fisher quadratic form, F jk θ j θ k , be proportional to the gradient of q at θ j = δ j : dθ + N (cid:88) k =2 q k dθ k = dq ∝ d ( F jk θ j θ k ) = 2 F jk θ j dθ k = 2 F k dθ k = 2 (cid:18) dθ + a N (cid:88) k =2 a k dθ k (cid:19) . (E30)37hoose a j = (cid:40) , j = 1 ,q j , j = 2 , . . . , N, (E31)to make the proportionality, and—voil`a!—find a no-control procedure for estimating q ,achieving both the one-from-many and QCRB-O bounds.Generalize this no-control measurement to a K -hyperedge. Let w = . . . . . . , wherethere are s in the first K positions and s in the remaining N − K slots (the same con-struction works for any K -hyperedge). A vector b on the hyperedge has the form (E2): b = K (cid:88) k =1 b k ∂ k , K (cid:88) k =1 b k = 1 , b k ≥ . (E32)To be estimated is a linear combination of the form (E3): q = w + N (cid:88) k = K +1 q k θ k = K (cid:88) k =1 θ k + N (cid:88) k = K +1 q k θ k , | q k | ≤ 1. (E33)Any point on the K -hyperedge satisfies (cid:80) Kk =1 θ k = 1, with θ k ≥ 0, for k = 1 , . . . , K , and θ k = 0, for k = K + 1 , . . . , N . The requirement that the Fisher ellipsoid be tangent to thelevel surface of q is again that at any point on the K -hyperedge, the gradient of the Fisherquadratic form F jk θ j θ k be proportional to the gradient of q : K (cid:88) k =1 dθ k + N (cid:88) k = K +1 q k dθ k = dq ∝ F jk θ j dθ k = 2 K (cid:88) k =1 dθ k (cid:18) θ k (1 − a k ) + a k K (cid:88) j =1 a j θ j (cid:19) + 2 N (cid:88) k = K +1 dθ k a k K (cid:88) j =1 a j θ j . (E34)Make the proportionality true by choosing a j = (cid:40) , j = 1 , . . . , K,q j , j = K + 1 , . . . , N. (E35)Thus generalized is Eq. (E31) to a no-control procedure for estimating property q ofEq. (E33), achieving both the one-from-many and QCRB-O bounds. Cylindrical is theFisher ellipsoid for the measurement given by Eq. (E35): it contains the K -hyperedges w and − w and runs off to infinity along the planes defined by those hyperedges; the cross-section of the cylinder is an ellipsoid.The choice (E35) is similar, yet different from the measurement formulated in App. E 1.The difference? The measurement in App. E 1 uses a fiducial state that makes the measure-ment insensitive to parameters θ k for k = K + 1 , . . . , N ; the measurement here adjusts thefiducial state of the previously superfluous qubits to give just the right sensitivity to those38ame parameters, thus delivering a procedure for no-control estimation of q in Eq. (E33),instead of estimation of w . ∗ [email protected] † [email protected] R. A. Fisher, “On the mathematical foundations of theoretical statistics,” Philosophical Trans-actions of the Royal Society of London A , 309–368 (1922). D. Dugu´e, “Application des propr’et´es de la limite au sens du calcul des probabilit´es a l’´etudedes diverses questions d’estimation,” Journal de l’Ecole Polytechnique (4), 305–372 (1937). C. R. Rao, “Information and the accuracy attainable in the estimation of statistical parameters,”Bulletin of the Calcutta Mathematical Society , 81–91 (1945); reprinted in Breakthroughs inStatistics: Foundations and Basic Theory , edited by S. Kotz and N. L. Johnson (SpringerScience+Business Media, New York, 1992), pp. 235–247. H. Cram´er, Mathematical Methods of Statistics (Princeton University Press, 1946), p. 500. H. L. van Trees, Detection, Estimation, and Modulation Theory. Part I. Detection, Estimation,and Linear Modulation Theory (Wiley-Interscience, New York, 2001), Chap. 2. C. W. Helstrom, Quantum Detection and Estimation Theory (Academic Press, New York, 1976). A. S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory (North-Holland, Ams-terdam, 1982). W. K. Wootters, “Statistical distance and Hilbert space,” Physical Review D , 357–362(1981). S. L. Braunstein and C. M. Caves, “Statistical distance and the geometry of quantum states,”Physical Review Letters , 3439–3443 (1994). S. L. Braunstein, C. M. Caves, and G. J. Milburn, “Generalized uncertainty relations: Theory,examples, and Lorentz invariance,” Annals of Physics (N.Y.) , 135–173 (1996). S. Boixo, S. T. Flammia, C. M. Caves, and J. Geremia, “Generalized limits for single-parameterquantum estimation,” Physical Review Letters , 090401 (2007). Z. Eldredge, M. Foss-Feig, J. A. Gross, S. L. Rolston, and A. V. Gorshkov, “Optimal and securemeasurement protocols for quantum sensor networks,” Physical Review A , 042337 (2018). W. Ge, K. Jacobs, Z. Eldredge, A. V. Gorshkov, and M. Foss-Feig, “Distributed quantummetrology with linear networks and separable inputs,” Physical Review Letters , 043604(2018). K. Qian, Z. Eldredge, W. Ge, G. Pagano, C. Monroe, J. V. Porto, and A. V. Gorshkov,“Heisenberg-scaling measurement protocol for analytic functions with quantum sensor net-works,” Physical Review A , 042304 (2019). M. G. A. Paris, “Quantum estimation for quantum technology,” International Journal of Quan-tum Information , 125–137 (2009). T. J. Proctor, P. A. Knott, and J. A. Dunningham, “Multiparameter estimation in networkedquantum systems,” Physical Review Letters , 080501 (2018). J. Rubio, P. A. Knott, T. J. Proctor, and J. A. Dunningham, Quantum sensing networks for theestimation of linear functions, Journal of Physics A: Mathematical and Theoretical , 344001(2020). P. Sekatski, S. W¨olk, and W. D¨ur, “Optimal distributed sensing in noisy environments,” Phys-ical Review Research , 023052 (2020). J. S. Sidhu and P. Kok, “A geometric perspective on quantum parameter estimation,”arXiv:1907.06628 (2019), Sec. VIII. B. Efron, “The efficiency of Cox’s likelihood function for censored data,” Journal of the Amer-ican Statistical Association , 557–565 (1977). V. P. Godambe, On sufficiency and ancillarity in the presence of a nuisance parameter,Biometrika , 155–162 (1980). V. P. Godambe, On ancillarity and Fisher information in the presence of a nuisance parameter,Biometrika , 626–629 (1984). M. Kumon and S-I. Amari, “Estimation of structural parameter in the presence of a largenumber of nuisance parameters,” Biometrika , 445–459 (1984). S-I. Amari, “Differential geometrical theory of statistics,” in Differential Geometry in Statis-tical Inference , Lecture Notes–Monograph Series, Vol. 10, edited by S. S. Gupta (Institute ofMathematical Statistics, Hayward, California, 1987), pp. 19–94, esp. Sec. 6. S-I. Amari and M. Kumon, “Estimation in the presence of infinitely many nuisance parameters—geometry of estimating functions,” The Annals of Statistics , 1044–1068 (1988). V. P. Bhapkar, “Conditioning on ancillary statistics and loss of information in the presence ofnuisance parameters,” Journal of Statistical Planning and Inference , 139–160 (1989). V. P. Bhapkar and C. Srinivasan, “On Fisher information inequalities in the presence of nuisanceparameters,” Annals of the Institute of Statistical Mathematics , 593–604 (1994). Y. Zhu and N. Reid, “Information, ancillarity, and sufficiency in the presence of nuisance pa-rameters,” The Canadian Journal of Statistics (1) 111–123 (1994). Y. Gazit, H. K. Ng, and J. Suzuki, “Quantum process tomography via optimal design of exper-iments,” Physical Review A , 012350 (2019). J. Suzuki, “Nuisance parameter problem in quantum estimation theory: Tradeoff relation andqubit examples,” Journal of Physics A: Mathematical and Theoretical , 264001 (2020). J. Suzuki, Y. Yang, and M. Hayashi,“Quantum state estimation with nuisance parameters,”Journal of Physics A: Mathematical and Theoretical, doi:10.1088/1751-8121/ab8b78 (2020). M. Tsang, F. Albarelli, and A. Datta, “Quantum semiparametric estimation,” arXiv:1906.09871[quant-ph]. C. S. Jackson, private communication, after reading S. Roberts, King of Infinite Space: DonaldCoxeter, The Man Who Saved Geometry (Walker and Company, New York, 2006). K. S. Thorne, “John Archibald Wheeler: 1911–2008,” arXiv:1901.06623, to be published in the Biographical Memoirs of the National Academy of Sciences and of the Royal Society . For an imitation just of the idiosyncracy, see J. A. Wyler, “Rasputin, science, and the trans-mogrification of destiny,” General Relativity and Gravitation , 175–182 (1974). S. Ragy, M. Jarzyna, and R. Demkowicz-Dobrza´nski, “Compatibility in multiparameter quan-tum metrology,” Physical Review A , 052108 (2016). A. Fujiwara, “Quantum channel identification problem,” Physical Review A , 042304 (2001). A. Fujiwara and H. Imai, “A fibre bundle over manifolds of quantum channels and its applicationto quantum statistics,” Journal of Physics A: Mathematical and Theoretical , 255304 (2008). R. Demkowicz-Dobrza´nski, J. Ko(cid:32)lody´nski, and M. Gut¸˘a, “The elusive Heisenberg limit inquantum-enhanced metrology,” Nature Communications , 1063 (2012). J. Ko(cid:32)lody´nski and R. Demkowicz-Dobrza´nski, “Efficient tools for quantum metrology withuncorrelated noise,” New Journal of Physics , 073043 (2013). M. Tsang, “Quantum metrology with open dynamical systems,” New Journal of Physics ,073005 (2013). S. Alipour, M. Mehboudi, and A. T. Rezakhani, “Quantum metrology in open systems: Dissi-pative Cram´er-Rao bound,” Physical Review Letters , 120405 (2014). B. M. Escher, R. L. de Matos Filho, and L. Davidovich, “General framework for estimatingthe ultimate precision limit in noisy quantum-enhanced metrology,” Nature Physics , 406–411(2011). A. Fujiwara and H. Nagaoka, “Quantum Fisher metric and estimation for pure state models,”Physics Letters A , 119 (1995)., 119 (1995).