Typicality in the foundations of statistical physics and Born's rule
aa r X i v : . [ qu a n t - ph ] O c t TYPICALITY IN THE FOUNDATIONS OF STATISTICAL PHYSICSAND BORN’S RULE
DETLEF D ¨URR AND WARD STRUYVE
Abstract.
Typicality has always been in the minds of the founding fathers of probabil-ity theory when probabilistic reasoning is applied to the real world. However, the role oftypicality is not always appreciated. An example is the paper “Foundations of statisticalmechanics and the status of Born’s rule in de Broglie-Bohm pilot-wave theory” by AntonyValentini [1], where he presents typicality and relaxation to equilibrium as distinct ap-proaches to the proof of Born’s rule, while typicality is in fact an overriding necessity.Moreover the “typicality approach” to Born’s rule of “the Bohmian mechanics school” isclaimed to be inherently circular. We wish to explain once more in very simple termswhy the accusation is off target and why “relaxation to equilibrium” is neither necessarynor sufficient to justify Born’s rule.
Nino Zangh`ı and D.D. remember vividly the discussions with GianCarlo Ghirardi on Boltz-mann’s insights into statistical physics and its relation to the random theory he himselfhad proposed (with his coworkers) and had worked on for many decades until his untimelydeath. Not only was GianCarlo an admirer of Boltzmann, he also had a full grasp ofBoltzmann’s ideas and on the role of typicality. The GRW theory is intrinsically randomand the | ψ | -distribution arises from the collapse mechanism built into the theory and heunderstood that the appeal to typicality, for empirical assertions, cannot be avoided. Wemiss GianCarlo Ghirardi, our invaluable friend, coworker and colleague and we dedicatethis work in memoriam to him. Why “most” cannot be avoided
Typicality has always been in the minds of the founding fathers of probability theorywhen probabilistic reasoning is applied to the real world. Nevertheless, still it’s role is oftennot understood. An example is [1], where Valentini presents typicality and relaxation toequilibrium as distinct approaches to the proof of Born’s rule, while typicality is in fact anoverriding necessity. Valentini writes in the abstract of his article:
We compare and contrast two distinct approaches to understanding the Bornrule in de Broglie-Bohm pilot-wave theory, one based on dynamical relax-ation over time (advocated by this author and collaborators) and the otherbased on typicality of initial conditions (advocated by the ‘Bohmian mechan-ics’ school). It is argued that the latter approach is inherently circular andphysically misguided.
The accusation of circularity concerns the proof of Born’s rule in de Broglie-Bohm pilot-wave theory, or “Bohmian mechanics” for short, given in [2]. It is an important proof, asit explains the observed regularity concerning the outcomes of measurements on ensemblesof identically prepared systems. As such, Valentini’s accusation is at the same time anonslaught to the ideas underlying statistical physics. We wish to explain once more invery simple terms why the accusation is off target and why “relaxation to equilibrium” isneither necessary nor sufficient to justify Born’s rule.In the history of mathematics pointing out circularities in important proofs have beensometimes pathbreaking. An example is provided by what we would call now the “PhDthesis” of Georg Simon Kl¨ugel (1739 - 1812), who showed that all existing proofs (about27 of them) of the 11th Postulate of Euclid on the uniqueness of parallels were circular inthat they used in the proofs equivalents of the postulate as (hidden) assumptions. Thatthesis has led to the discovery of non-Euclidean geometry!The accusation of circularity in the derivation of Born’s rule is however less breathtaking;it is simply off target. The criticism misses the point of statistical physics entirely. Thatmay be partly due to the loose manner of speaking about probability and distributionswhich is common in statistical physics and which clouds the meaning of these objects.Instead, the notion of typicality is necessary to understand what the statistical predictionsof a physical theory really mean. In fact, typicality (though the word may not have beendirectly used) has always been in the minds of probabilists and physicists (from JacobBernoulli ∼ ∼ ∼ /
2, the probability which wenaturally assign for the sides of a (fair) coin. What matters here is that there is obviouslysome relation between the factual occurrence of the relative frequency of heads and thenumber 1 /
2. That needs to be explained. Why? Because other sequences are possible aswell, for example sequences which show less than 300 heads. The question which needs tobe answered is: Why don’t they show up in practice? The (mathematical) way that theregularity of roughly 500 heads is explained is by the law of large numbers (LLN) , whichestablishes the closeness of the empirical distribution of heads, i.e., the distribution whichcounts the relative frequency of heads in the sequence of length 1000 (which is the largenumber in the LLN) and the number 1 / Luckily, for what we have to say here, we can ignore the issues related to the question what the notionof “probability 1 / really means. It does not matter. YPICALITY IN THE FOUNDATIONS OF STATISTICAL PHYSICS AND BORN’S RULE 3
How does the LLN explain that? By counting sequences! Here are some telling numbers:There are about 10 sequences with about 500 heads. There are about 10 sequenceswith about 300 heads. So the proportion of sequences with 500 heads versus 300 heads isabout 10 . For sharpening our intuition about the power of such numbers note that the ageof the universe in seconds is about 10 . Thus most sequences show a law-like regularity,namely that the relative number of heads is roughly 1 /
2. Wouldn’t that suggest that it ismost likely that the observed sequence has roughly equal numbers of heads and tails? Well,most likely is just another way of saying “with high probability”. But then, what doesprobability mean here? It is better and simpler to say that the typical sequences will haveroughly 500 heads. The LLN says nothing more than that. It is a typicality statement.We remark for later that in introductory courses to probability theory the counting isintroduced as Laplace probability which is then a normalized quantity by dividing thenumbers of sequences of interest by the total number of sequences and which we betterrefer to by the role it plays in our example as Laplace-typicality. The point of this exampleis that mere counting of head-tail sequences (or 0 − / ′ s and 0 ′ s ) in a typical sequence looks random,unpredictable, while randomness never entered. It’s just the way typical sequenceslook like.Agnostics may still complain that that explains nothing: What needs to be explained is whywe only see typical sequences! That’s actually the deep question underlying the meaningof probability theory from its very beginning and Antoine-Augustin Cournot (1801-1877)coined once what became known as Cournot’s principle, which in our own rough wordsjust says that we should only be concerned with typical events. The point we wish to makewith the simple example is that appeal to typicality cannot be avoided. Sequences withdrastically unequal number of heads and tails are physically possible. The reason they donot appear in practice is because there are much too few of them, they are atypical. Thereis no way around that. That is what the founders of probability theory understood andhad to swallow. Note that typicality, through Cournot’s principle, only tells us what toexpect or not. It does not allow to associate a probability to, say, the sequence with 300heads: In terms of the Laplace-typicality only values near zero or one matter – atypical ortypical. The notion of typicality is distinct from the notion of probability.Let’s go a step further and consider the coin tossing as a physical process, because thatis what it is after all: There is a hand which tosses the coin, thereby providing the coinwith an initial momentum and position which determine its flight through the air. Thetrajectory of the coin is determined, given the initial conditions, by the laws of physics(here Newtonian) and hence it defines a function which maps initial conditions to head ortail (0 or 1). But the hand is just a physical system itself – a coin tossing machine so to “Most” should be understood as “overwhelmingly many”. DETLEF D ¨URR AND WARD STRUYVE say. The machine picks up the coin, throws it, and after the landing the machine notesdown head or tail, picks the coin up again, throws it and so on and so forth. Thus theresulting sequence of heads and tails depends only on the initial conditions, i.e., on thephase space point which determines the whole process of the coin tossing machine. Thephysical description and analysis may not be that easy, but at least the principle is clear :It shows that the head-tail-sequences are the images of a function F = ( F , . . . , F N ) of thehigh dimensional phase space variables q – the initial conditions. Here the component F k maps to the outcome δ ∈ { , } of the k th tossing of the machine. Such a function F (a coarse-graining function by the way) is usually called a random variable. The pointis that in this description where phase space variables play the decisive role, counting isnot anymore possible, as classical phase space is a continuum. What then replaces thecounting? That is a measure – a typicality measure . In classical physics, which would beappropriate for studying coin tossing as a physical process, the measure commonly used isthe “Liouville-measure” – the volume measure in phase space. It recommends itself by theproperty of being stationary, an observation which was promoted in the works of LudwigBoltzmann. It is a measure which is suggested by the physical law itself and not by anarbitrary human choice. The fact that, with this measure, typicality is a timeless notionis of great help for proving the LLN.The role of the Liouville-measure is to define the notion of “most” for the phase spaceof classical mechanics. In mathematical terms, the above mentioned Laplace-typicalityemerges then ideally as the image measure of the more fundamental Liouville-measuredefined by the function F . To express the LLN in this more fundamental setting, it isuseful to introduce the empirical distribution ρ N emp ( q, δ ), the function which counts therelative number of heads and tails and which is a function of the phase space variables q and the image variables δ ∈ { , } . ρ N emp ( q, δ ) = 1 N N X k =1 { δ } ( F k ( q )) . Here { δ } ( F k ( q )) = 1 if F k ( q ) = δ and 0 otherwise. The LLN (if it would be proven for aphysically realistic coin tossing machine) would then say something like: For most phase points q (w.r.t. the Liouville-measure) and for large enough N the empiricaldistribution ρ N emp ≈ / , or, Liouville-measure typically ρ N emp ≈ / . For more on this, see the chapter “Chance in Physics” in [3]. Dimension of the size of Avogadro’s number perhaps. In classical mechanics, there are many more measures which share this property, but that does notmatter for our concerns here. A technical remark on the side: To model the coin tossing experiment in which the coin is throwna great number of times in a physically realistic way is not so easy and to prove the LLN may turn outhard: The stochastic independence of the different tosses of the coin is easily said, but to prove that in aphysically realistic model is far from being easy (see [3], chapter “Chance in Physics” for an elaboration onthat).
YPICALITY IN THE FOUNDATIONS OF STATISTICAL PHYSICS AND BORN’S RULE 5
The reference to typicality cannot be avoided, as there are phase points which aremapped to sequences with, say, 300 heads, i.e., “most” cannot be replaced by “all”.One further point should be noted which is often used to actually justify the use ofstatistical methods in physics: It is almost impossible to know in a realistic physical systemexactly which initial conditions lead to which outcomes (as for example in the case of thecoin tossing machine). The power of typicality is that exact details are not needed. Itsuffices that for most initial conditions the observed statistical regularities obtain.Coin tossing is not a process which happens only here and now but which happens atarbitrary locations and times. To explain the statistical regularities in such generality, westill need to lift the whole discussion to a universal level. The universally relevant LLNwould then have to say (very) roughly something like:
For most universes in which coin tossing experiments are done, i.e., for Liouville measure-typical such universes, it is the case that the empirical distribution of heads in long enoughsequences in coin tossing experiments is approximately / . The typicality assertion concerning Born’s law is very analogous to this and has beenproven in [2]. Before we turn to that we shortly look at another rather simple classicalsystem. Everything that will be said for this example can be carried over to the case ofBohmian mechanics.Consider an ideal gas of point particles in a rectangular box, lets say with elastic col-lisions of the particles at the walls. The gas is in equilibrium when the gas molecules fillthe box approximately homogeneously. Most configurations (with respect to the Liouville-typicality measure) are like that, like most 0 , most , non-equilibrium configurations will evolveto configurations which macroscopically look like equilibrium ones. (“Most” is again withrespect to the Liouville-typicality measure, concentrated initially on the very small subsetof configurations which are such that the box is only half filled.) Why? Because the equi-librium set in phase space, is so overwhelmingly larger then the tiny non-equilibrium set,so that typically trajectories will wander around and will end up in the overwhelminglylarge set and stay there for a very large time. And, as we said, there exist also atypicalconfigurations which will not at all behave like that. That is, without typicality, we have no DETLEF D ¨URR AND WARD STRUYVE explanation why to expect equilibration. Having said this, we should warn the reader thatthis just is the physical idea behind the equilibration. To turn this into a rigorous argumentis famously hard, as hard as to justify the Boltzmann equation from first principles.The warning in mind, we can think of describing the transition from non-equilibriumto equilibrium also in terms of coarse-graining densities ρ ( x , t ), which are more or lesssmooth functions (macro-variables) on the three-dimensional physical space with variables x and which should be pictured as approximations of empirical densities. The uniformdensity i.e., ρ eq ( x ) = const. would then be the equilibrium density. Hence, starting with anon-equilibrium density ρ neq , it is perhaps reasonable to assume, that ρ neq ( t ) → ρ eq as t gets large. This convergence of densities is sometimes referred to as “mixing property” andwe shall refer to this notion to mean just that: convergence of densities without referenceto typicality. There have been attempts to show this. The mixing idea is presumably dueto Willard Gibbs who had introduced the so-called ensemble view into statistical physics.An idea for a strategy for a “convergence to equilibrium proof” was suggested by PaulEhrenfest as is recalled on page 85 by Marc Kac in [4] and where he refers to Ehrenfest’sattempt as an “amusing” theorem, since convergence to equilibrium in time does not followat all from what Ehrenfest had shown.But even when the mixing property, i.e., convergence of densities, were shown to be ofphysical relevance, the connection with the actual configuration (i.e., the empirical distribu-tion) would still have to be established. After all, Newtonian physics is about configurationsand not densities. In addition, by our arguments above, some non-equilibrium densitieswill never show the mixing property, for example densities which are concentrated on “bad”configurations, i.e., atypical ones. 2. Born’s Rule
What we have said about the statistical analysis in classical physics carries over tothe statistical analysis of Bohmian mechanics, where the phase space is now replaced byconfiguration space. Born’s rule ρ = | ψ | is a short hand for the universal LLN for theempirical distribution ρ N emp of the coordinates of the particles of subsystems in an ensemble(defined similarly as in footnote 7). Roughly speaking, the universal LLN in the contextof Born’s rule says the following (for the precise formulation, see e.g. [2]): For typical Bohmian universes hold: In an ensemble of (identical) subsystems of a universe,where each subsystem has effective wave function ψ , the empirical distribution ρ N emp ofthe particles coordinates of the subsystems are | ψ | distributed. For this to hold sufficiently well, the number N of subsystems in the ensemble shouldbe large. Note that in analogy with the coin tossing, the number 1 / | ψ | and the sequence of length 1000 is here the number of subsystems in the ensemble.But instead of the Liouville-measure, the typicality measure used in [2] is the measure The empirical density is in this case given by ρ N emp ( q, x ) = N P Nk =1 δ ( x − x i ), where x i is the positionof the i th particle. Note the analogy with the definition in the case of the coin tossing. Think of this wave function as the usual wave function of a system as it is used in physics courses.
YPICALITY IN THE FOUNDATIONS OF STATISTICAL PHYSICS AND BORN’S RULE 7 P Ψ ( A ) = R A | Ψ | ( q )dq ( q is a generic configuration space variable), where A is a subset ofthe configuration space of the Bohmian universe and Ψ is the universal wave function onthat space. What is special about the typicality measure P Ψ ? It is a measure which istransported equivariantly by the Bohmian flow. This means that it is a typicality measurewhich like the stationary Liouville measure is independent of time. The very nice property of the universal quantum equilibrium LLN is that it is empiricallyadequate. Up to date all tests affirm the empirical validity of Born’s law.3.
Dynamical Relaxation?
Valentini dislikes the use of typicality. Instead, he proposes “dynamical relaxation”to equilibrium to explain Born’s rule in the realm of Bohmian mechanics. It is howevernot at all clear what is meant by “dynamical relaxation” and in which way reference totypicality can be overcome. On the configurational level, i.e., on the level of empiricaldensities, starting in non-equilibrium our discussion of the gas in the box applies verbatim.There will always exist initial configurations of particles for which the empirical distributionwill never become close to | ψ | – the equilibrium distribution. So why should we expectequilibrium then? Appealing to Boltzmann’s idea, one could invoke typicality as in thecase of the gas example. But as soon as one invokes typicality, there is no longer any needto invoke relaxation to begin with to explain equilibrium! Namely, most configurationswill be in equilibrium most of the time and hence non-equilibrium just doesn’t occur–forall practical purposes – as established in [2].Valentini also follows the Gibbs-Ehrenfest idea of mixing and provides an analytic ar-gument for the convergence of densities. But the argument is the direct analogue of the“amusing” theorem proven by Ehrenfest, which “tells us nothing about the behavior ofthe non-equilibrium density ρ in time” [4]. Not to say that the connection to empiricaldensities needs to be established on top of that.Hence the “dynamical relaxation” approach turns out be neither necessary nor sufficient.4. Physically misguided?
All of the quantum formalism follows from Born’s rule [5]. There is no dispute aboutthat. Heisenberg’s uncertainty follows from Born’s rule. No dispute about that either.There is actually no dispute about any of the consequences which arise from or in quantumequilibrium. So what is the dispute about then? If it is about the needed referenceto typicality, then that can’t be because both “approaches” need reference to typicalityanyhow for physically meaningful assertions. As an aside, note that the typicality measure which is used in the LLN is really a member of anequivalence set of measures. That is, all measures which are absolutely continuous with respect to P Ψ yielda LLN for Born’s law. It has been proven under very reasonable conditions in [6] that this measure is unique. This involves in some way or another an analysis of measurement situations, which in Bohmian me-chanics is straightforward, and which in let’s say orthodox quantum theory needs the problematical collapseof the wave function.
DETLEF D ¨URR AND WARD STRUYVE
What then makes the use of typicality physically misguided? Because the physical lawallows for atypical universes? Because a coin tossing sequence of only heads is possibleby the physical law? No argument, other than denying the physical law, can make thosepossibilities impossible. Why then, don’t we deny the law to make them go away? Becauseby humbly looking at the facts in our world we understand that the law-like regularitiesin apparently random events are in surprising harmony with the physical law: They aretypical. 5.
Acknowledgments
W.S. is supported by the Research Foundation Flanders (Fonds Wetenschappelijk On-derzoek, FWO), Grant No. G066918N.
References [1] Valentini, A., Foundations of statistical mechanics and the status of the Born rule in de Broglie-Bohmpilot-wave theory. in:
Statistical Mechanics and Scientific Explanation,
Editor: Allori, V. Publisher:World Scientific, forthcoming (arXiv:1906.10761).[2] D¨urr, D., Goldstein, S. & Zangh`ı, N. Quantum equilibrium and the origin of absolute uncertainty.
J.Stat. Phys. , 843–907, DOI: 10.1007/BF01049004 (1992).[3] D¨urr, D. & Teufel, S. Bohmian Mechanics, The Physics and Mathematics of Quantum Theory. Springer (2009).[4] Kac, M., Probability and Related Topics in Physical Sciences.
Interscience Publishers Inc. (1959).[5] D¨urr, D., Goldstein, S. & Zangh`ı, N. Quantum equilibrium and the role of operators as observables inquantum theory.
J. Stat. Phys. , 959–1055, DOI: 10.1023/B:JOSS.0000 (2004).[6] Goldstein, S. & Struyve, W. On the Uniqueness of Quantum Equilibrium in Bohmian Mechanics
J.Stat. Phys. , DOI: 10.1007/s10955-007-9354-5 (2007), DOI: 10.1007/s10955-007-9354-5 (2007)