Shakin' All Over: Proving Landauer's Principle without neglect of fluctuations
aa r X i v : . [ phy s i c s . h i s t - ph ] J u l Shakin’ All Over: Proving Landauer’sPrinciple without neglect of fluctuations
Wayne C. MyrvoldDepartment of PhilosophyThe University of Western [email protected] 24, 2020
Abstract
Landauer’s principle is, roughly, the principle that there is an en-tropic cost associated with implementation of logically irreversible op-erations. Though widely accepted in the literature on the thermody-namics of computation, it has been the subject of considerable disputein the philosophical literature. Both the cogency of proofs of the prin-ciple and its relevance, should it be true, have been questioned. Inparticular, it has been argued that microscale fluctuations entail dis-sipation that always greatly exceeds the Landauer bound. In this ar-ticle Landauer’s principle is treated within statistical mechanics, anda proof is given that neither relies on neglect of fluctuations nor as-sumes the availability of thermodynamically reversible processes. Inaddition, it is argued that microscale fluctuations are no obstacle toapproximating thermodynamic reversibility as closely as one wouldlike.
The statement that has come to be known as
Landauer’s Principle is, roughly, that there is an entropic cost associated with implemen-tation of logically irreversible operations, that is, operations whoseinput states cannot be recovered from their output states. It is widely ccepted in the literature on the thermodynamics of computation; seeLeff and Rex (2003) for a sampling of the relevant literature and an ex-tensive bibliography. Nonetheless, it has been the subject of consider-able controversy in the philosophical literature (Earman and Norton,1999; Norton, 2005; Ladyman et al., 2007, 2008; Norton, 2011; Ladyman and Robertson,2013; Norton, 2013a,b,c; Ladyman and Robertson, 2014; Ladyman,2018; Norton, 2018).Ladyman, Presnell, Short, and Groisman (2007), hereinafter re-ferred to as LPSG, presented a proof of Landauer’s principle. Theproof, like any proof, rests on assumptions. The operative assump-tions of the proof are that a probabilistic version of the second law ofthermodynamics holds, and that certain processes can be performedreversibly. These processes include, crucially, expansion of a single-molecule gas. Norton (2011, 2013b,c) has argued that inevitable fluc-tuations at the molecular level invalidate the assumption of even ap-proximate thermodynamic reversibility of processes at the microscale,and that any process involves dissipation in excess of the bounds re-quired by Landauer’s principle, rendering the principle moot. This isregarded by Norton as a ‘no-go’ result, invalidating the basic frame-work within which most of the work on thermodynamics of computa-tion has been carried out.Ladyman and Robertson (2014) addressed the purported no-go re-sult, arguing that the conclusion has not been established. They ac-knowledged, however, a concern about the assumption, ubiquitous inthe literature on thermodynamics of computation, of molecular-scaleprocesses carried out with negligible dissipation.In this article, the subject of Landauer’s principle is addressed fromthe point of view of statistical mechanics. It is shown that the rele-vant version of the second law of thermodynamics is provable withinstatistical mechanics, in two versions, classical and quantum. It istherefore not required as an independent assumption. A derivationwithin statistical mechanics of the Landauer principle is given, thatneither relies on neglect of fluctuations nor assumes the availability ofthermodynamically reversible processes.As Norton has rightly emphasized, a theorem of this sort is mootif the processes involved depart sufficiently far from thermodynamicreversibility. This is explicit in the theorem we prove. Unless there areprocesses available that approximate reversibility sufficiently closely,the theorem places no bounds on extra dissipation associated with log-ical irreversibility. For that reason, I will argue that, given the notion f thermodynamic reversibility relevant to the context at hand, fluc-tuations, even ones that are large on the scale at which the processesare taking place, pose no threat to the assumption that processes cantake place that approximate thermodynamic reversibility as closely aswe would like. As is usual in thermodynamics, the thermodynamic state of a system A is defined with respect to some set of manipulable variables λ = { λ , λ , . . . , λ n } , which may represent, for example, the positions ofthe walls of a container the system is constrained to be in, or thevalue of applied fields. We thus consider a family of Hamiltonians { H λ } . The variables λ are treated as exogenous, meaning that wedo not include in our physical description the systems that are thesources of these applied fields, and we do not consider the influenceof the system A on those systems. They may also be freely specified,independently of the state of the system. We consider some set M ofmanipulations of the system, where each manipulation consists of somespecification of λ ( t ) through some interval t ≤ t ≤ t . In addition,we may assume that there are available one or more heat reservoirs { B i } at temperatures T i , with which the system can exchange heat.The system A may be coupled and decoupled from these heat bathsduring the course of its evolution. That is, the interaction terms inthe total Hamiltonian consisting of the system A and the reservoirs { B i } are also treated as manipulable variables.Since the time of Maxwell (1871, 1878) it has been recognized thatthe kinetic theory of heat entails that the second law of thermodynam-ics, as originally formulated, cannot hold strictly. When compressinga gas with a piston, we might find on some occasion that, due to afluctuation in the force exerted by the gas on the piston, less work isneeded to compress the gas than one would expect on average, andso in a given cycle of a heat engine might obtain more work than isallowed by the second law from a given quantity of heat extracted.By the same token we might obtain less work than expected. We donot, however, expect that we will be able to consistently and reliably violate the Carnot limit on efficiency of a heat engine. The originalversion of the second law should be replaced by a probabilistic one.The second law will then be, to employ Szilard’s vivid analogy, like theorem about the impossibility of a gambling system intended tobeat the odds set by a casino.Consider somebody playing a thermodynamical gamble withthe help of cyclic processes and with the intention of de-creasing the entropy of the heat reservoirs. Nature will dealwith him like a well established casino, in which it is pos-sible to make an occasional win but for which no systemexists ensuring the gambler a profit (Szilard 1972, p. 73,from Szilard 1925, p. 757).On a macroscopic scale, we expect fluctuations to be negligible, but,as Norton has emphasized, at the microscale on which in-principlelimitations on the thermal cost of computation are investigated, theyare far from negligible. Accordingly, we will invoke probabilistic con-siderations, and treat of the evolution of probability distributions overthe state of a system subjected to various manipulations. When con-sidering the amount of work needed to perform an operation, or theamount of heat exchanged in the course of the evolution of the sys-tem, we will consider expectation values of work and heat exchanges,calculated with respect to those probability distributions.We assume it makes sense to associate a probability distributionwith a preparation procedure, and to compute on its basis probabilitiesfor outcomes of subsequent manipulations. We need not enquire intothe status of these probabilities, so long as they serve this purpose.Since the late nineteenth century it has been common to think ofprobability statements as involving veiled reference to relative frequen-cies in an actual or hypothetical sequence of events, or in an ensembleof similarly prepared systems. There is no commitment here to suchfrequentism about probabilities; probability considerations may be ap-plied to single events. There is, however, a link between probabilitiesand mean outcomes in a long sequence of trials, afforded by the weaklaw of large numbers. Suppose that we are able to conduct multipleruns of a procedure, in such a way that the probabilities of the out-comes are the same on each trial, and the outcomes of any trial areprobabilistically independent of the outcomes of all the others. Then,if we take any outcome variable, and compute its mean value acrossthe results of a long sequence of trials, with high probability this meanvalue will be close to its expectation value on a single trial. We canmake the probability of any degree of approximation as high as we likeby increasing the number of trials. Though the expectation values to e invoked are not defined in terms of mean values in a long sequenceof trials, they have implications for such mean values. If we couldconstruct a heat engine such that the expectation value of work ex-tracted on each run exceeded the Carnot bound, we could, by runningsufficiently many cycles, make the probability of a net violation of theCarnot bound as close to unity as we like.We will treat of “states” a = ( ρ a , H a ), consisting of a probabilitydistribution over the phase space of the system A (or, in the quantumcontext, a density operator on the system’s Hilbert space), representedby a density ρ a , sand a Hamiltonian, which, as already noted, maydepend on exogenous, manipulable variables. We consider the effectson those states of manipulations in some class M .As is usual in statistical mechanics, the distributions associatedwith the heat reservoirs B i will be canonical distributions, uncor-related with the system A (see Maroney 2007 for discussion of thejustification for this use of canonical distributions). In the classicalcontext, a canonical distribution is a distribution that has density,with respect to Liouville measure, ρ β = Z − e − βH , (1)where β is the inverse temperature 1 /kT , and Z is the normalizationconstant required to make the integral of this density over all phasespace unity. This depends both on the Hamiltonian H and on β , andis called the partition function . In the quantum context, a canonicalstate is represented by density operatorˆ ρ β = Z − e − β ˆ H , (2)where, again, Z is the constant required to normalize the state.As the reservoirs interact with A , correlations will be built up, butwe will assume that the reservoirs are big enough and noisy enoughthat these are, as far as subsequent interactions with A are concerned,effectively effaced, meaning that the effect of the reservoirs on A is as if they are uncorrelated. This means, not that the probabilitydistribution over the full state of A and B i is a product distribution,but that the dynamical variables of B i and A relevant to interactionstheir interactions with A are effectively independent.The manipulations of a system A we will be considering will beones of the following form. • At time t , the system has some probability distribution ρ a , andthe Hamiltonian of the system A is H a . At time t , the heat reservoirs B i have canonical distributionsat temperatures T i , uncorrelated with A , and are not interactingwith A . • During the time interval [ t , t ], the composite system consistingof A and the reservoirs { B i } undergoes Hamiltonian evolution,governed by a time-dependent Hamiltonian H ( t ), which may in-clude successive couplings between A and the heat reservoirs { B i } . • The internal Hamiltonians of the reservoirs { B i } do not change. • At time t , the Hamiltonian of the system A is H b , and, as aresult of Hamiltonian evolution of the composite system, themarginal probability distribution of A is ρ b .This is a manipulation that takes a state a = ( ρ a , H a ) to state b =( ρ b , H b ).It should be noted that we are not considering manipulations thatconsist of a measurement performed on the system A followed by amanipulation of the exogenous variables whose choice depends on theoutcome of the measurement. Controlled operations are allowed, butthe control mechanism must be internalized, that is, included in thesystem under study. The system A could consist of two parts A and A , which interact in such a way that the state of A affects whathappens to A , which subsequently affects what happens to A . Butall of this must be encoded in the Hamiltonian H ( t ), which may betime-varying but which undergoes a preprogrammed evolution that is not dependent on the state of the system A . Otherwise, there maybe dissipation associated with the operation of the control mechanismthat gets left out of the analysis.We will count energy exchanges with the reservoirs B i as heat (tobe counted as positive if A gains energy from B , negative if A losesenergy), and energy changes to A due to changes in the external po-tentials as work (again, counted as positive if A gains energy, negativeif it loses energy).Dropping the assumption of the availability of reversible processesrequires revision of the familiar framework of thermodynamics, as itmeans dropping the assumption of the availability of an entropy func-tion. In its place we will define quantities S M ( a → b ), defined relativeto a class of available manipulations M , to be thought of as analogs,in the current context, of entropy differences between states a and b .These will be representable as differences in the values of some state unction only in the limiting case in which all states can be connectedreversibly.For any manipulation M , that takes a state a to a state b , we candefine h Q i ( a → b ) i M as the expectation value of the heat obtained by A from reservoir B i . We can use these to define, σ M ( a → b ) = X i h Q i ( a → b ) i M T i . (3)Let M ( a → b ) be the set of manipulations in M that take a to b ,and define, as analogs of entropies (which we will henceforth just call“entropies”), S M ( a → b ) = l.u.b. { σ M ( a → b ) | M ∈ M ( a → b ) } . (4)Via the obvious extension of this definition we also define quantitiessuch as S M ( a → b → c ) for processes with any number of intermediatesteps. It is assumed that manipulations can be composed, that is, thatany manipulation that takes a to b can be followed by one that takes b to c to form a manipulation that takes a to b and then to c . It followsfrom this composition assumption and the definition of the entropiesthat S M ( a → b → c ) = S M ( a → b ) + S M ( b → c ) , (5)and similarly for processes consisting of longer chains of intermediatesstates.One version of the second law of thermodynamics says that, forany cyclic process, the sum of Q i /T i over all heat exchanges cannot bepositive. Since we’re working in the context of statistical mechanics,and we do not want to ignore fluctuations, the appropriate revision ofthe second law involves expectation values of heat exchanges. A cyclicprocess will be one that restores the marginal probability distributionof the system A to the one it started out with. The revised secondlaw that we will prove in the next section states that, for any cyclicprocess, the sum of h Q i i /T i over all heat exchanges cannot be positive.In the notation we have introduced, this is: The Statistical Second Law . For any state a , S M ( a → a ) ≤ . It follows from this that S M ( a → b → a ) = S M ( a → b ) + S M ( b → a ) ≤ , (6) nd similarly for processes involving longer chains of intermediatestates.In any process M that takes a state a to a state b , some of thework done, or heat discarded into a reservoir, may be recovered bysome process that takes b back to a . If the process can be reversedwith the signs of all h Q i i reversed, then full recovery is possible. If fullrecovery is not possible, and cannot even be approached arbitrarilyclosely, we will say that the process is dissipatory . A manipulation M ′ that takes b to a and recovers work done and heat discarded would beone such that σ M ( a → b ) + σ M ′ ( b → a ) = 0 . (7)There might be a limit to how closely this can be approached. Definethe dissipation associated with the process of M taking a to b as thedistance between this limit and perfect recovery. δ M ( a → b ) = g.l.b. {− ( σ M ( a → b ) + σ M ′ ( b → a )) | M ′ ∈ M ( b → a ) } = − S M ( b → a ) − σ M ( a → b ) . (8)It follows from the statistical second law that this is non-negative.If there is no limit to how much the dissipation associated withprocesses that connect a to b can be diminished, S M ( a → b → a ) = 0 . (9)When this holds, it is traditional to say that a and b can be connectedreversibly, and to imagine a fictitious process that can proceed ineither direction, reversing the signs of all heat exchanges. There is noharm in doing so, as long as this is not taken too literally. Followingconvention, we will say, for any a , b for which (9) is satisfied, that a and b can be connected reversibly. When this locution is used, bear inmind that it is shorthand for (9), and does not presume the existenceof an actual reversible process.From the statistical second law it follows that, if all states canbe connected reversibly—that is, if, for all a, b , S M ( a → b → a ) =0—then there exists a state function S M , defined up to an additiveconstant, such that S M ( a → b ) = S M ( b ) − S M ( a ) . (10) As Norton (2016) has argued, taking talk of irreversible processes too literally canlead to contradictions. his is the familiar entropy function. The reason we have been ex-pressing things in an unfamiliar way is that we don’t want to assumereversibility as a general rule.Any manipulation that takes a to b must have dissipation of atleast − S M ( a → b → a ). Define the inefficiency associated with amanipulation that takes a to b as the amount by which its dissipationexceeds this minimal value. η M ( a → b ) = δ M ( a → b ) − ( − S M ( a → b → a ))= S M ( a → b ) − σ M ( a → b ) . (11)If a and b can be connected reversibly, the distinction between dissi-pation and inefficiency vanishes, and the inefficiency is equal to thedissipation.We are now in a position to state the version of Landauer’s princi-ple that we will be proving. Consider a logical operation L that is notlogically reversible, meaning that the input is not recoverable from theoutput. This means that there are two or more inputs { α i } that aremapped by L to the same output β . In a device that implements thelogical operation L , the inputs will be represented by statistical me-chanical states { a i } , and the output by a statistical mechanical state b . Distinct inputs are to be represented by distinguishable states,which, in the classical context, means probability distributions withnon-overlapping support, and in the quantum-mechanical context, byorthogonal density operators. An implementation of L is a manipula-tion M L that maps each member of the set { a i } into b .The question to be asked is: can the manipulation M L do thiswithout incurring any inefficiency? That is, can we have η M L ( a i → b ) equal to zero, for every a i ? Failing that, can we, by appropriatechoice of manipulation, make every element of the set { η M L ( a i → b ) } arbitrarily small?In the next section we will prove the following. Landauer bound on dissipations.
If manipulation M takes each of a distinguishable set { a i , i = 1 , . . . , n } ofstates to the same state b , then n X i =1 e − δ M ( a i → b ) /k ≤ . This entails that every member of the set { δ M ( a i → b ) } is greaterthan zero. It also entails a formulation that is often presented as a loss of Landauer’s principle, that the mean of the set is not smallerthan k log n . n n X i =1 δ M ( a i → b ) ≥ k log n. (12)That is, there an average dissipation, taken over members of the set { a i } , of at least k log n . We might be able to reduce the dissipationassociated with any particular member of the set as much as we like,but we cannot simultaneously make all of them arbitrarily small. Forthe case of n = 2, the most commonly discussed case, the constraint isgraphed in Figure 1. The shaded region is the set of permitted pairs( δ , δ ) = ( δ M ( a → b ) /k, δ M ( a → b ) /k ). Figure 1: Values of ( δ , δ ) permitted by Landauer’s principle. If, as is usually assumed in these discussions, the states { a i } canbe connected reversibly to b , then any dissipation is inefficiency, andbounds on dissipations are bounds on inefficiencies. If reversibility isnot assumed, there may be unavoidable levels of dissipation associatedwith some state transitions; if this is the case, not every dissipationrepresents an inefficiency. We can re-state the Landauer principle interms of inefficiencies. Landauer bound on inefficiencies.
If manipulation M takes each of a distinguishable set { a i , i = 1 , . . . , n } of In this article, all logarithms are natural logarithms, that is, logarithms to the base e . See Appendix for proof that this is entailed by our formulation of the Landauer prin-ciple. tates to the same state b , then n X i =1 e − ( η i − S M ( a i → b → a i )) /k ≤ , where η i is the inefficiency η M ( a i → b ).If we have reversibility, then this entails that all of the inefficiencies η M ( a i → b ) must be positive, and that they cannot all be made arbi-trarily small in the same process. Far enough from reversibly, it placesno constraint on inefficiencies at all. The condition for the Landauerbound to place a constraint on inefficiencies is, n X i =1 e S M ( a i → b → a i ) /k > . (13)A necessary condition for (13) to be satisfied, and thus for the Lan-dauer principle to have teeth, is the condition that, for some a i , S M ( a i → b → a i ) > − k log n. (14)If Norton (2011, 2013b,c) is right about the minimum dissipation re-quired for carrying out processes at the molecular level, then (14)is not satisfiable; because of fluctuations at the molecular level, anyprocess departs from reversibility by an amount that far exceeds theLandauer bound. In section 4 it will be argued that this is not correct,and the Landauer principle does have teeth.The Landauer bound we have stated involves a distinguishable setof states. Distinguishability, like reversibility, is something that weshould not expect to hold perfectly; in actual implementations we willat best approximate perfect distinguishability. For this reason, thetheorem that we will prove in the next section will not require perfectdistinguishability, and will entail the version of the Landauer boundwe have stated in this section as a special case. The theorems we will be concerned with come in two versions, clas-sical and quantum, each proven in pretty much the same way. To void saying everything twice, we adopt a systematically ambiguousnotation, and state each theorem in such a way that it can be readeither as a theorem of classical statistical mechanics, or as a theoremof quantum statistical mechanics.In what follows, ρ will be used either for a density function, withrespect to Liouville measure, on a classical phase space, or, in thequantum context, a density operator on a Hilbert space. S [ ρ ] is theGibbs entropy (classical), or the von Neumann entropy (quantum). S [ ρ ] = − k h log ρ i ρ . (15)We also define the relative entropy of two distributions. S [ ρ k σ ] = − k ( h log σ i ρ − h log ρ i ρ ) . (16) S [ ρ k σ ] is one way to measure how much the distribution representedby σ departs from that represented by ρ . It is equal to zero for σ = ρ ,and is positive for any other σ .Suppose ¯ a is a probabilistic mixture of states { a i } . ρ ¯ a = n X i =1 p i ρ a i , (17)where { p i } are positive numbers that add up to one. Then the Gibbs/vonNeumann entropy of ¯ a is related to that of the a i ’s via, S [ ρ ¯ a ] = n X i = a p i S [ ρ a i ] + n X i = a p i S [ ρ a i k ρ ¯ a ] . (18)If the states { a i } are distinguishable, then S [ ρ a i k ρ ¯ a ] = − k log p i , andso S [ ρ ¯ a ] = n X i = a p i S [ ρ a i ] − k n X i =1 p i log p i . (19)As outlined in the previous section, we are concerned with a sys-tem A evolving between times t and t according to a time-varyingHamiltonian, and interacting successively with one or more heat baths { B i } , which initially have canonical distributions at temperatures T i .The Hamiltonians of the heat baths remain fixed throughout the evo-lution. We define h Q i i = − ∆ h H B i i = − (cid:16) h H B i i ρ Bi ( t ) − h H B i i ρ Bi ( t ) (cid:17) . (20) his is the expectation value of the heat energy obtained by A from B i . Our first theorem relates the entropies as defined in the previoussection to the Gibbs/von Neumann entropies. Though a simple one, itis of fundamental importance in the foundations of statistical mechan-ics, and deserves to be called the Fundamental Theorem of StatisticalMechanics. Proposition 1. If M is a class of manipulations of the sort outlinedin Section 2, then, for any states a , b , S M ( a → b ) ≤ S [ ρ b ] − S [ ρ a ] . The following are immediate corollaries of this.
Corollary 1.1.
The second law of statistical thermodynamics.
For any state a , S M ( a → a ) ≤ . Corollary 1.2. If a and b can be connected reversibly—that is, if S M ( a → b → a ) = 0 , then S M ( a → b ) = S [ ρ b ] − S [ ρ a ] . Thus, the Gibbs/von Neumann entropy is the state function whoseexistence is guaranteed by the second law plus reversibility.Now, to the Landauer principle. If a manipulation M takes eachof the states { a i } to the same state b , then it must also take anyprobabilistic mixture ¯ a of these states to b . Let ¯ a be a mixture of thestates { a i } with weights { p i } . The expectation value of heat exchangedwhen M is applied to this mixture is a weighted average of exchangesthat would occur in the states { a i } , and so σ M (¯ a → b ) = n X i =1 p i σ M ( a i → b ) . (21) This is not a new theorem. The classical version of it is found in Gibbs (1902, pp.160–164), and the quantum version, in Tolman (1938, § generalized Landauer principle . e must have, of course, σ M (¯ a → b ) ≤ S M (¯ a → b ) . (22)By the Fundamental Theorem, S M (¯ a → b ) ≤ S [ ρ b ] − S [ ρ ¯ a ] . (23)From (18), the right-hand side of this is S [ ρ b ] − S [ ρ ¯ a ] = n X i =1 p i ( S [ ρ b ] − S [ ρ a i ]) − n X i =1 p i S [ ρ a i k ρ ¯ a ] . (24)Employing the Fundamental Theorem again, S [ ρ b ] − S [ ρ a i ] ≤ − S M ( b → a i ) . (25)Combining (21), (22), (23), (24), and (25) gives us n X i =1 p i σ M ( a i → b ) ≤ − n X i =1 p i S M ( b → a i ) − n X i =1 p i S [ ρ a i k ρ ¯ a ] . (26)Rearranging, and recalling the definition (8) of the dissipations, weget n X i =1 p i δ M ( a i → b ) ≥ n X i =1 p i S [ ρ a i k ρ ¯ a ] . (27)Thus, we have the result, Proposition 2.
For any manipulation M that takes each of { a i } to b , and any positive numbers { p i } such that n X i =1 p i = 1 , we have n X i =1 p i δ M ( a i → b ) ≥ n X i =1 p i S [ ρ a i k ρ ¯ a ] , where ¯ a is a mixture of states { a i } with weights { p i } . This is our general version of Landauer’s principle. If we applythis to the case in which the states { a i } are distinguishable, we getthe following corollary. orollary 2.1. For any manipulation M that takes each of a distin-guishable set { a i } to b , and any positive numbers { p i } such that n X i =1 p i = 1 , we have n X i =1 p i δ M ( a i → b ) ≥ − k n X i =1 p i log p i . As shown in the Appendix, this is equivalent to the following,
Corollary 2.2.
For any manipulation M that takes each of a distin-guishable set { a i } to b , n X i =1 e − δ M ( a i → b ) /k ≤ . This is the version stated in the previous section.
The second law of statistical thermodynamics entails that, for any a , b , S M ( a → b → a ) ≤ . (28)We do not expect there to be any process that takes a to b and thenback to a without any dissipation. However, if the array of permittedmanipulations is sufficiently rich, there maybe no bound on dissipationshort of zero, and we may have S M ( a → b → a ) = 0.One way to have a process that proceeds with negligibly smalldissipation is to keep the system A in contact with a heat reservoirlarge and noisy enough that the reservoir may be regarded as canoni-cally distributed throughout the process, and to vary the parameters λ slowly enough that the time it takes for any appreciable change inthese parameters is long compared to the equilibration time-scale ofthe system A . Then the system A may be treated as if it is in equi-librium with the reservoir at each stage of the process. We can also This does not, of course, mean that it is in equilibrium, only that, for the purposes athand, differences between quantities calculated on the basis of the equilibrium distributionand quantities calculated on the basis of the actual distribution are small enough that theymay be neglected. onsider slowly varying the temperature of the reservoir. For a pro-cess like that, at any time t during the process A may be treated ashaving a canonical distribution for the instantaneous parameter values( λ ( t ) , β ( t )).If ρ is a canonical distribution for parameters ( λ , β ), and ρ acanonical distribution for slightly differing parameters ( λ + d λ , β + dβ ),then, to first order in the parameter differences, d h H i = h H i ρ − h H i ρ = X i (cid:28) ∂H∂λ i (cid:29) ρ dλ i − β − d h log ρ i . (29)The first term on the right-hand side of this equation is the expec-tation value of the work done in changing the external parameters;the remainder is the expectation value of the heat obtained from thereservoir. h d ¯ Q i = − kT d h log ρ i , (30)where kT = β − . This means that, for a process in the course ofwhich the system A is in continual contact with a heat reservoir attemperature T and the parameters λ are varied slowly from values λ a to λ b , the expectation value of total heat absorbed will have theapproximate value h Q ( a → b ) i ≈ − kT ( h log ρ b i ρ b − h log ρ a i ρ a ) = T ( S [ ρ b ] − S [ ρ a ]) . (31)As long as there is no in-principle limit to how much time a state-transformation may take, there is no in-principle limit to how closelythis approximation can hold, and equality will be approached as thetime-scale of the changes in the parameters λ is increased, relative tothe time-scale of equilibration of the system A .The result (31) is a result about expectation values. It is not as-sumed that the actual value of heat exchanged will be close to itsexpectation value, or even that it will probably be close to its expec-tation value. The probability distribution for the heat exchange mayhave a large variance, and probabilities of large deviations from theexpectation value may be far from negligible. That is, the result does not depend on disregard of fluctuations. When we say that the sys-tem has time to equilibrate, this does not mean that it is ever in aquiescent state, only that its distribution may be treated as canonicalat each stage of the process. The classical version of this eq. (112) on p. 44 of Gibbs (1902), and the quantum, eq.(121.8) on p. 534 of Tolman (1938). et a , b be canonical states with parameters ( λ a , β a ), ( λ b , β b ). Wewill say that a class of manipulations M connects a and b quasi-statically if1. M contains manipulations of the following form(a) During time interval [ t , t + T ], the parameters undergosmooth evolution λ ( t ), with λ ( t ) = λ a and λ ( t + T ) = λ b .(b) At time t the system A is in thermal contact with a heatreservoir at inverse temperate β ( t ), where β ( t ) is a smoothfunction with β ( t ) = β a and β ( t + T ) = β b .2. For any such manipulation, there is one that proceeds twice asslowly. That is, there is a manipulation that takes place in timeinterval [ t , t + 2 T ], with parameter values λ ′ , β ′ where λ ′ ( t + t ) = λ ( t + t/ β ′ ( t + t ) = β ( t + t/ . for t ∈ [0 , T ].Then we have the following result. Proposition 3. If a , b are canonical states, and M is a class ofmanipulations that connects a to b quasi-statically, then S M ( a → b ) = S [ ρ b ] − S [ ρ a ] . We have, as a trivial corollary,
Corollary 3.1. If a , b are canonical states, and M is a class ofmanipulations that connects a to b quasi-statically, and also connects b to a quasi-statically, then S M ( a → b → a ) = 0 . Suppose that we have a system to which can be applied a manipu-lable external potential V λ , and which can also be confined, by suitablebarriers, to various regions { Γ i } of its state space. Let { a i } be a finiteset of canonical states, confined to the regions { Γ i } , with values λ a of the manipulable parameters λ on which the external potential de-pends, and let { b i } be a set of canonical distributions confined to thesame regions, with parameter values λ b . Then, for any desired degreeof approximation to the quasistatic limit, we can find a sufficientlyslow variation of the parameters λ that yields the desired degree ofapproximation for all of the transitions a i → b i . We will say, of sucha situation, that M uniformly quasi-statically connects { a i } to { b i } .We have, as another corollary to Proposition (3). orollary 3.2. Let { a i } , { b i } be sets of canonical states, such that M uniformly quasi-statically connects { a i } to { b i } and { b i } to { a i } .Let { p i } be a set of non-negative numbers that sum to 1, and let ¯ a and ¯ b be probabilistic mixtures of { a i } and { b i } with weights { p i } . Then S M (¯ a → ¯ b → ¯ a ) = 0 . The simplest example I can think of for illustrating erasure that isa single particle in a box, with a partition that can be inserted andremoved. If this is the only available manipulation, S M ( a → b ) willbe zero for all states a , b of the same temperature. To get nontrivialentropies, we need to introduce the possibility of doing work on andobtaining work from the system.Suppose that the particle can be subjected to an external poten-tial V λ , that varies in the x -direction only. We take the system tobe in thermal equilibrium with a heat bath at temperature T . Ona canonical distribution, the distributions of the momentum p andthe coordinates other than x are unchanged when the potential V λ isvaried. We therefore integrate these out, and consider the marginaldistribution of the coordinate x . ρ λ,β ( x ) = (cid:26) Z − λ,β e − β V λ ( x ) , inside the container;0 . outside . (32)Take the x -coordinate within the container to range from − l to l . Thepartition function is Z λ,β = Z l − l e − β V λ ( x ) dx. (33)We need not assume that the potential V λ is under perfect control. It,too, may fluctuate, with its own probability distribution. Evolutionof a probability distribution, via the Liouville equation, of a systemsubject to a potential V that fluctuates with a probability distributionof its own, independent of the state of the system, is the same asevolution under a steady potential equal to the expectation value h V i of the potential. Thus, if the external force fluctuates, the stabledistribution is the same as (32), with V λ ( x ) replaced by its expectationvalue at the point x . Fluctuations of the external potential, even largeones, do not invalidate our analysis. uppose the force on the particle is constant within the box, andmay be varied in both strength and direction. The particle could, forexample, be a charged particle, and the applied field an electric field.Then the external potential varies linearly with x . Take it to be, V λ ( x ) = λkT x/l. (34)where λ is a dimensionless parameter.The analog of compressing or expanding the one-particle gas isvarying the external potential. As λ is increased from zero, the distri-bution of the particle becomes more and more concentrated towardsthe left end of the container. We can make the probability that itis to the left of any chosen location as high as we want by taking λ sufficiently large. Similarly, for negative values of λ , the distributionis concentrated towards the right end of the container.Relative to a canonical distribution with λ = 0, a distributionfor a large value of λ has a large value of free energy, and so wehave to do work on the gas while increasing the potential. The workdone may be recovered by decreasing the potential back to zero. Ifthe process is done slowly enough that the particle can be treated ascanonically distributed at each state of the process, the expectationvalue of the work recovered while decreasing the potential is equal tothe expectation value work of the work done in increasing it.Let b be a state in which no partition is present and the appliedpotential is zero. The probability distribution of the particle is evenlydistributed throughout the container. Now insert a partition thatdivides the container into subvolumes with ratio p : (1 − p ). Let a ( p )be a state in which the particle is to the left of the partition, and let a ( p ) be a state in which the particle is to the right of the partition.The states a ( p ) and a ( p ) are perfectly distinguishable states.There’s a complication, however: given our class of manipulations, wehave no way to prepare them, starting from state b . If we start from b and increase the potential, we can make the probability that theparticle is to the left of where we intend to drop the partition as highas we like, but it can never be equal to 1.In place of these states a ( p ) and a ( p ), which are perfectly dis-tinguishable but not preparable using the manipulations considered,we consider a pair of states that are almost distinguishable, and arepreparable. Let ǫ be a small positive number, and let a ǫ ( p ) be a statein which V λ is zero, and a partition is present, dividing the containerinto subvolumes with ratio p : (1 − p ), and in which there is a prob- bility of 1 − ǫ that the particle is to the left of the partition, andprobability ǫ that it is to the left. Define a ǫ ( p ) similarly, with theprobabilities reversed.One manipulation that takes a ǫ ( p ) to b is removal of the partition,after which the particle equilibrates. This is inefficient, as we couldhave performed an expansion of the gas, in the course of which workis obtained and heat enters the gas from the reservoir.To see how much inefficiency, we consider the following process,which is analogous to a controlled expansion of a gas. We start instate a ǫ ( p ).1. We first slowly increase λ to the point at which, on the canonicaldistribution for V λ , the particle has probability 1 − ǫ of being tothe left of the partition, and probability ǫ of being on the right.2. We remove the partition, allowing the particle to move freelythroughout the container. The probability distribution does notchange, as the probability, on the equilibrium distribution, of theparticle being on the left of the former location of the partitionis the same as it was before the partition was removed.
3. The potential is slowly decreased to zero.The process can be performed in reverse order to create a ǫ ( p ) from b .If we have available to us arbitrarily slow processes, S M ( a ǫ ( p ) → b → a ǫ ( p )) = S M ( a ǫ ( p ) → b → a ǫ ( p )) = 0 . (35)The expectation value of heat gained in the process of expansion is,in the quasistatic approximation, h Q ( a ǫ ( p ) → b ) i = T ( S [ b ] − S [ a ǫ ( p )])= − kT [(1 − ǫ ) log p + ǫ log(1 − p ) − v ( ǫ )] , (36)where v ( ǫ ) = ǫ log ǫ + (1 − ǫ ) log(1 − ǫ ) . (37)We can make h Q ( a ǫ ( p ) → b ) i as close to − kT log p as we like by taking ǫ sufficiently small. General rule: if we take state space Γ and partition the space into disjoint regions Γ i ,a canonical distribution ρ defined on Γ is a mixture of canonical distributions ρ i confinedto the regions Γ i , with weights being the probabilities, on ρ , that the system is in Γ i . herefore, erasure by removing the partitions has associated withit inefficencies, η = − k [(1 − ǫ ) log p + ǫ log(1 − p ) − v ( ǫ )] ≈ − k log p,η = − k [ ǫ log p + (1 − ǫ ) log(1 − p ) − v ( ǫ )] ≈ − k log(1 − p ) . (38)Suppose that we want an erasure process that takes both a ǫ ( p ) and a ǫ ( p ) to the state b . One such process goes by removal of the partition.This has the inefficiencies exhibited in (38). But we have only availedourselves of a fairly limited set of operations. Would it be possible toconcoct a different set of operations, which might include the employ-ment of auxiliary systems subject to any sort of Hamiltonian we mightdream up, whether or not realization of such Hamiltonians is remotelyfeasible, and thereby construct an operation that takes both a ǫ ( p ) and a ǫ ( p ) to b , with lower inefficiency for both input states than the lossyremoval-of-partition operation? Alas, the answer is negative. As thereader can verify, as long as ǫ < p < − ǫ , the pair of inefficiencies(38) saturate the Landauer bound exhibited in Proposition 2. Thismeans that no process, no matter how elaborate, will achieve a lowerinefficiency for both input states, so long as all exchanges of heat arewith canonically distributed reservoirs, there are at the beginning ofthe process no dynamically relevant correlations between the state of A and either the auxiliary systems or the reservoirs, the evolution ofthe total system is Hamiltonian, and at the end of the evolution theauxiliary systems are restored to their initial states. The LPSG proof proceeds as follows. Suppose we have a manipulation M L that takes each of a distinguishable set of states { a i , i = 1 , . . . , n } of a device D to a common destination state b . The proof employsas an auxiliary system a one-molecule gas in a box into which par-titions may be inserted and removed, and which can be expandedreversibly. LPSG reason that, on pain of violating the statistical sec-ond law of thermodynamics, the manipulation M L must satisfy theLandauer principle. This involves considering the following cycle ofoperations (performed with both the device D and the gas G in con-tact with a heat reservoir at all times). The starting state is one inwhich device D is in state b , and there are no partitions in the box. . n − n subvolumes, with volumes that are fractions p i of thetotal volume. With probability p i , the gas molecule is in the i thsubvolume.2. A controlled operation is performed on D , using the state of thegas G as control. If the gas molecule is in the i th subvolume, b is taken to a i .3. A controlled operation is performed on the gas G , using thestate of D as control. The ith subvolume is expanded reversibly,obtaining heat − kT log p i from the reservoir. The gas has nowbeen restored to its initial state.4. The operation M L is performed, restoring the device D to thestate b .If one works through the expectation values of heat exchanges in thecourse of this cycle, assuming the statistical second law but not as-suming reversibility of the processes b → a i , then what is obtained isprecisely our Corollary 2.1 of section 3. Obviously, if one replaces theassumption that heat − kT log p i can be obtained in step 3 with theassumption that there are operations such that the expectation valueof heat obtained can come arbitrarily close to − kT log p i , the resultstill obtains.The point of contention is whether expansion of a one-molecule gascan be performed in such a way that expectation value of heat obtainedis arbitrarily close to − kT log p i . Norton, in the works cited, contendsthat this is false. In my opinion Ladyman and Robertson (2014) areright when they say that he has not established this. However, if onehas doubts about this being true for a one-molecule gas expanded bya piston, because of lack of control over a sufficiently sensitive piston,our example from the previous section of a one-molecule gas subjectedto an external potential may be substituted.We replace step 3 with the following process. For simplicity weillustrate it for the case of a single partition; extension to multiplepartitions is straightforward. Suppose the particle is found to be tothe left of the partition. The initial state is a ( p ).1. Slowly increase λ to a high positive value λ ∗ .2. Remove the partition, and allow the system to equilibrate. Someheat is absorbed from the reservoir, but, for large λ ∗ , this is small.3. Slowly decrease λ to zero. f the particle is found to the right of the partition, one takes λ toa large negative value instead. It is not difficult to calculate the ex-pectation value of heat obtained in such a process in the adiabaticlimit. The details of this calculation need not concern us; what mat-ters if that it can be made arbitrarily close to − kT log p by taking λ ∗ sufficiently large. Landauer’s principle is a theorem of statistical mechanics. The worriesraised by Norton about assuming reversibility can be addressed; fluc-tuations pose no threat to the extent we can approximate reversibility,in the relevant sense. If the system being manipulated is in contactwith a heat reservoir at temperature T throughout a cycle of opera-tions, the expectation value of heat exchanged over the course of thecycle can be made as small as one likes if one is patient enough. Onany given run of the cycle, the actual heat exchanged may differ wildlyfrom this expectation value, but it is the expectation value that is rel-evant to the statistical version of Landauer’s principle. I am grateful to a number of people with whom I have discussed thesematters over the years. In particular, I thank Owen Maroney fordrawing my attention to what I have called the Fundamental Theo-rem, John Norton for discussions of reversible processes, and KatieRobertson for comments on an earlier draft of this article. For those who are interested, the result is h Q i = − kT log p − kT log (cid:18) − e − λ ∗ − e − pλ ∗ (cid:19) . For any p , 0 < p <
1, for large λ ∗ we have h Q i ≈ − kT log p − kT e − pλ ∗ . Therefore, h Q i approaches − kT log p exponentially with increase of λ ∗ . Appendix
To be proven: If M is a class of manipulations of the sort outlined inSection 2, then, for any states a , b , S M ( a → b ) ≤ S [ ρ b ] − S [ ρ a ] . We use the following lemmas.
Lemma 1.
For any Hamiltonian H , and any T > , the canonicaldistribution at temperature T minimizes h H i ρ − T S [ ρ ] . Lemma 2.
Subadditivity.
For a composite system AB , S [ ρ AB ] ≤ S [ ρ A ] + S [ ρ B ] , with equality if and only if the subsystems are probabilistically inde-pendent. Lemma 3. S [ ρ ] is conserved under Hamiltonian evolution. We consider some manipulation M ∈ M that takes a state a of A at t to a state b at t . At time t the composite system consisting of A and { B i } has distribution represented by density ρ tot ( t ). At time t the density is ρ tot ( t ). We will write S tot ( t ) as an abbreviation for S [ ρ tot ( t )], and similarly for S A ( t ) and S B i ( t ).By Lemma 1 we have, for each reservoir B i , h H B i ( t ) i − T i S B i ( t ) ≤ h H B i ( t ) i − T i S B i ( t ) , (39)or, ∆ h H B i i − T i ∆ S B i ≥ . (40)Since h Q i i = − ∆ h H B i i , this gives h Q i i T i ≤ − ∆ S B i . (41)Because A is uncorrelated with each B i at t , S tot ( t ) = S A ( t ) + n X i =1 S B i ( t ) . (42) ecause of subadditivity, S tot ( t ) ≤ S A ( t ) + n X i =1 S B i ( t ) . (43)Because Hamiltonian evolution conserves S , S tot ( t ) = S tot ( t ) . (44)Taken together, (42), (43), and (44) yield,∆ S A + n X i =1 ∆ S B i ≥ . (45)This, together with (41), gives us the result, σ M ( a → b ) = n X i =1 h Q i i T i ≤ ∆ S A . (46)Since this must hold for every manipulation in the set M , it must holdalso for S M ( a → b ), which we defined as the least upper bound of theset of all σ M ( a → b ) for M ∈ M . This gives us the desired result, S M ( a → b ) ≤ ∆ S A . (47) Lemma 4.
Let { x i , i = 1 , . . . , n } be any sequence of n real numbers.The following are equivalent.A) For all non-negative { p i , i = 1 , . . . , n } such that P i p i = 1 , n X i =1 p i x i ≥ − n X i =1 p i log p i . B) n X i =1 e − x i ≤ . To prove this, we use the following. emma 5. For any positive numbers { p i } , { q i } , n X i =1 p i log q i − log X i q i ! ≤ n X i =1 p i log p i − log X i p i ! . To prove this: given { p i } , find { q i } that maximizes the LHS; thismaximum value is the RHS. Details omitted. We now proceed to theproof of Lemma 4. Proof that ( A ) ⇒ ( B ). Suppose that { x i } are such that (A) holds.Take p i = e − x i P nj =1 e − x j . (48)Then P i p i = 1, and n X i =1 p i x i = − n X i =1 p i log p i − log n X j =1 e − x j . (49)In order for (A) to be satisfied, we must havelog n X j =1 e − x j ≤ , (50)which is equivalent to n X j =1 e − x j ≤ . (51) Proof that ( B ) ⇒ ( A ). Suppose that { x i } are such that (B) holds. Let q i = e − x i . Then n X i =1 p i x i = − n X i =1 p i log q i . (52)By Lemma 5, for any { p i } such that P i p i = 1, − n X i =1 p i log q i ≥ − n X i =1 p i log p i − log n X i =1 q i ! , (53)and so n X i =1 p i x i ≥ − n X i =1 p i log p i − log n X i =1 q i ! . (54) ecause of (B), log n X i =1 q i ! = log n X i =1 e − x i ! ≤ , (55)and so, n X i =1 p i x i ≥ − n X i =1 p i log p i . (56) eferences Earman, J. and J. D. Norton (1999). Exorcist XIV: The wrath ofMaxwell’s Demon. Part II. From Szilard to Landauer and beyond.
Studies in History and Philosophy of Modern Physics 30 , 1–40.Gibbs, J. W. (1902).
Elementary Principles in Statistical Mechanics:Developed with Especial Reference to the Rational Foundation ofThermodynamics . New York: Charles Scribner’s Sons.Ladyman, J. (2018). Intension in the physics of computation: Lessonsfrom the debate about Landauer’s principle. In M. E. Cuffaro andS. C. Fletcher (Eds.),
Physical Perspectives on Computation, Com-putational Perspectives in Physics , pp. 219–239. Cambridge: Cam-bridge University Press.Ladyman, J., S. Presnell, and A. J. Short (2008). The use of theinformation-theoretic entropy in thermodynamics.
Studies in His-tory and Philosophy of Modern Physics 39 , 315–324.Ladyman, J., S. Presnell, A. J. Short, and B. Groisman (2007). Theconnection between logical and thermodynamic irreversibility.
Stud-ies in History and Philosophy of Modern Physics 38 , 58–79.Ladyman, J. and K. Robertson (2013). Landauer defended: Reply toNorton.
Studies in History and Philosophy of Modern Physics 44 ,263–271.Ladyman, J. and K. Robertson (2014). Going round in circles: Lan-dauer vs.
Norton on the thermodynamics of computation.
En-tropy 16 , 2278–2290.Leff, H. S. and A. F. Rex (Eds.) (2003).
Maxwell’s Demon 2: En-tropy, Classical and Quantum Information, Computing . Bristol andPhiladelphia: Institute of Physics Publishing.Maroney, O. (2007). The physical basis of the Gibbs-von Neumannentropy. arXiv:quant-ph/0701127v2 .Maroney, O. J. E. (2009). Generalizing Landauer’s principle.
PhysicalReview E 79 , 031105.Maxwell, J. C. (1871).
Theory of Heat . London: Longmans, Green,and Co. axwell, J. C. (1878). Tait’s “Thermodynamics”, II. Nature 17 ,278–280.Norton, J. D. (2005). Eaters of the lotus: Landauer’s principle andthe return of Maxwell’s demon.
Studies in History and Philosophyof Modern Physics 36 , 375–411.Norton, J. D. (2011). Waiting for Landauer.
Studies in History andPhilosophy of Modern Physics 42 , 184–198.Norton, J. D. (2013a). Author’s reply to Landauer defended.
Studiesin History and Philosophy of Modern Physics 44 , 272.Norton, J. D. (2013b). The end of the thermodynamics of computa-tion: A no-go result.
Philosophy of Science 80 , 1182–1192.Norton, J. D. (2013c). All shook up: Fluctuations, Maxwell’s demonand the thermodynamics of computation.
Entropy 15 , 4432–4483.Norton, J. D. (2016). The impossible process: Thermodynamic re-versibility.
Studies in History and Philosophy of Modern Physics 55 ,43–61.Norton, J. D. (2018). Maxwell’s demon does not compute. In M. E.Cuffaro and S. C. Fletcher (Eds.),
Physical Perspectives on Compu-tation, Computational Perspectives in Physics , pp. 240–256. Cam-bridge: Cambridge University Press.Szilard, L. (1925). ¨Uber die Ausdehnung der ph¨anomenologischenThermodynamik auf die Schwankungserscheinungen.
Zeitschrift f¨urPhysik 32 , 753–788. English translation in Szilard (1972).Szilard, L. (1972). On the extension of phenomenological thermody-namics to fluctuation phenomena. In B. T. Feld, G. W. Szilard, andK. R. Winsor (Eds.),
The Collected Works of Leo Szilard: ScientificPapers , pp. 70–102. Cambridge, MA: The MIT Press.Tolman, R. C. (1938).
The Principles of Statistical Mechanics . Oxford:Clarendon Press.. Oxford:Clarendon Press.