[PDF] Optimized finite-time information machine

Abstract

We analyze a periodic optimal finite-time two-state information-driven machine that extracts work from a single heat bath exploring imperfect measurements. Two models are considered, a memory-less one that ignores past measurements and an optimized model for which the feedback scheme consists of a protocol depending on the whole history of measurements. Depending on the precision of the measurement and on the period length the optimized model displays a phase transition to a phase where measurements are judged as non-reliable. We obtain the critical line exactly and show that the optimized model leads to more work extraction in comparison to the memory-less model, with the gain parameter being larger in the region where the frequency of non-reliable measurements is higher. We also demonstrate that the model has two second law inequalities, with the extracted work being bounded by the change of the entropy of the system and by the mutual information.

Full PDF

aa r X i v : . [ c ond - m a t . s t a t - m ec h ] J un Optimized ﬁnite-time information machine

Michael Bauer, Andre C. Barato and Udo Seifert

II. Institut f¨ur Theoretische Physik, Universit¨at Stuttgart, 70550 Stuttgart, GermanyPACS numbers: 05.70.Ln, 05.10.Gg

Abstract.

We analyze a periodic optimal ﬁnite-time two-state information-drivenmachine that extracts work from a single heat bath exploring imperfect measurements.Two models are considered, a memory-less one that ignores past measurements andan optimized model for which the feedback scheme consists of a protocol depending onthe whole history of measurements. Depending on the precision of the measurementand on the period length the optimized model displays a phase transition to a phasewhere measurements are judged as non-reliable. We obtain the critical line exactlyand show that the optimized model leads to more work extraction in comparison tothe memory-less model, with the gain parameter being larger in the region where thefrequency of non-reliable measurements is higher. We also demonstrate that the modelhas two second law inequalities, with the extracted work being bounded by the changeof the entropy of the system and by the mutual information.

1. Introduction

The thermodynamics of information processing is a very active area of research. Whereascentral concepts in this ﬁeld have been developed a while ago [1–3], more recentlythe ﬂuctuation relation obtained by Sagawa and Ueda [4] has shown that stochasticthermodynamics [5] provides a convenient framework to study the relation betweeninformation and thermodynamics. Moreover, ingenious experiments with small systems[6,7] verifying second law inequalities that involve information have played an importantrole in triggering the recent avalanche of papers. These works deal with the derivationof ﬂuctuation relations and second law inequalities [4, 8–19] and the study of speciﬁcmodels [20–29].In ﬁnite-time thermodynamics the issue of optimal protocols is of centralimportance. A recent result within stochastic thermodynamics has been the observationthat the optimal protocol has discontinuities at the beginning and end of the ﬁnite-timeprocess [30–36]. In information processing, optimal protocols have so far been analyzedfor the maximal work extraction in a feedback driven system described by an one-dimensional over-damped Langevin equation [37] and for the minimum dissipated heatin an erasure process [38, 39].In this paper we study a paradigmatic discrete two-state model [28,34,40,41], wherethe work extraction, performed by lifting and lowering one energy level, is driven by ptimized ﬁnite-time information machine τ = 0 − − p p τ = 0 + E . . . τ = t − E t τ = t + − p t p t Figure 1.

Representation of the ﬁnite-time process. Initially, at τ = 0 − , the entropyof the system is H ( p ) = − p ln p − (1 − p ) ln(1 − p ). At τ = 0 + the level with lowerprobability p ≤ / ≤ τ ≤ t this energy level is lowered with protocol E τ . At time τ = t this energy level is set from E t back to 0 extracting work E t if thislevel had been occupied at τ = t − . feedback. Besides applying the optimal protocol leading to the maximal work during oneperiod, this information machine will also be optimized in the sense that the protocoltakes the whole history of measurements into account.We show that this optimized feedback strategy leads to more work extraction incomparison to a simple memory-less machine. Moreover, we observe a phase transition,where in one phase the machine always lifts the state indicated by the last measurementas empty and in the other phase the state measured as occupied is lifted with a certainfrequency. The extracted work is observed to be bounded by two quantities: the familiarmutual information between system and controller and the change of the entropy of thesystem. While the second bound is valid for every measurement trajectory, the ﬁrstbecomes valid only after an average over measurement trajectories is taken. Finally,we show that the memory-less model allows for a diﬀerent physical interpretation ofthe system interacting with a tape, i.e., a sequence of bits. This memory-less modelthen corresponds to a generalization of the model recently introduced in [42] (seealso [41, 43–45]).The paper is organized as follows. In Sec. 2 we obtain the optimal protocol fora single period. The full feedback driven models are deﬁned in Sec. 3. The phasetransition and gain parameter for the optimized model are studied in Sec. 4. In Sec. 5the diﬀerent second law inequalities valid for the model are analyzed. We conclude inSec. 6.

2. Two-state ﬁnite-time process

The model analyzed in this paper is a two-state system where the time dependent energyof the upper level E τ ≥ ≤ τ ≤ t . The lower level has always energy zero. This system is connected to a heatbath at temperature T and a work reservoir. We consider a ﬁnite-time process withduration t , where both energy levels are zero immediately before starting τ = 0 − andimmediately after ﬁnishing τ = t + , see Fig. 1. These initial and ﬁnal jumps of E τ area generic feature of optimal protocols [30]. ptimized ﬁnite-time information machine p τ , the time derivativeof the average internal energy reads ddτ E τ p τ = ˙ E τ p τ + E τ ˙ p τ , (1)where the dot represents a time derivative throughout the paper. This is the ﬁrst lawof thermodynamics, where ˙ w = − ˙ E τ p τ is identiﬁed as the rate of extracted work and˙ q τ = E τ ˙ p τ as the rate of absorbed heat. This identiﬁcation means that if a jump occursheat is exchanged with the heat bath and if the energy level changes work is exchangedwith the work reservoir. The extracted work in the time interval t then becomes W t = − Z t ˙ E τ p τ d τ + E t p t − E p , (2)where the boundary terms comes from the discontinuities in E τ represented in Fig. 1.Since the variation of the internal energy is zero, the extracted work equals the heatabsorbed from the heat bath, i.e., W t = Q t = Z t E τ ˙ p τ d τ. (3)Even though the system is connected to a single heat bath and the variation of theinternal energy is zero, it is still possible to extract work due to the increase in theentropy of the system. More precisely, the second law for such isothermal processestablishes that the extracted work is bounded by the change of the entropy of thesystem, i.e., W t ≤ k B T [ H ( p t ) − H ( p )] , (4)where H ( p ) ≡ − p ln p − (1 − p ) ln(1 − p ). In this paper we set k B T = 1 and, in order tohave work extraction, we restrict to the case p ≤ p t ≤ / The optimal protocol E τ that leads to the maximal work extraction for given timeinterval t and initial occupation probability p is calculated in the remaining of thissection. The master equation reads˙ p τ = − ω − p τ + ω + [1 − p τ ] , (5)where ω + ( ω − ) is the time dependent transition rate to (from) the upper level. Theserates must fulﬁll the detailed balance relation ω − /ω + = e E τ . For convenience, we choose ω + = e − E τ and ω − = 1 . (6)Following the analysis for a symmetric choice of rates [34], the optimal protocol and thecorresponding maximal extracted work is found by considering the Lagrangian L ( p, ˙ p ) ≡ ˙ p ln (cid:18) − pp + ˙ p (cid:19) , (7) ptimized ﬁnite-time information machine W t = Z t L ( p, ˙ p )d τ. (8)Since L ( p, ˙ p ) does not explicitly depend on τ , we have the following constant of motion, K ≡ L − ˙ p ∂L∂ ˙ p = ˙ p p + ˙ p ≥ . (9)Introducing the variable r τ ≡ p τ /K ≥

0, equation (9) becomes˙ r τ = 12 + 12 √ r τ ≥ . (10)The solution of this equation is˙ r τ = −

12 plog − ( − r e − r e − τ ) ≡ f τ ( ˙ r ) . (11)where plog − ( x ) is the lower branch of product logarithm [46]. Using relations r τ = ˙ r τ − ˙ r τ , K = p /r = p / [ ˙ r − ˙ r ], and (11), the extracted work (8) becomesa function of the single variable ˙ r . The maximal work is then obtained for ˙ r = a ,where dd ˙ r W | ˙ r = a = 0. In this way, a ( p ) is given by the solution of the transcendentalequation f t ( a ) (exp[1 /f t ( a )] + 1) − f t ( a ) = a ( a − p . (12)For convenience the optimal protocol and corresponding maximal work are simplydenoted by E τ ( p ) and W t ( p ), respectively. From (8) the maximal work that can beextracted for ﬁxed t and p is W t ( p ) = − ln − p a − a [ f t ( a ) − f t ( a )]1 − p ! + p (cid:20) ln (cid:18) p a (1 − p )( a − (cid:19) + a − (cid:21) , (13)and the corresponding optimal protocol reads E τ ( p ) = ln (cid:18) p f τ ( a ) + a − ap f τ ( a ) − (cid:19) , (14)where a ( p ) is given by the solution of equation (12). In Fig. 2, we plot the maximalwork, the power and the discontinuities of the optimal protocol as a function of p forgiven t . The optimal work is a decreasing function of p , with full knowledge of theinitial state leading to the maximal work extraction for ﬁxed t . By increasing t , thework increases whereas the power W t ( p ) /t decreases, going to zero in the limit t → ∞ .The initial and ﬁnal energy jumps decrease with p , being maximal for p = 0. Theinitial jump E ( p ) increases with t , while the ﬁnal jump E t ( p ) decreases with t . Moreprecisely, for t → E ( p ) = E t ( p ) and the diﬀerence between the jumps growswith t , with E ( p ) reaching its maximal value and E t ( p ) → t → ∞ .Finally it is useful for the following discussion to give p t for the optimal protocolexplicitly as p t = p a − a [ f t ( a ) − f t ( a )] . (15) ptimized ﬁnite-time information machine . W t p t = 0 0 . . ∞

05 0 0.25 0.5 E p /e W t / t p E t p Figure 2.

Maximal work W t ( p ) for diﬀerent values of t on the left, with the power W t ( p ) /t plotted in the inset. The initial energy jump E ( p ) and the ﬁnal energyjump E t ( p ) in the inset are plotted on the right.

3. Feedback driven machine

An information driven machine periodically repeats the process explained in the previoussection using feedback control. Measurements and feedback drive the work extractionby resetting the entropy of the system at the end of the time interval. We denote thestate of the system just before starting period i by x i and the measurement by m i ,where x i = − x i = +1) means that the left (right) state is occupied while m i = − m i = +1) corresponds to measuring the left (right) state as being occupied. Theconditional probability of the measurement is deﬁned as P ( m i | x i ) ≡ ( − ǫ if m i = x i ,ǫ if m i = x i , (16)where ǫ is the measurement error. The machine never knows the real state of the system x i and has access only to the history of measurements m i = { m , m , . . . , m i } . Hence,in all calculations that follow the state of the system is always averaged out. First we consider a feedback procedure with a protocol taking the whole measurementtrajectory m i = { m , m , . . . , m i } into account. We are interested in the probability ofbeing at state x i given the history of measurements m i , which is denoted P ( x i | m i ). Forthis feedback scheme the initial occupation probability of the level that will be raisedat the beginning of period i is p ( i )0 = min { P ( x i = m i | m i ) , P ( x i = − m i | m i ) } ≤ . (17) ptimized ﬁnite-time information machine − m i independentof the measurement history it is also possible to make the unusual choice of lifting thelevel m i . In this second case, the level indicated by the last measurement as occupied islifted: the measurement is judged to be not reliable. Moreover, the machine applies theprotocol E τ ( p ( i )0 ), which takes into account the whole history of measurements by usingthe history dependent initial probability p ( i )0 .In Appendix A we show that the initial probability p ( i )0 fulﬁlls a nonlinear recursionrelation. Denoting by p ( i − t the probability at the end of the period i − p ( i − , we deﬁne the functions F + ( p ( i − ) ≡ ǫp ( i − t − q ( i − t (18)and F − ( p ( i − ) ≡ ǫ (1 − p ( i − t ) q ( i − t , (19)where q ( i − t ≡ p ( i − t + ǫ (1 − p ( i − t ) . (20)The recursion relation for p ( i )0 then reads p ( i )0 = ( F + ( p ( i − ) if ˜ z i = 1 , min { F − ( p ( i − ) , − F − ( p ( i − ) } if ˜ z i = − . (21)As explained in Appendix A, the variable ˜ z i has the purpose of identifying whether themeasurement outcome m i corresponds to the upper or the lower level of the interval i −

1, with ˜ z i = − m i and ˜ z i = 1 if the lower level is m i . Wecall this machine taking the history of measurements m i into account the optimizedmachine because, as we will see in Sec. 4, it leads to more work extraction then a simplememory-less machine which we deﬁne next. A memory-less feedback scheme that only takes the last measurement into accountwould be to simply apply a protocol for which the level raised for the next period isjust the state measured as empty. Hence, for a measurement outcome m i , the level − m i is lifted at the beginning of period i . As we show in Appendix B, where the memory-less machine is more explicitly deﬁned, the average initial occupation probability of theupper level is ǫ , independent of the protocol. Therefore, the appropriate choice for aprotocol that must be independent from the whole measurement history and correspondsto the memory-less version of the optimized machine is E τ ( ǫ ), which is obtained from(14) with p = ǫ . ptimized ﬁnite-time information machine

4. Gain and phase transition

The work extracted during period i with the optimized machine is denoted by W ( i ) t = W t ( p ( i )0 ). For a given measurement realization m N we deﬁne W t ≡ N N X i =1 W ( i ) t . (22)The average work h W t i is obtained by considering the limit N → ∞ and averaging overall measurement trajectories, where the brackets denote this average over measurementtrajectories. Numerical simulations for large enough N indicate that W t (andother observables we calculate below) is independent of the numerically generatedmeasurement history, i.e., self-averaging. Therefore, we calculate the average work bygenerating a single long measurement history.For the memory-less machine the average work is just W t ( ǫ ), as demonstrated inAppendix B. The improvement of the optimized in relation to the memory-less machineis quantiﬁed by the gain parameter α ≡ − W t ( ǫ ) h W t i . (23)Naively one expects the optimized machine that takes the history of measurements intoaccount to extract more work than the simple machine. This expectation is conﬁrmedby numerical simulations, from which we observe that α ≥

0. For α = 0 the workextraction in the memory-less model would be the same as in the optimized model andfor α → α in the ( t, ǫ )-plane. The gain approaches 1 for small t and ǫ close to 1 /

2, wherenon-reliable measurements are more likely to occur.It turns out that the optimized model displays a phase transition. The orderparameter for this transition φ is the frequency at which the state m i is lifted, i.e., φ ≡ * N N X i =1 (1 − σ i ) + , (24)where σ i = − σ i = 1 if the measurementis reliable (see Appendix A for a precise deﬁnition). The numerical calculation of thisorder parameter is also shown in Fig. 3. We can clearly see a phase transition with φ = 0below a critical threshold ǫ c ( t ). Numerics indicates a second order phase transition.The optimized machine has two advantages in relation to the memory-less machine:it lifts the level m i if the last measurement is not reliable and it uses a history dependentprotocol E τ ( p ( i )0 ). By comparing α with φ in Fig. 3, we see that in the phase φ > φ = 0 the average initial occupation probability is ǫ . Hence, α > W t ( p ) plotted in Fig. 2 isconvex, implying h W ( i ) t i i ≥ W t ( ǫ ), where the average h . i i is deﬁned in (A.11). ptimized ﬁnite-time information machine t ǫ t t ǫ t ǫ φ > α φ = 0 Figure 3.

The gain parameter α (left) and the order parameter φ (right) as functionsof the time interval t and the measurement error ǫ . The results are obtained bynumerically generating a measurement trajectory of length 10 . The full black criticalline is obtained analytically from (C.1) and the dotted line on the right panel from(C.2). As we show in Appendix C, the critical line ǫ c ( t ) can be obtained analytically fromthe transcendental equation (C.1). It is in perfect agreement with numerical results, asshown in Fig. 3.

5. Second law inequalities

The second law for feedback driven systems [10] states that the average extracted workis bounded by the average mutual information between system and controller due tomeasurements. The mutual information between the system and the controller due tothe measurement m i is deﬁned as I ( i ) t ≡ X m i ,x i P ( m i , x i | m i − ) ln P ( m i , x i | m i − ) P ( m i | m i − ) P ( x i | m i − )= H ( q ( i − t ) − H ( ǫ ) . (25)We denote the average mutual information by h I t i , so that the eﬃciency of the optimalmachine reads η ≡ h W t ih I t i . (26)In Fig. 4 we show the numerically calculated eﬃciency η and power h W t i /t for theoptimized model in the ( t, ǫ )-plane. Increasing the time period t increases the eﬃciencybut decreases the power of the machine. For ﬁxed t , the eﬃciency increases for increasingmeasurement error ǫ . Hence, maximum power is obtained for small ǫ and small t , whichis, however, a rather ineﬃcient case with η . . ptimized ﬁnite-time information machine t ǫ t ǫ η h W t i t Figure 4.

The eﬃciency η and the average power h W t i /t as functions of the timeinterval t and the measurement error ǫ . The results are obtained by numericallygenerating a measurement trajectory of length 10 . Another bound on the extracted work is provided by the Shannon entropy change, asexpressed in (4), which for the optimized model can be written as W ( i ) t ≤ ∆ H ( i ) t , (27)where the Shannon entropy change in interval i is∆ H ( i ) t ≡ H ( p ( i ) t ) − H ( p ( i )0 ) . (28)The inequality (27) is then valid for a ﬁxed measurement trajectory, whereas thestandard second law for feedback driven systems h W t i ≤ h I t i is valid only after anaverage over measurement trajectories is taken. By numerical inspection we observe that I ( i ) t (or I ( i +1) t ) can be smaller than W ( i ) t . Furthermore, by taking the average for large N we ﬁnd h I t i = h ∆ H t i within numerical errors. This equality can be demonstrated withthe following heuristic argument. Rearranging the terms in the mutual information weget I ( i +1) t = H ( q ( i ) t ) − H ( ǫ )= H ( p ( i ) t ) − q ( i ) t H ǫ (1 − p ( i ) t ) q ( i ) t ! − (1 − q ( i ) t ) H ǫp ( i ) t − q ( i ) t ! = H ( p ( i ) t ) − h H ( p ( i +1)0 ) i i +1 . (29)where the average h . i i +1 is deﬁned in (A.11). From this equation it is clear that theaverage mutual information and the average Shannon entropy change diﬀer only byboundary terms, which for large N should be irrelevant, implying h I t i = h ∆ H t i . ptimized ﬁnite-time information machine As in Appendix B we now consider a memory-less machine using an arbitrary protocol˜ E τ . Equation (29) is also valid for the memory-less case and, therefore, the averageShannon entropy change should be equal to the average mutual information for large N . This result was conﬁrmed numerically for the protocol ˜ E τ = E τ ( ǫ ) and for ˜ E τ = E ,which corresponds to the energy level held ﬁxed during the whole time interval. Similarto (25), the mutual information depending on m i − reads˜ I ( i ) t = H (˜ q ( i − t ) − H ( ǫ ) , (30)where ˜ q ( i − t is deﬁned in (B.5). Denoting by ˜ p t the solution of the master equation (5)with protocol ˜ E τ and initial probability ǫ we deﬁne˜ I t ≡ H (˜ q t ) − H ( ǫ ) , (31)where ˜ q t = ˜ p t + ǫ (1 − p t ). Since ˜ q ( i ) t (and ˜ p ( i ) t ) is linear in ˜ p ( i )0 , it follows that h ˜ q ( i ) t i i = ˜ q t ,where the average h . i i is deﬁned in (B.6). From the fact that the Shannon entropy isconcave we obtain that ˜ I t provides an upper bound on the average mutual information h ˜ I t i ≡ * N N X i =1 ˜ I ( i ) t + . (32)As the average extracted work is equal to the work extracted in the ﬁrst period(see Appendix B), if before the ﬁrst measurement both states are equally probable, theaverage extracted work is also bounded by the Shannon entropy change in the ﬁrstperiod ∆ ˜ H t ≡ H (˜ p t ) − H ( ǫ ) . (33)Comparing with the other bounds we have ∆ ˜ H t ≤ ˜ I t and, numerically, for the protocols˜ E τ = E τ ( ǫ ) and ˜ E τ = E , we observe ∆ ˜ H t ≤ h ˜ I t i . We conjecture that this entropychange provides the strongest bound on the extracted work.The inequality˜ W t ( ǫ ) ≤ ∆ ˜ H t (34)for the protocol ˜ E τ = E has been recently studied in [41]. In this reference it hasbeen shown that the two-state model can also be interpreted as a tape, i.e., a sequenceof bits, interacting with a thermodynamic system. In this interpretation the entropychange is dumped to a tape or information reservoir. The inequality (34) means that theextracted work is bounded by the Shannon entropy change of the tape, which is initially H ( ǫ ) and becomes H ( p t ) after the system has written information to it. This model fora tape interacting with a thermodynamic system has been introduced by Mandal andJarzynski [42], for a model with six instead of two states and a protocol that is also heldﬁxed during the whole time interval. By showing that inequality (34) is valid also forarbitrary ˜ E τ protocols, we thus obtain that their model can be generalized to arbitrarytime-dependent protocols. ptimized ﬁnite-time information machine

6. Conclusion

We have studied a two-state ﬁnite-time optimized information-driven machine. Besidesutilizing the optimal protocol, this machine is also optimized in the sense that thefeedback scheme takes into account the whole history of measurements. We have shownthat the optimized machine leads to more work extraction in comparison to a simplememory-less machine that does not take the full measurement trajectory into account.This optimized model displays a phase transition with the frequency at whichnon-reliable measurements occur being the order parameter. In the region of thephase diagram where non-reliable measurements occur with a higher frequency thegain parameter, characterizing the improvement of the optimized in relation to thememory-less machine, was found to be high. Hence the possibility of lifting the statelast measured as occupied if the measurement is non-reliable is the key feature thatmakes the optimized model perform better. Moreover, analyzing the recursion relationsfor the initial occupation probability of the upper level we have obtained the criticalline exactly.We have shown that the work extraction is bounded both by the Shannon entropychange and the mutual information. While the ﬁrst bound is valid for every measurementtrajectory the second is valid only after averaging over the measurements. In thiscase, both bounds become the same. Moreover, for the memory-less model we havedemonstrated that the average extracted work is bounded by the Shannon entropychange of the ﬁrst period. This inequality allows for an interpretation of the model as athermodynamic system interacting with a tape, thus generalizing the model introducedin [42].

Appendix A. Iterative relation for the optimized model

In this appendix, we obtain the nonlinear recursive relation for the initial occupationprobability of the upper level of the optimized machine (21). From relations P ( m i | m i − ) = X x i P ( x i , m i | m i − ) (A.1)and P ( x i , m i | m i − ) = P ( m i | x i , m i − ) P ( x i | m i − ) = P ( m i | x i ) P ( x i | m i − ) , (A.2)we obtain P ( m i | m i − ) = (1 − ǫ ) P ( x i = m i | m i − ) + ǫP ( x i = − m i | m i − )= P ( x i = m i | m i − ) + ǫ [1 − P ( x i = m i | m i − )] , (A.3)where we used the deﬁnition of measurement error (16). Using the relation P ( x i | m i ) = P ( x i , m i | m i − ) /P ( m i | m i − ) (A.4) ptimized ﬁnite-time information machine P ( x i | m i ) can then be written as P ( x i | m i ) =  (1 − ǫ ) P ( x i = m i | m i − ) P ( x i = m i | m i − )+ ǫ [1 − P ( x i = m i | m i − )] if x i = m i , ǫ [1 − P ( x i = m i | m i − )] P ( x i = m i | m i − )+ ǫ [1 − P ( x i = m i | m i − )] if x i = − m i . (A.5)Depending on the past measurements the probability P ( x i = m i | m i − ) on the right sideof this equation is either p ( i − t or 1 − p ( i − t , where p ( i − t is obtained from p ( i − andequation (15). From equation (17), the state indicated by the measurement as occupied m i is lifted provided P ( x i = m i | m i ) < P ( x i = − m i | m i ), which from (A.5) is equivalentto P ( x i = m i | m i − ) < ǫ . Since 1 − p ( i − t ≥ /

2, a necessary condition for lifting thestate m i at the beginning of period i is that p ( i − t < ǫ .It is convenient to deﬁne the variables z i ≡ m i m i − , for i >

1, and σ i , which takesthe value 1 ( −

1) if the level − m i ( m i ) is lifted at the beginning of period i , i.e., σ i ≡ ( P ( x i = m i | m i ) > P ( x i = − m i | m i ) , − P ( x i = − m i | m i ) > P ( x i = m i | m i ) . (A.6)Furthermore, we deﬁne ˜ z i ≡ z i σ i − . This last variable identiﬁes whether for given m i the probability P ( x i = m i | m i − ) is p ( i − t or 1 − p ( i − t : P ( x i = m i | m i − ) = ( − p ( i − t if ˜ z i = 1 ,p ( i − t if ˜ z i = − . (A.7)In words, this equation means that if ˜ z i = 1 (˜ z i = −

1) then m i corresponds to the lower(upper) level of period i − p (1)0 = ǫ and σ = 1. Numerical simulations of the measurementtrajectory are then performed with the following algorithm:1) For period i randomly choose a measurement according to the probability P ( m i | m i − ) given by equations (A.3) and (A.7);2) with p ( i − t , the variable ˜ z i , and equations (17), (A.5) and (A.7) calculate σ i and p ( i )0 ;3) from relation (15) and p ( i )0 calculate p ( i ) t . Go back to the ﬁrst step with thesubstitution i → i + 1.This algorithm can be translated into a recursion relation for the initial probability.Using (A.5) and (A.7), relation (17) becomes p ( i )0 = ( F + ( p ( i − ) if ˜ z i = 1 , min { F − ( p ( i − ) , − F − ( p ( i − ) } if ˜ z i = − , (A.8)where F + ( p ( i − ) ≡ ǫp ( i − t − q ( i − t (A.9) ptimized ﬁnite-time information machine F − ( p ( i − ) ≡ ǫ (1 − p ( i − t ) q ( i − t , (A.10)with q ( i − t ≡ p ( i − t + ǫ (1 − p ( i − t ). Note that the function 1 − F − ( p ( i − ) is minimalwhen F − ( p ( i − ) > /

2, which implies ǫ > p ( i − t . Only in this case, the state measuredas occupied m i can be lifted at the beginning of period i .Moreover, from (A.3), (A.7), and (A.8), the average initial occupation probabilityconditioned on m i − is h p ( i )0 i i ≡ X m i p ( i )0 P ( m i | m i − ) = min { p ( i − t , ǫ } . (A.11)If the machine never lifts the level m i , i.e., F − ( p ( i − ) > / h p ( i )0 i i = ǫ . Appendix B. Extracted work for the memory-less machine

For the memory-less machine we denote the initial occupation probability at period i by ˜ p ( i )0 . The ﬁnal occupation probability at period i is ˜ p ( i ) t : as the memory-less machinedoes not use the optimal protocol, ˜ p ( i ) t is not obtained from (15) but rather it is thesolution of the master equation (5) for a given protocol ˜ E τ and initial condition ˜ p ( i )0 .Another diﬀerence in relation to the optimized model considered in Appendix A isthat the variable σ i is not necessary for the memory-less machine, since here σ i = 1 forall i . Hence, for the memory-less machine, equation (A.7) becomes P ( x i = m i | m i − ) = ( − ˜ p ( i − t if z i = 1 , ˜ p ( i − t if z i = − . (B.1)The iterative relation (A.8) then simpliﬁes to˜ p ( i )0 = ( ˜ F + (˜ p ( i − ) if z i = 1 , ˜ F − (˜ p ( i − ) if z i = − , (B.2)where ˜ F + (˜ p ( i − ) ≡ ǫ ˜ p ( i − t − ˜ q ( i − t (B.3)and ˜ F − (˜ p ( i − ) ≡ ǫ (1 − ˜ p ( i − t )˜ q ( i − t , (B.4)with ˜ q ( i − t ≡ ˜ p ( i − t + ǫ (1 − p ( i − t ) . (B.5)Similar to (A.11), the average initial probability for ﬁxed measurement history is h ˜ p ( i )0 i i ≡ X m i ˜ p ( i )0 P ( m i | m i − ) = ǫ. (B.6) ptimized ﬁnite-time information machine p ( i − p ( i )0 p † p ∗ . . . . F + F − p ( i − p ( i )0 − F − p † p ∗ . . . . F + F − p ( i − p ( i )0 − F − p † p ∗ . . . . F + F − Figure C1.

Cobweb diagram of all possible ﬁrst ﬁve iterations, starting with p (1)0 = ǫ ,of the relation (21). In the left panel t = 2 and ǫ = 0 .

2, in the central panel t = 1 . ǫ = 0 .

4, and in the right panel t = 0 . ǫ = 0 .

4. For the ﬁrst two cases φ = 0,whereas in the third case φ > We denote by ˜ W t (˜ p ( i )0 ) the work that is obtained from (3) with protocol ˜ E τ and initialprobability ˜ p ( i )0 . The average work is then given by h ˜ W t (˜ p ( i )0 ) i i = ˜ W t ( h ˜ p ( i )0 i i ) = ˜ W t ( ǫ ) , (B.7)where in the ﬁrst equality we used the fact that ˜ W t (˜ p ( i )0 ) is linear in ˜ p ( i )0 . Since, ˜ W t ( ǫ ) isindependent of the history m i − , it follows that the average work is simply ˜ W t ( ǫ ). Forthe memory-less machine we compare with the optimized one ˜ E τ = E τ ( ǫ ), which is givenby (14), and the average work is W t ( ǫ ), which is given by (13). Moreover, assuming thatbefore the ﬁrst measurement both states are equally probable, which leads to ˜ p (1)0 = ǫ ,the average work equals the work extracted in the ﬁrst period. Appendix C. Critical line

We now obtain the critical line exactly by analyzing the iterative relation for p ( i )0 (21).As discussed in Appendix A, the level m i will be lifted at the beginning of interval i onlyif F − ( p ( i − ) > /

2. Hence, in the phase φ = 0 the condition F − ( p ( i − ) < / F + ( p † ) = p † and F − ( p ∗ ) = p ∗ . The possible trajectories in the cobweb diagram for the ﬁrst ﬁve iterationsof relation (21) are shown in Fig. C1. It is clear that p ( i )0 does not go below the ﬁxedpoint p † . Therefore, the critical line ǫ c ( t ) can be obtained analytically by setting F − ( p † ) = 1 / , (C.1)which leads to a cumbersome transcendental equation. Solving this equation we obtainthe full black line in Fig 3, in perfect agreement with the numerical simulations.Moreover, the phase ǫ < ǫ c can be further separated by two distinct regions, wherethe line (dotted line in Fig. 3) separating these two regions is obtained from F − (0) = 1 / . (C.2)In the region closer to the critical line with F − (0) > / p † is not small enough, i.e., F − ( p † ) < / ptimized ﬁnite-time information machine References [1] Szilard L, 1929

Z. Phys.

840 – 856[2] Landauer R, 1961

IBM J. Res. Dev. Int. J. Theor. Phys. Phys. Rev. Lett.

Rep. Prog. Phys. Nature Phys. Nature

Phys. Rev. Lett. Physica A

Phys. Rev. E Phys. Rev. E Phys. Rev. E Phys. Rev. Lett.

Phys. Rev. E Phys. Rev. E Phys. Rev. Lett.

Phys. Rev. Lett.

J. Stat. Mech.

P09011[19] Hartich D, Barato A C and Seifert U, 2014

J. Stat. Mech.

P02016[20] Cao F J, Dinis L and Parrondo J M R, 2004

Phys. Rev. Lett. EPL New J. Phys. EPL Phys. Rev. E EPL EPL Phys. Rev. Lett.

Phys. Rev. Lett.

EPL

Phys. Rev. Lett. Phys. Rev. E EPL J. Chem. Phys.

EPL Phys. Rev. Lett.

Phys. Rev. E J. Phys. A: Math. Theor. Phys. Rev. E Phys. Rev. E Phys. Rev. E Phys. Rev. Lett.

Proc. Natl. Acad. Sci. U.S.A.

Phys. Rev. Lett.

EPL

Phys. Rev. X ptimized ﬁnite-time information machine [46] Corless R, Gonnet G, Hare D, Jeﬀrey D and Knuth D, 1996 Adv. Comput. Math.5