Principles of Low Dissipation Computing from a Stochastic Circuit Model
PPrinciples of Low Dissipation Computing from a Stochastic Circuit Model
Chloe Ya Gao and David T. Limmer
1, 2, 3, 4, a) Department of Chemistry, University of California, Berkeley, California Kavli Energy NanoScience Institute, Berkeley, California Materials Science Division, Lawrence Berkeley National Laboratory, Berkeley, California Chemical Science Division, Lawrence Berkeley National Laboratory, Berkeley, California (Dated: 26 February 2021)
We introduce a thermodynamically consistent, minimal stochastic model for complementary logic gates builtwith field-effect transistors. We characterize the performance of such gates with tools from information theoryand study the interplay between accuracy, speed, and dissipation of computations. With a few universalbuilding blocks, such as the NOT and NAND gates, we are able to model arbitrary combinatorial andsequential logic circuits, which are modularized to implement computing tasks. We find generically thathigh accuracy can be achieved provided sufficient energy consumption and time to perform the computation.However for low energy computing, accuracy and speed are coupled in a way that depends on the devicearchitecture and task. Our work bridges the gap between the engineering of low-dissipation digital devicesand theoretical developments in stochastic thermodynamics, and provides a platform to study design principlesfor low dissipation digital devices.
I. INTRODUCTION
The last decade has seen an exponential growth inenergy consumption associated with information, com-munications, and computing technologies. Such re-source demands are not sustainable, and thus there isa need to design devices with reduced energetic costs.While the problem of computing efficiency dates backto Landauer, with modern developments in stochas-tic thermodynamics, this problem is actively beingrevisited.
The main goal of this paper is to bridgethe gap between developments in nonequilibrium sta-tistical physics and circuit engineering by proposing amodel for stochastic logic circuits that is thermodynam-ically consistent, and thus amenable to physical analysisand constraints, but simple enough to be extendable tocomplex computing tasks. By treating thermal fluctu-ations in electron transport explicitly at a mesoscopicscale, our model reproduces the behavior of a robust cir-cuit in the low noise limit, but describes errors accuratelyaway from this limit. With this model we explore the con-sequences of carrying out computations at low thermody-namic costs and finite time, and provide design principlesfor low dissipation computing devices.State-of-the-art semiconductor devices are typicallybuilt from metal-oxide-semiconductor field effect transis-tors on the scale of a few nanometers, enabling billionsof transistors to be packed on a single chip. In orderto mitigate heating and large energy consumption bur-dens, it would be advantageous to operate such small de-vices with small bias voltages, however as biases approachthermal scales, fluctuations increase, which necessitatesa careful treatment of thermal noise.
The conventionaltreatment of thermal noise is largely phenomenologicaland involves either a correction to the power spectral a) Electronic mail: [email protected] density, or transformation of the internal noise intoexternal independent sources. Such models are typi-cally valid only near equilibrium where the fluctuation-dissipation theorem can be invoked to constrain theirfunctional form, whereas higher order correlations areneeded in general to determine the full response. While these models can provide insight into how thermalnoise may put a physical limit on the density of tran-sistors, their validity in non-linear electrical networksoperating far from equilibrium is uncertain.Stochastic thermodynamics provides a theoretical wayto move beyond an equilibrium description of ther-mal noise and its impact on information processing. While information theory provides limits on the accu-racy of typical communication, stochastic thermo-dynamics provides generalized fluctuation-dissipation re-lationships, and places limits on the work required toimplement a physical process in finite time and the spec-trum of its fluctuations.
The link between informa-tion theory and stochastic thermodynamics has gener-ated a wealth of expressions relating precision, speed anddissipation, including the thermodynamic uncertainty re-lationships, speed limits, and fluctuation theorems. Forexample, dissipation bounds the rate at which a sys-tem transforms between different states.
Dissipa-tion also provides an upper bound for the precision ofa current.
A universal tradeoff between power, pre-cision and speed has been proposed for communicationsystems as well. These theoretical results have foundapplication in many biological processes that natively op-erate near thermal energy scales.
Placed in the con-text of artificial computing, these relationships have shedlight on fundamental constraints on the design of com-puting devices to minimize thermodynamic costs.
While such theoretical results are general, to applythem to the problem of computing design requires a re-alistic physical representation of information processing,such as bit storage, measurement, and erasure. Some suc-cess has been made with non-linear single electron devices a r X i v : . [ c ond - m a t . s t a t - m ec h ] F e b and Coulomb blockade systems, where the logicalstates are represented by the presence of a few electrons.More recently, thermodynamically consistent stochasticmodels have been proposed for transistors and non-linearelectronic circuits. Even though such models are ca-pable of describing a broad range of nonlinear devicesat a macroscopic level, they are based on continuumdescriptions and therefore model parameters are chosenphenomenologically. In this paper, we adopt a differentapproach where single logic gates are described by a tun-nel junction model on the mesoscopic scale, combinedwith a capacitive circuit model for the charging and ma-nipulation of the device. Such an approach is able todescribe electron transport processes consistent with thefluctuation theorems, but also consistent with thecomplementary metal-oxide-semiconductor (CMOS) cir-cuit platform used widely in modern computing devices.Therefore, it provides an ideal platform to study circuitbehaviors with the tool of stochastic thermodynamics.In what follows, we demonstrate principles for low dis-sipation computing by constructing a stochastic modelfor logic circuits from a bottom-up approach. By workingwith elementary linear components, we can build nonlin-ear circuits that are thermodynamically consistent. Wefirst introduce a model for single gates, including theNOT gate and the NAND gate, and discuss their phys-ical properties. We then study the collective behaviorsof these basic components, including spatial correlationswithin combinational circuits, and temporal correlationswithin sequential circuits, where the emphasis will be oncircuit design principles. The logic circuits are finallymodularized and scaled up to a computing device to il-lustrate how multiple components are synchronized tocomplete a computing task. Throughout, the thermody-namically consistent model enables a description of errorsand dissipation.
II. MODEL FOR SINGLE GATES
Modern CMOS circuits implement logic functions byintegrating two different types of transistors differenti-ated by their major charge carriers, so-called N-type andP-type transistors. Here we choose a mesoscopic tunneljunction model to describe electron transport in a singlegate. The transistors are modeled by two single electronlevels of energy (cid:15) i with i = N , P for the N-type and P-type transistors. The electrodes are modeled by electronreservoirs with chemical potential µ j with j = s, d, g de-noting the source, drain, and gate respectively. Electrontransfer among them is described by a Markovian mas-ter equation, parametrized by transition rates k ji thatdescribe the exchange rate of an electron from site i to j .The transition rates are chosen to satisfy a local detailedbalance condition, and thus guaranteeing thermodynam-ical consistency, k ji k ij = e − β ( E j − E i ) (1) where β = 1 /k B T , k B is the Boltzmann constant, and T the temperature of the device. The energy is describedby either the band energy for an electron in the transis-tor, (cid:15) i , or a chemical potential, µ j , for an electron in anelectrode. The condition of local detailed balance is aprerequisite for the application of stochastic thermody-namics, as it ensures a correct description of dissipationaway from equilibrium, and relaxation to a Boltzmanndistribution at equilibrium. While local detailed balancemodels each microscopic transition as being thermallymediated, emergent nonlinear behaviors resulting fromcollections of transitions can take the system arbitrarilyfar from equilibrium. The energy levels of the transistors are controlled byan input voltage denoted V in . In the case of a field-effecttransistor, V in refers to the gate voltage that switchesthe transistor on and off. In the limit of high gate ca-pacitance, V in changes the energy levels of the transistorsapproximately linearly (cid:15) P = (cid:15) + qV in , (cid:15) N = (cid:15) − qV in , (2)where (cid:15) i =N , P are reference energies and q is the unit ofelectric charge. The sign of the slope differentiates theN and P-type transistors with different charge carriers.In our model, a voltage also uniquely determines the en-ergetics of the electrodes by modulating their chemicalpotentials, µ j = − qV j . Throughout, we will differenti-ate between two different types of electrodes. The firsttype, including the source and drain electrodes, are keptat fixed potentials, V s and V d , respectively. The secondtype, the gate electrode, satisfies a capacitive chargingmodel with a fluctuating voltage V g for reading out agate. This is justified by the fact that in CMOS cir-cuits, the output of a single gate is usually used as theinput of another gate, in which case the two are connectedthrough a capacitor. The dynamics of V g is described bythe equation of motion C g dV g dt = − J g ( t ) (3)where C g is the capacitance and J g is the electron currentflowing into the electrode from the transistors. The con-stant capacitance implies a quadratic energy for chargingthe electrode, E = C g V g / Such a description is valid inthe weak coupling limit between a transistor and an elec-trode relative to the thermal energy, and for transistorsthat are small in scale relative to the mean free path ofthe electron. We restrict our analysis to single energylevel transistors, for which the corresponding transitionrates between transistor i and electrode j are k ij = Γ f j ( (cid:15) i ) , k ji = Γ[1 − f j ( (cid:15) i )] (4)where f j ( x ) = [ e β ( x − µ j ) + 1] − is the Fermi distribu-tion. The prefactor Γ is related to contact resistancesand is chosen so that the timescale of electron transi-tions is longer than the timescale of thermal fluctuation,and thus the broadening of energy levels due to the cou-pling is smaller than thermal fluctuations. In makingthese assumptions to simplify our model, we have ne-glected effects such as scattering within the transistor,delocalization between the electrode and the transistor,and electron correlations, each of which can be incorpo-rated into our model as long as thermodynamical consis-tency is retained.Since we will be considering energy scales on the orderof thermal fluctuations at the room temperature, we use V T = k B T /q ≈ e V and β ~ ≈ ~ is Planck’s constant. The voltagesignal to noise ratio V d /V T in our model will be on theorder of 10, which is the prerequisite of low dissipation inthe computing process since the two are closely related.While this ratio is two orders of magnitude lower thanthe current industry level, and requires delicate opera-tion of the device, it can be experimentally achieved bydesigns such as the single-electron box. We referencepotentials relative to the source voltage so that V s = 0,and take (cid:15) = 0 and (cid:15) = 1 . qV d so that there existsonly one independent energy parameter V d . The transi-tion rate constant is chosen as β ~ Γ = 0 . To study the dy-namics of the gates, we use both the exact steady statesolution of master equation when possible, and Gillespiesimulations to sample individual trajectories. We set C g = 200 q/V T in order to separate the timescales of ca-pacitor charging from individual electron transfer events,simplifying the Gillespie simulations. Details of the nu-merical methods and the justification of the parameterscan be found in Section I of the Supplementary Informa-tion ( SI ). A. NOT gate
The NOT gate, also known as the inverter, takes asingle binary input X , and generates its complement asthe output Y . The circuit diagram of the NOT gate,composed of two transistors, is shown in Fig. 1 A . TheN-type transistor is connected to a lower source voltage V s = 0 on its left, while the P-type transistor is connectedto a higher drain voltage V d to its right. Both transis-tors are controlled by an input voltage V in as in Eq. 2,which is treated as fixed in a single gate, while the out-put voltage V out is measured between the two transistorsfrom the capacitor voltage V g , which evolves according toEq. 3. The kinetic diagram for our Markovian model isalso shown in Fig. 1 A . Electrons can move ballisticallybetween adjacent sites in the kinetic diagram accordingto a master equation, the details of which can be foundin the SI (Eq. S4).A NOT gate is typically characterized by its voltagetransfer curve (VTC), shown in Fig. 1 B . The VTC re-ports on the average V out in response to V in in the long AC ! ! ! ! ! PN µ g = − qV g µ s = 0 µ d = − qV d ϵ N ϵ P V ou t / V d V in / V d V d = 3 V T V d = 5 V T V d = 10 V T V d = 40 V T B V d V s = 0 V in V out ( V g ) − − − l og ( − C ) V d / V T FIG. 1: Performance of a single NOT gate. ( A ) Circuitdiagram (above) and kinetic diagram (below) of anNOT gate, which is composed of a N-type (left) and aP-type transistor (right). ( B ) Voltage transfer curve ofa NOT gate. ( C ) Channel capacity improves withincreasing drain voltage V d .time limit. Generically, we find increasing V in results ina decrease in V out in agreement with the expected re-sponse of an inverter. However, its behavior is depen-dent on the scale of the thermal noise relative to V d . Thelimiting values of V out approach 0 and V d for V in = V d and 0, respectively, and sharpens between these limitswith increasing V d . Both features result from tuning theband energies of the two transistors in or out of reso-nance with their respective electrodes, as the transistorband energies depend on V d through Eq. 2. Increasing V d with V in = 0 or V d , increasingly suppresses currentinto the gate capacitor from V d or V s . In the limit thatcurrent flows from only one electrode with fixed voltage,the gate electrode would reach an equilibrium state withthat same voltage. The approach to this limiting behav-ior is exponential, for example for increasing V d (cid:29) V T and V in = 0, | V out − V d | ∼ exp[ − V d / V T ]. The VTC isalso symmetric around V in = V out = V d /
2, under whichcondition the difference between the energy level of thetransistors and its connecting reservoirs is roughly thesame for the N-type and P-type transistors.
1. Performance as a computing unit
When used as a computing unit, our first concern iswhether our model generates the correct output withhigh probability. We define a perfect gate or device asone that generates a deterministic output according tothe truth table, e.g. Y should be the complement of X for a perfect NOT gate. However in the presence of noise,the deterministic output becomes stochastic and subjectto finite error rates. As can be anticipated from the be-havior of the VTC, in the limit of high V d /V T , or thelow noise limit, the performance of our model approachesthat of a perfect NOT gate, whereas the behavior is non-trivial at smaller V d .The input and output signals are given as voltages inthis model, so we map them to binaries by X = (cid:26) , V in = 01 , V in = V d Y = , V out ≤ αV d , V out ≥ (1 − α ) V d ∅ , otherwise (5)where ∅ represents an invalid result that cannot be des-ignated and α represents an error tolerance with 0 <α (cid:28)
1. We choose α = 0 .
02 so that the resultant erroris below 10 − for V d = 40 V T as comparable to currenttechnologies, but our qualitative results are insensitive tothis choice.To characterize the accuracy of the gate, we definethe error rate ξ as the probability of observing an out-put different from the perfect gate in a single shot. Inthe case of X = 0, the error rate can be calculatedfrom the empirical distribution of V out in steady state, as ξ ( X = 0) = p [ V out < (1 − α ) V d | V in = 0] = 0 .
36. A com-prehensive characterization of the accuracy that takesinto account the error rate for both cases of X = 0 / C = max p ( X ) I ( X ; Y ) , (6)which is the highest information rate that can be achievedwith arbitrarily small error. We compute C numericallyfrom the mutual information I ( X ; Y ) between the inputand output at steady state as a function of V d (see detailsin SI Section II), as shown in Fig. 1C. For a binary chan-nel, the capacity is between 0 and 1, with 1 correspondingto a perfect gate. Here the capacity is computed to be C = 0 .
60 for a channel operated at V d = 5 V T , given theslight difference between the error rate for X = 0 or 1. While from the VTC the mean V out is influenced by boththe source and drain electrode for finite V d , we find itsdistribution to be Gaussian with variance 1 / ( βC g ) withinthe steady state (Fig. S1). This is expected from a Boltz-mann distribution, reflecting a proximity to equilibriumdespite the presence of persistent currents. To reach ahigher capacity, we need the average output V out to ap-proach the limits 0 or V d . This can be achieved by oper-ating at a higher V d so that the leakage current flowingthrough the higher energy level transistor is even smaller.Given the Gaussian statistics, asymptotically for large V d the error scales as ξ ∼ exp[ − βC g α V d / p π/βC g /αV d and the channel capacity scales as C ∼ − ξ (1 − log ξ ),consistent with Fig. 1C.
2. Trade-off among accuracy, speed and dissipation
While the accuracy of the gate improves dramaticallyfor V d (cid:29) V T , its performance is compromised by signif-icantly increasing costs in computation time and energyconsumption. Upon receiving a distinct input signal, thegate requires time to charge or discharge the capacitorto reach a steady state output signal. The average relax-ation to steady state is shown in Fig. 2 A for an initiallydischarged capacitor with input X = 0. The relaxation ismonotonic and nearly exponential but with characteristicdecay time that depends on V d . Under this initial con-dition and input voltage, (cid:15) N (cid:29) µ s , so that few electronscan flow between the source and the capacitor. The lowerenergy level (cid:15) P facilitates electrons to transfer from thecapacitor to the drain following the concentration gradi-ent, gradually building up a higher voltage.We define the time it takes for V out to reach (1 − α ) V d ,the threshold voltage for Y = 1, as the propagation delaytime τ p . While the threshold voltage increases linearlywith V d , the average propagation delay τ p grows exponen-tially. The propagation delay time, τ p follows an inverseGaussian distribution with a long exponential tail (Fig.S1). Note that τ p coincides with the time required for theerror rate to decay below 0.5. Figure 2 B shows the de-cay of the error rate with time for V d = 5 , , V T , scaledby the propagation delay τ p for each V d . As the distri-bution of V out remains Gaussian, the time dependenceof the error reflects the charging of the gate capacitor,and specifically follows the evolution of the mean V out .While we consider the single shot error, the exponentialscaling of τ p with V d implies that associating an error ratewith a time averaged measurement of V out would yield anon-monotonic relationship between the waiting time toreach a set error threshold and V d . For intermediate V d ,the slower decorrelation time will cause waiting times toincrease with V d , while for large V d the suppressed fluc-tuations will dominate and decrease waiting times.When the gate is used repeatedly to process a sequenceof inputs X = { X , X , · · · , X N } , there is no need to re-initialize the gate after each computation, and the resid-ual charge on the capacitor may help reduce the com- ξ t / τ p V d = 5 V T V d = 8 V T V d = 10 V T BCA
5 10 15 τ p / β − h V d / V T | V g ( t ) − V g ( ∞ ) | / V g ( ∞ ) t / β − h V d = 3 V T V d = 5 V T V d = 8 V T V d = 10 V T FIG. 2: Trade-off among accuracy, speed anddissipation for a single NOT gate. ( A ) Relaxationtowards the steady state for a NOT gate initialized with V g = 0 V T , and X = 0. ( B ) The decay of the error ratewith time, scaled by propagation delay τ p . ( Inset )Propagation delay as a function of V d . ( C ) The totalentropy production is a non-monotonic function of V d for finite observation time τ obs . The grey dashed line isthe reversible limit C g V d / and the detail of whichcan be found in the SI (Section III). Using this metric,we find the memory effect plays a significant role at in-termediate τ obs enhancing the robustness of transmissionby up to 30 percent (Fig. S2). For times much longerthan τ p , the memory effect wears off and the information rate is set by the channel capacity.The energy consumption for a gate can be quantifiedwith the entropy production or the heat dissipated tothe environment. From stochastic thermodynamics, theentropy production of the NOT gate during a long ob-servation time τ obs can be computed by the product ofelectron current and its conjugate affinity from two sep-arate pathways Σ( τ obs ) = Z τ obs dt J s → N ( µ s − µ g ) + J d → P ( µ d − µ g ) , (7)where J s → N is the electron current flowing from thesource to the N-type transistor, and J d → P is the currentfrom the drain to the P-type transistor (Eq. S5). In theprocess described in Fig. 2 A , the pathway through theN-type transistor is essentially blocked due to the highenergy level of (cid:15) N , so the main contribution in Eq. 7 isthe second term in the sum. This second term has a sim-ilar form as the work required to quasi-statically chargethe capacitor from V g = 0 to V g ≈ V d , and thus is closeto C g V d /
2. This initial charging process is the dominantcontribution to the total entropy production over shorttimes, and represents the reversible limit of the NOT gate(Fig. S1). Once the system reaches the steady state,there is still a steady entropy production coming fromthe leakage currents through both pathways, but the en-tropy production rate within the steady state is muchsmaller and decreases exponentially with V d (Fig. S3).This is because the output voltage V out is very close to V d , leaving the affinity across the drain and the outputnearly zero. Further, the corresponding leakage currentfrom the source to the output is small due to the highenergy level (cid:15) N . The contributions to Σ( τ obs ) from V d implies that for each observation time τ obs , there existsan optimal V d that minimizes Σ( τ obs ), as confirmed inFig. 2 C . The minimum V d shifts to the right with in-creasing time as at higher V d a larger contribution fromthe steady state flux counterbalances the higher entropyproduction during charging. B. NAND gate
We have presented a Markovian model for the NOTgate, which reproduces the performance of a perfect gatein the limit of high V d and for which there is a complexinterplay between energy consumption and time. Withinthe framework presented, it is straightforward to con-struct an analogous model of a NAND gate. A NANDgate takes in two binary inputs X A , X B , and outputs Y = 0 only when both inputs are 1. As shown in Fig.3 A , the kinetic diagram, similar to the circuit diagram,is composed of two P-type transistors P A , P B , and twoN-type transistors N A and N B . The energy levels of P A and N A depend on the first input voltage V in ,A , while theenergy levels of P B and N B are controlled by the secondinput V in ,B (Eq. S14). More details on the model, in-cluding the definition of the entropy production, can be × × × × ξ t / β − h ( X A , X B ) = (0,0)( X A , X B ) = (1,0)( X A , X B ) = (0,1)
0 0.2 0.4 0.6 0.8 1 V in, A / V d V i n , B / V d ABC ! P B N B µ g = − qV g ! N A ! P A ! ! ! ! ! ! µ s = 0 µ d = − qV d µ d = − qV d ! D V s = 0 V d V out ( V g ) V in , A V in , B × × × ξ t / β − h ( X A , X B ) = (0,0)( X A , X B ) = (1,0)( X A , X B ) = (0,1) FIG. 3: Dynamics of a NAND gate. ( A ) Circuitdiagram (left) and kinetic diagram (right) of a NANDgate. ( B ) Two-dimensional voltage transfer curve at V d = 5 V T . ( C - D ) Decay of the error rate with timeunder three cases: ( X in,A , X in,B ) = (0 , , (1 ,
0) and(0 ,
1) for V d = 5 V T ( C ) and V d = 8 V T ( D ) for a NANDgate initialized with V g = 0 V T .found in the SI (Section IV). The two dimensional VTCfor V d = 5 V T is shown in Fig. 3 B , which agrees with thetruth table for a perfect NAND gate.While the dynamical properties of the NAND gate arevery similar to the NOT gate, an asymmetry arises in theNAND gate due to the different pathways in the kineticdiagram, which is a feature absent in the NOT gate. Con-sider the three different inputs ( X A , X B ) = (0 , , (1 , ,
1) shown in Fig. 3 C for V d = 5 V T . While fora perfect NAND gate, these three inputs should all cor-respond to the output Y = 1, the evolution of the er-ror rate ξ and its converged values in the steady stateare not exactly the same for finite V d . In the case of ( X A , X B ) = (0 , and P have relativelylow energy levels, there are two pathways to charge thecapacitor, resulting in a faster error decay rate. For thecases ( X A , X B ) = (0 ,
1) and (1 , andN , such differences shrink drastically when we increase V d to 8 V T in Fig. 3 D . The three cases now convergeto similar error rates in the steady state. In fact, as wefurther increase V d , all such asymmetries vanish, anotherexample of which can be found in Fig. S4, where weplot the one dimensional cut of the VTC along the line V in ,A = V in ,B for different V d . As in the case of the NOTgate, our model behaves as a perfect NAND gate as V d approaches 1 e V. For clarification, we define the propa-gation delay τ p of a NAND gate as the time required toreach the threshold αV d for the input ( X A , X B ) = (0 , τ p for the NOT gate of the same V d . III. LOGIC CIRCUITS
Equipped with a model for the NOT and NAND gate,we now in principle have the tools to implement arbitrarylogic functions. While any logic function can be repre-sented in multiple ways, the topology of the circuit hasan influence on its accuracy, and thermodynamic costs. In the following section, we first explore spatial propa-gation effects arising from assembling multiple gates ina combinational circuit, and then demonstrate memoryeffects arising from the feedback loop in a sequential cir-cuit. Understanding the behavior of these basic comput-ing circuits will be crucial to building up a computingdevice.For each logic circuit, which is itself a computing mod-ule made up of multiple logic gates, while each gate hasan intermediate output, we reserve the symbol V out forthe specific V g that corresponds to the overall output Y of the module. Intermediate input and output voltagesare not converted to binaries except for the final output V out . While the output of each gate is used as the inputof the ensuing gate, we neglect the back reaction on V out so that the occupation of the ensuing transistors does notaffect V out , which is consistent with the high capacitanceassumption made in Eq. 2. Unless specified otherwise,all gates are initialized at V g = 0 V T at the start of thecomputation, but no re-initialization is done afterwards.While the channel capacity is a more comprehensive char-acterization of the accuracy and provides the best casescenario, the much larger input space and complicatedmemory effects make it cumbersome to calculate in thecase of logic circuits. We thus use the error rate in thefinal output instead, and consider the worst case scenarioin choosing the inputs to provide an upper bound for theerror rate whenever possible. A. Combinational Circuit
A combinational circuit maps a given set of inputs to asingle output using a number of gates, such as an adderthat computes the sum of inputs and a XOR gate thatcomputes their parity. As the simplest example, we studythe behavior of an array of L NOT gates indexed by i = 1 , , · · · , L connected in the way that V ( i )in = V ( i − g for i >
1. A schematic of the system can be found in Fig.4 A . The input of the circuit X determines V (1)in , and theoutput is measured from the last gate V out = V ( L ) g . Thespatial dimension adds complexity to the evolution of V g ,as illustrated in Fig. 4 B for V d = 5 V T , X = 0. In thesteady state, we expect the output voltage of the oddgates close to V d , and the even gates close to zero. For agate to reach its steady state, its input, which depends onthe dynamics of the previous gate, must first reach its ex-pected value, thus the propagation delay should increasewith the gate index i . As the odd gates are initializedfar from its steady state, it will take a significant amountof time to reach its expected output. For the odd gateswhich have not reached the steady state yet, the ensu-ing even gate will have a lower input voltage, resultingin the overshoot of voltage before eventually decaying toits expected lower output. The turn over in voltage ofthe even gates corresponds to the inflection point on theVTC.A consequence of the connectivity between gates is thecorruption of initial input. While the input voltage ofthe first gate is always 0, for finite V d , the maximuminput voltage of the second gate will be slightly lowerthan V d , and thus corrupted. As the VTC of the NOTgate is a non-increasing function, a corrupted input willinevitably cause a higher error rate in the output, whichwill propagate along the array. This is shown in Fig. 4 C ,where the error rate for individual gates in the steadystate rises initially with gate index, before converging to aconstant value after a few gates, and is always higher thanthat of the single gate. A similar behavior can be foundin the propagation delay time, which increases sharplyfor the first few gates and converges to a slower linearincrease afterwards (Fig. S5). This implies that circuitdesigns with deeper layered structure are unfavorable interms of both accuracy and propagation delay.The convergence behavior is intriguing as it impliesthe existence of a pair of fixed points ( V ∗ odd , V ∗ even ) forthe intermediate outputs in the steady state. Indeed,the fixed point solution corresponds to the point on theVTC ( V in = V ∗ odd , V out = V ∗ even ) satisfying the conditionthat its reflection ( V in = V ∗ even , V out = V ∗ odd ) is also on theVTC. As the fixed point is a dynamically stable solution,it does not depends on the initial input V (1)in (Fig. S5),whereas the speed of approaching the fixed point char-acterizes the spatial correlation in the system. We fitthe decay in | V ( i ) g − V ∗ | /V T with an exponential functionexp[ − κi ], and report the rate κ for different V d in Fig.4 D . For V d = 5 V T , the spatial correlation length 1 /κ is κ V d / V T ξ gateoddeven ABC D X ( V (1)in ) Y ( V out / V ( L ) g ) FIG. 4: Performance of an array of NOT gates with X = 0, and all V g initialized to 0 V T . ( A ) Schematic ofan array of NOT gates with a single input X andoutput Y . ( B-C ) Evolution of V g ( B ), and the steadystate error rate ( C ) for individual gates with V d = 5 V T .( D ) Spatial propagation rate κ as a function of V d .on the order of 1, which means spatial correlation existsbetween neighboring gates. As a consequence, it is moreprobable to observe consecutive errors along the array,which is shown by an error analysis of simulated trajec-tories in the SI (Section V). As the VTC becomes sharperwith increasing V d , the correlation length between gatesdecreases. In the limit of high V d , the fixed point solu-tion can be found exactly at ( V in = 0 , V out = V d ), whichmeans that the input becomes uncorrupted. To summa-rize, the combination of gates introduces longer propaga-tion delay and input corruption, and thus deeper layeredcircuit design is advised against. By operating at a higher V d to reduce spatial correlation, the latter problem canbe mitigated, but of course this is done at the cost ofeven longer propagation delay. B. Sequential Circuit: RS latch
While combinational circuits are typically used tocarry out arithmetic computations, modern computingdevices often include another type of logic circuit to han-dle memory - the sequential circuit. Fig. 5 A shows anexample of such a circuit, known as the RS latch. TheRS latch consists of two NAND gates where the outputof gate 1, V (1) g , is sent as an input of gate 2, V (2)in,A , andsimilarly, the output of gate 2, V (2) g , is fed back as V (1)in,B .FIG. 5: ( A ) Circuit diagram of the RS latch. ( B - C ) The evolution of the outputs V (1) g and V (2) g for 100 trajectorieswith the initialization V (1) g = V (2) g = V d = 5 V T , where time is scaled by the propagation delay τ p of the NAND gate.The dark curve represents the average relaxation behavior. ( D - F ) The location of the stable informational statesdetermined by overlapping the VTC for the two NAND gates at V d = 3 V T ( D ), V d = 5 V T ( E ), and V d = 40 V T ( F ).The remaining two inputs, V (1)in,A and V (2)in,B , correspondsto the two external binary inputs X S and X R , respec-tively. The output of the circuit, V out , which coincideswith V (1) g , depends not only on the external inputs X S and X R , but also the stored information of V (1) g and V (2) g .This is the defining characteristic of a sequential circuit,which makes it useful as a memory storage. More specif-ically for a perfect RS latch, in the “set” stage where theexternal inputs are set as X S = 0, X R = 1 or X S = 1, X R = 0, there exists only one dynamically stable statefor the system, so that we can unambiguously designatethe memory at V out as 1 or 0. In the “hold” stage where X S = X R = 1, however, the system is bistable and itsstate depends on the initialized value of V (1) g and V (2) g .In the vicinity of the fixed points, an effective Hamilto-nian description of the RS latch is quartic in V out withtwo minima and a maxima between them. This emer-gent bistability resulting from the feedback loop allowsthe RS latch to function as a memory storage device.To function as a memory storage device, a circuit musthave at least two distinguishable states in which infor-mation can be stored. For our stochastic model in Fig.5 A , these states correspond to the steady state solu-tions that satisfy the feedback condition V (1)in,B = V (2) g , V (2)in,A = V (1) g under the input V (1)in,A = V (2)in,B = V d . Anintuitive way to find their location is to overlap the VTCof the two NAND gates along the cut V (1)in,A = V d and V (2)in,B = V d , which are not exactly the same due to the asymmetry in the non-perfect NAND gates. We showa couple of scenarios at different V d in Fig. 5 D - F . At V d = 3 V T , the highly asymmetric VTCs cross merely at( V (1)in,B , V (2)in,A ) = (0 . V T , . V T ), indicating that the sys-tem only has a single stable state and does not qualify asa memory storage device. As V d increases to 5 V T , two dy-namically stable informational states start to emerge at( V (1)in,B , V (2)in,A ) = (0 . V T , . V T ) and (4 . V T , . V T ),though the slight asymmetry suggests different dynam-ics around the two states. While a third intersectionpoint is found at ( V (1)in,B , V (2)in,A ) = (2 . V T , . V T ), it cor-responds to an unstable saddle point. At an even higher V d = 40 V T , the two states converge to ( V (1)in,B , V (2)in,A ) =(0 V T , V T ) and (40 V T , V T ), and symmetry is restored.While the existence of two distinguishable informa-tional states is guaranteed at sufficiently high V d , thereremains the question of whether these informationalstates are robust against noises. While in both the setand hold stage, V (1) g and V (2) g are usually sufficiently farfrom each other that it is possible to distinguish themdefinitively, there do exist occasions where the noise canmediate a transition. One such example is shown in Fig.5 B and C for the initialization V (1) g = V (2) g = V d = 5 V T .As the outputs of the gates evolve from their initializationtowards the steady state solution, there is a significantoverlap between the two outputs around t = 0 . τ p , whichleads to about 13% percent of the trajectories failing toretain the information and evolving to the wrong fixedpoint. This kind of perturbation happens when the over-lap region includes the unstable intersection point on theVTC, and the change of convexity of the effective Hamil-tonian brings the trajectory towards a different stablestate. Such an initialization error is rare to observe eitherin the set or hold stage, and we show additional evidencefor the robustness of the circuit at V (1) g = V (2) g = 2 . V T and 0 V T in Fig. S7. In addition, at a higher V d , as theVTC becomes sharper, not only do the two minima inthe Hamiltonian become more separated, their vicinityalso become steeper, both of which facilitate the differ-entiation between the two states and thus will drasticallyimprove the robustness of the device. C. Sequential Circuit: D flip-flop
With the RS latch as a basic computing unit, we canmodel a memory storage module that synchronizes withthe clock generator, called the D flip-flop. Modern com-puting devices typically include a pulse generator thatoscillates between 0 and 1, with a clock cycle τ c . To seehow the clock is incorporated into the D flip-flop, we showthe circuit diagram of a D flip-flop in Fig. 6 A , built upfrom 4 NAND gates and 1 NOT gate. The circuit can bereadily modularized as a memory storage unit, denotedwith the symbol D, that takes in an input X represent-ing the data, another input X WE synchronized with theclock, and generates an output Y . The two NAND gateswith the feedback loop on the right hand side constitutean RS latch, which is responsible for the memory stor-age. When the write-enable input X WE = 1, the D flip-flop sets its output V out in agreement with the data X ,whereas when X WE = 0, the D flip-flop holds its storedvalue as its output, which can be further processed forcomputing purposes.The clock cycle τ c , or the clock frequency 1 /τ c , is animportant parameter as it determines how fast data canbe read and stored. In Fig. 6 B and C we illustrate howthe clock cycle influences the accuracy and dissipation ofthe data transmission process for a D flip-flop with V d =8 V T . We start with X WE = 1 and send in a stream ofdata X = { , , , , · · · } . While X WE alternates between1 and 0 every τ c /
2, the data input only changes every τ c . This input data sequence is chosen to maximize thealternation in the output, and thus minimize the memoryeffect discussed earlier for the NOT gate. Therefore, theerror rate and dissipation in this case are expected tobe the highest among all possible input sequences. Theerror rate ξ is measured according to the output V out atthe end of each cycle, and is reported separately for thecycles with X = 1 and 0. The evolution of V out as afunction of the cycle number can be found in Fig. S8.Similar to the behavior for the single NAND gate inFig. 3 D , the error rate for X = 0 starts to decreasemonotonically when τ c is longer than the single gatepropagation delay τ p . The error rate for X = 1, how-ever, first increases with τ c before eventually decreasing.This counter-intuitive behavior comes from the memory BCA X data X WE Y ( V out ) ξ ( Σ − / q V T ) / τ c / τ p ξ ( Σ − / q V T ) / τ c / τ p FIG. 6: ( A ) The symbol (left) and circuit diagram(right) of a D flip-flop. ( B - C ) Error rate (blue, withaxis label on the left) and average dissipation per gateper cycle (red, with axis label on the right) as a functionof the clock cycle τ c , scaled by the propagation delay ofthe NAND gate, for cycles with input X = 1 ( B ) and 0( C ). All NAND gates are operated at V d = 8 V T .retention behavior in the RS latch. Once data is stored inthe RS latch, it tends to stay in the memory by influenc-ing the transmission of the following data, and thus in-troduces temporal correlation between the outputs. Theinfluence of the data can only be erased given sufficienttime to transmit the following data. This temporal cor-relation time, or memory retention time, again coincideswith the propagation delay τ p . In this example, as thefirst input X = 1, the output retains the memory of ahigher output at short τ c , so the error rate for X = 1is deceptively low, and the error rate for X = 0 is high.At τ c ≈ τ p , the output is stuck between the high andlow outputs before reaching either steady state, so thatthe error rate for either cycle is high. In this regime, theaverage dissipation accumulated within each cycle risesfast with τ c , as charging processes contribute heavily toenergy costs. When τ c > τ p , the memory effect is even-0tually overcome and the error rates for both cycles startto decay. The average dissipation rate also converges toa smaller constant value as within each cycle, the sys-tem is able to reach the steady state, in which much lessdissipation is generated. The exponential scaling of τ p with V d implies that while the asymptotic error is ex-pected to decrease when operating far from thermal en-ergies V d (cid:29) V T , the speed with which the D flip-flop canfunction with that lower error is significantly slower. Dueto this lag, comparing between a lower and a higher V d ,the error is expected to be much lower in the former casefor a fixed computing time on the order of τ p of the lower V d . IV. PARITY COMPUTING DEVICE
With the combinational circuit modularized as thearithmetic logic unit (ALU), and the sequential circuitas the memory storage device, we can combine the twocomponents to model a computing device. We choosethe task of computing the parity of a sequence of inputs X = { X , X , · · · , X N } of length N , which has wide ap-plications in error detection. Such a task can be easilyimplemented by combining ( N −
1) XOR gates in a se-quential manner. However, when N is relatively large,due to the limitation in resources, it is beneficial to breakup the task in several steps, and store intermediate re-sults in memory. The clock generator synchronizes theoperation of different components to ensure correct se-quencing.As an example, we consider 2 XOR gates as an ALU,and 4 D flip-flops, D1 to D4, as a memory device to checkthe parity of N = 12. Fig. 7 A shows the schematic ofour design, while the complete circuit diagram can befound in Fig. S9. Each XOR gate takes in 2 binaryinputs at a time, the source of which is controlled by 2input two-way switches, shown in red in Fig. 7 A . Whenthe switch is connected to terminal 1, the input comesfrom the data sequence X ; whereas when terminal 2 isconnected, the input comes from the data stored in a Dflip-flop. At the end of each XOR gate is an output two-way switch, shown in green in Fig. 7 A , which controlswhere to store the output. We store new data only onfree D flip-flops, where the data stored at an earlier timeis already read out for post-processing and does not needto be held any more. The total system requires modelingover 100 transistors.We start the computation by sending in pairs of in-put data from the data sequence, and computing theirparities with the XOR gates. The D flip-flops are set byoutputs from the ALU (first D1, D2 and then D3, D4),and once all D flip-flops have been set, we free them bysending the stored information back to the ALU for fur-ther processing. The computation is terminated when allinputs are taken into account in the final output, and theentire task can be completed in 6 clock cycles. A moredetailed description of the protocol, and a computational A X X X X Y Y Y Y B FIG. 7: ( A ) Schematic of the parity computing devicewith 2 XOR gates and 4 D flip-flops. The input of eachXOR gate is controlled by 2 input two-way switches,shown in red. The output two-way switch, shown ingreen, determines which D flip-flop is used to store theoutput of the XOR gate. ( B ) The average error rate(blue, with axis label on the left) and dissipation pergate per cycle (red, with axis label on the right) as afunction of the time cycle τ c , scaled by the propagationdelay of a single NAND gate, averaged over differentinput sequences. All gates are operated at V d = 8 V T .tree graph that illustrates how intermediate outputs arerelated to the final output can be found in the SI (SectionVI) and Fig. S10.As before, we are interested in the time and dissipa-tion required to achieve a certain accuracy. In Fig. 7 B ,we show the error rate for the final output at t = 6 τ c ,and the average dissipation per gate (averaged over the28 gates in this device) per clock cycle ¯Σ as a function of τ c with V d = 8 V T . Both results are averaged over morethan 10 inputs, which are sequences of independent andidentically distributed Bernoulli random variables withequal probability of being 0 or 1. As expected, the aver-age error rate decays with the clock cycle until τ c ≈ τ p ,as the extended spatial dimension of the circuit increasesthe propagation delay in the final output. At such a high V d , spatial correlations do not extend beyond neighbor-ing gates, and are even weaker between different modules,especially for clock cycles longer than τ p . We further ana-lyze how the error in the final output is correlated along1its computational path in the SI . The average dissipa-tion first increases sharply and then converges to a lineargrowth in the limit of large τ c , similar to the D flip-flop,but slightly lower than that in Fig. 6 C for the same τ c .This is because the input sequences are randomly choseninstead of alternating between 0 and 1, and the memoryeffect can help shorten the charging process, which mostcontributes to the entropy production. Additionally be-cause of the synchronization, the D flip-flops may remainat a steady state for a few cycles before it is freed to storenew data. During such periods, the dissipation is espe-cially low as the entropy production in the steady state isminimal due to relatively small leakage currents. There-fore, computational protocols that minimize changes onthe memory storage device is desirable for low-dissipationcomputing. Taking into consideration both the accuracyand dissipation, the optimal clock cycle to operate withis τ c ≈ τ p , as lowering the speed further will only resultin higher dissipation from the steady state. V. DISCUSSION AND CONCLUSION
We have illustrated a realistic model for stochasticlogic gates, and demonstrated its utility in building ar-bitrary logical circuits. Information manipulations, suchas bit storage and erasing are represented by the charg-ing and discharging of the capacitors, which is consistentwith current data storage technology. While our modelperforms as a perfect logic circuit when operated in thelimit of low noise, its thermodynamical consistency al-lows us to study the rich interplay between speed, accu-racy and dissipation in the intermediate regimes, fromwhich we can derive some useful design principles forlow dissipation computing devices. For instance, we haveprovided a physical origin of input corruption in the com-binational circuits, as well as feedback robustness in thesequential circuits, and illustrated how each can be im-proved drastically by operating at a slightly higher volt-age. In addition, memory effects should be exploited asmuch as possible to minimize dissipation. With modu-larization, it is straightforward to scale up our model toeven larger and more complex systems, making it a usefulmodel to study collective behaviors of circuits.One of the major motivations of this work is to en-able the design of low dissipation computing devices withmaximal accuracy and speed. While there exists sev-eral theoretical results that propose bounds on the ther-modynamic costs of computing, understanding underwhat circumstances they are saturated requires a realisticmodel for the thermal noise. As each dynamical processin our model obeys a local detailed balance, we are ableto harness the lessons of stochastic thermodynamics todefine and analyze the time dependence and fluctuationsof the entropy production. Note that the entropy pro-duction Σ we have referred to throughout the paper isdifferent from the total dissipation, which is the heat re-leased by the system, by a term ∆ S - the change in the Shannon entropy of the system transistors. Nevertheless,we have used the term entropy production and dissipa-tion interchangeably since for the timescales studied, theboundary term ∆ S is orders of magnitude smaller thanthe cumulative term Σ, which is very large due to thelarge gate capacitance. This then raises the question ofhow to further decrease the irreversible dissipation andthat associated with charging the gates. This problemis the crux of optimal control theory, and adiabatic cir-cuit design, from which some design principles canbe borrowed. For example, while we have kept the in-put voltage of the transistors V in fixed within each cy-cle, one can design optimal feedback protocol that con-trols it according to the state of the capacitor, in or-der to minimize the irreversible dissipation throughoutthe process. Such optimal feedback protocols already ex-ist for simple thermodynamic engines, and we believeour model provides an ideal testing ground for applyingmore advanced stochastic control algorithms. Marryingour model with a framework that integrates informationwith thermodynamics, we hope to get a step closer toachieving a computing design that minimizes dissipationwhile maximizing accuracy and speed.
VI. MATERIALS AND METHODS
Simulations were done with both an iterative, numeri-cally exact diagonalization of the master equation as wellas Gillespie simulations. In both, we employ a sepa-ration of timescales for electron transfer to or from atransistor and gate charging, afforded by the large gatecapacitance. Specifically, the large capacitance means wecan update V g with discrete timestep, chosen to be 10 β ~ ,and compute rates at fixed V g in between these dynami-cal updates. More details on the models and calculationscan be found in the SI . All our codes and data can beaccessed on GitHub: https://github.com/chloegao12. VII. ACKNOWLEDGEMENT
The authors thank Gavin E. Crooks for suggesting thistopic for study and for invaluable discussion and com-ments. C.Y.G. and D.T.L. are supported by the USDepartment of Energy, Office of Basic Energy Sciences,through the CPIMS Program Early Career Research Pro-gram under Award DE-FOA0002019.
VIII. REFERENCES R. Landauer,
Maxwell’s Demon: Entropy, Information, Com-puting (Princeton University Press, 1990). C. H. Bennett, “Notes on landauer’s principle, reversible compu-tation, and maxwell’s demon,” Studies In History and Philosophyof Science Part B: Studies In History and Philosophy of ModernPhysics , 501–510 (2003). J. M. Parrondo, J. M. Horowitz, and T. Sagawa, “Thermody-namics of information,” Nature Physics , 131–139 (2015). D. H. Wolpert, “The stochastic thermodynamics of computa-tion,” Journal of Physics A: Mathematical and Theoretical ,193001 (2019). J. B. Johnson, “Thermal agitation of electricity in conductors,”Physical Review , 97 (1928). H. Nyquist, “Thermal agitation of electric charge in conductors,”Physical Review , 110 (1928). A. Van Der Ziel, “Noise in solid-state devices and lasers,” Pro-ceedings of the IEEE , 1178–1206 (1970). S. Heinen, J. Kunisch, and I. Wolff, “A unified framework forcomputer-aided noise analysis of linear and nonlinear microwavecircuits,” IEEE Transactions on Microwave Theory and Tech-niques , 2170–2175 (1991). V. Rizzoli and A. Neri, “State of the art and present trends innonlinear microwave cad techniques,” IEEE Transactions on Mi-crowave Theory and Techniques , 343–365 (1988). C. Maes, “On the second fluctuation–dissipation theorem fornonequilibrium baths,” Journal of Statistical Physics , 705–722 (2014). T. Speck and U. Seifert, “Restoring a fluctuation-dissipation the-orem in a nonequilibrium steady state,” EPL (Europhysics Let-ters) , 391 (2006). C. Y. Gao and D. T. Limmer, “Nonlinear transport coefficientsfrom large deviation functions,” The Journal of chemical physics , 014101 (2019). D. Lesnicki, C. Y. Gao, B. Rotenberg, and D. T. Limmer, “Field-dependent ionic conductivities from generalized fluctuation-dissipation relations,” Physical review letters , 206001 (2020). L. B. Kish, “End of moore’s law: thermal (noise) death of inte-gration in micro and nano electronics,” Physics Letters A ,144–149 (2002). U. Seifert, “Stochastic thermodynamics, fluctuation theoremsand molecular machines,” Reports on progress in physics ,126001 (2012). C. E. Shannon, “A mathematical theory of communication,” TheBell System Technical Journal , 379–423 (1948). T. M. Cover,
Elements of information theory (John Wiley &Sons, 1999). G. Gallavotti and E. G. D. Cohen, “Dynamical ensembles in sta-tionary states,” Journal of Statistical Physics , 931–970 (1995). C. Jarzynski, “Nonequilibrium equality for free energy differ-ences,” Physical Review Letters , 2690 (1997). G. E. Crooks, “Entropy production fluctuation theorem and thenonequilibrium work relation for free energy differences,” Physi-cal Review E , 2721 (1999). C. Jarzynski, “Equalities and inequalities: Irreversibility and thesecond law of thermodynamics at the nanoscale,” Annu. Rev.Condens. Matter Phys. , 329–351 (2011). N. Shiraishi, K. Funo, and K. Saito, “Speed limit for classi-cal stochastic processes,” Physical Review Letters , 070601(2018). S. Ito, “Stochastic thermodynamic interpretation of informationgeometry,” Physical Review Letters , 030605 (2018). S. Ito and A. Dechant, “Stochastic time evolution, informationgeometry, and the cram´er-rao bound,” Physical Review X ,021056 (2020). G. Falasco and M. Esposito, “Dissipation-time uncertainty rela-tion,” Physical Review Letters , 120604 (2020). B. Kuznets-Speck and D. T. Limmer, “Dissipation bounds theamplification of transition rates far from equilibrium,” Proceed-ings of the National Academy of Sciences (2021). T. R. Gingrich, J. M. Horowitz, N. Perunov, and J. L. Eng-land, “Dissipation bounds all steady-state current fluctuations,”Physical Review Letters , 120601 (2016). T. R. Gingrich, G. M. Rotskoff, and J. M. Horowitz, “Infer-ring dissipation from current fluctuations,” Journal of PhysicsA: Mathematical and Theoretical , 184004 (2017). J. M. Horowitz and T. R. Gingrich, “Proof of the finite-timethermodynamic uncertainty relation for steady-state currents,”Physical Review E , 020103 (2017). S. Lahiri, J. Sohl-Dickstein, and S. Ganguli, “A universal trade-off between power, precision and speed in physical communica-tion,” arXiv preprint arXiv:1603.07758 (2016). A. Murugan, D. A. Huse, and S. Leibler, “Speed, dissipation,and error in kinetic proofreading,” Proceedings of the NationalAcademy of Sciences , 12034–12039 (2012). R. Rao and L. Peliti, “Thermodynamics of accuracy in kineticproofreading: Dissipation and efficiency trade-offs,” Journal ofStatistical Mechanics: Theory and Experiment , P06001(2015). A. C. Barato and U. Seifert, “Thermodynamic uncertainty rela-tion for biomolecular processes,” Physical Review Letters ,158101 (2015). K. Banerjee, A. B. Kolomeisky, and O. A. Igoshin, “Elucidatinginterplay of speed and accuracy in biological error correction,”Proceedings of the National Academy of Sciences , 5183–5188(2017). W. D. Pi˜neros and T. Tlusty, “Kinetic proofreading and thelimits of thermodynamic uncertainty,” Physical Review E ,022415 (2020). T.-L. Wang, B. Kuznets-Speck, J. Broderick, and M. Hinczewski,“The price of a bit: energetic costs and the evolution of cellularsignaling,” bioRxiv (2020). M. B. Plenio and V. Vitelli, “The physics of forgetting: Lan-dauer’s erasure principle and information theory,” ContemporaryPhysics , 25–60 (2001). S. Still, D. A. Sivak, A. J. Bell, and G. E. Crooks, “Thermody-namics of prediction,” Physical review letters , 120604 (2012). T. Sagawa and M. Ueda, “Information thermodynamics:Maxwell’s demon in nonequilibrium dynamics,” NonequilibriumStatistical Physics of Small Systems: Fluctuation Relations andBeyond , 181–211 (2013). D. Wolpert and A. Kolchinsky, “The thermodynamics of com-puting with circuits,” New Journal of Physics (2020). M. H. Devoret, D. Est`eve, H. Grabert, G.-L. Ingold, H. Pothier,and C. Urbina, “Effect of the electromagnetic environment onthe coulomb blockade in ultrasmall tunnel junctions,” PhysicalReview Letters , 1824 (1990). C. Wasshuber,
Computational single-electronics (Springer Sci-ence & Business Media, 2001). D. Bagrets and Y. V. Nazarov, “Full counting statistics of chargetransfer in coulomb blockade systems,” Physical Review B ,085316 (2003). J. Gu and P. Gaspard, “Microreversibility, fluctuations, and non-linear transport in transistors,” Physical Review E , 012137(2019). J. Gu and P. Gaspard, “Counting statistics and microreversibil-ity in stochastic models of transistors,” Journal of Statistical Me-chanics: Theory and Experiment , 103206 (2020). N. Freitas, J.-C. Delvenne, and M. Esposito, “Stochasticthermodynamics of non-linear electronic circuits: A realisticframework for thermodynamics of computation,” arXiv preprintarXiv:2008.10578 (2020). M. Esposito, U. Harbola, and S. Mukamel, “Fluctuation theo-rem for counting statistics in electron transport through quantumjunctions,” Physical Review B , 155316 (2007). S. Nakamura, Y. Yamauchi, M. Hashisaka, K. Chida,K. Kobayashi, T. Ono, R. Leturcq, K. Ensslin, K. Saito, Y. Ut-sumi, et al. , “Fluctuation theorem and microreversibility in aquantum coherent conductor,” Physical Review B , 155431(2011). S. Datta,
Electronic transport in mesoscopic systems (Cambridgeuniversity press, 1997). A. Wachtel, R. Rao, and M. Esposito, “Thermodynamically con-sistent coarse graining of biocatalysts beyond michaelis–menten,”New Journal of Physics , 042002 (2018). S. Datta, “A simple kinetic equation for steady-state quan-tum transport,” Journal of Physics: Condensed Matter , 8023(1990). U. Harbola, M. Esposito, and S. Mukamel, “Quantum masterequation for electron transport through quantum dots and singlemolecules,” Physical Review B , 235309 (2006). M. Leijnse and M. Wegewijs, “Kinetic equations for transportthrough single-molecule transistors,” Physical Review B ,235424 (2008). J. Koski, T. Sagawa, O. Saira, Y. Yoon, A. Kutvonen, P. Solinas,M. M¨ott¨onen, T. Ala-Nissila, and J. Pekola, “Distribution ofentropy production in a single-electron box,” Nature Physics ,644–648 (2013). J. V. Koski, V. F. Maisi, T. Sagawa, and J. P. Pekola, “Ex-perimental observation of the role of mutual information in thenonequilibrium dynamics of a maxwell demon,” Physical ReviewLetters , 030601 (2014). H.-P. Breuer and F. Petruccione,
The theory of open quantumsystems (Oxford University Press, 2002). D. T. Gillespie, “A general method for numerically simulating thestochastic time evolution of coupled chemical reactions,” Journalof Computational Physics , 403–434 (1976). G. E. Crooks, “Field guide to continuous probability distri-butions,” Berkeley Institute for Theoretical Sciences, Berkeley (2019). S. Verd´u et al. , “A general formula for channel capacity,” IEEETransactions on Information Theory , 1147–1157 (1994). A. Rahman and D. Blackmore, “Threshold voltage dynamics ofchaotic rs flip-flops,” Chaos, Solitons & Fractals , 555–566(2017). M. P. Frank, “Common mistakes in adiabatic logic design andhow to avoid them.” Embedded Systems and Applications (2003). A. Zulehner, M. P. Frank, and R. Wille, “Design automation foradiabatic circuits,” in
Proceedings of the 24th Asia and SouthPacific Design Automation Conference (2019) pp. 669–674. J. M. Horowitz and J. M. Parrondo, “Designing optimal discrete-feedback thermodynamic engines,” New Journal of Physics ,123019 (2011). A. Das and D. T. Limmer, “Variational design principles fornonequilibrium colloidal assembly,” The Journal of ChemicalPhysics , 014107 (2021). S. Deffner and C. Jarzynski, “Information processing and thesecond law of thermodynamics: An inclusive, hamiltonian ap-proach,” Physical Review X , 041003 (2013). A. C. Barato and U. Seifert, “Stochastic thermodynamics withinformation reservoirs,” Physical Review E , 042150 (2014). upplementary Information for Principles of Low Dissipation Computing from aStochastic Circuit Model Chloe Ya Gao and David T. Limmer
1, 2, 3, 4, a)1)
Department of Chemistry, University of California, Berkeley, California Kavli Energy NanoScience Institute, Berkeley, California Materials Science Division, Lawrence Berkeley National Laboratory, Berkeley,California Chemical Science Division, Lawrence Berkeley National Laboratory, Berkeley,California (Dated: 26 February 2021) a) Electronic mail: [email protected] a r X i v : . [ c ond - m a t . s t a t - m ec h ] F e b . NOT GATE The Markovian system is described by the occupation number of the two single electronlevels n N , n P = 0 /
1, and the gate voltage V g . Electrons can jump between the transistorsand the reservoirs only if the target site is empty. Denoting the transition rate from state i to j as k ji , the rate describing the exchange of electrons between the transistors and thereservoirs are given by k N s = Γ f s ( (cid:15) N ) , k s N = Γ[1 − f s ( (cid:15) N )] ,k P d = Γ f d ( (cid:15) P ) , k d P = Γ[1 − f d ( (cid:15) P )] ,k N g = Γ f g ( (cid:15) N ) , k g N = Γ[1 − f g ( (cid:15) N )] ,k P g = Γ f g ( (cid:15) P ) , k g P = Γ[1 − f g ( (cid:15) P )] , (1)where f i ( x ) = [ e β ( x − µ i ) + 1] − is the Fermi distribution. The transition rate between the twotransistors depends on their relative energy levels, for example in the case of (cid:15) P > (cid:15) N , k PN = Γ n ( (cid:15) P − (cid:15) N ) , k NP = Γ[1 + n ( (cid:15) P − (cid:15) N )] , (2)where n ( x ) = [ e βx − − is the Bose-Einstein distribution. The rate constant Γ = 0 . /β ~ ∼ ps − is chosen so that electron transitions happen on a longer timescale than quantumtunneling, and the broadening of energy levels due to the coupling to electrodes is smallerthan thermal fluctuations.The dynamics of the capacitor is solved by the equation of motion dV g = − C g Z t int J g ( t ) dt, (3)where C g is the capacitance and J g is the electron current flowing into the electrode from thetransistors. While the transfer of electrons changes V g , the capacitor is treated as an electronreservoir at constant chemical potential µ g = − qV g within each time interval t int . Thus theassumption made here is that the electron transfer within each t int is small compared to C g V T , and the electron relaxation within the capacitor is fast compared to t int . We havechosen C g = 200 q/V T , t int = 10 β ~ in order to justify these assumptions.To obtain a numerically exact solution to the Markovian dynamics, for each interval t int ,we solve for the average occupation number h n N i , h n P i from the stationary solution of themaster equation, which describes how the probability of the configuration p n N ,n P evolves2ith time ˙ p , ˙ p , ˙ p , ˙ p , = − S k r P + k g P k l N + k g N k P r + k P g − S k PN k l N + k g N k N l + k N g k NP − S k r P + k g P k N l + k N g k P r + k P g − S p , p , p , p , (4)where S j = P i = j W ij for a matrix W . The current J g flowing into the capacitor is thencomputed by the sum of two terms J N → g /q = k g N h n N i − k N g (1 − h n N i ) ,J P → g /q = k g P h n P i − k P g (1 − h n P i ) . (5)In the Gillespie simulation, the electron jumping processes are modeled explicitly aschemical reactions, with M = 10 reaction rates w = k N s (1 − n N ) , w = k s N n N ,w = k P d (1 − n P ) , w = k d P n P ,w = k N g (1 − n N ) , w = k g N n N ,w = k P g (1 − n P ) , w = k g P n P ,w = k PN n N (1 − n P ) , w = k NP n P (1 − n N ) . (6)We use Monte Carlo method to simulate the probability that reaction i will happen aftertime t P ( t, i ) = w i exp[ − M X i =1 w i t ] , (7)and the currents between two sites are calculated as the discrete number of jumps betweenthe two sites. The discretization error in voltages between the average protocol and theGillespie simulation is on the order of q/C g = 0 . V T . II. COMPUTATION OF CHANNEL CAPACITY
For the NOT gate, we observe Gaussian distributions in V out in the steady state regardlessof V d , with the same variance 1 / ( βC g ). Under the Gaussian assumption, the error rate isuniquely determined by the average output voltage V out . For example, the accuracy rate for X = 0 is determined by the conditional probability p ( Y = 1 | X = 0) = Z ∞ . V d dV r βC g π exp (cid:20) − βC g ( V − V out ) (cid:21) . (8)3arginalizing the conditional probabilities, the mutual information can be computed by I ( X ; Y ) = X x =0 , X y =0 , , ∅ p ( x, y ) log p ( x, y ) p ( x ) p ( y ) . (9)To compute the channel capacity, we numerically maximize I ( X ; Y ) over the base probabilitydistribution p ( X ), where the conditional probabilities are computed using V out in the steadystate. III. COMPUTATION OF AVERAGE INFORMATION RATE
For an information channel with memory, the average information rate per data is definedas ¯ I ( X ; Y ) = 1 N data I ( X , · · · , X N data ; Y , · · · , Y N data ) , (10)which in the limit of N data → ∞ and upon maximizing over the input probability distribution p ( X ), is the generalization of the channel capacity. As an example, for N data = 2, the mutualinformation is computed by I ( X , X ; Y , Y ) = X x =0 , X x =0 , X y =0 , , ∅ X y =0 , , ∅ p ( x , x , y , y ) log p ( x , x , y , y ) p ( x , x ) p ( y , y ) . (11)To incorporate the memory effect, note that the probability distribution of the i th output Y i is not only a function of X i , but also the output voltage of the previous data V i − . Thedependence can be expressed with the conditional probability p ( V i out | x i , V i − ), which wesample by Gillespie simulations of more than 10 trajectories. The dependence of I on theobservation time τ obs thus comes from this conditional probability.Let V = 0 V T , we can write down the joint probability p ( x , x , V , V ) = p ( V | x , V ) p ( V | x , V = 0 V T ) p ( x ) p ( x ) , (12)which follows from the Markovian nature of the memory effect, and the fact that the inputdata x i are independent from each other. The joint distributions can then be computed bymarginalization, e.g. defining the mapping between Y and V out in Eq. 5 as Y = dig( V out ), p ( x , x , y , y ) = X y =dig( V ) X y =dig( V ) p ( x , x , V , V ) , (13)4here the sum is over all V i out , discrete in our simulation, that correspond to y i .As the numerical maximization is difficult for large N data , without loss of generality, wechoose as our input a sequence of independent and identically distributed Bernoulli randominputs with equal probability of being 0 or 1. We show in Fig. S2 the average informationrate at V d = 5 V T for N data = 1 , , τ obs , the processing time for eachindividual data from input to output. For N data = 1, the information rate first decreases atsmall τ obs , as ξ ( X = 1) inevitably increases at short time due to the initialization V g = 0 V T ,and rises up sharply around the propagation delay τ p , which is the time required for ξ ( X = 0)to decay. As we increase N data , the memory effect is expected to be especially helpful whenconsecutive inputs share the same value, and thus should on average improve the informationrate. This effect is not evident for extremely small τ obs , where the error rate for X = 0 is toohigh to be corrected by the memory effect. However, the memory effect plays a significantrole, bringing up to 30 percent increase in the average information rate, at intermediate τ obs .All curves eventually converge in the long time limit as the memory effect wears off for timesmuch longer than the propagation delay. IV. NAND GATE
The NAND gate is described by four occupation numbers ( n P A , n P B , n N A , n N B ) and thegate voltage V g . The energy levels of the four transistors are determined in the same manneras in Eq. 2 of the main text, with (cid:15) N A , (cid:15) P A corresponding to V in,A , and (cid:15) N B , (cid:15) P B correspondingto V in,B , (cid:15) P A = (cid:15) + qV in ,A , (cid:15) N A = (cid:15) − qV in ,A ,(cid:15) P B = (cid:15) + qV in ,B , (cid:15) N B = (cid:15) − qV in ,B , (14)The transition rate between the transistors and reservoirs are described analogously to Eq.S1 and S2. To avoid numerical issues in Eq. S2 when the adjacent transistors have the sameenergy levels, we add a 10 − regularizer in the denominator of the Bose-Einstein distribution n ( x ). The output V g is again treated as a capacitor described by Eq. S3, where the current J g is the sum of J P A → g , J P B → g and J N A → g , each defined as in Eq. S5. As in the NOT gate,both an average protocol and a Gillespie simulation consisting of 16 chemical reactions are5sed to study the dynamics. The entropy production during an observation time τ obs isΣ( τ obs ) = Z τ obs dt J s → N B ( µ s − µ g ) + ( J d → P A + J d → P B )( µ d − µ g ) . (15) V. ERROR ANALYSIS IN AN ARRAY OF NOT GATES
We simulate an array of NOT gates of length L with V (1)in = 0 V T , V d = 5 V T , and generatemore than 10 snapshots of the system. For each snapshot, we first search for regions with d e = 1 , , · · · ,
16 consecutive errors, and then count the total number of such error domains,denoted by N ( d e ). While counting, we do not account for the first 10 gates in each arrayas they have not reached the fixed point solution. To characterize spatial correlation inthe system, we compare the value of N ( d e + 1) /N ( d e ) computed in our model with thecase where all gates are independent from each other. We denote the single gate error rateof the odd and even gates as ξ / . Note that to make a fair comparison, this error ratecorresponds to the fixed point solution of the array, instead of the error rate of a single NOTgate with X = 0 /
1. Assuming odd and even gates are observed with equal probability,it is easy to derive that N ( d e = 1) = ( ξ + ξ ) / N ( d e = 2) = ξ ξ . One can inferfrom this simple calculation that for independent gates, N ( d e + 1) /N ( d e ) = 2 ξ ξ / ( ξ + ξ )if d e is odd, ( ξ + ξ ) / d e is even. This result is plotted in Fig. S6 as the reference,where the zigzag behavior comes from the difference between ξ and ξ . In addition, weplot in the same figure the value N ( d e + 1) /N ( d e ) for our model with L = 60 , ,
160 and210. For smaller L , as finite size effect prevents larger error domains to emerge, the value N ( d e + 1) /N ( d e ) is lower and decays with d e . Such effect mitigates with increasing L , andthe value of N ( d e + 1) /N ( d e ) should not depend on the exact value of d e other than itsparity in the L → ∞ limit. The converged values of N ( d e + 1) /N ( d e ), as shown in Fig. S6,is clearly higher than the reference values, indicating that there exists a positive correlationin errors between adjacent gates. In other word, given that an error occurs at gate i , theprobability of observing another error at its neighboring gate is enhanced due to the spacialcorrelation. 6 I. PARITY COMPUTING DEVICEA. Computation protocol
For concreteness, we consider the input sequence X = { , , , , , , , , , , , } andplot in Fig. S10 B - C the V out of the XOR gates, D1 and D2, for an ensemble of trajectories.Here the clock cycle is chosen as τ c = 10 β ~ . All gates are operated at V d = 8 V T , whileall capacitors are initialized with zero charge. We highlight with red cross the time pointswhere outputs are being read out from the D flip-flops. At t = 0, we send in 4 input data, X to X , by connecting all the input two-way switches to terminal 1. Since all the D flip-flopsare slack at the moment, we can store the computing results of the XOR gates into D1 andD2 by switching both output two-way switches to terminal 1 as well. The clock stays at 0within the first half cycle while computations are being done at the ALU, until t = τ c / t = τ c , as the clock returns to 0, another 4 input data are sent in while the outputtwo-way switches are connected to terminal 2 so that outputs can be sent to and stored atD3 and D4. At the end of the second cycle, when we realize that our memory devices arefull and can not take in new inputs, we read out the outputs at D1 to D4 and send themback to the ALU as inputs by connecting all input two-way switches to 2. We continue thecomputation in this manner, until all input data are taken into account and the final outputis read from D1 at t = 6 τ c . B. Error analysis in the parity computing device
The computational tree graph of the computing device is shown in Fig. S10 A , with12 input nodes in the 0th layer representing the input data, and a single final output l (1)4 that computes the parity of the inputs. The nodes in layer 1 to 3 represent intermediatecomputation results. If an error is observed in any of the nodes whose layer is deeper than0, we look further at its parent nodes to trace where the error originates, and its child node(if existing) to see how far the error propagates. We call such a record of error verticallyalong the computational tree graph an error path, and its length is denoted as l e . In thiscomputational tree graph, the maximum value for l e is 4, which means the error propagatesfrom layer 1 all the way to the final output; while the minimum value for l e is 1. Among the7 . × simulations we have done with different input sequences, we make a histogramof the error paths with length l e for different clock cycle τ c , which is shown in Fig. S10 D .For the shortest clock cycle plotted τ c = 5 × β ~ , which is too soon for the gates to reachtheir steady states, we observe an overwhelmingly high number of error paths of length l e = 4. However, when τ c is longer, we see an exponential decay in the number of errorpaths with increasing l e . This exponential decay rate characterizes the temporal correlationbetween intermediate computation results. The rate increases with longer τ c , indicating thediminishing correlations, or the weakening of the memory effect at longer clock cycle. With τ c > × β ~ , it is almost impossible to find an error path with l e = 4 that propagatesthrough the computational tree graph. 8 − − − l n p ( V g / V T ) V g / V T − − − − − −
9 0 1×10 l n p ( τ p ) τ p / β − h A × × × Σ / q V T t / β − h C DB
FIG. 1: Dynamics of a single NOT gate with X = 0, V d = 5 V T , and V g initialized fromzero. ( A ) Distribution of the propagation delay τ p . ( B ) Distribution of V g at t = 5 × β ~ , where the red dashed line labels the threshold voltage for Y = 1. ( C )Evolution of the total entropy production of 100 individual simulated trajectories (lightblue) and their ensemble average (dark blue). ( D ) Distribution of the entropy productionrate in the steady state, measured in the long time limit where τ obs = 10 β ~ .9IG. 2: Information rate for a repeatedly used NOT gate with V d = 5 V T as a function ofthe data processing time τ obs . A B
FIG. 3: Entropy production as a function of V d for a NOT gate with V g initialized to 0 V T .( A ) The entropy production during the propagation delay τ p (dots) for X = 0, whichconverges to the quadratic function C g V d / V d . ( B ) The entropy productionrate in the steady state σ ss decreases exponentially with V d for both X = 0 / V in = V in,A = V in,B , which becomes more symmetric as V d increases. × × × ×
0 5 10 15 20 τ p / β − h gate A B
FIG. 5: Spatial propagation in an array of NOT gates with V d = 5 V T and all V g initializedto 0 V T . ( A ) Propagation delay for the odd gates with X = 0. ( B ) Exponential decay ofthe distance between V g and the fixed point. The spatial propagation rate κ is duducedfrom the rate of the exponential decay, which is independent of the input V (1)in .11 N ( d e + ) / N ( d e ) d e L =60 L =110 L =160 L =210 ref FIG. 6: The ratio N ( d e + 1) /N ( d e ) as a function of d e calculated for an array of NOTgates of length L = 60 , ,
160 and 210 with V d = 5 V T . The orange dots represent thereference where all gates are independent from each other, and alternates between2 ξ ξ / ( ξ + ξ ) and ( ξ + ξ ) /
2. 12 B FIG. 7: The evolution of the outputs V (1) g and V (2) g with the initialization V (1) g = V (2) g = 0 V T ( A ) and V (1) g = V (2) g = 2 . V T ( B ) where V d = 5 V T . The dark curvesrepresent the steady state solution from the average protocol, while the lighter curvesrepresent 100 trajectories. In both cases, the percentage of trajectories that end up in theother informational state ( V (1) g , V (2) g ) = (0 . V T , . V T ) is less than 10 − among the2 . × trajectories simulated. 13 V ou t / V T N c τ c / β − h = 10 ABC V ou t / V T N c τ c / β − h = 2 × FIG. 8: Output voltage of a Dflipflop with clock cycle τ c = 2 × β ~ ( A ), τ c = 10 β ~ ( B ),and τ c = 3 × β ~ ( C ). The input data sequence starts from X data = 1 and alternatesbetween 1 and 0. All gates are operated at V d = 8 V T , and are initialized with V g = 0 V T .We discard the first few cycles and average over more than 50 cycles when computing theaverage error rate and dissipation in Fig. 6, so that their values have no dependence on theinitialization. 14 X X X Y Y Y Y A X X Y B FIG. 9: ( A ) Symbol (left) and circuit diagram (right) of a XOR gate, which takes in twoinputs X , X , and compute their parity as output Y . ( B ) Circuit diagram of the paritycomputing device with 2 XOR gates and 4 D flip-flops.15 (1)1 l (2)1 l (3)1 l (4)1 l (5)1 l (6)1 l (1)2 l (2)2 l (3)2 l (1)3 l (1)4 l (1)0 l (2)0 l (3)0 l (4)0 l (5)0 l (6)0 l (7)0 l (8)0 l (9)0 l (10)0 l (11)0 l (12)0 DA − V ou t / V T t / τ c XOR 1 D1
B C h i s t og r a m V out / V T − V ou t / V T t / τ c XOR 2 D2
FIG. 10: ( A ) Computational tree graph of the device that computes the parity of 12 inputdata. The node l ( j ) i denotes the j th output of the i th layer. The 0th layer has 12 nodes,which represent the 12 data in the input sequence. Each child node calculates the parity ofits two parent nodes. The final output l (1)4 computes the parity of all the input data. ( B )For the input sequence X = { , , , , , , , , , , , } , the output of XOR 1 and itscorresponding memory storage D1 for 64 individual trajectories at V d = 8 V T , τ c = 10 β ~ .The red crosses label points where outputs on D1 are read out for further processing. ( C )The output of XOR 2 and its corresponding memory storage D2 with the same parametersas in B . The lower plot shows the histogram of outputs at D2 at t = 4 τ c . The red dottedline labels the threshold under which the output corresponds to an error. ( D ) Histogram oferror paths of length l e for different clock cycle τ cc