[PDF] Global minimization via classical tunneling assisted by collective force field formation

Abstract

Simple dynamical models can produce intricate behaviors in large networks. These behaviors can often be observed in a wide variety of physical systems captured by the network of interactions. Here we describe a phenomenon where the increase of dimensions self-consistently generates a force field due to dynamical instabilities. This can be understood as an unstable ("rumbling") tunneling mechanism between minima in an effective potential. We dub this collective and nonperturbative effect a "Lyapunov force" which steers the system towards the global minimum of the potential function, even if the full system has a constellation of equilibrium points growing exponentially with the system size. The system we study has a simple mapping to a flow network, equivalent to current-driven memristors. The mechanism is appealing for its physical relevance in nanoscale physics, and to possible applications in optimization, novel Monte Carlo schemes and machine learning.

Full PDF

GGlobal minimization via classical tunneling assisted by collective force ﬁeld formation † F. Caravelli and † , ∗ F. C. Sheldon † Theoretical Division (T4) and ∗ Center for Nonlinear Studies,Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA

F. L. Traversa

MemComputing, Inc. San Diego, California, 92121

Simple dynamical models can produce intricate behaviors in large networks. These behaviors can often beobserved in a wide variety of physical systems captured by the network of interactions. Here we describea phenomenon where the increase of dimensions self-consistently generates a force ﬁeld due to dynamicalinstabilities. This can be understood as an unstable (“rumbling") tunneling mechanism between minima inan effective potential. We dub this collective and nonperturbative effect a “Lyapunov force" which steers thesystem towards the global minimum of the potential function, even if the full system has a constellation ofequilibrium points growing exponentially with the system size. The system we study has a simple mapping toa ﬂow network, equivalent to current-driven memristors. The mechanism is appealing for its physical relevancein nanoscale physics, and to possible applications in optimization, novel Monte Carlo schemes and machinelearning.

Introduction.

The dynamics of simple elements interactingin networks can give rise to emergent behaviors, which are notexhibited in the dynamics of a single element. These behav-iors are inherently multi-element in nature and yet may admitlow-dimensional analytical descriptions in terms of coarse-grained variables. Examples, such as synchronization andphase transitions are often widely applicable, as many dif-ferent systems may reduce to the same effective model. Inthis work we demonstrate such a behavior in a model ofinput-output networks inspired by memristors. When weaklydriven, the system is described by movement in an effectivepotential, but at higher driving, instabilities will cause escapesfrom local minima of the potential which can be captured as atunneling phenomenon.Tunneling and escape phenomena are a remarkable featureof quantum mechanical and thermal systems [1, 2]. Ther-mal and quantum tunneling can occur when a barrier sepa-rates two possible minima, and a particle can escape (tun-nel) from one minimum to the other aided by either ther-mal ﬂuctuations or spreading of the wavefunction. If the bar-rier is not inﬁnitely high, both thermal and quantum systemshave a non-zero probability to tunnel between minima, witha probability exponentially dependent on the height or widthof the barrier. The discovery of quantum tunneling was aradical paradigmatic shift from classical mechanics and un-derlies many important physical phenomena and technologies( e.g. alpha decay, scanning-tunneling microscopy). A simi-lar behavior occurs in non-equilibrium stochastic dynamicalsystems. Particles at a certain temperature can in fact escapefrom a metastable state after a typical time proportional to theinverse of the Kramers’ rate [2]. Intuitively, thermal ﬂuctu-ations provide sufﬁcient energy for the particle to escape thelocal minimum (e.g. a “kick"). An analogy between these be-haviors is not surprising if one considers that the Schrödingerequation can be mapped to a non-Markovian stochastic sys-tem [3], or to a classical particle in a non-local force ﬁeld [4].Tunneling phenomena are of current interest for their appli-cation to computation, as in quantum annealing [5]. Metasta-bility and tunneling are a leitmotif for optimization schemes L L (a) (d)(b)(c) sxR off ሶ𝑥 = −𝜕 𝑥 𝑉(𝑥, 𝑠)𝑉(𝑥, 𝑠)𝑉(𝑥, 𝑠)

𝑉(𝑥, 𝑠)

𝑉(𝑥, 𝑠) (e)

FIG. 1. Representation of the tunneling phenomenon. On the left(a,c), the memristor potential V ( x , s ) for an high (a) and a low (c)value of the voltage s is sketched and two attractors are highlightedthrough the directions that internal memristor state take dependingon the initial conditions. On the right (b,d), the memristor potential V ( x , s ) now represents either each one of memristors in the networkin absence of the coupling, or equivalently the mean ﬁeld approxi-mation of the system. In this case, the emergent Lyapunv force (bluearrows) pushes the system towards the global minimum creating aneffective tunneling through the barrier. In the middle inset (e) theequivalent schematic of the memristive system is reported highlight-ing the role of the potential V ( x , s ) . both in a classical and a quantum framework [6–8]. In addi-tion to quantum annealing, interest in alternative approachesto computation and optimization is mounting [9–16]. For in-stance, there are proposals to use oscillators or more gen-eral encodings to process information in the frequency do-main [10, 15–17]. Others leverage memory, ranging fromnear-memory [18] to in-memory computing, [19–21] and ﬁn-ishing in memcomputing [11, 22]. These alternatives are try-ing to more efﬁciently solve hard problems across optimiza-tion [23–26] and machine learning [27].Among these alternatives, interest active or chaotic systems a r X i v : . [ c ond - m a t . m e s - h a ll ] F e b x Voltage s E n e r gy S t e p B a rr i e r E global min x gm local min x lm FIG. 2. Upper panel: Local ( x lm , dashed dotted red line) and global( x gm , solid green line) minimum location as function of voltage s .Lower panel: Energy Step (blue solid curve) and Barrier ∆ E (dashedyellow curve) as a function of voltage s . has experienced a revival, also because of their applicabilityto optimization problems [28]. Chaotic dynamics have beenposed as an obstacle to reaching asymptotic ﬁxed points in dy-namical system-based computing [29] but there has also beensome indirect evidence that unstable dynamics can improvethe efﬁciency of some optimization schemes. In particular ithas been noted that in crossbar arrays made of memristive el-ements, unstable dynamics can lead to escaping local minima[30] in memristor-implemented simulated annealing.While escape phenomena are familiar in quantum and ther-mal settings, there are no examples, of which we are aware,of classical dynamical systems which exhibit barrier escapeswhen the system is a-thermal and passive. In this article wepresent an example of barrier tunneling as an emergent, multi-particle effect in an a-thermal, passive [31] system movingin an effective potential. Both thermal and quantum tunnel-ing involve a distribution which is either intrinsic (quantum)or induced by uncertainty (thermal). Here this distributionis replaced by a coarse-grained representation of the system,which transitions from a laminar phase (always negative Lya-punov exponents) to an unstable or transiently unstable regime(local Lyapunov exponents positive for short transients).The example considered is inspired by the dynamics ofmemristor networks [32–37], in which a low dimensional ef-fective potential representation shows barrier escapes. Thisproperty is represented in Fig. 1 and is a novel ﬁnding ofthis paper which we characterize both analytically and nu-merically. This effect comes from a non-perturbative intrinsiccoupling among elements, which in the memristor networkinterpretation is induced by Kirchhoff’s laws. Due to the non-pertubative nature of the coupling, we use a recently obtainedexact large- N formula for the matrix inverse, and show an-alytically that the effective coarse grained dynamics can beapproximated by the one of a single memristor (e.g. a meanﬁeld approximation). When the effective potential lacks con- vexity, the emergent Lyapunov force pushes the system intothe absolute minimum of the potential, thus effectively tun-neling through the barriers. Interestingly, this Lyapunov forceis directly related to instabilities in the dynamics. The model.

The dynamical system we consider is derivedfrom the dynamics of memristor networks [38], but also ap-plies to physarum molds [39] or as a dynamic generalizationof supply-chain or input-output models. In particular, it mod-els a ﬂow network which obeys current and energy conser-vation and whose edge dynamics are linear in the currents.A resistor with memory can be described by an effective dy-namical resistance depending on an internal parameter x . Forinstance, TiO memristors are approximately described by thefunctional form R ( x ) = R off ( − x ) + xR on , where R on < R off are the limiting resistances, and x ∈ [ , ] . The internal mem-ory parameter evolves according to a simple equation of theform ddt x = R on β I − α x = R on β VR ( x ) − α x . The parameters α and β are the decay constant and the effective activation voltageper unit of time respectively, and determine the timescales ofthe system. For a single memristor under an applied voltage S we use Ohm’s law S = RI to obtain an equation for x ( t ) inadimensional units ( τ = α t ) given by dd τ x = S αβ − χ x − x = − ∂ x V ( x , s ) . (1)Here we’ve deﬁned χ = R off − R on R off , with 0 ≤ χ ≤ V ( x , s ) as an effective potential.The dynamics of the one-dimensional system of Eqn. (1)are fully characterized by gradient descent in the potential V ( x , s ) = x + s χ log ( − χ x ) , (2)with s = S αβ . The potential can have two local minima andthe energy step between the two is shown in Fig. 2 for variousvalues of s and for χ = .

9. We consider s on the restrictedinterval for which a barrier exists as shown in Fig. 2. In thisrange, and with χ near 1, the local minimum can move insidethe domain [ , ] , and an unstable ﬁxed point (i.e. the locationof the peak of the barrier) emerges, leading to two basins ofattraction (Fig. 2). For α = β = χ = . / < s < /

18 . The requirement that χ , which character-izes the nonlinearity of the system, be near 1 implies that thephenomenon is nonperturbative. The asymptotic behavior ofthis single-element dynamical system is fully characterized bythe simple basins of attraction of the potential, and presents noexotic features [40]. An analytical solution of such differentialequation has been recently obtained in implicit form [41].When, instead of a single memory device, we have a cir-cuit composed of many memristors, each with resistance R ( x i ) and with voltage generators S i in series, (taken to be constant S i = ˜ S ), the differential equation (1) generalizes to a systemof coupled and nonlinear ordinary differential equations. Thenetwork dynamics equation for the memory elements x i ( t ) is[42, 43]: ddt (cid:126) x = β ( I − χ Ω X ) − Ω (cid:126) S − α (cid:126) x , (3) FIG. 3. Dynamics of the system for s = .

15 and s = .

25, and the mean ﬁeld basins of the attractions of the effective potential are representedby the background blue and red coloring. The initial condition for the elements of x is a uniform distribution in [ , ] . Left panel: trajectoriesof the dynamical system in the case with s = .

15 and χ = . Right panel: rumbling trajectories of the dynamical system for the case of the non-convex potential, for s = .

25, and in which the mean ﬁeldequations fail to capture the dynamics of the network. with χ = R off − R on R off <

1, and X i j ( t ) = x i ( t ) δ i j . The matrix Ω isthe projection operator ( Ω = Ω ) on the vector space of thecycles of G , the graph representing the circuit [42]. In prac-tice, Ω can be determined via the directed incidence matrix B of G as Ω = I − B ( B t B ) − B t [38, 44–47]. That Ω is a projec-tion operator is a mathematical representation of Kirchhoff’scircuit laws. In this paper we choose B to be a random matrixin order to abstract the dynamical system from a particularcircuit topology.The system has a Lyapunov function when (cid:126) S is constant(see Supp. Mat. A), and memristors are passive elementsand thus cannot possess positive Lyapunov exponents (e.g.the system cannot be unstable for long times [31]). Passivecomponents subject to DC voltage generators will approachan equilibrium and thus only transient forms of instability arepossible. Exact information about the behavior of Eqn. (3) hasbeen difﬁcult to obtain as the matrix inverse ( I − χ Ω X ) − con-tains the variables x i ( t ) and it is therefore difﬁcult to solve ana-lytically. Additionally, Bézout’s theorem for quadrics of order2 with N variables suggests the number of equilibrium points(stable, unstable or saddle) can be exponential in the numberof memristors [41] (this can be justiﬁed intuitively, as eachmemristor has 2 ﬁxed points). Despite this complexity, thehigher-dimensional dynamics of networks of memristors canbe well-approximated as an effective one-dimensional systemin a coarse-grained representation as we show in the followingsection.Interestingly, in the regime where the potential V ( x , s ) hasmultiple minima and the initial condition of the system liesoutside of the attraction basin of the global minimum, thesystem exhibits instability. This can be seen in Fig. 3 (redcurves), where for χ = . s . Here we have as-signed the initial conditions of the elements of (cid:126) x according toa uniform distribution on [ , ] and considered two cases for s : s = .

15 for which the global minimum is located at x i ≈ . s = .

25 for which the global minimum at x i = x i ≈ .

2. On the other hand,for s = .

25, initial conditions for the elements of (cid:126) x are almostall within the “local" minimum’s basin of attraction since nowthe global minimum is at x i =

1; now the system shows unsta-ble dynamics, pushing trajectories to the other side of the bar-rier and converging to the global minimum x i = (cid:126) x ( ) ( τ ) and (cid:126) x ( ) ( τ ) . As a distance, we use the absolute value,which is essentially an L measure of the Lyapunov exponentsfor the dynamics. For these experiments, we initialize the twosystems in the same state (cid:126) x , but with a small deviation (cid:126) ε , with || (cid:126) ε || = .

01, and where || · || represents the 1-norm. It isnatural to deﬁne the quantity M τ = N ∑ Ni = | x ( ) i ( τ ) − x ( ) i ( τ ) | ,e.g. a normalized 1-norm which quantiﬁes the deviation ofthe trajectories element by element, and which is convenientbecause it is naturally normalized between 0 and 1. The re-sults are shown in Fig. 4, where we see that for the case inwhich s = .

15 (laminar), M τ decays to zero rapidly. In thecase s = .

25 (unstable), M τ grows for a transient of time ofapproximately 7-fold longer before asymptotically reachingzero. For this reason, we dub this phenomenon as a rumblingtransition due to the chaoticity of the individual trajectories,resulting from the nonlinearity of the interaction. Coarse-grained dynamics.

We now make the intuitive pic-ture above more precise by introducing an effective mean ﬁeld

FIG. 4. Growth and decay of M τ = N ∑ Ni = | x i ( t ) − x ε i ( t ) | as a func-tion of time for χ = . s = .

15 and s = .

25. The realization ofthe noise ε on the initial condition is such that initially M = . ( s = . ) , M τ quickly decays to zero, whileit grows for a transient of time in the unstable regime. In order tomeasure the growth and decay of M τ , we look at λ = τ log M ¯ τ M .For s = .

15, we obtain λ = − . ( ) measures on the whole in-terval, while for s = .

25 we obtain λ = . ( ) , and thus positive. Inset:

Evolution of the Lyapunov force as a function of time and for s ∈ [ . , . ] . Every curve is averaged over 100 initial conditionsand the shadow area represents the standard deviation on the meancurves. potential derived from an approximation for the matrix in-verse. It has been recently proven that the resolvent of largematrices exhibits statistical regularities, for which it can beapproximated by an effectively one-dimensional matrix [48]which is universal at the zeroth-order in the limit of weakcorrelations. The result is rather general and has applicationsto various domains of complexity science [49]. If we deﬁne (cid:126) f = Ω (cid:126) x and the matrix˜ A =  f ( (cid:126) x ) · · · f ( (cid:126) x ) f ( (cid:126) x ) · · · f ( (cid:126) x ) ... . . . ... f N ( (cid:126) x ) · · · f N ( (cid:126) x )  = (cid:126) f ⊗ (cid:126) t , (4)we then have the approximate relation (details in Supp. Mat.B) ( I − χ Ω X ) − = I + N χ − N χ ∑ Ni = f i ( (cid:126) x ) ˜ A + O (cid:18) N (cid:19) . (5)Thus, if we deﬁne a coarse-grained variable x cg = N ∑ i f i ,and deﬁne (cid:104) Ω (cid:126) S (cid:105) = N ∑ Ni = ( Ω (cid:126) S ) i , we can write an effective onedimensional dynamics: dd τ x cg = αβ (cid:16) (cid:104) Ω (cid:126) S (cid:105) + χ (cid:104) Ω (cid:126) S (cid:105) − χ x cg x cg (cid:17) − x cg + L ( (cid:126) x )= − ∂ x cg V ( x cg , χ ) + L ( (cid:126) x ) (6) where L ( (cid:126) x ) is an effective force due to the fact that thecoarse-graining is not exact. We thus see that at the zerothorder the mean ﬁeld variable obeys dynamics similar to thoseof a single memristor, where now the parameter s αβ is re-placed by the mean ﬁeld value s = (cid:104) Ω (cid:126) S (cid:105) αβ . Thus, the equa-tion above again represents gradient descent dynamics for aone-dimensional system, but where an effective external force L ( (cid:126) x ) emerges from the interaction between the large set of(hidden) variables x i , and which we can evaluate numericallyon the dynamics. In a sense, while the N particles somehowfeel a similar potential, these can interact as well, leading toa non-trivial collective dynamics. As a result we ﬁnd that thevariable x cg is a natural dynamical order parameter to studyfor our system. Laminar versus Transient unstable dynamics.

We numer-ically examined the validity of the coarse-grained dynamicsfor various ranges of the parameters s and χ .[50]The picture which emerges is represented in Fig. 3. Asdiscussed previously, we consider a uniform distribution in[0,1] for the initial condition of the elements of (cid:126) x and the lam-inar case s = .

15 for which the global minimum is locatedat around x i ≈ . s = .

25 with theglobal at x i = x i ( ) lie in the basin of the globalminimum. On the other hand, for s = .

25, almost all initialconditions for the elements of (cid:126) x are on the basin of the “lo-cal" minimum as now the global minimum is at x i =

1. In thelaminar regime the mean-ﬁeld dynamics accurately ﬁt the dy-namics of the system. This is always the case for χ (cid:28)

1, inwhich very few trajectories cross the barrier and the dynam-ics are therefore smooth. In this regime, the effective force L ( x ) (which we call Lyapunov force for reasons which be-come clear soon) is typically small after a short transient, aswe can see the inset in Fig. 4.This implies that the equilibrium points of the dynamics,which are those satisfying dd τ x cg =

0, are well approximatedby the mean ﬁeld dynamics (e.g. the particles follow the samebasin of attraction). This gives the equilibrium value x ∗ cg = N ∑ i j Ω i j x ∗ j = αβ − (cid:113) α β − αβ χ (cid:104) Ω (cid:126) S (cid:105) αβ χ , (7)which matches simulations. The equation above is equivalentto ﬁnding values of x i ∈ [ , ] such that (cid:126) x · (cid:126) c = a , and (cid:126) x is suchthat (cid:126) x = Ω (cid:126) x (See Supp. Mat. C). This equation has manymultiple possible solutions in high dimensions, explaining thelarge number of asymptotic states of the memristive dynamicscommonly seen in numerical experiments [51]. Yet, this alsoimplies that the dynamics can be succinctly described by ascalar order parameter for the system, and that the mean ﬁeldand the effective potential do play a role.The ability of the coarse-grained dynamics to capture thememristor network evolution changes abruptly when most x i have initial conditions outside of the attraction basin of theglobal minimum, as can be seen in the right panel of Fig. 3.Fixing χ = . s = .

25 such that the global minimumis now at x i = FIG. 5. Average asymptotic state E [ (cid:104) (cid:126) x ( t = ∞ ) (cid:105) ] (red curve) of thememristor coarse grained variable obtained from 400 Monte Carloinstances for each point ( s ,(cid:126) x ) , on a grid of 50 points s ∈ [ . , . ] and 30 points x i ∈ [ , ] and for χ = .

9, obtained with an Eulerintegration method with dt = .

1, and tested against a Runge-Kutta4 integration method for stability. The transparent plane is for visualaid to show that the average climbs the barrier. obtained at the leading order fails to captures the dynamicsof the system. Individual trajectories x i ( t ) now deviates fromthe coarse-grained value x cg ( t ) substantially, as we can see inFig. 3 (right), with clear instabilities which lead them to theabsorbing boundary x i ( t ) =

1. While this might seem irksomeat ﬁrst, we argue that this phenomenon is worthy of attention.In fact, the effective Lyapunov force L ( x ) is now consistentlylarge along the dynamics, pushing the system towards the ab-solute minimum. Since the resolvent expansion of Eqn. (5) isvalid in the limit of weak correlations of the matrix elements,the fact that the Lyapunov force is non-zero is naturally asso-ciated to strong correlations between the dynamical variables.As described, the system displays this behaviour even whenrandomly initialized, and was investigated systematically. Tohighlight the role of the basin of attraction of the global min-imum we employed Monte Carlo simulations, randomizingover the initial conditions on [ , ] . The results are shown inFig. 5. The surface shows the potential V ( x , s ) as a function of x cg and s . The superimposed red curve represents the asymp-totic average position E [ (cid:104) (cid:126) x ( t = ∞ ) (cid:105) ] where E [ · ] is the averageover the Monte Carlo samplings and (cid:104) (cid:126) x (cid:105) = N ∑ i x i is the aver-age over the components. As can be seen, before the barrierdisappears the dynamics reaches the right minimum of the po-tential V ( x , s ) with probability one. A glimpse of the structureof the basins of attraction can be obtained by ﬁxing the initialconditions homogeneously for all but two variables x i , andlook at the resulting asymptotic state in these two variables(see the Supp. Mat. E for details) near the transition point. Tunneling.

To capture this as a tunneling effect, we ini-tialized the system around the local minimum using the initialcondition (cid:126) x ( ) = x lm ( s ) + (cid:126) σ where x lm ( s ) is the location of thecoarse grained local minimum as depicted in ﬁgure 2, and (cid:126) σ being a random vector drawn from the Gaußian distribution N ( , σ ) . Given this, we performed Monte Carlo simulationsand obtained the probability P t that the system hops from thelocal minimum to the global minimum, or, in other words,the probability P t that the initial condition (cid:126) x ( ) = x lm ( s ) + (cid:126) σ leads to (cid:126) x ( t = ∞ ) = x gm ( s ) . We also note that x lm ( s ) showsan abrupt change for s = s crit ≈ .

206 as depicted int Fig. 2as at that point the local and global minima switch locations.This leads to two different effective tunneling directions for s > s crit and s < s crit (see the sketch in Fig. 1).We ﬁrst investigated how the tunneling emerges as a col-lective behaviour of the system. To this end we evaluated thedynamics of systems as the number of elements increased.Fig. 6-b reports the tunneling probability for σ = . s . We can observe that at small system size thetunneling probability is proportional to the number of initialconditions that fall into the global minimum attraction basinto end with an almost perfect sigmoid (tanh), as a function ofthe energy barrier ∆ E (see also Fig. 6-b). This transition issmooth and qualitatively equivalent for both tunneling direc-tions. Therefore the system shows a size-depended transitionfrom gradient dynamics to a collective tunneling towards theglobal optimum, led by the emergent Lyapunov force.We also investigated how the emergent force depends onthe spread of the initial conditions around x lm , i.e., how it de-pends on σ . Figure 6-d reports the tunneling probability P t asa function of σ for a system of 50 components. An interestingfeature manifests itself: for σ = ∆ E for each tunneling direction below which thesystem is still able to tunnel towards the global minimum withprobability 1 (this is highlighted in ﬁgure 6-c). This showsthat the Lyapunov force is an intrinsic collective large-system-size feature present even in the absence of any randomness inthe initial conditions.Based on the latter ﬁnding, we investigated whether escapefrom the minimum might be due to a transient instability ofnon-normal dynamics [52]. One possible mechanism wouldin fact be that via this (local) transient instability perturba-tions are ampliﬁed for a short time and the system reachesthe basin of attraction of another minimum. To examine thiswe studied the local stability of the system near x lm via theJacobian matrix. While the details are provided in App. D,we analyzed a large number of local minima via Monte Carlo,ﬁnding that the non-normality of the Jacobian does not leadto any ampliﬁcation phenomenon. This also leaves us withthe only option that the instability of Fig. 4 is in fact dueto a non-perturbative and cooperative phenomenon betweenthe dynamical variables. From the point of view of the ef-fective potential V ( x , s ) , this analysis implies that the globalminimum lying behind a barrier is reached (via the instabili-ties above mentioned) via a dynamics assisted by the effectiveLyapunov force which pushes the system towards it. Conclusions.

Traditional escape phenomena in thermal andquantum systems are a consequence of either thermal ﬂuctua-tions or the spreading of a wavefunction. We have presentedan additional setting in which barrier escapes emerge in the ef-fective description of a multiparticle system. We presented a (a)(b) (c)(d)FIG. 6. Tunneling probability P t from the local to the global minimum of V ( x , s ) . Insets (a-b) are evaluated for the standard deviation of initialcondition distribution around the local minimum σ = .

5. (b) probability P t as a function of the energy step between the two minima and thesize of the system (or equivalently the number of memristors). (a) section of the plot in (b) for System Size = 100. Blu dots are the numericalevaluation of P t vs the energy step while the solid red curve is the ﬁtting using the tanh functions. The dashed yellow curve is the barrier ∆ E as a function of the energy step. Insets (c-d) are evaluated for System Size = 100. (c) probability P t as a function of the energy step betweenthe two minima and the standard deviation σ of initial condition distribution around the local minimum. (c) section of the plot in (d) for σ = P t vs the energy step while the solid red curves are the ﬁttings using the tanhfunctions. The dashed yellow curve is the barrier ∆ E as a function of the energy step. class of dynamical system, derived from networks of memris-tors, that can be mapped to an effective one-dimensional sys-tem. The resulting dynamics are characterized by a potentialwhich depends on the (mean-ﬁeld) external applied voltage,and an effective force which occurs when the system becomesLyapunov unstable. The result of this instability is that theeffective representation may jump between basins of attrac-tion, even through a barrier. The explanation we provide issimple but difﬁcult to capture analytically and is due to thelarge number of directions in which the system can travel, viaa sequence of saddle points in the dynamics, and from whichthe local instability emerges. This is compatible with previousinvestigations of memristive circuit dynamics applied to logicgates [53, 54], and the observation of instanton-like behaviorin the dynamics [55, 56].Our analysis of the effective “Lyapunov" force provides anexplanation of the observed tunneling as an epiphenomenonof the interaction between the memory elements. In princi-ple this might ﬁt into a coarse graining argument of the hid-den variables, due to a Zwanzig expansion [57], which will beconsidered in the future. However, we wish to stress that thesystem we studied is a-thermal ( i.e. no stochastic forces wereintroduced in our analysis) and that the effective Lyapunovforce is far from random. The main message of this paper is that the introduction ofhidden variables in a dynamical system can lead to transitionsbetween local and global minima of the effective descriptionvia instabilities in the full system. In this sense, this type ofbehavior is novel and worth further exploration. In quantumand statistical ﬁeld theory, tunneling phenomena are describedby instantons, e.g. solutions in which the ﬁeld tunnels in aﬁnite time between local minima. Instantons, originally intro-duced in cosmology [58], are nonlinear solutions to the ﬁeldequation which are non-local in time, and represent tunnel-ing between two local minima. Typically, instantonic behav-ior is associated to some form of stochasticity, whether ther-mal or quantum in nature. In this paper we have shown thatinstantonic behavior can also be an emergent phenomenon[59] thanks to the nonlinear interaction between many com-ponents. We hope that our paper can spark further interest inthe study of tunneling in dynamical systems via “hidden vari-ables", with their interesting properties, and with importantapplications to computer science and optimization in particu-lar [30, 36, 55, 56]. Acknowledgements.

We acknowledge the support of NNSAfor the U.S. DoE at LANL under Contract No. DE-AC52-06NA25396. FC was also ﬁnanced via DOE-LDRD grantsPRD20170660 and PRD20190195, and would like to thank S.Bartolucci, F. Caccioli and P. Vivo for collaborations on theresolvent formula which enabled us to write this paper, and C. Coffrin and Y.-T. Lin for comments on the draft. FCS wasalso supported by a Center for Nonlinear Studies fellowship. [1] J. J. Sakurai and J. Napolitano,

Modern Quantum Mechanics .Cambridge University Press, sep 2017.[2] P. Hänggi, P. Talkner, and M. Borkovec, “Reaction-rate theory:ﬁfty years after kramers,”

Reviews of Modern Physics , vol. 62,pp. 251–341, apr 1990.[3] E. Nelson,

Quantum Fluctuations . Princeton University Press,sep 2020.[4] D. Bohm, “A suggested interpretation of the quantum theoryin terms of "hidden" variables. II,”

Physical Review , vol. 85,pp. 180–193, jan 1952.[5] W. D. Oliver, “Quantum computing takes ﬂight,”

Nature ,vol. 574, pp. 487–488, oct 2019.[6] G. E. Santoro, “Theory of quantum annealing of an ising spinglass,”

Science , vol. 295, pp. 2427–2430, mar 2002.[7] R. Hamerly and et al., “Experimental investigation of perfor-mance differences between coherent ising machines and a quan-tum annealer,”

Science Advances , vol. 5, p. eaau0823, may2019.[8] C. Baldassi and R. Zecchina, “Efﬁciency of quantum vs. clas-sical annealing in nonconvex learning problems,”

Proceedingsof the National Academy of Sciences , vol. 115, pp. 1457–1462,jan 2018.[9] J. L. Hennessy and D. A. Patterson, “A new golden age forcomputer architecture,”

Communications of the ACM , vol. 62,pp. 48–60, jan 2019.[10] S. K. Vadlamani, T. P. Xiao, and E. Yablonovitch, “Physics suc-cessfully implements lagrange multiplier optimization,”

Pro-ceedings of the National Academy of Sciences , vol. 117,pp. 26639–26650, oct 2020.[11] F. L. Traversa and M. Di Ventra, “Universal memcomputingmachines,”

IEEE Transactions on Neural Networks and Learn-ing Systems , vol. 26, pp. 2702–2715, nov 2015.[12] B. Sutton, K. Y. Camsari, B. Behin-Aein, and S. Datta, “In-trinsic optimization using stochastic nanomagnets,”

ScientiﬁcReports , vol. 7, mar 2017.[13] J. Torrejon, M. Riou, F. A. Araujo, S. Tsunegi, G. Khalsa,D. Querlioz, P. Bortolotti, V. Cros, K. Yakushiji, A. Fukushima,H. Kubota, S. Yuasa, M. D. Stiles, and J. Grollier, “Neuromor-phic computing with nanoscale spintronic oscillators,”

Nature ,vol. 547, pp. 428–431, jul 2017.[14] B. Molnár, F. Molnár, M. Varga, Z. Toroczkai, and M. Ercsey-Ravasz, “A continuous-time MaxSAT solver with high analogperformance,”

Nature Communications , vol. 9, nov 2018.[15] F. Böhm, G. Verschaffelt, and G. V. der Sande, “A poor man’scoherent ising machine based on opto-electronic feedback sys-tems for solving optimization problems,”

Nature Communica-tions , vol. 10, aug 2019.[16] D. Pierangeli, G. Marcucci, and C. Conti, “Large-scale pho-tonic ising machine by spatial light modulation,”

Physical Re-view Letters , vol. 122, p. 213902, may 2019.[17] G. Csaba and W. Porod, “Coupled oscillators for computing:A review and perspective,”

Applied Physics Reviews , vol. 7,p. 011302, mar 2020.[18] G. Singh, L. Chelini, S. Corda, A. J. Awan, S. Stuijk, R. Jor-dans, H. Corporaal, and A.-J. Boonstra, “Near-memory com-puting: Past, present, and future,”

Microprocessors and Mi-crosystems , vol. 71, p. 102868, nov 2019. [19] D. Ielmini and H.-S. P. Wong, “In-memory computing with re-sistive switching devices,”

Nature Electronics , vol. 1, pp. 333–343, jun 2018.[20] F. L. Traversa, F. Bonani, Y. V. Pershin, and M. Di Ventra,“Dynamic computing random access memory,”

Nanotechnol-ogy , vol. 25, p. 285201, 2014.[21] A. Sebastian, M. L. Gallo, R. Khaddam-Aljameh, and E. Eleft-heriou, “Memory devices and applications for in-memory com-puting,”

Nature Nanotechnology , vol. 15, pp. 529–544, mar2020.[22] M. Di Ventra and F. L. Traversa, “Perspective: Memcomput-ing: Leveraging memory and physics to compute efﬁciently,”

Journal of Applied Physics , vol. 123, p. 180901, may 2018.[23] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimizationby simulated annealing,”

Science , vol. 220, pp. 671–680, may1983.[24] F. Glover and G. A. Kochenberger, eds.,

Handbook of Meta-heuristics . Springer US, 2003.[25] M. Dorigo and T. Stützle,

Ant Colony Optimization . The MITPress, 2004.[26] X.-S. Yang,

Engineering Optimization . John Wiley & Sons,Inc., jun 2010.[27] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,”

Nature ,vol. 521, pp. 436–444, may 2015.[28] M. Ercsey-Ravasz and Z. Toroczkai, “The chaos within su-doku,”

Scientiﬁc Reports , vol. 2, oct 2012.[29] T. Tél, “The joy of transient chaos,”

Chaos: An Interdisci-plinary Journal of Nonlinear Science , vol. 25, p. 097619, sep2015.[30] J. Yang, Q. Duan, Y. Wang, T. Zhang, T. Yang, and R. Huang,“Transiently chaotic simulated annealing based on intrinsicnonlinearity of memristors for efﬁcient solution of optimizationproblems,”

Science Advances , vol. 6, p. eaba9901, aug 2020.[31] S. Grivet-Talocia and B. Gustavsen,

Passive macromodeling:Theory and applications . John Wiley & Sons, 2015.[32] L. Chua, “Memristor-the missing circuit element,”

IEEE Trans-actions on Circuit Theory , vol. 18, no. 5, pp. 507–519, 1971.[33] L. Chua and S. M. Kang, “Memristive devices and systems,”

Proceedings of the IEEE , vol. 64, no. 2, pp. 209–223, 1976.[34] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams,“The missing memristor found,”

Nature , vol. 453, pp. 80–83,may 2008.[35] F. Caravelli and J. P. Carbajal, “Memristors for the curious out-siders,”

Technologies , vol. 6, p. 118, dec 2018.[36] M. Di Ventra and Y. V. Pershin, “The parallel approach,”

NaturePhysics , vol. 9, pp. 200–202, apr 2013.[37] C. Du, F. Cai, M. A. Zidan, W. Ma, S. H. Lee, and W. D. Lu,“Reservoir computing using dynamic memristors for temporalinformation processing,”

Nature Communications , vol. 8, dec2017.[38] F. Caravelli, “Locality of interactions for planar memristive cir-cuits,”

Physical Review E , vol. 96, p. 052206, nov 2017.[39] A. Johannson and J. Zou, “A slime mold solver for linear pro-gramming problems,” in

Conference on Computability in Eu-rope , pp. 344–354, Springer, 2012.[40] Typically, since one must have 0 ≤ x ≤

1, the equations of mo-tion of the single variables are supplied with (non-absorbing) cutoff functions, e.g. dd τ x = − W ( x ) f ( x ) , with W ( x ) = ≤ x ≤ Physical Review E , vol. 95, p. 022140, feb2017.[43] F. Caravelli, “Asymptotic behavior of memristive circuits,”

En-tropy , vol. 21, p. 789, aug 2019.[44] F. Caravelli, “The mise en scéne of memristive networks: effec-tive memory, dynamics and learning,”

International Journal ofParallel, Emergent and Distributed Systems , vol. 33, pp. 350–366, may 2017.[45] A. Zegarac and F. Caravelli, “Memristive networks: Fromgraph theory to statistical physics,”

EPL Perspectives , vol. 125,p. 10001, jan 2019.[46] J. W. Nilsson and S. Riedel,

Electric Circuits (9th ed), . PearsonEducation, Saddle River NJ, 2011.[47] B. Bollobás,

Modern Graph Theory . Springer New York, 1998.[48] S. Bartolucci, F. Caccioli, F. Caravelli, and P. Vivo, “Inversion-free leontief inverse: statistical regularities in input-output anal-ysis from partial information,” Sept. 2020.[49] S. Bartolucci, F. Caccioli, F. Caravelli, and P. Vivo, “Universalrankings in complex input-output organizations,” Sept. 2020.[50] The numerical results are obtained using a simple Euler inte-gration with a step dt = .

1, but the results are robust to smallervalues, and agree with independent simulations using Runge-Kutta 4 method and similar step sizes.[51] F. Caravelli and F. C. Sheldon, “Phases of memristive circuitsvia an interacting disorder approach,” Aug. 2020.[52] L. N. Trefethen and M. Embree,

Spectra and pseudospectra:the behavior of nonnormal matrices and operators . PrincetonUniversity Press, 2005.[53] F. L. Traversa and M. Di Ventra, “Polynomial-time solutionof prime factorization and NP-complete problems with digitalmemcomputing machines,”

Chaos: An Interdisciplinary Jour-nal of Nonlinear Science , vol. 27, p. 023107, 2017.[54] M. Di Ventra and F. L. Traversa, “Absence of periodic or-bits in digital memcomputing machines with solutions,”

Chaos:An Interdisciplinary Journal of Nonlinear Science , vol. 27,p. 101101, oct 2017.[55] M. Di Ventra, F. L. Traversa, and I. V. Ovchinnikov, “Topolog-ical ﬁeld theory and computing with instantons,”

Annalen derPhysik , vol. 529, p. 1700123, aug 2017.[56] S. R. Bearden, H. Manukian, F. L. Traversa, and M. Di Ven-tra, “Instantons in self-organizing logic gates,”

Physical ReviewApplied , vol. 9, p. 034029, mar 2018.[57] R. Zwanzig,

Nonequilibrium statistical mechanics . OxfordNew York: Oxford University Press, 2001.[58] S. Coleman, “Fate of the false vacuum: Semiclassical theory,”

Physical Review D , vol. 15, pp. 2929–2936, may 1977.[59] P. W. Anderson, “More is different,”

Science , vol. 177, pp. 393–396, aug 1972.[60] E. Ott,

Chaos in Dynamical Systems . Cambridge UniversityPress, aug 2002.

Supplementary Material

A: PROOF OF LYAPUNOV FUNCTION FOR SHARP BOUNDARIES

In this section we provide a proof of the fact that the system is dissipative when each state variable is far from the boundary x i = p → ∞ W p ( x ) = θ ( x ) θ ( − x ) . In order to prove it, weattempt to write a Lyapunov function of the form for the differential equation (when boundaries are not considered) ddt (cid:126) x = β ( I − χ Ω X ) − Ω (cid:126) S − α (cid:126) x , (8)which we can rewrite as 1 α ( I − χ Ω X ) ddt (cid:126) x = αβ (cid:126) s − ( I − χ Ω X ) (cid:126) x , (9)where we wrote Ω (cid:126) S = (cid:126) s . L = (cid:126) x T X (cid:126) x − χ (cid:126) x T X Ω X (cid:126) x − αβ (cid:126) x T X (cid:126) s . (10)After taking a time derivative of the Lyapunov function above, we get dLdt = ˙ (cid:126) x T (cid:18) X (cid:126) x − χ X Ω X (cid:126) x − αβ X (cid:126) s (cid:19) = ˙ (cid:126) x T X (cid:18) (cid:126) x − χ Ω X (cid:126) x − αβ (cid:126) s (cid:19) = − ˙ (cid:126) x T X (cid:18) αβ (cid:126) s − ( I − χ Ω X ) (cid:126) x (cid:19) which is Eqn. (9) = − α ˙ (cid:126) x T X ( I − χ Ω X ) ˙ (cid:126) x = − α ˙ (cid:126) x T ( X − χ X Ω X ) ˙ (cid:126) x = − α ˙ (cid:126) x T √ X ( I − χ √ X Ω √ X ) √ X ˙ (cid:126) x = − α ||√ X ˙ (cid:126) x || ( I − χ √ X Ω √ X ) (11)We obtain that if I − χ √ X Ω √ X (cid:31) X (cid:54) = x i ∈ [ , ] ,which is a key factor in the dynamics, it is not hard to see that √ X Ω √ X ≺ x i ∈ [ , ] ), and since χ <

1, the Lyapunov propertyapplies also in this case. The existance of a Lyapunov function also implies that there is a global notion of “energy" beyond themean-ﬁeld potential provided in the main text.

B: LARGE-N RESOLVENT AND MEAN FIELD EQUATIONS

In this section we discuss the theorem used in the main text to derive the mean ﬁeld potential. Such theorem has been derivedwith the intention of studying resolvent matrices for generic random variables whose correlations are weak (in a sense deﬁnedbelow).

Theorem.

Let A = ( a i j ) be a random N × N matrix, characterized by the joint probability density function of the entries P A ( a , . . . , a NN ) . Let a i j ≥ ρ ( A ) < β > (cid:104) a i j (cid:105) A = z i / N > (cid:104) a i j a k (cid:96) (cid:105) A − (cid:104) a i j (cid:105) A (cid:104) a k (cid:96) (cid:105) A ∼ C i j N β δ ik δ j (cid:96) as N → ∞ (13)for all i , j , with (cid:104) ( · ) (cid:105) A deﬁned as (cid:104) ( · ) (cid:105) A = (cid:90) d a · · · d a NN P A ( a , . . . , a NN )( · ) , (14)and C i j be constants. Then we have: ( I − A ) − = I + N − ¯ z (cid:126) z ⊗ (cid:126) t + O ( N γ ) . (15)0where ¯ z = ( / N ) ∑ Ni = z i , and z i = ∑ Nj = A i j , with γ >

0. The results above hold irrespective of the precise form of P A . For aproof, see [48], and an application of the formula above to complex datasets [49]. While the theorem is stated for a particularform of the correlation between the elements, as a matter of fact it is more general, and it applies to correlation functions whichare sufﬁciently sparse. Nonetheless, for ﬁnite N the expansion has some corrections which are important, and that we estimatenumerically in the main text.In our case, the ensemble A is the one of the random projector operators. We will make hereby the connection between thetheorem and our setup. The formula for the resolvent is exact when the span of the operator Ω is one dimensional, as it can bepromptly seen using the Sherman-Morrison formula.In order to derive a mean-ﬁeld (or coarse-grained) dynamics for the network of memristors, we now use a recently obtainedformula for the resolvent. Let us now consider the equation: ddt (cid:126) x = β ( I − χ Ω X ) − Ω (cid:126) S − α (cid:126) x . (16)The technical condition lim N → ∞ N (cid:104) a i j (cid:105) A = c i j applies in the case of memristive dynamics if we consider the fact that Ω aprojector operator. As shown in [43], typically, Ω i j ≈ cN because of the condition Ω = Ω . As such, if we identify A = χ Ω X ,since we impose 0 < x i < χ <

1, then we have that a i j < χ | c | N , which satisﬁes the technical condition. Thenumerical results are obtained for a number of memristors equal to N = | Span ( Ω ) | = N − M with M =

50. For random matrices, we generate Ω = A t ( AA t ) − A , with A a random matrix of size N × M , and a i j randomly distributedin [ , ] .We now discuss how to apply the result above to the dynamics of memristors. Let us deﬁne f i ( (cid:126) x ) = ∑ j Ω i j x j . It is not hard tosee from the deﬁnition above that z i = χ f i ( (cid:126) x ) . Let us deﬁne (cid:126) f = { f i ( (cid:126) x ) } . We then have˜ A =  f ( (cid:126) x ) · · · f ( (cid:126) x ) f ( (cid:126) x ) · · · f ( (cid:126) x ) ... . . . ... f N ( (cid:126) x ) · · · f N ( (cid:126) x )  = (cid:126) f ⊗ (cid:126) t . (17)The expansion above can be rewritten as ( I − χ Ω X ) − = I + N χ − χ N ∑ Ni = f i ( (cid:126) x ) (cid:126) f ⊗ (cid:126) t + O ( N ) (18)from which we obtain, if we call Ω (cid:126) S = Ω (cid:126) Sddt (cid:126) x = β (cid:16) Ω (cid:126) S + χ N Ω (cid:126) S · (cid:126) − χ N (cid:126) f · (cid:126) (cid:126) f (cid:17) − α (cid:126) x + O ( N ) (19)Now that we have the identity Ω (cid:126) f = (cid:126) f and Ω (cid:126) S = s . Thus, if we project the equation using Ω , we have ddt (cid:126) f = β (cid:16) Ω (cid:126) S + χ N Ω (cid:126) S · (cid:126) − χ N (cid:126) f · (cid:126) (cid:126) f (cid:17) − α (cid:126) f + O ( N ) . (20)The asymptotic states are thus determined by the equation1 β (cid:16) Ω (cid:126) S + χ N Ω (cid:126) S · (cid:126) − χ N (cid:126) f · (cid:126) (cid:126) f (cid:17) − α (cid:126) f = (cid:126) f = Ω (cid:126) x = − χαβ (cid:104) Ω (cid:126) S (cid:105) − χ x cg Ω (cid:126) S αβ , (22)where we deﬁned (cid:104) (cid:126) G (cid:105) = N (cid:126) G · (cid:126)

1. The asymptotic states can be obtained up to a gauge transformation (cid:126) x ∗ = (cid:126) x ∗ + ( I − Ω ) (cid:126) r for anarbitrary (cid:126) r . The equation above however cannot be solved without ﬁrst ﬁnding a value for x cg . An equation for x cg = N (cid:126) f · (cid:126) N , obtaining ∂ t x cg = β (cid:16) (cid:104) Ω (cid:126) S (cid:105) + χ (cid:104) Ω (cid:126) S (cid:105) − χ x cg x cg (cid:17) − α x cg + L ( (cid:126) x ) (23)1where L ( (cid:126) x ) is an effective force we will discuss in a moment.In order for the equilibrium points to be compatible , we thus have to solve also for the equation1 β (cid:16) (cid:104) Ω (cid:126) S (cid:105) + χ (cid:104) Ω (cid:126) S (cid:105) − χ x cg x cg (cid:17) − α x cg = x cg = (cid:104) Ω (cid:126) S (cid:105) αβ ( + χ x cg − χ x cg ) (25)whose solution is given by x ∗ cg = N ∑ i j Ω i j x ∗ j = αβ − (cid:113) α β − αβ χ (cid:104) Ω (cid:126) S (cid:105) αβ χ (26)By setting (cid:104) Ω (cid:126) S (cid:105) αβ = s we obtain a mapping to the one dimensional case. Let us now discuss the emergence of the lack ofconvexity of the potential V ( x , s ) as a function of s ; speciﬁcally, let us analyze the situation in which the potential has a minimumor a maximum. The coarse grained variable is given, as mentioned earlier, by x cg = N ∑ i j Ω i j x j . First, we focus on the potentialon a compact support D = [ x min , x max ] , determined via the condition x min = min (cid:126) x ∈ [ , ] N ( x cg ) and x max = max (cid:126) x ∈ [ , ] N ( x cg ) . For s < x cg = max ( , x min ) . For s >

0, the potentialof Eqn. (2) can have at most two points in which its derivative in x is zero, given by the values x ± = ±√ − s χ χ . While x − isalways a local minimum, x + is always a local maximum of the potential. However, there is an intermediate set of values, givenby s = [ , χ ] , in which the potential has only a single minimum in the domain [ x min , x max ] , as x + is outside the domain. In thiscase, the local minimum x − is also an absolute minimum in the domain [ , ] . However, if χ is close enough to 1 and s largeenough, we have that x + moves inside the domain, and thus the potential has two local minima in [ , ] , one at x − and one at x max . If s increases further, we can have that V ( x max ) < V ( x − ) , implying that the potential is not non convex, and a barrier ofheight ∆ E ( s ) = V s ( x + , χ ) − V s ( x − , χ ) emerges between the the local minimum x − and the absolute minimum at x xmax . C: PROJECTED DYNAMICS AND EQUILIBRIUM POINTS

One of the advantages of using an analytical framework to study memristor dynamics is that many subtleties of nonlinearcircuits become explicit in this setup. In this regard, it is interesting to recall the fact that Ω = Ω . As mentioned earlier, the factthat this matrix is a projector operator is essentially a consequence of the Kirchhoff’ laws. Such freedom implies a redundancyin the way in which we can effectively represent the dynamics, as investigated in earlier papers [38, 42–44]. In fact, in orderto obtain the matrix Ω from the circuit, one has the liberty of choosing a particular reference spanning tree, whose details areclariﬁed in [45]. Such freedom is not hidden, but is evident from the fact that if we perform the transformation (cid:126) S (cid:48) = (cid:126) S + ( I − Ω ) (cid:126) k for an arbitrary vector (cid:126) k , the dynamics is unchanged (this is reminiscent of the “toy" gauge freedom which is typically presentin circuits). We now see that such freedom is also present in the representation of the ﬁxed points. We can in fact write (cid:126) x (cid:48) j = (cid:126) x j + ( I − Ω ) (cid:126) k , without effectively changing the coarse grained variable ﬁxed point state.Let us consider the two formulations of the dynamics, based on the fact that (cid:126) x = Ω (cid:126) x + ( I − Ω ) (cid:126) x = (cid:126) x c + (cid:126) x e , in which (cid:126) x c · (cid:126) x e = Ω is an orthogonal projector.Given the dynamics introduced in the bulk of the paper, we have d (cid:126) xdt = β ( I − χ Ω X ) − Ω (cid:126) S − α (cid:126) x , (27)from which, projecting via Ω and Ω B = I − Ω , we have, using the series expansion for the matrix expansion, that d (cid:126) x c dt = β ( I − χ Ω ( X e + X c )) − Ω (cid:126) S − α (cid:126) x c , d (cid:126) x e dt = − α (cid:126) x e . (28)It follows that since (cid:126) x e falls exponentially to zero, the ﬁxed points are those such that (cid:126) x e =

0. Given (cid:126) x c = (cid:126) f , the basin of attractionfor the ﬁxed points (cid:126) f ∗ is up to an arbitrary vector (cid:126) x e = ( I − Ω ) (cid:126) k added to the initial conditions. For the asymptotic ﬁxedpoints emerging from the mean ﬁeld dynamics, this implies that, since x ∗ cg must be such that N ∑ i j Ω i j x ∗ j = a , then necessarily ∑ j Ω i j x ∗ j = x ∗ i . Thus, if Ω i j has a span of N − M vectors, (cid:126) x ∗ = ∑ M − Nk = a k (cid:126) n k , where (cid:126) n k ’s are the basis vectors of Span ( Ω ) . It followsthat the solutions of the problem are such that, if we call ˜ n k = ∑ i ( (cid:126) n k ) i , the solutions are of the form ∑ k a k ˜ n k = a .2 D: LOCAL STABILITY ANALYSIS AND PSEUDOSPECTRAL TRIVIALITY

For the single memristor dynamical system, the stability of the equilibrium is determined by the condition − ∂ x V ( x ∗ ) > ∂ x V ( x ) =

0. This is equivalent to the condition2 αβ (cid:16) − αβ + (cid:112) αβ ( αβ − s χ ) + s χ (cid:17)(cid:16)(cid:112) αβ ( αβ − s χ ) − αβ (cid:17) < . (29)For reference, in the main text we consider the values α = β = χ = .

9, and the dynamical instability occurs around s ≈ . ( ) .The dynamics of our system can be described by the differential equations: d (cid:126) xdt = (cid:126) f ( (cid:126) x ) (30)where (cid:126) f = [ f , f · · · , f S ] T is a set of non-linear functions [60].Since we are interested in our system escaping from the equilibrium point, we study numerically the equilibria (cid:126) x ∗ which mustsatisfy (cid:126) x ( (cid:126) x ∗ ) = . (31)Since the system can have an exponential number of ﬁxed points, such search cannot be exaustive. However, ﬁnding these pointsis relatively easy and can be done numerically by initializing the dynamics randomly and letting the system evolve into one ofthese minima.The local dynamics near each equilibrium point is described by the Jacobian matrix evaluated at an equilibrium point. Anequilibrium point is stable if under any inﬁnitesimally small perturbation, ∆ (cid:126) x ( ) , eventually decays to zero, i.e., lim t → ∞ ∆ (cid:126) x ( t ) = ∆ (cid:126) x ( t ) = e DJt ∆ (cid:126) x ( ) . (32)Therefore, the spectrum of DJ is relevant for local stability analysis. If Λ ( M ) is the set of eigenvalues of M , then the equilibriumpoint is stable if all eigenvalues have negative real part. For stability near the boundaries, the dynamical system is given by ddt x j = W j ( x j ) f j ( (cid:126) x ) (33)where W p ( (cid:126) x ) is a window function enforcing 0 ≤ x ≤ f j ( (cid:126) x ) = ( β ( I − χ Ω X ) − Ω (cid:126) S − α (cid:126) x ) j . In the following we consider thenon-absorbing window function W ( x i ) = θ ( − f i ) θ ( x i ) + θ ( f i ) θ ( − x i ) , where we relaxed to θ ( a ) to θ p ( a ) = + tanh ( pa ) ,for numerical stability[11]. The numerical results shown in the paper are obtained for p = θ ( x ) . The full Jacobian of the dynamics is given by DJ fi j = ∂ x i ( W j ( x j ) f j ( (cid:126) x )) = W j ( x j ) ∂ x i ( f j ( (cid:126) x )) + f j ( (cid:126) x ) ∂ x i W j ( x j ) . (34)The Jacobian for the dynamical system without window functions can be evaluated exactly, and is given by (using ∂ α A − = − A − ( ∂ α A ) A − ) by ( δ i j ) is the Kronecker delta): DJ ji = ∂ x i f j ( (cid:126) x )= χβ α ∑ krpm ( I − χ Ω X ) − jk Ω kr δ ri ( I − χ Ω X ) − rm ( Ω (cid:126) S ) m − δ ji = χαβ P ji V i − δ ji , (35)where P = ( I − χ Ω X ) − Ω and (cid:126) V = P (cid:126) S , and which we could use to evaluate the Local Lyapunov Exponents or study localstability. The local ﬂow divergence is obtained via the relation δ (cid:126) x ( t n + ) = (cid:0) I + DJ f ( t n ) dt (cid:1) δ (cid:126) x ( t n ) . (36)The spectrum of the Jacobian at the ﬁxed point has been studied ﬁrst in [44], showing that the spectrum is real even if thematrix is not symmetric. Despite this, non-normal Jacobian matrices can in principle exhibit ampliﬁcation of perturbationson a stable equilibrium and it is worth of careful scrutiny in this paper. Small perturbations of a stable equilibria typically3decay in exponential time. In the case of non-normal matrices however, even if the eigenvalues have all real part negative, canexhibit a transient instability. How non-normal a matrix M is quantiﬁed by (any) norm of the matrix N = ( M t M − MM t ) / DJ ( X ) = P S , where P = ( I − χ Ω X ) − Ω and S i j = S i δ i j . Itfollows that N = P S − S ( P t ) . This is not inconsistent with the notion of stability. As perturbations increase in magnitude viaa transient phase, the effect on eigenvalues is more pronounced. For long times, if the system is linear, the long term dynamicsis still dominated by the negativity of the real part of the spectrum. However, in the case of non-normal matrices, the transientdynamics can be no longer consistent with the traditional picture of asymptotic stability.For this reason we go beyond a simple spectral analysis. Generalizations of the notion of eigenvalues have been studied inthe literature. In general, the eigenvalues of DJ can be deﬁned: Λ ( DJ ) = { z ∈ C : det ( zI − DJ ) = } , meaning that if z is aneigenvalue of DJ then by convention the norm of ( zI − DJ ) − is deﬁned to be inﬁnity [52]. But if || ( zI − DJ ) − || is ﬁnite andvery large, as is often the case with perturbed non-normal matrices, then the pseudospectrum of DJ must be considered. The‘ ε -pseudospectrum’ is a generalization of the notion of eigenvalues which depend on a real parameter ε , and can be deﬁned invarious equivalent ways [52]. We use the following deﬁnition: Λ ε ( DJ ) = { z ∈ C : || ( zI − DJ ) − || ≥ ε − } . (37)If a matrix is normal then its ε -pseudospectrum (from now one we only call it ‘pseudospectrum’) is somewhat trivial: it consistsof closed balls of radius ε surrounding the original eigenvalues of DJ . In the case of non-normal matrices however, pseudospectracan be much larger and much more intricate, which is the case we are interested in here.Local asymptotic stability is determined in the same way for normal and non-normal matrices. The ‘spectral abscissa’ of DJ is deﬁned as α ( DJ ) = sup z ∈ Λ ( DJ ) Re ( z ) , where the supremum is the largest real part of Λ ( DJ ) ; clearly, stability is guaranteedfor α ( DJ ) <

0. If DJ is normal, then || e DJt || = e α ( DJ ) t and dynamics is determined by α ( DJ ) . The relationship between thespectral abscissa and the system dynamics is evident from the bouns [52]: e α ( DJ ) t ≤ || e DJt || ≤ κ ( (cid:126) V ) e α ( DJ ) t where κ ( V ) = || V || ·|| V − || is called the conditioning of the matrix V , which is built from the eigenvectors of DJ , and is a measure of invertibilityof V . We see that for a normal dynamics, the conditioning provides an upper bound to the maximum ampliﬁcation of theperturbations. Clearly, if the spectral abscissa is positive, perturbations are unstable but within the bounds deﬁned above.These bounds can however generalized by introducing the notion of ‘ ε -pseudospectral abscissa’ of DJ . This is deﬁned as α ε ( DJ ) = sup z ∈ Λ ε ( DJ ) Re ( z ) , e.g. the largest real part of the spectrum of DJ (for a given ε ). The relevance of the ε -pseudospectralabscissa is given by the fact that it provides a lower bound to the maximum ampliﬁcation of the perturbation[52]:sup ε ≥ α ε ( DJ ) ε ≤ sup t ≥ || e DJt || . (38)The quantity f DJ ( ε ) = α ε ( DJ ) ε (39)is called the Kreiss constant of the matrix DJ , and is thus relevant in order to understand the transient dynamics for short times.Given the bound of Eqn. (38), it is important to look for values ε ∗ , at which f DJ ( ε ∗ ) =

1. Via a visual analysis of the contoursof pseudospectrum, a necessary condition of a critical value of ε ∗ is that ε ∗ -contour crosses the imaginary axis. At this point,perturbations from the equilibrium population vector can be in principle ampliﬁed, depending on the value of the Kreiss constant.In order to study the non-normality, we looked at the Jacobian matrix evaluated at the equilibrium points for the dynamicsobtained numerically via Monte Carlo. In Fig. 7 (left) we can observe the bulk of ε -pseudospectrum of the Jacobian matrix DJ ,ﬁxed Ω as in the main text, and with α = β = χ = .

9. However, such non-normality is not enough to explain the transientinstability we observe. A careful numerical analysis of the Kreiss constant for varying s gave us a maximum value of the Kreissmatrix of K = sup ε F DK ( ε ) = .

7, thus less than 1, the critical value for the ampliﬁcation. We thus directly studied the norm of e DJt in Fig. 7 (right), not observing any ampliﬁcation phenomenon up to the instability threshold for the matrix. This impliesthat sup ε α ε ( DJ ) ε ≤ E: BASINS OF ATTRACTION

First of all, let us ﬁrst mention, as shown in the main text, that the tunneling probability depends on the number of dynamicalvariables. In Fig. 6 we see that the larger the height of the barrier ( ∆ E decreases with s ), the larger the number of variables oneneeds to tunnel through. In order to obtain a glimpse of the structure of the basins of attractions and the role of the dimensionalityin reaching the two asymptotic states we perform the following numerical experiment. Given a random instance of Ω obtainedas Ω = A t ( AA t ) − A , for A a random matrix (with ﬂat probability on [ , ] ) of size N c × N ( N = N c = χ and4 FIG. 7. Left: ε -Pseudospectrum of the Jacobian matrix for ε = [ − , − ] . Right: behavior of the norm || e DJ t || ∞ as a function of time for s ∈ [ . , . ] .FIG. 8. Distributions P ( x cg ) and P ( x cg ) as a function of ∆ E of the potential. Inset: Fit of the Fermi distribution with a chemical potential. s = Ω (cid:126) S in a region in which we know there is a mixing between the local minimum and global minimum of the mean ﬁeldpotential potential V ( x , χ ) in x ∈ [ , ] . Speciﬁcally, we study the basins of attraction of the system near the mean ﬁeld situation(e.g. a single memristor), e.g. when many of the variables x i are initialized in the same value. We consider this analysis in theregime reported in the main text, for the values χ = . s ≈ .

22 which is close to the boundary between the two minima.We then perform the following two experiments. In the ﬁrst, we choose x i ( ) randomly, and then follow the trajectory and markit as local or global minimum asymptotic state. Since Ω i j is random and there is no preference between the variables, we chooseto plot initial conditions of the variables x and x and mark them depending on which asymptotic state they reached. In thesecond experiment instead we choose x · · · x N = ˜ x , and x and x still at random, following the procedure of the ﬁrst experimentand following the trajectory. The value of ˜ x is an extra parameter and has to be obtained such that system ends in both minima.A value for ˜ x which we found to be working well for random Ω is ˜ x ≈ . x , and for larger values the system it tends to move towards the global minimum;for smaller values towards the local minimum. Intuitively, this implies that for random initial conditions, for every possibleasymptotic state there is a initial condition nearby to the global minimum.A more careful analysis has been done for random initial conditions. We studied via Monte Carlo sampling the probabilitydistribution, given a random initial condition, of the system being in one or the other local minimum, showing that it can be ﬁtby a Fermi-like distribution depending on the energy barrier ∆ E . Let P ( x ; s , χ ) be the probability distribution as a function of the5 FIG. 9. Projected basin of attraction for the coarse grained asymptotic states according to the two controlled numerical experiments, asdescribed in the text, in order to visualize the basins of attractions. On the left, we see the completely randomize initial condition, in whichwe see a complete mixing in the initial conditions for x and x as a function of the asymptotic states. On the right, we see that there is a neatseparation between the two asymptotic states via a separatrix bordering two basins of attractions. height of the barrier ∆ E ( s ) = E max − E local min from the local minimum. We obtained such function numerically, averaging over100 Monte Carlo simulations for each value of s , and shown in Fig. 8. We ﬁt P ( ∆ E ) with a function of the form P ( x ; s , χ ) = / ( − e − ∆ E ( s , χ ) − µ / T ef f ) , where we obtained numerically the value T e f f = . µ = ..