[PDF] Generalized Play Hysteresis Operators in Limits of Fast-Slow Systems

Abstract

Hysteresis operators appear in many applications such as elasto-plasticity and micromagnetics, and can be used for a wider class of systems, where rate-independent memory plays a role. A natural approximation for systems of evolution equations with hysteresis operators are fast-slow dynamical systems, which - in their used approximation form - do not involve any memory effects. Hence, viewing differential equations with hysteresis operators in the non-linearity as a limit of approximating fast-slow dynamics involves subtle limit procedures. In this paper, we give a proof of Netushil's "observation" that broad classes of planar fast-slow systems with a two-dimensional critical manifold are expected to yield generalized play operators in the singular limit. We provide two proofs of this "observation" based upon the fast-slow systems paradigm of decomposition into subsystems. One proof strategy employs suitable convergence in function spaces, while the second approach considers a geometric strategy via local linearization and patching adapted originally from problems in stochastic analysis. We also provide an illustration of our results in the context of oscillations in forced planar non-autonomous fast-slow systems. The study of this example also strongly suggests that new canard-type mechanisms can occur for two-dimensional critical manifolds in planar systems.

Full PDF

aa r X i v : . [ m a t h . D S ] M a y Generalized Play Hysteresis Operatorsin Limits of Fast-Slow Systems

Christian Kuehn ∗ and Christian M¨unch † May 24, 2017

Abstract

Hysteresis operators appear in many applications such as elasto-plasticity and micro-magnetics, and can be used for a wider class of systems, where rate-independent memoryplays a role. A natural approximation for systems of evolution equations with hysteresisoperators are fast-slow dynamical systems, which - in their used approximation form -do not involve any memory eﬀects. Hence, viewing diﬀerential equations with hystere-sis operators in the non-linearity as a limit of approximating fast-slow dynamics involvessubtle limit procedures. In this paper, we give a proof of Netushil’s “observation” thatbroad classes of planar fast-slow systems with a two-dimensional critical manifold are ex-pected to yield generalized play operators in the singular limit. We provide two proofsof this “observation” based upon the fast-slow systems paradigm of decomposition intosubsystems. One proof strategy employs suitable convergence in function spaces, whilethe second approach considers a geometric strategy via local linearization and patchingadapted originally from problems in stochastic analysis. We also provide an illustrationof our results in the context of oscillations in forced planar non-autonomous fast-slow sys-tems. The study of this example also strongly suggests that new canard-type mechanismscan occur for two-dimensional critical manifolds in planar systems.

Keywords:

Fast-slow system, multiple time scale dynamics, hysteresis operator, general-ized play, canard, Netushil’s observation.

In this (non-technical) introduction, we are going to outline the main topic. We also presentthe main result and proof strategy used in this paper.Depending upon the context, the term ’hysteresis’ is used in technically diﬀerent, yetstrongly related, forms. Classical results in magnetic materials [SW48, Tor00] and mechan-ical systems [Lov13, MX91] refer to hysteresis as the description of memory eﬀects, particularly ∗ Technical University of Munich, Faculty of Mathematics, Research Unit “Multiscale and Stochastic Dynam-ics”, 85748 Garching b. M¨unchen, Germany † Technical University of Munich, Faculty of Mathematics, Research Unit “Mathematical Modelling”, 85748Garching b. M¨unchen, Germany hysteresis oper-ators is that trajectories often form ’loops’ in phase space as shown in Figure 1(a). A secondcommon use of hysteresis is to describe a system exhibiting switching between two locally stablestates upon parameter variation. A typical situation occurs when two fold bifurcations [GH83]are connected in an S-shaped curve as shown in Figure 1(b). This hysteresis eﬀect has beenfound and described in essentially all disciplines involving mathematical models ranging fromneuroscience [Fit55], geoscience [GR01], engineering [vdP26], solid-state physics [Str00], ecol-ogy [BHC03] to economics [Cro93], just to name a few; see also the references in [Kue15].A suitable mathematical formulation to approximate systems, in which this hysteresis eﬀectappears, are fast-slow ordinary diﬀerential equations (ODEs), where parameters are viewedas slowly-driven variables. Also in this case, limit cycles are frequently observed, e.g., slowlymoving through an S-shaped bifurcation diagram can lead to relaxation oscillations; see alsoSection 2.1 for more precise deﬁnitions and background on geometric singular perturbationtheory (GSPT) for fast-slow systems [Kue15].(a) (b)Figure 1: Sketch of two hysteresis models. (a) Trajectory for a hysteresis operator with memory;note that if we start at the lower left point, then the trajectory does not make sense as atrajectory of a planar ODE as it would violate local uniqueness. (b)

Hysteresis behaviour givenby a relaxation oscillation (solid curve) in a fast-slow system. The S-shaped critical manifold(dashed curve) can also be interpreted as one limiting part of the dynamics in the inﬁnite-timescale separation limit, i.e., there are three branches of steady states, which meet at two foldbifurcations.Although these two uses of hysteresis diﬀer in their mathematical setup, it is evident fromFigure 1 that one might expect at least some relation between the model classes for certain cases.The “observation” that certain hysteresis operators can be obtained as a singular limit fromfast-slow dynamics is often attributed [PS05, MOPS05] to Netushil [Net68, Net70]. However,taking limits directly to relate fast-slow dynamics to systems with hysteresis operators is diﬃcultas the limits are singular, in the sense that we have to bridge via the limiting process twodiﬀerent classes of equations. This diﬃculty is one possible motivation to look at the singularlimit by exploiting gradient-type structures [MR15], if they are available, or going deeper intothe theory of diﬀerential inclusions. Albeit yielding limits for certain subclasses of fast-slowsystems, these approaches tend to make very strong assumptions on the structure of the systemand, more importantly, the interpretation and proof strategies provided by GSPT, i.e., guidedby the individual decomposition of trajectories, are lost. Hence, we are going to refer to thissituation more precisely as Netushil’s conjecture, i.e., how one can rigorously unify GSPT and he abstract hysteresis operator viewpoint , without making too stringent assumptions on thestructure of the fast-slow system; see Sections 3 & 4 for the precise statements.In this work, we give two rigorous proofs of Netushil’s conjecture for a very general classof planar fast-slow systems, which have two-dimensional critical manifolds [PS05]; see alsoSection 3 for the technical statement of Netushil’s conjecture. We highlight that in our setting,we allow for a full coupling of fast and slow variables. We establish our results via two diﬀerenttechniques. The ﬁrst approach is more functional-analytic, yet carefully exploits the fast-slow decomposition. The second approach is very geometric and transfers a proof-strategyinitially developed for stochastic fast-slow systems to the deterministic hysteresis situation.More precisely, we show how to obtain the generalized play operator, which can be viewed asa model for perfect plasticity. In addition, to the overall GSPT approach, each of the twoproofs also contains further technical advances, e.g., the use of suitable projectors to deal withsmall-scale oscillations or a careful matching argument near two-dimensional critical manifoldsin planar systems. In summary, our results aim to provide a better bridge between two adjacentareas of mathematical modelling, and to provide a step in establishing Netushil’s conjecture, i.e.,to link GSPT and hysteresis operators in even more generality; see Section 2.2 for backgroundand comparison of our results to other approaches.The paper is structured as follows. In Section 2, we explain the necessary background fromfast-slow systems and hysteresis operators. We also provide a brief introduction to motivateour approach and compare it to other possible approaches not using GSPT and fast-slow de-composition. In Section 3, we state the ﬁrst version of our main result and provide a morefunctional-analytic decomposition proof. In Section 4, we state a variant of the convergence re-sult, which provides stronger convergence under stronger assumptions. The proof of this resultfollows a diﬀerent strategy patching fast dynamics geometrically. In Section 5, we consider anapplication of our results to a prototypical periodically forced planar fast-slow system, whichdisplays interesting oscillatory patterns. This example also illustrates the possible occurrenceand the role of canard trajectories. Remark on notation:

If not further speciﬁed, we always write k · k = k · k for theEuclidean norm. The same remark applies to matrix norms and metrics on ﬁnite-dimensionalspaces, they will all be understood in the Euclidean sense unless speciﬁed otherwise. In this section we provide some necessary background for the two main areas in this paper.We hope that this is going to make the results more accessible for experts in diﬀerent ﬁelds.We restrict ourselves here to the case of planar systems to simplify the notation and presentthe main ideas. The techniques we present are expected to generalize to several classes ofhigher-dimensional systems.

Let ( x, y ) ∈ R and consider the planar fast-slow system ε d x d t = ε ˙ x = f ( x, y ; ε ) , d y d t = ˙ y = g ( x, y ; ε ) , (1)3here ε > x and the slow variable y . For now we shall assume that f : R → R and g : R → R aresuﬃciently smooth but for our main setting we shall considerably weaken this assumption andalso allow the slow vector ﬁeld to be non-autonomous. A classical goal in fast-slow systems isto understand the dynamics of (1) for suﬃciently small ε by showing persistence results fromthe singular limit ε →

0. Setting ε = 0 in (1) yields the slow subsystem0 = f ( x, y ; 0) , ˙ y = g ( x, y ; 0) , (2)which is a diﬀerential-algebraic equation on the critical set C := { ( x, y ) ∈ R : f ( x, y ; 0) = 0 } . (3)Therefore, the slow subsystem (2) only covers the dynamics in a part of phase space. Anotherpossibility is to consider the fast-slow system (1) on the fast time scale s := εt instead of theslow time scale t . Taking the singular limit on the fast time scale leads to the fast subsystem d x d s = x ′ = f ( x, y ; 0) , d y d s = y ′ = 0 , (4)which is a parametrized diﬀerential equation, where the slow variable y acts as a parameter.The fast and slow subsystems are diﬀerent types of diﬀerential equations, which illustrates thesingular perturbation character of fast-slow systems. For (4) the critical set C consists of theequilibrium points (or steady states), while it is a constraint for (2).Several approaches to analyze fast-slow systems exist [Kue15]. Probably the most classicaltechnique is to use asymptotic methods, such as matched asymptotic expansions [MR80, KC96,BO99]. In this approach, we aim to write the solution as( x ( s ) , y ( s )) ∼ K X k =1 x k ( s ) a k ( ε ) , K X k =1 y k ( s ) b k ( ε ) ! + O ( b k +1 ( ε ) , a k +1 ( ε )) , as ε → K ∈ N , where { a k ( ε ) } ∞ k =1 , { b k ( ε ) } ∞ k =1 are asymptotic sequences; the simplest exampleare polynomials a k ( ε ) = ε k = b k ( ε ). In asymptotic analysis, one usually has to match diﬀerentseries by combining solutions obtained of a slow time scale from (2) and via a fast time scalefrom (4). In particular, the crucial point is to exploit the decomposition algebraically.More recently, the development of geometric singular perturbation theory (GSPT) has pro-vided very signiﬁcant additional geometric insight , how we may combine (2)-(4). In this work,we restrict to the case when the critical set C is a manifold but for other cases see [KS01,Sch85, KS15]. Consider a compact submanifold S ⊂ C . Then S is called normally hyper-bolic [Fen79, Jon95, Kap99] if ∂ x f ( p ; 0) = 0 for all p ∈ S ; for a higher-dimensional fast variable,normal hyperbolicity is deﬁned by requiring that the matrix D x f ( p ; 0) is a hyperbolic matrix.Fenichel’s Theorem [Fen79, Jon95, Wig94] guarantees that S perturbs to a locally invariantslow manifold S ε , which is O ( ε )-close to S . The dynamics on S ε is conjugate to the dynamicson S , which converts the singular perturbation problem to a regular perturbation problem neara normally hyperbolic critical manifold. In this view, asymptotic matching becomes geometric4atching of singular limit trajectories from the fast and slow subsystems; e.g., in Figure 1(b),we have to match two fast segments with two slow segments near the two fold points, wherethe critical manifold loses normal hyperbolicity [Kue15].In summary, the crucial point of this discussion for the current work is, that modern GSPT,as well as more algebraic asymptotic matching, crucially exploit the views of decomposition , scaled subsystems , and matching regions/points . There is a very detailed literature available onmany further important topics in fast-slow systems utilizing this strategy and we refer to theliterature review in [Kue15] for a broader view of the area.(a) (b) yx yx C C Figure 2: Sketch of two possible conﬁgurations of the critical manifold C for planar fast-slowsystems. (a) Classical situation with a dimension one (and codimension one) critical manifold.(b) Situation discussed in this paper, when C has the same dimension as the ambient phasespace, i.e., dimension two and codimension zero; see also equation (5).One crucial aspect of the critical manifold in planar systems is that it is generically ofdimension one (or even the empty set) if it is deﬁned via the zeros of a generic smooth mapping f . However, the disadvantage is that we cannot model two-dimensional constrained dynamicsin this context in planar fast-slow systems. For example, consider the fast variable vector ﬁeld f ( x, y ; ε ) :=  − x + y if y > x ,0 if x − < y < x , − x + y + 1 if y < x −

1. (5)The critical manifold C = { ( x, y ) ∈ R : x − < y < x } is a two-dimensional strip. Theclassical normal hyperbolicity assumption does not hold for C since the fast vector ﬁeld isidentically zero inside the strip. In particular, one may ask, what happens in the singular limit ε → ε = 0 is unknown, whereas it is usually assumed in GSPT that it is easier to analyze the fast and slow subsystems, i.e., usually the diﬃcultpart is to control the case 0 < ε ≪

1. In the context of Netushil’s observation, the geometryis signiﬁcantly worse as we only have ∂ C as the main geometric object available with justone-sided estimates near ∂ C for the fast dynamics. Additionally, observe that we work in thecoupled setting of a fast-slow system rather than only with one equation with an input functionas in the classical statement for Netushil’s observation. Here we overcome these diﬃculties using5 suitable local decomposition of trajectories as well as projections to preserve the geometricviewpoint.Other approaches to the problem use very diﬀerent technical approaches, and we brieﬂyreview these approaches in the next section, which also contains some basic background onhysteresis operators. Amongst others, hysteresis eﬀects appear in physics in ﬁelds like ferromagnetism, ferroelectricityor plasticity [May03, BS96, KP89, Vis94, MR15]. They can also be observed in shape memoryeﬀects of certain materials, they are relevant for thermostats in engineering [Vis94], and areused for modelling of certain systems in mathematical biology [GST13, HJ80, Kop06, PKK + , T ], scalarhysteresis operators take an admissible time-dependent input function y : [0 , T ] → R togetherwith an initial value x and return a time-dependent output function x = x ( y, x ), where wecan also view x as a map x : [0 , T ] → R if x is ﬁxed. All scalar rate-independent hysteresisoperators have two properties in common [Vis94, BS96]: Deﬁnition 2.1. (Vol) The output function x ( t ) at time t ∈ [0 , T ] may depend not only on the value of the inputfunction y ( t ) at time t , but on the whole history of y in the interval [0 , t ] . This non-localityin time is often referred to as memory eﬀect, causality or Volterra property: for all y , y in the domain of the operator, for all initial values x , and any t ∈ [0 , T ] it follows thatif y = y in [0 , t ] , then [ x ( y , x )]( t ) = [ x ( y , x )]( t ) ; cf. [Vis94, Chapter III].(RI) The output function x is invariant under time transformations. This means that forany monotone increasing and continuous function φ : [0 , T ] → [0 , T ] with φ (0) = 0 and φ ( T ) = T and for all admissible input functions y it holds [ x ( y ◦ φ, x )]( t ) = x ( y, x )( φ ( t )) , ∀ t ∈ [0 , T ] . In [Vis94, Chapter III], the function φ is also assumed to be bijective, i.e., the deﬁ-nitions diﬀer in the literature. For our purpose one may consider either deﬁnition ofadmissible time transformations. Invariance under time transformations is also calledrate-independence in the literature [MR15, Deﬁnition 1.2.1]. Furthermore, we recall that hysteresis operators may not be described by planar diﬀerentialequations. To see this, consider an ODE of the formdd t x i ( t ) = f ( x i ( t ) , y i ( t )) , x i (0) = x ,i , i ∈ { , } for two diﬀerent input functions y , y and initial values x , , x , . If for a given time t ∈ [0 , T ]it holds x ( t ) = x ( t ) and if y | ( t,T ] = y | ( t,T ] , then clearly x | [ t,T ] = x | [ t,T ] no matter if y | [0 ,t ] =6 x yF + F − Figure 3: Typical ﬁnite-time trajectory of y and x ( y, x ) in the ( y, x )-plane. The initial andﬁnal points are marked with black dots. Note that we allow quite general curves F + , F − as longas the graphs are increasing and Lipschitz. y | [0 ,t ] or even x , = x , . Therefore, the Volterra property cannot be captured in general. Thesolutions x , x of the ODEs on the remaining time interval ( t, T ] depend only on the currentvalues x ( t ) and x ( t ) at time t and on the behaviour of the input functions y and y in theinterval ( t, T ]; see also Figure 1(a). Diﬀerential equations likedd t x ( t ) = f ( x ( t ) , y ( t )) , x (0) = x , are also not rate-independent. To see this, consider a time transformation φ . Then for thesolution operator x = x ( y, x ) of the diﬀerential equation it holds x ( y ◦ φ, x )( t ) = x + Z t f ( x ( s ) , y ( φ ( s ))) d s, whereas we ﬁnd that [ x ( y, x ) ◦ φ ]( t ) = x + Z φ ( t )0 f ( x ( s ) , y ( s )) d s for t ∈ [0 , T ]. In general those two functions do not coincide. In summary, the appearance ofdiﬀerential equations with hysteresis as the singular limit of systems of diﬀerential equationsbrings along completely new features in the evolution dynamics.The class of hysteresis operators represents merely a part of the more general class of rate-independent processes. There are various (often equivalent) ways to represent such processes.In this paper, we will be concerned with so-called scalar generalized play operators which weintroduce in the following [Vis94, Chapter III.2]. Let F − , F + : R → R be two increasingfunctions with F − ≤ F + . If F − and F + are Lipschitz continuous, then one way to representthe solution x = x ( y, x ) ∈ W ,p (0 , T ) of the generalized play operator, which corresponds tothe functions F − , F + with input y ∈ W ,p (0 , T ), 1 ≤ p ≤ ∞ , and x ∈ R , is the solution of thefollowing variational inequality:˙ x ( t )( x ( t ) − ξ ) ≤ ∀ ξ ∈ [ F − ( y ( t )) , F + ( y ( t ))] , a.e. in (0 , T ) ,x ( t ) ∈ [ F − ( y ( t )) , F + ( y ( t ))] in [0 , T ] ,x (0) = min { max { F − ( y (0)) , x } , F + ( y (0)) } . (6)7e do not list all the properties of a generalized play operator here, but refer to the literature forcertain features once we need them. The typical behaviour of the solution x ( y, x ) is depictedin Figure 3. The appearance of rate-independent systems as the (singular) limit of regularizedproblems has been analyzed via several approaches. We only refer to several results, which arerelated to our work, but emphasize that many other research directions in the area of hysteresisoperators are currently being pursued.The ﬁrst equation in (1) given by ε ˙ x = f ( x, y ; ε ) (7)can be viewed as one possible regularization of (6). Equivalent to (6), x ( y, x ) can be repre-sented by a diﬀerential inclusion, see e.g. [Vis94] or [BS96]. Many rate-independent processes areequally described by an energetic formulation via a (local/global) stability condition and an en-ergy balance condition [MR15, Chapter 2 or Chapter 3]. This leads to the notion of local/globalenergetic solutions. There are several other concepts used to describe rate-independent pro-cesses. We refer to [Mie11] or [MRS12] for overviews.In the context of energetic solutions, and the representation of the corresponding solutionsby rate-independent diﬀerential inclusions, one often considers the so-called vanishing-viscositylimit, see [MRS12]. The latter is achieved by regularizing the rate-independent diﬀerentialinclusion, frequently by a term modelling viscosity. The solutions of ε -dependent regularizedproblems are then proved to converge to an energetic solution of the initial problem. Onegoal of the approach is to reveal properties of the energetic solution such as the behaviour atdiscontinuity points, see [MRS12].Consider (7) with a given function y . Special choices of f , F + and F − lead to a convexand/or coercive energy functional for the regularized, as well as for the limit problem. In thiscase, uniform-in- ε -estimates of the norm of x ε by the norm of y or ˙ y in the appropriate spacescan be derived [MR15, Chapter 1.7 or Chapter 3.8] or [MRS12]. Moreover, either (I) an equi-continuity estimate for { x ε } ε , (II) a uniform-in- ε bound of (some) norm of ˙ x ε , or (III) a boundof the total variation of { x ε } ε by y and ˙ y can be derived. For (I), a suitable application of theArzel`a-Ascoli Theorem yields the desired convergence of x ε to a singular limit x ( y, x ). For (II)or (III), a weak compactness argument together with Helly’s selection theorem can be used toprove convergence. For special choices of f , F + and F − , it should be possible to use them alsofor the coupled system (1). Our situation may include non-convex and non-coercive energyfunctionals as well as systems without any energy structure .Furthermore, in our setting, in the formulation as variational inequalities, the generalityof f , F + and F − together with the coupling in the fast-slow system (1) complicates the limitprocedure, since uniform-in- ε -estimates for ˙ x ε , even in L (0 , T ), can no longer be derived. Webypass this problem by projecting x ε in y - x -phase space vertically onto the set C of roots of f , i.e., we employ the geometric one-sided limit available for the critical manifold as discussedabove. For the projection p ε we can even show W , ∞ (0 , T )-bounds, which are uniform in ε .That is, the question of convergence of x ε and y ε to x ( y, x ) and y is shifted to the problem ofshowing that x ε − p ε converges to zero in the limit ε → x ( y, x ) = lim ε → p ε follows the hysteresis law (6), where y = lim ε → y ε solves the fastequation in (1) with x = x ( y, x ). The latter can be proved in our setting.8nother result about hysteresis as the singular limit in ODEs, which can be compared toour problem is derived in [Kre05] assuming the Li´enard case f ( x, y ; ε ) = − F ( x ) + y. The source function y is a priori ﬁxed , i.e., independent of ε and independent of fast-slowcoupling. Furthermore, the set C of this problem has co-dimension

1, diﬀerent from oursetting, where major diﬃculties appear as C has co-dimension 0. Another main diﬀerencein [Kre05] is that the singular limit x of x ε with y independent of ε is the output of a hysteresisoperator of switch type with input y . Since the solutions x ε are continuous, while the limit x isin general discontinuous, uniform convergence in classical spaces cannot be expected in [Kre05].However, uniform convergence results have been obtained for the concept of r -convergence inthe space of regulated functions involving uniform bounds. We completely bypass the problemof showing uniform-in- ε bounds for the oscillations or for the derivative of the solutions x ε in(1) by introducing suitable projection functions p ε below.In fact, our approach substantially diﬀers from previous approaches already by its funda-mental principles and the direct relation to the fast-slow GSPT viewpoint. It also providesseveral new technical tools, and it does not require specialized spaces, ODE structure assump-tions, or the existence of an energy. In particular, this setting is designed to link GSPT andfast-slow systems a lot more directly to hysteresis limits than has been possible so far. The main system we are going to study is a planar fast-slow system with the added generaliza-tion that the slow vector ﬁeld may also depend upon time ε ˙ x = f ( x, y ) , (8)˙ y = g ( x, y, t ) , (9)augmented with the initial condition ( x (0) , y (0)) = ( x , y ). Our argument also allows for thegeneralization g ( x, y, t ) = g ( x, y, t ; ε ) and f ( x, y ) = f ( x, y ; ε ) as long as f, g are bounded for all ε ∈ [0 , ε ] for some suﬃciently small ε >

0. We shall not make this additional generalizationexplicit and just omit the ε -dependence in f, g . We denote the solutions of (8)-(9) by ( x ε , y ε ) =( x ε ( t ) , y ε ( t )) to emphasize the dependence upon ε . We are interested in the limit as ε → J T := (0 , T ). The situation outlined already in Sections 2.1-2.2 is madeprecise by the following main assumptions on f, g :(A1) f, g are Lipschitz continuous with Lipschitz constants L f and L g respectively; we can alsoallow only local Lipschitz continuity in our results but refrain from doing so as it justcomplicates the notation.(A2) The function f satisﬁes f ( x, y ) < x > F + ( y ) ,f ( x, y ) = 0 if x ∈ [ F − ( y ) , F + ( y )] ,f ( x, y ) > . (10)for two functions F + > F − . 9 yF + F − f = 0 f < f > ( y ε ( t ) , x ε ( t )) ( y ε ( t ) , p ε ( t )) ( y ε ( t ) , p ε ( t ))( y ε ( t ) , x ε ( t ))( y ε ( t ) , x ε ( t ))= ( y ε ( t ) , p ε ( t )) Figure 4: Sign behaviour of f and projection to p ε in the ( y, x )-plane. Note that the criticalmanifold C = { f = 0 } is two-dimensional and bounded by the two curves F ± ; see also assump-tion (A2). The projection along the vertical fast coordinate to C is denoted by p ε and onlyacts on the fast variable.(A3) F + , F − are Lipschitz continuous with Lipschitz constants L + and L − ; we deﬁne L ± :=max { L + , L − } . F + , F − are monotone increasing functions.(A4) g satisﬁes the growth assumption | g ( x, y, t ) | ≤ M ( t )(1 + G ( x ) + | y | ) (11)with M > G bounded.Assumption (A1) essentially assures the existence of unique (local) solutions of the system(8)-(9). (A2) forces the solutions ( x ε , y ε ) to converge to the critical manifold C as ε →

0, inparticular, the signs of f are given so that C can be viewed as attracting with respect to anyfast trajectory movement. Assumption (A3) is going to yield the required Lipschitz continuityof the corresponding limiting generalized play operator and enables us to carry over Lipschitzbounds of y ε , which are independent of ε , to F + ( y ε ) and F − ( y ε ). Amongst others, (A2) and(A3) together are eventually going to allow us to show bounds of x ε in C( J T ) = C ( J T ), whichare independent of ε . Assumption (A4) yields that y ε is bounded in W , ∞ ( J T ) independentlyof ε . In particular, it provides a growth bound of the slow variable, which could just becomeunbounded, i.e., other growth bounds are expected to work as well.Since the derivative of x ε cannot be bounded independently of ε , we introduce a projectionfunction for which such a bound can be derived. As shown in Figure 4, let p ε = p ( x ε , y ε ) bedeﬁned by p ( x ε , y ε ) := min { max { x ε , F − ( y ε ) } , F + ( y ε ) } . The family of projections p ε is continuous because F − , F + , x ε , y ε , max and min are all continuous.The main theorem we are going to prove in several steps below is the following: Theorem 3.1.

Suppose the assumptions (A1)-(A4) hold and let q ∈ (1 , + ∞ ) . Denote by ( x, y )10 he unique solution of ˙ y ( t ) = g ( x ( t ) , y ( t ) , t ) a.e. in J T , (12) y (0) = y , (13)˙ x ( t )( x ( t ) − ξ ) ≤ ∀ ξ ∈ [ F − ( y ( t )) , F + ( y ( t ))] , a.e. in J T , (14) x (0) = min { max { F − ( y ) , x } , F + ( y ) } , (15) x ( t ) ∈ [ F − ( y ( t )) , F + ( y ( t ))] in J T . (16) Then we have x ∈ W , ∞ ( J T ) and y ∈ C ( J T ) ∩ C( J T ) . Moreover, the main convergence resultis x ε → x in L q ( J T ) and y ε → y in W ,q ( J T ) as ε → , i.e., the singular limit x is the solution of a generalized play operator for the curves F + and F − with input y . Theorem 3.1 shows that the singular limit of the non-autonomous planar fast-slow systemis (8) with (9) replaced by a hysteresis operator. An example of a trajectory of a generalizedscalar play operator is shown in Figure 3. Note that the time-dependence of the slow vector ﬁeldcan indeed generate highly nontrivial dynamics inside C . Of course, Theorem 3.1 only providesa partial solution of Netushil’s conjecture as we have not characterized all classes of hysteresisoperators, which arise as singular fast-slow limits. We need to derive several important auxiliaryresults before we can prove Theorem 3.1. Lemma 3.2.

Fix ε > and assume (A1)-(A4) hold. Then we can bound y ε in C( J T ) ∩ C ( J T ) and p ε in W , ∞ ( J T ) independent of ε . More precisely, there exists a constant K > such that k y ε k C( J T ) + k y ε k C ( J T ) + k p ε k W , ∞ ( J T ) ≤ K . (17) More precisely, there exists a constant K > such that k x ε − p ε k C( J T ) ≤ K . (18) The constants K , K > are independent of ε .Proof. For this proof, let c > ε . First,we are going to show that y ε is bounded in C( J T ) ∩ C ( J T ). In a second step, we prove that p ε is bounded in W , ∞ ( J T ). By assumption (A4) we obtain an estimate of the form | y ε ( t ) | ≤ | y | + Z t M ( s )(1 + G ( x ε ( s )) + | y ε ( s ) | ) d s ≤ c + c Z t | y ε ( s ) | d s (19)for some constants c , c >

0; note that the constants in (19) may grow in time as M is time-dependent but the constants remain ﬁnite for every ﬁxed ﬁnal time T >

0. Applying Gronwall’sLemma to (19) implies k y ε k C( J T ) ≤ c . Equation (9) and assumption (A4) yield k y ε k C ( J T ) ≤ c .For further use we note that this implies that y ε is Lipschitz continuous on J T with a Lipschitzconstant independent of ε . Regarding p ε , it follows from the deﬁnition of the projection that p ε ( t ) ∈ [ F − ( y ε ( t )) , F + ( y ε ( t ))] . (20)11ince F − and F + are Lipschitz continuous by (A3), we obtain from (20) that k p ε k C( J T ) ≤ c . Itremains to to show that the norm of p ε in W , ∞ ( J T ) can be bounded independently of ε . Againfrom the deﬁnition of p ε it follows for a.e. t ∈ J T that p ε ( t ) ∈ ( F − ( y ε ( t )) , F + ( y ε ( t ))) if and only if ˙ p ε ( t ) = 0 = ˙ x ε ( t ) , and otherwise, p ε ( t ) ∈ { F − ( y ε ( t )) , F + ( y ε ( t )) } . Assumption (A3) together with the bounds for y ε , which we have shown already, yield that F − ( y ε ( · )) and F + ( y ε ( · )) are Lipschitz continuouswith Lipschitz constant independent of ε . Let t < t be given and suppose p ε ( t ) = F + ( y ε ( t )).If p ε ( t ) = F + ( y ε ( t )) in the time interval [ t , t ] then | p ε ( t ) − p ε ( t ) | = | F + ( y ε ( t )) − F + ( y ε ( t )) | ≤ c | t − t | by Lipschitz continuity of F + ( y ε ( · )). Otherwise, there exist times t (1) ∈ [ t , t ) and t (2) ∈ ( t (1) , t ]such that x ε ( t ) ≥ F + ( y ε ( t )) ∀ t ∈ [ t , t (1) ] and x ε ( t ) = F + ( y ε ( t (1) )) ∀ t ∈ [ t (1) , t (2) ] . Note that in the case when t = t (1) we imply the empty set if we write [ t , t (1) ]. In this settingwe ﬁnd | p ε ( t ) − p ε ( t (2) ) | = | F + ( y ε ( t )) − F + ( y ε ( t (1) )) | ≤ c | t − t (1) | ≤ c | t − t (2) | . We ﬁrst choose t (1) ∈ [ t , t ) and then t (2) ∈ ( t (1) , t ] both maximal. If t (2) = t and F + ( y ε ( t (1) )) = F − ( y ε ( t (2) )) then we apply the same reasoning with F − on the interval [ t (2) , t ]. We obtain apartition t (0) = t ≤ t (1) ≤ · · · ≤ t ( k ) = t of [ t , t ] such that | p ε ( t ) − p ε ( t ) | ≤ k X i =1 | p ε ( t ( i − ) − p ε ( t ( i ) ) | ≤ c k X i =1 | t ( i − − t ( i ) | = c | t − t | . This implies that p ε is Lipschitz continuous independent of ε . By a standard embedding theo-rem [Eva02, Chapter 5.8.2.b, Theorem 4] it follows that k p ε k W , ∞ ( J T ) ≤ c. This concludes the proof of the ﬁrst bound (17) in the result. It remains to show that k x ε − p ε k C( J T ) is bounded. We calculate | x ε ( t ) − p ε ( t ) | = | x ε (0) − p ε (0) | + Z t ( ˙ x ε ( s ) − ˙ p ε ( s )) x ε ( s ) − p ε ( s ) | x ε ( s ) − p ε ( s ) | d s = | x (0) − p (0) | + 1 ε Z t f ( x ε ( s ) , y ε ( s )) x ε ( s ) − p ε ( s ) | x ε ( s ) − p ε ( s ) | d s − Z t ˙ p ε ( s ) x ε ( s ) − p ε ( s ) | x ε ( s ) − p ε ( s ) | d s. By the stability assumption (A2) and the deﬁnition of p ε it holds f ( x ε ( s ) , y ε ( s )) x ε ( s ) − p ε ( s ) | x ε ( s ) − p ε ( s ) | ≤ s ∈ [0 , t ], cf. Figure 4. Hence, it follows that | x ε ( t ) − p ε ( t ) | − ε Z t f ( x ε ( s ) , y ε ( s )) x ε ( s ) − p ε ( s ) | x ε ( s ) − p ε ( s ) | d s ≤ | x (0) − p (0) | + Z t | ˙ p ε ( s ) | d s (22)and the left side is greater or equal than zero. We have already shown that the right side in (22)is bounded independently of ε . Consequently, the estimate (22) implies k x ε − p ε k C( J T ) ≤ c, which ﬁnishes the proof.Since the bounds in Lemma 3.2 are independent of ε , we can proceed by a compactnessargument and prove convergence results for appropriate subsequences. Lemma 3.3.

Fix any q ∈ (1 , + ∞ ) and assume (A1)-(A4), then the following results hold:(C1) lim ε → k x ε − p ε k L q ( J T ) = 0 .(C2) Given any sequence { ε j } ∞ j =1 such that ε j → as j → + ∞ , there exists a subsequence { ε j k =: ε k } ∞ k =1 and functions x, y ∈ W ,q ( J T ) such that p ε k → x and y ε k → y (23) as ε k → weakly in W ,q ( J T ) and strongly in C( J T ) .(C3) In L q ( J T ) as ε k → , we have x ε k → x. (24) Proof.

In order to prove (C1), we compute for arbitrary ε > t ∈ J T ( x ε ( t ) − p ε ( t )) = ( x (0) − p (0)) + 2 Z t ( ˙ x ε ( s ) − ˙ p ε ( s ))( x ε ( s ) − p ε ( s )) d s = ( x (0) − p (0)) + 2 ε Z t f ( x ε ( s ) , y ε ( s ))( x ε ( s ) − p ε ( s )) d s − Z t ˙ p ε ( s ) ( x ε ( s ) − p ε ( s )) d s. This calculation together with Lemma 3.2 and (21) yields0 ≤ ( x ε ( t ) − p ε ( t )) − ε Z t f ( x ε ( s ) , y ε ( s ))( x ε ( s ) − p ε ( s )) d s ≤ c, (25)where c > ε . The bounds in (25) and the signcondition (21) imply f ( x ε ( s ) , y ε ( s ))( x ε ( s ) − p ε ( s )) → s ∈ J T as ε →

0. By deﬁnition of p ε and assumption (A2) we conclude that x ε ( s ) − p ε ( s )tends to zero for a.e. s ∈ J T as ε →

0. This result, together with (18) and the Lebesguedominated convergence theorem, implieslim ε → k x ε − p ε k L q ( J T ) = 0 , q ∈ (1 , + ∞ ), which concludes the proof of (C1). It remains to show (C2)-(C3) con-cerning subsequences for a given sequence { ε j } ∞ j =1 with lim j →∞ ε j = 0. By Lemma 3.2 thefunctions y ε and p ε are bounded in W , ∞ ( J T ) independently of ε and hence are in W ,q ( J T )for any q ∈ (1 , + ∞ ]. Using the Banach-Alaoglu Theorem [Rud91], it follows that there is asubsequence { ε j k =: ε k } ∞ k =1 of { ε j } ∞ j =1 and that there exist functions x, y ∈ W ,q ( J T ) such that p ε k ⇀ x and y ε k ⇀ y (26)as ε k → ,q ( J T ). Because W ,q ( J T ) is compactly embedded in C( J T ) for 1 < q ≤ ∞ [AF03, Theorem 6.3], this convergence is strong in C( J T ). Since x ε − p ε → q ( J T ) we ﬁnallyconclude x ε k → x in L q ( J T ) as ε k → x ε , y ε and p ε actually converge in a certain sense, wewould like to understand the behaviour of the limit functions. We can also improve the typeof convergence of the subsequence { y ε k } ∞ k =1 . Lemma 3.4.

Consider the same assumptions and the notation of Lemma 3.3. The functions x and y solve the system ˙ y ( t ) = g ( x ( t ) , y ( t ) , t ) a.e. in J T , (27) y (0) = y , (28)˙ x ( t )( x ( t ) − ξ ) ≤ ∀ ξ ∈ [ F − ( y ( t )) , F + ( y ( t ))] , a.e. in J T , (29) x (0) = min { max { F − ( y ) , x } , F + ( y ) } , (30) x ( t ) ∈ [ F − ( y ( t )) , F + ( y ( t ))] in J T . (31) Furthermore, y ε k → y in W ,q (0 , T ) as k → ∞ , i.e., we have strong convergence in W ,q (0 , T ) of the slow dynamics for a subsequence.Proof. We ﬁrst show that y solves (27)-(28) with x = x and improve the convergence of { y ε k } ∞ k =1 .Lemma 3.3, assumptions (A1) and (A4), and the Lebesgue dominated convergence theoremyield that g ( x ε k , y ε k , t ) → g ( x, y, t ) (32)in L q ( J T ) as ε k →

0. By Lemma 3.3, y ε k converges to y uniformly in J T . For t ∈ J T we obtainby using y ε k = y + R t g ( x ε k , y ε k , s ) d s that (cid:12)(cid:12)(cid:12)(cid:12) y ( t ) − y − Z t g ( x, y, s ) d s (cid:12)(cid:12)(cid:12)(cid:12) ≤ | y ( t ) − y ε k ( t ) | + Z t | g ( x, y, s ) − g ( x ε k , y ε k , s ) | d s → ε k → t < + ∞ for the last term to get convergence in L .This shows that y solves (27)-(28) with x = x . Together with (32) we may conclude that y ε k → y in W ,q (0 , T ).Next, we are going to show that x solves (29)-(31) with y = y . First, we will deal with (30)-(31). By deﬁnition of p ε and with Lemma 3.3 we have x (0) = lim k →∞ p ε k (0) = min { max { x , F − ( y ) } , F + ( y ) } (34)as well as x ( t ) ∈ [ F − ( y ( t )) , F + ( y ( t ))] ∀ t ∈ J T , (35)14hich proves (30)-(31). Hence, it remains to show the variational inequality (29), which weaccomplish in two steps. First, we deal with initial data in the interior of the critical manifold C and in a second step we are going to consider dynamics on the boundary. Fix t ∈ J T andsuppose we start in the interior x ( t ) ∈ ( F − ( y ( t )) , F + ( y ( t ))) . Continuity of x , F − ( y ( · )) and F + ( y ( · )) implies that there is some interval J ⊂ J T with t ∈ J such that x ( t ) ∈ ( F − ( y ( t )) , F + ( y ( t ))) for all t ∈ J . We deﬁne the distance to the boundary as δ J := min t ∈ J { x ( t ) − F − ( y ( t )) , F + ( y ( t )) − x ( t ) } . By Lemma 3.3 and assumption (A3), we can ﬁnd some ε (0) > ε k < ε (0) andall t ∈ J T the following estimate holds | F − ( y ε k ( t )) − F − ( y ( t )) | + | F + ( y ε k ( t )) − F + ( y ( t )) | + | p ε k ( t ) − x ( t ) | < δ J . (36)This implies p ε k ( t ) ∈ ( F − ( y ε k ( t )) , F + ( y ε k ( t ))) as well asmin t ∈ J { p ε k ( t ) − F − ( y ε k ( t )) , F + ( y ε k ( t )) − p ε k ( t ) } ≥ δ J . for all ε k < ε (0) and t ∈ J . By deﬁnition of the projection p ε k we immediately ﬁnd p ε k ( t ) = x ε k ( t )and ˙ p ε k ( t ) = 0 for a.e. t ∈ J so that p ε k ( t ) = p ε k ( t ) for t ∈ J and the variational inequality isjust satisﬁed with zero almost everywhere in J . As the second step, suppose we start on theboundary of the critical manifold which occurs e.g. if x ( t ) = F + ( y ( t )). Then there is someinterval J with t ∈ J such that x ( t ) > F − ( y ( t )) for all t ∈ J ; note that we slightly overloadthe notation here and again use J . A similar argument leads to (36) and the conclusion thatfor all for all ε k < ε (0) and t ∈ J we now have p ε k ( t ) > F − ( y ε k ( t )) + δ J . By deﬁnition of p ε k it follows x ε k ( t ) ≥ F − ( y ε k ( t )) and ˙ x ε k ( t ) ≤ t ∈ J . In particular,we have a negative sign for ˙ x ε k ( t ), which we want to transfer to the limit and show that˙ x ≤ , a.e. in J . (37)To prove (37), assume in contradiction that x ( t ) > x ( t ) for some t , t ∈ J , t < t . (38)Then there exists ε (1) > p ε k ( t ) > p ε k ( t ) if ε k < ε (1) . Using the Lipschitz continuityof the p ε k from Lemma 3.2, we can ﬁnd t (1) ∈ ( t , t ) and δ > p ε k ( t ) < p ε k ( t ) − δ for all t ∈ [ t , t (1) ] and all ε k < ε (1) . By Lemma 3.2, x ε k − p ε k converges to zero a.e. in J T .Hence there must be some t (2) ∈ [ t , t (1) ] and some ε (2) < ε (1) such that x ε k ( t (2) ) < p ε k ( t ) − δ for some δ < δ and all ε k < ε (2) . But then because ˙ x ε k ≤ J it follows x ε k ( t ) < p ε k ( t ) − δ ε k < ε (2) and all t ∈ [ t (2) , t ]. Again because x ε k − p ε k converges to zero a.e. in J T thisyields x ( t ) = lim k →∞ ( p ε k ( t ) − x ε k ( t ) + x ε k ( t )) < lim k →∞ ( p ε k ( t ) − x ε k ( t ) + p ε k ( t ) − δ ) = x ( t ) − δ for all ε k < ε (2) and a.e. t ∈ [ t (2) , t ]. By continuity of x this estimate holds for all t ∈ [ t (2) , t ]which gives the contradiction x ( t ) < x ( t ) − δ . Hence, (37) indeed holds also in the limit. From (35) and the previous results we may nowconclude that ˙ x < J , which has positive measure, is only possible if x = F + ( y )a.e. in this set. Note that this precisely gives one case of the variational inequality.Similarly, we can deal with the case x ( t ) = F − ( y ( t )) to show that ˙ x ≥ x > x = F − ( y ) a.e. in this set. Hence, it follows, wehave for a.e. t ∈ J T ˙ x ( t )( x ( t ) − ξ ) ≤ ∀ ξ ∈ [ F − ( y ( t )) , F + ( y ( t ))] . This proves that x solves (29)-(31) with y = y .We observe that the proof of Lemma 3.4 essentially relied on the convergence of the fastprojection as ε →

0. Furthermore, the argument treats the critical manifold C in three diﬀer-ent phases according whether we are on the boundaries deﬁned by F + , F − or in the interior.Since many other hysteresis operators can also be deﬁned by variational inequalities, we ac-tually may hope that our strategy can be carried over to other singular limits not covered bystandard normally hyperbolic Fenichel theory or other fast-slow systems methods. Finally, wecan summarize the previous results and prove the main result. Proof. (of Theorem 3.1) Most of the statements already follow from combining Lemma 3.2,Lemma 3.3 and Lemma 3.4. However, we still have to prove that ( x, y ) are uniquely determinedby (12)-(16).First, we make a few observations. The limit x is a generalized play operator for the Lipschitzcontinuous curves F + and F − with input y = y . By [Vis94, III.2.,Theorem 2.2], this generalizedplay is a Lipschitz continuous hysteresis operator from C( J T ) × R to C( J T ), where the secondinput variable is the initial condition x . The map g is Lipschitz continuous by (A1).To prove uniqueness, we argue by contradiction. Suppose that there are two pairs of solu-tions ( x , y ) and ( x , y ) of (12)-(16). Since y and y are continuous, for t close enough to 0,we have | y ( t ) − y ( t ) | ≤ Z t | g ( x ( s ) , y ( s ) , s ) − g ( x ( s ) , y ( s ) , s ) | d s ≤ L g Z t | x ( s ) − x ( s ) | + | y ( s ) − y ( s ) | d s ≤ c Z t sup ≤ τ ≤ s | y ( τ ) − y ( τ ) | + | y ( s ) − y ( s ) | d s. We used Lipschitz continuity of the hysteresis operator for the last estimate. Therefore,sup ≤ τ ≤ t | y ( τ ) − y ( τ ) | ≤ c Z t sup ≤ τ ≤ s | y ( τ ) − y ( τ ) | d s, ≤ τ ≤ t | y ( τ ) − y ( τ ) | = 0. This argument can be repeatedfor another small time interval so that ﬁnally y = y . Consequently, x = x as well. Thisshows uniqueness of x and y .Concerning the regularity of x and y note that y ∈ C ( J T ) because g ( x ( · ) , y ( · ) , · ) is contin-uous. Since y is also continuous in J T it follows y ∈ W , ∞ ( J T ). We then obtain x ∈ W , ∞ ( J T )by [Vis94, III.2.,Theorem 2.3].The last step is the convergence result, which is relatively simple using the intermediateresults. Indeed, by Lemma 3.3 and Lemma 3.4, every sequence { ε j } ∞ j =1 with ε j → { ε j k =: ε k } ∞ k =1 such that y ε k → y in W ,q ( J T ) and x ε k → x in L q ( J T ) . Because x and y are the unique solution of (12)-(16) we conclude that this convergence holdsfor the whole sequence { ( x ε j , y ε j ) } ∞ j =1 .Note that the strategy of our proof presented in this section depends crucially on the fastvariable convergence, which is dealt with via weak convergence ﬁrst. Our second approachreplaces this step of the argument using a more geometric strategy based upon local lineariza-tion, which actually relies on additional assumptions on the diﬀerentiability of the vector ﬁeld;hence, it complements the approach presented in this section. As before, we consider the fast-slow non-autonomous planar system (8)-(9) on a ﬁnite timeinterval J T = (0 , T ). However, we strengthen the assumption (A1) to the following:(A1’) g ∈ C ( R ) and f ∈ C ( R ).Essentially we are going to get a stronger convergence result if we assume more diﬀerentia-bility. The result we are going to prove in this section is a variant/extension of Theorem 3.1. Theorem 4.1.

Suppose the assumptions (A1’), (A2)-(A4) hold. Let ( x, y ) be the unique solu-tion of ˙ y ( t ) = g ( x ( t ) , y ( t ) , t ) a.e. in J T , (39) y (0) = y , (40)˙ x ( t )( x ( t ) − ξ ) ≤ ∀ ξ ∈ [ F − ( y ( t )) , F + ( y ( t ))] , a.e. in J T , (41) x (0) = min { max { F − ( y ) , x } , F + ( y ) } , (42) x ( t ) ∈ [ F − ( y ( t )) , F + ( y ( t ))] in J T . (43)(44) Then y ∈ C( J T ) ∩ C ( J T ) and y ∈ W , ∞ ( J T ) . For arbitrary η > , there exists an ε η > suchthat for all ε ∈ (0 , ε η ) there exists a time t ( ε ) ∈ J T with k y ε − y k C( J T ) + k x ε − x k C[ t ( ε ) ,T ] < η . (45) If x ∈ [ F − ( y ) , F + ( y )] , then t ( ε ) = 0 . Otherwise, t ( ε ) ≤ εC ( η ) for some C ( η ) > . Thisimplies the following: N1) If x ∈ [ F − ( y ) , F + ( y )] , then y ε → y in C ( J T ) ∩ C ( J T ) and x ε → x in C( J T ) .(N2) Otherwise, for arbitrary η > , y ε → y in C ( J T ) ∩ C ( η , T ) and x ε → x in C([ η , T ]) .(N3) For any q ∈ (1 , + ∞ ) , y ε → y in W ,q ( J T ) and x ε → x in L q ( J T ) . In particular, note that the conclusions of convergence to a hysteresis operator in the singularlimit are now obtained in stronger norms in (N2) but the convergence result of Theorem 3.1stated in (N3) obviously still holds. We do not expect any stronger convergence in (N2) for thefast variable, even if the diﬀerentiability of f, g is increased. This is reminiscent of the classicalresults from Fenichel Theory [Fen79, Jon95, Kue15] as fast subsystem trajectories genericallydevelop non-diﬀerentiable points when connecting to a critical manifold. To prove Theorem 4.1,we need some additional notation. First, note that by Lemma 3.2, trajectories remain bounded.

Deﬁnition 4.2.

Let M be a compact rectangle such that ( x ε ( t ) , y ε ( t )) is contained in M for all ε > and all t ∈ J T . • We introduce M := M ∩ { ( x, y ) : f ( x, y ) = 0 } , M + := M ∩ { ( x, y ) : f ( x, y ) < } and M − := M ∩ { ( x, y ) : f ( x, y ) > } . • We deﬁne the constants C f , C g , C Dg , C Df , C D f , C D g > by the upper bounds of thenorms of f, g, Dg, Df, D f and D g on M . Moreover, we set C M := max {k w k : w ∈ M } , C D := max { C D f , C D g } and L := max { L f , L g } ; cf. assumptions (A1) and (A1’). We remark that the notation of M + and M − corresponds to F + and F − above and the signsubscripts are chosen so that [ F − , F + ] is an interval. Deﬁnition 4.3.

For w = ( x w , y w ) ∈ M and ε > we write x ε,w and y ε,w for the solutionof (8) - (9) with initial value ( x (0) , y (0)) = w and on the time interval J T . For ( τ , τ ) ⊂ J T wedenote by x ε,w , ( τ ,τ ) and y ε,w , ( τ ,τ ) the solution of (8) - (9) with initial value ( x ( τ ) , y ( τ )) = w and on the time interval ( τ , τ ) . The additional subscript w will be necessary in the proof as we piece together severallocal results comparing linear and nonlinear terms. This approach is similar to a strategy ofpatching sample paths used in the context of stochastic fast-slow systems [BG06, BGK15]; seealso Figure 5. With w ∈ M , t ∈ J T and ε > ε ˙ x ( t ) = f ( w ) + [D f ( w )] (cid:18)(cid:18) x ( t ) y ( t ) (cid:19) − w (cid:19) for t > t , (46) x ( t ) = x w , (47)˙ y ( t ) = g ( w, t ) + [ ∂ ( x,y ) g ( w, t )] (cid:18)(cid:18) x ( t ) y ( t ) (cid:19) − w (cid:19) + [ ∂ t g ( w, t )]( t − t ) for t > t , (48) y ( t ) = y w . (49)18 x C C + Figure 5: Sketch of the approximation argument. The actual fast-slow system trajectory fromDeﬁnition 4.3 is shown as a solid curve (black, with arrows). On the fast scale we use localapproximation in balls (grey disks) to obtain the curve in Deﬁnition 4.4 (dashed black). Thecritical manifold boundary C + is shown as well (thick grey curve). Furthermore, we always workwith the approximation up to a given neighbourhood, which scales with δ as in Deﬁnition 4.6;the neighbourhood is indicated as well (dotted grey curve).We group the terms containing ( x ( t ) , y ( t )) and the remaining terms together, write F ( w ) + [ F ( w )] (cid:18) x ( t ) y ( t ) (cid:19) := f ( w ) + [D f ( w )] (cid:18)(cid:18) x ( t ) y ( t ) (cid:19) − w (cid:19) ,G ( w, t , t ) + [ G ( w, t )] (cid:18) x ( t ) y ( t ) (cid:19) := g ( w, t ) + [ ∂ ( x,y ) g ( w, t )] (cid:18)(cid:18) x ( t ) y ( t ) (cid:19) − w (cid:19) + [ ∂ t g ( w, t )]( t − t ) , and denote by C F , C F , C G , C G > F j , G j with j ∈ { , } in M and J T . We also set C F,G := max { C F , C G , C F , C G } . The next step is to deﬁne the local approximations of the solution and patch them together ona given ﬁnite-time interval.

Deﬁnition 4.4.

For v = ( x v , y v ) ∈ M , for given ε, θ > and ( τ , τ ) ⊂ J T , we deﬁne x lin ε,θ,v, ( τ ,τ ) and y lin ε,θ,v, ( τ ,τ ) in the following way:(S1) Let v (0) := v , τ (0) := τ and deﬁne x lin ε,θ,v, ( τ ,τ ) and y lin ε,θ,v, ( τ ,τ ) ﬁrst on the interval [ τ (0) , τ (1) ) as the solution of (46) - (49) with initial value w = v (0) and initial time τ (0) . Let τ (1) bethe ﬁrst time in ( τ , τ ] with | v (0) x − x lin ε,θ,v, ( τ ,τ ) ( τ ) | = θ or τ (1) = τ . (50) We set v (1) := ( x lin ε,θ,v, ( τ ,τ ) ( τ ) , y lin ε,θ,v, ( τ ,τ ) ( τ )) . S2) For given v ( i ) ∈ M and τ ( i ) ∈ ( τ , τ ) , i ≥ , we deﬁne x lin ε,θ,v, ( τ ,τ ) and y lin ε,θ,v, ( τ ,τ ) induc-tively on the interval [ τ ( i ) , τ ( i +1) ) by the solution of (46) - (49) with initial value w = v ( i ) and initial time τ ( i ) . τ ( i +1) is the ﬁrst time in ( τ ( i ) , τ ] with | v ( i ) x − x lin ε,θ,v, ( τ ,τ ) ( τ i +1 ) | = θ or τ ( i +1) = τ . (51) We set v ( i +1) := ( x lin ε,θ,v, ( τ ,τ ) ( τ i +1 ) , y lin ε,θ,v, ( τ ,τ ) ( τ i +1 )) . If τ ( i +1) < τ we repeat Step (S2).(S3) After ﬁnitely many steps this deﬁnes x lin ε,θ,v, ( τ ,τ ) and y lin ε,θ,v, ( τ ,τ ) on the interval [ τ , τ ] . Although Deﬁnition 4.4 may look complicated, it is actually just a piecewise deﬁnition ofthe linearized solution using hitting times in (50)-(51).

Deﬁnition 4.5.

Let δ > be given. Then by assumption (A2) we can deﬁne f + ( δ ) < by f + ( δ ) := 12 max { f ( x, y ) : ( x, y ) ∈ M + , dist(( x, y ) , M ) ≥ δ } . Similarly we deﬁne f − ( δ ) > by f − ( δ ) := 12 min { f ( x, y ) : ( x, y ) ∈ M − , dist(( x, y ) , M ) ≥ δ } . Furthermore, it will be helpful to introduce the following notation min {| f + ( δ ) | , f − ( δ ) } =: f m ( δ ) . (52)The next lemma is the main step to estimate the deviation of the solution obtained fromthe linearized process by patching to the true solution. To simplify the statement and theproof, we use for the next result the notation x ε,w := x w , y ε,w := y w , x lin := x lin ε,θ,v, ( τ ,τ ) and y lin := y lin ε,θ,v, ( τ ,τ ) . Lemma 4.6.

Let δ > be given such that M \ ( M + B (0 , δ )) = ∅ and let v = ( x v , y v ) ∈ M \ ( M + B (0 , δ )) . Furthermore, consider any τ ∈ [0 , T ) and deﬁne ε δ := min ( f m ( δ )2(1 + L ± ) C F,G (cid:20)(cid:18) √ (cid:18) C M + 2 C F,G f m ( δ ) (cid:19) e C F,G /f m ( δ ) + 1 (cid:19)(cid:21) − , ) . Let ε ∈ (0 , ε δ ) and θ ∈ (cid:16) , min { f m ( δ ) C Df , } (cid:17) be arbitrary but ﬁxed. Denote by τ > τ the ﬁrsthitting time such that ( x lin , y lin ) ∈ M + B (0 , δ ) or τ = T . We have have the local time estimate | τ ( j +1) − τ ( j ) | ≤ εθf m ( δ ) , for all j ∈ { , · · · , K − } , (53) and the global number K of time intervals ( τ ( j ) , τ ( j +1) ) in Deﬁnition 4.4 satisﬁes K ≤  L ± ) dist( v, M + B (0 , δ )) θ (cid:16) − εε δ (cid:17)  ≤ L ± ) dist( v, M + B (0 , δ )) θ (cid:16) − εε δ (cid:17) + 1 . (54)20 n particular, this implies the global time estimate | τ − τ | ≤ ε  L ± ) dist( v, M + B (0 , δ )) f m ( δ ) (cid:16) − εε δ (cid:17) + 1 C Df  . (55) Moreover, for w ∈ M and arbitrary t ∈ [ τ , τ ] there holds | x w ( t ) − x lin ( t ) | + | y w ( t ) − y lin ( t ) | ≤ (cid:0) | x w ( τ ) − x v | + | y w ( τ ) − y v | + θ K (cid:1) K , (56) where the constants K , K depend upon the given data as follows K =  L ± ) dist( v, M + B (0 , δ )) f m ( δ ) (cid:16) − εε δ (cid:17) + 1 C Df  (cid:18) C D + ε f m ( δ ) (cid:19) , K = exp  L  L ± ) dist( v, M + B (0 , δ )) f m ( δ ) (cid:16) − εε δ (cid:17) + 1 C Df  Proof.

For j ∈ { , · · · , K − } let ( τ ( j ) , τ ( j +1) ) be an interval as in Deﬁnition 4.4. On thisinterval, we have (cid:18) x lin ( t ) y lin ( t ) (cid:19) = exp (cid:20) ( t − τ ( j ) ) (cid:18) ε F ( v ( j ) ) G ( v ( j ) , τ ( j ) ) (cid:19)(cid:21) v ( j ) + Z tτ ( j ) exp (cid:20) ( t − s ) (cid:18) ε F ( v ( j ) ) G ( v ( j ) , τ ( j ) ) (cid:19)(cid:21) (cid:18) ε F ( v ( j ) , s ) G ( v ( j ) ) (cid:19) d s. (57)Consider the matrix (cid:18) [ F ( v ( j ) )] (1) [ F ( v ( j ) )] (2) [ G ( v ( j ) , τ ( j ) )] (1) [ G ( v ( j ) , τ ( j ) )] (2) (cid:19) := (cid:18) F ( v ( j ) ) G ( v ( j ) , τ ( j ) ) (cid:19) . Note that the entries of the matrix are bounded by C F,G . By deﬁnition of the matrix exponentialit follows that for all s, t ∈ [ τ ( j ) , τ ( j +1) ](0 , (cid:20) exp (cid:20) ( t − s ) (cid:18) ε F ( v ( j ) ) G ( v ( j ) , τ ( j ) ) (cid:19)(cid:21)(cid:21) = ( m ( t, s ) , m ( t, s )) , where for l ∈ { , } a direct calculation yields | m l ( t, s ) | ≤ ∞ X k =1 (( t − s ) C F,G ) k k ! (cid:18) ε (cid:19) k − = C F,G ( t − s ) ∞ X k =0 (cid:18) ( t − s ) C F,G (1 + ε ) ε (cid:19) k k + 1)! ≤ C F,G ( t − s ) exp (cid:18) C F,G ( t − s ) ε (cid:19) ≤ C F,G ( τ ( j +1) − τ ( j ) ) exp (cid:18) C F,G ( τ ( j +1) − τ ( j ) ) ε (cid:19) . Consequently, we also deduce that( | m ( t, s ) | + | m ( t, s ) | ) / ≤ √ C F,G ( τ ( j +1) − τ ( j ) ) exp (cid:18) C F,G ( τ ( j +1) − τ ( j ) ) ε (cid:19) .

21e multiply with (0 ,

1) from the left in (57) for j ∈ { , · · · , K − } and t ∈ [ τ ( j ) , τ ( j +1) ] andtake the absolute value to obtain | y lin ( t ) − y v ( j ) | = (cid:12)(cid:12)(cid:12)(cid:12) ( m ( t, τ ( j ) ) , m ( t, τ ( i ) )) v ( j ) + Z tτ ( j ) ( m ( t, s ) , m ( t, s )) (cid:18) ε F ( v ( j ) ) G ( v ( j ) , s ) (cid:19) d s (cid:12)(cid:12)(cid:12)(cid:12) ≤ √ C F,G ( τ ( j +1) − τ ( j ) ) e CF,G ( τ ( j +1) − τ ( j )) ε (cid:18) C M + ( τ ( j +1) − τ ( j ) ) C F,G (cid:18) ε (cid:19)(cid:19) +( τ ( j +1) − τ ( j ) ) C F,G . Since θ ∈ (0 , ε δ , using ε ∈ (0 , ε δ ) and if (53) would hold, i.e., | τ ( j +1) − τ ( j ) | ≤ εθf m ( δ ) , then it follows that | y lin ( t ) − y v ( j ) | ≤ θεC F,G f m ( δ ) (cid:18) √ C F,G /f m ( δ ) (cid:18) C M + 2 C F,G f m ( δ ) (cid:19) + 1 (cid:19) ≤ θε L ± ) ε δ ≤ θ . Next, we are going to prove (53). We already know that if | τ ( j +1) − τ ( j ) | ≤ εθf m ( δ ) then( x lin ( t ) , y lin ( t )) ∈ B ( v ( j ) , θ ) for all t ∈ [ τ ( j ) , τ ( j +1) ]. Recall that τ was deﬁned such that( x lin ( t ) , y lin ( t )) lies outside of a δ -neighbourhood of the critical manifold, i.e., in M \ ( M + B (0 , δ )). Assume that v ∈ M + . Then we can obtain a local upper bound on the fast vectorﬁeld of the following form f ( v ( j ) ) + [D f ( v ( j ) )] (cid:18)(cid:18) x lin ( t ) y lin ( t ) (cid:19) − v ( j ) (cid:19) < f + ( δ ) + C Df θ < f + ( δ ) (58)for all t ∈ [ τ ( j ) , τ ( j +1) ]. This actually implies a helpful bound at the end of the small timeinterval, namely x lin ( τ ( j +1) ) = v ( j ) x + Z τ ( j +1) τ ( j ) ε (cid:18) F ( v ( j ) ) + F ( v ( j ) ) (cid:20)(cid:18) ( x lin )( s )( y lin )( s ) (cid:19) − v ( j ) (cid:21)(cid:19) d s ≤ v ( j ) x + ( τ ( j +1) − τ ( j ) ) ε f + ( δ ) ≤ v ( j ) x − ( τ ( j +1) − τ ( j ) ) ε f m ( δ ) . (59)The corresponding result for v ∈ M − follows analogously. By deﬁnition of the patched linearsolution, the result (53) follows. We have collected enough estimates to derive the upperbound (54) for K . Suppose ﬁrst that v ∈ M + . Then for all j ∈ { , · · · , K − } there holds x v ( j +1) ≤ x v ( j ) and | y v ( j ) − y v ( i +1) | ≤ θε L ± ) ε δ . By assumption (A3), we have that F + is monotone increasing with Lipschitz constant L + .Therefore for all j ∈ { , · · · , K − } :dist( v ( j ) , M + B (0 , δ )) ≤ (1 + L ± )dist( v, M + B (0 , δ )) + j − X k =0 (cid:18) L + | y v ( k +1) − y v ( k ) | − θ (cid:19) ≤ (1 + L ± )dist( v, M + B (0 , δ )) + j θ (cid:18) εε δ − (cid:19) . v ( K ) ∈ M + B (0 , δ ) if K is large enough, so that(1 + L ± )dist( v, M + B (0 , δ )) + K θ (cid:18) εε δ − (cid:19) ≤ , which can be attained for some K ≤  L ± )dist( v, M + B (0 , δ )) θ (cid:16) − εε δ (cid:17)  ≤ L ± )dist( v, M + B (0 , δ )) θ (cid:16) − εε δ (cid:17) + 1and (54) follows as analogous estimates for v ∈ M − lead to the same bounds for K ; note thatin this case we have x v ( j +1) ≥ x v ( j ) for j ∈ { , · · · , K − } . The upper bound for K impliesalmost immediately the total time estimate (55). For example, consider the direct calculationin the case of v ∈ M + | τ − τ | ≤ K εθf m ( δ ) ≤  L ± )dist( v, M + B (0 , δ )) θ (cid:16) − εε δ (cid:17) + 1  εθf m ( δ ) ≤ ε  L ± )dist( v, M + B (0 , δ )) f m ( δ ) (cid:16) − εε δ (cid:17) + 1 C Df  . With the bounds on time intervals, one may now inductively show the the worst-case upperbound (56) by a second-order approximation of f and g in combination with Gronwall’s Lemmaand ε < ε δ ≤ Lemma 4.7.

Let δ, ε > and v = ( x v , y v ) ∈ M + B (0 , δ ) be given. Consider any τ ⊂ [0 , T ) and denote by τ the ﬁrst time after τ such that ( x ε,v, ( τ ,τ ) , y ε,v, ( τ ,τ ) ) =: ( x, y ) ∈ ∂ ( M + B (0 , δ )) or τ = T . Then either τ = T or | τ − τ | > δC g . (60) Furthermore, for arbitrary w ∈ M , ( x w , y w ) := ( x ε,w , y ε,w ) and t ∈ ( τ , τ ) there holds | x w ( t ) − x ( t ) | + | y w ( t ) − y ( t ) | ≤ ( | x w ( τ ) − x v | + | y w ( τ ) − y v | ) e ( τ − τ ) L (1+1 /ε ) . Proof.

We consider the case v ∈ M + . Then because f ( x, y ) < x, y ) ∈ M + and because F + is monotone increasing, the closest point in ∂ ( M + B (0 , δ )), which can be reached from v is given by ( x v , y α ) := { ( x v , y v − α ) : α > } ∩ ∂ ( M + B (0 , δ )) , and y v − y > y v − y α for all points ( x, y ) ∈ ( M + B (0 , δ )) with x ≤ x v . Since v ∈ ( M + B (0 , δ ))we have y v − y α > δ . Hence, either τ = T or | y ( τ ) − y v | > δ . This yields δ < | y ( τ ) − y v | = (cid:12)(cid:12)(cid:12)(cid:12)Z τ τ ˙ y ( s ) d s (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)Z τ τ g ( x ( s ) , y ( s ) , s ) d s (cid:12)(cid:12)(cid:12)(cid:12) ≤ C g ( τ − τ ) . Therefore, we ﬁnd | τ − τ | > δC g . Analogous estimates for v ∈ M − prove (60). The laststatement in the result follows by a direct Gronwall Lemma argument.23t is helpful to describe the solution near the critical manifold by the full nonlinear dynamicsas it reduces to the slow dynamics in the singular limit, while still using the patched linearizedsolution for the fast dynamics. This motivates the following deﬁnition: Deﬁnition 4.8.

Let w = ( x , y ) ∈ M and δ > be given. Adopt the assumptions and the no-tation from Lemma 4.6. Let ε ∈ (cid:0) , ε δ (cid:1) be arbitrary. Furthermore, choose θ ε ∈ (cid:16) , min { f m ( δ ) C Df , } (cid:17) such that θ ε is of order o (cid:16) e − TL ε (cid:17) . We deﬁne ˜ x ε and ˜ y ε inductively as follows:(T1) Let t := 0 be the initial time. If dist( w , M ) ≤ δ , deﬁne t > t to be the ﬁrst time suchthat ( x ε,w , ( t ,t ) , y ε,w , ( t ,t ) ) ∈ ∂ ( M + B (0 , δ )) or t = T . In this case we set (˜ x ε , ˜ y ε ) := ( x ε,w , ( t ,t ) , y ε,w , ( t ,t ) ) on [ t , t ] . If dist( w , M ) > δ we deﬁne t > t by the ﬁrst time such that the linearized solutionsatisﬁes ( x lin ε,θ,w , ( t ,t ) , y lin ε,θ,w , ( t ,t ) ) ∈ ( M + B (0 , δ )) or t = T . We then set (˜ x ε , ˜ y ε ) := ( x lin ε,θ,w , ( t ,t ) , y lin ε,θ,w , ( t ,t ) ) on [ t , t ] . In both cases we deﬁne w := (˜ x ε ( t ) , ˜ y ε ( t )) .(T2) Let (˜ x ε , ˜ y ε ) be deﬁned on the interval J t i and let w , . . . , w i be chosen. If t i = T we aredone. Otherwise, t i < T . If w i ∈ ∂ ( M + B (0 , δ )) , deﬁne t i +1 > t i by the ﬁrst time suchthat ( x ε,w i , ( t i ,t i +1 ) , y ε,w i , ( t i ,t i +1 ) ) ∈ ∂ ( M + B (0 , δ )) or t i +1 = T . In this case we set (˜ x ε , ˜ y ε ) := ( x ε,w i , ( t i ,t i +1 ) , y ε,w i , ( t i ,t i +1 ) ) on [ t i , t i +1 ] . If w i ∈ ∂ ( M + B (0 , δ )) , deﬁne t i +1 > t i by the ﬁrst time such that the linearized solutionsatisﬁes ( x lin ε,θ,w i , ( t i ,t i +1 ) , y lin ε,θ,w i , ( t i ,t i +1 ) ) ∈ ∂ ( M + B (0 , δ )) or t i +1 = T . In this case we set (˜ x ε , ˜ y ε ) := ( x lin ε,θ,w i , ( t i ,t i +1 ) , y lin ε,θ,w i , ( t i ,t i +1 ) ) on [ t i , t i +1 ] . In both cases we deﬁne w i +1 := (˜ x ε ( t i +1 ) , ˜ y ε ( t i +1 )) . As before, we need a projection operator for the solution. Deﬁne ˜ p ε = p (˜ x ε , ˜ y ε ) by˜ p ε := p (˜ x ε , ˜ y ε ) = min { max { ˜ x ε , F − (˜ y ε ) } , F + (˜ y ε ) } . (61) Lemma 4.9.

Consider the same assumptions and the notation from Deﬁnition 4.8. Since w ∈ M is ﬁxed we denote x ε,w by x ε and y ε,w by y ε . Then there exists a constant C ( δ ) > such that | x ε ( t ) − ˜ x ε ( t ) | + | y ε ( t ) − ˜ y ε ( t ) | ≤ θ ε e TLε C ( δ ) , ∀ t ∈ J T . (62) Since θ ε is of order o (cid:16) e − TL ε (cid:17) , this implies that k ( x ε , y ε ) − (˜ x ε , ˜ y ε ) k C ( J T ; R ) ≤ θ ε e TLε C ( δ ) → as ε → . Furthermore, if w = ( x w , y w ) ∈ M + B (0 , δ ) then k ˜ p ε − ˜ x ε k C ( J T ) ≤ δ (1 + L ± ) . (63) If w = ( x w , y w ) ∈ M \ ( M + B (0 , δ )) then there exists a constant C ( δ ) > and a time t ∈ J T with t ≤ εC ( δ ) such that k ˜ p ε − ˜ x ε k C [ t ,T ] ≤ L ± ) δ. (64)24 roof. During the proof we denote ˜ x := ˜ x ε , ˜ y := ˜ y ε , x := x ε , and y := y ε . By Lemma 4.7, eachsubinterval of J T in which (˜ x, ˜ y ) behaves according to (8)-(9) can be estimated from below by δ/C g . Hence, the total number K of subintervals in Deﬁnition 4.8 is bounded from above by( T C g ) /δ . Let K = K l + K e , where K l is the number of time intervals in which (˜ x, ˜ y ) is deﬁnedvia (46)-(49), and where K e denotes the number of time intervals in which (˜ x, ˜ y ) is given by (8)-(9). We ﬁrst assume w = ( x w , y w ) ∈ M \ ( M + B (0 , δ )). In this case, K l ∈ { K e , K e + 1 } .Without loss of generality, we assume K l = K e + 1 = K +12 . By Lemma 4.6 and because ε ≤ ε δ ,the ﬁrst time interval ( t , t ) in Deﬁnition 4.8 is bounded from above by | t − t | ≤ ε (cid:18) L ± )dist( w , M + B (0 , δ )) f m ( δ ) + 1 C Df (cid:19) =: εC ( δ ) . We introduce the notation C ( δ ) := max (cid:26) C ( δ ) , (cid:18) L ± ) δf m ( δ ) + 1 C Df (cid:19)(cid:27) , C ( δ ) := C ( δ ) (cid:18) C D + ε δ f m ( δ ) (cid:19) ,C ( δ ) := 2 LC ( δ ) . Lemma 4.6 implies that for i ∈ (cid:8) , · · · , K − (cid:9) and t ∈ [ t i , t i +1 ] we may estimate the diﬀerencebetween the full and approximate solutions by | x ( t ) − ˜ x ( t ) | + | y ( t ) − ˜ y ( t ) | ≤ (cid:2) | x ( t i ) − x w i | + | y ( t i ) − y w i | + θ C ( δ ) (cid:3) e C ( δ ) . Lemma 4.7 proves that for i ∈ (cid:8) , · · · K − (cid:9) and t ∈ [ t i − , t i ] we obtain | x ( t ) − ˜ x ( t ) | + | y ( t ) − ˜ y ( t ) | ≤ (cid:2) | x ( t i − ) − x w i − | + | y ( t i − ) − y w i − | (cid:3) e ( t i − t i − ) L ( ε ) . This together with P K − i =1 ( t i − t i − ) ≤ T implies that we can estimate for any t ∈ J T : | x ( t ) − ˜ x ( t ) | + | y ( t ) − ˜ y ( t ) |≤ (cid:20) | x ( t ) − x w | + | y ( t ) − y w | + K − θ C ( δ ) (cid:21) exp (cid:18) K − C ( δ ) + T L (cid:18) ε (cid:19)(cid:19) = K − θ C ( δ ) exp (cid:18) K − C ( δ ) + T L (cid:18) ε (cid:19)(cid:19) =: θ e TLε C ( δ ) . Analogous estimates apply for w = ( x w , y w ) ∈ ( M + B (0 , δ )). This proves (62). Now theresults (63)-(64) follow directly from the deﬁnition of the mapping ˜ p ε in (61).Finally we can prove the main result. Some elements of the proof of Theorem 3.1 will bekept. However, we can improve the convergence norm and also simplify the argument that inthe singular limit, solutions satisfy the variational inequality for the generalized play operator. Proof. (of Theorem 4.1) As in the proof of Theorem 3.1, we shall argue with sequences andconverging subsequences { ε k } . There exist functions x, y ∈ W ,q ( J T ) such that p ε k → x and y ε k → y (65)25s ε k → ,q ( J T ) and strongly in C( J T ) respectively. Furthermore, one shows aspreviously that x ε k → x in L q ( J T ), that the convergence of y ε k is strong and that y solves (39)-(40) with x = x . The strategy is now to improve the convergence norm and to simplify theargument that x solves (41)-(43).Let η > F − < F + and bothfunctions are monotone increasing. Using these facts and the deﬁnition (61) of ˜ p ε yields | ˜ p ε ( t ) − p ε ( t ) | ≤ max {| F − ( y ε ( t )) − F − (˜ y ε ( t )) | , | F + ( y ε ( t )) − F + (˜ y ε ( t )) | , | x ε ( t ) − ˜ x ε ( t ) |} for all t ∈ J T . Consider the assumptions and the notation from Deﬁnition 4.8 as well as thenotation from Lemma 4.9. If w = ( x w , y w ) = ( x , y ) ∈ M \ M then we set δ η := min (cid:26) dist( w , M ) , η L ± ) (cid:27) (66)in Deﬁnition 4.8, so that w ∈ M \ ( M + B (0 , δ η )), and deﬁne ε δ η as in Deﬁnition 4.8. In thiscase, we further let t ε := t ≤ εC ( δ η )for ε ∈ (cid:0) , ε δη (cid:1) with t and C ( δ η ) from Lemma 4.9. If w = ( x w , y w ) ∈ M then let δ η := η L ± ) (67)in Deﬁnition 4.8 and again consider ε δ η from Deﬁnition 4.8. Then we can deﬁne t ε = 0 for ε ∈ (cid:0) , ε δη (cid:1) . Now we can ﬁnd ε η ∈ (cid:0) , ε δη (cid:1) such that for all ε ∈ (0 , ε η ) the following key boundis satisﬁed max { , L − , L + } θ ε e TLε C ( δ η ) < η . (68)Lemma 4.9 can then be used to estimate for all ε ∈ (0 , ε η ) the fast variablemax t ∈ J T : t ≥ t ε | x ε ( t ) − p ε ( t ) | ≤ max t ∈ J T : t ≥ t ε | x ε ( t ) − ˜ x ε ( t ) | + | ˜ x ε ( t ) − ˜ p ε ( t ) | + | ˜ p ε ( t ) − p ε ( t ) |≤ δ η (1 + L ± ) + max { , L ± } θ ε e TLε C ( δ η ) < η. (69)Given a subsequence ε k →

0, we choose k η > ε k ∈ (0 , ε η ) andmax t ∈ J T | y ε k ( t ) − y ( t ) | + | p ε k ( t ) − x ( t ) | < η k ≥ k η . From the last two maximum norm bounds (69)-(70) it then follows that we have k x ε k − x k C[ t εk ,T ] + k y ε k − y k C( J T ) ≤ k y ε k − y k C( J T ) + max t ∈ [ t εk ,T ] | x ε k ( t ) − p ε k ( t ) | + | p ε k ( t ) − x ( t ) | < η. (71)The bound (71) is the crucial step. It is going to provide that the variational inequality issolved and it is going to show the convergence in the C -norm. We start by proving the former,i.e., by showing that x solves (41)-(43) with y = y . The proof of the properties x (0) = min { max { x , F − ( y ) } , F + ( y ) } x ( t ) ∈ [ F − ( y ( t )) , F + ( y ( t ))] ∀ t ∈ J T remain the same as in the proof of Theorem 3.1. Let t ∈ J T be given. Suppose that x ( t ) = F + ( y ( t )). Assume ﬁrst that g ( x, y, t ) < U × I ⊂ M × J T of ( x ( t ) , y ( t ) , t )and ( x ( t ) , y ( t )) ∈ U for all t ∈ I . Using [Rud91, Theorem 4.15 and Theorem 4.16] and continuityof g , after eventually making U smaller, we may assume that − C g < g ( x, y, t ) < − c < c and all ( x, y, t ) ∈ U × I . (72)Now we are going to apply the crucial bound (71). Deﬁne a parameter η > η < min t ∈ I dist(( x ( t ) , y ( t )) , ∂U ) . In dependence of this parameter, let δ η , ε η , k η be deﬁned as in (66)-(67), (68) and (70). Moreover,we choose ε I ∈ (0 , ε η ) such that t ε ≤ εC ( δ η ) < min { t : t ∈ I } for all ε ∈ (0 , ε I ) . Finally we deﬁne k I ≥ k η such that ε k ∈ (0 , ε I ) for all k ≥ k I . These choices then lead us tothe estimate k y ε k − y k C( J T ) + k x ε k − x k C[ t εk ,T ] < η (73)for all k ≥ k I . Therefore, we obtain ( x ε k ( t ) , y ε k ( t ) , t ) ∈ U × I for all k ≥ k I and t ∈ I . Because( x ( t ) , y ( t )) ∈ ∂M by assumption, (73) yields dist(( x ε k ( t ) , y ε k ( t )) , ∂M ) < η for all k ≥ k I .Recall that − C g < g ( x, y, t ) < − c < x, y, t ) ∈ U × I by (72). This fact applied inthe fast-slow system (8)-(9) implies that for all k ≥ k I , y ε k is monotone decreasing in I with − C g < ˙ y ε k < − c . We also already know from (69) thatmax t ∈ J T : t ≥ t ε | x ε ( t ) − p ε ( t ) | < η ε ∈ (0 , ε η )and since ( p ε , y ε ) ∈ M for all ε > t ε < max { t ∈ I } , this yieldsmax t ∈ I dist(( x ε k ( t ) , y ε k ( t )) , M ) < η k ≥ k I . This fact can be combined with the observation that there is no fast ﬂow inside the critical man-ifold, i.e., ˙ x ε k ( t ) = 0 if ( x ε k ( t ) , y ε k ( t )) ∈ M and with the fact that dist(( x ε k ( t ) , y ε k ( t )) , ∂M ) <η and − C f < ˙ y ε k < − c in I for all k ≥ k I . In particular, we may now conclude thatmax t ∈ I ∩ [ t ,T ] dist(( x ε k ( t ) , y ε k ( t )) , ∂M ) < η for all k ≥ k I . Since we are still free in our choice of η (which then determines k I ), the last estimate, togetherwith the crucial bound (73), proves that ( x ( t ) , y ( t )) ∈ ∂M and therefore x ( t ) = F + ( y ( t )) for all t ∈ I ∩ [ t , T ]. Moreover, it follows that y is monotone decreasing in I . Since F + is monotoneincreasing, this implies that x is monotone increasing in I ∩ [ t , T ], so that ˙ x ( t ) < t ∈ I ∩ [ t , T ].The case when g ( x, y, t ) > x ( t ) , y ( t ) , t ) leads to a contradictionas we would have moved already inside M earlier in this case. If t ∈ J T is a time such that27 ( x, y, t ) ≥ U × I ) ∩ ( M × [ t , T ]) for some neighbourhood U × I of ( x ( t ) , y ( t ) , t ) then y is monotone increasing and x is constant in I ∩ [ t , T ] with x ( t ) = F + ( y ( t )). The cases when x ( t ) = F − ( y ( t )) or ( x ( t ) , y ( t )) ∈ int( M ) are treated in a similar manner. Therefore, wehave shown that x solves (41)-(43) with y = y .Uniqueness of x and y and convergence of the whole sequence follow just as in the proofof Theorem 3.1. More precisely, we may repeat the steps with any subsequence converging tozero and (71) then proves (45). From (45) we deduce the convergence result of x ε . This yieldsthe convergence result also for y ε . So far, we have studied the singular limit convergence guided by Netushil’s conjecture of fast-slow systems coupled with hysteresis operators. Our results in Sections 3-4 were fully rigorous .However, it will be of interest to see, how we can practically analyze fast-slow systems (8)-(9).In this section we provide a numerical and formal analysis of an important subclass of (8)-(9).First, note that (8)-(9) can be re-written in non-autonomous form as ε d x d τ = f ( x, y ) , d y d τ = g ( x, y, t ) , d t d τ = ω, (74)where we introduce an additional parameter ω . From a dynamical systems point of view,fast-slow problems of the form (74) provide many highly investigated classical examples. Forexample, if we consider f as the classical cubic non-linearity and assume that g is periodic in t , say for concreteness, f vdP ( x, y ) = y − x x, g ( x, y, t ) = g ( x, y, t + 2 π ) , (75)then (74) has a very prominent representative given by the forced van der Pol oscillator [vdP34,Kue15, Guc03] usually written as ε d x d τ = f vdP ( x, y ) , d y d τ = a sin(2 πθ ) − x, d θ d τ = ω. (76)where ( x, y, θ ) ∈ R × S with S = [0 , / (0 ∼

1) is the circle, ( y, θ ) are the slow variablesand a, ω are the main amplitude and phase parameters used in bifurcation studies of (76);see [GHW03, BEG +

03, GNV84, SW04]. The forced van der Pol equation is also one of thevery few ODE models, where it has been rigorously proven that chaotic oscillations may oc-cur [Hai09]. The model has also a strong link to one-dimensional or almost one-dimensionalreturn maps and chaotic dynamics [GWY06]. The forced van der Pol equation is still beingstudied very actively [BDG + f ( x, y ) =  y − x − x < y − y − x + 1 if x > y + 1,0 else, (77)for the fast variable. Piecewise linear approximations are very classical in fast-slow systems,e.g., they have been studied in the van der Pol context many times [Lev49] but are still of highcurrent interest [DFH +

13, DGP + g ( x, y, t ) = a sin(2 πt ) + bx + cy (78)for parameters a, b, c ∈ R . The strategy to start with the lowest order Taylor expansion iswell-known in fast-slow systems for the slow variables [Guc08, SW01] and so is starting withthe lowest harmonic in many other contexts [Kur12]. One checks that (A1)-(A4) hold with F + ( y ) = y + 1 and F − ( y ) = y − G ( x ); however, we shall observethat x is going to stay bounded for certain parameter choices to be investigated below so wecan just cut oﬀ G ( x ) = bx smoothly outside a compact set. −1.5−1−0.500.511.5−1.5 −1 −0.5 0 0.5 1 1.5−0.500.5 (a) (b) x y C F + F − tyx Figure 6: Direct numerical integration of the fast-slow ODEs (74) with nonlinearities givenby (77)-(78); the parameters are chosen as a = 1, b = − c = , ω = 4, and ε = 0 .

01. Theinitial condition was chosen outside and O (1)-separated from C . (a) Phase portrait showinga typical trajectory (black curve) projected into the ( y, x )-plane, i.e., not explicitly showingthe non-autonomous periodic τ -direction. C lies between the two curves (gray) deﬁned by F ± and C has the same dimension as the ambient phase. (b) Time series of the trajectory withthe y -coordinate (solid curve) and the x -coordinate (dashed curve). One clearly observes smallscale behaviour in the region, where the fast and slow variables interact.As a ﬁrst step, we would like to check, whether we can ﬁnd any interesting dynamics byselecting the basic nonlinearities (77)-(78). Figure 6 shows the results of numerical integration.We have selected an initial condition far separated of the critical manifold C = { ( x, y, τ ) ∈ R × S : y − ≤ x ≤ y + 1 } . (79)29he initial condition gets attracted very fast towards C as shown in Figure 3(a) as expectedalready from the theoretical results based upon assumption (A2). The dynamics near theboundary ∂ C = { x = F − ( y ) } ∪ { x = F + ( y ) } =: C − ∪ C + (80)is a lot more delicate. We observe in Figure 6 the case of many small amplitude oscillations(SAOs) near both parts of the boundary. Furthermore, there are relatively slow jumps between C − and C + in comparison to the long very slow drift time near each boundary piece. Essentially,we observe oscillations, which look similar to classical relaxation oscillations [MR80, Gra87,Kue15], just with high-frequency fast SAOs overlayed near the slowest scale pieces and thejumps in the relaxation cycle still occur on the slow time scale. Figure 6 does not provide anindication, whether the oscillations are actually periodic or potentially even chaotic.The SAOs near F ± are easy to explain formally. Suppose we use the standard slow subsystemreduction just along the lines C ± , then we obtaind y d t = a sin(2 πωt ) + bx + cy = a sin(2 πωt ) + ( b + c ) y ± b, y (0) = y . (81) -1.5 -1 -0.5 0 0.5 1 1.5-0.500.5 (a) (b1)(b2) x y C F + F − Figure 7: Direct numerical integration of the fast-slow ODEs (74) with nonlinearities givenby (77)-(78); the parameters are chosen as a = 1, b = − c = , ω = 4, ε = 0 .

01 and ﬁnal time T = 5 · . The initial condition was chosen outside and O (1)-separated from C . (a) Phaseportrait showing a typical trajectory (black curve) projected into the ( y, x )-plane. We have nowalso marked the two “average” equilibrium points ( Y ± , F ± ( Y ± )) as dots (black). (b1) Zoom near( Y − , F − ( Y − )) = (1 . , . C − .The ODE (81) can actually be solved explicitly. We denote the solutions corresponding tothe respective signs in front of the constant term ± b by y ± = y ± ( t ). We have y ± ( t ) = Y ± + Y e e ( b + c ) t + Y h ( t ) (82)30here the individual (“constant, exponential prefactor, and harmonic”) terms are given by Y ± = ∓ bb + c , (83) Y e = ± bb + c + 2 πaω ( b + c ) + 4 π ω + y , (84) Y h ( t ) = − a ( b + c ) sin(2 πtω ) + 2 πaω cos(2 πtω )( b + c ) + 4 π ω . (85) -0.5 0 0.5-2-1012 -2-1012 0 1 2 310 -101 (a) (b1)(b2) A c txyxy max( y ( t ))min( y ( t ))Figure 8: Bifurcation diagram of the fast-slow ODEs (74) with nonlinearities given by (77)-(78);the parameters are chosen as a = 1, b = − ω = 4, and ε = 0 .

01. (a) Main bifurcation diagramvarying the parameter c and showing the maximum and minimum amplitudes A of the variable y for the global attractor. (b1) Time series for c = 0 .

1. (b2) Time series for c = − .

1. Thedashed curves are x ( t ) and the solid curve with SAOs are y ( t ).The formal slow subsystem (81) remains bounded for all y ∈ R if and only if b + c ≤

0. Weshall not investigate the borderline case b = − c here and just assume b + c < Y e exp[( b + c ) t ] → t → + ∞ so the dynamics of y ± ( t ) is a harmonic oscillation aroundthe points Y ± , i.e., we have lim t → + ∞ t Z t y ± ( s ) d s = Y ± so we may view Y ± as averaged equilibrium points. We now have to re-visit the numericalresults from Figure 6, which are presented in phase space in a diﬀerent way in Figure 7, wherewe clearly see that the slow subsystem approximation correctly describes the behaviour nearthe branches C ± , i.e., we move upwards via the terms Y − + e ( b + c ) t Y e on the right branch C − towards ( y, x ) = ( Y − , F − ( Y − )) and there are oscillations induced by the term Y h ( t ). Similarly wemove downwards on the left branch C + with several oscillations induced by the time-dependentterms.The next natural question is, how the global periodic large oscillations are generated underparameter variation. Figure 8 shows a basic bifurcation diagram ﬁxing all parameters except c .31 x y Figure 9: Numerical simulation of the fast-slow ODEs (74) with nonlinearities given by (77)-(78); the parameters are chosen as a = 1, b = − c = , ω = 4. The parameter ε is varied.The top row shows ε = 0 . , . , . ε = 0 . , . , . c passes through c = 0.In particular, the transition could be viewed as being similar to a canard-type explosion [DD95,Eck83, DR96, Kue15] as the growth of the amplitude occurs near a part of C , which is notnormally hyperbolic and not attracting, i.e., inside int( C ). Note carefully that if x ≈

0, thenthe slow equation for y has only a small x -dependence, so c can actually control growth ordecay in this region. Indeed, in this case the singular limit generalized play operator fromTheorem 3.1 precisely shows an equilibrium point at x = 0 for the y -dynamics if c < ε . Figure 9shows, how we converge from an oscillation with quite large excursions outside of C to thesingular limit generalized play operator, which is entirely constrained to C after the projectionof the initial condition. We observe that on the initial transient approach towards the oscillatorysolution, there are signiﬁcant diﬀerences in the patterns of the SAOs for diﬀerent values of ε .Furthermore, the patterns seem to stabilize a bit more as ε → Acknowledgments:

CK has been supported by a Lichtenberg Professorship of the Volk-swagenStiftung. CM has been supported by the DFG through the International ResearchTraining Group IGDK 1754 “Optimization and Numerical Analysis for Partial DiﬀerentialEquations with Nonsmooth Structures”. We also would like to thank two anonymous referees32or suggestions regarding the presentation of our results.

References [AF03] R.A. Adams and J.J.F. Fournier.

Sobolev Spaces . Elsevier, 2003.[BDG +

16] J. Burke, M. Desroches, A. Granados, T.J. Kaper, M. Krupa, and T. Vo. Fromcanards of folded singularities to torus canards in a forced van der Pol equation.

J.Nonlinear Sci. , 26(2):405–451, 2016.[BEG +

03] K. Bold, C. Edwards, J. Guckenheimer, S. Guharay, K. Hoﬀman, J. Hubbard,R. Oliva, and W. Weckesser. The forced van der Pol equation II: canards in thereduced system.

SIAM Journal of Applied Dynamical Systems , 2(4):570–608, 2003.[BG06] N. Berglund and B. Gentz.

Noise-Induced Phenomena in Slow-Fast DynamicalSystems . Springer, 2006.[BGK15] N. Berglund, B. Gentz, and C. Kuehn. From random Poincar´e maps to stochasticmixed-mode-oscillation patterns.

J. Dyn. Diﬀ. Equat. , 27(1):83–136, 2015.[BHC03] B.E. Beisner, D.T. Haydon, and K. Cuddington. Alternative stable states in ecology.

Front. Ecol. Environ. , 1(7):376–382, 2003.[BK15] M. Brokate and P. Krejˇc´ı. Weak diﬀerentiability of scalar hysteresis operators.

Discrete Contin. Dyn. Syst. , 35(6):2405–2421, 2015.[BO99] C.M. Bender and S.A. Orszag.

Asymptotic Methods and Perturbation Theory .Springer, 1999.[BR05] M. Brokate and D. Rachinskii. On global stability of the scalar Chaboche models.

Nonlin. Anal.: Real World Appl. , 6(1):67–82, 2005.[BS96] M. Brokate and J. Sprekels.

Hysteresis and Phase Transitions . Springer, 1996.[CGT16] M. Curran, P. Gurevich, and S. Tikhomirov. Recent advances in reaction-diﬀusionequations with non-ideal relays. In

Control of Self-Organizing Nonlinear Systems ,pages 211–234. Springer, 2016.[Cro93] R. Cross. On the foundations of hysteresis in economic systems.

Econ. Phil. , 9(1):53–74, 1993.[DD95] F. Diener and M. Diener.

Nonstandard Analysis in Practice . Springer, 1995.[DFH +

13] M. Desroches, E. Freire, S.J. Hogan, E. Ponce, and P. Thota. Canards in piecewise-linear systems: explosions and super-explosions.

Proc. R. Soc. A , 469:20120603,2013.[DGP +

16] M. Desroches, A. Guillamon, E. Ponce, R. Prohens, S. Rodrigues, and A. E. Teruel.Canards, folded nodes, and mixed-mode oscillations in piecewise-linear slow-fastsystems.

SIAM Rev. , 58(4):653–691, 2016.[DR96] F. Dumortier and R. Roussarie.

Canard Cycles and Center Manifolds , volume 121of

Memoirs Amer. Math. Soc.

AMS, 1996.33Eck83] W. Eckhaus. Relaxation oscillations including a standard chase on French ducks.

Lecture Notes in Mathematics , 985:449–494, 1983.[Eva02] L.C. Evans.

Partial Diﬀerential Equations . AMS, 2002.[Fen79] N. Fenichel. Geometric singular perturbation theory for ordinary diﬀerential equa-tions.

J. Diﬀerential Equat. , 31:53–98, 1979.[Fit55] R. FitzHugh. Mathematical models of threshold phenomena in the nerve membrane.

Bull. Math. Biophysics , 17:257–269, 1955.[GH83] J. Guckenheimer and P. Holmes.

Nonlinear Oscillations, Dynamical Systems, andBifurcations of Vector Fields . Springer, New York, NY, 1983.[GHW03] J. Guckenheimer, K. Hoﬀman, and W. Weckesser. The forced van der Pol equationI: the slow ﬂow and its bifurcations.

SIAM Journal of Applied Dynamical Systems ,2(1):1–35, 2003.[GNV84] J. Grasman, H. Nijmeijer, and E.J.M. Veling. Singular perturbations and a mappingon an interval for the forced van der Pol relaxation oscillator.

Physica D , 13(1):195–210, 1984.[GR01] A. Ganopolski and S. Rahmstorf. Rapid changes of glacial climate simulated in acoupled climate model.

Nature , 409(6817):153–158, 2001.[Gra87] J. Grasman.

Asymptotic Methods for Relaxation Oscillations and Applications .Springer, 1987.[GST13] P. Gurevich, R. Shamin, and S. Tikhomirov. Reaction-diﬀusion equations withspatially distributed hysteresis.

SIAM J. Math. Anal. , 45(3):1328–1355, 2013.[Guc03] J. Guckenheimer. Global bifurcations of periodic orbits in the forced van der Polequation. In H.W. Broer, B. Krauskopf, and G. Vegter, editors,

Global Analysisof Dynamical Systems - Festschrift dedicated to Floris Takens , pages 1–16. Inst. ofPhysics Pub., 2003.[Guc08] J. Guckenheimer. Singular Hopf bifurcation in systems with two slow variables.

SIAM J. Appl. Dyn. Syst. , 7(4):1355–1377, 2008.[GWY06] J. Guckenheimer, M. Wechselberger, and L.-S. Young. Chaotic attractors of relax-ation oscillations.

Nonlinearity , 19:701–720, 2006.[Hai09] R. Haiduc. Horseshoes in the forced van der Pol system.

Nonlinearity , 22:213–237,2009.[HJ80] F.C. Hoppenstaedt and W. J¨ager. Pattern formation by bacteria. In

BiologicalGrowth and Spread , volume 17, pages 68–81. Springer, 1980.[Jon95] C.K.R.T. Jones. Geometric singular perturbation theory. In

Dynamical Sys-tems (Montecatini Terme, 1994) , volume 1609 of

Lect. Notes Math. , pages 44–118.Springer, 1995.[Kap99] T.J. Kaper. An introduction to geometric methods and dynamical systems theoryfor singular perturbation problems. analyzing multiscale phenomena using singular34erturbation methods. In J. Cronin and R.E. O’Malley, editors,

Analyzing MultiscalePhenomena Using Singular Perturbation Methods , pages 85–131. Springer, 1999.[KC96] J. Kevorkian and J.D. Cole.

Multiple Scale and Singular Perturbation Methods .Springer, 1996.[Kop06] J. Kopfov´a. Hysteresis in biological models.

J. Phys.: Conf. Ser. , 55(1):130–134,2006.[KP89] M.A. Krasnosel’skii and A.V. Pokrovskii.

Systems with Hysteresis . Springer, 1989.[Kre05] P. Krejˇc´ı. Hysteresis in singularly perturbed problems. In M.P. Mortell, R.E.O’Malley, A. Pokrovskii, and V. Sobolev, editors,

Singular Perturbations and Hys-teresis , pages 73–100. SIAM, 2005.[KS01] M. Krupa and P. Szmolyan. Extending slow manifolds near transcritical and pitch-fork singularities.

Nonlinearity , 14:1473–1491, 2001.[KS15] C. Kuehn and P. Szmolyan. Multiscale geometry of the Olsen model and non-classical relaxation oscillations.

J. Nonlinear Sci. , 25(3):583–629, 2015.[Kue15] C. Kuehn.

Multiple Time Scale Dynamics . Springer, 2015. 814 pp.[Kuh03] K. Kuhnen. Modeling, identiﬁcation and compensation of complex hysteretic nonlin-earities: A modiﬁed Prandtl-Ishlinskii approach.

European J. Control , 9(4):407–418,2003.[Kur12] Y. Kuramoto.

Chemical Oscillations, Waves, and Turbulence . Springer, 2012.[Lev49] N. Levinson. A second order diﬀerential equation with singular solutions.

Ann. ofMath. , 50:127–153, 1949.[Lov13] A.E.H. Love.

A Treatise on the Mathematical Theory of Elasticity . CUP, 2013.[May03] I.D. Mayergoyz.

Mathematical Models of Hysteresis and their Applications . Aca-demic Press, 2003.[Mie11] A. Mielke. Diﬀerential, energetic, and metric formulations for rate-independentprocesses. In

Nonlinear PDE’s and Applications , pages 87–170. Springer, 2011.[MOPS05] M.P. Mortell, R.E. O’Malley, A. Pokrovskii, and V. Sobolev, editors.

SingularPerturbations and Hysteresis . SIAM, 2005.[MR80] E.F. Mishchenko and N.Kh. Rozov.

Diﬀerential Equations with Small Parametersand Relaxation Oscillations (translated from Russian) . Plenum Press, 1980.[MR15] A. Mielke and T. Roub´ıˇcek.

Rate-independent Systems . Springer, 2015.[MRS12] A. Mielke, R. Rossi, and G. Savar´e. BV solutions and viscosity approximations ofrate-independent systems.

ESAIM: Control, Optimisation and Calculus of Varia-tions , 18(1):36–80, 2012.[MT04] A. Mielke and F. Theil. On rate-independent hysteresis models.

NoDEA , 11(2):151–189, 2004.[MX91] I. M¨uller and H. Xu. On the pseudo-elastic hysteresis.

Acta Mater. , 39(3):263–271,1991. 35Net68] A.V. Netushil. Nonlinear element of stop type.

Avtomat. Telemech. , 7:175–179,1968. (in Russian).[Net70] A.V. Netushil. Self-oscillations in systems with negative hysteresis. In

Proc. 5th In-ternational Conference on Nonlinear Oscillations , volume 4, pages 393–396. IzdanieInst. Mat. Akad. Nauk Ukrain., 1970. (in Russian).[PKK +

12] A. Pimenov, T.C. Kelly, A. Korobeinikov, M. O’Callaghan, A.V. Pokrovskii, andD. Rachinskii. Memory eﬀects in population dynamics: spread of infectious diseaseas a case study.

Math. Model. Nat. Phenom. , 7(3):204–226, 2012.[PS05] A. Pokrovskii and V. Sobolev. A naive view of time relaxation and hysteresis.In M.P. Mortell, R.E. O’Malley, A. Pokrovskii, and V. Sobolev, editors,

SingularPerturbations and Hysteresis , pages 1–59. SIAM, 2005.[Rud91] W. Rudin.

Functional Analysis . McGraw-Hill, 1991.[Sch85] S. Schecter. Persistent unstable equilibria and closed orbits of a singularly perturbedequation.

J. Diﬀerential Equat. , 60:131–141, 1985.[Str00] S.H. Strogatz.

Nonlinear Dynamics and Chaos . Westview Press, 2000.[SW48] E.C. Stoner and E.P. Wohlfarth. A mechanism of magnetic hysteresis in heteroge-neous alloys.

Phil. Trans. R. Soc. A , 240(826):599–642, 1948.[SW01] P. Szmolyan and M. Wechselberger. Canards in R . J. Diﬀerential Equat. , 177:419–453, 2001.[SW04] P. Szmolyan and M. Wechselberger. Relaxation oscillations in R . J. DiﬀerentialEquat. , 200:69–104, 2004.[Tor00] E. Della Torre.

Magnetic Hysteresis . Wiley, 2000.[vdP26] B. van der Pol. On relaxation oscillations.

Philosophical Magazine , 7:978–992, 1926.[vdP34] B. van der Pol. The nonlinear theory of electric oscillations.

Proc. IRE , 22:1051–1086, 1934.[Vis94] A. Visintin.

Diﬀerential Models of Hysteresis . Springer, 1994.[Wig94] S. Wiggins.