[PDF] Computational assessment of smooth and rough parameter dependence of statistics in chaotic dynamical systems

Abstract

An assumption of smooth response to small parameter changes, of statistics or long-time averages of a chaotic system, is generally made in the field of sensitivity analysis, and the parametric derivatives of statistical quantities are critically used in science and engineering. In this paper, we propose a numerical procedure to assess the differentiability of statistics with respect to parameters in chaotic systems. We numerically show that the existence of the derivative depends on the Lebesgue-integrability of a certain density gradient function, which we define as the derivative of logarithmic SRB density along the unstable manifold. We develop a recursive formula for the density gradient that can be efficiently computed along trajectories, and demonstrate its use in determining the differentiability of statistics. Our numerical procedure is illustrated on low-dimensional chaotic systems whose statistics exhibit both smooth and rough regions in parameter space.

Full PDF

CComputational assessment of smooth and roughparameter dependence of statistics in chaotic dynamicalsystems

Adam A. ´Sliwiak a,b, ∗ , Nisha Chandramoorthy a,c , Qiqi Wang a,b a Center for Computational Science and Engineering, Massachusetts Institute of Technology(MIT), Cambridge, MA, 02139, USA b MIT Department of Aeronautics and Astronautics c MIT Department of Mechanical Engineering

Abstract

An assumption of smooth response to small parameter changes, of statisticsor long-time averages of a chaotic system, is generally made in the ﬁeld ofsensitivity analysis, and the parametric derivatives of statistical quantities arecritically used in science and engineering. In this paper, we propose a numericalprocedure to assess the diﬀerentiability of statistics with respect to parametersin chaotic systems. We numerically show that the existence of the derivativedepends on the Lebesgue-integrability of a certain density gradient function,which we deﬁne as the derivative of logarithmic SRB density along the unstablemanifold. We develop a recursive formula for the density gradient that can beeﬃciently computed along trajectories, and demonstrate its use in determiningthe diﬀerentiability of statistics. Our numerical procedure is illustrated on low-dimensional chaotic systems whose statistics exhibit both smooth and roughregions in parameter space.

Keywords:

Chaotic dynamical systems, Sensitivity analysis, Linear responsetheory, Roughness, SRB density, Density gradient function

1. Introduction

Sensitivity analysis is the study of the response of a dynamical system tosmall perturbations. In this paper, we are interested in the response of long-term or statistical behavior in a chaotic dynamical system, for instance, time-averaged dynamic force coeﬃcients in a turbulent ﬂow [1], to small changesin system parameters. The sensitivities we are interested in are derivatives of ∗ Corresponding author.

Email addresses: [email protected] (Adam A. ´Sliwiak), [email protected] (NishaChandramoorthy), [email protected] (Qiqi Wang)

Preprint submitted to Elsevier January 26, 2021 a r X i v : . [ n li n . C D ] J a n ong-time averages of observables with respect to parameters. These deriva-tives are used in gradient-based approaches for design, optimization and control[2, 1, 3, 4, 5, 6], variational data assimilation [7, 8] and uncertainty quantiﬁca-tion applications [9, 10] in various ﬁelds of science and engineering.In chaotic dynamical systems describing complex phenomena, such as the Earth’sclimate [11, 12], molecular transition [13], and turbulent ﬂow [14], the high sen-sitivity to perturbations poses a challenge to computing the parametric deriva-tives of statistics. As a result of this so-called butterﬂy eﬀect , the sensitivitiesof time-averaged quantities with respect to parameters, which can be obtainedusing traditional methods such as tangent/adjoint equations [15], automatic dif-ferentiation [16] and ﬁnite-diﬀerencing [17], grow exponentially with averagingtime window. However, the sensitivities of the statistical or long-time averagedquantities that we are interested in are bounded quantities. In fact, the assump-tion of linear response is generally made, by which the statistics or long-term av-erages of a chaotic system vary diﬀerentiably with respect to parameters. Thatis, for small parameter perturbations, we assume that the statistics or long-termaverages of observables are linear in the parameter perturbation. Recently, moresophisticated methods have been proposed to compute the parametric deriva-tive of statistics or linear response. They include shadowing trajectory-basedapproaches [18, 19], ensemble averaging [20], transfer operator-based [21] meth-ods, and variational techniques [22]. Other modern approaches are derivedbased on the Fluctuation-Dissipation Theorem [23], Ruelle’s linear responsetheory [24, 25]. While these methods are recently being used in some prac-tical chaotic systems, including low-Reynolds number turbulent ﬂows, climatemodels of intermediate complexity, and so forth [14, 26, 27, 28], the underlyingassumption of linear response may itself be violated in other systems. In fact,beyond the classic example of the logistic map [29, 21] non-smooth response, asdemonstrated in [30], one-dimensional chaotic systems can be constructed thatexhibit arbitrarily large linear responses: an imperceptible parameter perturba-tion causes a drastic change in statistics.As shown by Ruelle [31], linear response is rigorously true in uniformly hy-perbolic systems, which represent the simplest setting in which chaotic attrac-tors occur. Based on Gallavotti and Cohen [32], it is popularly believed thatmany highly dissipative dynamical systems found in nature behave as if theywere uniformly hyperbolic. This conjecture, known as the chaotic hypothesis, issupported by numerical computations of sensitivities of statistical quantities inmany popular physical models, including PDE models for turbulence governedby Kuramoto-Sivashinsky equation [27] and the 3D Navier-Stokes equation [14].However, various models used in climate modeling and geophysical ﬂuid dynam-ics indicate the chaotic hypothesis cannot always be applied. For example, arough statistics-parameter relationship has been observed in an El-Ni˜no South-ern Oscillation climate model in [33]; other work describes violations of linearresponse on climate models around atmospheric blocking events [34]. Wormelland Gottwald approach the question of existence of linear response from a sta-tistical mechanical perspective [35]. They construct a worst-case prototype ofa macroscopic system where linear response is upheld despite its failure in the2onstituent microscopic subsystems. Besides this theoretical insight, the au-thors caution [36] that detecting the failure of linear response na¨ıvely usingthe statistics-vs-parameter curve is too data-intensive, and detailed informa-tion about the invariant probability distribution is required. Chekroun et al. ’swork [33] provides a potential mathematical as well as a numerically veriﬁableprocedure for the detection of smoothness of parameter dependences. In par-ticular, a relation between the spectral gap of the transfer operator [37], andthe smoothness of statistical quantities in parameter space is established. In[38], a generalization of the Fluctuation-Dissipation Theorem has been used tostudy the response attributes of an atmospheric general circulation model, andto verify the applicability of the linear response theory for that particular model.The main contribution of this work is an alternative numerical procedurefor the detection of bounded linear response. Our approach is based on usingdetailed statistical information about the underlying probability distribution,which is represented in the density gradient function (see Section 3 and 5 for thedeﬁnition; see also [39] for an intuitive description of the density gradient func-tion). However, we are able to compute this fundamental function simply fromtime series information by developing an eﬃcient ergodic-averaging method.We empirically ﬁnd that when the computed density gradient is Lebesgue-integrable, linear response holds. This observation can also be mathematicallycorroborated by integration-by-parts on Ruelle’s linear response formula. Inparticular, we build on our previous work in which we regularize Ruelle’s for-mula and describe an algorithm to compute the regularized formula, known asspace-split sensitivity or S3. In this work, we utilize the S3 formula to proposea computable criterion for diﬀerentiability of statistics. Thus, the numericalassessment of the validity of linear response presented in this paper has two at-tractive features: i) it is eﬃciently computable along trajectories, ii) it producesingredients that lead to the value of the derivative, when linear response holds.The main body of this paper is divided into six sections. In Section 2,we analyze a representative one-dimensional chaotic map, which we call theonion map, whose statistical quantities exhibit both smooth and non-smoothbehavior. A mathematical argument showing the relation between the derivativeof statistics and the density gradient function is presented in Section 3. In thesame section, we derive a recursive formula for the density gradient functionusing the measure preservation property and provide implementation details. InSection 4, we numerically analyze the distribution of g and estimate the H¨olderexponent of the statistics-parameter relation to validate our results. Basedon the numerical results, we propose a computable mathematical criterion forassessing the smoothness of statistics. Section 5 generalizes our conclusions fromSections 3-4 to higher-dimensional systems and demonstrates numerical resultsusing the Lorenz ’63 system as an example. Finally, Section 6 concludes thispaper. 3 igure 1: Illustration of the onion map at h = 0 .

97 and its dependence on γ . The right-handside plot zooms in the region in the vicinity of the tip.

2. Onion map as an example of one-dimensional chaos

At the beginning of our discussion, we present a simple chaotic map thatfeatures both smooth and non-smooth (rough) behavior. In that map, thedegree of smoothness of statistical quantities strictly depends on the value of itsparameters. Let us consider a one-dimensional map, ϕ : [0 , → [0 , h ], deﬁnedas follows, x k +1 = ϕ ( x k ; γ, h ) = h (cid:112) − | − x k | γ . (1)where γ >

0, 1 ≥ h > ϕ ( x k ) = ϕ ( x k ; γ, h ). Figure 1depicts Eq. 1 at some selected parameter values. Due to its characteristic shape,this map will be further referred to as the onion map. While the proportionalityparameter h only aﬀects the range of ϕ , the exponent γ has a signiﬁcant impacton the function shape in the vicinity of the tip located at (0 . , h ). If γ <

1, thetip is sharp, i.e., the derivative ϕ (cid:48) (0 .

5) does not exist, and its shape resemblesthe cusp map, as deﬁned and illustrated in [21]. The cusp map has been used inmodeling as a one-dimensional simpliﬁcation of the Lorenz ’63 system [40]. Weobserve the tip blunts when γ gets larger than 1, and the shape of ϕ convergesto the well-known logistic map [29] as γ approaches the value of 2. Given thecusp map and logistic map feature smooth and non-smooth statistical behaviors[21], the onion map, which combines both of them, is a perfect example of a mapwith varying regularity of statistical quantities. In our numerical examples, weﬁx the value of h to 0 .

97 and consider diﬀerent values of γ .One can easily verify that the only Lyapunov exponent (LE) λ , deﬁned as λ = lim N →∞ N N − (cid:88) k =0 log (cid:12)(cid:12)(cid:12)(cid:12) ∂ϕ∂x ( x k ) (cid:12)(cid:12)(cid:12)(cid:12) , (2)is always positive for the onion map if h = 0 .

97 and γ ∈ [0 . , . h , then the range of γ for which λ > chaotic behavior of the map4 igure 2: Density distribution ρ ( x ) generated for the onion map (Eq. 1) at h = 0 .

97. Togenerate ρ ( x ), we divided the domain x ∈ [0 ,

1] into K = 2048 bins of equal width, countedthe number of times the trajectory passes through each bin. We used N = 41 , , , K/N , to satisfy the axiom of unit area. reﬂected by the butterﬂy eﬀect , i.e. strong sensitivity to the initial conditions.LE measures rate of separation of two trajectories of a chaotic map and its valuedepends only on the parameter.A function critical in the analysis of chaotic systems is the Sinai-Ruelle-Bowen(SRB) density ρ , which contains statistical information of dynamics describedby ϕ [41, 42, 43, 39]. Intuitively, ρ can be viewed as the likelihood of thetrajectory passing through a non-zero-volume region of the manifold and, ifnormalized, ρ can be viewed as a probability density function. In case of one-dimensional maps deﬁned on [0 , ρ is a function that maps[0 ,

1] to the set of non-negative real numbers, which satisﬁes the unity axiom,i.e., (cid:82) ρ ( x ) dx = 1. Figure 2 shows ρ generated for the onion map at diﬀerentvalues of the exponent γ . We observe the density distribution is smooth if γ ≤ .

4. When the exponent γ becomes higher, but is still no larger than 1,the density function is clearly bounded, but have some non-smooth regions. If γ ≥

1, the ρ distribution features discontinuous regions.Due to the ergodicity of the onion map, its statistics does not depend oninitial conditions. Moreover, this property implies that the Ergodic Theoremholds, and thus we can directly use the stationary density to compute the long-time averages of the onion map. In particular, the theorem ensures that aninﬁnite time average of some quantity of interest J is equal to the expectedvalue of the same quantity computed with respect to the density distribution.Mathematically, it means that (cid:104) J (cid:105) = lim N →∞ N N − (cid:88) i =0 J ( x i ) = (cid:90) J ( x ) ρ ( x ) dx (3)always holds. It is assumed that J is an integrable bounded function and it doesnot depend on the map parameter. The origins of the ﬁrst assumption will be5 igure 3: Relationship between the long-term average and the exponent γ for the onionmap (Eq. 1) at h = 0 .

97 with J ( x ) = δ (cid:15)c ( x ). To generate this plot, we computed densitydistributions at a uniform grid of 16,001 diﬀerent values of γ between 0.2 and 1.8. Foreach value of γ , we run 10 indepentend simulations using N = 41 , , ,

000 samples persimulation. In the calculation of the density, we divided the domain x ∈ [0 ,

1] into K = 4 binsof equal width (see Figure 2 for reference). explained in Section 3. Furthermore, the dependence of the objective functionon the parameter is not considered in this paper, as it does not impose extramathematical complexity in the computation of sensitivities and is irrelevantin the context of our analysis. Therefore, two critical properties can be furtherinferred from Eq. 3. First, the map statistics (cid:104) J (cid:105) solely depends on its parameter γ and, second, the smoothness of statistics strictly depends on the smoothnessof the density distribution. In our numerical test, we set J ( x ) = δ (cid:15)c ( x ), c ∈ [0 , h ],where δ (cid:15)c ( x ) is an indicator function, i.e. J ( x ) = 1 for all x ∈ x (cid:15)c := [ c − (cid:15)/ , c + (cid:15)/ J ( x ) = 0 otherwise. Note with this particular choice of the quantityof interest, the long-time average equals the density distribution itself evaluatedat c in the limit (cid:15) →

0, i.e. (cid:104) J (cid:105) = ρ ( c ) if (cid:15) is inﬁnitesimally small. For any (cid:15) such that x (cid:15)c ⊂ [0 , (cid:104) J (cid:105) equals the integral of ρ over x (cid:15)c . Note also thatfor any Riemann-integrable J ( x ), the statistics can be easily computed usinga numerical integration scheme by virtue of Eq. 3 if ρ is available. Figure 3illustrates the relationship between (cid:104) J (cid:105) and γ , for two diﬀerent values of c .In terms of the function smoothness, we observe a similar trend in bothFigure 2 and Figure 3. In particular, if γ increases, both the density function andlong-time average become more oscillatory and even discontinuous. This result isconsistent with the study in [21], where the relationship between the smoothnessof the statistics and smoothness of the density distribution has been justiﬁedanalytically using the Frobenius-Perron operator, which belongs to the class ofMarkov operators and describes the evolution of the SRB density [44]. Theauthors of [21] notice that if the Frobenius-Perron operator is well-conditionedand the density function is diﬀerentiable in phase space (with respect to x in6D), then ∂ρ/∂γ must be bounded. This observation is critical for the existenceof d (cid:104) J (cid:105) /dγ = (cid:82) J ( x ) ∂ρ∂γ ( x ) dx , which is the sought-after quantity in sensitivityanalysis.Motivated by the above discussion, we strive to ﬁnd a computable and gen-eralizable mathematical criterion for the existence of d (cid:104) J (cid:105) /dγ . In particular, ourpurpose is to identify a condition that can be translated to an eﬃcient numericalmethod, and is applicable to higher-dimensional chaotic systems.

3. Density gradient function in one-dimensional chaos

The main focus of this section is to highlight the signiﬁcance of the densitygradient function g in the context of the diﬀerentiability of statistics in one-dimensional chaotic maps. This function is a fundamental ingredient of thederivative of statistics. Here we consider 1D chaotic maps in which case g isdeﬁned as follows, g ( x ) = d log ρdx ( x ) = ρ (cid:48) ( x ) ρ ( x ) . (4)That is, the density gradient g is the relative rate of change of the SRB densityat each point on the 1D manifold [39]. Throughout this section, we use the primesymbol ( (cid:48) ) to indicate diﬀerentiation with respect to phase space. We assume ϕ : [0 , → [0 ,

1] is a 1D, invertible, ergodic, C map with a positive LE. Let J be a smooth observable whose expectation with respect to the SRB densityor equivalently, the inﬁnite-time average starting from almost everywhere, isdenoted (cid:104) J (cid:105) . In this case, Ruelle’s linear response formula [31, 45], which is aclosed-form expression for the parametric derivative of (cid:104) J (cid:105) , is given by d (cid:104) J (cid:105) dγ = ddγ (cid:90) J ( x ) ρ ( x ) dx = ∞ (cid:88) k =0 (cid:90) f ( x ) (cid:0) J ◦ ϕ k (cid:1) (cid:48) ( x ) ρ ( x ) dx, (5)where f := ∂ϕ/∂γ ◦ ϕ − is the parameter perturbation. The subscript notationis used to denote the number of times a map ϕ is applied i.e., ϕ ( x ) = x and ϕ k ( x ) = ϕ ( ϕ k − ( x )) for any state vector x , while the inverse of the map isindicated using the conventional notation, i.e., ϕ − . Integrating the RHS of Eq.5 by parts leads to an alternative expression for the sensitivity, ddγ (cid:90) J ( x ) ρ ( x ) dx = − ∞ (cid:88) k =0 (cid:90) (cid:32) g ( x ) f ( x ) + f (cid:48) ( x ) (cid:33) ( J ◦ ϕ k )( x ) ρ ( x ) dx, (6)which provides a direct relation between the derivative of the long-time averageand the density gradient function g (see [39] for the derivation of Eq. 6). Eq. 6is in fact a one-dimensional version of the space-split sensitivity (S3) formula,originally derived and computed in [24]. The general form of S3, which shows the7elation d (cid:104) J (cid:105) /dγ vs. g for higher-dimensional systems, is discussed in Section5.2.The above formula is a sum of time-correlations between the function J and gf + f (cid:48) . The k -time-correlation between two observables φ and ψ is given by C φ,ψ ( k ) = (cid:90) φ ◦ ϕ k ( x ) ψ ( x ) ρ ( x ) dx − (cid:16) (cid:90) φ ( x ) ρ ( x ) dx (cid:17)(cid:16) (cid:90) ψ ( x ) ρ ( x ) dx (cid:17) . (7)A classical result in uniform hyperbolicity theory is that C φ,ψ ( k ) decays ex-ponentially with k > C φ,ψ ( k ) ∼ O ( e − ck ) for some constant c > k -th term of regularized Ruelle’s formula (Eq. 6) is in fact a k -timecorrelation between J and h := gf + f (cid:48) because (cid:90) (cid:16) g ( x ) f ( x ) + f (cid:48) ( x ) (cid:17) ρ ( x ) dx = (cid:90) (cid:16) ρ ( x ) f ( x ) (cid:17) (cid:48) dx = 0 . (8)The above integral vanishes since we have a periodic boundary. We remarkthat an analogous boundary term, which appears when we perform integrationby parts on a higher-dimensional unstable manifold, also vanishes (see Section5.2). More generally, Ruelle’s formula converges whenever C J,h ( k ) is summable.Our goal in this work is to investigate this condition so that we can assess theexistence of linear response in systems that may not be uniformly hyperbolic.A numerical assessment is not only useful for practical purposes but necessarybecause mathematical analysis of such systems is diﬃcult. We now isolate theterm that determines the convergence. Considering again the k -th term ofRuelle’s response, C J,h ( k ) = (cid:90) J ◦ ϕ k (cid:16) g ( x ) f ( x ) + f (cid:48) ( x ) (cid:17) ρ ( x ) dx = (cid:90) J ◦ ϕ k ( x ) g ( x ) f ( x ) ρ ( x ) dx + (cid:90) J ◦ ϕ k ( x ) f (cid:48) ( x ) ρ ( x ) dx. (9)In uniformly hyperbolic systems, the function class of observables exhibiting ex-ponential decay of correlations contains C functions. Hence, the time-correlation C J,f (cid:48) ( k ), corresponding to the second integral in Eq. 9, is an exponentially de-caying sequence in uniformly hyperbolic systems, since, by assumption f ∈ C .From here on, we discuss the choice of parameter perturbation, f , and the ob-jective function, J , such that C J,f (cid:48) ( k ) is summable. Typically, in the context ofengineering simulations, J represents a physical quantity, such as force or tem-perature, which are smooth functions. The function f and its derivative f (cid:48) arealso smooth in case of many popular physical models, including the standardparameter perturbations in the Lorenz ’63 system (see Section 5) [11, 47] orKuramoto-Sivashinsky equation [27]. 8e focus now on the ﬁrst integral on the RHS of Eq. 9, isolate the compo-nents that are intrinsic to the dynamics, and whose convergence aﬀects the exis-tence of linear response. Since both f g and J are suﬃciently regular, we expectthat C J,fg ( k ) decays exponentially in uniformly hyperbolic systems. However,Ruelle’s series may converge under weaker conditions on g . For instance, as-suming that C J,f (cid:48) ( k ) is absolutely summable, the boundedness of Ruelle’s seriesdepends only on the absolute summability of the correlation C J,gf ( k ). Now, toisolate conditions on g , we assume that (cid:104) J (cid:105) = 0 . This assumption is withoutloss of generality because for any

J, d γ (cid:104) J (cid:105) = d γ (cid:104) J − (cid:104) J (cid:105)(cid:105) , and hence analyzingthe existence of the derivative of (cid:104) J − (cid:104) J (cid:105)(cid:105) is suﬃcient to determine the validityof linear response. Then, note that there exists some constant c > , such that,for all K ∈ Z + , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) k ≤ K (cid:32) (cid:90) J ◦ ϕ k ( x ) g ( x ) f ( x ) ρ ( x ) dx + (cid:90) J ◦ ϕ k ( x ) f (cid:48) ( x ) ρ ( x ) dx (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) k ≤ K (cid:0) | C J,gf ( k ) | + | C J,f (cid:48) ( k ) | (cid:1) ≤ (cid:88) k ≤ K | C J,gf ( k ) | + c. (10)From the above expression, we can see that when C J,gf ( k ) is absolutely summable,Ruelle’s series is ﬁnite. Then, linear response holds. While this series convergesexponentially in uniformly hyperbolic systems, and hence linear response holds, | C J,gf ( k ) | may be summable beyond uniformly hyperbolic systems. Our goal isto ﬁnd conditions on g that can be veriﬁed numerically, and under which thetime-correlation C J,gf ( k ) is summable, for any smooth perturbation f . Since f can be considered to be arbitrarily smooth, a suﬃcient condition for summabilityis that g , and hence f g , must belong to a function class of observables for whichexponential decay of correlations holds for any pair of observables belonging tothis class. Exponential decay also indicates that a central limit theorem (CLT)of the following form also holds for the observables h, h ◦ ϕ, h ◦ ϕ , ... , when h isin the same function class [41, 48, 46]. This CLT says that the random variable √ N (cid:80) N − n =0 (cid:0) h ◦ ϕ n ( x ) − (cid:104) h (cid:105) (cid:1) is, for large N , and almost every x , distributedaccording to a normal distribution with mean 0 and a ﬁnite variance.The crux of our assessment lies in our eﬃcient numerical method to compute g along a ρ -typical trajectory. With the values of g available along a suﬃcientlylong trajectory, we numerically assess whether the CLT applies by checking ifthe ﬁrst and second moments of | g | exist. We expect that when the CLT doesapply to | g | , linear response holds. Accordingly, in the numerical examplesin Sections 4-5, we ﬁnd that the validity of CLT for | g | is indeed a suﬃcientcondition. Moreover, we observe that a necessary and suﬃcient condition, inthe examples considered, is that the ﬁrst moment of | g | is ﬁnite, i.e., g ∈ L ( ρ ) . Since an analytical assessment of the decay of correlations and CLT, beyondsome hyperbolic systems [41], is absent, if this relationship between the existenceof moments of | g | and that of linear response is generalizable, we may be ableto numerically determine the validity of linear response. Next we describe ournumerical method to compute g along trajectories.9 .2. Computing the derivative of density ρ (cid:48) and density gradient g In this section, we focus on computational aspects of the density gradientfunction. Due to the fact that the SRB density, ρ , is stationary in time, itsatisﬁes ρ ( ϕ ( x )) = ρ ( x ) | ϕ (cid:48) ( x ) | . (11)This statement is an alternative expression of measure preservation, which im-plies that for any J ∈ L ( ρ ) , (cid:90) J ( x ) ρ ( x ) dx = (cid:90) J ◦ ϕ ( x ) ρ ( x ) dx. (12)Applying a change of variables x → ϕ ( x ) to the integral on the right hand side,and recognizing that Eq. 12 holds for all J ∈ L ( ρ ), leads to Eq. 11. Takingthe logarithm of Eq. 11 and then diﬀerentiating with respect to x, we obtain, g ( ϕ ( x )) = g ( x ) ϕ (cid:48) ( x ) − ϕ (cid:48)(cid:48) ( x ) ϕ (cid:48) ( x ) . (13)The above equation converges to the true value ρ (cid:48) ( x ) /ρ ( x ) upon iterating withan initial guess g ◦ ϕ − N ( x ) = 0 , as N → ∞ . A convenient way to numericallyverify Eq. 13 is to approximate ρ (cid:48) = ρg using the above formula for g . We vali-date the results against the ﬁnite diﬀerence approximation of ρ (cid:48) . Let x , x , ..., be a long trajectory, and let the interval [0 ,

1] be divided into K subintervals(bins), { ∆ k } Kk =1 , of equal length 1 /K . We compute a piecewise-constant ap-proximation of ρ ( x ) g ( x ) as follows: ρ ( x ) g ( x ) ≈ KN N − (cid:88) n =0 g ( x n ) I ∆ k ( x n ) , ∀ x ∈ ∆ k (14)where I A is the indicator function over a subset A ⊂ [0 , I A ( x ) = 1 , when x ∈ A, and I A ( x ) = 0 , otherwise. The pointwise error associated with thisapproximation is proportional to (cid:112) K/N , i.e., | ρ ( x ) g ( x ) − ( K/N ) (cid:80) N − n =0 g ( x n ) I ∆ k ( x n ) | decays as O ( (cid:112) K/N ) for all x ∈ ∆ k if g obeys the CLT [46]. Note that, for aﬁxed N , the error increases proportionally to √ K for all x ∈ [0 , /K . Note ifwe replace g ( x n ) with 1 in the RHS of Eq. 14, we eﬀectively obtain a formulafor the density function itself. Thus, from the algorithmic point of view, theprocess of generating ρ (cid:48) requires similar steps as the process of generating ρ ,while g emerges as a byproduct. Analogously, this process can be generalized tohigher-dimensional systems with a 1D unstable manifold. In such systems, g isa scalar function, and thus Eq. 14 still applies assuming an analogous partitionof the higher-dimensional attractor is created. Computational aspects involv-ing the density gradient function deﬁned for higher-dimensional maps with onepositive LE are described in Section 5.2 and Appendix A.Figure 4 illustrates the derivative of density generated for the onion map for10 igure 4: Derivative of density of the onion map (Eq. 1) at h = 0 .

97. We used N = 41 , , ,

000 samples and K = 2048 bins to generate all curves. The solid linesrepresent the derivative of density computed using Eq. 13-14, while the dots represent centralﬁnite diﬀerence approximation of the same function using the corresponding density functionhistograms illustrated in Figure 2. the same set of parameter values as the densities in Figure 2. We observe asatisfactory match between the results generated using the above algorithm for ρ (cid:48) ( x ) and the corresponding ﬁnite diﬀerence approximations as long as γ < . γ , there is a visible discrepancy between the two approxi-mations in the proximity of discontinuities, which is consistent with the density ρ exhibiting discontinuities for γ > .

4. Probing the diﬀerentiability of statistics in one-dimensional chaos | g | The central part of this work is to identify a computable mathematical crite-rion for the diﬀerentiability of statistics. As argued in Section 3.1, we anticipatethere is a relation between the properties of the distribution of | g | and the11 igure 5: Distribution of the absolute value of the density gradient function generated forthe onion map (Eq. 1) at h = 0 .

97. To generate these histograms, we divided the x-axisfrom 10 − to 10 into K = 2048 bins with equal with in the logarithmic scale. For eachhistogram, a trajectory of the length of approximately N = 1 . · has been computed. validity of linear response. We intend to further investigate this observationusing numerical simulation. We compute the distribution of the absolute valueof the density gradient (using Eq. 13) over a range of parameter values. Wealso compute the statistics-vs-parameter curve directly, from which we estimateits H¨older exponent. Very long trajectories are required to make an accurateestimation. However, it is still computationally feasible given the low dimen-sionality of our examples. Using this direct check of diﬀerentiability, we ﬁndany potential correlation between the probability distribution of | g | and theexistence of the parametric derivative.As the ﬁrst step, we study the distribution of | g | , considering it to be a ran-dom variable, for the onion map introduced in Section 2. We obtain empiricallythe distributions of | g | at diﬀerent γ values, from both the smooth and non-smooth regions. Based on the procedure introduced in Section 3.2, we computea suﬃciently long trajectory, and count the number of occurrences of | g | in allbins, each corresponding to a subset of the range of | g | . Figure 5 shows thedistribution of | g | on a logarithmic scale at ﬁxed h = 0 .

97 for diﬀerent values ofthe parameter γ .The vertical axis of Figure 5 represents the number of appearances of agiven value of | g | in each bin. Based on these histograms, we conclude that theprobability density function (PDF) of | g | , denoted as PDF( | g | ), has power-lawbehavior, i.e. PDF( | g | ) ∼ | g | − t , for some exponent t . Since Figure 5 presents12ata on a log-log scale, the bin size increases proportionally to the value of | g | . Therefore, in Figure 5, we observe a distribution that is proportional to | g | · PDF( | g | ) ∼ | g | − t +1 . That is, the slope of the histograms gives us the valueof − t + 1.For power-law probability distributions, the existence of the expected value,variance, and higher-order moments, is solely determined by the value of theexponent t . If t ≤

2, then the mean (expected value), E [ | g | ] = (cid:90) ∞ | g | PDF( | g | ) d | g | = (cid:90) | g ( x ) | ρ ( x ) dx, and all higher moments are inﬁnite. If t >

2, then the mean is ﬁnite. In otherwords, the density gradient belongs to L ( ρ ), when the slope of the histogramin Figure 5 is less than -1. In addition, if the exponent t is larger than 3, thenthe variance var( | g | ) = E [ g ] − ( E [ | g | ]) of the probability distribution functionis ﬁnite. In this case, the density gradient is square-integrable with respect to ρ , i.e., g ∈ L ( ρ ).From Figure 5, the slope at γ ≤ . g is thereforeLebesgue-integrable as long as γ ≤ .

9. Thus, the Lebesgue-integrability thresh-old must be in (0 . , . γ ≥ . t larger than −

1. Furthermore, square-integrabilty threshold can be esti-mated to be around γ = 0 .

5, since the slope at γ = 0 . γ , for example, γ = 0 .

3, we see that both expectationand variance are ﬁnite.

In order to draw a correlation, if any, between the existence of moments of | g | and the validity of linear response, we must identify the regions of parameterspace where linear response exists. For this, we directly assess the smoothness ofthe statistics illustrated in Figure 3 for the onion map. That is, we numericallyestimate the H¨older exponent µ ∈ (0 ,

1] of the long-time average function, whichchanges with γ . A function h ( γ ) deﬁned on a domain D in parameter space issaid to be H¨older continuous with exponent γ , if there exists a C > | h ( γ ) − h ( γ ) | ≤ C | γ − γ | µ (15)for all γ and γ in D If µ = 1, then h ( γ ) is Lipschitz-continuous, and in thiscase, also diﬀerentiable at almost every parameter value in D. Thus, to probethe smoothness of the long-time average, we numerically estimate the H¨olderexponent of the statistics-parameter relation, (cid:104) J (cid:105) vs. γ , from the plot in Figure3. This can be achieved by generating a suﬃcient number of data points andproducing a scatter plot with |(cid:104) J (cid:105) ( γ ) − (cid:104) J (cid:105) ( γ ) | on the y -axis and | γ − γ | on the x -axis, where γ and γ indicate points of evaluation of (cid:104) J (cid:105) ( γ ). If thelogarithmic scaling is used, the H¨older exponent µ can be approximated byestimating the slope (steepness) of the maximum values of |(cid:104) J (cid:105) ( γ ) − (cid:104) J (cid:105) ( γ ) | as | γ − γ | changes. 13ssuming the function (cid:104) J (cid:105) is sampled every δγ along the x-axis, it is clearthat δγ = min γ ,γ ∈ D | γ − γ | . We set δγ = 0 . δγ , however, cannot be toosmall, as the growing statistical noise may signiﬁcantly impact the value of |(cid:104) J (cid:105) ( γ ) − (cid:104) J (cid:105) ( γ ) | . To further reduce the eﬀect of statistical noise, we run 10independent simulations per one parameter value and compute the 3-sigma con-ﬁdence interval of the data coming from these independent simulations, wherethe standard deviation is averaged over a chosen interval of γ .The left-hand side column of Figures 6 and 7 illustrate the statistics versusparameter dependence at four diﬀerent intervals of γ . The second column ofthese two ﬁgures shows |(cid:104) ¯ J (cid:105) ( γ ) − (cid:104) ¯ J (cid:105) ( γ ) | versus | γ − γ | computed from thedata presented in the left-hand side column, where (cid:104) ¯ J (cid:105) represents the long-timeaverage of a modiﬁed objective function ¯ J . The new quantity of interest isobtained by subtracting a linear function from (cid:104) J (cid:105) , illustrated in Figure 3, suchthat the resulting long-time average vanishes at the end points of each intervalof γ . This modiﬁcation is made to visually amplify the roughness of the curve,which is done for demonstration purposes only. See the caption of Figure 6 formore details.The top row of Figure 6 corresponds to the range γ ∈ [0 . , . µ ≈

1. According to Figure 5, the tail of the distribution of | g | hasa slope smaller than -2 in that interval, which implies that | g | has both ﬁnitemean and variance. The second row of Figure 6 corresponds to the interval γ ∈ [0 . , . µ is still close to 1. Figure 5 indicates that g is in L ( ρ ) but not in L ( ρ ) in thatrange (it has ﬁnite mean, but inﬁnite variance), as the slope of the distributiontail is between -2 and -1. Figure 7 includes two sets of plots showing clearlynon-smooth, even discontinuous responses. Even for the interval γ ∈ [1 . , . γ ∈ [1 . , . γ . Again correlating with ournumerical results on the distribution of | g | , for γ >

1, we found that | g | doesnot have a ﬁnite mean.The plots in the right column of Figures 6 and 7 clearly indicate that thestatistics of the onion map is diﬀerentiable as long as γ is smaller than 1. Accord-ing to our analysis of the distribution of | g | , γ < . | g | . It means that, in case of the onion map, linear response holds when g ∈ L ( ρ ). This result also conﬁrms our analysis from Section 3.1. From ournumerical results, in this case, we ﬁnd that the converse is also true: when g / ∈ L ( ρ ), linear response fails. To check whether the equivalence g ∈ L ( ρ ) ⇐⇒ (cid:12)(cid:12)(cid:12)(cid:12) d (cid:104) J (cid:105) dγ (cid:12)(cid:12)(cid:12)(cid:12) < ∞ is generalizable, we will apply the above two-step procedure to a higher-dimensionalsystem with one positive LE. 14 igure 6: Left column: relation of the long-time average and the exponent γ for the onionmap at h = 0 .

97. The simulation data is the same as the data presented in Figure 3 for c = 0 . (cid:104) ¯ J (cid:105) by subtracting a linearfunction describing a straight line crossing the endpoints of the curve in Figure 3 in each γ interval from the original objective function. Each plot corresponds to a diﬀerent γ intervalbetween γ min and γ max , which has been discretized uniformly with step size δγ = 0 . γ , we run 10 independent simulations. Right column: H¨older exponent testresults. First, for each pair of data points from the left-hand side plot, excluding the pairswith the same value of γ (i.e. when γ = γ ), we compute the diﬀerence of the correspondinglong-time average values versus the diﬀerence of their parameter values. Second, we computethe lower-bound of the 3-sigma conﬁdence interval by subtracting 6 averaged sigmas, wheresigma represents standard deviation of results obtained in 10 simulations averaged over theinterval [ γ min , γ max ], from the computed diﬀerences of modiﬁed statistics. This means eachplot has approximately (0 . · ( γ max − γ min ) /δγ ) ≈ . · data points. Skew solid linesrepresent reference lines with the slope of 1 in the logarithmic scale. igure 7: This ﬁgure is an extension of Figure 6. It includes γ intervals corresponding tonon-smooth statistics. All plots have been generated in the same manner as those in Figure6 – see caption therein for more details. . Generalization to a higher-dimensional system In this section, we generalize our conclusions from Section 4 to higher-dimensional systems with a one-dimensional unstable manifold. This meanswe consider n -dimensional systems that have exactly one positive Lyapunov ex-ponent out of n Lyapunov exponents. As a test case, we consider the Lorenz’63 system [11, 47], which consists of three coupled nonlinear ODEs, dx (1) dt = σ ( x (2) − x (1) ) , dx (2) dt = x (1) ( γ − x (3) ) − x (2) , dx (3) dt = x (1) x (2) − βx (3) , (16)where σ ≥ β ≥

0, and γ ≥ x ( t ) = [ x (1) ( t ) , x (2) ( t ) , x (3) ( t )] T . Inour analysis, we set σ and β to their canonical values of 10 and 8/3, respectively,and keep them ﬁxed, while we allow γ to vary. Given the Lorenz ’63 systemis a three-dimensional system, it has three distinct Lyapunov exponents λ i , i = 1 , ,

3, indexed in decreasing order. They satisfy the following constraints[47], λ + λ + λ = − (1 + σ + β ) , λ = 0 . (17)Since both parameters are assumed to be positive, it is evident that Eq. 17admits at most one positive solution. According to [47], for the canonical valuesof σ and β , the Lorenz system is: • non-chaotic (has no positive LEs) if 0 ≤ γ < . γ > . • chaotic (has one positive LE) if 24 . ≤ γ ≤ . γ ∈ [24 . , . ∆ t , starting from a random initialvector x init . In our discussion, we no longer consider the original, that is, con-tinuous version of Lorenz ’63, but rather we focus on the discrete form usinga map ϕ , which is deﬁned by the numerical time integration of the Lorenz ’63system for a time of ∆ t ; that is, ϕ ( x ( t )) = x ( t + ∆ t ) , for all t ∈ R + . In otherwords, an orbit of ϕ , denoted x k , k ∈ Z + , is a numerical solution of the Lorenz’63 system with x k being the 3-element state vector at time step k .As a quantity of interest, we consider the long-time average of the third variable, (cid:104) x (3) (cid:105) = lim N →∞ N N − (cid:88) k =0 x (3) k , (18) Speciﬁc values of the time step size ∆ t are indicated in the captions of correspondingﬁgures. igure 8: Left: projection on the x (1) - x (3) plane of the Lorenz attractor at γ = 25 (red), 40(green), 55 (orange) and 70 (blue). Each projection has been shifted downwards proportionallyto γ for demonstration purposes. Right: relation of the long-time average (cid:104) x (3) (cid:105) (deﬁned byEq. 18) and the system parameter γ . We generate 420,000 data points in total: for each valueof γ on a uniform grid with size δγ = 0 . . · time steps with ∆ t = 0 . which we approximate as (cid:104) x (3) (cid:105) by generating suﬃciently long trajectories. Fig-ure 8 shows a 2D projection of the attractor at diﬀerent values of γ , as well asthe dependence of (cid:104) x (3) (cid:105) on γ . We observe that the attractor expands outwardon the x (1) - x (3) plane, as γ increases. This observation is also reﬂected in thelinear relation between (cid:104) x (3) (cid:105) and γ , which is shown on the right-hand side ofFigure 8. To show the statistical quantities of the Lorenz ’63 system are in factnon-smooth at some values of γ , we subtract a smooth function s ( γ ), obtainedby ﬁtting x (3) vs. γ with a quadratic polynomial, from the original data shownin the right plot of Figure 8. Figure 9 illustrates the behavior of the modiﬁedquantity of interest, i.e., x (3) − s ( γ ), as γ changes. This computational treat-ment clearly reveals the actual regularity of the system’s statistics. Analogouslyto the one-dimensional onion map, here as well we observe a transition from asmooth response at lower values of γ to a non-smooth, and even discontinuous,behavior at larger values of γ . Note the modiﬁed statistics becomes sharp forvalues of γ slightly above 30. This is in fact the region where the Lorenz systemloses its quasi-hyperbolic properties [47].For completeness, we illustrate the x (1) - x (3) projection of the density func-tion of the Lorenz system at three diﬀerent values of γ in Figure 10. We notice aclearly smooth distribution for γ = 28. For γ = 38, however, subtle wrinkles arevisible around the “eyes” of the attractor. In case of γ = 70, regions with largedensity gradients, which clearly indicate non-smoothness of the distribution,appear around the “eyes” and close to the boundary of the attractor.Based on these observations, we anticipate the density gradient function tobe smooth for values of γ close to 28, and non-smooth if γ is higher. We alsoacknowledge a consistency between Figures 9-10 and Figures 2-3, correspondingto the Lorenz ’63 system and onion map, respectively. Both pairs of ﬁgures in-dicate a strong correlation between the smoothness of statistics and smoothnessof the density function in phase space. For a thorough investigation of this con-18 igure 9: The modiﬁed quantity of interest, (cid:104) x (3) (cid:105) − s ( γ ), as a function of γ , where s ( γ ) =1 . γ − . γ . The quadratic function s ( γ ) is a result of least squares polynomial ﬁttingof the original long-time average shown in Figure 8. nection, we must numerically compute g , and also the H¨older exponents of thestatistics-parameter response curve. The deﬁnition of g in higher-dimensionalsystems and its impact on the sensitivity is discussed in the following section. In this section, we consider Ruelle’s formula [31, 45] as applied to multi-dimensional systems with one-dimensional unstable manifolds or equivalently,chaotic systems with one positive LE. We split Ruelle’s formula into stable andunstable contributions out of which the unstable contribution is analogous toEq. 6 for one-dimensional chaotic maps. Due to this similarity, we expect therelationship between the boundedness of linear response and the integrability of g to also hold in multi-dimensional systems with a single direction of instability.As before, we consider an invertible, ergodic, discrete map of a manifold M ,parameterized by γ , and given by x k +1 = ϕ ( x k ; γ ) , k ∈ Z . (19)Here, x k is an n -dimensional state vector. Equation 19 may also arise fromtime-discretizations of continuous ODEs, such as the Lorenz system in Eq. 16.Let D denote the derivative operator with respect to phase space, and Dϕ the n × n Jacobian matrix of the system. For the system deﬁned by Eq. 19, Ruelle’sformula [31, 45] for the parametric derivative of the long-time average can beexpressed as d (cid:104) J (cid:105) dγ = ∞ (cid:88) k =0 (cid:104) D ( J ◦ ϕ k ) · χ, ρ (cid:105) , (20)where χ := ∂ϕ/∂γ ◦ ϕ − is the parameter perturbation vector. This formulais proven [31] rigorously under the assumption of uniform hyperbolicity which19 igure 10: x (1) − x (3) projection of the (unnormalized) density function of the Lorenz ’63 sys-tem at γ = 28 (top), 38 (middle) and 70 (bottom). To generate each plot, a 2D box in phasespace has been divided into n x · n x = 3840 · − , / [20 , − , / [54 ,

70] (middle), [ − , / [40 ,

80] (bottom). We computed a trajectory oflength N = 0 . · with ∆ t = 0 .

002 for each plot. The color bars indicate the number oftimes the trajectory crosses a bin. x ∈ M, there exists a decomposition of the tangent space, T x M = E u ( x ) ⊕ E s ( x ) , whichsatisﬁes the following properties: • covariance property: Dϕ ( E u ( x )) = E u ( ϕ ( x )) , Dϕ ( E s ( x )) = E s ( ϕ ( x )) , • uniform expansion/contraction:for some ﬁxed constants C > λ ∈ (0 , x ∈ M , every vector v ∈ E u ( x ) satisﬁes (cid:107) Dϕ − k ( x ) v ( x ) (cid:107) ≤ Cλ k (cid:107) v ( x ) (cid:107) for all positive integers k . And, (cid:107) Dϕ k ( x ) v ( x ) (cid:107) ≤ Cλ k (cid:107) v ( x ) (cid:107) for all v ∈ E s ( x ) . The norm, (cid:107) · (cid:107) denotes the standard Euclidean norm in R n .Although the series in Eq. 20 has been proven to converge, it is practicallyinfeasible to compute it directly using tangent/adjoint methods [12] in high di-mensional systems. This computational infeasibility [20, 49] stems from the factthat the integrand D ( J ◦ ϕ k ) · χ increases exponentially with k for almost everyperturbation χ . In uniformly hyperbolic systems, it is possible to decomposethe vector χ as χ + χ such that • χ ( x ) ∈ E u ( x ) and both components, χ and χ , are diﬀerentiable on theunstable manifold; • there exists a bounded vector ﬁeld v : M → R d that is orthogonal to E u and satisﬁes v − Dϕ v = χ . Given such a decomposition of χ , we refer to the part of the sensitivity due to χ as the stable contribution, which is given by (cid:80) ∞ n =0 (cid:104) DJ · ( Dϕ n χ ) ◦ ϕ − n , ρ (cid:105) . Itcan be shown that the stable contribution is equivalently expressed as (cid:104) DJ · v, ρ (cid:105) , including that v := (cid:80) ∞ n =0 (cid:0) Dϕ n χ (cid:1) ◦ ϕ − n is a bounded vector ﬁeld [24]. Since,by assumption, (cid:107) DJ (cid:107) ∞ is bounded, the stable contribution is always bounded.We now turn our attention to the sensitivity due to χ , which we shallrefer to as the unstable contribution: (cid:80) ∞ n =0 (cid:104) D ( J ◦ ϕ n ) · χ , ρ (cid:105) . We shall restrictourselves to one-dimensional unstable manifolds. Let q ( x ) be the unit vectoralong the one-dimensional vector space, E u ( x ). Let ξ be a coordinate alongthe unstable manifold, and let χ = a q, where a is the scalar ﬁeld representingthe component of χ along q. We shall regularize the unstable contribution byapplying integration by parts on the unstable manifold [45, 50] to yield, ∞ (cid:88) n =0 (cid:104) D ( J ◦ ϕ n ) · χ , ρ (cid:105) = − ∞ (cid:88) n =0 (cid:104) ( J ◦ ϕ n ) ( ag + b ) , ρ (cid:105) , (21)21here • b = ∂a∂ξ , is the derivative of the vector ﬁeld χ on the unstable manifold, • ρ u is the density of the conditional measures of the SRB measure, ρ, onunstable manifolds, • g := 1 ρ u ∂ρ u ∂ξ is the density gradient function, equal to the derivative of thelogarithm of ρ u on the unstable manifold.The sum of the stable and unstable contributions, as deﬁned above, specializesthe S3 formula to systems with one positive LE. In the case M itself is a one-dimensional manifold, as in the onion map, the unstable contribution, given byEq. 21 is the entire sensitivity, since there is no stable contribution. In Section4, we observed that the integrability of the density gradient g determined theexistence of linear response. In Section 3, the expression we derived for theoverall sensitivity (Eq. 6) is identical to that obtained from integration byparts of Ruelle’s formula that is in Eq. 21. Since the unstable contributionis identical to the one-dimensional case, and the stable contribution is alwaysbounded, this same connection between the regularity of g and the existence oflinear response can potentially be extended to multi-dimensional systems witha one-dimensional unstable manifold (e.g. the Lorenz ’63 system). For thecomputation of both terms of the regularized Ruelle’s formula (Eq. 21) as wellas the details of the derivation above, the reader is referred to [24].In practice, g is computed with the following recursive formula along thetrajectory (see Section 4 of [24]; [39] provides an intuitive explanation for theformula) g k +1 = g k α k − ( ∂ ξ α ) k α k , g = 0 , (22)where α := (cid:107) Dϕ q (cid:107) . The subscript notation applied to a function h is usedto denote the composition h k = h ◦ ϕ k . Note that the iterative formula inEq. 22 reduces to Eq. 13 if ϕ is one-dimensional. That is because, in caseof 1D manifolds, the trajectory can be deformed in only one direction, whichmeans that q = 1, and α = | ϕ (cid:48) | . To compute g , we execute the recursion inEq. 22, analogously to the 1D case described in Section 3.2. In case of higher-dimensional systems, however, the vector q belongs to E u (cid:54) = T M and representsthe direction of trajectory deformation as we travel along it. It is indeed asolution to the homogeneous tangent equation q ( x k +1 ) = α ( x k ) Dϕ ( x k ) q ( x k )with random initial conditions, and is also the only unstable Covariant LyapunovVector (CLV) [51].The extra mathematical diﬃculty in this higher-dimensional case is the eval-uation of ∂ ξ α , which is now an unknown, unlike in 1D systems. This termrequires computing the derivative of q with respect to its own direction (self-derivative), ∂ ξ q . An iterative formula, derived in [52], can be used to approx-imate ∂ ξ q along the trajectory. More details about the computation of ∂ ξ α | g | in a way analogous to the procedure described inSection 4. Using the recursive relation described by Eq. 22, we generate histograms ofthe absolute value of the density gradient function | g | = | ∂ ξ log ρ | for the Lorenz’63 system. Figure 11 illustrates the distributions of | g | at three diﬀerent valuesof γ . We observe power-law behavior of the generated histograms similar tothose of the onion map in Figure 5. Clearly, the exponent t , which is introducedin Section 4, is much higher than 3 if γ = 28. This implies that both theexpected value and variance of | g | are ﬁnite, which means g is square-integrable(with respect to ρ ). The other two distributions (at γ = 40 and γ = 68) featuretails with exponents t slightly smaller than 2, which means that g may not beLebesgue-integrable, as discussed in Section 4.Figure 11 clearly indicates that the Lebesgue-integrability threshold can beestimated to be at some γ between 28 and 40. This result can be correlatedwith the regularity of the density function ρ (see Figure 10), which apparentlyloses its global smoothness for γ ≤

70. In an extensive study of the Lorenzattractor at canonical values of β and σ presented in [47], it was shown that thesystem is quasi-hyperbolic if γ ∈ [24 . ,

31] and non-hyperbolic if γ ∈ [31 , . µ , as deﬁned in Eq. 15, for thestatistics-vs-parameter relation presented in Figure 9, using the procedure de-scribed in Section 4.2. Figures 12-13 illustrate the results of the H¨older exponentnumerical test generated for three diﬀerent intervals of γ . These results clearlyindicate the exponent µ is approximately 1 if γ ∈ [28 , µ is signiﬁcantly smaller than 1, which implies the long-time average cannot be diﬀerentiable at γ >

36. Therefore, one can observe aclear correlation between the Lebesgue-integrability (with respect to ρ ) of g andsmoothness of statistics, which is consistent with our numerical results of theonion map from Section 4.

6. Conclusions and future work

Statistical quantities are critical both in understanding and in applicationsof chaotic phenomena, such as turbulent ﬂows. In many chaotic dynamical sys-tems, the relation between statistical quantities and system parameters is notsmooth. In this paper, we show that the existence of the parametric derivativeof a statistics or long-time average (sensitivity) depends on whether the densitygradient function g , which we deﬁne, is integrable with respect to the SRB mea-sure. That function represents the relative rate of change of the SRB densitywith respect to the coordinates of the unstable manifold. The relationship be-tween the sensitivity and g is clearly reﬂected by the S3 formula, a closed-form23 igure 11: Distribution of the absolute value of the density gradient function generated forthe Lorenz attractor at three diﬀerent values of γ using Eq. 22. To generate these histograms,we divided the x-axis from 10 − to 10 into K = 2048 bins of equal width in the logarithmicscale. For each histogram, a trajectory of the length of approximately 2 . · , computed bysolving Eq. 16 with ∆ t = 0 .

01, is used. expression for the sensitivity, which stems from Ruelle’s linear response theory.This observation can be utilized to construct an eﬃcient numerical procedureto assess the diﬀerentiability of statistics. The computation of the probabilitydistribution of | g | is the central part of the procedure. The probability densityfunction of | g | features a power-law behavior in case of both the systems con-sidered in this paper: the onion map and Lorenz system. In this special case,a numerical estimate of the power law exponent is suﬃcient to determine thediﬀerentiability of statistics. We validate this test by numerically computingthe H¨older exponent of statistics over parameter space.The density gradient function and hence its probability distribution can benumerically generated through recursive equations along suﬃciently long trajec-tories, solving which is the most expensive part of our procedure. These formu-las require integrating the primal system, as well as ﬁrst-order and second-ordertangent systems in time, which in turn require the ﬁrst and second derivativesof the map at each time step.A possible subject of future work is to generalize our algorithm to systemswith higher-dimensional unstable spaces, i.e., to systems with more than onepositive Lyapunov exponent. Based on the general form of the S3 formula,we believe that the major conclusion of this paper would remain the same.That is, the indicator of the smoothness of statistical quantities is still theLebesgue-integrability of g . However, in systems with higher-dimensional un-24 igure 12: Left column: analogously to Figures 6 - 7, the simulation data is the same asthe data presented in Figure 9, and the quantity of interest has been modiﬁed such thatthe values of the long-time average at the endpoints of each interval is zero. It has beenachieved by subtracting a linear function describing a straight line crossing the endpoints ofthe original curve, (cid:104) x (cid:105) − s ( γ ). The modiﬁed objective function has been denoted by ¯ J . Eachplot corresponds to a diﬀerent γ interval, which has been discretized uniformly between γ min and γ max with step size δγ = 0 . γ , we run 10 simulations. Right column:H¨older exponent test of the statistical quantity (cid:104) ¯ J (cid:105) = (cid:104) z (cid:105) − s ( γ ) versus parameter γ relationplotted in Figure 9. These plots have been generated in the same fashion as those for the onionmap in Figures 6 - 7 (see the caption of Figure 6 for a detailed description), i.e., by taking thelower bound of the 3-sigma conﬁdence interval of the data set corresponding to [ γ min , γ max ],obtained in 10 independent simulations. We sample the statistics every δγ = 0 . . · (( γ max − γ min ) /δγ ) = 8 · data points. Skewsolid lines represent reference lines with the slope of 1 in the logarithmic scale. igure 13: This ﬁgure is an extension of Figure 12. All plots have been generated in the samemanner as their counterparts in Figure 12 – see caption therein for more details. stable manifolds, g is a vector quantity that involves derivatives with respect toall coordinates of the unstable manifold. This implies computation of deriva-tives of all basis vectors of the unstable space will be required. The iterativeprocedure for g , therefore, is expected to be computationally more costly. Acknowledgments

This work was supported by Air Force Oﬃce of Scientiﬁc Research GrantNo. FA8650-19-C-2207.

Conﬂict of interests

The authors declare that they have no conﬂict of interests.

Supplementary materials

In [53], we provide our code, including post-processing routines, that wewrote to produce Figures 2-13. In the same repository, we include most of theraw data used in the preparation of this manuscript. Some raw data ﬁles oflarge size were not added to the repository, however, all of them are availableupon request.

References [1] F. Geng, I. Kalkman, A. S. J. Suiker, B. Blocken, Sensitivity analysisof airfoil aerodynamics during pitching motion at a Reynolds number of1 . · , Journal of Wind Engineering and Industrial Aerodynamics 183(2018) 315–332. doi: .262] U. Kirsch, Eﬃcient sensitivity analysis for structural optimization, Com-puter Methods in Applied Mechanics and Engineering 117 (1994) 143–156.doi: .[3] H. A. Dwyer, T. Peterson, Study of turbulent ﬂow with sensitivity analysis,AIAA Journal 19 (2020) 1309–1314. doi: .[4] C. Hu, X. Yang, X. Zhu, Z. Du, Stability and structural sensitivity analysisof the turbulent ﬂow in the narrow vaneless diﬀuser with mean ﬂow method,Computers & Fluids 177 (2018) 46–57. doi: .[5] G. A. Chua, Y. Liu, Sensitivity analysis on responsive pricing and produc-tion under imperfect demand updating, Naval Research Logistics 66 (2019)529–546. doi: .[6] Z. Weimin, H. Jun, S. Jingsong, S. Jun, S. Desheng, Improving productdevelop process time based on process sensitivity analysis, in: 2012 IEEEInternational Conference on Computer Science and Automation Engineer-ing (CSAE), volume 3, 2012, pp. 76–79. doi: .[7] L. Ren, M. Hartnett, Sensitivity analysis of a data assimilation techniquefor hindcasting and forecasting hydrodynamics of a complex coastal waterbody, Computers & Geosciences 99 (2017) 81–90. doi: .[8] S. A. Margulis, Variational sensitivity analysis and data assimilation stud-ies of the coupled land surface-atmospheric boundary layer system, Ph.D.thesis, Massachusetts Institute of Technology, 2002.[9] L. Arriola, J. M. Hyman, Sensitivity analysis for uncertainty quantiﬁca-tion in mathematical models, in: G. Chowell, J. M. Hyman, L. M. A.Bettencourt, C. Castillo-Chavez (Eds.), Mathematical and Statistical Esti-mation Approaches in Epidemiology, Springer Netherlands, 2009, pp. 195–247. doi: .[10] A. Caicedo-Casso, H.-W. Kang, S. Lim, C. I. Hong, Robustness and periodsensitivity analysis of minimal models for biochemical oscillators, ScienceReports 5 (2015). doi: .[11] E. Lorenz, Deterministic nonperiodic ﬂow, Journal of Atmospheric Sci-ences 32 (1963) 2022–2026. doi: .[12] D. J. Lea, M. R. Allen, T. W. Haine, Sensitivity analysis of the climateof a chaotic system, Tellus 52 (2000) 523–532. doi: . 2713] J.-T. Hwang, H. Rabitz, The Green’s function method of sensitivityanalysis in quantum dynamics, Journal of Chemical Physics 79 (1979).doi: .[14] A. Ni, Hyperbolicity, shadowing directions and sensitivity analysis of aturbulent three-dimensional ﬂow, Journal of Fluid Mechanics 863 (2019)644–669. doi: .[15] A. Jameson, Aerodynamic design via control theory, Journal of ScientiﬁcComputing 3 (1988) 233–260. doi: .[16] C. Rackauckas, Y. Ma, V. Dixit, X. Guo, M. Innes, J. Revels, J. Nyberg,V. Ivaturi, A comparison of automatic diﬀerentiation and continuous sensi-tivity analysis for derivatives of diﬀerential equation solution, arXiv e-printsarXiv:1812.01892 (2018).[17] J. E. V. Peter, R. P. Dwight, Numerical sensitivity analysis for aerodynamicoptimization: A survey of approaches, Computers & Fluids 39 (2010) 373–391. doi: .[18] Q. Wang, R. Hu, P. Blonigan, Least Squares Shadowing sensitivity analysisof chaotic limit cycle oscillations, Journal of Computational Physics 267(2014) 210–224. doi: .[19] A. Ni, Q. Wang, Sensitivity analysis on chaotic dynamical systems bynon-intrusive least squares shadowing (NILSS), Journal of ComputationalPhysics 347 (2017) 56–77. doi: .[20] G. Eyink, T. Haine, D. Lea, Ruelle’s linear response formula, ensembleadjoint schemes and l´evy ﬂights, Nonlinearity 17 (2004) 1867. doi: .[21] P. Blonigan, Q. Wang, Probability density adjoint for sensitivity analysis ofthe Mean of Chaos, Journal of Computational Physics 270 (2014) 660–686.doi: .[22] D. Lasagna, Sensitivity Analysis of Chaotic Systems Using Unstable Pe-riodic Orbits, SIAM Journal on Applied Dynamical Systems 17 (2018)547–580. doi: .[23] R. V. Abramov, A. J. Majda, Blended response algorithms for linearﬂuctuation-dissipation for complex nonlinear dynamical systems, Nonlin-earity 20 (2007). doi: .[24] N. Chandramoorthy, Q. Wang, A computable realization of Ruelle’s for-mula for linear response of statistics in chaotic systems, arXiv e-printsarXiv:2002.04117 (2020).[25] A. Ni, Linear response algorithm for diﬀerentiating stationary measures ofchaos, arXiv e-prints arXiv:2009.00595 (2020).2826] N. Chandramoorthy, L. Magri, Q. Wang, Variational optimization and dataassimilation in chaotic time-delayed systems with automatic-diﬀerentiatedshadowing sensitivity, arXiv e-prints arXiv:2011.08794 (2020).[27] P. J. Blonigan, Q. Wang, Least squares shadowing sensitivity analysis ofa modiﬁed Kuramoto–Sivashinsky equation, Chaos, Solitons & Fractals 64(2014) 16–25. doi: .[28] T. Bodai, V. Lucarini, F. Lunkeit, Can we use linear response theory toassess geoengineering strategies?, arXiv e-prints arXiv:1803.09606 (2020).[29] R. May, Simple mathematical models with very complicated dynamics,Nature 261 (1976) 459–467. doi: .[30] N. Chandramoorthy, Q. Wang, On the probability of ﬁnding a nonphysicalsolution through shadowing, arXiv e-prints arXiv:2010.13768 (2020).[31] D. Ruelle, Diﬀerentiation of SRB states, Communications in MathematicalPhysics 187 (1997) 227–241. doi: .[32] G. Galavotti, E. G. D. Cohen, Dynamical ensembles in stationary states,Journal of Statistical Physics 80 (1995) 931–970. doi: .[33] J. D. Chekroun, M. C. Neelin, D. Kondrashov, J. C. McWilliams, M. Ghil,Rough parameter dependence in climate models and the role of Ruelle-Pollicott resonances, Proceedings of the National Academy of Sciences 111(2014) 1684–1690. doi: .[34] A. Gritsun, V. Lucarini, Fluctuations, response, and resonances in a simpleatmospheric model, Physica D: Nonlinear Phenomena 349 (2017) 62–76.doi: .[35] C. L. Wormell, G. A. Gottwald, Linear response for macroscopic ob-servables in high-dimensional systems, Chaos 29 (2019). doi: .[36] C. L. Wormell, G. A. Gottwald, On the validity of linear response theory inhigh-dimensional deterministic dynamical systems, Journal of StatisticalPhysics 172 (2018) 1479–1498. doi: .[37] V. Baladi, Positive Transfer Operators and Decay of Correlations, WorldScientiﬁc, 2000. doi: .[38] A. Gritsun, G. Branstator, A. Majda, Climate response of linear andquadratic functionals using the Fluctuation–Dissipation Theorem, Journalof Atmospheric Sciences 65 (2008) 2824––2841. doi: .[39] A. A. ´Sliwiak, N. Chandramoorthy, Q. Wang, Ergodic Sensitivity Analysisof One-Dimensional Chaotic Maps, Theoretical and Applied MechanicsLetters 10 (2020) 438–447. doi: .2940] M. Mehta, A. K. Mittal, S. Dwivedi, The double-cusp map for the forcedLorenz system, International Journal of Bifurcation and Chaos 13 (2003)3029–3035. doi: .[41] L.-S. Young, Statistical properties of dynamical systems with some hyper-bolicity, Annals of Mathematics 147 (1998) 585–650. doi: .[42] L.-S. Young, What Are SRB Measures, and Which Dynamical SystemsHave Them?, Journal of Statistical Physics 108 (2002) 733–754. doi: .[43] H. Crimmins, G. Froyland, Fourier approximation of the statistical prop-erties of Anosov maps on tori, Nonlinearity 33 (2020). doi: .[44] J. Ding, T. Y. Li, Markov ﬁnite approximation of Frobenius-Perron op-erator, Nonlinear Analysis: Theory, Methods & Applications 17 (1991)759–772. doi: .[45] D. Ruelle, Diﬀerentiation of SRB states: correction and complements,Communications in Mathematical Physics 234 (2003) 185–190. doi: .[46] N. I. Chernov, Limit theorems and Markov approximations for chaoticdynamical systems, Probability Theory and Related Fields 101 (1995)321–362. doi: .[47] C. Sparrow, The Lorenz Equations, Springer-Verlag New York, 1982.doi: .[48] C. Liverani, Decay of correlations for piecewise expanding maps, Journalof Statistical Physics 78 (1995) 1111–1129. doi: .[49] N. Chandramoorthy, P. Fernandez, C. Talnikar, Q. Wang, Feasibility anal-ysis of ensemble sensitivity computation in turbulent ﬂows, AIAA Journal57 (2019). doi: .[50] M. Jiang, Diﬀerentiating potential functions of SRB measures on hy-perbolic attractors, Ergodic Theory and Dynamical Systems 32 (2012).doi: .[51] F. Ginelli, H. Chat´e, R. Livi, A. Politi, Covariant Lyapunov vectors, Jour-nal of Physics A: Mathematical and Theoretical 46 (2013). doi: .[52] N. Chandramoorthy, Q. Wang, An ergodic averaging method to diﬀerenti-ate covariant Lyapunov vectors, arXiv e-prints arXiv:2007.08297 (2020).[53] A. A. ´Sliwiak, Q. Wang, Supplementary ﬁles (code and data) for themanuscrupt titled “Computational assessment of smooth and rough pa-rameter dependence of statistics in chaotic dynamical systems”, https://github.com/asliwiak/differentiabilityStatistics , 2021.30 ppendix A. Computation of density gradient in higher-dimensionalsystems with one positive LE To compute the density gradient function along a trajectory, we need todiﬀerentiate α = (cid:107) Dϕ q (cid:107) with respect to ξ every time step. By diﬀerentiatingthe Euclidean norm and using the chain rule, we can expand the derivative inthe second term of the RHS of Eq. 22 to the following expression, ∂ ξ α = 1 α ( Dϕ q ) T ∂ ξ ( Dϕ q ) = 1 α ( Dϕ q ) T (cid:32) ( D ϕ · q ) q + Dϕ ∂ ξ q (cid:33) := α ( Dϕ q ) T u, (A.1)in which D ϕ is a third-order tensor with second derivatives of each componentof ϕ , while the dot symbol ( · ) represents the contracted tensor-vector product.In the diﬀerentiation of α , a special type of parameterization of the unstablemanifold curve x ( ξ ) ∈ M is assumed, namely that ∂ ξ x ( ξ ) = q ( x ( ξ )). If adiﬀerent parameterization was used, then the second term on the RHS of Eq.22 would have to be adjusted accordingly. Regardless of the parameterization,however, direct computation of (cid:107) ∂ ξ x ( ξ ) (cid:107) is never required in the iterative processfor g . The directional derivative ∂ ξ measures the rate of change along the curveof the local unstable manifold, i.e., in the direction of q , and thus ∂ ξ q is calledthe self-derivative of q [52]. Note in case of a 1D map, q = 1, ∂ ξ q = 0, andthus by combining Eq. 22 and Eq. A.1, we obtain the iterative formula for g we derived in Section 3 using the Frobenius-Perron operator (compare withEq. 13). The RHS of Eq. A.1 requires computing all possible ﬁrst derivativesof the map in phase space, i.e., the n -element Jacobian Dϕ , and all possiblesecond derivatives, i.e., the n -element tensor D ϕ every time step. In case ofthe Lorenz ’63 system, however, the tensor D ϕ is constant, and only 4 out ofits 27 elements are non-zero.The ﬁnal ingredient needed to evaluate ∂ ξ α is the self-derivative of q . Based onthe covariance property of CLVs, which ensures that ( Dϕ ) k q k = α k q k +1 , andthe chain rule in smooth manifolds, a recursive formula for ∂ ξ q ,( ∂ ξ q ) k +1 = ( I − q k +1 q Tk +1 ) (( D ϕ ) k · q k ) q k + ( Dϕ ) k ( ∂ ξ q ) k α k =( I − q k +1 q Tk +1 ) u k , (A.2)has been derived in [52]. That study also shows Eq. A.2 converges asymptot-ically at an exponential rate, regardless of the choice of the initial condition( ∂ ξ q )( x init ) = ( ∂ ξ q ) . Note Eq. A.2 resembles the formula for ∂ ξ α , Eq. A.1. Infact, by applying the covariance property of CLVs, Eq. A.2 can be rewritten asfollows, ( ∂ ξ q ) k +1 = u k − ( ∂ ξ α ) k α k q k +1 . (A.3)Therefore, the quantity ∂ ξ α is a byproduct of the recursive procedure for ∂ ξ q .We observe that the vector u k − ( ∂ ξ q ) k +1 is parallel to q k +1 , and the scalar( ∂ ξ α ) k /α k is its magnitude. 31o conclude, the computation of the density gradient function g requires theiterative formula in Eq. 22. However, in case of higher-dimensional problemswith one positive LE, the recursion described by Eq. A.2 for ∂ ξ q (cid:54) = 0 must berun simultaneously, from which the scalar ∂ ξ αα