[PDF] Using local dynamics to explain analog forecasting of chaotic systems

Abstract

Analogs are nearest neighbors of the state of a system. By using analogs and their successors in time, one is able to produce empirical forecasts. Several analog forecasting methods have been used in atmospheric applications and tested on well-known dynamical systems. Although efficient in practice, theoretical connections between analog methods and dynamical systems have been overlooked. Analog forecasting can be related to the real dynamical equations of the system of interest. This study investigates the properties of different analog forecasting strategies by taking local approximations of the system's dynamics. We find that analog forecasting performances are highly linked to the local Jacobian matrix of the flow map, and that analog forecasting combined with linear regression allows to capture projections of this Jacobian matrix. The proposed methodology allows to estimate analog forecasting errors, and to compare different analog methods. These results are derived analytically and tested numerically on two simple chaotic dynamical systems.

Full PDF

GGenerated using the oﬃcial AMS L A TEX template v5.0

Using local dynamics to explain analog forecasting of chaotic systems

P. Platzer ∗ Laboratoire des Sciences du Climat et de l’Environnement, Saclay, France & IMT Atlantique, Lab-STICC, UMR CNRS 6285, F-29238,Plouzané, France & France Énergies Marines, Plouzané, France

P. Yiou and P. Naveau

Laboratoire des Sciences du Climat et de l’Environnement, Saclay, France

P. Tandeo and Y. Zhen

IMT Atlantique, Lab-STICC, UMR CNRS 6285, F-29238, Plouzané, France

P. Ailliot

Laboratoire de Mathématiques de Bretagne Atlantique, Brest, France

J-F. Filipot

France Énergies Marines, Plouzané, France ∗ Corresponding author : Paul Platzer, [email protected] a r X i v : . [ phy s i c s . d a t a - a n ] J u l BSTRACTAnalogs are nearest neighbors of the state of a system. By using analogs and their successorsin time, one is able to produce empirical forecasts. Several analog forecasting methods have beenused in atmospheric applications and tested on well-known dynamical systems. Although eﬃcientin practice, theoretical connections between analog methods and dynamical systems have beenoverlooked. Analog forecasting can be related to the real dynamical equations of the system ofinterest. This study investigates the properties of diﬀerent analog forecasting strategies by takinglocal approximations of the system’s dynamics. We ﬁnd that analog forecasting performances arehighly linked to the local Jacobian matrix of the ﬂow map, and that analog forecasting combined withlinear regression allows to capture projections of this Jacobian matrix. The proposed methodologyallows to estimate analog forecasting errors, and to compare diﬀerent analog methods. Theseresults are derived analytically and tested numerically on two simple chaotic dynamical systems.2 ntroduction

To evaluate the future state of a physical system, one strategy is to use physical knowledge tobuild diﬀerential equations that emulate the dynamics of this system. Then, measurements provideinformation on the initial state from which these equations must be integrated. Data assimilationgives a framework to account for two main types of error in this forecasting process. First, theaforementioned equations do not describe perfectly the real dynamics of the system, and solvingthese equations often requires additional approximations, such as numerical discretization. Theseﬁrst error sources combine into what is called model error. Second, observations are usually partialand noisy, such that the initial state from which the diﬀerential equations must be integrated isuncertain. Observation error is especially important for chaotic dynamical systems as the latter arehighly sensitive to initial conditions.For complex, highly nonlinear systems such as the atmosphere, forecasts based on physicalequations are challenging. Therefore, many empirical methods have been used in atmosphericsciences (see Van den Dool et al. 2007, and references therein). The last decades have seen aproliferation of data from numerical model outputs, observations or the combination of them (seefor instance Saha et al. 2010; Hersbach et al. 2020), strengthening scientiﬁc interest for empiricalmethods. One of such methods is called analog forecasting and is based on a notion originallyintroduced by Lorenz (1969) to estimate atmospheric predictability. Analog forecasting has beenused in meteorological applications and on famous low-dimensional dynamical systems. Yiou(2014) uses analogs in the context of stochastic weather generators. Tandeo et al. (2015), Hamiltonet al. (2016) and Lguensat et al. (2017) combine analog forecasting and data assimilation. Moregenerally, analog forecasting procedures are used in a large range of environmental applications,3rom tropical intraseasonal oscillations (Alexander et al. 2017) to solar irradiance (Ayet and Tandeo2018).Analog forecasting proposes to bypass physical equations and to use existing trajectories of thesystem instead, drawing either from numerical model output, observation data or reanalysis. Analogmethods are based on the hypothesis that one is provided with many (or one long) trajectories ofthe system of interest, which enables to ﬁnd analog states close to any initial state, and to usethe time-successors of these analogs to evaluate the future state of the system. The ﬂuctuatingquality and density of available trajectories adds variability to this process. This leads to analogforecasting errors, which play the same role as the previously described model errors.Preliminary results suggest that analog forecasting errors can be estimated empirically usinglocal approximations of the true dynamics (Platzer et al. 2019). The current paper gives a morein-depth description of the theory that supports diﬀerent analog forecasting procedures, and allowsto evaluate the evolution of analog covariance matrices. The methodology is applied to two famouschaotic Lorenz systems.The theoretical framework for analog forecasting is outlined in Sec. 1, and three analog fore-casting operators are recalled. The point of view of dynamical systems is then detailed in Sec. 2.Finally, Sec. 3 examines analog forecasting mean and covariance, and investigates the link betweenlinear regression in analog forecasting and the Jacobian matrix of the real system’s ﬂow map. Thediscussion section takes a broader view, outlines limitations which provide opportunities for newresearch. The conclusion emphasizes the major results of the paper.4 . Analog forecasting a. Mathematical framework

Let a dynamical system be deﬁned by the following time-diﬀerential equation:d x d t = f ( x ) , (1)where x is a vector that fully characterizes the state of the system, and f is a deterministic, vector-valued map. The space P in which x lives is called phase-space. In most applications andthroughout this study, P is a vector space of ﬁnite dimension n . The system is supposed to beautonomous, such that f : P → P does not depend on time.Given an initial state x , a forecast gives an estimation of the state of the system x t at a later time t . The true future state x t is given by the ﬂow map Φ : P × R → P such that: Φ t : x → Φ t ( x ) = x t . (2)For the dynamical system deﬁned through Eq. (1), Φ represents the time-integration of thisequation. For ergodic systems, trajectories come back inﬁnitely close to their initial condition aftera suﬃciently long time (Poincaré 1890). Furthermore, if the dynamical system has an attractorset A ∈ P , then all trajectories converge to this subset of the phase-space (Milnor 1985). Analogmethods are based on the idea that if one is provided with a long enough trajectory of the systemof interest, one will ﬁnd analog states close to any initial point x in the attractor A . The trajectoryfrom which the analogs are taken is called the "catalog" C , and can either come from numericalmodel output or reprocessed observational data.Analog forecasting thus supposes that we know a ﬁnite number of initial states that are closeenough to x to be called "analogs", and that the ﬂow map of the analogs resembles Φ . Therefore,5he time-successors of the analogs should allow to estimate the real future state x t . In the following,the k -th analog and its successor are noted a k and a kt . Note that analog forecasting is intrinsicallyrandom as it depends on the catalog, which is one out of many possible trajectories. The variabilityin the catalog inﬂuences the ability of the analogs and successors to estimate the future state. Thismotivates the use of probabilistic analog forecasting operators Θ such that: Θ t : x → Θ t ( x ) (3)where Θ t ( x ) is a distribution that gives information both about the estimation of the future state x t and the variability of this estimation process.Note that for chaotic dynamical systems, analog forecasting can only work if t is smaller thanwhat is called the "Lyapunov time". This is the characteristic timescale after which trajectoriesof chaotic systems diverge, such that even if the analog a k is inﬁnitesimally close to x and if itfollows exactly the same dynamics as the real state, the successor a kt will still be far away from x t .This study is devoted to the properties of analog forecasting below the Lyapunov timescale. b. Analog forecasting operators Here are recalled three analog forecasting operators originally introduced in Lguensat et al.(2017). A ﬁnite number K of analogs ( a k ) k ∈[ , K ] and successors ( a kt ) k ∈[ , K ] are used, and areassigned weights ( ω k ) k ∈[ , K ] . This allows to give more weight to the pairs of analogs and successorsthat are best suited for the estimation of x t . The present article studies the properties of analogforecasting without restriction to any particular choice of weights and distance.The distributions of each analog forecast Θ t ( x ) is multinomial, with each pair of analog/successordeﬁning an element of the empirical distribution.6 he locally-constant operator (LC) uses only the successors to estimate x t . Θ t LC ( x ) ∼ (cid:205) k ω k δ a kt (·) . The mean forecast is thus µ LC = (cid:205) k ω k a kt . The covariance of theforecast is cov ω k ( a kt ) , the ω -weighted empirical covariance of the successors. The locally-incremental operator (LI) uses x , the analogs and the successors to estimate x t . Θ t LI ( x ) ∼ (cid:205) k ω k δ x + a kt − a k (·) . The mean forecast is µ LI = x + (cid:205) k ω k ( a kt − a k ) . The covarianceof the forecast is cov ω k ( a kt − a k ) , the ω -weighted empirical covariance of the increments. The locally-linear operator (LL) performs a weighted linear regression between the analogs andthe successors. The regression is applied between a k − µ and the successors a kt , where µ = (cid:205) k ω k a k . This gives slope S , intercept c , and residuals ξ k = a kt − S (cid:16) a k − µ (cid:17) − c . Θ t LL ( x ) ∼ (cid:205) k ω k δ µ LL + ξ k (·) . The mean forecast is µ LL = S ( x − µ ) + c . The covariance ofthe forecast is cov ω k ( ξ k ) , the ω -weighted empirical covariance of the residuals.The locally-constant (LC), locally-incremental (LI) and locally-linear (LL) analog forecastingoperators are illustrated in Fig. 1. The variance of the LC is similar around t = t . On the other hand, the variance of the LI goes to 0 as t →

0, but for large times the LIestimator has a larger variance compared to the LC. The next sections provide some informationthat help interpreting this phenomenon. The LL is able to catch the dynamics, and therefore showsa small variance and a good precision at all times. This is due to the fact that, in this example,non-linear terms are small and the ﬂow map of the analogs matches exactly the real system’s ﬂowmap.It is worth mentioning another kind of analog forecasting operator called "constructed analogs"(CA). It is a particular case of the locally-constant operator where the weights ω CA k can havenegative values and are such that the mean of the analogs µ is as close as possible to the initialstate: (cid:8) ω CA k (cid:9) k = argmin { ω k } k | (cid:205) k ω k a k − x | . It was used by Van Den Dool (1994) to create7etter analogs in the case of small catalogs. Later, Tipett and DelSole (2013) showed that CAare equivalent to the locally-linear operator with constant weights. In the following and unlessotherwise speciﬁed, it is assumed that the weights ω k are positive and decreasing functions of thedistance between a k and x .

2. Successor-to-future state distance a. Notations and hypotheses

This work assumes that the evolution dynamics of the analogs are similar to the evolutiondynamics of the system of interest, and that the system is deterministic. This can be stated in adiﬀerential equation form:  d x d t = f ( x ) x t = = x , (4a) ∀ k ,  d a k d t = f a ( a k ) a kt = = a k , with f a = f + δ ˜ f , (4b)or in an integrated form using ﬂow maps: x t = Φ t ( x ) , (5a) ∀ k , a kt = Φ ta ( a k ) , with Φ ta = Φ t + δ ˜ Φ t , (5b)where Φ a is the ﬂow map of the analogs, and ˜ Φ is the diﬀerence between the analog and realﬂow maps normalized through the scalar value δ such that Φ , Φ a and ˜ Φ are of the same order ofmagnitude. The maps f , f a and ˜ f are deﬁned accordingly.8n these formulations, the fundamental hypotheses of analog forecasting are the continuity of Φ t (or f ) with respect to the phase-space variable x , the density of the catalog C (for all k , a k is closeto x for a given metric) and the adequacy of the analogs’ dynamics ( δ is small, Φ a ≈ Φ ).The next section will investigate the ability of analogs and successors to approximate the realsystem state, provided that t is below the Lyapunov time and that the aforementioned hypothesesare veriﬁed. b. When analogs work : Taylor expansions of the dynamics

1) Distance between successor and real stateAssuming diﬀerent levels of smoothness of the ﬂow maps and using Taylor expansions, one canestimate the diﬀerence between the real future state x t and any given successor a kt at leading order: ∀ k , a kt − x t = δ ˜ Φ t ( x ) + (cid:2) ∇ Φ t | x (cid:3) · ( a k − x ) + O (cid:16) | a k − x | , δ | a k − x | (cid:17) , (6a)where ∇ Φ t | x is the Jacobian matrix (the matrix of partial derivatives in phase-space) of Φ t at x , ’ · ’ is the matrix multiplication, and O (cid:16) | a k − x | , δ | a k − x | (cid:17) represents higher-order terms.Neglecting these higher-order terms and lightening notations, this equation can be rewritten: a kt − x t ≈ δ ˜ Φ t + ∇ Φ t · ( a k − x ) , where the evaluation of δ ˜ Φ t and ∇ Φ t at x is implicit. The leading-order diﬀerence termsexplicitly described in the right hand-side of Eq. (6a) come from two sources. The ﬁrst source isthe diﬀerence between the analog and true ﬂow maps at point x , which is independent of a k . Thesecond source of diﬀerence is the mismatch in the initial condition, left-multiplied by the Jacobianmatrix of the true ﬂow map at point x . 9q. (6a) states that at ﬁrst order, these two error terms are additive. This is not true athigher orders. Higher-order terms include the bilinear product of a k − x with a matrix of secondderivatives of Φ t called the Hessian, and the product of the Jacobian of ˜ Φ t at x and a k − x .Fig. 2 shows applications of Eq. (6a) to the three-variable system of Lorenz (1963), hereafternoted L63. A real trajectory is compared with two analog trajectories. The L63 system is solvednumerically using a fourth-order Runge-Kutta ﬁnite-diﬀerence scheme, with numerical integrationtime step ∆ t = .

01 non-dimensional time. For notation details, see Eq. (A1) in appendix 4.The real trajectory has parameters σ = ρ = β = /

3, while the σ parameter for the analogdynamics is slightly perturbed with σ a = = . σ . The matrices δ ˜ Φ t and ∇ Φ t are estimatednumerically using formulae given below and time step ∆ t = .

01. The 10-th analog stays closeenough to the real trajectory all the time (upper-left panel of Fig. 2), therefore Eq. (6a) gives asatisfactory approximation of | a t − x t | (upper-right panel). The 100-th analog starts to be too farfrom the real trajectory around t ≈ . | a t − x t | (upper-right panel).The diﬀerent right-hand side-terms of Eq. (6a) are projected on the ﬁrst axis of phase-space anddisplayed in the lower-left panel of Fig. 2. The "ﬂow map" term δ ˜ Φ t is the same for both analogs,but the "initial condition" term ∇ Φ t · ( a k − x ) is much larger for the 100-th analog, and one cansee that those terms are proportional, here negatively correlated.Further assuming that t is small, one can express Eq. (6a) in the alternative formulation: ∀ k , a kt − x t = t δ ˜ f ( x ) + (cid:2) I + t ∇ f | x (cid:3) · ( a k − x ) + O (cid:16) t , | a k − x | , δ | a k − x | (cid:17) , (6b)where I is the identity matrix. Using lighter notations, this becomes:10 kt − x t ≈ t δ ˜ f + [ I + t ∇ f ] · ( a k − x ) , where the evaluation of δ ˜ f and ∇ f at x is implicit. This last formulation is analogous to a Eulerscheme used in ﬁnite-diﬀerence numerical methods for solving diﬀerential equations, it is thereforevalid only for small times. In the lower-right panel of Fig. 2, one can see that the right-hand sideterms of Eq. (6b) only approximate the terms of Eq. (6a) for t (cid:46) . f and Φ t Eq. (6b) is a ﬁrst-order expansion in time of Eq. (6a) . The fundamental resolvent matrix M ( t , t (cid:48) ) gives a more complete relationship between the two representations. M ( t , t (cid:48) ) is solution to thetime-varying linear system d M ( t , t (cid:48) ) d t = ∇ f | x t · M ( t , t (cid:48) ) with M ( t (cid:48) , t (cid:48) ) = I . The fundamental resolventmatrix can be approximated numerically as M ( t , t (cid:48) ) ≈ exp ( ∆ t ∇ f t ) · exp ( ∆ t ∇ f t − ∆ t ) . . . exp ( ∆ t ∇ f t (cid:48) ) with numerical time-step ∆ t and using the short notation ∇ f t : = ∇ f | x t .We have: δ ˜ Φ t ( x ) ≈ δ ∫ t M ( t , t (cid:48) ) · ˜ f ( x t (cid:48) ) d u , (7a) ∇ Φ t | x = M ( t , ) , (7b)where the " ≈ " sign is here to say that Eq. (7a) is valid only at ﬁrst order in δ . This ﬁrst order isenough to compute the right-hand side terms of Eq. (6a), which is also valid at ﬁrst order in δ .From Eq. (7b) one can use Taylor developments relating ∇ f and ∇ Φ t , such as: ∇ Φ t = I + t ∇ f + t (cid:18) ( ∇ f ) + dd t ∇ f (cid:19) + O( t ) , (8)11here ∇ Φ t is implicitly evaluated at x . The short notation ∇ f is used for ∇ f x , and dd t ∇ f is thetime derivative along the trajectory x t of the Jacobian of f , at t = dd t ∇ f | : = lim t → ( ∇ f t − ∇ f ) / t .At ﬁrst order in t , one recovers the result expressed in Eq. (6b).

3. Consequences for analog forecasting operators a. Mean error of analog forecasting operators

By multiplying equations (6a,b) by ω k and summing over k , one can compare the distances from x t to the averages µ LC , µ LI and µ LL of the diﬀerent analog forecasting operators of Sec. 1b. Thoseaverages depend on t , although only implicitly in the notation. Letting µ = (cid:205) k ω k a k the weightedmean of the analogs, we have the following expressions. Locally-constant mean error : µ LC − x t = δ ˜ Φ t ( x ) + (cid:2) ∇ Φ t | x (cid:3) · ( µ − x ) + O (cid:32)(cid:213) k ω k | a k − x | , δ (cid:213) k ω k | a k − x | (cid:33) , (9a) µ LC − x t = t δ ˜ f ( x ) + (cid:2) I + t ∇ f | x (cid:3) · ( µ − x ) + O (cid:32) t , (cid:213) k ω k | a k − x | , δ (cid:213) k ω k | a k − x | (cid:33) . (9b) Locally-incremental mean error : µ LI − x t = δ ˜ Φ t ( x ) + (cid:2) ∇ Φ t | x − I (cid:3) · ( µ − x ) + O (cid:32)(cid:213) k ω k | a k − x | , δ (cid:213) k ω k | a k − x | (cid:33) , (10a) µ LI − x t = t δ ˜ f ( x ) + (cid:2) t ∇ f | x (cid:3) · ( µ − x ) + O (cid:32) t , (cid:213) k ω k | a k − x | , δ (cid:213) k ω k | a k − x | (cid:33) . (10b)Using lighter notations with implicit evaluation at x , this gives:12ocally − constant : µ LC − x t ≈ δ ˜ Φ t + ∇ Φ t · ( µ − x )≈ t δ ˜ f + [ I + t ∇ f ] · ( µ − x ) , Locally − incremental : µ LI − x t ≈ δ ˜ Φ t + (cid:2) ∇ Φ t − I (cid:3) · ( µ − x )≈ t δ ˜ f + t ∇ f · ( µ − x ) . The errors of the locally-constant and locally-incremental operators are both aﬀected by thediﬀerence between the analog and real ﬂow maps. This source of error cannot be circumventedunless provided with some information about δ ˜ Φ . The other ﬁrst-order error term is linear in ( µ − x ) , but when t →

0, this term is of order t in the locally-incremental case. Thus, forsmall lead-times, as both t → µ → x (dense catalog), the mean of the locally-incrementalprovides a better estimate of x t . This is why this operator is qualiﬁed by Lguensat et al. (2017) asmore "physically-sound" than the locally-constant: the locally-incremental takes advantage of thefact that lim t → Φ t = I , just as any ﬁnite-diﬀerence numerical scheme does. Formulas similar toEq. (9-10) were used by Platzer et al. (2019) to predict analog forecasting errors with LC and LIoperators, on the famous three-variable L63 system, with δ = x t out ofthe convex hull of the catalog. This is related to what is called "novelty creation" in the machine-learning terminology. Such a property is interesting, but it also enables some inconsistent forecasts.Indeed, if t is not small enough, the locally-incremental operator can produce forecasts that have alarge error due to the − I term in Eq. (10a). In Fig. 1, one can see that the LI has a larger variancethan the LC for large times.Eq. (9) is also valid for constructed analogs (CA) introduced in section b, where the weights { ω CA k } are chosen so that | (cid:205) k ω CA k a k − x | is as small as possible. This means that the ( µ − x ) -13inear term of equation (9) is also small. As mentioned earlier, Tipett and DelSole (2013) showedthat this strategy is equivalent to making a linear regression. This explains why the ( µ − x ) -linearterm is absent from Eqs. (11a,b). Locally-linear mean error : µ LL − x t = δ ˜ Φ t ( x ) + O (cid:32)(cid:213) k ω k | a k − x | , δ (cid:213) k ω k | a k − x | (cid:33) , (11a) µ LL − x t = t δ ˜ f ( x ) + O (cid:32) t , (cid:213) k ω k | a k − x | , δ (cid:213) k ω k | a k − x | (cid:33) . (11b)Another way to understand why the ( µ − x ) -linear term should disappear when using the LLis to see that the LL is estimating the local Jacobian of the ﬂow map. Indeed, the linear regressionbetween the analogs and the successors gives an estimation of ∇ Φ t | x , with an estimation errorthat is at least of order O(| µ − x | , δ ) . Sec. 3.b gives a detailed argumentation to support thisclaim and investigates limitations. The estimation error between the linear regression matrix andthe Jacobian thus adds higher-order error terms to the right-hand side of Eqs. (11a,b), but theseare already included in the O( (cid:205) k ω k | a k − x | , δ (cid:205) k ω k | a k − x |) .We now make the explicit link between the three operators. Recall the notations of Sec. 1.b:the locally-linear operator ﬁnds slope S and intercept c such that for all k , a kt = S ( a k − µ ) + c + ξ k using weighted least-square estimates. This gives c = (cid:205) k ω k a kt = µ LC , thus we have µ LL = µ LC + S ( x − µ ) and the following relations hold: µ LC = µ LL | S = , (12a) µ LI = µ LL | S = I , (12b)such that the locally-constant and locally-incremental operator are particular cases of the locally-linear operator. We also have lim t → S = I , because for all k , lim t → a kt = a k . Thus, mean14orecasts of the locally-linear and locally-incremental operators are equivalent as t approaches 0: µ LL ∼ t → µ LI .This analysis shows that, in terms of mean forecast error, the locally-linear operator is moreprecise than the locally-incremental, and the latter is more precise than the locally-constant. Theseﬁndings are in agreement with the numerical experiments of Lguensat et al. (2017).We now investigate the link between the local Jacobian of the ﬂow ∇ Φ t | x , and the linearregression matrix from the locally-linear operator S . b. Ability of analogs to estimate local Jacobians If analogs can estimate the Jacobian of the real system, it means that analog forecasting provides alocal approximation of the real dynamics, proving the relevance of analogs for short-range forecasts.Furthermore, having an estimation of the local Jacobian can be useful in some applications suchas the Extend Kalman Filter, where the Jacobian allows to estimate the evolution of the forecastcovariance.1) Derivation of the first order error in Jacobian estimationIt is possible to ﬁnd an exact expression of the ﬁrst-order error term in the estimation of the localJacobian. Let us start with the case of perfect agreement between the real and analog ﬂow maps: Φ a = Φ , or δ =

0. Then, assume that in the neighborhood of x where the analogs lie, the ﬂow Φ t (·) can be approximated by a quadratic function in phase-space. We then have : ∀ k , a kt = ∇ Φ t ( a k − µ ) + ( a k − µ ) ∇ Φ t ( a k − µ ) T + Cst , (13)where "Cst" is a constant (independent of k ), and the Jacobian and Hessian of Φ t are implicitlyevaluated at x (see appendix 4 for notation of product of vectors and Hessian). In the next15quations, the t -superscript is dropped to lighten notations. Let X , the matrix of the analogs minustheir mean, so that the k -th row of X is a k − µ . Similarly, let Y be the matrix of the successors,with the k -th row of Y being a kt . Eq. (13) thus translates into Y = X ∇ Φ T + X ∇ Φ X T , omittingthe constant.Now let Ω = diag ( ω , . . ., ω K ) , the ( K × K ) diagonal matrix of the weights given to each analogin the regression. Then S is the weighted least-squares solution of the linear regression S = ( X T Ω X ) − X T Ω Y . With a bit of rewriting, this ﬁnally gives: S − ∇ Φ = ( X T Ω X ) − X T Ω (cid:34) X ∇ Φ (cid:18) X + ( µ − x ) T ⊗ J K , (cid:19) T (cid:35) , (14)where ⊗ is the Kronecker matrix product and J K , is the column vector with K elements all equalto 1.Eq. (14) tells us that S is close to the Jacobian at x up to a factor that is linear in the distancebetween the mean of the analogs µ and the analogs a k , and another factor linear in the distancebetween µ and x . These linear error term depend on the second-order phase-space derivativesof Φ at the point x (the Hessian of Φ ).Conducting the same derivation but relaxing the hypothesis of δ =

0, one would ﬁnd the sameresult with an added linear error term involving the Jacobian of ˜ Φ . This analysis allows us to saythat S = ∇ Φ t + O (| µ − x | , δ ) , if the distance between the analogs and their mean is of same orderas the distance between their mean and x .However, the claim that the linear regression matrix S is able to approximate the Jacobian ∇ Φ t must be tempered by several facts. To illustrate these, the regular locally-linear analog forecastingoperator will now be compared with two other strategies aimed at solving dimensionality issues.16) Strategies for linear regression in high dimensionDimensionality can make analog forecasting diﬃcult, especially when using the locally linearanalog forecasting operator. Here are recalled two strategies that can be used to circumvent thisissue.The ﬁrst approach uses empirical orthogonal functions (EOFs, also called principal componentanalysis) at every forecast step. Dimension is reduced by keeping only the ﬁrst n eof EOFs of the setof analogs ( a k ) k ∈[ , K ] , or keeping only the n eof ﬁrst principal components of the matrix X T Ω X . Reducing dimension using EOFs • Find analogs ( a k ) k ∈[ , K ] of the initial state x • Compute the n EOFs of the weighted set of analogs ( a k ) k ∈[ , K ] • Keep the n eof ﬁrst EOFs up to 95% total variance• Project x , ( a k ) k ∈[ , K ] and ( a kt ) k ∈[ , K ] on the n eof ﬁrst EOFs• Perform LL analog forecasting in this projected space The second strategy is to perform n analog forecasts, one for each coordinate of the phase-space P , and to assume that the future of a given coordinate only depends on the initial valuesof the neighboring coordinates and not on the whole initial vector x . In the model of Lorenz(1996) (hereafter noted L96), Eq. (A2) in appendix motivates the choice of keeping only theinitial coordinates { i − , i − , i , i + , i + } to estimate the i -th future coordinate. Thus we keeponly n trunc = n linear regressions with 5coeﬃcients at each forecast. By combining the results of those linear regressions, one ﬁnds a n × n matrix that is sparse by construction: all elements two cells away from the diagonal are equal to17ero. This was introduced in Lguensat et al. (2017) as "local analogs". In the present paper thisstrategy will rather be termed as "coordinate-by-coordinate" analog forecasting. Coordinate-by-coordinate forecast • for i from 1 to n , forecast the i -th future coordinate x t , i : – Condition the forecast Θ t LL , i on a few initial coordinates around x , i . Θ t LL , i ( x ) = Θ t LL , i ( x , i − , x , i − , x , i , x , i + , x , i + ) – Find analogs of the truncated initial vector ( x , i − , x , i − , x , i , x , i + , x , i + ) – Perform LL analog forecasting Θ t LL , i – Store the coeﬃcients of the linear regression ( S i , i − , S i , i − , S i , i , S i , i + , S i , i + ) • Aggregate the coeﬃcients into the n × n matrix S The next section investigates limitations to the claim that the matrix S from the LL operator isable to approximate the Jacobian ∇ Φ t , and studies the impact of dimension reduction techniqueson this Jacobian estimation.3) Effect of the number of analogs and the phase-space dimensionFirst, to be able to compute S , one must have enough analogs to perform the inversion of thematrix X T Ω X , where X is the matrix of the analogs and Ω the diagonal matrix of the weights.This cannot be done unless K , the number of analogs used for the forecast, is superior or equalto n , the phase-space dimension. Using the EOF or coordinate-by-coordinate strategies from theprevious section, one can reduce the dimension to n eof or n trunc , needing only to satisfy K ≥ n EOF or K ≥ n trunc .To illustrate the practical consequences of these issues, numerical simulations of the L96 systemwere performed with n =

8. The L96 is a famous chaotic dynamical system with a ﬂexible18imension, well suited to the purpose of this study. The governing equations were solved using afourth-order Runge-Kutta numerical scheme with an integration time step ∆ t = .

05. A catalog wasbuilt from one long trajectory (10 times) using the real equations ( δ = × test points (10 non-dimensionaltimes) taken from another trajectory on the attractor (independent from the catalog). Setting thenumber of analogs to the limiting case K = n = n trunc = P onto the EOFs that maximize the variance in the set of analogs. Thus the rank of the set ofanalogs is likely to be equal to n eof in this reduced-space. However, the EOF strategy necessarilymisses some of the components of the full ( n × n ) Jacobian matrix ∇ Φ t , as it gives only theestimation of a ( n eof × n eof ) matrix. The coordinate-by-coordinate method also ensures that thelinear regression can be performed as long as n trunc is low enough, but is also misses some of theelements of the Jacobian matrix of the ﬂow map. Indeed, even though the coeﬃcients of ∇ f arezero two cells away from the diagonal, this is not the case of ∇ Φ t . Recall that, at second-order intime, ∇ Φ t = I + t ∇ f + t (cid:16) ( ∇ f ) + dd t ∇ f (cid:17) . Thus, some coeﬃcients of order t will not be capturedby the linear regression matrix S using coordinate-by-coordinate analog forecasting with n trunc = S is then compared with ∇ Φ t for the three methods. The real valueof ∇ Φ t is estimated with the second-order time-expansion of Eq. (8) that can be computed directlyfrom the model equations (A2). An example is shown in Fig. 3. In this case, the regular analog19orecasting misses the Jacobian with RMSE of 2.659, because the rank of the set of analogs is toolow and X T Ω X is thus not invertible. Analog forecasting combined with EOFs gives a better resultas it circumvents this inversion problem, with a total RMSE between S and ∇ Φ t of 0.193. Thecoordinate-by-coordinate analog forecasting gives the best solution in this case, with a RMSE of0.095. Note that many coeﬃcients of the matrix S are set to zero by construction when using thecoordinate-by-coordinate method.Then, Fig. 4 shows empirical probability density functions for the RMSE of S − ∇ Φ t for eachof the three methods. The low number of analogs implies large ﬂuctuation of the regular LLanalog forecasts, as the rank of the set of analogs used can be below or close to the phase-spacedimension, making the inversion of X T Ω X hazardous. This variability is noticeably reducedwhen the inversion is performed in the n eof -dimension reduced-space. The EOF strategy has theadvantage of preventing large errors and the drawback of hindering very precise estimations of theJacobian. Indeed, when using EOFs the linear regression matrix has a rank necessarily lower than n , and some information is missed. Finally, coordinate-by-coordinate analog forecasting is able toperform better estimations of the Jacobian in average, and with a variability between that of theregular analogs and that of the analogs combined with EOFs. However, the probability to have veryprecise estimations of the Jacobian (log ( RMSE ) < − .

3) is lower with coordinate-by-coordinateanalog forecasting than with regular analog forecasting. This can be witnessed as the area underthe graph for log ( RMSE ) < − . t ) non-zero coeﬃcients two cells away fromthe diagonal that the coordinate-by-coordinate analog forecasting cannot estimate.In some situations however, the number of analogs K is much larger than the phase-spacedimension n , and the linear regression matrix S is still unable to approximate the Jacobian ∇ Φ t .20) Effect of the analogs rank and the attractor’s dimensionAs we have seen, to calculate S and perform locally-linear analog forecasting, one must invertthe matrix X T Ω X . This means that the set of analogs must be of rank n . Yet, in some situations,the dimension of the attractor is lower than the full phase-space dimension n . Thus if the catalogis made of one trajectory inside the attractor, the set of analogs might not be of rank n , howeverlarge K might be. In some cases, the dimension of the attractor is between n − n , such thatthe matrix X T Ω X is still invertible but very sensitive to ﬂuctuations in the rank of the set analogs.Similar remarks can be made for the successors. If Y (the set of successors) is not of rank n , thenthe matrix S , if it can be computed, is still not of rank n . Thus S will not be able to estimate theJacobian ∇ Φ t if the latter is of rank n . Note that the rank of the successors (the rank of the matrix Y ) is highly dependent on the rank of the analogs and the Jacobian matrix as we have Y ≈ X ∇ Φ T at ﬁrst order in X , such that if the analogs are not of rank n the successors are likely not to be ofrank n either.Thus, depending on the dimension of the attractor, the locally-linear analog forecasting operatormight not be able to estimate the local Jacobian of the real ﬂow map, but only a projection ofthis Jacobian matrix onto the local sets of analogs and successors. This is a typical case wheredata-driven methods are not able to reveal the full physics of an observed system unless providedwith other sources of information or hypotheses, such as a parametric law.The three-variable L63 system is used to illustrate this fact. This system is known to have adimension of ≈ .

06, with local variations around this value (Caby et al. 2019). This is the perfectcase study where the rank of the set of analogs will be close to n −

1. Thus, the linear regressionmatrix S between the analogs and the successors is not able to approximate the full ( × ) Jacobianmatrix ∇ Φ t . Using restriction to the vector subspace V a spanned by the two ﬁrst EOFs of the21nalogs ( e a , e a ) , one can understand better the connection between the two matrices ∇ Φ t and S .In the following, subscript " r " indicates restriction to ( e a , e a ) . The choice of using only the twoﬁrst EOFs is motivated by the quasi-planar nature of the Lorenz attractor. In the next formulas the t -superscript is dropped for the sake of readability. ∇ Φ r = ∇ Φ (cid:169)(cid:173)(cid:173)(cid:173)(cid:173)(cid:173)(cid:173)(cid:171) e a e a (cid:170)(cid:174)(cid:174)(cid:174)(cid:174)(cid:174)(cid:174)(cid:172) , (15a) S r = S (cid:169)(cid:173)(cid:173)(cid:173)(cid:173)(cid:173)(cid:173)(cid:171) e a e a (cid:170)(cid:174)(cid:174)(cid:174)(cid:174)(cid:174)(cid:174)(cid:172) . (15b)The condition number of the set of analogs gives a direct way to measure whether the matrix X T Ω X can be inverted, and whether S can approximate a full rank Jacobian matrix. This numberis the ratio of highest to lowest singular value. It has the advantage of being a continuous functionof the set of analogs, while the rank is a discontinuous function that takes only integer values. Ifthe condition number is large, the set of analogs is almost contained in a plane, and the analogsmight not be able to approximate the full Jacobian Φ . Conversely, if the condition number is closeto 1, then the rank of the set of analogs is clearly 3, and analogs will be able to approximate thefull Jacobian matrix. Note that the condition number of the set of analogs is not directly linked tothe dimension of the attractor. One simply uses the fact that the attractor is locally close to a plane,without referring further to the complex notion of attractor dimension.This can be investigated through numerical simulations of the L63 system, using a fourth-orderRunge-Kutta numerical scheme and a time step of ∆ t = .

01 to solve the governing equations. A22atalog was generated from a trajectory of 10 non-dimensional times, with the original equations( δ = t = .

01 with K =

40 analogs,on 10 points randomly selected on the attractor. The linear regression matrix S was then comparedwith ∇ Φ t , with or without restriction to ( e a , e a ) . To estimate numerically the real value of ∇ Φ t ,a third-order time-expansion similar to Eq. (8) was computed directly from the model equations.Fig. 5 shows that estimation of the Jacobian by the analogs improves as the catalog size (andtherefore the catalog density) grows. This validates that the analogs are able to approximateprecisely the Jacobian matrix of the ﬂow map. The ﬁgure also shows that, once restricted to thetwo-dimensional subspace spanned by the analogs, this estimation is much more precise and lessﬂuctuating.Fig. 6 displays the RMSE of the full (3 ×

3) matrix S − ∇ Φ as a function of the condition numberof the set of analogs. We can see in this ﬁgure that large RMSE values are highly correlated withhigh condition numbers, while low RMSE values can only be achieved when the condition numberof the analogs is close to 1.All these elements show that the estimation of the Jacobian matrix from analogs is highlydependent on the number of analogs K , the condition number of the set analogs, the attractor’sdimension, and the phase-space dimension n . However, the fact that the matrix from the LLoperator does not approximate the full Jacobian ∇ Φ t does not mean that the analog forecast willpoorly approximate the future state x t . For the LL forecast to be eﬃcient, one only needs a goodapproximation of the restricted Jacobian, and that the inversion associated with the linear regressionis not ill-conditioned. 23 . Evolution of mean and covariance under Gaussian assumption In this section, it is assumed that the weighted multinomial distribution of the analogs (cid:205) k ω k δ a k and of their successors (cid:205) k ω k δ a kt can be approximated by Gaussian distributions: (cid:213) k ω k δ a k ≈ N ( µ , P ) , (16a) (cid:213) k ω k δ a kt ≈ N ( µ t , P t ) , (16b)where we have µ t = µ LC . Combining this hypotheses with Eq. (5b) and approximating Φ ta (·) byits tangent around µ we have the classic relationships: µ t = Φ ta ( µ ) + O ( Tr P ) , (17a) P t = ∇ Φ ta | µ P ∇ Φ ta | T µ + O ( Tr P ) , (17b)where Tr is the trace operator. Similar relations can be found using the diﬀerential representationof Eq. (5b): d µ t d t = f a ( µ t ) + O ( Tr P t ) , µ t = = µ , (18a)d P t d t = ∇ f a | µ t P t + P t ∇ f a | T µ t + O ( Tr P t ) , P t = = P . (18b)Now, let us make the simplifying hypothesis that | x − µ | (cid:46) Tr P , which means that the state x is not farther from the analogs’ mean µ than the standard deviation of the analogs. Then, oneevaluates Φ ta , f a and their derivatives at x and x t instead of µ and µ t , giving additional terms: µ t = Φ t ( x ) + δ ˜ Φ t ( x ) + ∇ Φ t | x ( µ − x ) + O ( Tr P , δ | µ − x |) , (19a)24 t = ∇ Φ t | x P ∇ Φ t | T x + δ (cid:16) ∇ Φ t | x P ∇ Φ t | T x + ∇ Φ t | x P ∇ Φ t | T x (cid:17) + (cid:16) (cid:2) ( µ − x ) ∇ Φ t | x (cid:3) P ∇ Φ t | T x + ∇ Φ t | x P (cid:2) ( µ − x ) ∇ Φ t | x (cid:3) T (cid:17) + O ( Tr P , δ | µ − x |) , (19b)where terms of order | µ − x | are included in O ( Tr P ) and ∇ Φ t | x is the Hessian of Φ t at x .In the time-diﬀerential representation we have :d ( µ t − x t ) d t = ∇ f | x t ( µ t − x t ) + δ ˜f ( x t ) + O ( Tr P t , δ | µ t − x t |) , (20a)d P t d t = ∇ f | x t P t + P t ∇ f | T x t + + δ (cid:16) ∇ ˜f | x t P t + P t ∇ ˜f | T x t (cid:17) + (cid:16) (cid:2) ( µ t − x t ) ∇ f | x t (cid:3) P t + P t (cid:2) ( µ t − x t ) ∇ f | x t (cid:3) T (cid:17) + O ( Tr P t , δ | µ t − x t |) . (20b)Eq. (20a) is equivalent to Eq. (19a), which is also equivalent to Eq. (6a). Eq. (20a) can beTaylor-expanded around t = Φ t at x . The covariance of the analog forecast will depend on the covarianceof the analogs at t = P , and on the system’s local Jacobian ∇ Φ t | x . This is another way to seethat the analogs are highly linked to the local dynamics of the system. If the local dynamics inducea large spread in the future possible trajectories, it is captured in the successors’ covariance P t . Onthe contrary, if the local dynamics are ﬂat ( ∇ Φ t | x (cid:39) I or ∇ f | x (cid:39) ) the successors’ covariance isequal to the analogs’ covariance. 25 . Discussion This paper contributes to the interpretation of analog forecasting methods. Following a similarobjective but using diﬀerent methodology, Zhao and Giannakis (2016) set a mathematical frame-work for the convergence of analog forecasting operators to the ﬂow map of the real system, witha particular emphasis on the kernels used for the weights ω k .There are many natural extensions to the work presented here. The ﬁrst one is non-deterministicdynamics that can happen, for instance, when forecast is not performed in phase-space but in alower-dimensional space. One might be provided only with observations of a few variables of thewhole system, and try to forecast those same variables. The use of time-embeddings from Takens(1981) combined with analog forecasting is promising (Alexander et al. 2017). Also, Chau et al.(2020) build a catalog of state-space trajectories from a catalog of partial and noisy observations,using analog forecasting and data assimilation.The second natural extension is to account for observation error in the catalog of analogs. Asthe ﬂow map is assumed to be quasi-linear in phase-space in the neighborhood of the analogs, onecould conduct the same analysis including centered additive noise for each analog and successorof the catalog, and ﬁnd results similar to the ones outlined here.One must bear in mind that the use of analog forecasting in applications implies issues such asthe choice of the space in which forecasting is performed, the choice of the right metric to compareanalogs and initial state, and the combination of analogs with other techniques. In data assimilation,one might want to convert the multinomial distributions of Sec. b to Gaussian distributions touse Kalman ﬁltering. Ridge and Lasso regularizations could be used to ease the linear regressioninstead of the techniques mentioned in Sec. b. These operational choices must be made accounting26or memory use and computational time (see Lguensat et al. (2017) for diﬀerences between regularand coordinate-by-coordinate analog forecasting). Conclusion

Analog forecasting allows to avoid solving complex nonlinear equations by using existing so-lutions starting from similar initial conditions. The accuracy of analog forecasting depends onlocal dynamical properties of the system of interest. In particular, the quality of analog forecastsis related to the Jacobian matrix of the real system’s ﬂow map, and the linear regression fromanalogs to successors is shown to provide an approximation of this matrix. This allows to examinethe mean accuracy of known analog forecasting operators, and to compare diﬀerent methods thatevaluate this Jacobian matrix, using numerical experiments of famous dynamical systems. Thelocally-linear operator is found to give the best approximation of the future state, provided that thelinear regression is not ill-posed. The locally-incremental operator is shown to give more preciseforecasts at small lead times. The Jacobian matrix of the ﬂow map is found to drive the growthof the successors’ covariance matrix. Altogether, this brings theoretical evidence that analogs canbe used to emulate a real system, and gives quantitative expressions for the precision of analogforecasting techniques.

Acknowledgments.

The work was ﬁnancially supported by ERC grant No. 338965-A2C2 andANR No. 10-IEED-0006-26 (CARAVELE project).APPENDIX A

Lorenz systems

The three-variable "L63" Lorenz (1963) system of equations is:27  d x d t = σ ( x − x ) , d x d t = x ( ρ − x ) − x , d x d t = x x − β x , (A1)with usual parameters σ = β = / ρ = n -variable "L96" Lorenz (1996) system of equations is: ∀ i ∈ [ , n ] , d x i d t = −( x i − + x i + ) x i − − x i + θ , (A2)where θ is the forcing parameter. We set n = θ =

8, and use periodic boundary conditions x i + n = x i . APPENDIX B Product of Hessian with vectors

Let g a vector-valued, phase-space-dependant function g : R n → R n such as Φ t or f .The Hessian of g at x is noted ∇ g | x . It is of dimension n and its ( i , j , k ) -th coeﬃcient (cid:2) ∇ g | x (cid:3) i , j , k equals ∂ g k ∂ x i ∂ x j ( x ) . The product of a Hessian ∇ g | x with a n -dimensional vector y is a matrix and its ( i , k ) -th coeﬃcient (cid:2) y ∇ g | x (cid:3) i , k equals (cid:205) j y j ∂ g k ∂ x i ∂ x j . The double-product of aHessian with two n -dimensional vectors y and z is a vector and its k -th coeﬃcient (cid:2) y ( ∇ g | x ) z T (cid:3) k equals (cid:205) i , j y j z i ∂ g k ∂ x i ∂ x j . The double product of a Hessian ∇ g | x with two matrices X and Y ofsame shape K × n is a matrix of shape k × n and its ( k , j ) -th coeﬃcient is (cid:205) l , m X k , l Y k , m ∂ g j ∂ x l ∂ x m .28 eferences Alexander, R., Z. Zhao, E. Székely, and D. Giannakis, 2017: Kernel analog forecasting of tropicalintraseasonal oscillations.

Journal of the Atmospheric Sciences ,

74 (4) , 1321–1342.Ayet, A., and P. Tandeo, 2018: Nowcasting solar irradiance using an analog method and geosta-tionary satellite images.

Solar Energy , , 301–315, doi:10.1016/j.solener.2018.02.068.Caby, T., D. Faranda, G. Mantica, S. Vaienti, and P. Yiou, 2019: Generalized dimensions, largedeviations and the distribution of rare events. Physica D: Nonlinear Phenomena , , 132 143.Chau, T. T. T., P. Ailliot, and V. Monbet, 2020: An algorithm for non-parametric estimation instate-space models. arXiv preprint 2006.09525 .Hamilton, F., T. Berry, and T. Sauer, 2016: Ensemble kalman ﬁltering without a model. PhysicalReview X , , 011 021, doi:10.1103/PhysRevX.6.011021.Hersbach, H., and Coauthors, 2020: The era5 global reanalysis. Quarterly Journal of the RoyalMeteorological Society .Lguensat, R., P. Tandeo, P. Ailliot, M. Pulido, and R. Fablet, 2017: The Analog Data Assimilation.

Monthly Weather Review ,

145 (10) , 4093–4107, doi:10.1175/MWR-D-16-0441.1.Lorenz, E. N., 1963: Deterministic nonperiodic ﬂow.

Journal of the atmospheric sciences ,

20 (2) ,130–141.Lorenz, E. N., 1969: Atmospheric Predictability as Revealed by Naturally Occurring Analogues.

Journal of the Atmospheric Sciences ,

26 (4) , 636–646.Lorenz, E. N., 1996: Predictability: A problem partly solved.

Proc. Seminar on predictability ,Vol. 1. 29ilnor, J., 1985: On the concept of attractor.

The theory of chaotic attractors , Springer, 243–264.Platzer, P., P. Yiou, P. Tandeo, P. Naveau, and J.-F. Filipot, 2019: Predicting analog forecastingerrors using dynamical systems.

CI 2019: 9th International Workshop on ClimateInformatics .Poincaré, H., 1890: Sur le problème des trois corps et les équations de la dynamique.

Actamathematica ,

13 (1) , A3–A270.Saha, S., and Coauthors, 2010: The ncep climate forecast system reanalysis.

Bulletin of theAmerican Meteorological Society ,

91 (8) , 1015–1058.Takens, F., 1981: Detecting strange attractors in turbulence.

Dynamical Systems and Turbulence,Warwick 1980 , Springer, 366–381, doi:https://doi.org/10.1007/BFb0091924.Tandeo, P., and Coauthors, 2015: Combining analog method and ensemble data assimilation:application to the lorenz-63 chaotic system.

Machine learning and data mining approaches toclimate science , Springer, 3–12.Tipett, M. K., and T. DelSole, 2013: Constructed Analogs and Linear Regression.

Monthly WeatherReview , , 2519–2525, doi:10.1175/MWR-D-12-00223.1.Van den Dool, H., P. S. Cpc, and H. Van Den Dool, 2007: Empirical methods in short-term climateprediction . Oxford University Press.Van Den Dool, H. M., 1994: Searching for analogues , how long must we wait ?

Tellus A: DynamicMeteorology and Oceanography ,

46 (3) , 314–324, doi:10.3402/tellusa.v46i3.15481.Yiou, P., 2014: AnaWEGE: a weather generator based on analogues of atmospheric circulation.

Geoscientiﬁc Model Development , , 531–543, doi:10.5194/gmd-7-531-2014.30hao, Z., and D. Giannakis, 2016: Analog forecasting with dynamics-adapted kernels. Nonlinear-ity ,

29 (9) , 2888–2939, doi:10.1088/0951-7715/29/9/2888.31

IST OF FIGURES

Fig. 1.

Analog forecasting operators presented in Sec. b. The ﬂow map Φ t ( x ) has a simplepolynomial form. Analogs are drawn from a normal distribution centered on x and followthe same model as the real state x . The same analogs and ﬂow maps are used for the threeoperators and are represented on each panel. Weights ω k are computed using Gaussiankernels. The real initial and future states x and x t are displayed in full circles. On theleft panel, analogs are in colored, right-pointing triangles, and successors in left-pointingtriangles with the same colors. The size of the k -th triangle is proportional to the weight ω k .In the middle and right panels, the elements of the forecast distribution at time t are also incolored, left-pointing triangles. . . . . . . . . . . . . . . . . . 34 Fig. 2.

Illustrating Eq. (6a,b) on the three-variable L63 system. Upper-left: A real trajectory from x to x t and two analog trajectories, namely the 10-th best analog a to a t and the 100-thbest analog a to a t . The catalog is shown in white. Upper-right: comparing the exactvalue of the norm of a t − x t (full lines) and the sum of the two terms on the right-hand sideof Eq. (6a) (dashed lines). Lower-left: Contributions of the ﬁrst term (black squares) andthe second term (brown circles and blue triangles) of the right-hand side of equation (6a)projected on the ﬁrst coordinate of the L63 system. Lower-right: Contributions of the ﬁrstterm (black squares) and the second term (brown circles and blue triangles) of the right-handside of equation (6b) projected on the ﬁrst coordinate of the L63 system. . . . . . . . 35 Fig. 3.

Flow map Jacobian matrix estimation with the model of Lorenz (1996). Forecast leadtime is t = .

05 Lorenz time, catalog length is 10 Lorenz times, phase-space dimensionis n = K = ω k with shape parameter λ set to the median of analog-to-state distances | a k − x | . Upper-left:Jacobian matrix ∇ Φ t | x . Upper-middle: linear regression matrix S using regular analogs.Upper-right: diﬀerence S − ∇ Φ t | x with regular analogs, also giving the value of RMSEbelow the plot. Middle panels: same but the linear regression is performed in a lower-dimensional subspace spanned by the ﬁrst EOFs of the set of the K = Fig. 4.

Empirical probability density function of RMSE in ﬂow map Jacobian matrix estimation,depending on the method used. We use the system of Lorenz (1996) with phase-spacedimension n = K = Fig. 5.

RMSE in estimating with analogs the L63 Jacobian matrix, as a function of catalog size. Inbrown circles, the median RMSE (with 10% and 90% quantiles) of the total (3 ×

3) Jacobianmatrix. In violet squares, the median RMSE (with 10% and 90% quantiles) of the (2 × Fig. 6.

RMSE in analogs estimation of the full (3x3) Jacobian matrix ∇ Φ t as a function of analogrank and median analog distance, with the L63 system. The rank of each set of analogs ismeasured by the ratio between the lowest and the highest singular value of the set of analogs.Most of the variability of the RMSE is explained by the rank of the analogs. Some of theremaining variability can be explained by the median distance from the analogs a k to x ,which gives a measure of the local catalog density. The catalog size is 10 non-dimensional imes, δ =

0, and we use K =

40 analogs. Tests are done at 10 points randomly selected onthe attractor. . . . . . . . . . . . . . . . . . . . . . . 39 ig. 1. Analog forecasting operators presented in Sec. b. The ﬂow map Φ t ( x ) has a simple polynomial form.Analogs are drawn from a normal distribution centered on x and follow the same model as the real state x . Thesame analogs and ﬂow maps are used for the three operators and are represented on each panel. Weights ω k arecomputed using Gaussian kernels. The real initial and future states x and x t are displayed in full circles. Onthe left panel, analogs are in colored, right-pointing triangles, and successors in left-pointing triangles with thesame colors. The size of the k -th triangle is proportional to the weight ω k . In the middle and right panels, theelements of the forecast distribution at time t are also in colored, left-pointing triangles. ig. 2. Illustrating Eq. (6a,b) on the three-variable L63 system. Upper-left: A real trajectory from x to x t and two analog trajectories, namely the 10-th best analog a to a t and the 100-th best analog a to a t . Thecatalog is shown in white. Upper-right: comparing the exact value of the norm of a t − x t (full lines) and the sumof the two terms on the right-hand side of Eq. (6a) (dashed lines). Lower-left: Contributions of the ﬁrst term(black squares) and the second term (brown circles and blue triangles) of the right-hand side of equation (6a)projected on the ﬁrst coordinate of the L63 system. Lower-right: Contributions of the ﬁrst term (black squares)and the second term (brown circles and blue triangles) of the right-hand side of equation (6b) projected on theﬁrst coordinate of the L63 system. ig. 3. Flow map Jacobian matrix estimation with the model of Lorenz (1996). Forecast lead time is t = . Lorenz times, phase-space dimension is n = K = ω k with shape parameter λ set to the median of analog-to-statedistances | a k − x | . Upper-left: Jacobian matrix ∇ Φ t | x . Upper-middle: linear regression matrix S using regularanalogs. Upper-right: diﬀerence S − ∇ Φ t | x with regular analogs, also giving the value of RMSE below theplot. Middle panels: same but the linear regression is performed in a lower-dimensional subspace spannedby the ﬁrst EOFs of the set of the K = ig. 4. Empirical probability density function of RMSE in ﬂow map Jacobian matrix estimation, dependingon the method used. We use the system of Lorenz (1996) with phase-space dimension n = K = ig. 5. RMSE in estimating with analogs the L63 Jacobian matrix, as a function of catalog size. In browncircles, the median RMSE (with 10% and 90% quantiles) of the total (3 ×

3) Jacobian matrix. In violet squares,the median RMSE (with 10% and 90% quantiles) of the (2 ×

2) Jacobian matrix after projection on the two ﬁrstEOFs of the successors and restriction to the two ﬁrst EOFs of the analogs. The projection-restriction impliesmuch lower RMSE, and a much lower variability. Both estimation errors are decreasing functions of the catalogsize. The number of test points decreases with catalog size, as more test points are needed for small catalogs. ig. 6. RMSE in analogs estimation of the full (3x3) Jacobian matrix ∇ Φ t as a function of analog rank andmedian analog distance, with the L63 system. The rank of each set of analogs is measured by the ratio betweenthe lowest and the highest singular value of the set of analogs. Most of the variability of the RMSE is explainedby the rank of the analogs. Some of the remaining variability can be explained by the median distance from theanalogs a k to x , which gives a measure of the local catalog density. The catalog size is 10 non-dimensionaltimes, δ =

0, and we use K =

40 analogs. Tests are done at 10 points randomly selected on the attractor.points randomly selected on the attractor.