[PDF] Diffusion maps embedding and transition matrix analysis of the large-scale flow structure in turbulent Rayleigh--Bénard convection

Abstract

By utilizing diffusion maps embedding and transition matrix analysis we investigate sparse temperature measurement time-series data from Rayleigh--Bénard convection experiments in a cylindrical container of aspect ratio Γ=D/L=0.5 between its diameter ( D ) and height ( L ). We consider the two cases of a cylinder at rest and rotating around its cylinder axis. We find that the relative amplitude of the large-scale circulation (LSC) and its orientation inside the container at different points in time are associated to prominent geometric features in the embedding space spanned by the two dominant diffusion-maps eigenvectors. From this two-dimensional embedding we can measure azimuthal drift and diffusion rates, as well as coherence times of the LSC. In addition, we can distinguish from the data clearly the single roll state (SRS), when a single roll extends through the whole cell, from the double roll state (DRS), when two counter-rotating rolls are on top of each other. Based on this embedding we also build a transition matrix (a discrete transfer operator), whose eigenvectors and eigenvalues reveal typical time scales for the stability of the SRS and DRS as well as for the azimuthal drift velocity of the flow structures inside the cylinder. Thus, the combination of nonlinear dimension reduction and dynamical systems tools enables to gain insight into turbulent flows without relying on model assumptions.

Full PDF

DDiﬀusion maps embedding and transitionmatrix analysis of the large-scale ﬂow structure in turbulent Rayleigh–B´enard convection

P´eter Koltai and Stephan Weiss Laboratory for Fluids, Pattern Formation and Biocomplexity (LFPB), Max Planck Institute forDynamics and Self-Organization, Am Fassberg 17, 37077 Goettingen, Germany

Abstract

By utilizing diﬀusion maps embedding and transition matrix analysis we investigate sparsetemperature measurement time-series data from Rayleigh–B´enard convection experimentsin a cylindrical container of aspect ratio Γ =

D/L = 0 . D ) and height( L ). We consider the two cases of a cylinder at rest and rotating around its cylinder axis. Weﬁnd that the relative amplitude of the large-scale circulation (LSC) and its orientation insidethe container at diﬀerent points in time are associated to prominent geometric features inthe embedding space spanned by the two dominant diﬀusion-maps eigenvectors. From thistwo-dimensional embedding we can measure azimuthal drift and diﬀusion rates, as well ascoherence times of the LSC. In addition, we can distinguish from the data clearly the singleroll state (SRS), when a single roll extends through the whole cell, from the double roll state(DRS), when two counter-rotating rolls are on top of each other. Based on this embeddingwe also build a transition matrix (a discrete transfer operator), whose eigenvectors andeigenvalues reveal typical time scales for the stability of the SRS and DRS as well as for theazimuthal drift velocity of the ﬂow structures inside the cylinder. Thus, the combinationof nonlinear dimension reduction and dynamical systems tools enables to gain insight intoturbulent ﬂows without relying on model assumptions.

1. Introduction

Due to technological advancements in past years the rate at which data can be acquired, storedand processed has increased tremendously. In addition, powerful computers allow to simulatenatural processes in greater detail, with larger temporal and spatial resolution than ever before.While acquiring data from measurements and simulation is essential for scientiﬁc progress, thegoal is always to develop simple eﬀective models that map high-dimensional natural processesonto lower dimensional models that, optimally, allow for predicting the future with accuracy.1 a r X i v : . [ phy s i c s . f l u - dyn ] D ec deally, such a reduction results in functional relationships between just a few control andresponse parameters. Deriving equations that describe the system, however, can only be doneif one has a clear understanding of the underlying fundamental mechanisms. While such anunderstanding often does not exist, even if the fundamental mechanisms are known, they areoften not suﬃcient to make accurate predictions in complex nonlinear systems. For example,while it is well known that the ﬂow of simple ﬂuids is governed by the Navier–Stokes equations[Dav09], their possible solution is computationally demanding and very sensitive to boundaryand initial conditions, such that simple accurate predictions are impossible.Over the years many tools have been developed for the reduction of data without takingadditional knowledge about the source and type of these data into account. Early pioneeringapproaches are Principal Component Analysis (PCA) [Pea01, Jol86, Shl14], where multidi-mensional data are mapped on a lower dimensional space via a linear mapping by keepingas much variation as possible, or Multidimensional Scaling (MDS) [Kru64, TdSL00], where alinear embedding is used such that pair-wise distances between points are maintained as wellas possible. A more novel method, the diﬀusion maps approach , ﬁrst suggested by Lafon andCoifman [Laf04, CL06a], uses local information from the data [RS00, BN03, DG03, ZZ04], incontrast with the above methods that use global distances. It is a non-linear technique thatparametrises data based on their underlying connectivity, i.e., their proximity in the spacespanned by the observable variables. This approach has been successfully used for instance forimage analysis [BWWA13] and image processing [FFL10].The combination of dimension reduction tools with dynamical time-series analysis can lead toa better understanding of complex and high-dimensional systems: (i) PCA is applied in ProperOrthogonal Decomposition (POD) [SMH05, Row05]; (ii) time-lagged correlation analysis resultsin Dynamic Mode Decomposition [SS08, KNK + +

09, WKR15, KKS16, AM17], (iii) cluster analysis [KNC +

14] aggregatessystem states with respect to similarity, and (iv) optimal transport is used to ﬁnd a non-lineartransformation that decouples the dynamics of coordinates [AS19]. Data-based approximationof the Koopman operator by diﬀusion maps is obtained in [BGH15, GKKS18], where the latterreference analyses a convection problem similar to ours below, in a diﬀerent container geometry.Classical embedding results from diﬀerential geometry and dynamical systems with subdivisiontechniques are used in [DvMZ16, ZDG18, GKD19] to approximate ﬁnite-dimensional attractorsand invariant manifolds of inﬁnite-dimensional systems.In this work we propose to use the diﬀusion maps embedding and transition matrix anal-ysis [Hsu87, DJ99] to analyse the structure and dynamics of a turbulent system. The mainadvantage of our approach is that it represents the geometry of data in a low-dimensional spacewhere analysis techniques such as the transition matrix method can be applied. Thus, it simul-taneously delivers a geometric-topological and a dynamical understanding of the system. Theparticular system considered here is turbulent Rayleigh–B´enard convection (RBC). In RBC ahorizontal ﬂuid layer is conﬁned by a warm plate from below and a cold plate from the top. Itis an archetypical system to study pattern formation and turbulence and has been investigatedfor more than a century [B´en00, Ray16].Under the Oberbeck–Boussinesq approximation [Obe79, Bou03, SV60] the system is fullydetermined by only two dimensionless parameters. These are the Rayleigh and Prandtl numbers, Ra = gα ∆ T L κν and Pr = ν/κ, respectively. Here, L , ∆ T , and g denote the height of the ﬂuid layer, the temperature diﬀerencebetween its bottom and its top, and the gravitational acceleration. Furthermore, α , ν , and κ are2he isobaric thermal expansion coeﬃcient, the kinematic viscosity and the thermal diﬀusivity.The Rayleigh number measures the thermal driving, while the Prandtl number is the ratio of thetwo damping mechanisms, i.e., viscous and thermal diﬀusion. For small Ra the ﬂow is laminar,and forms steady spatially periodic convection rolls with a wavelength of roughly twice theheight of the ﬂuid layer. The convection ﬂow becomes unsteady, chaotic and ﬁnally turbulentas Ra increases. In suﬃciently extended systems, the signature of the initial periodic rolls canstill be observed in the averaged ﬂow and temperature ﬁeld as turbulent superstructures [PSS18].For technical reasons, most experiments and numerical simulations have been conducted insmaller cylindrical containers, with an aspect ratio Γ = D/L between its diameter D and itsheight L close to unity. In this case, the superstructure consists of a single large-scale circulation(LSC) roll, that extends over the entire cell, so that warm ﬂuid rises close to the sidewall onone side, while cold ﬂuid sinks at the opposite side. This single roll structure is rather stablecompared to the eddy turnover time, but exhibits interesting dynamics on larger time scales,such as diﬀusive azimuthal drift, cessations, or torsional oscillations [BA07, FBA08, BA09]. Theshape and dynamics of the LSC heavily depends on Γ. With increasing Γ the single roll state(SRS) is replaced by counter-rotating rolls arranged side-by-side [PWB + /

2. Here the system is predominantly in the SRS, butthe single roll is signiﬁcantly less stable than for Γ = 1 and often undergoes a transitionto a state in which two counter-rotating rolls are on top of each other (double roll state -DRS) [XX08, WA11c]. In the DRS, the vertical heat transport is reduced by about 1 -2 %compared to the SRS. Note that in the following, we refer to the large-scale ﬂow as LSCregardless of whether the system is in SRS or DRS. Studying these large-scale coherent structuresis interesting as they pose forms of self-organisation in highly nonlinear systems very far fromequilibrium that are currently not understood at all.A particularly interesting variation of the classical RBC system is rotating RBC where theconvection cylinder rotates around its vertical axis with a constant angular speed. Studyingrotating convection is crucial for a better understanding of the large convection system in geo-and astrophysics that occur in rotating frames and are thus strongly inﬂuenced by Coriolisforces (see e.g., [LE09, ZA10, KSN +

09, WWA16]).In this paper, we investigate the dynamics of the temperature ﬁeld in turbulent RBC for therotating and non-rotating case using diﬀusion maps embedding. In particular, we re-analysetemperature measurements at various locations in the sidewall of a convection cylinder of aspectratio Γ = 0 .

5, ﬁlled with water at an average temperature of 40 ◦ C. The analysis here wasdone with measurements at Rayleigh numbers of Ra = 7 × and 9 × . Temperaturemeasurements in the sidewall help to reveal large-scale convection structures, as in general, hotﬂuid rises from the warm bottom plate along one side, while cold ﬂuid sinks down from the topplate along the opposite side (see ﬁg. 1a). Despite this large-scale circulation, the turbulentﬂuid motion results in vigorous ﬂuctuations of the temperature ﬁeld in space and time thatare detected by the sidewall thermometers. The relevant data have been published before in[WA11b] for the non-rotating case and in [WA11c] for the rotating case.In the next section, we will brieﬂy describe the experimental setup used to acquire the data aswell as the structure of the data. In section 3 we explain in detail the diﬀusion map embeddingand we will show the resulting embedding for a standard (horizontal and non-rotating) RBCcase. In section 4 we will analyse the embedded data regarding dynamical features of thesystems, and ﬁnd that the long-term dynamical behavior can be connected to the evolution ofthe LSC. We close the paper with a discussion section.3 . Experimental setup and data collection A sketch of the experimental setup is shown in ﬁg. 1a. The main part of the experiment is thecylindrical convection cell of aspect ratio Γ = 0 . ± .

02 K of the desired temperature during an experiment.To characterise the temperature ﬁeld, 24 thermistors were embedded in blind holes in thesidewall, roughly a millimeter away from the ﬂuid. Eight thermistors were equally distributedalong the azimuthal direction at each of the heights L/ L/

2, and 3 L/

4. The working ﬂuidwas water at a temperature of 40 o C, resulting in Pr =4.38 for all analysed measurements inthis paper. Under the applied conditions the ﬂow was highly turbulent with vigorous smallscale ﬂuctuations that self-organised into a large-scale circulation that was predominantly in theSRS. The SRS can be detected in sidewall temperature measurements as sinusoidal temperaturevariation along the azimuthal direction at a speciﬁc height (ﬁg. 1b).During an experiment, the top and bottom plate temperature T t and T b were held constantand the temperature of 24 thermistors embedded in the sidewall were recorded with a rate ofroughly one measurement every 3.4 s. The experiment was conducted for several hours. Datafrom the ﬁrst hour were discarded as the system has not yet reached statistical equilibrium.Please see [WA11c] for further experimental details. Top view 1234 5 6 7 0

D3L/4L/4L/2 θ / π (a) (b) θ T ( o C ) Figure 1: (a) Sketch of the experimental setup with the location of the thermistors marked withblack dots (left) and cross-section of the cylinder with the azimuthal location of thethermistors (right). Arrows mark the large-scale motion of the hot (red) and cold(blue) plumes that rise and sink close to the sidewall. (b) Thermistor measurementsat one time instance at the heights L/4 (red), L/2 (green) and 3L/4 (blue) showing thetemperature signature of the large-scale circulation in the single roll state. The solidlines are sinusoidal ﬁts to the temperature as a function of the azimuthal position.In the next sections we will reanalyse time series of the sidewall temperature measurementsusing diﬀusion maps embedding. We will compare the results with results published in [WA11c].The analysed datasets together with the relevant experimental parameters are listed in table 1.4ata Set: No. of Data points Ra Ω [rad/sec]D1 302360 9.0 × ×

3. Data embedding

In the following we consider the measurements of the 24 side wall thermistors as observations ofthe full system’s state (the full velocity and temperature ﬁelds) by a 24-dimensional observableat a given time. Before we start analyzing these data, we ﬁrst describe the diﬀusion mapsapproach for an arbitrary data set (a cloud of data points).

Let the set of measured data points Z := { z , . . . , z m } ⊂ R r be given. In our case, the stateof the system at a given time t j is represented by a 24-dimensional vector ( z j ∈ R ), spannedby the r = 24 thermistor readings. Note that the components of z j are dimensionless, as eachcomponent was calculated by subtracting the mean temperature ( T t + T b ) / T b − T t . The entire set Z thus represents a cloud of all points taken during an experimental run, i.e., m = 302360 points in the 24-dimensional phase space for data set D1 (see table 1). With theembedding algorithm explained below, we represent each state z j in a new coordinate system,spanned by convenient and representative embedding coordinates ξ • ,j := ξ • ( z j ) ∈ R n , n < r .Here n is determined from the output of the diﬀusion maps algorithm and should be such thatthe representation of the data set with these n coordinates delivers insight into its geometry; ingeneral we use n = 2 , reduced embedding coordinates ξ • will bea selected subset of the full embedding coordinates ξ , as described below. We denote by ξ i ( z j )the i -th (full) embedding coordinate of the j -th data point, and ξ i , the i -th coordinate function,can thus be represented by a vector in R m through setting ξ i,j = ( ξ i ) j = ξ i ( z j ), where ( · ) j denotes the j -th entry of a vector.Note that the method will use only the vectors of measurements, and in particular it willuse no information whatsoever on the thermistor positions at which these measurements weretaken. The diﬀusion maps approach is designed to reveal the geometric features with the largest(nonlinear) variation in the data set it is applied to.The method we choose to work with, diﬀusion maps [CL06b], assumes that the given datapoints are sampled from a low-dimensional smooth manifold that is embedded in the high-dimensional ambient space R r . It will try to construct (non-linear) coordinate functions fromthe data set into a low-dimensional space that is one-to-one; thus giving a low-dimensionalparametrization (embedding) of the data set. To ﬁnd this embedding map ξ • the methodconstructs a virtual diﬀusion process on the data points, with the jump probabilities of thisdiﬀusion being based on proximity between data points—and thereby it neglects any naturalor artiﬁcial ordering of the data, in particular any temporal order. Such information, however,can later be re-included as we will do below in section 4.2. The normalised temperatures z j relate to the measured temperatures T j as z j = T j − ( T b + T t ) / T b − T t . Here T t and T b are the top and bottom plate temperatures and T j is a 24-dimensional vector containing all temperaturemeasurements at time t j . small local Euclidean distances in theambient space R r are good approximations of local geodesic distances on the unknown datamanifold. This is not true for large distances, as the manifold might be curved. A diﬀusion-likeprocess that is running locally “along the data manifold” thus “feels” the intrinsic geometry ofthis manifold, and its properties thus reﬂect this geometry.The construction is as follows. First we choose a proximity or scale parameter ε >

0, anddeﬁne the data similarity matrix K ∈ R m × m by K ij = exp (cid:18) − (cid:107) z i − z j (cid:107) ε (cid:19) , k ε ( z i ) = m (cid:88) j =1 K ij , (1)where algorithmically often a cutoﬀ radius is used to set K ij (cid:28) K ij is a measure for the closenessbetween the two data points z i and z j in their r -dimensional state space (i.e., the 24-dimensionalspace in our case). The row sums k ε ( · ) are now used to pre-normalize the similarity matrix:ˆ K ij = K ij k ε ( z i ) k ε ( z j ) , d ( z i ) = m (cid:88) j =1 ˆ K ij . (2)By row-normalizing ˆ K we ﬁnally obtain the diﬀusion map matrix P ij = ˆ K ij d ( z i ) . (3)With this normalization, the matrix elements P ij can be interpreted as a probability that arandom walker moves from z i to z j . P is a stochastic matrix giving rise to a—by constructionreversible—Markov chain; a virtual diﬀusion on the data points. The central observation in[CL06b] (as pioneered in related works [RS00, BN03, DG03, ZZ04]) is that the right eigenvectorsof P , denoted by Ξ i ∈ R m , at dominant eigenvalues Λ i ∈ R , can be used as intrinsic coordinateson the manifold. More precisely, if (Λ i , Ξ i ) denote eigenpairs of P , we embed the data point z j into R m by ξ ( z j ) = (cid:16) Λ (Ξ ) j , . . . , Λ m (Ξ m ) j (cid:17) T ∈ R m . (4)Thus, the entry of the i -th eigenvector (scaled by the associated eigenvalue) at a data pointis used as i -th coordinate value for the embedded data point. As m is usually large (e.g., m = 302360 for dataset D1), this is not yet a simpliﬁcation. However, if the data manifoldis low-dimensional, then only the ﬁrst few eigenvectors carry geometric information, and thesubsequent ones are redundant. Thus, we choose n of the dominant eigenmodes (which wedenote, for notational simplicity, by the ﬁrst n subdominant ones, keeping in mind that wecould skip some, e.g., taking the 2-nd, 4-th, and 7-th mode) to deﬁne the diﬀusion map ξ • ( z j ) = (cid:16) Λ (Ξ ) j , . . . , Λ n +1 (Ξ n +1 ) j (cid:17) T ∈ R n , (5)where the main advantage is in n (cid:28) m and n (cid:28) r . Note that we discarded the ﬁrst eigenvector,since by the row-stochasticity of P we have Ξ = (1 , . . . , T because P Ξ = Ξ , and thus it isan entirely non-informative coordinate. We summarize the notation used in this constructionin Table 2. The pre-normalization is necessary for canceling a bias of the data distribution in the results, if this distributionis non-uniform. For details, we kindly refer the reader to [CL06b], where this is done by introducing a tuningparameter α . Our construction is obtained with α = 1. Why this is the case, is discussed in Appendix A.1. z j ∈ Z ⊂ R r physical phase space i -th embedding coordinate ξ i : Z → R ⇔ ξ i ∈ R m ξ i = Λ i Ξ i , where P Ξ i = Λ i Ξ i Full embedding ξ : Z → R m ( ξ ( z j )) i = ξ i ( z j ) = ξ i,j = ( ξ i ) j Reduced embedding ξ • : Z → R n ξ • ,j = ξ • ( z j )Table 2: Summary of the notation used in the construction of the embedding by diﬀusion maps.The computation of pairwise distances (cid:107) z i − z j (cid:107) as necessary in here, can be done eﬃcientlyup to hundreds of dimensions utilizing k-d tree data structure [Fri18], and subsequently solvinga m × m sparse eigenvalue problem through vector iteration [Ste02], as only the dominant modesare required. This latter step is usually the computational bottleneck, and makes the method for m (cid:29) infeasible, depending on the sparsity of P . Fortunately, one can subsample the data,compute (Λ i , Ξ i ) pairs on this data set of tractable size, and “interpolate” the embedding of theremaining points without the need of any further eigenvalue computations. More informationon how to do this, on how to choose the proximity parameter ε , and on why the diﬀusion mapgives good intrinsic coordinates on the manifold is deferred to Appendix A. The data set.

The measured data points are local observations z j = h ( x ( t j )) at speciﬁc times t j by some observation function h : X → R r of the original dynamics x ( t j +1 ) = F τ ( x ( t j )) , (6)where x ( t ) ∈ X is the full state of the system (the velocity and temperature ﬁeld), and F τ denotes the time dynamics of the system, i.e., governed by the momentum and energy equationsin our case. Although strictly not necessary, here we assume that the observation times t j areequispaced, i.e., t j +1 − t j = τ for all j .In the experimental setup of section 2 the state z j ∈ R r is given by the r = 24 sidewalltemperature measurements and thus marks a single point in a 24-dimensional parameter space.The number of data points acquired during a single experimental run is usually at least m ≈ .We note that to uncover the attractor of a partially observed system, often delay-embeddingin the sense of Takens [Tak81, Rob05] is used. For a noisy system, however, these approacheshave their limitations, as one would necessarily reconstruct the state space of the noise as well.Our analysis did not show a signiﬁcant diﬀerence in the geometry of the data set if analysed indelay-coordinates, thus in the following we will work in the original 24-dimensional data space. Geometric features of the data: small scales.

We now calculate diﬀusion maps from sidewalldata of dataset D1 that was acquired without rotation of the convection cylinder. For thispreliminary analysis we subsample the original data set and take 3800 equally sampled pointsbetween times 3400 s and 68930 s, as measured from the beginning of the experiment. We applythe automated procedure to ﬁnd an appropriate scale parameter ε , as described in Appendix A.2.This automated procedure yields ε = 1 . · − , and estimates a dimension of the data manifoldto be approximately 5. As we will see, the data does not have a simple manifold structure of7ocally homogeneous dimension (at least not on this scale), and noise plays a role as well, thusthis dimension estimate should not be taken as a precise numerical value. Nevertheless, asthe estimate is substantially smaller than the ambient dimension 24, we can claim with highcertainty that the data yields a low-dimensional geometric structure. (a) -0.2-0.150.15-0.1 0.1 -0.05 (b) -0.1 0-0.2 -0.05-0.2-0.10 (c) (cid:1) S R S o r i e n t a (cid:1) o n π not SRS SRSinde ﬁ niteDRS Figure 2: Embeddings of the partial data set D1 with diﬀusion maps and proximity parameter ε = 1 . · − . Color code represents time (a), roll state of the LSC (b) and azimuthalorientation of the LSC (c); please refer to main text for more details. Core regions ineach embedding are magniﬁed and shown on top of each plot.In ﬁg. 2 we show three-dimensional embeddings ξ • of the data by ( ξ , ξ , ξ ) (ﬁg. 2a), ( ξ , ξ , ξ )(ﬁg. 2b), and ( ξ , ξ , ξ ) (ﬁg. 2c). The coloring in this ﬁgure represents (a) time (time increasesfrom blue to yellow), (b) the roll state (dark blue: SRS, turquoise: DRS, yellow indeﬁnite),and (c) the orientation of the single roll (color hue: orientation, black: DRS and indeﬁnite), asobtained in [WA11c]. We see that all eigenvectors pick up a core disc-like structure and longexcursions that depart from the core and return to it shortly after. We will see below thatthe disc-like structure is a representation of the large-scale ﬂow structure in the system. Theexcursions on the other hand might be interpreted as signatures of the intermittent nature of theturbulent ﬂow. We already see here that the disc-like core correlates with the amplitude (ﬁg. 2b) and the orientation (ﬁg. 2 c) of the LSC. This indicates that the low-dimensional embeddingreadily represents important physical structures related the to large-scale circulation of theconvective ﬂow—the turbulent superstructures in convection—while the distinct eigenvectorsencode transient excursive behavior of diﬀerent kinds. We note that this distinguishes diﬀusionmaps from linear methods, such as PCA, which are not able to eﬀectively separate the bulk ofthe data from the excursions (computations not shown here).In this study we shall focus on the dynamically most prevalent structure, namely the coreregion where most data points reside. Noting that the transients are rare events consisting ofjust a few data points—which nevertheless have a large geometric deviation from the “meanpattern”—, we will consider an embedding that focuses on the large-scales and suppresses thetransients. This is achieved by increasing the proximity scale to ε = 5 · − . An explanationof the phenomenon is given via the diﬀusion distances in Appendix A.1. Further increasing of ε would make the similarity of all point pairs asymptotically equal, thus we would lose all theinformation on the geometric structure in the data. Taking ε (cid:28) . · − would, contrariwise,disconnect all data points, resulting in a similar loss of information. We refer to Appendix A.28or details. We also note that there is a reasonable robustness of the method with respect tolocal variation of the proximity parameter. Geometric features of the data: large scale.

The results of the diﬀusion map embedding with ε = 5 · − , are shown in ﬁg. 3, where we plot three-dimensional embeddings ( ξ k , ξ k +1 , ξ k +2 ) for k = 2 , . . . , ε Λ Λ Λ Λ Λ Λ Λ Λ · − relevant geometric information to be hidden in the lower spectrum. A comparison of the mutual ratiosof the log-eigenvalues log(Λ i ) with one another show similar ratios than the eigenvalues of theLaplacian on a disc, indicating that the large-scale geometric structure of the data set is adisc (see Appendix B). Another evidence supporting this claim is that the embedded points inﬁg. 3(a-d) gather along a two-dimensional sub-manifold in the 3d-space, meaning that two of theembedding coordinates parametrize the third one, hence this does not yield additional geometricinformation. Moreover, these three-dimensional embeddings show very strong similarities toanalogous embeddings by the eigenfunctions of the Laplacian on a disc, cf. ﬁg. 13 in Appendix B,indicating a disc-like data manifold.Figure 3: Diﬀerent diﬀusion map embeddings for the dataset D1. Color code marks the valueof the vertical coordinate for better visualisation. The kernel scale was ε = 5 · − .We will thus focus in the following mainly on the embedding of the data in the ( ξ , ξ )-space.First, we shall compare the information obtained from this embedding with a previous analysis9f the data—which relies on the knowledge of the physical setting of the underlying experiment,such as the shape of the container and the existence of convection roll states. Comparison with previous results.

Figure 4 shows the ( ξ , ξ )-embedding for ε = 5 · − .Additionally, some further properties obtained in [WA11c] of the physical system are representedby the color of the data points. In ﬁg. 4a, colors represent time at which the data were taken,ranging from blue at early times to yellow at later times. In this representation no clearcorrelation between time and the location of the data in the eigenvector space can be seen.That means, on suﬃciently large time scales the system is in a statistically steady state. Later,in section 4 we will further analyse the dynamical behaviour of the system. −0.01 0 0.01 (a) t i m e [ s ] x (b) L S C a m p li t ude [ K ] −0.0100.01 −0.01 0 0.01 (c) L S C o r i en t a t i on x x -pp −0.01 0 0.01 (d) L S C o r i en t a t i on x Figure 4: Relation between physically determined properties of the large-scale circulation andlocation of the states in the ( ξ , ξ ) eigenvector space. Color code marks (a) time, (b)amplitude of the LSC, and (c) orientation of the LSC as deduced from the azimuthaltemperature proﬁle at midheight (color hue shows orientation). Data points withvery small amplitude have been omitted in (c), since calculating the orientation forthe LSC with very small amplitude is not only ambiguous but also meaningless, asthere might not be a well deﬁned LSC. Subﬁgure (d) shows data points of the systemwhen the large-scale circulation is in a single roll state (blue) or a double roll state(red).The data points in ﬁg. 4b are color-coded with the amplitude of the LSC. Here, we see a clearcorrelation. Points with large radii r ξ,j := (cid:113) ξ ,j + ξ ,j also show large LSC amplitudes, while10oints inside the circle, with small radii also show small amplitudes. Furthermore, as shown inﬁg. 4c, the angle θ := arctan ( ξ /ξ ) also correlates with the orientation of the LSC very well.We thus see that the two dominant eigenvectors describe the most prominent variation of theLSC, the large-scale motion of the system.As we have pointed out already in the introduction, the LSC in cylinders of aspect ratioΓ = 1 / . It can be seen thatpoints corresponding to SRS and DRS are clearly distinguishable. While SRS points are locatedon the outer areas of the disc, points corresponding to the DRS are located closer to the centerof the disc. In this representation points belonging to SRS and DRS are not well separated, andthere is a large number of points that belong to a transition state. This suggests that the DRShere is not a stable state but rather an intermediated state after the SRS has been renderedunstable, e.g., due to the growth of a corner roll (see [SNS + what these structures are in a physical sense , unlesswe provide additional knowledge, as for example about the positions of the thermistors. As anadvantage, we can detect dominant dynamical behavior of the system without any additionalphysical knowledge, as long as these are present in the identiﬁed coordinates ξ . This willbe discussed in the next section. In our case, a suﬃciently spread-out arrangement of thethermistors is vital for “seeing” the LSC in the data. If the thermistors would be locatedrather close to each other, for example only at one half of the sidewall, the LSC could not havebeen detected — in this case the embedding would show other features related to the localﬂuctuations of the temperature ﬁeld at that particular location.

4. Analysing dynamical features

It is in the nature of turbulent ﬂows to show chaotic dynamics on diﬀerent time and length scales.We therefore propose in the following new methods of statistically analysing the dynamicalfeatures of the convective ﬂow utilising the diﬀusion maps approach. The main focus will be onanalysing our data in their ( ξ , ξ )-embedding. Recall that we denote this reduced embeddingof the j -th data point by ξ • ,j := ξ • ( z j ). As we observe in ﬁg. 4, the large-scale geometry of the dataset resembles a disc with sparseoccupation towards the center. In particular, data points that correspond to the SRS form aring. This suggests that the orientation on the ring (the “angular coordinate”) is the mostimportant variable of the system, as it has the largest variation in the measurement space.Thus, in the following we investigate the dynamics of this coordinate. Note that investigatingthe stability and dynamics of the SRS has been a topic of interest for more than a decade[BA07, BA08a, BA09, SBHT14].To this end, for each embedded point ξ • ,j ∈ R we compute its azimuthal angle θ j = arg( ξ • ,j ).Since we want to investigate the long time dynamics, we need to get rid of any eﬀects caused by Please see [WA11b] for the detection criteria of of both states. π . Any larger changes are due to winding and thus we unwind θ j by adding or removing multiple of 2 π until − π < ( θ j − θ j − ) ≤ π .Now, for a given oﬀset s ∈ N , we calculate ∆ θ j ( s ) = θ j + s − θ j and from this the correspondingmean displacement (cid:104) ∆ θ j ( s ) (cid:105) and its variance (cid:104) ∆ θ j ( s ) (cid:105) − (cid:104) ∆ θ j ( s ) (cid:105) . Here the average (cid:104)·(cid:105) is doneover all times, i.e., over all j . Non-rotating convection cylinder.

For experiment D1 with ε = 5 · − , we analyse the datafor the ( ξ , ξ ) embedding and show the result in ﬁg. 5. Fig. 5a shows the mean displacement(drift) and variance as a function of the lagtime ∆ := τ s on a linear scale, whereas the samedata are plotted on the double-log scale in (b). (a) (b) ∆ [sec]variancedisplacement 10 -3 -2 -1 (a) (b) ∆ [sec] Figure 5: Mean displacement (yellow triangles) and variance (blue bullets) plotted as a functionof the lagtime ∆ for the case without rotation (data set: D1). Diﬀusion maps werecalculated using ε = 5 · − . (a): linear scaled axis, (b): double-logarithmic axis.The green solid line marks a parabolic ﬁt to variance for ∆ <

30 s (coeﬃcient a =(2 . ± . × − rad /s ). The red dashed line marks a linear ﬁt to the varianceresulting in a coeﬃcient D = (9 . ± . × − rad /s. A ﬁt to the drift reveals anaverage drift of ω = (7 . ± . × − rad/s (ﬁt not shown in the plot).We see in ﬁg. 5a that the average displacement (yellow triangles) is negligible compared tothe variance. This is expected, since there is no source for a constant mean drift of θ with time.The mean drift is not exactly zero as we average over a ﬁnite number of samples (i.e., shortsections in time) and would further decrease for longer time traces. We will see below that thesame system exhibits a nearly constant drift when the convection cylinder is rotated around itsvertical axis with a constant rotation rate.The variance (blue bullets in ﬁg. 5) increases monotonically with increasing ∆. From ﬁg. 5bwe see that the increase appears to be linear for ∆ (cid:38) D = (9 . ± . × − rad /s. For small lag-times (∆ <

50 s) the variance does not follow12 linear trend. Its slope in the log-log plot (ﬁg. 5b) is close to 2, which suggests a ballisticbehaviour, i.e., (cid:104) (∆ θ ) (cid:105) − (cid:104) ∆ θ (cid:105) ∝ at for some a >

0. Such dynamics, i.e., a ballistic motionfor small time scales and a diﬀusive behaviour for large time scales is characteristic for examplefor molecular motion and can be well described by a Langevin equation dθ ( t ) dt = ω ( t ) + ω (7) dω ( t ) dt = − γω ( t ) + η ( t ). (8)Here, ω is a constant angular drift that might occur, for example in a rotating frame and γ isa resistance caused by friction. The stochastic noise η ( t ) describes contributions of turbulentﬂuctuations to the dynamics and is assumed to be δ -correlated, with (cid:104) η ( t ) η ( t ) (cid:105) = σ δ ( t − t ).These ﬂuctuations lead to increments of the form η ( t ) dt = σdW ( t ), with dW ( t ) being a standardWiener process.For eq. (7) the mean displacement and its variance can be calculated [Gar04], giving (cid:104) ∆ θ ( t ) (cid:105) = ω t , (9)and (cid:104) ∆ θ (cid:105) − (cid:104) ∆ θ (cid:105) ∝  at := σ γ t , t → ,Dt := σ γ t, t → ∞ . (10)Thus, the drag and forcing coeﬃcients γ and σ can be computed from the diﬀusion coeﬃcient D and the coeﬃcient a of the parabola, yielding γ = 2 a/D = (4 . ± . × − s − and σ = γ D = (2 . ± . × − rad /s .A Langevin equation has already been suggested for the orientation of the LSC in turbulentthermal convection in cylinders with aspect ratio Γ = 1 by Brown and Ahlers [BA08b]. In orderto compare our results with their results, we have to compensate for the diﬀerence in Rayleighnumber. This can be done, as our cell height was similar to theirs and they provide ﬁtted powerlaw relations between Ra and D θ . While the ﬁts in [BA08b] and thus our estimate includessigniﬁcant uncertainty, we estimate a diﬀusion coeﬃcient of D θ = 0 . × − rad /s , whichis more than an order of magnitude smaller than for our case. This result reﬂects the largerﬂuctuations of the ﬂow and smaller stability of the LSC in Γ = 1 / Rotating convection cylinder.

While for the non-rotating case, discussed above, the net driftwas very small and just a statistical feature that would shrink even more with longer time series,the drift becomes signiﬁcant when the cylinder is rotated around its cylinder axis. Figure 6shows an analysis of the embedded data acquired in a cylinder under rotation with a rotationrate of 0.088 rad/s (data set D2). As a result due to Coriolis forces the internal convectionstructure also rotates with respect to the side walls and thus with respect to the temperatureprobes deployed in them.We see that now ∆ θ increases linearly with ∆. A linear ﬁt to the date reveals a slope of ω = (1222 ± . × − ) rad/s. The variance of ∆ θ on the other hand looks very similarto the non-rotating case in ﬁg. 5. The variance increases initially (∆ (cid:46)

30 s) quadraticallywith a coeﬃcient a = (1 . ± . × − rad /s and for larger ∆ linearly with a diﬀusioncoeﬃcient D = (6 . ± . × − rad /s, corresponding to γ = (4 . ± . × − s − and σ = (1 . ± . × − rad /s . 13 (a) (b) ∆ [sec]variancedisplacement 10 -3 -2 -1 ∆ [sec] Figure 6: Mean displacement (yellow triangles) and variance (blue bullets) as a function of thelagtime ∆ for the rotating case (dataset D2). (a) shows the data plotted against linearscaled axis. (b) shows only the variance plotted against logarithmic axis. The linesare ﬁts to the data. Black: linear ﬁt to the drift for ∆ <

500 s resulting in a slope ω = (1 . × − ) rad/s. Green: Parabolic ﬁt to the variance for ∆ <

20 s resultingin a coeﬃcient a = (1 . ± . × − rad /s . Red dashed: Linear ﬁt to the variancefor 100 s < ∆ <

500 s resulting in a diﬀusion coeﬃcient D = (6 . ± . × − rad /s.It is quite interesting that while the drag coeﬃcient γ is very similar to the non-rotatingcase, σ and thus the diﬀusion coeﬃcient are signiﬁcantly reduced, by almost 40%. This showshow slow rotation suppresses turbulent ﬂuctuations, resulting in a much more stable large scalecirculation. This stabilising eﬀect has also been observed in other statistical quantities suchas the frequency of transitions between the double and the single roll state, the width of theprobability density function of the LSC amplitude, or the number of Fourier modes determinedfrom sidewall measurements [WA11a].We note that Ra was roughly 15% smaller for the rotating data set (D2) as for the non-rotating one (D1). The diﬀerence in Ra is too small to have any signiﬁcant inﬂuence on either γ or D , as we have observed from analysing other non-rotating data sets with Ra = 7 . × that were however much shorter and thus less suitable for a rigorous statistical analysis. A note on the radius evolution.

We have seen in section 3.2 that the radius r ξ = (cid:112) ξ + ξ corresponds to the amplitude of the LSC. We therefore also want to analyse the stochasticbehaviour of r ξ . Similarly to θ , we now look at displacements ∆ r ξ (∆) for given lagtimes ∆ andcalculate their variance v r := (cid:104) (∆ r ξ ) (cid:105) − (cid:104) ∆ r ξ (cid:105) .We plot in ﬁg. 7 v r as a function of ∆. For the non-rotating case (ﬁg. 7a) we see for ∆ < a = (1 . ± . × − . Forlarger ∆ there is a short range where the data follow a linear function of ∆. The correspondingslope is D = (4 . ± . × − . For even larger ∆, the slope of the data decreases again.This decrease is expected, as r ξ can not take arbitrarily large values as also the amplitude of14he LSC is conﬁned. We note that Brown and Ahlers [BA08b] have modeled the dynamics ofthe amplitude of the LSC as a Brownian motion inside a potential well. As ﬁtting the completemodel would be more involved, we defer this to future work. With the more general methodpresented in the next section, the statistical behaviour of r ξ is captured as well. Note, thatinformation about the exact amplitude of the LSC can not be calculated from r ξ but onlyqualitative features of it.Fig. 7b shows a very similar analysis for the rotating case (dataset D2). There, the coeﬃcients a and D are signiﬁcantly smaller, suggesting that also in this quantity the stabilising eﬀect ofrotation can be seen. We do not elaborate on this ﬁnding any further, as clearly a simpleLangevin model (eg. (7)) is only valid for very small deviations, and v r is a nonlinear functionof ∆, just as the relationship between r ξ and the amplitude of the LSC is assumed to benonlinear. Furthermore, the relationship between the LSC amplitude and r ξ might be evendiﬀerent for the non-rotating and the rotating case. (a) x10 −5 v r Δ [sec] (b) x10 −6 v r Δ [sec]

Figure 7: Variance v r of the change of the radius ∆ r ξ as a function of lagtime ∆ (blue bullets).Subplot (a) shows the non-rotating case (dataset D1). The red line marks a parabolicﬁt for ∆ <

20 s with coeﬃcient a = (1 . ± . × − and the green line marks a linearﬁt to the data with 50 s < ∆ <

100 s, resulting in a slope of D = (4 . ± . × − .Subplot (b) shows the rotating case (dataset D2). The red line marks a parabolic ﬁtfor ∆ <

20 s with coeﬃcient a = (3 . ± . × − and the green line marks a linearﬁt to the data with 40 s < ∆ <

60 s, resulting in a slope of D = (7 . ± . × − .The ∆-range for which we assumed a linear behaviour was chosen by eye. In the previous subsection we have discussed mainly the angular dynamics of the single convec-tion roll (SRS) by considering the dynamics of the polar coordinate in the ( ξ , ξ ) parameterspace. However, the dynamics could have other important features not readily revealed by theembedding geometry. In examples below we will investigate the switching between SRS andnon-SRS, and also the dynamics observed for a small- ε embedding in three dimensions, wherethe transient excursions of the dynamics are still dominating the geometry.The tool for this is going to be the transition matrix, which describes the redistribution ofstates under the dynamics between subsets of the state space. As such, it is an approximation of15he so-called transfer operator that describes the evolution of distributions due to the dynamicsof the system [LM94, DJ99].More precisely, we consider the m embedded data points in an n -dimensional eigenvectorspace, i.e., Y := { ξ • , , . . . , ξ • ,m } ⊂ R n . Let P = { B , . . . , B p } be a partition of a domain D ⊂ R n (i.e., D = (cid:83) pb =1 B b ), such that D covers Y (i.e., Y ⊂ D ), and every box is necessary for thecovering, i.e., B b ∩ Y (cid:54) = ∅ for every b = 1 , . . . , p .Recall that the data points were obtained from a long simulation, and were sampled at anearly constant rate with time period τ = 3 . ξ • ,j ) j =1 ,...,m ,represents the dynamical time series (cid:0) x ( t j ) (cid:1) j =1 ,...,m observed through ξ • ◦ h . Consequently, onecan view ξ • ,j + s as the image of ξ • ,j under the dynamics after a lagtime ∆ = sτ , where s isthe chosen oﬀset . Let J = { , . . . , m − s } . The discretization of the dynamics is then done byconstructing the so-called transition matrix T = T ( s ) ∈ R p × p by T i(cid:96) = { j ∈ J : ξ • ,j ∈ B i and ξ • ,j + s ∈ B (cid:96) } { j ∈ J : ξ • ,j ∈ B i } (11) ≈ Prob (cid:2) ξ • ( z (∆)) ∈ B (cid:96) (cid:12)(cid:12) ξ • ( z (0)) ∈ B i (cid:3) , counting the relative number of transitions from box i to box (cid:96) , within the time from step j to j + s . Note that we can use an arbitrary oﬀset s ∈ N in (11), and arbitrary partition P of thedata set, but should consider the following aspects such that the transition matrix representsthe dynamical properties of the system well [PWS +

11, SNL + B i , otherwise mostpoints do not leave the box they start in, and thus the dynamical features remain invisible.(b) The oﬀset should not be too large with respect to the number of data points in the boxes,otherwise the Monte Carlo estimate (11) might be strongly erroneous.(c) On a similar note, the boxes should not be too small with respect to the data density,otherwise there are too few data points per box, and again one obtains too high samplingerrors.By construction, T is a row-stochastic matrix. Its elements approximate the probabilities thata point from a certain box B i is mapped into another box B (cid:96) by the dynamics. For determiningthe box partition and assembling the transition matrix we use the MATLAB-based softwarepackage GAIO [DFJ01]; cf. https://github.com/gaioguy/GAIO/.Note that in principle this analysis can be applied directly to the 24-dimensional param-eter space in which our original data points x k are observed. However, partitioning a high-dimensional space suﬀers from the curse of dimensionality, and becomes quickly computation-ally intractable. Also, with the diﬀusion maps embedding we can represent our data pointsbased on a few chosen most important independent features and neglect everything else. In thenext section we will elaborate how spectral analysis of T can uncover the long-term dominantdynamical behavior of the underlying process.Assembling the transition matrix by (11) scales linearly with the number of data points,making it numerically tractable for a large amount of dynamical data. Because for diﬀusionmaps the computational bottleneck is to acquire the pairwise distances between (close-by) datapoints and to solve the eigenvalue problem, it would be desirable to do this computation on data Usually the B j are non-overlapping axis-parallel n -dimensional “boxes”; that is, B b = I b, × . . . × I b,n , for someintervals I b,i , i = 1 , . . . , n . m (cid:46) . These seemingly opposing attributes can be brought together by the out-of-sample extension of diﬀusion maps [CSSS08]. On the one hand, for a representative subset Y (cid:48) ⊂ Y of data points the diﬀusion maps embedding of the dominant geometric features of thedata set is not improved by increasing the size of Y (cid:48) . On the other hand, having computed adiﬀusion map embedding for some Y (cid:48) , there is a substantially cheaper way to approximate theembedding coordinates of additional points (i.e., Y \ Y (cid:48) ) than to compute the embedding for thefull data set. This method is described in Appendix A.3 and it is utilized in the following tocompute the embedding on less than 5 × data points, and to “interpolate” the embedding ofup to 3 × additional data points. These are then used to calculate the transition matrix T . Stationary distribution.

The transition matrix provides access to the long-time behavior ofthe dynamics. For instance, a stochastic dynamics governed by the matrix T (i.e., a Markovchain jumping from box to box) will be ergodic if T is irreducible, i.e., every box can be reachedfrom any other box with a positive probability, thus T ki(cid:96) > k . Then therelative time duration that the process spends in box B b is given by the b -th component of the stationary distribution µ = ( µ , . . . , µ p ) that is given by µ T T = µ T with (cid:80) pa =1 µ a = 1. It isstraightforward to see that for our transition matrix holds µ b = { j ∈ J : ξ • ,j ∈ B b } J .µ b represents the probability that a random data point lies within box B b and it is µ b =lim k →∞ T kib . Almost invariant behavior.

If the transition matrix was not irreducible, there would be disjointsets I ∪ . . . ∪ I (cid:96) = { , . . . , p } such that T ab = 0 whenever a ∈ I i and b ∈ I j , i (cid:54) = j . The indexsets I i , and equivalently the sets (cid:83) b ∈ I i B b , are then called invariant. This means that there arediﬀerent sets of boxes, between which transitions are impossible. A point ξ • ,j can only transferto other boxes that are part of its own box set, but not to boxes in other sets. One can showthat in this case T has an (cid:96) -fold eigenvalue 1. There would be right eigenvectors ρ k satisfying (cid:0) ρ k (cid:1) i = (cid:26) , i ∈ I k , , i / ∈ I k . (12)Whether the transition matrix is irreducible or not highly depends on the time duration consid-ered here, i.e., the oﬀset s . For example, for s = 0 every box is decoupled from the other boxes,while for s → ∞ we gain the stationary solution T ab = µ b (naturally, one would also need anarbitrarily long data set to be able to set up T ).Now, if for some s the transition matrix is not irreducible but close to an irreducible matrix(with respect to some matrix norm) with (cid:96) invariant sets, then T has (cid:96) eigenvalues close to 1,and the corresponding eigenvectors are close to those in (12). That means, in turn, if T haseigenvalues close to one, we can identify regions (unions of boxes belonging to the same indexset I i ) that the system is unlikely to leave within time ∆, i.e., they are almost invariant [Dav82,GS98, DJ99, DW04, GS06].In ﬁg. 8(a) we show the eigenvector of T (obtained with s = 10) corresponding to the largestreal eigenvalue, λ = 0 . τ ≈ .

45 s, the decay rate of this eigenvalue is κ = log( λ )∆ = − .

020 s − , -0.03-0.02-0.0100.010.020.03 -0.04-0.03-0.02-0.0100.010.020.030.04 jump time [sec] -3 -2 -1 r e l a t i v e f r equen cy - ++ - (a) (b) Figure 8: Analysis of dataset D1. (a) Eigenvector ρ at the largest subdominant purely realeigenvalue of the transition matrix constructed with ε = 5 · − , ξ • = ( ξ , ξ ), s = 10,and with an initial 32 ×

32 box covering of the bounding box of the data leading to p =754. The color scale has been adjusted for better visual distinguishability of positiveand negative regions. The circle represents the boundary in the embedding spacebetween SRS and DRS as determined from ﬁg. 4(d). (b) Switching time distributionbetween “outer” (yellow, positive) and “inner” (blue, negative) regions of the boxpartition. The gray lines are log-linear ﬁts to the respective distribution up to jumptimes of 1000 s. The switching time distributions seem to be exponential with meanswitching times t − → + = (145 ±

5) s and t + → − = (168 ±

5) s.indicating transitions between the blue and yellow regions of the ﬁgure (positive and negativeparts of the eigenvector) happening on the order of timescale κ − ≈

50 s.As the sign structure of the eigenvector distinguishes regions well described by some level setof r ξ , this means that the largest almost invariance of the dynamics is in the radial direction.Comparison with ﬁg. 4 suggests that the dynamical feature found here approximately separatesthe SRS from non-SRS of the convection. Note that this is not the same as separating SRS fromDRS, as this separation would be done by the gray circle in ﬁg 8, as obtained from ﬁg. 4(d).Computing expected lifetimes supports this observation. In [WA11b] expected lifetimes of53 s for the DRS and 270 s for the SRS were measured, while both seemed to be exponentiallydistributed random variables. Fig. 8(b) shows the empirical distribution of jump times betweenthe almost invariant sets. They seem exponentially distributed too, suggesting the dynamics onthis level of coarseness (i.e., only observing whether r ξ > c for some appropriate constant c ) isMarkovian. The expected transition time between the yellow and the blue areas are (145 ±

5) s(blue → yellow) and (168 ±

5) s (yellow → blue). The numerical values in these two studies areof the same order of magnitude, however the discrepancy between them is not surprising. Itstems from the fact that [WA11b] was measuring lifetimes of SRS and DRS, however there isa transition region between the two which does not belong to either. Our analysis assigns thetransition region, the DRSs and some SRSs with smaller r ξ to one almost-invariant set (blueregion), and the rest to the other. Almost cyclic behavior.

Let us consider an cyclic permutation σ : { , . . . , p } → { , . . . , p } with period p on the set of boxes. The associated transition matrix satisﬁes S ab = 1 if b = σ ( a ),otherwise S ab = 0. It is well known that the eigenvalues of S are the p roots of unity, λ k = ω k ,18here ω = e π i /p and i is the imaginary unit. The corresponding right eigenvectors ρ k are (cid:0) ρ k (cid:1) σ ( a ) = ω k (cid:0) ρ k (cid:1) a . (13)It follows that S p = Id, the identity, and thus a permutation induces a cyclic behavior. Ifall we know is S , we can deduce the dynamics from its eigenvectors by (13). Note that everypermutation matrix is also a stochastic matrix.With a similar reasoning one can also relax the permutation requirement on S in the followingways:(a) Let S be a stochastic matrix with permuting blocks, i.e., there are I , . . . , I (cid:96) with I ∪ . . . ∪ I (cid:96) = { , . . . , p } and a permutation σ : { , . . . , (cid:96) } → { , . . . , (cid:96) } such that if a ∈ I i then S ab = 0 forall b / ∈ I σ ( i ) . This means that S is a permutation viewed block-wise as given by the indexsets I i . Then, e π i k/(cid:96) are among the eigenvalues of S , and the corresponding eigenvectorsare constant on the I i .(b) If the matrix S is stochastic, and close to a permutation matrix, then also its eigenvaluesand eigenvectors are close to those of a permutation matrix. In this sense, one can speakof almost-cyclic behavior.To summarize, if the transition matrix T shows eigenvalues close to those of a permutationmatrix, i.e., eigenvalues close to the unit circle in the complex plane, then the time-series isexpected to have an almost cyclic component in it [DJ99].Via transition matrix analysis one is able to ﬁnd the hidden cyclic behavior even when consid-ering an embedding that does not map the data on a disc where an angular coordinate can readilybe identiﬁed. To show this, we use the automated ε -selection procedure from Appendix A.2for the data set D2, where the convection cylinder is rotating. This gives ε = 7 . · − . Wesubsample the data set, such that only every second measurement is used, and “interpolate” theremaining data points as described in Appendix A.3. For this proximity parameter, the largebut rare excursions dominate the geometry, and we choose ξ • = ( ξ , ξ , ξ ), where the disc-likegeometry is not obvious. To compute the transition matrix, we subdivide the bounding box ofthe embedded, three-dimensional data into 128 × ×

128 congruent boxes, by keeping onlythose that contain data points; leaving us with p = 856 boxes. With the oﬀset s = 5 we computethe transition matrix T , and ﬁnd the second eigenvalue with decay rate κ = − . ± . . The real part of the corresponding eigenvector is shown in ﬁg. 9(b). It has negligibly smallvalues on the excursing paths, and shows a sinusoidal pattern on the part of the embeddingspace that can be attributed to the SRS; cf. ﬁg. 9(a). Application of T turns the pattern ofthe eigenvector counterclockwise with period t per = π Im ( κ ) = 540 s. Note that in section 4.1we calculated for the same dataset (D2) a rotation period of the structure of 2 π/ . ε and due to thediscretization error from the transition matrix computation on a ﬁnite partition. In summary,the transition matrix method reveals the cyclic behavior in the dynamics even for complicatedembedding scenarios. Transition matrix analysis summary.

In ﬁgures 10 and 11 we depict the real parts of the ﬁrst12 left eigenvectors of the transition matrix for the data sets D1 and D2, respectively. Theireigenvectors can be separated roughly in two classes: the ﬁrst showing sinusoidal patterns in19 a) -0.150.02 -0.1 -0.2 0 -0.02 -0.05-0.15 -0.04-0.06 0-0.1 -0.050 (b) Figure 9: (a) Embedding of D2 with ε = 7 . · − , coloring represents the orientation of the LSCas measured in [WA11a]. (b) Real part of the second eigenvector of the transitionmatrix on a 856-box covering with oﬀset s = 5, giving ω = 1 . · − and a periodof 538 s.the angular direction, and the second showing variation primarily only in the radial direction.This suggests that—at least on long time scales—the dynamics in the orientation and the radialdirection are independent; otherwise we would expect a mixed-mode eigenvector, i.e., one thatcannot be written as a product φ ang( θ ) φ rad( r ) of purely angular and radial modes.If the dynamics were a pure noisy rotation of the orientation, the drift and diﬀusion coeﬃcients ω and D from section 4.1 could be approximated from the eigenvalues of the transition matrixdirectly, as described in Appendix C. This works well for the angular frequency, as shown abovefor the data set D2, is however more defective for the diﬀusion coeﬃcient. There is increasing activity on the dynamical analysis of ﬂow ﬁelds; in particular the lastdecade witnessed the development of diﬀerent novel tools. We brieﬂy discuss two such methods.The ﬁrst one seems to be the most widespread to date. The second is in spirit the closestto ours. Then we draw a conceptual comparison with the theory of eﬀective (or reduced)transfer operators and reaction coordinates [BKK + + Dynamic mode decomposition.

Dynamic mode decomposition (DMD), ﬁrst introduced in [SS08],was devised to reveal long-term dynamical features of a system (eq. (6)), when all that is avail-able are observables z , . . . , z m sampled at a constant rate. With some oﬀset s , building the20igure 10: Dataset D1. First 12 eigenvectors (left to right, row-wise from top to bottom) of thetransition matrix constructed with ε = 5 · − , ξ • = ( ξ , ξ ), s = 10, and with aninitial 64 ×

64 box covering of the bounding box leading to p = 2891. The decayrates log( λ k ) sτ are shown directly above the eigenvectors.21igure 11: Dataset D2. First 12 eigenvectors (left to right, row-wise from top to bottom) ofthe transition matrix constructed with ε = 5 · − , ξ • = ( ξ , ξ ), s = 5, and withan initial 32 ×

32 box covering of the bounding box leading to p = 714. The decayrates log( λ k ) sτ are shown directly above the eigenvectors.22ata matrices X =  | | z z · · ·| |  and Y =  | | z s z s · · ·| |  , of the same size, one seeks a matrix (linear transformation) A such that AX ≈ Y , where thisequality is solved in a least-squares fashion (by minimizing the Frobenius norm); i.e., A = Y X + with X + being the pseudoinverse of X .In [WKR15], a connection of DMD and the so-called Koopman operator has been revealed,which has been extended to the Perron–Frobenius operator in [KKS16], which is dual to theKoopman operator. In particular, DMD converges to a Galerkin projection of the Koopman(and, by a similarity transformation, to the Perron–Frobenius) operator onto the space spannedby functions linear in the observable vector z ; cf. [KKS16].The main connection to our work here is that the transition matrix T (eq. (11)) is a dis-cretization of the Perron–Frobenius operator as well, as described in [KKS16, Section 3.2]. Weexpect this discretization to be superior to DMD, as it is a projection onto hundreds of boxesin our example cases, while DMD is a projection merely onto the 24-dimensional observablespace. Nevertheless, an application of DMD with oﬀset s = 5 to the data set D2 gives a modewith decay rate κ = − . ± . (cid:80) k =1 e πk i / (( z ) k + ( z ) k +8 + ( z ) k +16 ), where z = (( z ) , . . . , ( z ) ) T ∈ R . However,applying DMD to the data set D1, we do not ﬁnd any mode with decay rate close to κ = − . z .Thus, DMD can capture dynamical features that can be characterized by linear functions ofthe observable, but is oblivious to other dynamical behavior. Direct Koopman analysis by diﬀusion maps.

Berry, Giannakis, and Harlim recently developeda method [BGH15] with strong connections to the one we presented here. They approximatethe so-called (stochastic) Koopman generator L , i.e., the generator of the stochastic diﬀerentialequations assumed to be underlying the dynamics that governs the data, directly on the diﬀusionmaps eigenvectors Ξ i . To this end they use the approximation L Ξ( z j ) ≈ τ (cid:0) Ξ( z j +1 ) − Ξ( z j ) (cid:1) to obtain a converging (Galerkin) approximation as the number of dynamical samples grows.Our transition matrix T can be seen as an approximation of e τ L on a discretization of thedomain of this operator; see [FJK13] for connections between the transition matrix and dis-cretization of the generator. The main diﬀerence lies in the discretization of the approximationspace: while we are aggregating data points in a ﬁxed number of partition elements, the approx-imation L ∈ R m × m of [BGH15] grows in size with the size of the data set, and makes it thusnumerically intractable at some point. It should be nevertheless remarked that by combiningtheir method with the out-of-sample extension of diﬀusion maps, as used here, it is possible For functions f : X → C , the Koopman operator K τ associated with the dynamical system F τ is given bythe linear mapping f (cid:55)→ f ◦ F τ in the deterministic, and by f (cid:55)→ E [ f ◦ F τ ] in the stochastic dynamical case.Here, E [ · ] denotes expectation. The Perron–Frobenius operator P τ , usually considered on the space L ofintegrable functions, is its dual, and is uniquely deﬁned by (cid:82) P τ f g = (cid:82) f K τ g . It describes the propagation ofprobability distributions under the (deterministic or stochastic) dynamics F τ ; see [LM94], for instance. Forboth of these operators we use the general name transfer operators .

23o compute more accurate (Galerkin) approximations of L on a ﬁxed tractable set of diﬀusionmaps eigenfuctions [TGDW19]. Eﬀective transfer operators and Koopman Mode Decomposition.

Eﬀective (reduced) trans-fer operators arise in the context when the state of the system is only observed partially, throughsome non-linear observation function ξ : X → R r . They propagate distributions or observables f ξ that are functions of ξ , i.e., f ξ = ˜ f ◦ ξ with ˜ f : R r → C , describing the eﬀect of the dynamicsas seen through the observation function ξ , conditional to the system being in equilibrium.Thus, they are given by the conditional expectations P τξ f ξ ( x ) = E (cid:2) P τ f ξ ( z ) (cid:12)(cid:12) ξ ( z ) = ξ ( x ) (cid:3) , K τξ f ξ ( x ) = E (cid:2) K τ f ξ ( z ) (cid:12)(cid:12) ξ ( z ) = ξ ( x ) (cid:3) , (14)where E [ · ] denotes expectation with respect to the invariant measure of the system. It isimmediate that our transition matrix T , deﬁned by (11) in section 4.2, is a discretization of P τξ with observation function ξ = ξ • ◦ h . In fact, it can be seen as a Galerkin projection of thisoperator on the space spanned by piecewise constant functions over the boxes B i , cf. [KKS16]for further details. Its transpose T T is an analogous approximation of the eﬀective Koopmanoperator K τξ , which is the adjoint of P τξ . It is shown for reversible dynamics in [BKK +

18] thatthe dominant timescales of the original system are still well retained in the system observedthrough ξ , i.e., the dominant spectra of P τ and P τξ are close, if the dominant eigenvectors φ j of P τ are well parametrized by the observation function, i.e., there exist ˜ φ j : R r → C , such that φ j ≈ ˜ φ j ◦ ξ . This non-linear representation property is often present in molecular systems wherethe long time scales are connected to transitions between regions of state space (“reactions”),hence such a ξ is called reaction coordinate .Koopman Mode Decomposition (KMD) [RMB + ξ .It assumes that the observation function can be decomposed in the eigenfunctions of K τ , i.e., ξ ( · ) = (cid:80) ∞ k =0 v j φ j ( · ), where v j ∈ C r are the vector-valued coeﬃcients of this decomposition (theKoompan modes). The evolution of the observation ξ ( x ( t )) of the true systems state x ( t ) canthen be described as E (cid:2) ξ ( x ( t )) (cid:12)(cid:12) x (0) = x (cid:3) = K t ξ ( x ) = ∞ (cid:88) k =0 λ t/τj v j φ j ( x ) , or some suitable truncation of this series.Given some observation function ξ , both KMD and spectral analysis of the eﬀective transferoperator are model reduction tools for the original, complex, possibly high-dimensional deter-ministic or random system. While KMD aims at reconstructing the (expected) dynamics point-wise, spectral analysis of the eﬀective transfer operator gives information about the dynamicalprocesses with the slowest time scales. Requirements for good performance of both methods arein some sense quite opposite: for KMD to perform eﬃciently, one requires ξ to be (componen-twise) representable through linear combinations of a reasonable number of eigenfunctions ofthe Koopman operator, while the eﬀective transfer operators require the dominant eigenfunc-tions of the full transfer operators P τ or K τ to be approximately non-linearly parametrizableby a low-dimensional observation function. From the perspective of having a complex (chaotic)system at hand, it appears more suitable to ask for the long-term dominant statistical behaviorof the system, as delivered by analyzing the eﬀective transfer operators, than to consider singletrajectories, as given by KMD. Also, once we have given a fair parametrization ξ of the (approx-imate) attractor of a system, unless important dynamical processes happen in dimensions of24he attractor not seen by this parametrization, the eigenfunctions of the full transfer operatorsare non-linear functions of ξ .

5. Summary and discussion

In this paper, we have analysed geometric and dynamical features of turbulent Rayleigh–B´enardconvection using the diﬀusion maps embedding approach. The state of the system was measuredby the temperature at 24 diﬀerent locations in the sidewall of the convection cylinder, and theapproach embeds the unordered set of measurements into a space of chosen dimensionality.We applied this embedding to the data of two diﬀerent data sets, one where the convectioncylinder was at rest and another one where it rotates with a constant angular speed. Wefound that in both cases the transformed data set forms a disc in the space spanned by the twodominant diﬀusion-map coordinates ξ and ξ . We found that this disc represents the large-scalecirculation (LSC) in the convection container, with the orientation of the LSC being representedby the azimuthal location on the disc, and the strength of the LSC being represented by theradial distance from the disc center.We furthermore investigated the double and single roll states and their representation inthe ( ξ , ξ )-embedding. We found that points corresponding to the DRS were located at thecenter of the disc, while points belonging to the SRS were located at the perimeter of the disc.Analysing the azimuthal dynamics of the embedded data shows a diﬀusive process on largetime scales and a ballistic process on short time scales. This behavior supports a model for thelarge-scale circulation based on a Langevin equation suggested by Brown and Ahlers [BA08b].While we clearly see the diﬀusive azimuthal motion of the LSC in our approach, we did notdetect any clear signatures of the torsional or the sloshing mode of the LSC (see e.g., [ZXZ + .

5, causingthe SRS to be rather unstable with frequent switching between SRS and DRS. Due to thiserratic behaviour of the SRS, periodic torsional and sloshing oscillations are diﬃcult to detect.However, we do expect to observe the torsional mode for Γ = 1 cells when the SRS is morestable and the torsional mode better detectable.If the embedding does not represent the bulk dynamical behaviour (because, e.g., large butrare dynamical excursions dominate the geometry in state space), a transition matrix analysisin the low-dimensional embedding space can reveal hidden large-scale dynamical features, e.g.,cyclic dynamics; shown in Figure 9.The proposed approach belongs to the class of transfer (or Koopman) operator methods to ap-proximate properties of dynamical systems. While having direct connections to other methods ofthis class [RMB +

09, WKR15, BGH15], it is a novel composition of the so-called Ulam discretiza-tion of transfer operators [Ula60, Fro98, DJ99, KKS16] and the diﬀusion maps [CL06b, CL06c]manifold learning approaches to approximate a so-called eﬀective transfer operator [BKK + P τξ from (14) inthe correct limit of inﬁnite data points and vanishing-diameter box-covering. This should followby combining results from [VLBB08, DJ99] and [KKS16, Appendix C], under the assumptionthat the data lies exactly on a ﬁnite-dimensional smooth manifold.We note that many of the ﬁndings reported here, can also be calculated by using a classicalapproach, where the amplitude and orientation of the LSC is determined from ﬁtting a cosinefunction to the thermistors at a given vertical position [BA07, FBA08, BA09, BA08b, WA11c].While this approach assumes already the existence of an LSC, the diﬀusion maps approach in25ontrast is suitable to detect unknown structure in data in the ﬁrst place.In summary, our approach was able to provide geometric intuition about the dynamicalstate space (in a general sense, the stochastic “attractor”) of the inﬁnite-dimensional Rayleigh–B´enard convection system in a cylinder. Furthermore, the long-term dynamical behavior on thisspace was revealed, and found to be in good quantitative agreement with previous experimentalresults. While these experimental results heavily relied on the knowledge of the physical space’sgeometry and model assumptions, our analysis is not utilizing such knowledge at any point.We believe that this and similar analysis techniques can help to improve the understanding andreduced modeling of high- or inﬁnite-dimensional complex systems.

Acknowledgement

We acknowledge support by the Priority Program 1881 “Turbulent Superstructures” of theGerman Research Foundation (DFG). We thank Michael Wilczek and J¨org Schumacher forvaluable discussions.

A. More on diﬀusion maps

A.1. Diﬀusion distance

To understand in which sense does diﬀusion maps retain the geometric properties of the datamanifold after embedding, let us recall that P is a stochastic matrix and thus P ij can be viewedas the probability that a random walker jumps from the data point z i to the data point z j .Now deﬁne a diﬀusion distance as follows: D ( z i , z j ) = (cid:88) k | P ik − P jk | π k . (15)Here, π = ( π , . . . , π m ) T the stationary distribution of the Markov chain, and satisﬁes π i = d ( z i ) (cid:80) k d ( z k ) , such that π i P ij = π j P ji . In this deﬁnition, the diﬀusion distance D ( z i , z j ) is small, if the transition probabilities from z i are similar to the transition probabilities from z j . This can be written as P i − ≈ P j − , where P i − denotes the i -th row of P .Thus, we note that for ˆ ξ i := P i − we have (cid:13)(cid:13) ˆ ξ ( z i ) − ˆ ξ ( z j ) (cid:13)(cid:13) /π = D ( z i , z j ) , i.e., a weighted Euclidean distance in the space of distributions on the data set corresponds tothe diﬀusion distance on the data manifold. The important observation in [CL06b] is that thisnow can be reformulated using the eigenvalue decomposition of P to yield D ( z i , z j ) = m (cid:88) (cid:96) =1 Λ (cid:96) (Ξ (cid:96) ( z i ) − Ξ (cid:96) ( z j )) . (16)26his now means that with the embedding ξ ( z i ) :=  Λ Ξ ( z i )Λ Ξ ( z i )...Λ m Ξ m ( z i )  (17)satisﬁes (cid:13)(cid:13) ξ ( z i ) − ξ ( z j ) (cid:13)(cid:13) = D ( z i , z j ) , (18)where (cid:107) · (cid:107) is the usual Euclidean norm.Eq. (18) holds only for the full m -dimensional embedding. Assuming that the eigenvaluesΛ (cid:96) decay suﬃciently fast (after sorting them accordingly), we can represent the data pointssuﬃciently well by just a few eigenvectors, as in (5), i.e., in a lower-dimensional space. Alsoif this is not the case, one still obtains a valuable parametrization of the data manifold byjust a few (usually the number of eigenvectors to take is the dimension of the data manifold)selected eigenvectors, as the other eigenvector do not contain additional topological information (these are so-called higher order harmonics, cf. Appendix B for an example). Of course, thequantitative property (18) is lost, but on a qualitative level one still obtains a low-dimensionalone-to-one embedding of the data manifold.Let us now consider the statement from section 3.2, claiming that a larger proximity parameteremphasizes the one-dimensional transient loops less in the embedding. Note that the data pointson the transient arc are sparsely spaced along a one-dimensional line. This means, for a small ε there will be non-negligible transition probability P ij essentially only between neighboringpoints z i and z j . Thus, the random walker needs many steps to transition from one part ofthe arc to another. Meanwhile, for a larger ε , transitions jumping over several neighbors arepossible, speeding up transitions strongly. In the bulk, for both small and large values of ε there are plenty neighbors present, allowing for a fast diﬀusive spreading. Thus, the diﬀusiondistance between points in the bulk stays low in ε is changed from large to small, while on thetransient arc it becomes large. If the pairwise mutual distance between the arc points is large,they need to ﬁll in more space in the embedding. A.2. Optimal choice of the proximity parameter ε Here we will consider the task of automatically determining a “good” proximity parametervalue ε (sometimes also called bandwidth ) in the diﬀusion maps method. The ideas presentedhere stem from [CSSS08], and have been later reﬁned in [BH16].Let us consider the entries K ij ( ε ) = exp (cid:0) − ε − (cid:107) x i − x j (cid:107) (cid:1) of the similarity matrix in thediﬀusion maps construction. If there are m data points sampling the manifold M , then, byinterpreting the following sum as Monte Carlo approximation of an integral, we obtain S ( ε ) := (cid:88) i,j K ij ( ε ) ( ∗ ) ≈ m vol( M ) (cid:82) M (cid:82) M exp (cid:0) − ε − (cid:107) x − y (cid:107) (cid:1) dx dy ( ∗∗ ) ≈ m vol( M ) (cid:82) M (cid:82) R d exp (cid:0) − ε − (cid:107) x − y (cid:107) (cid:1) dx dy = m vol( M ) (2 πε ) d/ . Here, on the one hand, ( ∗ ) works if ε is suﬃciently large, such that the point cloud { x i } mi =1 “resolves” the functions exp (cid:0) − ε − (cid:107) · − x j (cid:107) (cid:1) properly, such that the Monte Carlo estimation27s valid. On the other hand, ( ∗∗ ) is a good approximation, if ε is suﬃciently small, such thatthe integral (cid:82) M exp (cid:0) − ε − (cid:107) · − y (cid:107) (cid:1) is well approximated by the same integral on the tangentialspace R d of M at y . For this, the “Gaussian bell” should be suﬃciently localized, i.e., ε small. It is assumed, that in between, there is a sweet spot for the values of ε that the aboveapproximations hold. It follows thatlog( S ( ε )) ≈ d ε ) + log (cid:32) (2 π ) d/ m vol( M ) (cid:33) . (19)We note the two limiting behaviors: • As ε →

0, we have K ij ( ε ) → δ ij , the Kronecker delta. Thus, S ( ε ) → m . • As ε → ∞ , we have K ij ( ε ) →

1, thus S ( ε ) → m .In between these two extremes there should be a region of linear growth in the double-logarithmicplot S ( ε ) versus ε , according to (19). This is suggested to be determined by maximizing d log( S ( ε )) d log( ε ) .The idea is that such an ε is neither too small (compared with the data point density), nor toolarge (compared with the diameter of the data point cloud). Note that the procedure also givesan estimate of the dimension of the manifold M through the slope computed in (19). A.3. Out-of-sample extension of diﬀusion maps

Once we have computed the diﬀusion map matrix P and its eigenpairs (Λ , Ξ) for a given ﬁxedset of data points, z , . . . , z m , we would like to embed a new data point z without repeating thewhole diﬀusion map computation anew on the entire data set augmented by z . Especially, wewould like to avoid solving the numerically expensive eigenvalue problem.To this end, let us write the eigenvalue equation P Ξ = ΛΞ of the diﬀusion map matrix in thefollowing functional form, Ξ( z i ) = 1Λ m (cid:88) j =1 p ( z i , z j )Ξ( z j ) , (20)where p ( z i , z j ) = P ij . The main idea of the “interpolation” is to use (20) to evaluate Ξ in anarbitrary point z , cf. [CL06c, Def. 2]. For this we need that the function z (cid:55)→ p ( z, z j )—for ﬁxeddata points z , . . . , z m and given z j —can be evaluated at any point of the space, not just indata points z i . This is possible by simply repeating the construction in (1)–(3) with replacing z i by z . There, the summation in (1) and (2) is carried out over the original data points only.Thus, (3) gives the values p ( z, z j ), and by (20) we setΞ( z ) = 1Λ m (cid:88) j =1 p ( z, z j )Ξ( z j ) , (21)where the Ξ( z j ) have already been computed in advance.Naturally, one can vectorize this procedure for a whole set { ¯ z , . . . , ¯ z M } of new data points.Then we need to compute the interpolating matrix ¯ P ∈ R M × m with ¯ P ij = p (¯ z i , z j ), and¯Ξ := (Ξ(¯ z ) , . . . , Ξ(¯ z M )) T is obtained by ¯Ξ = ¯ P Ξ.28 . Eigenmodes of the Laplacian on a disc

To be able to better understand the diﬀusion-maps embedding, we will brieﬂy consider the eigen-value problem of the Laplacian on the disc D = { ( x, y ) ∈ R : x + y ≤ } with homogeneousNeumann boundary conditions.It turns out [GN13, section 3.2] that eigenfunctions are all separable in the sense that theycan be written in polar coordinates ( r, φ ) as ψ (1) n,m ( r, φ ) = cos( nφ ) J n ( (cid:112) − λ n,m r ) ,ψ (2) n,m ( r, φ ) = sin( nφ ) J n ( (cid:112) − λ n,m r ) , where n ∈ N , m ∈ N , J n is the n -th Bessel function of ﬁrst kind, and (cid:112) − λ n,m is the m -thpositive root of their derivatives J (cid:48) n . Due to the rotational symmetry in φ , the eigenspacesfor n > ψ (1) n,m , ψ (2) n,m form a basis for them. If n = 0, there is no ψ (2)0 ,m , because the eigenspace is non-degenerate. The λ n,m are the associated eigenvalues. Forinstance: λ , = − . , λ , = − . , λ , = − . . The corresponding eigenfunctions are shown in ﬁg. 12 as surface plots. If the disc has radius R instead of one, the eigenvalues are scaled by R . (a) (b) (c) Figure 12: Eigenfunctions of the Laplace operator. (a): ψ , . (b): ψ (1)1 , . (c): ψ (1)2 , .Note that depending on the kernel function used in the diﬀusion maps method, one approxi-mates diﬀerent multiples of the Laplacian, i.e., in the appropriate sense P ( ε ) − Id ε ≈ C ∇ , where for our kernel function k ( x, y ) = exp( − ε − (cid:107) x − y (cid:107) ) we have C = , cf. [HAL07, The-orem 25]. Thus, the transformed eigenvalues ( λ − /ε (or log( λ ) /ε ) of the diﬀusion mapapproximate λ n,m . As for high-dimensional data we cannot directly compare the diﬀusionmap eigenfunctions with those of the Laplacian on a disc, an indication whether the data is analmost-isometric embedding of a disc in high dimensions is given by the following:(i) The eigenvalues themselves depend on the radius of the circle, but their ratios should stayclose to the ratios given by the λ n,m above.(ii) Another indicator for a disc-like manifold is if the embedding by diﬀerent diﬀusion-mapeigenfunctions resembles an associated embedding by the eigenfunctions of the true Lapla-cian on a disc. Some of these are shown in ﬁg. 13. Thy are to be compared with the29mbedding of the convection data in ﬁg. 3. That these three-dimensional embeddingsshow two-dimensional surface, is an indication of degeneracy (non-linear interdependence)between the eigenfunctions. This is expected, as the manifold we consider is itself merelytwo-dimensional. (a) (b) (c) Figure 13: Embeddings of the disc D into R by diﬀerent combinations of Laplacian eigen-functions. The cone-shaped (a), saddle-shaped (b), and ﬁgure-eight shaped (c) em-beddings appear for the convection-roll data set as well, indicating a disc-like datamanifold. C. Noisy rotation on a circle

Let us consider the SDE dθ ( t ) = ω dt + √ D dW ( t ) , (22)which describes a noisy uniform rotation on the circle with circumference L , that we identifywith the (periodic) interval [0 , L ). The associated backward Kolmogorov equation reads as ∂ t u = D ∂ θθ u + ω ∂ θ u for u = u ( t, θ ) . (23)The associated eigenvalues (and eigenfunctions) are those of the right-hand side, which is asecond order diﬀerential operator; called the generator . Expressing the Kolmogorov equationfor a Fourier basis with basis functions φ k ( θ ) = exp (cid:0) i πkθL (cid:1) , k ∈ Z , we directly obtain that thebasis functions are eigenfunctions at eigenvalues λ k = − π k L D + i 2 πkL ω. Thus, λ k , λ − k are complex conjugate eigenpairs, and the corresponding eigenspace can bespanned by the real functions sin (cid:0) πkθL (cid:1) and cos (cid:0) πkθL (cid:1) .The forward Kolmogorov (or Fokker–Planck) equation associated with (22) reads as ∂ t u = D ∂ θθ u − ω ∂ θ u , and has thus the very same eigenfunctions as the backward equation at complexconjugate eigenvalues, i.e., λ fwdk = ¯ λ bwdk .Thus, the noise and drift coeﬃcients show up separately in the real and imaginary parts ofthe eigenvalues, and so we can estimate them from those.30 . Azimuthal drift and diﬀusion as a function of the appliedrotation rate We have explained in section 4.1 how one can extract information of the azimuthal orientationand their time dependence from the embedded data. We have shown that the azimuthal orien-tation can be modeled with a Langevin equation that includes a constant azimuthal drift ω .This drift is zero (or small for a single time trace) for a non-rotating system, but becomesimportant under rotation.Fig. 14 shows the azimuthal diﬀusion D , the ballistic coeﬃcient s and the average drift ω for diﬀerent rotation rates (expressed dimensionless as the inverse Rossby number 1/ Ro ). Letus ﬁrst have a look at the drift in ﬁg. 14b. Note that using the diﬀusion maps embedding andcalculating the azimuthal position from it, does not preserve the sign of the azimuthal angle. Asa result we cannot distinguish between cyclonic and anti-cyclonic motion of the ﬂow structure.Thus, we assigned the sign from the classical analysis to ω . The result in ﬁg. 14b shows thatthe in this way calculated azimuthal velocity (in fact these are not ﬂow velocities but rathervelocities of the thermal structure) agree very well with the classically calculated values. | ω | increase ﬁrst with increasing 1/ Ro reach a maximum at around 1/ Ro ≈ . Ro ≈ . | ω | increases again, with anegative sign.The diﬀusion rate D and the ballistic coeﬃcient s also change as a function of 1/ Ro , as canbe seen in ﬁg. 14a. Interestingly, both values decrease initially with increasing 1/ Ro , they reacha minimum at roughly 1/ Ro ≈ . Ro . Note that a change in the heattransport was observed at 1/ Ro =0.8. For faster rotation rates, the vertical heat transport isenhanced compared to the non-rotating case.We interpret the results found here, such that for suﬃciently small rotation rates, the ﬂow issomehow stabilised, turbulent ﬂuctuations are reduced or have less inﬂuence on the large-scalecirculation and thus the diﬀusion coeﬃcient is reduced. The ﬁnding further support that at1/ Ro =0.8 a transition occurs and that the ﬂuid is in a diﬀerent state for larger 1/ Ro . While itis not clear what exactly happens at this point, there are evidences from previous studies thata large convection roll is replaced by multiple vortices and columnar structures in which coldand warm ﬂuid is transported from the boundaries into the bulk (see e.g., [WSZ +

10, SOLC11]).The diﬀusion of the structures is then no longer driven by the turbulent plume emission, butrather by the motion and interaction of the vortices.

References [AM17] H. Arbabi and I. Mezi´c. Study of dynamics in post-transient ﬂows using Koopmanmode decomposition.

Physical Review Fluids , 2(12):124402, 2017.[AS19] H. Arbabi and T. Sapsis. Data-driven modeling of strongly nonlinear chaoticsystems with non-gaussian statistics. Preprint: , 2019.[BA07] E. Brown and G. Ahlers. Large-scale circulation model of turbulent Rayleigh-B´enard convection.

Phys. Rev. Lett. , 98:134501–1–4, 2007.[BA08a] E. Brown and G. Ahlers. Azimuthal asymmetries of the large-scale circulation inturbulent Rayleigh-B´enard convection.

Phys. Fluids , 20:105105–1–15, 2008.31 (a)(b) D [ - r ad / s ] s [ r ad / s ] sD-0.03-0.02-0.0100.010.02 0 0.5 1 1.5 2 2.5 3 3.5 (a)(b) w [ r ad / s ] Figure 14: Dynamics of the LSC as a function of the dimensionless rotation rate 1 / Ro . (a) Thediﬀusion coeﬃcient (green diamonds, left y-axis) and the ballistic coeﬃcient (redsquares, right y-axis). (b) Drift ω as a function 1/ Ro calculated via diﬀusion mapsas explained above (blue bullets) and via the classical way (red x, see [WA11a]).One should note that since we cannot determine the direction of the drift, i.e., thesign in front of ω , we assigned the sign from the classical analysis to it. Thus ifthe classical analysis shows cyclonic motion, we add assumed that ω was positive,while we assumed negative ω when the classical analysis shows anti-cyclonic mo-tion. The black vertical line marks 1 / Ro = 0 .

8, where there onset of heat transportenhancement due to rotation was observed (see [WA11c]).32BA08b] E. Brown and G. Ahlers. A model of diﬀusion in a potential well for the dynamics ofthe large-scale circulation in turbulent Rayleigh-B´enard convection.

Phys. Fluids ,20:075101–1–16, 2008.[BA09] E. Brown and G. Ahlers. The origin of oscillations of the large-scale circulation ofturbulent Rayleigh-B´enard convection.

J. Fluid Mech. , 638:383–400, 2009.[B´en00] H. B´enard. Les tourbillons cellularies dans une nappe liquide.

Rev. Gen. Sci. PureAppl. , 11:1261–1309, 1900.[BGH15] T. Berry, D. Giannakis, and J. Harlim. Nonparametric forecasting of low-dimensional dynamical systems.

Physical Review E , 91(3):032915, 2015.[BH16] T. Berry and J. Harlim. Variable bandwidth diﬀusion kernels.

Applied and Com-putational Harmonic Analysis , 40(1):68–96, 2016.[BKK +

18] A. Bittracher, P. Koltai, S. Klus, R. Banisch, M. Dellnitz, and C. Sch¨utte. Tran-sition manifolds of complex metastable systems.

Journal of Nonlinear Science ,28(2):471–512, Apr 2018.[BN03] M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction anddata representation.

Neural computation , 15(6):1373–1396, 2003.[Bou03] J. Boussinesq.

Theorie analytique de la chaleur, Vol. 2 . Gauthier-Villars, Paris,1903.[BWWA13] O. Barkan, J. Weill, L. Wolf, and H. Aronowitz. Fast high dimensional vector mul-tiplication face recognition. In

The IEEE International Conference on ComputerVision (ICCV) , December 2013.[CL06a] R. R. Coifman and S. Lafon. Diﬀusion maps.

Applied and Computational HarmonicAnalysis , 21(1):5 – 30, 2006. Special Issue: Diﬀusion Maps and Wavelets.[CL06b] R. R. Coifman and S. Lafon. Diﬀusion maps.

Applied and Computational HarmonicAnalysis , 21(1):5–30, 2006.[CL06c] R. R. Coifman and S. Lafon. Geometric harmonics: a novel tool for multiscale out-of-sample extension of empirical functions.

Applied and Computational HarmonicAnalysis , 21(1):31–52, 2006.[CSSS08] R. R. Coifman, Y. Shkolnisky, F. J. Sigworth, and A. Singer. Graph Laplaciantomography from unknown random projections.

IEEE Transactions on Image Pro-cessing , 17(10):1891–1899, 2008.[Dav82] E. B. Davies. Metastable states of symmetric Markov semigroups ii.

J. LondonMath. Soc. , s2-26(3):541–556, 1982.[Dav09] P. Davidson.

Turbulence - An Introduction for Scientists and Engineers . OxfordUniversity Press, 2009.[DFJ01] M. Dellnitz, G. Froyland, and O. Junge. The algorithms behind GAIO–Set orientednumerical methods for dynamical systems. In

Ergodic theory, analysis, and eﬃcientsimulation of dynamical systems , pages 145–174. Springer, 2001.33DG03] D. L. Donoho and C. Grimes. Hessian eigenmaps: Locally linear embedding tech-niques for high-dimensional data.

Proceedings of the National Academy of Sciences ,100(10):5591–5596, 2003.[DJ99] M. Dellnitz and O. Junge. On the approximation of complicated dynamical behav-ior.

SIAM J. Numer. Anal. , 36:491–515, 1999.[DvMZ16] M. Dellnitz, M. H. von Molo, and A. Ziessler. On the computation of attractorsfor delay diﬀerential equations.

Journal of Computational Dynamics , 3(1):93–112,2016.[DW04] P. Deuﬂhard and M. Weber. Robust Perron cluster analysis in conformation dy-namics.

Linear Algebra Appl. , 398:161–184, 2004. Special Issue on Matrices andMathematical Biology.[FBA08] D. Funfschilling, E. Brown, and G. Ahlers. Torsional oscillations of the large-scalecirculation in turbulent Rayleigh-B´enard convection.

J. Fluid Mech. , 607:119–139,2008.[FFL10] Z. Farbman, R. Fattal, and D. Lischinski. Diﬀusion maps for edge-aware imageediting.

ACM Trans. Graph. , 29(6):145:1–145:10, December 2010.[FJK13] G. Froyland, O. Junge, and P. Koltai. Estimating long term behavior of ﬂowswithout trajectory integration: the inﬁnitesimal generator approach.

SIAM J.Numer. Anal. , 51(1):223–247, 2013.[Fri18] J. Friedman. An algorithm for ﬁnding best matches in logarithmic expectedtime. Technical report, SLAC National Accelerator Lab., Menlo Park, CA (UnitedStates), 2018.[Fro98] G. Froyland. Approximating physical invariant measures of mixing dynamical sys-tems.

Nonlinear Analysis, Theory, Methods, & Applications , 32:831–860, 1998.[Gar04] C. Gardiner.

Handbook of Stochastic Methods . Springer, 2004.[GKD19] R. Gerlach, P. Koltai, and M. Dellnitz. Revealing the intrinsic geometry of ﬁnitedimensional invariant sets of inﬁnite dimensional dynamical systems. arXiv preprintarXiv:1902.08824 , 2019.[GKKS18] D. Giannakis, A. Kolchinskaya, D. Krasnov, and J. Schumacher. Koopman anal-ysis of the long-term evolution in a turbulent convection cell.

Journal of FluidMechanics , 847:735–767, 2018.[GN13] D. S. Grebenkov and B.-T. Nguyen. Geometrical structure of Laplacian eigenfunc-tions.

SIAM Review , 55(4):601–667, 2013.[GS98] B. Gaveau and L. S. Schulman. Theory of nonequilibrium ﬁrst-order phase transi-tions for stochastic dynamics.

J. Math. Phys. , 39(3):1517–1533, 1998.[GS06] B. Gaveau and L. S. Schulman. Multiple phases in stochastic dynamics: Geometryand probabilities.

Phys. Rev. E , 73(3), 2006.34HAL07] M. Hein, J.-Y. Audibert, and U. v. Luxburg. Graph Laplacians and their conver-gence on random neighborhood graphs.

Journal of Machine Learning Research ,8(Jun):1325–1368, 2007.[Hsu87] C. S. Hsu.

Cell-to-cell mapping. A Method of Global Analysis for Nonlinear Sys-tems , volume 64 of

Applied Mathematical Sciences . Springer-Verlag, New York,1987.[Jol86] I. Jolliﬀe.

Principal component analysis . Springer-Verlag New York, 1986.[KKS16] S. Klus, P. Koltai, and C. Sch¨utte. On the numerical approximation of the Perron–Frobenius and Koopman operator.

J. Comput. Dyn. , 3(1):51–79, 2016.[KNC +

14] E. Kaiser, B. R. Noack, L. Cordier, A. Spohn, M. Segond, M. Abel, G. Daviller,J. ¨Osth, S. Krajnovi´c, and R. K. Niven. Cluster-based reduced-order modelling ofa mixing layer.

Journal of Fluid Mechanics , 754:365–414, 2014.[KNK +

18] S. Klus, F. N¨uske, P. Koltai, H. Wu, I. Kevrekidis, C. Sch¨utte, and F. No´e. Data-driven model reduction and transfer operator approximation.

Journal of NonlinearScience , 28(3):985–1010, 2018.[Kru64] J. B. Kruskal. Multidimensional scaling by optimizing goodness of ﬁt to a nonmetrichypothesis.

Psychometrika , 29(1):1–27, 1964.[KSN +

09] E. King, S. Stellmach, J. Noir, U. Hansen, and J. Aurnou. Boundary layer controlof rotating convection systems.

Nature , 457:301–304, 2009.[Laf04] S. S. Lafon.

Diﬀusion Maps and Geometric Harmonics . PhD thesis, Yale University,2004.[LE09] Y. Liu and R. Ecke. Heat transport measurements in turbulent rotating Rayleigh-B´enard convection.

Phys. Rev. E , 80:036314, 2009.[LM94] A. Lasota and M. C. Mackey.

Chaos, Fractals, and Noise . Springer-Verl., 2. edition,1994.[Obe79] A. Oberbeck. ¨uber die W¨armeleitung der Fl¨ussigkeiten bei Ber¨ucksichtigung derStr¨omungen infolge von Temperaturdiﬀerenzen.

Ann. Phys. Chem. , 243:271–292,1879.[Pea01] K. Pearson. Liii. on lines and planes of closest ﬁt to systems of points in space.

TheLondon, Edinburgh, and Dublin Philosophical Magazine and Journal of Science ,2(11):559–572, 1901.[PSS18] A. Pandey, J. D. Scheel, and J. Schumacher. Turbulent superstructures in rayleigh-b´enard convection.

Nature Communications , 9(1):2118, 2018.[PWB +

11] K. Petschel, M. Wilczek, M. Breuer, R. Friedrich, and U. Hansen. Statisticalanalysis of global wind dynamics in vigorous rayleigh-b´enard convection.

Phys.Rev. E , 84:026309, Aug 2011.[PWS +

11] J. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, J. Chodera, C. Sch¨utte,and F. No´e. Markov models of molecular kinetics: Generation and validation.

J.Chem. Phys. , 134:174105, 2011. 35Ray16] L. Rayleigh. On convection currents in a horizontal layer of ﬂuid, when the highertemperature is on the under side.

Philosophical Magazin , 32:529, 1916.[RMB +

09] C. W. Rowley, I. Mezi´c, S. Bagheri, P. Schlatter, and D. S. Henningson. Spectralanalysis of nonlinear ﬂows.

Journal of Fluid Mechanics , 641:115–127, 2009.[Rob05] J. C. Robinson. A topological delay embedding theorem for inﬁnite-dimensionaldynamical systems.

Nonlinearity , 18(5):2135, 2005.[Row05] C. W. Rowley. Model reduction for ﬂuids, using balanced proper orthogonal de-composition.

International Journal of Bifurcation and Chaos , 15(03):997–1013,2005.[RS00] S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linearembedding.

Science , 290(5500):2323–2326, 2000.[SBHT14] H. Song, E. Brown, R. Hawkins, and P. Tong. Dynamics of large-scale circulation ofturbulent thermal convection in a horizontal cylinder.

Journal of Fluid Mechanics ,740:136–167, 2014.[Shl14] J. Shlens. A tutorial on principal component analysis.

CoRR , abs/1404.1100, 2014.[SMH05] T. R. Smith, J. Moehlis, and P. Holmes. Low-dimensional modelling of turbulenceusing the proper orthogonal decomposition: a tutorial.

Nonlinear Dynamics , 41(1-3):275–307, 2005.[SNL +

11] C. Sch¨utte, F. No´e, J. Lu, M. Sarich, and E. Vanden-Eijnden. Markov state modelsbased on milestoning.

The Journal of chemical physics , 134(20):05B609, 2011.[SNS +

10] K. Sugiyama, R. Ni, R. J. A. M. Stevens, T. S. Chan, S.-Q. Zhou, H.-D. Xi,C. Sun, S. Grossmann, K.-Q. Xia, and D. Lohse. Flow reversals in thermallydriven turbulence.

Phys. Rev. Lett. , 105(3):034503, Jul 2010.[SOLC11] R. J. A. M. Stevens, J. Overkamp, D. Lohse, and H. J. H. Clercx. Eﬀect ofaspect ratio on vortex distribution and heat transfer in rotating rayleigh-b´enardconvection.

Phys. Rev. E , 84:056313, Nov 2011.[SS08] P. Schmid and J. Sesterhenn. Dynamic Mode Decomposition of numerical andexperimental data. In .American Physical Society, 2008.[Ste02] G. W. Stewart. A Krylov–Schur algorithm for large eigenproblems.

SIAM Journalon Matrix Analysis and Applications , 23(3):601–614, 2002.[SV60] E. A. Spiegel and G. Veronis. On the Boussinesq approximation for a compressibleﬂuid.

Astrophys. J. , 131:442–447, 1960.[Tak81] F. Takens. Detecting strange attractors in turbulence. In

Dynamical systems andturbulence, Warwick 1980 , pages 366–381. Springer, 1981.[TdSL00] J. B. Tenenbaum, V. de Silva, and J. C. Langford. A global geometric frameworkfor nonlinear dimensionality reduction.

Science , 290:2319, 2000.36TGDW19] E. H. Thiede, D. Giannakis, A. R. Dinner, and J. Weare. Galerkin approximationof dynamical quantities using trajectory data.

The Journal of Chemical Physics ,150(24):244111, 2019.[Ula60] S. M. Ulam.

A Collection of Mathematical Problems . Interscience Publisher NY,1960.[VLBB08] U. Von Luxburg, M. Belkin, and O. Bousquet. Consistency of spectral clustering.

The Annals of Statistics , pages 555–586, 2008.[WA11a] S. Weiss and G. Ahlers. The large-scale ﬂow structure in turbulent rotatingRayleigh–B´enard convection.

J. Fluid Mech. , 688:461–492, 2011.[WA11b] S. Weiss and G. Ahlers. Turbulent Rayleigh-B´enard convection in a cylindricalcontainer with aspect ratio Γ = 0 .

50 and Prandtl number Pr=4.38.

J. Fluid Mech. ,676:5–40, 2011.[WA11c] S. Weiss and G. Ahlers. Heat transport by turbulent rotating Rayleigh-B´enardconvection and its dependence on the aspect ratio.

J. Fluid Mech. , 684:407–426,2011.[WKR15] M. O. Williams, I. G. Kevrekidis, and C. W. Rowley. A data–driven approximationof the Koopman operator: Extending dynamic mode decomposition.

Journal ofNonlinear Science , 25(6):1307–1346, 2015.[WSZ +

10] S. Weiss, R. J. A. M. Stevens, J.-Q. Zhong, H. J. H. Clercx, D. Lohse, and G. Ahlers.Finite-size eﬀects lead to supercritical bifurcations in turbulent rotating rayleigh-b´enard convection.

Phys. Rev. Lett. , 105(22):224501, Nov 2010.[WWA16] S. Weiss, P. Wei, and G. Ahlers. Heat-transport enhancement in rotating turbulentRayleigh-B´enard convection.

Phys. Rev. E , 93:043102, Apr 2016.[XX08] H.-D. Xi and K.-Q. Xia. Flow mode transition in turbulent thermal convection.

Phys. Fluids , 20:055104–1–15, 2008.[ZA10] J.-Q. Zhong and G. Ahlers. Heat transport and the large-scale circulation in ro-tating turbulent Rayleigh-B´enard convection.

J. Fluid Mech. , 665:300–333, 2010.[ZDG18] A. Ziessler, M. Dellnitz, and R. Gerlach. The numerical computation of unstablemanifolds for inﬁnite dimensional dynamical systems by embedding techniques. arXiv preprint arXiv:1808.08787 , 2018.[ZXZ +

09] Q. Zhou, H.-D. Xi, S.-Q. Zhou, C. Sun, and K.-Q. Xia. Oscillations of the large-scalecirculation in turbulent Rayleigh -B´enard convection: The sloshing mode and itsrelationship with the torsional mode.

Journal of Fluid Mechanics , 630(-1):367–390,2009.[ZZ04] Z. Zhang and H. Zha. Principal manifolds and nonlinear dimensionality reductionvia tangent space alignment.