[PDF] Decomposing spectral and phasic differences in non-linear features between datasets

Abstract

When employing non-linear methods to characterise complex systems, it is important to determine to what extent they are capturing genuine non-linear phenomena that could not be assessed by simpler spectral methods. Specifically, we are concerned with the problem of quantifying spectral and phasic effects on an observed difference in a non-linear feature between two systems (or two states of the same system). Here we derive, from a sequence of null models, a decomposition of the difference in an observable into spectral, phasic, and spectrum-phase interaction components. Our approach makes no assumptions about the structure of the data and adds nuance to a wide range of time series analyses.

Full PDF

DDecomposing spectral and phasic diﬀerences in non-linear features between datasets

Pedro A.M. Mediano, ∗ Fernando E. Rosas,

2, 3, 4, ∗ Adam B. Barrett, † and Daniel Bor † Department of Psychology, University of Cambridge, Cambridge CB2 3EB Center for Psychedelic Research, Department of Medicine, Imperial College London, London SW7 2DD Data Science Institute, Imperial College London, London SW7 2AZ Center for Complexity Science, Imperial College London, London SW7 2AZ Sackler Center for Consciousness Science, Department of Informatics, University of Sussex, Brighton BN1 9RH

When employing non-linear methods to characterise complex systems, it is important to determineto what extent they are capturing genuine non-linear phenomena that could not be assessed bysimpler spectral methods. Speciﬁcally, we are concerned with the problem of quantifying spectraland phasic eﬀects on an observed diﬀerence in a non-linear feature between two systems (or twostates of the same system). Here we derive, from a sequence of null models, a decomposition of thediﬀerence in an observable into spectral, phasic, and spectrum-phase interaction components. Ourapproach makes no assumptions about the structure of the data and adds nuance to a wide rangeof time series analyses.

Non-linear methods are useful for characterising dif-ferences between various states of a complex system,and have found applications in a wide range of scien-tiﬁc domains. For example, Lempel-Ziv (LZ) complex-ity [1] and multiscale entropy [2] have been successful indiscriminating between conscious and unconscious brainactivity [3], and have yielded insights into physiologicalpathologies [4] and price dynamics [5]. However, morereﬁned conclusions could be obtained if there were a prin-cipled way to assess how much of the diﬀerences in suchmeasures are due to genuine non-linear eﬀects, and howmuch is explainable by changes in the power spectrum.A popular approach to study the eﬀect of spectral andphasic contributions on an observable is via surrogatedata methods [6], which examine whether its value is rep-resentative of a null distribution obtained from surrogatedata. Such surrogate methods are regarded as a basicconstituent of the data analyst’s toolkit [7], and havebeen extended to a range of scenarios including multivari-ate time series [8], non-stationary data [9], and many oth-ers. However, surrogate methods are typically designedto be applied on a single dataset , and it is not straight-forward to use them to disentangle spectral and pha-sic contributions on diﬀerences in an observable betweentwo datasets — e.g. how much of the diﬀerence in LZcomplexity between two neurological conditions simplyreﬂects the known spectral changes between them [10].The crux of why this is challenging, and why naive ap-plications of typical surrogate methods fail, is that thediﬀerence between two null models is not necessarily agood null model of the diﬀerence (see Supp. Mat. for adetailed example).To deal with this issue, here we present a novel decom-position of the diﬀerence in an observable between twotime series datasets into spectral, phasic, and spectrum-phase interaction components. The decomposition makesno assumptions about the structure of the data, and iswidely applicable to a broad range of scenarios of interest.We illustrate our method by analysing LZ complexity on neuroimaging data, where our decomposition identiﬁesphasic and spectrum-phase interaction components thattake the opposite sign to the predominantly spectral over-all eﬀect, and which would not have been detectable bypreviously existing methods.

The decomposition.

Let us consider a scientist whois interested in an observed diﬀerence in some quantity f between data recorded in two diﬀerent conditions, de-noted by X and Y . The data consist of time series record-ings, and a set of time series segments are obtained fromeach condition. Each segment could correspond to datarecorded from, e.g. diﬀerent participants in an experi-ment, or diﬀerent time periods from the same partici-pant. The whole dataset from the ﬁrst condition is de-noted as x N , where N is the population size of thesedata, and the N time series segments within x N as x , x , . . . , x N . Similarly, for the second condition one has y M = { y , . . . , y M } . Our goal is to decompose the dif-ference in f between X and Y into spectral, phasic andspectrum-phase interaction components – i.e. to decom-pose ∆ (cid:0) x N , y M (cid:1) := ¯ f (cid:0) x N (cid:1) − ¯ f (cid:0) y M (cid:1) , (1)where ¯ f (cid:0) x N (cid:1) = N P Nj =1 f (cid:0) x j (cid:1) and ¯ f (cid:0) y M (cid:1) = M P Mk =1 f (cid:0) y k (cid:1) are the empirical ensemble averages ofthe function in question, f . This is achieved by a seriesof comparisons between expected f values on the dataand those on a set of progressively more constrained nullmodels for the stochastic processes underlying the data.Formally, we consider x , . . . , x N to be indepen-dent and identically distributed (i.i.d.) realisations ofa stochastic process sampled under condition X , and y , . . . , y M to be i.i.d. realisations of another stochas-tic process sampled under condition Y , and x j , y k ∈ R T ,where T is the length of each time series. The decom-position utilises the discrete Fourier transform, which isdenoted by ˆ x = F{ x } ∈ C T , given a time series x . The a r X i v : . [ s t a t . M E ] S e p amplitudes of the Fourier components are denoted by A ( ˆ x ) = { A ( ˆ x ) , . . . , A T ( ˆ x ) } ∈ R T , and their phases by φ ( ˆ x ) = { φ ( ˆ x ) , . . . , φ T ( ˆ x ) } ∈ [0 , π ] T . Thus, the datafor X can be represented in the frequency domain asi.i.d. phase-amplitude tuples (cid:0) A ( ˆ x j ) , φ ( ˆ x j ) (cid:1) , followinga distribution p X ( A , φ ) induced by X – and similarly forthe y k .We begin by considering a null model M i on whichamplitudes and phases have no interaction – i.e. are sta-tistically independent. Accordingly, we construct newtime series x (w) j that satisfy this null model by com-bining the spectrum of each x j with the phases fromsome other randomly chosen time series from within condition X (and similarly for the y k ). That is, weconstruct x (w) j = F − { A ( ˆ x j ) e i φ (ˆ x αj ) } and y (w) k = F − { A ( ˆ y k ) e i φ (ˆ y βk ) } , where α j and β k are distributeduniformly over { , . . . , N } and { , . . . , M } , respectively.We then consider the mean value of f on these phase-shuﬄed data, given by ν i (cid:0) x N (cid:1) := N P Nj =1 f (cid:0) x (w) j (cid:1) . Thespectrum-phase interaction contribution to the value of f in condition X is then calculated as∆ i (cid:0) x N (cid:1) := ¯ f (cid:0) x N (cid:1) − E (cid:8) ν i (cid:0) x N (cid:1)(cid:12)(cid:12) x N (cid:9) , (2)where the conditional expectation averages the eﬀect ofthe random integers α j on ν i . Similarly, ∆ i (cid:0) y M (cid:1) can becalculated for Y . When estimating ∆ i ( x N ) and ∆ i ( y M )in practice, one will approximate the distribution of ν i by averaging multiple realisations of it.The quantity ∆ i (cid:0) x N (cid:1) measures the extent to which theexpected value of f would be aﬀected if one were to breakany dependence that exists between the amplitudes andphases of the ˆ x j . Equivalently, ∆ i (cid:0) x N (cid:1) accounts for thedeviation in the mean value of f in condition X from thatwhich would be expected if the null model M i holds. Forlarge N , the law of large numbers guarantees that∆ i ( x N ) → E p X ( A , φ ) { f ( A , φ ) } − E p X ( A ) p X ( φ ) { f ( A , φ ) } , and hence that in the absence of any dependency be-tween the phases and spectra, i.e. when p X ( A , φ ) = p X ( A ) p X ( φ ), lim N →∞ ∆ i (cid:0) x N (cid:1) = 0.Next, we focus on the phasic eﬀect on f , i.e. the ef-fect of diﬀerences between the phase distributions of X and Y . For this, we consider a second null model M φ under which phases are not only independent from am-plitude but also follow the same distribution in each ofthe conditions X and Y . We construct phase-shuﬄedtime series x (a) j , y (a) k that satisfy this null model by re-placing the phases of each time series with those fromanother randomly chosen time series from the wholeset of data { x N , y M } . That is, we construct x (a) j = F − { A ( ˆ x j ) e i φ ( ˆ w j ) } and y (a) k = F − { A ( ˆ y k ) e i φ ( ˆ z k ) } ,where ˆ w j , ˆ z k are the discrete Fourier transforms of in-dependently randomly chosen time series that are eachdrawn from X with probability 1/2, and from Y with probability 1/2. Then, we consider the mean valueof f on these phase-shuﬄed data: ν φ (cid:0) x N | y M (cid:1) := N P Nj =1 f (cid:0) x (a) j (cid:1) , and introduce∆ φ (cid:0) x N (cid:1) := E n ν i (cid:0) x N (cid:1) − ν φ (cid:0) x N | y M (cid:1)(cid:12)(cid:12)(cid:12) x N , y M o . We deﬁne ∆ φ (cid:0) y M (cid:1) analogously. Again, when estimatingthese quantities in practice, one can approximate the dis-tributions of ν i and ν φ , for each condition, by averagingmultiple realisations of them.The quantity ∆ φ (cid:0) x N (cid:1) measures the expected eﬀect onthe mean value of f in condition X if M φ holds – i.e.the eﬀect of changing the probability distribution of thephases from p X ( φ ) to the mixture ( p X ( φ ) + p Y ( φ )) / p X ( φ ) = p Y ( φ ), then the law oflarge numbers guarantees thatlim N →∞ ∆ φ (cid:0) x N (cid:1) = lim M →∞ ∆ φ (cid:0) y M (cid:1) = 0 . (3)Finally, we consider the eﬀect of spectral diﬀerences be-tween the conditions on the diﬀerence in f . For this, weconsider the deviation of the phase-shuﬄed data abovefrom a further constrained null model M A , in which bothamplitudes and phases are statistically independent anddistributed identically in X and Y . Speciﬁcally, we con-sider ν A (cid:0) x N , y M (cid:1) := ν φ (cid:0) x N | y M (cid:1) − ν φ (cid:0) y M | x N (cid:1) . Since x (a) i and y (a) i have, by deﬁnition, the same phase statis-tics, ν φ (cid:0) x N | y M (cid:1) and ν φ (cid:0) y M | x N (cid:1) will, on average, diﬀeronly because of diﬀerences between the distribution ofthe spectrum of X and Y . Therefore, we introduce∆ A (cid:0) x N , y M (cid:1) := E (cid:8) ν A (cid:0) x N , y M (cid:1)(cid:12)(cid:12) x N , y M (cid:9) as a metric of the spectral eﬀect. If the distributionof the spectrum is the same for both conditions, thenlim N,M →∞ ∆ A (cid:0) x N , y M (cid:1) = 0 by the law of large num-bers. Again, when estimating this quantity in practice,one will approximate the distribution of ν A (cid:0) x N , y M (cid:1) byobtaining multiple realisations.With these quantities at hand, via a telescopic sumwe can obtain a decomposition of the total diﬀerence in f between the two conditions into spectral, phasic, andspectrum-phase interaction terms. We have that the dif-ference in mean f values between the conditions decom-poses into∆( x N , y M ) = ∆ A ( x N , y M ) (cid:9) Spectrum+∆ φ ( x N ) − ∆ φ ( y M ) (cid:9) Phase+∆ i ( x N ) − ∆ i ( y M ) (cid:9) Interaction (4)Of these, the ﬁrst term is the diﬀerence in f that per-sists on data modiﬁed so the phases have the same distri-bution across conditions, and so corresponds to the dif-ference attributable to spectral changes only. Similarly,by comparing the data with the observed phase distri-butions against data with identically distributed phases, Total diﬀerence Spectral eﬀect Phase eﬀect Phase-amplitudeinteraction T a s k R e s t − − − a) b) c) FIG. 1.

Decomposition of the diﬀerence in Lempel-Ziv (LZ) complexity between task and rest conditionsin the CAMCAN MEG dataset . The LZ complexity computed on sensor-level data during task minus that duringrest indicates a pronounced reduction in complexity in frontal areas during task ( a ). This eﬀect is mostly driven by spectralchanges, ∆ A (cid:0) x N , y M (cid:1) ( b ). Nevertheless, the decomposition (4) also reveals substantial diﬀerences in phase and phase-amplitudeinteraction contributions between conditions ( c ). These have a diﬀerent spatial proﬁle and show the opposite trend to thespectral component (top row shows ∆ φ (cid:0) x N (cid:1) and ∆ i (cid:0) x N (cid:1) and bottom row shows ∆ φ (cid:0) y M (cid:1) and ∆ i (cid:0) y M (cid:1) ). Due to the largenumber of participants, each quantity is signiﬁcantly non-zero for most channels (i.e. t-test across participants gives p (cid:28) . the second term measures the diﬀerence in f attributableto phase changes. Finally, the third term compares theobserved data with phase-shuﬄed time series to accountfor changes due to the phase-spectrum interaction in bothconditions.Accordingly, each of the ∆’s can be considered tobe comparing expected f values on the data against f values on a set of increasingly restrictive null models, M i → M φ → M A . We note that this decomposition isinvariant to the order in which the decomposition is con-structed, i.e. it doesn’t make a diﬀerence if phasic eﬀectsare considered before spectral contributions (as describedhere), or vice versa (proof in Supp. Mat.). Example.

As an illustration, we present an analysis ofthe entropy rate of binarised magnetoencephalographic(MEG) signals, as measured with LZ complexity. Weuse the Cambridge Centre for Ageing and Neuroscience(CAMCAN) dataset [11], which includes a large-scaleMEG dataset of participants undergoing several cognitivetasks, and study the diﬀerences in Lempel-Ziv complex-ity [1] between participants in wakeful rest, and partici-pants performing a simple cognitive stop/no-go task [11].This measure (or minor variations of it) has been widelyused in the neuroscience literature [12–15], showing a re-markable performance in discriminating between diﬀer-ent states of consciousness, for instance normal wakeful-ness versus sleep [3]. In this application, we consider data for 131 partici-pants in both “task” and “rest” conditions. The datafrom each participant were divided up into 100 non-overlapping windows of length T = 1024 (which corre-sponds to approximately 4 s given the sampling rate of250 Hz). To compute the LZ complexity, time series werebinarised, and then the original (1976) version of the LZcomplexity described in Ref. [1] was computed. Binarisa-tion was carried out based on the mean value of the timeseries in question, so the binarised time series containedones where the raw value was greater than the mean, andzeros where the raw value was less than the mean.For each of the 204 MEG channels of each participant,the decomposition in Eq. (4) was applied considering x N to be the windowed data during task and y M to be thewindowed data during rest, and using 500 realisations ofthe random variables involved (i.e. 500 random phaseshuﬄings). Thus, a set of ∆’s was obtained for eachchannel, for each participant. Then, to assess whetherdiﬀerences were signiﬁcant at the group level, 1-samplet-tests were carried out across participants — for each ofthe ∆’s, for each channel. The mean value of each of the∆’s at each MEG channel is shown in Fig. 1.Our decomposition reveals information about the rela-tion between task and rest that is not captured by otherstatistical tools. First, by studying the direct diﬀerencebetween LZ complexity in task versus rest, our resultsshow a reduction of complexity in frontal regions, and anincrease in the rest of the brain during the task (Fig. 1a).Our decomposition shows that the vast majority of thisdiﬀerence (approximately 7.5 out of 8 units) can be ex-plained by spectral eﬀects (Fig. 1b). Interestingly, con-trasting eﬀects are found in the phase and interactioncomponents. In particular, during task there is a strongand heavily localised phase-amplitude interaction com-ponent, which becomes much weaker and spatially ho-mogeneous during rest (Fig. 1c). Interestingly, both ofthese show the opposite trend from the direct diﬀerence,with an increase in frontal regions and reductions else-where during task. The neurobiological implications ofthese ﬁndings will be developed in a separate publication. Conclusion.

In this paper we have tackled the prob-lem of determining to what extent a measured diﬀerencein some quantity between two time series datasets canbe attributed to diﬀerences between their power spectra.For this, we introduced a decomposition that uses a se-quence of null models to disentangle the eﬀect of spectral,phasic, and phase-amplitude interaction eﬀects. Our de-composition requires no assumptions on the data (beyondthat distinct samples within the data are independent),and is easy to compute. As a proof of concept, we pro-vided an example of the decomposition yielding novelresults on some neuroimaging data, more nuanced thanwhat was previously possible with a standard analysis ofLZ complexity.Since this decomposition can be applied to anyobserved diﬀerence between two datasets, it promises tobe a valuable tool for practitioners in multiple scientiﬁcdisciplines. Moreover, it will help to deepen our under-standing of the behaviour of non-linear properties ondatasets describing complex systems.The authors thank Lionel Barnett and Anil Seth forvaluable discussions, and two anonymous referees forcomments on earlier versions of this manuscript. Wealso thank Aleksi Ikkala and Darren Price for vital back-ground work, and Yike Guo for supporting this research.P.M. and D.B. are funded by the Wellcome Trust (grant no. 210920/Z/18/Z). F.R. is supported by the Ad AstraChandaria foundation. D.B. conceptualised the work.A.B.B. guided the writing of the paper. ∗ P.M. and F.R. contributed equally to this work.E-mail: [email protected], [email protected] † A.B.B. and D.B. are joint senior authors.[1] A. Lempel and J. Ziv, IEEE Transactions on InformationTheory , 75 (1976).[2] M. Costa, A. L. Goldberger, and C.-K. Peng, Phys. Rev.Lett. , 068102 (2002).[3] A. G. Casali, O. Gosseries, M. Rosanova, M. Boly,S. Sarasso, K. R. Casali, S. Casarotto, M.-A. Bruno,S. Laureys, G. Tononi, and M. Massimini, Science Trans-lational Medicine , 198ra105 (2013).[4] M. Costa and J. Healey, in Computers in Cardiology,2003 (IEEE, 2003) pp. 705—708.[5] E. Martina, E. Rodriguez, R. Escarela-Perez, andJ. Alvarez-Ramirez, Energy Economics , 936 (2011).[6] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, andJ. D. Farmer, Physica D , 77 (1992).[7] G. Lancaster, D. Iatsenko, A. Pidde, V. Ticcinelli, andA. Stefanovska, Physics Reports , 1 (2018).[8] D. Prichard and J. Theiler, Physical Review Letters ,951 (1994).[9] J. H. Lucio, R. Vald´es, and L. R. Rodr´ıguez, PhysicalReview E , 056202 (2012).[10] N. D. Schiﬀ, T. Nauvel, and J. Victor, Current Opinionin Neurobiology , 7-14 (2012).[11] M. A. Shafto, L. K. Tyler, M. Dixon, J. R. Taylor, J. B.Rowe, R. Cusack, A. J. Calder, W. D. Marslen-Wilson,J. Duncan, et al. , BMC Neurology , 204 (2014).[12] X.-S. Zhang, R. J. Roy, and E. W. Jensen, IEEE trans-actions on Biomedical Engineering , 1424 (2001).[13] M. M. Schartner, A. Pigorini, S. A. Gibbs, G. Arnulfo,S. Sarasso, L. Barnett, L. Nobili, M. Massimini, et al. ,Neuroscience of Consciousness , niw022 (2017).[14] D. Dolan, H. J. Jensen, P. Mediano, M. Molina-Solana,H. Rajpal, F. Rosas, and J. A. Sloboda, Frontiers inPsychology , 1341 (2018).[15] In these references, all channels are concatenated beforethe computation. The version used here is what is com-monly referred to as “LZs” in the neuroscience literature. upplementary Material toDecomposing spectral and phasic diﬀerences in non-linear features between datasets Pedro A.M. Mediano, ∗ Fernando E. Rosas,

NAIVE APPLICATIONS OF CONVENTIONALSURROGATE METHODS FAIL ATTWO-SAMPLE COMPARISONS

Let us start by considering the problem of testing ifan observable of interest f ( x ) ∈ R depends only onthe power spectrum of a single dataset composed bythe time series x = ( x , . . . , x T ) ∈ R T . The discreteFourier transform of x is denoted as ˆ x = F{ x } ∈ C T ,with amplitude A ( ˆ x ) = | ˆ x | ∈ R T and phase φ ( ˆ x ) =arctan (cid:0) Im ( ˆ x ) / Re ( ˆ x ) (cid:1) ∈ [0 , π ] T . A simple procedure,known as phase randomisation [1], is to compare thevalue of f ( x ) against a null distribution given by therandom variable f ( x pr ), where x pr is surrogate dataobtained by taking the Fourier transform of x , addingan independent random phase to each component, andthen taking the inverse Fourier transform. Hence, x pr = F − { A ( ˆ x ) e i φ } where φ is a random vector of uniformlydistributed phases, denoted by φ ∼ R . (Technically, halfof the entries of φ are uniformly distributed over [0 , π ]while the other half are their complex conjugates, so that x pr is a real vector [2].) Accordingly, the null hypothesisthat there is no genuine non-linear structure is rejectedif the quantity f ( x ) − E { f ( x pr ) } is signiﬁcantly diﬀerentfrom zero [3].Let us now consider a slightly more complex scenariowith two time series x and y , and consider whether thediﬀerence δ := f ( x ) − f ( y ) can be attributed solely todiﬀerences in their spectra. A naive approach to addressthis problem – considered here for illustration purposes –would be to compare δ against the null distribution thatcorresponds to the random variable˜ δ = f ( x pr ) − f ( y pr ) , (1)where x pr = F − { A ( ˆ x ) e i φ } and y pr = F − { A ( ˆ y ) e i φ } ,with φ , φ ∼ R being statistically independent of ˆ x , ˆ y ,and of each other. One would then reject the null hypoth-esis if δ − E { ˜ δ } is signiﬁcantly diﬀerent from zero. Thisis equivalent to testing the diﬀerence between the “cor-rected” values f ( x ) − E { f ( x pr ) } and f ( y ) − E { f ( y pr ) } .Unfortunately, this test generally fails because the dif-ference between two null models is not necessarily a goodnull model of the diﬀerence. More speciﬁcally, the diﬀer-ences ˜ δ that are seen when injecting randomised phases . . . . P S D . . . . . c L Z . . . . . . . . . . . c T y p e I e rr o r a)b)c) FIG. 1.

Naive application of typical surrogate methodsto two-sample comparisons results in false positives . a) Two spectra, shown in blue (solid) and red (dashed), wereused to generate time series via inverse Fourier transform. b) To generate phases we used a simple model with a roughnessparameter c , that interpolates between constant phase ( c = 0)and fully random phases ( c = 1). The Lempel-Ziv complex-ity [5] of signals generated with both spectra and the samephase is shown in blue (bottom) and red (top). c) Naive ap-plications of typical surrogate methods, like the one describedin Eq. (1), incorrectly reject the null hypothesis when appliedto two time series with identical phases but diﬀerent spectra. may not be representative of the diﬀerences that are seenwhen injecting phases from other (plausibly more realis-tic) distributions.To illustrate these ideas, let’s consider two given powerspectra, shown in Fig. 1a, and generate random phasesvia a simple model equipped with a roughness parameter c ∈ [0 ,

1] that interpolates between a constant phase ( c =0) and fully random phases ( c = 1) [4]. With this, wecan build ( x c , y c ) pairs by combining the two spectra in a r X i v : . [ s t a t . M E ] S e p Figure 1a and the same phase φ c generated with a given c ∈ [0 , x c and y c only diﬀer in their spectra.However, as shown in Fig. 1c, the test in Eq. (1) resultsin a large number of false positives.To understand this result, note that the value of E { f ( x c ) − f ( y c ) } (i.e. the expected value of δ underthe null hypothesis) roughly corresponds to the gap be-tween the point clouds in Fig. 1b. However, the test inEq. (1) is related to the above quantity when c = 1. Thecrux is that the two spectra “max out” at diﬀerent valuesof f , making the quantity E (cid:8) ˜ δ (cid:9) non-zero and, in turn,introducing a bias in the test for c < THE DECOMPOSITION IS INVARIANT TOORDERING OF THE NULL MODELS

Here we prove that our proposed decomposition doesnot depend on the ordering in which phasic and spectraleﬀects are considered. In particular, instead of the stud-ied sequence of null models given by M i → M φ → M A ,one could also consider M i → M A → M φ where theeﬀect of spectrum is considered before the eﬀect of phase.In the rest of this section, we prove that this second se-quence of null models gives the same decomposition.For this purpose, let us deﬁne x φj = F − { A ( ˆ u j ) e i φ ( ˆ x j ) } and y φk = F − { A (ˆ v k ) e i φ ( ˆ y k ) } ,where ˆ u j , ˆ v k are the discrete Fourier transforms ofindependently randomly chosen time series that areeach drawn from X with probability 1/2, and from Y with probability 1/2. With this, let us introduce ν A ( x N | y M ) := N P Nj =1 f ( x φj ), and deﬁne∆ A ( x N ) := E n ν i ( x N ) − ν A ( x N | y M ) (cid:12)(cid:12)(cid:12) x N , y M o and ∆ A ( y M ) analogously. Hence, the diﬀerence∆ A ( x N , y M ) := ∆ A ( x N ) − ∆ A ( y N ) is the spectralcomponent of the diﬀerence when assessed on the sec-ond stage of the alternative decomposition given by M i → M A → M φ . Proposition 1.

The decomposition given by D : M i → M φ → M A (2) is equivalent to the decomposition given by D : M i → M A → M φ . (3) Proof.

Because both decompositions start with M i , it isclear that the term correposponding to spectral-phasicinteraction is equivalent. Therefore, it is enough to showthat either the spectral or the phasic contribution is thesame, as proving one would imply the other. Our strategy is to prove that the spectral component assessed in thethird step in D , given by ∆ A ( x N , y M ), is equal to thespectral component estimated in the second stage of D given by ∆ A ( x N , y M ).To prove this, let us introduce x y j = F − { A ( ˆ x j ) e i φ (ˆ y β ) } and y x k = F − { A ( ˆ y k ) e i φ (ˆ x α ) } ,where α, β are integers sampled at random from { , . . . , N } and { , . . . , M } , respectively. Put simply, x y j and y x k have the spectrum of one process combined witha randomly sampled phase from the other. Furthermore,let us use the notation ν c ( x N ) = N P Nj =1 f ( x y j ) and ν c ( y M ) = M P Mk =1 f ( y x k ). With this and the deﬁnitionof ν A , one can show that E n ν A ( x N | y M ) (cid:12)(cid:12)(cid:12) x N , y M o = E (cid:26) ν i ( x N ) + ν c ( y M )2 (cid:12)(cid:12)(cid:12) x N , y M (cid:27) , and therefore∆ A ( x N ) = E (cid:26) ν i ( x N ) − ν c ( y M )2 (cid:12)(cid:12)(cid:12) x N , y M (cid:27) . An analogous calculation gives that∆ A ( y M ) = E (cid:26) ν i ( y N ) − ν c ( x N )2 (cid:12)(cid:12)(cid:12) x N , y M (cid:27) . Finally, combining all this one ﬁnds that∆ A ( x N , y M ) =∆ A ( x N ) − ∆ A ( y M )= E n ν i ( x N ) + ν c ( x N )2 (cid:12)(cid:12)(cid:12) x N , y M o − E n ν i ( y M ) + ν c ( y M )2 (cid:12)(cid:12)(cid:12) x N , y M o = E n ν φ ( x N | y M ) − ν φ ( y M | x N ) (cid:12)(cid:12)(cid:12) x N , y M o =∆ A ( x N , y M ) , which concludes our proof. ∗ P.M. and F.R. contributed equally to this work.E-mail: [email protected], [email protected] † A.B.B. and D.B. are joint senior authors.[1] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, andJ. D. Farmer, Physica D , 77 (1992).[2] R. N. Bracewell, The Fourier Transform and its Applica-tions , Vol. 31999 (McGraw-Hill New York, 1986).[3] It is well known that this test corresponds to comparing x against time series generated by a linear model with thesame autocorrelation as x .[4] The details of the phase-generating model are irrelevant– it only matters that there is a method for generatingphases that are unlike samples from R when c <22