Note on AR(1)-characterisation of stationary processes and model fitting
NNote on AR(1)-characterisation of stationaryprocesses and model fitting
Marko Voutilainen ∗ and Lauri Viitasaari † and Pauliina Ilmonen ∗ October 1, 2018
Abstract
It was recently proved that any strictly stationary stochastic process can be viewedas an autoregressive process of order one with coloured noise. Furthermore, itwas proved that, using this characterisation, one can define closed form estimatorsfor the model parameter based on autocovariance estimators for several differentlags. However, this estimation procedure may fail in some special cases. In thisarticle we provide a detailed analysis of these special cases. In particular, weprove that these cases correspond to degenerate processes.
AMS 2010 Mathematics Subject Classification: (Primary) 60G10, (Secondary) 62M10
Keywords: stationary processes, covariance functions
Stationary processes are important tool in many practical applications of time seriesanalysis, and the topic is extensively studied in the literature. Traditionally, stationaryprocesses are modelled by using autoregressive moving average processes or linearprocesses (see monographs [2, 4] for details).One of the most simple example of an autoregressive moving average process is anautoregressive process of order one. That is, a process ( X t ) t ∈ Z defined by X t = φX t − + ε t , t ∈ Z , (1)where φ ∈ ( − , and ( ε t ) t ∈ Z is a sequence of independent and identically distributedsquare integrable random variables. The continuous time analogue of (1) is called theOrnstein-Uhlenbeck process, which can be defined as the stationary solution of theLangevin-type stochastic differential equation dU t = − φU t dt + dW t , (2) ∗ Department of Mathematics and Systems Analysis, Aalto University School of Science, Finland † Department of Mathematics and Statistics, University of Helsinki, Finland a r X i v : . [ m a t h . P R ] M a y here φ > and ( W t ) t ∈ R is a two-sided Brownian motion. Such equations have alsoapplications in mathematical physics.Statistical inference for AR(1)-process or Ornstein-Uhlenbeck process is well-established in the literature. Furthermore, recently a generalised continuous timeLangevin equation, where the Brownian motion W in (2) is replaced with a moregeneral driving force G , have been a subject of active study. Especially, the so-calledfractional Ornstein-Uhlenbeck processes introduced by [3] have been studied exten-sively. For parameter estimation in such models, we mention a recent monograph [5]dedicated to the subject, and the references there in.When the model becomes more complicated, the number of parameters increasesand the estimation may become a challenging task. For example, it may happen thatstandard maximum likelihood estimators cannot be expressed in closed form [2]. Evenworse, it may happen that classical estimators such as maximum likelihood or leastsquares estimators are biased and not consistent (cf. [1] for discussions on the gener-alised ARCH-model with fractional Brownian motion driven liquidity). One way totackle such problems is to consider one parameter model, and to replace white noisein (3) with some other stationary noise. It was proved in [7] that each discrete timestrictly stationary process can be characterised by X t = φX t − + Z t , (3)where φ ∈ (0 , . This representation can be viewed as a discrete time analogueof the fact that Langevin-type equation characterises strictly stationary processes incontinuous time [6].The authors in [7] applied characterisation (3) to model fitting and parameter esti-mation. The presented estimation procedure is straightforward to apply with the excep-tion of certain special cases. The purpose of this paper is to provide a comprehensiveanalysis of these special cases. In particular, we show that such cases do not providevery useful models. This highlights the wide applicability of characterization (3) andthe corresponding estimation procedure.The rest of the paper is organised as follows. In Section 2 we briefly discuss themotivating estimation procedure of [7]. We also present and discuss our main resultstogether with some illustrative figures. All the proofs and technical lemmas are post-poned to Section 3. Let X = ( X t ) t ∈ Z be a stationary process. It was shown in [7] that equation X t = φX t − + Z t , (4)where φ ∈ (0 , and Z t is another stationary process, characterises all discrete time(strictly) stationary processes. Throughout this paper we suppose that X and Z are2quare integrable processes with autocovariance functions γ ( · ) and r ( · ) , respectively.Using Equation (4), one can derive Yule-Walker type equations for the parameter φ ,which can be solved in an explicit form. Namely, for any m ∈ Z such that γ ( m ) (cid:54) = 0 we have φ = γ ( m + 1) + γ ( m − ± (cid:112) ( γ ( m + 1) + γ ( m − − γ ( m )( γ ( m ) − r ( m ))2 γ ( m ) . (5)The estimation of the parameter φ is obvious from (5) provided that one can determinewhich sign, plus or minus, one should choose. In practice, this can be done by choosingdifferent lags m for which to estimate the covariance function γ ( m ) . Then one candetermine the correct value φ by comparing different signs in (5) for different lags m (We refer to [7, p. 387] for detailed discussion). However, this approach fails, i.e. onecannot find suitably chosen lags leading to the correct choice of the sign and only onevalue φ , if, for m ∈ Z such that γ ( m ) = 0 we also have r ( m ) = 0 , and for any m ∈ Z such that γ ( m ) (cid:54) = 0 , the ratio a m = r ( m ) γ ( m ) = a (6)for some constant a ∈ (0 , . The latter is equivalent [7, p. 387] to the fact that γ ( m + 1) + γ ( m − γ ( m ) = b (7)for some constant b with φ < b < φ + φ − . This leads to γ ( m + 1) = bγ ( m ) − γ ( m − . (8)Moreover, if γ ( m ) = r ( m ) = 0 for some m , it is straightforward to verify that (8)holds in this case as well. Thus (8) holds for all m ∈ Z . Since covariance functionsare necessarily symmetric, we obtain an ”initial” condition γ (1) = b γ (0) . Thus (8)admits a unique symmetric solution.From γ (1) = b γ (0) it is clear that (8) does not define covariance function for b > . Furthermore, since φ > , it suffices to study the regime b ∈ [0 , (weinclude the trivial case b = 0 ). For b = 2 this corresponds to the case X t = X forall t ∈ Z which is hardly interesting. Similarly, the case b = 0 leads to a process ( . . . , X , X , − X , − X , X , X , . . . ) which again does not provide a practical model.On the other hand, it is not clear whether for some other values b ∈ (0 , Equation(8) can lead to some non-trivial model in which estimation procedure explained abovecannot be applied. It turns out that, for any b ∈ [0 , , Equation (8) defines a covariancefunction. On the other hand, the resulting covariance function, denoted by γ b , leads toa model that is either not very interesting. Theorem 2.1.
Let b ∈ (0 , and γ b be the (unique) symmetric function satisfying (8) .Then . Let b = 2 sin (cid:0) kl π (cid:1) , where k and l are strictly positive integers such that kl ∈ (0 , . Then γ b ( m ) is periodic.2. Let b = 2 sin (cid:0) r π (cid:1) , where r ∈ (0 , \ Q . Then for any M ≥ , the set { γ b ( M + m ) : m ≥ } is dense in [ − γ (0) , γ (0)] .3. For any b ∈ [0 , , γ b ( · ) is a covariance function. In many applications of stationary processes, it is assumed that the covariance func-tion γ ( · ) vanishes at infinity, or that γ ( · ) is periodic. Note that the latter case corre-sponds simply to the analysis of finite-dimensional random vectors with identicallydistributed components. Indeed, γ ( m ) = γ (0) implies X n = X almost surely, soperiodicity of γ ( · ) with period N implies that there exists at most N random variablesas the source of randomness. By items (2) and (3) of Theorem 2.1, we observe that, forsuitable values of b , (8) can be used to construct covariance functions that are neitherperiodic nor vanishing at infinity. On the other hand, in this case there are arbitrarylarge lags m such that γ b ( m ) is arbitrary close to γ b (0) . Consequently, it is expectedthat different estimation procedures fail. Indeed, even the standard covariance estima-tors are not consistent. A consequence of Theorem 2.1 is that only a little structure inthe noise Z is needed in order to apply the estimation procedure of the parameter φ introduced in [7], provided that one has consistent estimators for the covariances of X .The following is a precise mathematical formulation of this observation. Theorem 2.2.
Let X be given by (4) for some φ ∈ (0 , and noise Z . Assume thatthere exists (cid:15) > and M ∈ N such that r ( m ) ≤ r (0)(1 − (cid:15) ) or r ( m ) ≥ − r (0)(1 − (cid:15) ) for all m ≥ M . Then the covariance function γ of X does not satisfy (8) for any b ∈ [0 , . We end this section by visual illustrations of the covariance functions defined by(8). In these examples we have set γ b (0) = 1 . In Figures 1 and 2 we have illustrated thecase of item (1) of Theorem 2.1. Note that in Figure 1a we have b = 2 sin (cid:0) π (cid:1) = 1 .Figure 2 demonstrates how k can affect the shape of the covariance function. Finally,Figure 3b illustrates the case of item (2) of Theorem 2.1. Throughout this section, without loss of generality, we assume γ b (0) = 1 . We also dropthe sub-index and simply denote γ . The following first result gives explicit formula forthe solution to (8). Proposition 3.1.
The unique symmetric solution to (8) is given by γ ( m ) = (cid:26) ( − m cos (cid:0) m arcsin (cid:0) b (cid:1)(cid:1) , for m is even ( − ( m − sin (cid:0) m arcsin (cid:0) b (cid:1)(cid:1) , for m is odd . (9)4
20 40 60 80 100 − . − . . . . Lag A u t o c o v a r i an c e (a) k = 1 and l = 3 . − . − . . . . Lag A u t o c o v a r i an c e (b) k = 5 and l = 7 . Figure 1: Examples of covariance functions corresponding to b = 2 sin (cid:0) kl π (cid:1) . Proof.
Clearly, γ ( m ) given by (9) is symmetric, and thus it suffices to consider m ≥ .Moreover γ (0) = 1 and γ (1) = b . We use the short notation A = arcsin (cid:0) b (cid:1) so that sin A = b . Assume first m + 2 ≡ . Then γ ( m + 2) = − cos (( m + 2) A )= − cos ( mA ) cos (2 A ) + sin ( mA ) sin (2 A )= − cos ( mA ) (cid:0) − A (cid:1) + 2 sin ( mA ) sin A cos A = − cos ( mA ) (1 − b sin A ) + b sin ( mA ) cos A = b (cos ( mA ) sin A + sin ( mA ) cos A ) − cos ( mA )= b sin (( m + 1) A ) − cos ( mA )= bγ ( m + 1) − γ ( m ) . Similarly, for m + 2 ≡ we observe5 − . − . . . . Lag A u t o c o v a r i an c e (a) k = 1 and l = 3371 . − . − . . . . Lag A u t o c o v a r i an c e (b) k = 3367 and l = 3371 . Figure 2: Examples of covariance functions corresponding to b = 2 sin (cid:0) kl π (cid:1) . γ ( m + 2) = − sin (( m + 2) A )= − sin ( mA ) cos (2 A ) − sin (2 A ) cos ( mA )= − sin ( mA )(1 − A ) − A cos A cos ( mA )= − b (cos A cos ( mA ) − sin ( mA ) sin A ) − sin ( mA )= − b cos (( m + 1) A ) − sin ( mA )= bγ ( m + 1) − γ ( m ) . Treating cases m + 2 ≡ and m + 2 ≡ similarly, we deduce that(9) satisfies (8). Remark 3.2.
Using (8) directly, we observe, for even m ≥ , that γ ( m ) = b m + m − (cid:88) n = m ( − m − n (cid:18)(cid:18) nm − n (cid:19) b n − m + (cid:18) nm − n − (cid:19) b n − m +2 (cid:19) . (10) Similarly, for odd m ≥ , we obtain γ ( m ) = m (cid:88) n = m +12 ( − m − n (cid:18) nm − n (cid:19) b n − m + m − (cid:88) n = m − ( − m − n (cid:18) nm − n − (cid:19) b n − m +2 . (11)
200 400 600 800 1000 − . − . . . . Lag A u t o c o v a r i an c e (a) b = 0 . . − . − . . . . Lag A u t o c o v a r i an c e (b) b = 1 . . Figure 3: Examples of covariance functions corresponding to b = 2 sin (cid:0) r π (cid:1) . These formulas are finite polynomial expansions, in variable b , of the functions pre-sented in (9) which could have been deduced also by using some well-known trigono-metric identities. Before proving our main theorems we need several technical lemmas.
Definition 3.3.
We denote with Q a subset of rationals defined by Q := (cid:26) kl : k, l ∈ N , kl ∈ (0 , , k − l ≡ (cid:27) Remark 3.4.
The modulo condition above means only that either k is even and l isodd, or vice versa. Lemma 3.5.
Let x = kl π , where kl ∈ Q . Then l − (cid:88) j =1 cos ( jx )( − j = − . Proof.
We write l − (cid:88) j =1 cos ( jx )( − j = cos ( lx )( − l + l − (cid:88) j =1 cos ( jx )( − j + l − (cid:88) j = l +1 cos ( jx )( − j . t = j − l gives l − (cid:88) j = l +1 cos ( jx )( − j = l − (cid:88) t =1 cos (( t + l ) x )( − t + l = l − (cid:88) t =1 cos ( tx + k π − t + l = (cid:40) (cid:80) l − t =1 cos ( tx )( − t + l , k even (cid:80) l − t =1 sin ( tx )( − t + l , k odd . Consequently, for even k and odd l we have l − (cid:88) j =1 cos ( jx )( − j = − cos (cid:16) k π (cid:17) = − . Similarly, for odd k and even l , l − (cid:88) j =1 cos ( jx )( − j = cos (cid:16) k π (cid:17) + l − (cid:88) j =1 ( − j = − . Lemma 3.6.
Let γ ( · ) be given by (9) with b = 2 sin (cid:0) kl π (cid:1) for some kl ∈ Q . Then thenon-zero eigenvalues of the matrix C := γ (0) γ (1) γ (2) · · · γ (4 l − γ (1) γ (0) γ (1) · · · γ (4 l − γ (2) γ (1) γ (0) · · · γ (4 l − ... ... ... . . . ... γ (4 l − γ (4 l − γ (4 l − · · · γ (1) γ (4 l − γ (4 l − γ (4 l − · · · γ (0) (12) are either l with multiplicity of two or l with multiplicity of one.Proof. Let c i denote the i th column of C . Then, by the defining equation (8), c i = bc i − − c i − for any i ≥ . Consequently, there exists at most two linearly independentcolumns. Thus rank ( C ) ≤ , which in turn implies that there exists at most two non-zero eigenvalues λ and λ . In order to compute λ and λ , we recall the followingidentities: tr ( C ) = λ + λ = 4 l (13) tr ( C ) = λ + λ = || C || F , (14)where || · || F is the Frobenius norm. If rank ( C ) = 1 , then λ = 0 implying the secondpart of the claim. Suppose then rank ( C ) = 2 . Observing that the squared sum of the8iagonals is l and, for j = 1 , , . . . , l − , a term γ ( j ) appears in C exactly l − j ) times, we obtain || C || F = 4 l + 2 l − (cid:88) j =1 (4 l − j ) γ ( j ) . Dividing the sum into two parts and using sin ( x ) = 1 − cos ( x ) we have || C || F = 4 l + 2 l − (cid:88) j =0 (4 l − (2 j + 1)) γ (2 j + 1) + 2 l − (cid:88) j =1 (4 l − j ) γ (2 j ) = 4 l + 2 l − (cid:88) j =0 (4 l − (2 j + 1)) sin ((2 j + 1) x ) + 2 l − (cid:88) j =1 (4 l − j ) cos (2 jx )= 4 l + 2 l − (cid:88) j =0 (4 l − (2 j + 1)) + 2 l − (cid:88) j =1 (4 l − j ) cos ( jx )( − j = 8 l + 4 l + 2 l − (cid:88) j =1 (4 l − j ) cos ( jx )( − j , where in the last equality we have used l − (cid:88) j =0 (4 l − (2 j + 1)) = l − (cid:88) j =0 (4 l − − l − (cid:88) j =0 j = 2 l (4 l −
1) + 2 l (2 l −
1) = 4 l . Now l − (cid:88) j =1 (4 l − j ) cos ( jx )( − j = 2 l + l − (cid:88) j =1 (4 l − j ) cos ( jx )( − j + l − (cid:88) j =2 l +1 (4 l − j ) cos ( jx )( − j , (15)where substitution j = 4 l − t yields l − (cid:88) j =2 l +1 (4 l − j ) cos ( jx )( − j = l − (cid:88) t =1 t cos ((4 l − t ) x )( − l − t = l − (cid:88) t =1 t cos (2 kπ − tx )( − t = l − (cid:88) t =1 t cos ( tx )( − t . (16)Now (15), (16), and Lemma 3.5 imply || C || F = 8 l + 4 l + 2 (cid:32) l + 4 l l − (cid:88) j =1 cos ( jx )( − j (cid:33) = 8 l . || C || F = 8 l , we obtain λ + (4 l − λ ) − l = 2 λ − lλ + 8 l = ( √ λ − √ l ) = 0 . Hence λ = λ = 2 l .We are now ready to prove Theorem 2.1 and Theorem 2.2. Proof the Theorem 2.1.
Throughout the proof we denote a ≡ a (mod 2 π ) if a = a + 2 kπ for some k ∈ Z . That is, a and a are identifiable when regarding them aspoints on the unit circle. By a ∈ ( a , a ) (mod 2 π ) we mean that a ≡ a (mod 2 π ) for some a ∈ ( a , a ) .1. Since arcsin (cid:0) b (cid:1) = kl π , the first claim follows from Proposition 3.1 togetherwith the fact that functions sin( · ) and cos( · ) are periodic. In particular, we have γ (4 l + m ) = γ ( m ) for every m ∈ Z .2. Denote A = arcsin (cid:0) b (cid:1) = r π . By Proposition 3.1, mA is the correspondingangle for γ ( m ) on the unit circle. Note first that, due the periodic nature of cos and sin functions, it suffices to prove the claim only in the case M = 0 . Inwhat follows, we assume that m ≥ . We show that the function γ ( m ) , m ≡ is dense in [ − , , while a similar argument could be used for otherequivalence classes as well. That is, we show that the function cos ( mA ) , m ≡ is dense in [ − , . Essentially this follows from the observation that, as r / ∈ Q , the function m (cid:55)→ cos ( mA ) is injective. Indeed, if cos( ˜ mA ) = cos ( mA ) for some ˜ m, m ≥ , ˜ m (cid:54) = m , it follows that ˜ mA = ˜ mr π ± mr π k π = ± mA + k π for some k ∈ Z . This implies r = 4 k ˜ m ± m , which contradicts r / ∈ Q . Since cos( mA ) is injective, it is intuitively clearthat cos( mA ) , m ≡ is dense in [ − , . For a precise argument, weargue by contradiction and assume there exists an interval ( c , d ) ⊂ [ − , such that cos( mA ) / ∈ ( c , d ) for any m ≡ . This implies thatthere exists an interval ( c , d ) ⊂ [0 , π ] such that for every m ≡ it holds that mA / ∈ ( c , d ) (mod 2 π ) . Without loss of generality, we can as-sume c = 0 and that for some m ≡ we have m A ≡ π ) .Let m n = m + 4 n with n ∈ N and denote by (cid:98)·(cid:99) the standard floor function.Suppose that for some n ∈ N and p n ∈ ( − d , we have m n A ≡ p n (mod 2 π ) .Since by injectivity π | p n | / ∈ N , we get m n (cid:98) π | pn | (cid:99) A ∈ (0 , d ) (mod 2 π ) leading toa contradiction. This implies that for every n ∈ N we have m n A / ∈ ( − d , d )(mod 2 π ) (for a visual illustration, see Figure 4). Similarly, assume next that10 n A ≡ p n (mod 2 π ) and m n + n A − m n A ∈ ( − d , d ) (mod 2 π ) . Then m n A ∈ ( − d , d ) (mod 2 π ) which again leads to a contradiction (see Fig-ure 5). This means that for an arbitrary point p n on the unit circle such that m n A ≡ p n (mod 2 π ) , we get an interval ( p n − d , p n + d ) (understood as anangle on the unit circle) such that this interval cannot be visited later. As thewhole unit circle is covered eventually, we obtain the expected contradiction. d − d m Am n Am n ∗ nA m ( n ∗ − nA m nA Figure 4: Example of the excluded interval ( − d , d ) around zero. Here n ∗ = (cid:98) π | p n | (cid:99) ,and we have visualized the points on the unit circle corresponding to the steps , n, n, ( n ∗ − n and n ∗ n .3. Consider first the case b = 2 sin (cid:0) kl π (cid:1) , where kl ∈ Q . By Lemma 3.6, thesymmetric matrix C defined by (12) has non-negative eigenvalues, and thus C isa covariance matrix of some random vector ( X , X , . . . , X l − ) . Now it sufficesto extend this vector to a process X = ( X t ) t ∈ Z by the relation X l + t = X t .Indeed, it is straightforward to verify that X has the covariance function γ .Assume next b = 2 sin (cid:0) r π (cid:1) , where r ∈ (0 , \ Q . We argue by contradictionand assume that there exists k ∈ N , and vectors t = ( t , t , ..., t k ) T ∈ Z k and a = ( a , a , ..., a k ) T ∈ R k such that11 − d m Ad − d m n A m n + n A m n A Figure 5: Example of two excluded intervals and an angle m n A . k (cid:88) i,j =1 a i γ ( t i − t j ) a j = − (cid:15) for some (cid:15) > ,where γ ( · ) is the covariance function corresponding to the value b . Since Q is dense in [0 , , it follows that there exists ( q n ) n ∈ N ⊂ Q such that q n → r .Denote the corresponding sequence of covariance functions with ( γ n ( · )) n ∈ N . Bydefinition, k (cid:88) i,j =1 a i γ n ( t i − t j ) a j ≥ for every n. On the other hand, continuity implies γ n ( m ) → γ ( m ) for every m . This leadsto lim n →∞ k (cid:88) i,j =1 a i γ n ( t i − t j ) a j = k (cid:88) i,j =1 a i γ ( t i − t j ) a j = − (cid:15) Remark 3.7.
Note that in the periodic case the covariance matrix C defined by (12) satisfies rank ( C ) ≤ . Thus, in this case, the process ( X t ) t ∈ Z is driven linearly byonly two random variables Y and Y . In other words, we have X t = a ( t ) Y + a ( t ) Y , t ∈ Z for some deterministic coefficients a ( t ) and a ( t ) .Proof of Theorem 2.2. Suppose γ satisfies (8) and r ( m ) ≤ r (0)(1 − (cid:15) ) for all m ≥ M .By Theorem 2.1, there exists m ∗ ≥ M such that γ ( m ∗ ) ≥ γ (0) (cid:16) − (cid:15) (cid:17) . Furthermore, (8) implies (6) for every m such that γ ( m ) (cid:54) = 0 . Now a m ∗ = r ( m ∗ ) γ ( m ∗ ) ≤ r (0)(1 − (cid:15) ) γ (0) (cid:0) − (cid:15) (cid:1) < r (0) γ (0) = a leading to a contradiction. Treating the case r ( m ) ≥ − r (0)(1 − (cid:15) ) for all m ≥ M similarly concludes the proof. References [1] M. Bahamonde, S. Torres, and C.A. Tudor. ARCH model with fractional Brownianmotion.
Statistics and Probability Letters , 134:70–78, 2018.[2] P.J. Brockwell and R.A. Davis.
Time Series: Theory and Methods . SpringerScience & Business Media, 2013.[3] P. Cheridito, H. Kawaguchi, and M. Maejima. Fractional Ornstein-Uhlenbeckprocesses.
Electronic Journal of Probability , 8:no. 3, 14 pp. (electronic), 2003.[4] J.D. Hamilton.
Time Series Analysis , volume 2. Princeton university press Prince-ton, 1994.[5] K. Kubilius, Y. Mishura, and K. Ralchenko.
Parameter Estimation in FractionalDiffusion Models . Springer, 2018.[6] L. Viitasaari. Representation of stationary and stationary increment processes viaLangevin equation and self-similar processes.
Statistics and Probability Letters ,115:45–53, 2016.[7] M. Voutilainen, L. Viitasaari, and P. Ilmonen. On model fitting and estimationof strictly stationary processes.