An R package for Normality in Stationary Processes
AAn R Package for Normality in Stationary Processes
Izhar Asael Alonzo Matamoros
Universidad de [email protected]
Alicia Nieto-Reyes
Universidad de [email protected]
Abstract
Normality is the main assumption for analyzing dependent data in several time seriesmodels, and tests of normality have been widely studied in the literature, however, theimplementations of these tests are limited. The nortsTest package performs the tests of
Lobato and Velasco, Epps, Psaradakis and Vavra and random projection for normalityof stationary processes. In addition, the package offers visual diagnostics for checkingstationarity and normality assumptions for the most used time series models in several R packages. The aim of this work is to show the functionality of the package, presenting eachtest performance with simulated examples, and the package utility for model diagnosticin time series analysis. Keywords : Gaussian process, hypothesis test, stochastic process.
1. Introduction
Normality ( a set of observations being sampled from a Gaussian process ) is an important as-sumption in a wide variety of statistical models. Therefore, developing procedures for testingthis assumption is a topic that has gained popularity over several years. Most of the existingliterature, and implementation, is dedicated to independent and identically distributed ran-dom variables (D’Agostino and Stephens 1986); and there are no results showing that thesetests are consistent in the context of stationary processes. For this context, a small numberof tests have been proposed over the years, but, as far as we know, there exist no R packageor consistent implementation of them.The proposed nortsTest package provides four test implementations for normality in station-ary processes. The aim of this work is to present a review of these tests and introduce thepackage functionality. The implemented tests are: (i) the Epps test (Epps 1987) based onthe characteristic function, (ii) the corrected
Skewness-Kurtosis (SK) test implemented byLobato and Velasco (2004), (iii) the random projection test proposed by Nieto-Reyes, Cuesta-Albertos, and Gamboa (2014) and (iv) the
Psadarakis and Vávra test (Psaradakis and VÃąvra2017) that uses a bootstrap approximation of the Anderson and Darling (1952) test statisticfor stationary linear processes. Additionally, we propose the check_residual() function forchecking the assumptions in time-series models, which returns a report of tests for station-arity, seasonality, and normality as well as diagnostic plots for visual check. This functionsupports models from the most used packages for time-series analysis, such as the packages forecast (Hyndman and Khandakar 2008) and aTSA (Qiu 2015), and even functions in thebase R ( R Core Team 2018); for instance, it supports the
HoltWinters ( stats R package) a r X i v : . [ s t a t . C O ] S e p An R package for normality in stationary processes function for the Holt and Winters method (Holt 2004).Section 2 provides the theoretical background, including preliminary concepts and results.Section 3 introduces the normality tests for stationary processes, each subsection introducinga test framework and including examples of the tests functions with simulated data. Section4 provides numerical experiments with simulated data and a real data application. In Sub-section 4.1 reports a simulation study for all the implemented tests, and Subsection 4.2 showsthe functionality of the package for model checking in a real data application: the carbondioxide data measured in the Malua Loa Obsevatory (Stoffer 2020) is analyzed using a statespace model from the forecast package and the model assumptions are evaluated using theproposed check_residuals() function. Section 5 discusses the package functionality andprovides our conclusions. Furthermore, we include our future work on the package.
2. Preliminary concepts
This section provides some theoretical aspects of stochastic processes that are a necessarytheoretical framework for the following sections. The presented definitions and results can befound in Shumway and Stoffer (2010) and Tsay (2010).For the purpose of this work, T is a set of real values denoted as time, T ⊆ R , for instance T = N or T = Z , the naturals or integer numbers respectively. We denote by X := { X t } t ∈ T a stochastic process with X t a real random variable for each t ∈ T. Following this notation, a time-series is just a finite collection of ordered observations of X (Shumway and Stoffer 2010).An important measure for a stochastic process is its mean function µ ( t ) := E [ X t ] for each t ∈ T , where E [ · ] denotes the usual expected value of a random variable. A generalizationof this measure is the k-th order centered moment function µ k ( t ) := E [( X t − µ ( t )) k ] for each t ∈ T and k >
1; with the process variance function the second order centered moment, σ ( t ) := µ ( t ). Other important indicators are the auto-covariance and auto-correlationfunctions, which measure the linear dependency between two different time points of a givenprocess. For any t, s ∈ T, they are, respectively, γ ( t, s ) := E [( X t − µ ( t ))( X s − µ ( s ))] and ρ ( t, s ) := γ ( t, s ) p µ ( t ) p µ ( s ) . Other widely used indicator functions for the analysis of processes are the skewness andkurtosis functions, defined for each t ∈ T as s ( t ) := µ ( t ) / [ µ ( t )] / and k ( t ) := µ ( t ) / [ µ ( t )] respectively.A generally used assumption for stochastic processes is stationarity. It has a key role inforecasting procedures of classic time-series modeling (Tsay 2010) or as a principal assumptionin de-noising methods for signal theory (Wasserman 2006). Definition 1
A stochastic process X is said to be strictly stationary if, for every collection τ = { t , t , . . . , t k } ⊂ T and h > , the joint distribution of { X t } t ∈ τ is identical to that of { X t + h } t ∈ τ . The previous definition is strong for applications. A milder version of it, which makes use ofthe process first two moments, is weak stationarity.
Definition 2
A stochastic process X is said to be weakly stationary if its mean function isconstant in time, µ ( t ) = µ , its auto-covariance function only depends on the difference betweentimes, γ ( s, t ) = σ | t − s | for a σ ∈ R , and it has a finite variance function, µ ( t ) = µ < ∞ . For the rest of this work, the term stationary will be used to specify a weakly stationaryprocess. A direct consequence of the stationarity assumption is that the previous indicatorfunctions get simplified. Thus, given a stationary stochastic process X, its mean function, k -th order centered moment, for k > , and auto-covariance function are respectively, µ = E [ X t ], µ k = E [( X t − µ ) k ] and γ ( h ) = E [( X t + h − µ )( X t − µ )] , which are independent of t ∈ T. Given a sample x , . . . , x n , n ∈ N , of equally spaced observations of X, their correspondingestimators, sample mean, sample k -th order centered moment and sample auto-covariance,are respectively b µ := n − n X i =1 x i , b µ k := n − n X i =1 ( x i − b µ ) k and b γ ( h ) := n − n − h X i =1 ( x i + h − b µ )( x i − b µ ) . A particular case in which stationarity implies strictly stationarity are Gaussian processes.
Definition 3
A stochastic process X is said to be a Gaussian process if for every finitecollection τ = { t , t , . . . , t k } ⊂ T, the joint distribution of { X t } t ∈ τ has a multivariate normaldistribution. A series of mean zero uncorrelated random variables with finite constant variance is knownas white noise . If additionally, it is formed of independent and identically distributed (i.i.d)normal random variables, it is known as
Gaussian white noise ; which is a particular caseof stationary Gaussian process. For the rest of the work, X t ∼ N ( µ, σ ) denotes that therandom variable X t is normally distributed with mean µ and variance σ and χ ( v ) denotesthe chi square distribution with v degree freedom.Other classes of stochastic processes can be defined using collections of white noise, for in-stance, the linear process. Definition 4
Let X be a stochastic process. X is said to be linear if it can be written as X t = µ + X i ∈ Z φ i (cid:15) t − i , where { (cid:15) i } i ∈ Z is a collection of white noise random variables and { φ i } i ∈ Z is a set of real valuessuch that P i ∈ Z | φ j | < ∞ . An important class of processes is the auto-regressive moving average ( ARM A ). Box andJenkins (1990) introduced it for time series analysis and forecast, becoming very well-knownin the 90s and early 21st century.
Definition 5
For any non-negative integers p, q, a stochastic process X is an ARM A ( p, q ) process if it is a stationary process and X t = p X i =0 φ i X t − i + q X i =0 θ i (cid:15) t − i , (1) An R package for normality in stationary processes where { φ i } pi =1 and { θ i } qi =0 are sequences of real values with φ = 0 , φ p = 0 , θ = 1 and θ q = 0 and { (cid:15) i } i ∈ Z is a collection of white noise random variables. Particular cases of
ARM A processes are the auto-regressive ( AR ( p ) := ARM A ( p, M A ( q ) := ARM A (0 , q )) processes. Additionally, a random-walk is a non-stationary process satisfying (1) with p = 1 , φ = 1 and q = 0 . Several properties of an
ARM A process can be extracted from its structure. For that, the AR and M A polynomialsare introduced AR : φ ( z ) = 1 − p X i =0 φ i z i and M A : θ ( z ) = q X i =0 θ i z i , where z is a complex number and, as before, φ = 0 , φ p = 0 , θ = 1 and θ q = 0 . Conditionsfor stationarity, order selection and process behavior are properties studied from these twopolynomials.For modeling volatility in financial data Bollerslev (1986) proposed the generalized auto-regressive conditional heteroscedastic (GARCH) class of processes as a generalization of the auto-regressive conditional heteroscedastic (ARCH) processes (Engle 1982).
Definition 6
For any p, q ∈ N , a stochastic process X is a GARCH ( p, q ) process if it satisfies X t = µ + σ t (cid:15) t with σ t = α + p X i =1 α i (cid:15) t − i + q X i =1 β i σ t − i .µ is the process mean, σ is a positive constant value, { α i } pi =1 and { β i } qi =1 are non-negativesequences of real values and { (cid:15) t } t ∈ T is a collection of i.i.d. random variables. A more general class of processes are the state-space models ( SSM s ), which have gainedpopularity over the years because they do not impose on the process common restrictions suchas linearity or stationarity and are flexible in incorporating the process different characteristics(Petris, Petrone, and Campagnoli 2007). They are widely used for smoothing (West andHarrison 2006) and forecasting (Hyndman and Khandakar 2008) in time series analysis. Themain idea is to model the process dependency with two equations: the state equation , whichmodels how parameters change over time and the innovation equation , which models theprocess in terms of the parameters. Some particular SSMs that analyze the level, trend andseasonal components of the process are known as error, trend, and seasonal (ETS) models.There are over 32 different variations of ETS models (Hyndman, Koehler, Ord, and Snyder2008). One of them is the multiplicative error, additive trend-seasonality ( ET S ( M, A, A ))model.
Definition 7
A SSM process X follows an ETS(M,A,A) model, if the process accepts X t = [ L t − + T t − + S t − ](1 + (cid:15) t ) as innovation equation and L t = L t − + T t − + α ( L t − + T t − + S t − m ) (cid:15) t T t = T t − + β ( L t − + T t − + S t − m ) (cid:15) t S t = S t − m + γ ( L t − + T t − + S t − m ) (cid:15) t , as state equations. α, β, γ ∈ [0 , , m ∈ N denotes the period of the series and { (cid:15) t } are i.i.dnormal random variables. For each t ∈ Z , L t , T t and S t represent respectively the level, trendand seasonal component.
3. Normality tests for stationary processes
Extensive literature exists on goodness of fit tests for normality under the assumption ofindependent and identical distributed random variables, including Pearson’s chi-squared test(Pearson and Henrici 1895), Kolmogorov-Smirnov test (Smirnov 1948), Anderson-Darlingtest (Anderson and Darling 1952), SK test (Jarque and Bera 1980) and Shapiro-Wilk test(Shapiro and Wilk 1965; Royston 1982) among others. These procedures have been widelyused in many studies and applications, see D’Agostino and Stephens (1986) for further details.There are no results, however, showing that the above tests are consistent in the context ofstationary processes, case in which the independence assumption is violated. For instance,Gasser (1975) provides a simulation study where Pearson’s chi-squared test has an excessiverejection rate under the null hypothesis for dependent data. For this matter, several testshave been proposed over the years; a selection of which we reference here. Epps (1987) pro-vides a test based on the characteristic function and a similar test is proposed by Hinich(1982) based on the process’ spectral density function (Berg, Paparoditis, and Politis 2010,for further insight). Gasser (1975) gives a correction of the SK test, with several modificationsmade in Lobato and Velasco (2004); Bai and Ng (2005); Psaradakis (2017), which are pop-ular in many financial applications. Bontemps and Meddahi (2005) constructs a test basedon Stein’s characterization of a Gaussian distribution. Using the random projection method(Cuesta-Albertos, del Barrio, Fraiman, and MatrÃąn 2007), Nieto-Reyes et al. (2014) builda test that upgrades the performance of Epps (1987) and Lobato and Velasco (2004) proce-dures. Furthermore, Psaradakis and VÃąvra (2017) proposed a bootstrap approximation ofthe Anderson and Darling (1952) test statistic for stationary linear processes.Despite the existing literature, consistent implementations of goodness of fit test for normalityof stationary processes in programming languages such as R or Python are limited. We presenthere the nortsTest package: it performs the tests proposed in Epps (1987), Lobato and Velasco(2004), Nieto-Reyes et al. (2014) and Psaradakis and VÃąvra (2017). To install the latestrelease version of nortsTest from
CRAN , type install.packages("nortsTest") within R . Thecurrent development version can be installed from GitHub using the next code:
R> if (!requireNamespace("remotes")) install.packages("remotes")R> remotes::install_github("asael697/nortsTest",dependencies = TRUE)
Additionally, the package offers visualization functions for descriptive time series analysis andseveral diagnostic methods for checking stationarity and normality assumptions for the mostused time series models of several R packages. To elaborate on this, Subsection 3.1 introducesthe package functionality and software and Subsection 3.2 provides an overview of the usedmethods for checking stationarity and seasonality. Finally, Subsections 3.3-3.6 present a gen-eral framework of each of the implemented test and their functionality by providing simulateddata examples. An R package for normality in stationary processes The package works as an extension of the nortest package (Gross and Ligges 2015), whichperforms normality tests in random samples but for independent data. The building blockfunctions of the nortsTest package are:• epps.test() , function that implements the test of Epps,• lobato.test() , function that implements the test of Lobato and Velasco,• rp.test() , function that implements the random projection test of Nieto-Reyes, Cuesta-Albertos and Gamboa, and• vavra.test() , function that implements the test of Psaradaki and Vavra.Each of these functions accepts a numeric ( numeric ) or ts ( time series ) class object forstoring data, and returns a htest ( hypothesis test ) class object with the main results for thetest. To guarantee the accuracy of the results, each test performs unit root tests for checkingstationarity and seasonality (see Subsection 3.2) and displays a warning message if any ofthem not satisfied.For visual diagnostic, the package offers different plot functions based on the ggplot2 package(Wickham 2009): the autoplot() function plots numeric , ts and mts ( multivariate timeseries ) classes while the gghist() and ggnorm() functions are for plotting histogram and qq-plots respectively; and on the forecast package (Hyndman and Khandakar 2008): ggacf() and ggPacf() for the display of the auto-correlation and partial auto-correlations functionsrespectively.Furthermore, inspired in the function check.residuals() of the forecast package, we providethe check_residuals() function for checking assumptions of the model using the estimatedresiduals. Thus this function checks stationarity, seasonality ( see Subsection 3.2 ) and nor-mality, presenting a report of the used tests and conclusions. If the plot option is TRUE , thefunction displays several plots for visual checking. An illustration of these functions is pro-vided in Subsection 4.2, where we show the functions details and their utility for assumptionscommonly checked in time series modeling.
For checking stationarity, the nortsTest package uses unit root and seasonal unit-roots tests.These tests work similarly, checking whether a specific process follows a random-walk model,which clearly is a non-stationary process.
Unit root tests
A stochastic process X is non stationary if it follows a random-walk model. This statementis equivalent to say that the AR(1) polynomial ( φ ( z ) = 1 − φz ) of X has a unit root . Themost commonly used tests for unit root testing are Augmented Dickey Fuller (Said and Dickey1984),
Phillips-Perron (Perron 1988), kpps (Kwiatkowski, Phillips, Schmidt, and Shin 1992)and
Ljung-Box
Box and Pierce (1970). The urrot.test() and check_residual() functionsperform these tests, making use of the tseries package (Trapletti and Hornik 2019). If φ = 1, then φ ( z ) = (1 − z ) which its only root is one Seasonal unit root tests
Let X be a stationary process and m be its period . X follows a seasonal random walk if itcan be written as X t = X t − m + (cid:15) t , where (cid:15) t is a collection of i.i.d random variables. In a similar way, the process X is non-stationary if it follows a seasonal random-walk. Or equivalently, X is non-stationary ifthe seasonal AR(1) polynomial ( φ m ( z ) = 1 − φz m ) has a unit root. The seasonal() and check_residuals() functions perform the OCSB test (Osborn, Chui, Smith, and Birchenhall1988) from the forecast package, and the
HEGY (Beaulieu and Miron 1993) and Ch (Canovaand Hansen 1995) tests from the uroot package (de Lacalle 2019). The χ test for normality proposed by Epps (1987) compares the empirical characteristicfunction of the one-dimensional marginal of the process with the one of a normally distributedrandom variable evaluated at certain points on the real line. Several authors, such as Lobatoand Velasco (2004) and Psaradakis and VÃąvra (2017), point out that the greatest challengein this test is its implementation procedure.Let X be a stationary stochastic process that satisfies ∞ X t = −∞ | t | k | γ ( t ) | < ∞ for some k > . (2)The null hypothesis is that the one-dimensional marginal distribution of X is a Gaussianprocess. As wee see in what follows, the procedure for constructing the test consists ofdefining a function g , estimating its inverse spectral matrix function, minimizing the generatedquadratic function in terms of the unknown parameters of the random variable and, finally,obtaining the test statistic, which converges in distribution to a χ . Given N ∈ N with N > , letΛ := { λ := ( λ , . . . , λ N ) ∈ R N : λ i ≤ λ i +1 and λ i > , for i = 1 , , . . . , N } and g : R × Λ → R n be a measurable function, where g ( x, λ ) := [cos( λ x ) , sin( λ x ) , . . . , cos( λ N x ) , sin( λ N x )] . Additionally, let g θ : Λ → R N be a function defined by g θ ( λ ) := [Re(Φ θ ( λ )) , Im(Φ θ ( λ )) , . . . , Re(Φ θ ( λ N )) , Im(Φ θ ( λ N ))] t , where the Re( · ) and Im( · ) are the real and imaginary components of a complex number and Φ θ is the characteristic function of a normal random variable with parameters θ = ( µ, σ ) ∈ Θ , an open bounded set contained in R × R + . For any λ ∈ Λ , let us also denote b g ( λ ) := 1 n n X t =1 [cos( λ X t ) , sin( λ X t ) , . . . , cos( λ N X t ) , sin( λ N X t )] t . For observed data, m is the number of observations per unit of time An R package for normality in stationary processes Let f ( v ; θ, λ ) be the spectral density matrix of { g ( X t , λ ) } t ∈ Z at a frequency v. Then, for v = 0,it can be estimated by b f (0; θ, λ ) := 12 πn n X t =1 b G ( X t, , λ ) + 2 b n / c X i =1 (1 − i/ b n / c ) n − i X t =1 b G ( X t,i , λ ) , where b G ( X t, , λ ) = ( b g ( λ ) − g θ ( λ ))( b g ( λ ) − g θ ( λ )) t and b·c denotes the floor function. The teststatistic general form under H is Q n ( λ ) := min θ ∈ Θ { Q n ( θ, λ ) } , with Q n ( θ, λ ) := ( b g ( λ ) − g θ ( λ )) t G + n ( λ )( b g ( λ ) − g θ ( λ ))where G + n is the generalized inverse of the spectral density matrix 2 π b f (0; θ, λ ). Let b θ =arg min θ ∈ Θ { Q n ( θ, λ ) } be the argument that minimizes Q n ( θ, λ ) such that b θ is in a neighbor-hood of b θ n = ( b µ, b γ (0)). To guarantee its’ existence and uniqueness, the following assumptionsare required. We refer to them as assumption ( A. ).( A. ) Let θ be the true value of θ = ( µ, σ ) under H , then for every λ ∈ Λ the followingconditions are satisfied. – f (0; θ, λ ) is positive definite. – Φ θ ( λ ) is twice differentiable with respect to θ in a neighborhood of θ . – The matrix D ( θ , λ ) = ∂ Φ θ ( λ ) ∂θ | θ = θ ∈ R N × , for N >
2, has rank 2. – The set Θ ( λ ) := { θ ∈ Θ : Φ θ ( λ i ) = Φ θ ( λ i ) , i = 1 , . . . , N } is a finite bounded setin Θ. And θ is a bounded subset R × R + . – f (0; θ, λ ) = f (0; θ , λ ) and D ( θ , λ ) = D ( θ , λ ) for all θ ∈ Θ ( λ ).Under these assumptions, the Epps’s main result is presented as follows. Theorem 1 (Epps (1987) Theorem 2.1)
Let X be a stationary Gaussian process suchthat (2) and ( A. ) are satisfied, then nQ n ( λ ) → d χ (2 N − for every λ ∈ Λ . For the current nortsTest version, we define Λ := { (1 . , . , . , . / b γ (0) } , where b γ (0) is thesample variance. Therefore, the implemented test statistics converges to a χ distributionwith two degree freedom. In the next version of the package, the user will set Λ as desired,with the current value as default. Example 1
A stationary AR (2) process is drawn using a beta distribution with shape1 = 9 and shape2 = 1 parameters, and the implementation of the test of Epps, epps.test() , isperformed. At significance level α = 0 . , the null hypothesis of normality is correctly rejected. R> set.seed(298)R> x = arima.sim(250,model = list(ar =c(0.5,0.2)),rand.gen = rbeta,shape1 = 9,shape2 = 1)R> epps.test(x)
Epps testdata: xepps = 32.614, df = 2, p-value = 8.278e-08alternative hypothesis: x does not follow a Gaussian Process
Lobato and Velasco (2004) provides a consistent estimator for the corrected SK test statistic for stationary processes (Lomnicki 1961; Gasser 1975, for further insight). On the contraryto the test of Epps, it does not require of additional parameters for the approximation of thetest sample statistic. The general framework for the test is presented in what follows.Let X be a stationary stochastic process that satisfies ∞ X t =0 | γ ( t ) | < ∞ . (3)The null hypothesis is that the one-dimensional marginal distribution of X is normally dis-tributed, that is H : X t ∼ N ( µ, σ ) for all t ∈ R . Let k q ( j , j , . . . , j q − ) be the q-th order cummulant of X , X j , . . . , X j q − . H is fulfilledif all the marginal cummulants above the second order are zero. In practice, it is testedjust for the third and fourth order marginal cummulants, equivalently, in terms of moments,the marginal distribution is normal by testing whether µ = 0 and µ = 3 µ . For noncorrelated data, the SK test compares the SK statistic against upper critical values from a χ (2) distribution (Bai and Ng 2005). For a Gaussian process X satisfying (3), it holds thelimiting result √ n b µ b µ − b µ ! → d N [0 , Σ F )] , where 0 := (0 , t ∈ R and Σ F := diag(6 F (3) , F (4) ) ∈ R x is a diagonal matrix with F ( k ) := P ∞ j = −∞ γ ( j ) k for k = 3 , b F ( k ) = n − X t =1 − n b γ ( t )[ b γ ( t ) + b γ ( n − | t | )] k − , to build a generalized SK test statistic G := n b µ b F (3) + n ( b µ − b µ ) b F (4) . Similar to the SK test for non-correlated data, the G statistic is compared against uppercritical values from a χ (2) distribution. This is seen in the below result that establishesthe asymptotic properties of the test statistics, so that the general test procedure can beconstructed. The result requires the following assumptions, denoted by ( B. ) , for the process X. Also known as the Jarque-Bera test, Jarque and Bera (1980). An R package for normality in stationary processes (B.) – E [ X t ] < ∞ for t ∈ T. – P ∞ j = −∞ · · · P ∞ j q − = −∞ | k q ( j , . . . , j q − ) | < ∞ for q = 2 , , . . . , . – P ∞ j =1 (cid:18) E h E [( X − µ ) k | B j ] − µ k i (cid:19) / < ∞ for k = 3 , , where B j denotes the σ -field generated by X t , t ≤ − j. – E h ( X − µ ) k − µ k i + 2 P ∞ j =1 E (cid:16)h ( X − µ ) k − µ k i h ( X j − µ ) k − µ k i(cid:17) > k =3 , . Note that these assumptions imply that the higher-order spectral densities up to order 16 arecontinuous and bounded.
Theorem 2 (Lobato and Velasco (2004), Theorem 1)
Let X be a stationary process.If X is Gaussian and satisfies (3) then G → d χ (2) , and under assumption (B.), the teststatistic G diverges whenever µ = 0 or µ = 3 µ . Example 2
A stationary
M A (3) process is drawn using a gamma distribution with rate = 3 and shape = 6 parameters and the test of Lobato and Velasco is performed using the function lobato.test() of the proposed nortstTest package. At significance level α = 0 . , the nullhypothesis of normality is correctly rejected. R> set.seed(298)R> x = arima.sim(250,model = list(ma =c(0.2,0.3,-0.4)),rand.gen = rgamma,rate = 3,shape = 6)R> lobato.test(x)
Lobato and Velasco's testdata: xlobato = 62.294, df = 2, p-value = 2.972e-14alternative hypothesis: x does not follow a Gaussian Process
The previous two proposals only test for the normality of the one-dimensional marginal dis-tribution of the process, which is inconsistent against alternatives whose one-dimensionalmarginal is Gaussian. Nieto-Reyes et al. (2014) provides a procedure to fully test normalityof a stationary process using a Crammér-Wold type result (Cuesta-Albertos et al. l , the space of square summable sequences over N , with inner product h· , ·i . Theorem 3 (Cuesta-Albertos et al. (2007), Theorem 3.6)
Let η be a dissipative dis-tribution on l and Z a l -valued random element, then Z is Gaussian if and only if η { h ∈ l : h Z, h i has a Gaussian distribution } > .
1A dissipative distribution (Nieto-Reyes et al. l , it is made use of the Dirichlet process (Gelman, Carlin, Stern,Dunson, Vehtari, and Rubin 2013). In practice, the h ∈ l is drawn with a stick-breakingprocess that makes use of beta distributions.Let X = { X t } t ∈ Z be a stationary process. As X is normally distributed if the process X ( t ) := { X k } k ≤ t is Gaussian for each t ∈ Z , using the result above, Nieto-Reyes et al. (2014)provides a procedure for testing that X is a Gaussian process by testing whether the process Y h = { Y ht } t ∈ Z is Gaussian. Y ht := ∞ X i =0 h i X t − i = h X ( t ) , h i , (4)where h X ( t ) , h i is a real random variable for each t ∈ Z and h ∈ l . Thus, Y h is a stationaryprocess constructed by the projection of X ( t ) on the space generated by h. Therefore, X isa Gaussian process if and only if the marginal distribution of Y h is normally distributed.Additionally, the hypothesis of the tests Lobato and Velasco or Epps , such as (2), (3), ( A )and ( B ), imposed on X are inherited by Y h . Then, those tests can be applied to evaluatethe normality of the marginal distribution of Y h . Further conditions such as, a discussion onthe specific beta parameters used to construct the distribution from which to draw h , select aproper amount of combinations to establish the number of projections required to improve themethod performance, have to be considered. All of these details are discussed in Nieto-Reyes et al. (2014).Next, we summarize the test of random projections in practice:1. Select k, the number of independent projections to be used ( by default k = 64 ).2. Half of the random elements in which to project are drawn from a dissipative distributionthat makes use of a particular beta distribution ( β (2 , by default ). Then the test of Lobato and Velasco is applied to the odd number of projected processes, and the
Epps test to the even.3. The other half are drawn analogously but using another beta distribution ( β (100 , bydefault ). Then again the test of Lobato and Velasco is applied to the odd number ofprojected process, and the
Epps test to the even.4. The obtained k p -values are combined using the false discover rate (Benjamini andYekutieli 2001).The rp.test() function implements the above procedure. The user might provide optionalparameters such as the number of projections k , the parameters of the first beta distribution pars1 and those of the second pars2 . In the next example, the rp.test is applied to astationary GARCH(1,1) process drawn using normal random variables. Example 3
A stationary
GARCH(1,1) process is drawn using standard normal distributionand the parameters α = 0 , α = 0 . and β = 0 . . A GARCH(1,1) process is stationary if the parameters α and β satisfy the inequality α + β < An R package for normality in stationary processes R> set.seed(3466)R> spec = garchSpec(model = list(alpha = 0.2, beta = 0.3))R> x = ts( garchSim(spec, n = 300) )R> rp.test(x,k=250) k random projections testdata: xk = 250, lobato = 1.1885, epps = 3.1659, p-value = 0.8276alternative hypothesis: x does not follow a Gaussian Process
The random projections test is applied to the simulated data with k = 250 as the numberof projections (as recommended by the authors). At significance level α = 0 . , there is noevidence to reject null hypothesis of normality. The random.projection() function upgrades the lobato.test() and epps.test() func-tions for fully testing normality. This function generates the projected process Y h as in(4), the shape1 and shape2 function’s arguments are the parameters of a beta distributionused to generate the stick-breaking process h. And then, the lobato.test() or epps.test() functions can be applied to the resulting Y h process for fully testing. Example 4
We use the AR(2) process simulated in Example 1, to fully check of normal-ity using the epps.test() and random.projection() functions, where shape1 = 100 and shape2 = 1 are the arguments for generating the new projected process Y h . At significancelevel α = 0 . , the null hypothesis of normality is again correctly rejected. R> set.seed(298)R> x = arima.sim(250,model = list(ar =c(0.5,0.2)),rand.gen = rbeta,shape1 = 9,shape2 = 1)R> y = random.projection(x,shape1 = 100, shape2 = 1,seed = 298)R> epps.test(y)
Epps testdata: xepps = 11.645, df = 2, p-value = 0.002961alternative hypothesis: x does not follow a Gaussian Process
Psaradakis and VÃąvra (2017) proposed a distance test for normality of the one-dimensionalmarginal distribution of a stationary process. The test is based on the Anderson and Darling(1952) test statistic and makes use of an auto-regressive sieve bootstrap approximation tothe null distribution of the sample test statistic. Although the test is said to be applicableto a wider class of non-stationary processes, by transforming them into stationary by meansof a fractional difference operator, no theoretic result was apparently provided to sustain3this transformation. Therefore, here we restrict the presentation and implementation of theprocedure to stationary processes.Let X be a stationary process satisfying X t = ∞ X i =0 θ i (cid:15) t − i + µ , t ∈ Z , (5)where µ ∈ R , { θ i } ∞ i =0 ∈ l with θ = 1 and { (cid:15) t } ∞ i =0 a collection of mean zero i.i.d randomvariables. The null hypothesis is that the one-dimensional marginal distribution of X isnormally distributed, H : F ( µ + q γ (0) x ) − F N ( x ) = 0 , for all x ∈ R , where F is the cumulative distribution function of X , and F N denotes the standard normalcumulative distribution function. Note that if (cid:15) is normally distributed, then the null hypoth-esis is satisfied. Conversely, if the null hypothesis is satisfied, then (cid:15) is normally distributedand consequently X .The considered test for H is based on the Anderson-Darling distance statistic A d = Z ∞−∞ [ F n ( b µ + pb γ (0) x ) − F N ( x )] F N ( x )[1 − F N ( x )] dF N ( x ) , (6)where F n ( · ) is the empirical distribution function associated to F based on a simple randomsample of size n. Psaradakis and VÃąvra (2017) propose an auto-regressive sieve bootstrapprocedure to approximate the sampling properties of A d arguing that making use of classicalasymptotic inference for A d is problematic and involved. This scheme is motivated by thefact that under some assumptions for X, including (5), (cid:15) t admits the representation, (cid:15) t = ∞ X i =1 φ i ( X t − i − µ ) , t ∈ Z , (7)for certain type of { φ i } ∞ i =1 ∈ l . The main idea behind this approach is to generate a bootstrapsample (cid:15) ∗ t to approximate (cid:15) t with a finite-order auto-regressive model. This is because thedistribution of the processes (cid:15) t and (cid:15) ∗ t coincide asymptotically if the order of the auto-regressiveapproximation grows simultaneously with n at an appropriate rate (BÃijhlmann 1997). Theprocedure makes use of the (cid:15) ∗ st to obtain the X ∗ st through the bootstrap analog of (7). Then,a bootstrap sample of the A d statistic, A ∗ d , is generated making use of the bootstrap analogof (5).This test is implemented in the vavra.test() function. 1,000 sieve-bootstrap replicationsare used by default. The presented values are Monte-Carlo estimates of the A d statistic and p.value . Example 5
A stationary
ARM A (1,1) process is simulated using a standard normal distribu-tion, and the implementation of the test of Psaradakis and Vávra is performed. At significancelevel α = 0 . , there is no evidence to reject the null hypothesis of normality. R> set.seed(298)R> x = arima.sim(250,model = list(ar = 0.2, ma = 0.34))R> vavra.test(x) An R package for normality in stationary processes Psaradakis-Vavra testdata: xbootstrap A = 1.5798, p-value = 0.796alternative hypothesis: x does not follow a Gaussian Process
4. Simulations and data analysis
Inspired in Psaradakis and VÃąvra (2017) and Nieto-Reyes et al. (2014) simulation studies,this work proposes a similar procedure. This study involves drawing data from the AR (1)process X t = φX t − + (cid:15) t , t ∈ Z , for φ ∈ { , ± . , ± . } (8)where the { (cid:15) t } t ∈ Z are i.i.d random variables. For the distribution of the (cid:15) t we consider differentscenarios: standard normal ( N ), standard log-normal ( logN ), Student t with 3 degrees offreedom ( t ), chi-squared with 10 degrees of freedom ( χ (10)) and beta with parameters (7 , . As in Psaradakis and VÃąvra (2017), m = 1 ,
000 independent draws of the above process aregenerated for each pair of parameter φ and distribution. Each is taken of length past + n, with past = 500 and n ∈ { , , , } . The first 500 data points of each realization arethen discarded in order to eliminate start-up effects. The n remaining data points are usedto compute the value of the test statistic of interest. In each particular scenario, the rejectionrate is obtained by computing the proportion of times that the test is rejected among the m trials. Tables n, the columnsrepresent the used AR (1) parameter, and the rows the distribution use to draw the process.The obtained results are consistent with those obtained in the publications where the differenttests were proposed. As expected, rejection rates are around 0.05 when the data is drawnmaking use of the standard normal distribution, as in this case the data is drawn from aGaussian process. Conversely, high rejection rates are registered for the other distributions.Although low rejection rates are observed for instance for the χ (10) distribution in the casesof the Epps and random projection test, they consistently tend to 1 when the length of theprocess, n, increases. Furthermore, for the random projections test, the number of projectionsused in this study is k = 10, which is by far a lower number than the recommended by Nieto-Reyes et al. (2014). However, even in these conditions, the obtained results are satisfactory,having even better performance than the tests of Epps (1987), or Psaradakis and VÃąvra(2017). As an illustrative example, we analyze the monthly mean carbon dioxide, in parts per million( ppm ), measured at the Mauna Loa Observatory, in Hawaii, from March 1958 to November2018. The carbon dioxide data measured as the mole fraction in dry air on Mauna Loaconstitute the longest record of direct measurements of CO n = 100 n = 250 distribution φ -0.4 -0.25 0.0 0.25 0.4 -0.4 -0.25 0.0 0.25 0.4 Lobato and Velasco N logN t χ (10) 0.553 0.685 0.776 0.667 0.559 0.968 0.995 0.997 0.996 0.964 beta (7 ,
1) 0.962 0.995 0.999 0.996 0.958 1.000 1.000 1.000 1.000 1.000
Epps N logN t χ (10) 0.319 0.465 0.548 0.461 0.361 0.631 0.836 0.917 0.841 0.729 beta (7 ,
1) 0.781 0.953 0.991 0.960 0.887 0.996 1.000 1.000 1.000 0.999
Random Projections, k = 10 N logN t χ (10) 0.223 0.231 0.201 0.131 0.086 0.954 0.993 0.992 0.949 0.840 beta (7 ,
1) 0.605 0.533 0.363 0.184 0.108 1.000 1.000 1.000 1.000 1.000
Psaradakis and Vavra N logN t χ (10) 0.500 0.692 0.800 0.660 0.542 0.911 0.985 0.995 0.985 0.922 beta (7 ,
1) 0.956 1.000 1.000 0.998 0.972 1.000 1.000 1.000 1.000 0.999
Table 1: Rejection rate estimates over m = 1 ,
000 trials of the four studied goodness of fittest for the null hypothesis of normality. The data is drawn using the process defined in (8)for different values of φ and n displayed in the columns and different distributions for (cid:15) t inthe rows. φ ∈ { , ± . , ± . } , n ∈ { , } .is available in the astsa package (Stoffer 2020) under the name cardox data and it is displayedin the left panel of Figure 1.The objective of this subsection is to propose a model to analyze this time series and checkthe assumptions on the residuals of the model using our implemented check_residuals() function. The time series clearly has trend and seasonal components (see left panel of Figure1), therefore, an adequate model that filters both components has to be selected. We proposean ETS model. For its implementation, we make use the ets() function from the forecast package (Hyndman and Khandakar 2008). This function fits 32 different ETS models andselects the best model according to information criterias such as Akaike’s information criteria (AIC) or
Bayesian Information criteria (BIC) (Chen and Chen 2008). The results providedby the ets() function are:
R> library(forecast)R> library(astsa) An R package for normality in stationary processes n = 500 n = 1,000 distribution φ -0.4 -0.25 0.0 0.25 0.4 -0.4 -0.25 0.0 0.25 0.4 Lobato and Velasco N logN t χ (10) 0.902 0.965 0.996 0.976 0.880 1.000 1.000 1.000 1.000 1.000 beta (7 ,
1) 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Epps
N 0.063 0.077 0.078 0.072 0.073 0.051 0.048 0.052 0.056 0.062logN 0.989 1.000 1.000 1.000 0.999 1.000 1.000 1.000 1.000 1.000t3 0.569 0.705 0.781 0.694 0.587 0.999 1.000 1.000 1.000 0.999chisq10 0.534 0.745 0.859 0.740 0.611 0.999 1.000 1.000 1.000 1.000beta(7,1) 0.983 0.998 1.000 1.000 0.989 1.000 1.000 1.000 1.000 1.000
Random Projections k = 10
N 0.016 0.015 0.012 0.019 0.017 0.015 0.016 0.019 0.018 0.018logN 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000t3 1.000 1.000 1.000 1.000 0.999 1.000 1.000 1.000 1.000 1.000chisq10 1.000 1.000 1.000 1.000 0.993 1.000 1.000 1.000 1.000 1.000beta(7,1) 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Psaradakis and Vavra
N 0.064 0.046 0.048 0.038 0.050 0.055 0.049 0.045 0.057 0.042logN 1.000 1.000 1.000 1.000 0.998 1.000 1.000 1.000 1.000 1.000t3 0.908 0.972 0.982 0.958 0.896 1.000 1.000 1.000 1.000 1.000chisq10 0.824 0.954 0.988 0.958 0.856 1.000 1.000 1.000 1.000 1.000beta(7,1) 1.000 1.000 1.000 0.998 1.000 1.000 1.000 1.000 1.000 1.000
Table 2: Rejection rate estimates over m = 1 ,
000 trials of the four studied goodness of fittest for the null hypothesis of normality. The data is drawn using the process defined in (8)for different values of φ and n displayed in the columns and different distributions for (cid:15) t inthe rows. φ ∈ { , ± . , ± . } , n ∈ { , } . R> model = ets(cardox)R> summary(model)
ETS(M,A,A)Call:ets(y = cardox)Smoothing parameters:alpha = 0.5591beta = 0.0072gamma = 0.1061Initial states: l = 314.6899b = 0.0696s = 0.6611 0.0168 -0.8536 -1.9095 -3.0088 -2.7503-1.2155 0.6944 2.1365 2.7225 2.3051 1.2012sigma: 9e-04AIC AICc BIC3136.280 3137.140 3214.338Training set error measures:ME RMSE MAE MPE MAPE MASE ACF1Training set 0.02324 0.3120 0.24308 0.0063088 0.068840 0.1559102 0.07275949
The resulting model proposed for analyzing the carbon dioxide data in
Mauna Loa is an
ET S [ M, A, A ] model. The parameters α, β and γ (see Definition 7) have being estimatedusing the least squares method. If the assumptions on the model are satisfied, then the errorsof the model behave like a Gaussian stationary process. To check it, we make use of thefunction check_residuals() . For more details on the compatibility of this function with themodels obtained by other packages see the nortsTest repository. In the following, we displaythe results of using the Augmented Dickey-Fuller test (
Subsection random projection test with k = 64 projections to check the normalityassumption. For the other test options see the function’s documentation.
R> check_residuals(model,unit_root = "adf",normality = "rp",plot = TRUE) *************************************************** An R package for normality in stationary processes Unit root test for stationarity:Augmented Dickey-Fuller Testdata: yDickey-Fuller = -9.7249, Lag order = 8, p-value = 0.01alternative hypothesis: stationaryConclusion: y is stationary***************************************************Goodness of fit test for Gaussian Distribution:k random projections testdata: yk = 64, lobato = 3.679, epps = 1.3818, p-value = 0.5916alternative hypothesis: y does not follow a Gaussian ProcessConclusion: y follows a Gaussian Process***************************************************
The obtained results indicate that the null hypothesis of non-stationarity is rejected at sig-nificance level α = 0 . . Additionally, there is no evidence to reject the null hypothesis ofnormality at significance level α = 0 . . Consequently, we conclude that the residuals followa stationary Gaussian process, having that the resulting
ET S [ M, A, A ] model adjusts well tothe carbon dioxide data in
Mauna Loa .In the above displayed check_residuals() function, the plot argument is set to
TRUE . Theresulting plots are shown in Figure 2. The plot in the top panel and the auto-correlation plotsin the bottom panels insinuate that the residuals have a stationary behaviour. The top panelplot shows slight oscillations around zero and the auto-correlations functions in the bottom panels have values close to zero in every lag. The histogram and qq-plot in the middle panelssuggest that the marginal distribution of the residuals is normally distributed. Therefore,Figure 2 agrees with the reported results, indicating that the assumptions of the model aresatisfied.As the assumptions of the model have been checked, it can be used for instance to forecast.The result of applying the following function is displayed in the right panel of Figure 1. Itpresents the Carbon dioxide data for the last 8 years and a forecast of the next 12 months.It is observable from the plot that the model captures the process trend and periodicity.
R> autoplot(forecast(model,h = 12),include = 100,xlab = "years",ylab = "CO2 (ppm)",main = "Forecast: Carbon Dioxide Levels at Mauna Loa")
5. Conclusions
This work gives a general overview of a careful selection of tests for normality in stationaryprocess, which consists of the majority of available types of test for this matter. It additionallyprovides examples that illustrate each of the test implementations.For independent data, the nortest package (Gross and Ligges 2015) provides five differenttests for normality, the mvnormtest package (Jarek 2012) performs the Shapiro-Wilks test formultivariate data and the
MissMech package (Jamshidian, Jalal, and Jansen 2014) providestests for normality in multivariate incomplete data. To test normality of dependent data,some authors such as Psaradakis and VÃąvra (2017); Nieto-Reyes et al. (2014) have availableundocumented
Matlab code; mainly only useful for re-doing their simulation studies. Toour knowledge, however, no consistent implementation or package of a selection of tests fornormality has been done before. Therefore, the nortsTest is the first package that providesimplementations of tests for normality in stationary process.For checking model’s assumptions, the forecast and astsa packages contain functions for visu-alization diagnostic. Following the same idea, nortsTest provides similar diagnostic methods;in addition to a results report of testing stationarity and normality, the main assumptions forthe residuals in time series analysis.0 An R package for normality in stationary processes Future work and projects
The second version of the nortsTest package will incorporate (i) additional tests such asBispectral (Hinich 1982) and Stein’s characterization (Bontemps and Meddahi 2005), (ii)upgrades in the optimization and bootstrap procedures of the tests of
Epps and
Psaradaskis& Vavra , for faster performance, and (iii) the creation of different implementations of theSkewness-Kurtosis test besides the one of
Lobato & Velasco . Further future work will includea Bayesian version of a residuals check procedure that makes use of the random projectionmethod.
Acknowledgments
I.A.M. thanks the Carolina Foundation for the fellowship that has led to this work. A.N-R.is partially supported by the Spanish Ministry of Science, Innovation and Universities grantMTM2017-86061-C2-2-P.
References
Anderson TW, Darling DA (1952). “Asymptotic Theory of Certain Goodness of Fit CriteriaBased on Stochastic Processes.”
Annals of Mathematical Statistics , (2), 193–212. doi:10.1214/aoms/1177729437 .Bai J, Ng S (2005). “Tests for Skewness, Kurtosis, and Normality for Time Series Data.” Jour-nal of Business & Economic Statistics , (1), 49–60. doi:10.1198/073500104000000271 .Beaulieu J, Miron JA (1993). “Seasonal Unit Roots in Aggregate U.S. Data.” Journalof Econometrics , (1), 305 – 328. ISSN 0304-4076. doi:https://doi.org/10.1016/0304-4076(93)90018-Z .Benjamini Y, Yekutieli D (2001). “The Control of the False Discovery Rate in MultipleTesting under Dependency.” The Annals of Statistics , (4), 1165–1188. ISSN 00905364.URL .Berg A, Paparoditis E, Politis DN (2010). “A Bootstrap Test for Time Series Linearity.” Journal of Statistical Planning and Inference , (12), 3841 – 3857. ISSN 0378-3758. doi:https://doi.org/10.1016/j.jspi.2010.04.047 .BÃijhlmann P (1997). “Sieve Bootstrap for Time Series.” Bernoulli , (2), 123–148. ISSN13507265. URL .Bollerslev T (1986). “Generalized Autoregressive Conditional Heteroskedasticity.” Journalof Econometrics , (3), 307 – 327. ISSN 0304-4076. doi:https://doi.org/10.1016/0304-4076(86)90063-1 .Bontemps C, Meddahi N (2005). “Testing Normality: a GMM approach.” Journal of Econo-metrics , (1), 149 – 186. ISSN 0304-4076. doi:https://doi.org/10.1016/j.jeconom.2004.02.014 .1Box G, Pierce DA (1970). “Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models.” Journal of the American Statistical As-sociation , (332), 1509–1526. doi:10.1080/01621459.1970.10481180 .Box GEP, Jenkins G (1990). Time Series Analysis, Forecasting and Control . Holden-Day, Inc.,USA. ISBN 0816211043. URL .Canova F, Hansen BE (1995). “Are Seasonal Patterns Constant Over Time? A Test forSeasonal Stability.”
Journal of Business & Economic Statistics , (3), 237–252. doi:10.1080/07350015.1995.10524598 .Chen J, Chen Z (2008). “Extended Bayesian Information Criteria for Model Selection WithLarge Model Spaces.” Biometrika , (3), 759–771. ISSN 0006-3444. doi:10.1093/biomet/asn034 .Cuesta-Albertos J, del Barrio E, Fraiman R, MatrÃąn C (2007). “The Random ProjectionMethod in Goodness of Fit for Functional Data.” Computational Statistics & Data Analysis , (10), 4814 – 4831. ISSN 0167-9473. doi:https://doi.org/10.1016/j.csda.2006.09.007 .D’Agostino RB, Stephens MA (1986). “Goodness-of-fit techniques.” Quality and ReliabilityEngineering International , (1), 71–71. doi:10.1002/qre.4680030121 .de Lacalle JL (2019). uroot : Unit Root Tests for Seasonal Time Series . R package version2.1-0, URL https://CRAN.R-project.org/package=uroot .Engle RF (1982). “Autoregressive Conditional Heteroscedasticity with Estimates of the Vari-ance of United Kingdom Inflation.” Econometrica , (4), 987–1007. ISSN 00129682,14680262. URL .Epps TW (1987). “Testing That a Stationary Time Series is Gaussian.” The Annals ofStatistics , (4), 1683–1698. doi:10.1214/aos/1176350618 .Gasser T (1975). “Goodness-of-Fit Tests for Correlated Data.” Biometrika , (3), 563–570.ISSN 00063444. URL .Gelman A, Carlin J, Stern H, Dunson D, Vehtari A, Rubin D (2013). Bayesian Data Analysis,Third Edition . Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis. ISBN9781439840955. URL https://books.google.nl/books?id=ZXL6AQAAQBAJ .Gross J, Ligges U (2015). nortest : Tests for Normality . R package version 1.0-4, URL https://CRAN.R-project.org/package=nortest .Hinich MJ (1982). “Testing For Gaussianity and Linearity of a Stationary Time Series.” Jour-nal of Time Series Analysis , (3), 169–176. doi:10.1111/j.1467-9892.1982.tb00339 .Holt CC (2004). “Forecasting Seasonals and Trends by Exponentially Weighted MovingAverages.” International Journal of Forecasting , (1), 5 – 10. ISSN 0169-2070. doi:https://doi.org/10.1016/j.ijforecast.2003.09.015 .Hyndman R, Khandakar Y (2008). “Automatic Time Series Forecasting: The forecast Packagefor R .” Journal of Statistical Software, Articles , (3), 1–22. ISSN 1548-7660. doi:10.18637/jss.v027.i03 .2 An R package for normality in stationary processes Hyndman RJ, Koehler AB, Ord JK, Snyder RD (2008).
Forecasting with ExponentialSmoothing: The State Space Approach . Springer. ISBN 9783540719168. doi:https://doi.org/10.1111/j.1751-5823.2009.00085_17 .Jamshidian M, Jalal S, Jansen C (2014). “
MissMech : An R Package for Testing Homoscedas-ticity, Multivariate Normality, and Missing Completely at Random (MCAR).”
Journal ofStatistical Software , (6), 1–31. URL .Jarek S (2012). mvnormtest : Normality test for multivariate variables . R package version0.1-9, URL https://CRAN.R-project.org/package=mvnormtest .Jarque CM, Bera AK (1980). “Efficient tests for Normality, Homoscedasticity and SerialIndependence of Regression Residuals.” Economics Letters , (3), 255 – 259. ISSN 0165-1765. doi:https://doi.org/10.1016/0165-1765(80)90024-5 .Kwiatkowski D, Phillips PC, Schmidt P, Shin Y (1992). “Testing the Null Hypothesis ofStationarity Against the Alternative of a Unit Root: How sure Are We that EconomicTime Series Have a Unit Root?” Journal of Econometrics , (1), 159 – 178. ISSN 0304-4076. doi:https://doi.org/10.1016/0304-4076(92)90104-Y .Lobato I, Velasco C (2004). “A simple Test of Normality for Time Series.” EconometricTheory , , 671–689. doi:10.1017/S0266466604204030 .Lomnicki Z (1961). “Tests for Departure from Normality in the Case of Linear Stochastic Pro-cesses.” Metrika: International Journal for Theoretical and Applied Statistics , (1), 37–62.URL https://EconPapers.repec.org/RePEc:spr:metrik:v:4:y:1961:i:1:p:37-62 .Nieto-Reyes A, Cuesta-Albertos JA, Gamboa F (2014). “A Random-Projection Based Testof Gaussianity for Stationary Processes.” Computational Statistics & Data Analysis , ,124 – 141. ISSN 0167-9473. doi:https://doi.org/10.1016/j.csda.2014.01.013 .Osborn DR, Chui APL, Smith JP, Birchenhall CR (1988). “Seasonality and the Order ofIntegration for Consumption.” Oxford Bulletin of Economics and Statistics , (4), 361–377. doi:10.1111/j.1468-0084.1988.mp50004002.x .Pearson K, Henrici OMFE (1895). “X. Contributions to the Mathematical Theory of Evolu-tion. Skew Variation in Homogeneous Material.” Philosophical Transactions of the RoyalSociety of London. (A.) , , 343–414. doi:10.1098/rsta.1895.0010 .Perron P (1988). “Trends and Random Walks in Macroeconomic Time Series: Further Evi-dence From a New Approach.” Journal of Economic Dynamics and Control , (2), 297 –332. ISSN 0165-1889. doi:https://doi.org/10.1016/0165-1889(88)90043-7 .Petris G, Petrone S, Campagnoli P (2007). “Dynamic Linear Models With R.” doi:10.1111/j.1751-5823.2010.00109_26.x . R Core Team (2018). R : A Language and Environment for Statistical Computing . R Founda-tion for Statistical Computing, Vienna, Austria. URL .Psaradakis Z (2017). “Normality Tests for Dependent Data.”
Working and Discussion PapersWP 12/2017 , Research Department, National Bank of Slovakia. URL https://ideas.repec.org/p/svk/wpaper/1053.html .3Psaradakis Z, VÃąvra M (2017). “A Distance Test of Normality for a Wide Class of StationaryProcesses.”
Econometrics and Statistics , , 50 – 60. ISSN 2452-3062. doi:https://doi.org/10.1016/j.ecosta.2016.11.005 .Qiu D (2015). aTSA : Alternative Time Series Analysis . R package version 3.1.2, URL https://CRAN.R-project.org/package=aTSA .Royston JP (1982). “An Extension of Shapiro and Wilkś W Test for Normality to LargeSamples.” Journal of the Royal Statistical Society. Series C (Applied Statistics) , (2),115–124. ISSN 00359254, 14679876. URL .Said SE, Dickey DA (1984). “Testing for Unit Roots in Autoregressive-Moving Average Modelsof Unknown Order.” Biometrika , (3), 599–607. ISSN 0006-3444. doi:10.1093/biomet/71.3.599 .Shapiro SS, Wilk MB (1965). “An Analysis of Variance Test for Normality (Complete Sam-ples).” Biometrika , (3-4), 591–611. ISSN 0006-3444. doi:10.1093/biomet/52.3-4.591 .Shumway R, Stoffer D (2010). Time Series Analysis and Its Applications: With R Examples .Springer Texts in Statistics. Springer New York. ISBN 9781441978646. URL https://books.google.es/books?id=dbS5IQ8P5gYC .Smirnov N (1948). “Table for Estimating the Goodness of Fit of Empirical Distributions.”
Annals of Mathematical Statistics , (2), 279–281. doi:10.1214/aoms/1177730256 .Stoffer D (2020). astsa : Applied Statistical Time Series Analysis . R package version 1.10,URL https://CRAN.R-project.org/package=astsa .Trapletti A, Hornik K (2019). tseries : Time Series Analysis and Computational Finance . R package version 0.10-47., URL https://CRAN.R-project.org/package=tseries .Tsay R (2010). Analysis of Financial Time Series . Second edi edition. Wiley-Interscience,Chicago. ISBN 978-0470414354. doi:10.1002/0471264105 .Wasserman L (2006).
All of Nonparametric Statistics . Springer, New York. ISBN9780387251455. doi:10.1007/0-387-30623-4 .West M, Harrison J (2006).
Bayesian Forecasting and Dynamic Models . Springer Series inStatistics. Springer New York. ISBN 9780387227771. URL https://books.google.nl/books?id=0mPgBwAAQBAJ .Wickham H (2009). ggplot2 : Elegant Graphics for Data Analysis . Springer-Verlag New York.ISBN 978-0-387-98140-6. URL http://ggplot2.orghttp://ggplot2.org