Goodness-of-fit tests for parametric regression models with circular response
Andrea Meilán-Vila, Mario Francisco-Fernández, Rosa M. Crujeiras
GGoodness-of-fit tests for parametric regression models withcircular response
Andrea Meil´an-VilaUniversidade da Coru˜na ∗ Mario Francisco-Fern´andezUniversidade da Coru˜na ∗ Rosa M. CrujeirasUniversidade de Santiago de Compostela † Abstract
Testing procedures for assessing a parametric regression model with circular response and R d -valued covariate are proposed and analyzed in this work both for independent and forspatially correlated data. The test statistics are based on a circular distance comparing a(non-smoothed or smoothed) parametric circular estimator and a nonparametric one. Prop-erly designed bootstrap procedures for calibrating the tests in practice are also presented.Finite sample performance of the tests in different scenarios with independent and spatiallycorrelated samples, is analyzed by simulations. Keywords:
Model checking, Circular data, Local polynomial regression, Spatial correlation,Bootstrap
In many scientific fields, such as oceanography, meteorology or biology, data are angular mea-surements (points in the unit circle of a circular variable), which are accompanied by auxiliaryobservations of other Euclidean random variables. The joint behavior of these circular and Eu-clidean variables can be analyzed by considering a regression model, allowing at the same timeto explain the possible relation between the variables and also to make predictions on the vari-able of interest. Parametric regression estimators for linear-circular models (circular responseand Euclidean covariates) with independent data have been studied by Fisher and Lee (1992), ∗ Research group MODES, CITIC, Department of Mathematics, Faculty of Computer Science, Universidadeda Coru˜na, Campus de Elvi˜na s/n, 15071, A Coru˜na, Spain † Department of Statistics, Mathematical Analysis and Optimization, Faculty of Mathematics, Universidadede Santiago de Compostela, R´ua Lope G´omez de Marzoa s/n, 15782, Santiago de Compostela, Spain a r X i v : . [ s t a t . M E ] A ug resnell et al. (1998), and Kim and SenGupta (2017), among others. In the presence of spa-tial correlation, Jona-Lasinio et al. (2012), Wang and Gelfand (2014), Lagona et al. (2015) andMastrantonio et al. (2016) for instance, employed parametric methods to model circular spatialprocesses. Alternatively, nonparametric regression approaches can be used to deal with these in-ference problems. For this purpose, kernel-type estimators of the regression function in a modelwith a circular response and a R d -valued covariate have been introduced by Meil´an-Vila et al.(2020a,c). Notice that if the bandwidth matrix is appropriately chosen, these methods providemore flexible estimators than those obtained using parametric approaches, avoiding misspecifi-cation problems. However, if a parametric regression model is assumed and it holds, parametricmethods usually provide estimators which are more efficient and easier to interpret. In thiscontext, goodness-of-fit tests can be designed, providing a tool for assessing a general class ofparametric linear-circular regression models.There is a substantial literature on testing parametric regression models involving Eu-clidean data, including Kozek (1991), H¨ardle and Mammen (1993), Gonz´alez Manteiga andVilar Fern´andez (1995), Biedermann and Dette (2000), Park et al. (2015), Meil´an-Vila et al.(2020b), and Meil´an-Vila et al. (2020d), among others. See Gonz´alez-Manteiga and Crujeiras(2013) for a review. The previous testing procedures are based on measuring differences betweena suitable parametric estimator under the null hypothesis and a nonparametric one. Specifically, L -norm or supremum-norm tests, among others, can be employed for regression models withEuclidean responses and covariates. In the context of regression models with directional re-sponse and directional or Euclidean explanatory variables, the literature on goodness-of-fit testsis relatively scarce. In this setting, Deschepper et al. (2008) proposed an exploratory tool and alack-of-fit test for circular-linear regression models (Euclidean response and circular covariates).The same problem was studied by Garc´ıa-Portugu´es et al. (2016), using nonparametric methods.The authors proposed a testing procedure based on the weighted squared distance between asmooth and a parametric regression estimator, where the smooth regression estimator was ob-tained by a projected local regression on the sphere. However, the problem of assessing a certainclass of parametric linear-circular regression models, that is, for a regression model with circularresponse and R d -valued covariates (up to the authors knowledge) has not been considered in thestatistical literature yet, neither for independent nor for spatially dependent observations.In this work, new approaches for testing a linear-circular parametric regression model (cir-cular response and R d -valued covariate) are proposed and analyzed both for independent andspatially correlated errors. The test statistics employed in these procedures are based on a com-parison between a (non-smoothed or smoothed) parametric fit under the null hypothesis and anonparametric estimator of the circular regression function. More specifically, two different teststatistics are considered. In the first one, the parametric estimator of the regression function2nder the null hypothesis is directly used, while in the second one, a smooth version of thisestimator is employed. Notice that, in this framework, a suitable measure of circular distance must be employed (see Jammalamadaka and SenGupta, 2001, Section 1.3.2). The null hypoth-esis that the regression function belongs to a certain parametric family if the distance betweenboth fits exceeds a certain threshold. To perform the parametric estimation, a circular analog toleast squares regression is used (see Fisher and Lee, 1992; Lund, 1999). For the nonparametricalternative, kernel-type regression estimators (Meil´an-Vila et al., 2020a,c) are considered.For the application in practice of the proposals, the test statistics should be accompaniedby a calibration procedure. In this case, this is not based on the asymptotic distribution, giventhat the convergence to the limit distribution under the null hypothesis will presumably betoo slow. Different bootstrap methods are designed and their performance is analyzed andcompared in empirical experiments. For independent data, standard resampling proceduresadapted to the context of regression models with circular response are used: a parametriccircular residual bootstrap (PCB) and a nonparametric circular residual bootstrap (NPCB). ThePCB approach consists in using the residuals obtained from the parametric fit in the bootstrapalgorithm. If the circular regression function belongs to the parametric family considered inthe null hypothesis, then the residuals will tend to be quite similar to the theoretical errorsand, therefore, it is expected that the PCB method has a good performance. Following theproposal by Gonz´alez-Manteiga and Cao (1993), the NPCB method aims to increase the powerof the test and, for this purpose, the residuals obtained from the nonparametric fit are the onesemployed in the bootstrap procedure. The previous resampling procedures (PCB and NPCB)for independent data must be properly adapted for handling spatial correlation. Two specificprocedures for test calibration which take the spatial correlation into account are also introduced:a parametric spatial circular residual bootstrap (PSCB) and a nonparametric spatial circularresidual bootstrap (NPSCB). Similarly to the PCB, but now for spatially correlated errors, thePSCB considers the residuals obtained from the parametric fit under the null hypothesis. Therelevant difference between PCB and PSCB is that, in order to mimic the dependence structureof the errors, a spatial circular process is fitted to the residuals in PSCB. Samples coming fromthe fitted process are employed in the bootstrap algorithm. The steps followed in NPSCB aresimilar at those employed in PSCB, but the residuals are obtained from the nonparametricregression estimator.This paper is organized as follows. Section 2 is devoted to present some ideas of goodness-of-fit tests for circular regression models. The parametric and nonparametric circular regressionestimators employed in the test statistics are presented in Sections 2.1 and 2.2, respectively.Section 2.3 introduces the testing problem and the proposed test statistics. A description ofthe calibration algorithms considered is given in Section 2.4. Section 2.5 contains a simulation3tudy for assessing the performance of the tests when using the PCB and NPCB resamplingapproaches to approximate the sampling distribution of the test statistics. The extension of thetesting procedures for spatially correlated data is presented in Section 3. Section 3.1 containsbootstrap approaches to calibrate the tests in this spatial framework. A simulation study forassessing the performance of the tests using PSCB and NPSCB methods is provided in Section3.2. Finally, some conclusions and ideas for further research are provided in Section 4. Let { ( X i , Θ i ) } ni =1 be a random sample from ( X , Θ), where Θ is a circular random variable takingvalues on T = [0 , π ), and X is a random variable with density f and support on D ⊆ R d . Assumethat the following regression model holds:Θ i = [ m ( X i ) + ε i ]( mod π ) , i = 1 , . . . , n, (1)where m is a circular regression function, and ε i , i = 1 , . . . , n , is an independent sample ofa circular variable ε , with zero mean direction and finite concentration. This implies that E [sin( ε ) | X = x ] = 0. Additionally, the following notation is used: (cid:96) ( x ) = E [cos( ε ) | X = x ], σ ( x ) = V ar[sin( ε ) | X = x ], σ ( x ) = V ar[cos( ε ) | X = x ] and σ ( x ) = E [sin( ε ) cos( ε ) | X = x ].Considering the regression model (1), one of the aims of the present research is to proposeand study different testing procedures to assess the suitability of a general class of parametriccircular regression models. Specifically, in this work, we focus on the following testing problem: H : m ∈ M β = { β + g ( β T X ) , β ∈ R , β , ∈ R d } vs. H a : m / ∈ M β , (2)where g is a link function mapping the real line onto the circle. As pointed out in Section1, the procedure proposed in this work consists in comparing a (non-smoothed or smoothed)parametric fit with a nonparametric estimator of the circular regression function m , measuringthe circular distance between both fits and employing this distance as a test statistic. Theparametric and nonparametric estimation methods considered in this proposal are described inwhat follows. As mentioned in the Introduction, our proposal requires a parametric estimator of the circularregression function m , once a parametric family is set as the null hypothesis. Notice that, forinstance, the procedures based on least squares for Euclidean data, are not appropriate whenthe response variable is of circular nature. Minimizing the sum of squared differences between4he observed and predicted values may lead to erroneous results, since the squared difference isnot an appropriate measure on the circle.A circular analog to least squares regression for models with a circular response and a set ofEuclidean covariates was presented by Lund (1999). Specifically, assume the regression model(1) holds and consider m ∈ M c β = { m β , β ∈ B } , where m β is a certain parametric circularregression model with parameter vector β . A parameter estimate of β could be obtained byminimizing the sum of the circular distances between the observed and predicted values asfollows: ˆ β = arg min β n (cid:88) i =1 { − cos [Θ i − m β ( X i )] } . (3)The value of the parameter minimizing the previous expression will be used to construct theparametric circular regression estimator, namely, m ˆ β .An equivalent parameter estimator can be obtained using a maximum-likelihood approach(Lund, 1999). If it is assumed that the response variable (conditional on X ) has a von Misesdistribution with mean direction given by m β and concentration parameter κ , the maximumlikelihood estimator of m β maximizes the following expression n (cid:88) i =1 cos [Θ i − m β ( X i )] . (4)Notice that the circular least squares estimator given in (3) also maximizes the expression(4) and, therefore, as pointed out before, assuming a von Mises distribution, the circular leastsquares estimator coincides with the maximum likelihood estimator. For further details see Lund(1999).Assuming that the response variable follows a von Mises distribution and considering as M c β the parametric family M β given in (2), an iteratively reweighted least squares algorithm canbe used to compute the maximum likelihood estimators of κ , β and β (see Lund, 1999; Fisherand Lee, 1992). The extension of these results to the case of a generic parametric family hasnot been explicetly considered. A nonparametric regression estimator for m in model (1) is presented in this section. Noticethat the circular regression function m is the conditional mean direction of Θ given X which,at a point x , can be defined as the minimizer of the risk E { − cos[Θ − m ( X )] | X = x } .Specifically, the minimizer of this cosine risk is given by m ( x ) = atan2[ m ( x ) , m ( x )], where m ( x ) = E [sin(Θ) | X = x ] and m ( x ) = E [cos(Θ) | X = x ]. Therefore, replacing m and m
5y appropriate estimators, an estimator for m can be directly obtained. In particular, a wholeclass of kernel-type estimators for m at x ∈ D can be defined by considering local polynomialestimators for m ( x ) and m ( x ). Specifically, estimators of the form:ˆ m H ( x ; p ) = atan2[ ˆ m , H ( x ; p ) , ˆ m , H ( x ; p )] (5)are considered, where for any integer p ≥
0, ˆ m , H ( x ; p ) and ˆ m , H ( x ; p ) denote the p th orderlocal polynomial estimators (with bandwidth matrix H ) of m ( x ) and m ( x ), respectively. Thespecial cases p = 0 and p = 1 yield a Nadaraya–Watson (or local constant) type estimator anda local linear type estimator of m ( x ), respectively. Asymptotic properties of these estimators,considering model (1), have been studied by Meil´an-Vila et al. (2020c). In this section, in order to check if the circular regression function belongs to a general class ofparametric models, goodness-of-fit tests are presented. We consider the testing problem (2).Test statistics to address (2) are proposed and studied. The first approach considers aweighted circular distance between the nonparametric and parametric fits: T n,p = (cid:90) D { − cos[ ˆ m H ( x ; p ) − m ˆ β ( x )] } w ( x ) d x , (6)for p = 0 ,
1, where w is a weight function that helps in mitigating possible boundary effects. Theestimators ˆ m H ( x ; p ), for p = 0 ,
1, are the Nadaraya-Watson or the local linear type estimatorsof the circular regression function m , given in (5). The parametric estimator m ˆ β was describedin Section 2.1.The second approach is similar to the first one, but considering a smooth version of theparametric fit: T n,p = (cid:90) D { − cos[ ˆ m H ( x ; p ) − ˆ m H , ˆ β ( x ; p )] } w ( x ) d x , (7)where ˆ m H , ˆ β ( x ; p ), for p = 0 ,
1, are smooth versions of the parametric estimator m ˆ β , which aregiven by: ˆ m H , ˆ β ( x ; p ) = atan2[ ˆ m , H , ˆ β ( x ; p ) , ˆ m , H , ˆ β ( x ; p )] , with ˆ m j, H , ˆ β ( x ; 0) = (cid:80) ni =1 K H ( X i − x ) sin[ m ˆ β ( X i )] (cid:80) ni =1 K H ( X i − x ) if j = 1 , (cid:80) ni =1 K H ( X i − x ) cos[ m ˆ β ( X i )] (cid:80) ni =1 K H ( X i − x ) if j = 2 , m j, H , ˆ β ( x ; 1) = e T ( X T x W x X x ) − X T x W x ˆ S if j = 1 , e T ( X T x W x X x ) − X T x W x ˆ C if j = 2 , being ˆ S = (sin[ m ˆ β ( X )] , . . . , sin[ m ˆ β ( X n )]) T and ˆ C = (cos[ m ˆ β ( X )] , . . . , cos[ m ˆ β ( X n )]) T .In order to formally address problem (2) using the test statistics T n,p and T n,p given in (6)and in (7), respectively, it is essential to approximate the distribution of the test statistic underthe null hypothesis. Deriving the asymptotic distribution of the statistics is out of the scope ofthis work. However, some guidelines to compute these expressions are provided in Section 4. Forthe application in practice of our proposal, the distribution of the tests under the null hypothesisis approximated using bootstrap procedures and analyzed through an empirical study.If the null hypothesis in the testing problem given in (2) holds, then the (non-smoothed orsmoothed) parametric fit and the nonparametric circular regression estimator will be similarand, therefore, the value of the test statistics T n,p and T n,p will be relatively small. Conversely,if the null hypothesis does not hold, the fits will be different and the value of T n,p and T n,p will be fairly large. So, the null hypothesis will be rejected if the circular distance betweenboth fits exceeds a critical value. For a visual illustration of the performance of the tests(where, for the sake of simplicity, a model with a single covariate is considered), suppose thata sample of size n = 100 is generated following model (1), with regression function (10), with c = 0, and random errors ε i drawn from a von Mises distribution vM (0 , m ( X ) ∈ { β + 2atan( β X) , β , β ∈ R } , using the test statistics given in (6) and in (7),the estimators ˆ m h ( x ; p ), m ˆ β ( x ) and ˆ m h, ˆ β ( x ; p ) (denoting by h the smoothing parameter when d = 1) must be computed. In this case, the estimator obtained from (3) is considered for theparametric fit. The local linear type estimator ˆ m h ( x ; 1) given in (5) is employed to computethe nonparametric counterpart. A triweight kernel and the optimal bandwidth obtained byminimizing the circular average squared error (CASE), defined as:CASE[ ˆ m H ( x ; p )] = 1 n n (cid:88) i =1 { − cos [ m ( X i ) − ˆ m H ( X i ; p )] } , (8)for p = 0 , d = 1, are considered to compute ˆ m h ( x ; 1) and ˆ m h, ˆ β ( x ; 1). Figure 1 shows inred lines the local linear type regression estimator (left panel), the parametric fit (center panel)and the smooth version of the parametric fit (right panel), with sample points and the circularregression function (black lines). It seems that all estimates are very similar and, therefore, thevalue of the test statistics T n,p and T n,p should be small, and consequently, there would possiblybe no evidences against the assumption that the circular regression function belonging to theparametric family m β ( X ) = β + 2atan( β X).Notice that the test statistics given in (6) and in (7), respectively, depend on the bandwidthmatrix H (or on the bandwidth parameter h , if d = 1). A non-trivial problem in goodness-7 .0 0.2 0.4 0.6 0.8 1.0 . . . . lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll . . . . lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll . . . . lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll Figure 1: Red lines: local linear type regression estimator (left), parametric fit (center) and smoothversion of the parametric fit (right), with sample points and circular regression function (black lines).Sample of size n = 100 generated on the unit interval, following model (1), with regression function (10),for c = 0, and circular errors ε i drawn from a vM (0 , of-fit testing is the bandwidth choice, since the optimal bandwidth for estimation may not bethe optimal one for testing (being not even clear what optimal means). For instance, Fanet al. (2001), Eubank et al. (2005) and Hart (2013) gave some strategies on bandwidth selectionin testing problems. This issue was also discussed further in detail by Sperlich (2013). Asusual in the context of goodness-of-fit tests for regression based on nonparametric smoothers,the performance of the test statistics will be analyzed for a range of bandwidths, in order toevaluate the impact of this parameter in the numerical results. Once a suitable test statistic is available, in order to solve the testing problem (2), a procedurefor calibration of critical values is required. This task can be done by means of bootstrapresampling algorithms.In what follows, a description of two different bootstrap proposals designed to approximatethe distribution (under the null hypothesis) of the tests statistics given in (6) and in (7) forindependent data (PCB and NPCB) are presented. The main difference between them is themechanism employed to obtain the residuals. As noted in the Introduction, the residuals in PCBcome from the parametric regression estimator. On the other hand, for the NPCB algorithm,the residuals are obtained from the nonparametric regression estimator. In order to present the8CB and NPCB resampling methods, a generic bootstrap algorithm is described. No matterthe method used, ˆ m denotes the parametric or the nonparametric circular regression estimator. Algorithm 1
1. Compute the parametric or the nonparametric regression estimates (described in Sections2.1 and 2.2, respectively), namely ˆ m ( X i ), i = 1 , . . . , n , depending if a parametric (PCB) or anonparametric (NPCB) bootstrap procedure is employed.2. From the residuals ˆ ε i = [Θ i − ˆ m ( X i )]( mod π ), i = 1 , . . . , n , draw independent bootstrapresiduals, ˆ ε ∗ i , i = 1 , . . . , n .3. Bootstrap samples are { ( X i , Θ ∗ i ) } ni =1 with Θ ∗ i = [ m ˆ β ( X i ) + ˆ ε ∗ i ]( mod π ), being m ˆ β ( X i ) theparametric regression estimator under H .4. Using the bootstrap sample { ( X i , Θ ∗ i ) } ni =1 , the bootstrap test statistics T , ∗ n,p and T , ∗ n,p arecomputed as in (6) and in (7).5. Repeat Steps 2-4 a large number of times B .In Step 1 of the previous algorithm, in the PCB approach, the circular regression function isestimated parametrically, employing the procedure described in Section 2.1. Alternatively, theNPCB tries to avoid possible misspecification problems by using more flexible regression esti-mation methods than those employed in PCB. Then, using the same arguments as in Gonz´alez-Manteiga and Cao (1993) to increase the power of the test, in the NPCB method, the nonpara-metric circular regression estimator given in (5) is employed in Step 1 of the bootstrap Algorithm1. Notice that the empirical distribution of the B bootstrap test statistics can be employed toapproximate the finite sample distribution of the test statistics T n,p and T n,p , under the nullhypothesis. Denoting by { T j, ∗ n,p, , . . ., T j, ∗ n,p,B } (for j = 1 ,
2) the sample of the B bootstrap teststatistics given in (6) and in (7), and defining its (1 − α ) quantile t j, ∗ α,p , the null hypothesis in (2)will be rejected if T jn,p > t j, ∗ α,p . Additionally, the p -values of the test statistics ( j = 1 ,
2) can beapproximated by: p -value = 1 B B (cid:88) b =1 I { T j, ∗ n,p,b >T jn,p } . (9) The finite sample performance of the proposed tests, using the bootstrap approaches described inAlgortihm 1 for their calibration, is illustrated in this section with a simulation study, consideringa regression model with a single real-valued covariate and also with a bidimensional one.9 .5.1 Simulation experiment with a single covariate
In order to study the performance of the proposed tests considering a regression model with acircular response and a single real-valued covariate, the parametric regression family M , β = { β + 2atan( β X) , β , β ∈ R } is chosen. For different values of c the regression function m ( X ) = 2atan(X) + c · asin(2X −
1) (10)is considered. Therefore, the parameter c controls whether the null ( c = 0) or the alternative( c (cid:54) = 0) hypotheses hold in problem (2). Values c = 0, 1, and 2 are considered in the study.For each value of c , 500 samples of sizes n = 50 ,
100 and 200 are generated on the unit interval,following model (1) with regression function (10) and circular errors ε i drawn independentlyfrom a von Mises distribution vM (0 , κ ), for different values of κ (5, 10 and 15).To analyze the behavior of the test statistics given in (6) and in (7) in the different scenarios,the bootstrap procedures described in Section 2.4 are applied, using B = 500 replications. Thenon-smoothed or smoothed parametric fits used for constructing (6) and (7) are computed usingthe procedures given in Sections 2.1 and 2.3, respectively. The nonparametric fit is obtainedusing the estimator given in (5), for p = 0 , , with a triweight kernel. We address the bandwidthselection problem by using the same procedure as the one used in H¨ardle and Mammen (1993),Alcal´a et al. (1999), or Opsomer and Francisco-Fern´andez (2010), among others, applying thetests on a grid of several bandwidths. In order to use a reasonable grid of bandwidths, theoptimal bandwidth selected by minimizing the CASE given in (8), for d = 1, is calculated foreach scenario. In this case, the values of the CASE optimal bandwidths are in the interval[0 . , . h = 0 . , . , . , . , . , . , are considered to compute both test statistics (6) and (7). The weight function used in bothtests is w ( x ) = I { x ∈ [1 / √ n, − / √ n ] } , to avoid possible boundary effects. Effect of sample size.
Proportions of rejections of the null hypothesis, for a significance level α = 0 .
05, considering κ = 10, and different sample sizes, are shown in Tables 1 and 2, whenusing T n,p and T n,p , respectively. If c = 0 (null hypothesis) and using the Nadaraya–Watsontype estimator, the proportions of rejections are certainly much lower than the expected values.Using this estimator, the test works fairly well when PCB is employed. NPCB provides reallybad results. When the local linear type estimator is used, the proportions of rejections aresimilar to the theoretical level, although these proportions are quite affected by the value of h . For alternative assumptions ( c = 1 and c = 2), as expected, as the sample size increasesthe proportions of rejections are larger and increase with c . As pointed out before, substantialdifferences have been found when the local linear type estimator is employed, providing moresatisfactory results than those obtained when the Nadaraya–Watson fit is used. Using the locallinear estimator, NPCB presents a slightly better performance than PCB. Although both test10tatistics provide a similar behavior of the testing procedure, T n,p seems to give slightly betterresults. Effect of κ . The performance of the tests (for α = 0 .
05) is studied for n = 200 and fordifferent values of the concentration parameter κ in Tables 3 and 4, when using T n,p and T n,p ,respectively. If c = 0 and considering the local linear type estimator, the proportions of re-jections are similar to the theoretical level when using both bootstrap approaches (PCB andNPCB). Results obtained when the Nadaraya–Watson fit is used are quite poor, specially whenNPCB is employed. For alternative assumptions, as expected, large values of the concentrationparameter κ lead to an increase in power, which justifies the correct performance of the boot-strap procedures. Considerable differences have been found if the Nadaraya–Watson or locallinear type estimators are employed in the test statistics, especially when NPCB is used. The extension for regression models with a circular response and two covariates is analyzed inthis section. For this purpose, the parametric regression family M , β = { β + 2atan( β X + β X ) , β , β , β ∈ R } is chosen, and for different values of c the regression function m ( X ) = 2atan( − X + X ) + c · asin(2X − , (11)being X = ( X , X ), is considered. For each value of c ( c = 0, 1, and 2), 500 samples of sizes n =100 ,
225 and 400 are generated on a bidimensional regular grid in the unit square, following model(1), with regression function (11) and circular errors ε i drawn from a von Mises distribution vM (0 , κ ), for κ = 5 ,
10 and 15. The bootstrap procedures described in Section 2.4 are applied,using B = 500 replications. The non-smoothed or smoothed parametric fits used for constructing(6) and (7) are computed using the procedures given in Sections 2.1 and 2.3, respectively. Thenonparametric fit is obtained using the estimator given in (5), for p = 0 , , with a multiplicativetriweight kernel. In order to simplify the calculations, the bandwidth matrix is restricted toa class of diagonal matrices with both equal elements. In this case, the diagonal elements ofthe CASE optimal bandwidths are in the interval [0 . , . H = diag( h, h ) with different values of h , h = 0 . , . , . , . , . , . , . , areconsidered to compute both test statistics (6) and (7). The weight function used in both testsis w ( x ) = I { x ∈ [1 / √ n, − / √ n ] × [1 / √ n, − / √ n ] } , to avoid possible boundary effects. Effect of sample size.
Proportions of rejections of the null hypothesis, for a significance level α = 0 .
05, considering κ = 10, and different sample sizes, are shown Tables 5, when using T n,p . Itcan be observed that using both bootstrap methods (PCB and NPCB), the test has a reasonablebehavior. If c = 0 (null hypothesis) and considering the local linear estimator, the proportionsof rejections are similar to the theoretical level. As for a single covariate, results using the11adaraya–Watson type estimator and NPCB are really bad. For alternative assumptions ( c = 1and c = 2), NPCB presents a slightly better performance than the PCB, when using the locallinear type estimator. Notice that, in most of the cases, an increasing power of the test whenthe values of h increase is observed. For all the scenarios, the power of the test becomes largeras the value of c increases. Again, considerable differences have been found when the local lineartype estimator is employed. Similar conclusions to those given for T n,p were obtained when thetest statistic T n,p was employed (see Table 6). Effect of κ . The performance of the bootstrap procedures is analyzed for n = 400 and fordifferent values of the concentration parameter κ when using T n,p , for α = 0 .
05, in Table 7. If c = 0, the proportions of rejections are similar to the theoretical level when using both bootstrapapproaches (PCB and NPCB). For larger values of the concentration parameter κ , the bandwidthvalues providing an effective calibration must be smaller. For alternative assumptions, if thevalue of the concentration parameter κ is larger, an increasing power is obtained. In almostall scenarios, some differences have been found if the Nadaraya–Watson or the local linear typeestimators are employed in the test statistics. Results considering the test statistic T n,p aresummarized in Table 8. Similar conclusions to those provided for T n,p were obtained. The testing problem (2) is addressed in Section 2 for independent data, by constructing weightedcircular test statistics. In this section, these test statistics are also analyzed considering a linear-circular regression model with spatially correlated errors.Assume the linear-circular regression model given in (1), but supposing that the circularerrors are spatially correlated. More specifically, we consider the linear-circular regression modelgiven in (1): Θ i = [ m ( X i ) + ε i ]( mod π ) , i = 1 , . . . , n, (12)where m is a smooth trend or regression function and the ε i are random angles, such that, E [sin( ε i ) | X = x ] = 0, and additionally, satisfying in this dependence framework that C ov[sin( ε i ) , sin( ε j ) | X i , X j ] = σ ρ ,n ( X i − X j ) , C ov[cos( ε i ) , cos( ε j ) | X i , X j ] = σ ρ ,n ( X i − X j ) , C ov[sin( ε i ) , cos( ε j ) | X i , X j ] = σ ρ ,n ( X i − X j ) , with σ k < ∞ for k = 1 ,
2, and σ < ∞ . The continuous stationary correlation functions ρ k,n satisfy ρ k,n ( ) = 1, ρ k,n ( x ) = ρ k,n ( − x ), and | ρ k,n ( x ) | ≤
1, for any x ∈ D ⊂ R d , and k = 1 , , n in ρ k,n indicates that the correlation functions vary with n (specifically, the12orrelation functions are assumed to be short-range and shrink as n goes to infinity). Notealso that the subscript k does not correspond to an integer sequence and it just indicates if thecorrelation corresponds to the sine process ( k = 1), the cosine process ( k = 2) or if it is thecross-correlation between them ( k = 3).In order to solve the testing problem (2) in this context, the estimator described in Sec-tion 2.1 is likewise employed for the parametric fit. Probably, more accurate results would beobtained if an estimator taking the spatial dependence structure into account is considered.However, the problem of estimating parametrically the regression function accounting for thedependence structure (up to our knowledge) has not been considered in the statistical literature.Some guidelines about a possible iterative least squares estimator (taking the possible spatialdependence structure into account) are provided in Section 4. Kernel-type estimators given in(5) are employed for the nonparametric fit. These nonparametric estimators have been studiedin Meil´an-Vila et al. (2020a) in the context of spatially correlated data.For illustration purposes, a sample of size n = 400 is generated on a bidimensional regulargrid in the unit square, assuming the linear-circular regression model (12), with regressionfunction (11), being c = 0. The circular spatially correlated errors ε i , i = 1 , . . . , n , are drawnfrom a wrapped Gaussian spatial process (Jona-Lasinio et al., 2012) with the following steps: ε i = Y i ( mod π ) , i = 1 , . . . , n, where { Y i = Y ( X i ) , i = 1 , . . . , n } is a realization of a real-valued Gaussian spatial process,where each observation can be decomposed as: Y i = µ + w i , i = 1 , . . . , n, (13)being µ = µ ( X i ) the mean and w i random variables of a zero mean Gaussian spatial process with C ov( w i , w j | X i , X j ) = σ ρ n ( X i − X j ). The variance of w i is denoted by σ and ρ n is a continuousstationary correlation function satisfying ρ n ( ) = 1, ρ n ( x ) = ρ n ( − x ), and | ρ n ( x ) | ≤ ∀ x . Notethat a realization of this wrapped Gaussian spatial process can be written in vector form as ε = ( ε , . . . , ε n ) T , with mean direction vector µ n , being n a n × σ R n , where R n ( i, j ) = ρ n ( X i − X j ) is the ( i, j )-entry of thecorrelation matrix R n .In this particular example, the circular spatially correlated errors are drawn from a wrappedGaussian spatial process, considering that, in (13), µ = 0 and w i is a zero mean process withexponential covariance structure: C ov( w i , w j | X i , X j ) = σ [exp( −(cid:107) X i − X j (cid:107) /a e )] , (14)with σ = 1 and a e = 0 .
3. In order to test if m ( X ) ∈ { β + 2atan( β X + β X ) , β , β , β ∈ R } , being X = ( X , X ), using the test statistics given in (6) and in (7), ˆ m H ( x ; p ), m ˆ β ( x )13 .0 0.2 0.4 0.6 0.8 1.0 . . . . . . p p p p . . . . . . p p p p . . . . . . p p p p . . . . . . p p p p Figure 2: Circular regression function (top left), the local linear type regression estimator (top right), theparametric fit (bottom left) and the smooth version of the parametric fit (bottom right). Sample of size n = 400 generated on a bidimensional regular grid in the unit square, following model (1), with regressionfunction (11), for c = 0, and circular errors ε i drawn from from wrapped Gaussian spatial processes withzero mean and exponential covariance structure, given in (14), with σ = 1 and a e = 0 . and ˆ m H , ˆ β ( x ; p ) fits must be computed. For the parametric counterpart, the estimator ob-tained from (3) is employed, while local linear type estimators are used for the nonparametricsmoothers. A multiplicative triweight kernel and the optimal bandwidth obtained by minimizingthe CASE, given in (8), of the local linear type estimator are considered to compute ˆ m H ( x ; 1)and ˆ m H , ˆ β ( x ; 1). Figure 2 shows the theoretical circular regression function (11), with c = 0 (topleft panel), the local linear type regression estimator (top right panel), the parametric fit (bottomleft panel) and the smooth version of the parametric fit (bottom right panel). It can be seen thatestimates at top right, bottom left and bottom right panels seem to be very similar and, therefore,the value of the test statistics T n,p and T n,p should be small. Consequently, the formal applica-tion of the tests will probably lead to assert that there is no evidences against the assumptionthat the regression function belongs to the parametric family m β ( X ) = β +2atan( β X + β X ),with X = ( X , X ).Practical methods to calibrate the test statistics T n,p and T n,p given in (6) and in (7) forspatially correlated data are presented in the following section.14 .1 Calibration in practice This section is devoted to present bootstrap resampling methods to calibrate in practice thetest statistics T n,p and T n,p given in (6) and in (7), respectively, considering the linear-circularregression model (12) with spatially correlated errors.The bootstrap Algorithm 1, which was designed for independent data, should not be used forspatial processes, as it does not account for the correlation structure. The aim of this section isto describe two different proposals for test calibration which take the dependence of the data intoaccount (PSCB and NPSCB). The main difference between the proposals is how the resamplingresiduals (required for mimicking the dependence structure of the errors) are computed. InPSCB (similarly to PCB), the residuals are obtained from the parametric regression estimator,while in NPSCB (analogously to NPCB), the residuals are obtained from the nonparametricregression estimator. In both approaches, in order to imitate the dependence structure of theerrors, an appropriate spatial circular process model is fitted to the residuals.Next, a generic bootstrap algorithm is introduced to present the PSCB and NPSCB re-sampling approaches. As in Algorithm 1, no matter the method used, either parametric ornonparametric, ˆ m denotes the parametric or the nonparametric circular regression estimator. Algorithm 2
1. Compute the parametric or the nonparametric regression estimates (described in Sections2.1 and 2.2, respectively), namely ˆ m ( X i ), i = 1 , . . . , n , depending if a parametric (PSCB) ora nonparametric (NPSCB) bootstrap procedure is employed.2. From the residuals ˆ ε i = [Θ i − ˆ m ( X i )]( mod π ), i = 1 , . . . , n , fit a spatial circular process.3. Generate a random sample from the fitted model, ˆ ε ∗ i , i = 1 , . . . , n .4. Bootstrap samples are { ( X i , Θ ∗ i ) } ni =1 with Θ ∗ i = [ m ˆ β ( X i ) + ˆ ε ∗ i ]( mod π ), being m ˆ β ( X i ) theparametric regression estimator.5. Using the bootstrap sample { ( X i , Θ ∗ i ) } ni =1 , the bootstrap test statistics T , ∗ n,p and T , ∗ n,p arecomputed as in (6) and in (7).6. Repeat Steps 3-5 a large number of times B .Notice that Algorithm 2 is a modification of Algorithm 1. Two additional steps are includedin Algorithm 2 (Steps 2 and 3) trying to mimic properly the spatial dependence structure of thecirular errors in the bootstrap procedure.As pointed out in Section 2.4 for independent data, considering the test statistics T jn,p ( j =1 , T jn,p > t j, ∗ α,p , where t j, ∗ α,p is the (1 − α ) quantile of the sample of the B bootstrap test statistics { T j, ∗ n,p, , . . ., T j, ∗ n,p,B } .Moreover, the p -values of the test statistics can be approximated as in (9).15 .2 Simulation experiment The performance of the proposed test statistics and the bootstrap procedures, described inAlgorithm 2, are analyzed in a simulation study. The parametric circular regression family M , β given in Section 2.5.2 is chosen, and for different values of c ( c = 0 , , n = 100 ,
225 and 400 are generated on a bidimensionalregular grid in the unit square, assuming the linear-circular regression model (1), with regres-sion function (11), but considering circular spatially correlated errors generated from wrappedGaussian spatial processes (Jona-Lasinio et al., 2012). The realizations of the circular (error) { ε i , i = 1 , . . . , n } are generated considering a zero mean process with the exponential covariancestructure given in (14). The value of the variance σ is fixed equal to one, and different valuesof the range parameter are considered: a e = 0 . , . , . . The performance of Algorithm 2 is analyzed in this section. Notice that Algorithm 1, whichwas designed for independent observations, should not be used in a spatial framework. In orderto illustrate this issue, Tables 9 and 10 show the proportions of rejections of the null hypothesisfor different sample sizes and α = 0 .
05, when using T n,p and T n,p , respectively, for p = 0 , n = 400 and different spatial dependence degrees (controlled by the range parameter, a e ) are summarized in Tables 11 and 12, when using T n,p and T n,p , respectively, for p = 0 , B = 500 repli-cations. As pointed out previously, the test statistics T n,p and T n,p given in (6) and in (7),are computed using the non-smoothed or smoothed parametric fits given in Sections 2.1 and2.3, respectively, while the nonparametric fit is obtained using the estimator given in (5), for p = 0 , , with a multiplicative triweight kernel. In practice, in order to implement the bootstrapAlgorithm 2, a wrapped Gaussian spatial process model is employed in Step 2. Following theproposal by Jona-Lasinio et al. (2012), the model fitting within a Bayesian framework is per-formed using a Markov Chain Monte Carlo method. Assuming a linear Gaussian spatial processof the form (13), to perform a Bayesian fit of the model, priors are needed for the model param-eters. The authors suggest a normal prior for µ , a truncated inverse gamma prior for σ , and auniform prior (with support allowing small ranges up to ranges a bit larger than the maximumdistance over the region) for the decay parameter 3 /a e . More specifically, the prior of µ is aGaussian distribution with zero mean and variance one. For σ , we consider an inverse Gamma,IG( a σ , b σ ), with a σ = 2 and b σ = 1, then the mean is b σ / ( a σ −
1) = 1. The continuous uniform16istribution defined on the interval (0 . ,
1) is used as the prior for the decay parameter. Theparameters are updated using a Metropolis–Hastings algorithm (Hastings, 1970). For furtherdetails on the wrapped Gaussian spatial model fitting we refer to Jona-Lasinio et al. (2012). Themean of the posteriori parameter estimates are considered in Step 3 of Algorithm 2. Notice thatin this case, the circular spatially correlated errors are generated from wrapped Gaussian spatialprocesses, and in Step 2 of Algorithm 2, a wrapped Gaussian spatial process model is employedfor model fitting. This modeling only allows symmetric marginal distributions. Therefore, if theerrors were drawn by using other procedure, such as a projected Gaussian spatial process (withasymmetric marginals), it would be more convenient to use an alternative approach.In order to analyze the effect of the bandwidth matrix in the test statistics, T n,p and T n,p arecomputed in a grid of several bandwidths. As in the experiment shown in Section 2.5.2 in a inde-pendence framework, the bandwidth matrix is restricted to be diagonal with both equal elements, H = diag( h, h ). In this case, the different values of h = 0 . , . , . , . , . , . , . , . w as in Section 2.5.2 is used here.Table 13 shows the proportions of rejections of the null hypothesis for different sample sizesand α = 0 .
05, when using T n,p . Under the null hypothesis ( c = 0), it can be observed thatthe test has an acceptable performance using both bootstrap approaches PSCB and NPSCB.The proportions of rejections are similar to the theoretical level considered, namely α = 0 . h . For alternativeassumptions ( c = 1 and c = 2), the performance of the test is satisfactory. As expected, thepower of the test is larger when the value of c is also larger. A slightly better performance of theis obtained when considering the test statistic T n,p . In this case, the proportions of rejections ofthe null hypothesis are presented in Table 14.Results for n = 400 and different spatial dependence degrees ( a e = 0 . , . , .
6) are shownin Table 15, when using T n,p . PSCB and NPSCB approaches provide good results for boththe null and the alternative hypotheses. As expected, the power of the test is larger when thedependence structure is weaker. In these scenarios, results considering the test statistic T n,p aresummarized in Table 16. Testing procedures for assessing a parametric circular regression model (with circular responseand R d -valued covariate) were proposed and analyzed in this work for independent and forspatially correlated data. Specifically, the test statistics were constructed by measuring a circulardistance between a parametric fit (non-smoothed or smoothed) and a nonparametric estimatorof the circular regression function. For the parametric approach, taking into account thatthe classical least squares regression method is not appropriate when the response variable is of17ircular nature, a circular analog was used (Fisher and Lee, 1992; Lund, 1999). Other parametricfitting approaches, such as maximum likelihood methods, could be used instead. Regarding thenonparametric fit, local polynomial type estimators were considered in the test statistics.Although the asymptotic distribution of the tests, under the null and under local alternatives,is out of the scope of this work, some guidelines to calculate this expression are provided in thissection. As pointed out by Kim and SenGupta (2017), using Taylor series expansions, thefunction 1 − cos(Θ) can be approximated by Θ /
2, for Θ ∈ [0 , π ). Therefore, the expressions1 − cos[ ˆ m H ( x ; p ) − m ˆ β ( x )] and 1 − cos[ ˆ m H ( x ; p ) − ˆ m H , ˆ β ( x ; p )] in the test statistics T n,p and T n,p , given in (6) and in (7), respectively, can be approximated by 1 /
2[ ˆ m H ( x ; p ) − m ˆ β ( x )] and1 /
2[ ˆ m H ( x ; p ) − ˆ m H , ˆ β ( x ; p )] , respectively. Consequently, T n,p and T n,p can be approximatedby test statistics similar to the ones used, for example, in H¨ardle and Mammen (1993) or inMeil´an-Vila et al. (2020d), for regression models with Euclidean response and covariates. Noticethat the regression estimators involved in the test statistics T n,p and T n,p have more complicatedexpressions than those in H¨ardle and Mammen (1993) or in Meil´an-Vila et al. (2020d). Therefore,as intuition suggests, it will be more difficult to calculate close expressions of their asymptoticdistributions.For practical implementation, bootstrap resampling methods were used to calibrate the test.For independent data, two procedures have been designed and compared: PCB and NPCB. Bothmethods are based on computing the residuals and generating independent bootstrap resamples.The main difference between them is the mechanism employed to obtain the residuals. InPCB, the residuals come from the parametric regression estimator. Alternatively, in NPCB,the residuals are obtained from the nonparametric regression estimator. For dependent data, inorder to imitate the distribution of the (spatially correlated) errors, new bootstrap procedureswere proposed: PSCB and NPSCB. Again, the main difference between both approaches is howthe residuals are obtained. In the case of the PSCB, the residuals come from the parametric fit,whereas in NPSCB, the residuals are obtained from the nonparametric estimator. In practice,in order to implement the procedure, a wrapped Gaussian spatial process model (Jona-Lasinioet al., 2012) was fitted to them to mimic the dependence structure. This wrapped Gaussianspatial process model was fitted within a Bayesian framework, therefore, some prior parametervalues must be provided to use the Markov Chain Monte Carlo model fitting. For furtherdetails on wrapped Gaussian model fitting, we refer to Jona-Lasinio et al. (2012). Alternatively,other spatial-circular process models, such as asymmetric wrapped Gaussian spatial processes(Mastrantonio et al., 2016) or projected Gaussian spatial processes (Wang and Gelfand, 2014)could be employed to model the residuals, and thus try to imitate the dependence structure ofthe errors. Notice that once the model is fitted, error bootstrap samples are generated fromit. These errors bootstrap samples could be also employed to design a parametric iterative18east squares estimator, accounting for the possible spatial dependence structure, that could beused in the tests for spatially correlated data (instead of the parametric fit given in Section(6)). Specifically, using the errors bootstrap samples, the variance-covariance matrix of thecircular errors can be approximated. Then, applying a Cholesky decomposition of this matrix,the original circular responses and the R d -valued covariate are transformed, as it is done in thegeneralized least squares method. Finally, the parameter estimate is obtained applying (3) to thetransformed observations. Obviously, this algorithm could be applied iteratively. Although, wehave not applied this method in practice, we do not believe that it provides great improvementsover using the circular least squares method described in Section 2.1, even though the data areindeed dependent. The possible benefits of taking the correlation of the data into account couldbe offset by the difficulty of adequately estimating the varaince-covariance matrix of the circularerrors.For independent data, in the majority of scenarios considered in the simulation study, resultsobtained with NPCB improve those achieved by PCB, especially, for alternative assumptions.Moreover, a better behavior is observed when T n,p , given in (7), is employed. In general, thelocal linear type estimator seems to show a slightly better performance. For spatially correlateddata, it can be obtained that both tests do not work properly under the null hypothesis, whenusing PCB and NPCB designed for independence. Regarding PSCB and NPSCB, the use ofnonparametric residuals in the bootstrap procedure provides the best results. As expected, thepower of the test is larger when the spatial dependence structure is weaker. More satisfactoryresults are achieved when T n, is used. In both frameworks (independent and spatially correlateddata), the proportions of rejections of the null hypothesis clearly depend on the bandwidthmatrix considered.The procedures used in the simulation study were implemented in the statistical environment R (R Development Core Team, 2020), using functions included in the npsp and CircSpaceTime packages (Fern´andez-Casal, 2019; Jona-Lasinio et al., 2019).
Acknowledgements
The authors acknowledge the support from the Xunta de Galicia grant ED481A-2017/361 andthe European Union (European Social Fund - ESF). This research has been partially supportedby MINECO grants MTM2016-76969-P and MTM2017-82724-R, and by the Xunta de Gali-cia (Grupo de Referencia Competitiva ED431C-2017-38, and Centro de Investigaci´on del SUGED431G 2019/01), all of them through the ERDF.19 stimator c n
Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 1: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes and κ = 10. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 2: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes and κ = 10. The test statistic T n,p for p = 0 , α = 0 . stimator c κ Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 3: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of κ and n = 400. The test statistic T n,p for p = 0 , α = 0 . stimator c κ Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 4: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of κ and n = 400. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 5: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes and κ = 10. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 6: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes and κ = 10. The test statistic T n,p for p = 0 , α = 0 . stimator c κ Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 7: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of κ and n = 400. The test statistic T n,p for p = 0 , α = 0 . stimator c κ Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . Table 8: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of κ and n = 400. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 9: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes. Model parameters: σ = 0 .
16 and a e = 0 .
3. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 10: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes. Model parameters: σ = 0 .
16 and a e = 0 .
3. The test statistic T n,p for p = 0 , α = 0 . stimator c a e Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 11: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of a e . Model parameters: σ = 0 .
16 and n = 400. The test statistic T n,p for p = 0 , α = 0 . stimator c a e Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 12: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of a e . Model parameters: σ = 0 .
16 and n = 400. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 13: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes. Model parameters: σ = 0 .
16 and a e = 0 .
3. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 14: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentsample sizes. Model parameters: σ = 0 .
16 and a e = 0 .
3. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 15: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of a e . Model parameters: σ = 0 .
16 and n = 400. The test statistic T n,p for p = 0 , α = 0 . stimator c n Method h = 0 . h = 0 . h = 0 . h = 0 . h = 0 . h = 1 . h = 1 . h = 1 . Table 16: Proportions of rejections of the null hypothesis for the parametric family M ,β with differentvalues of a e . Model parameters: σ = 0 .
16 and n = 400. The test statistic T n,p for p = 0 , α = 0 . References
Alcal´a J, Crist´obal J, Gonz´alez-Manteiga W (1999) Goodness-of-fit test for linear models basedon local polynomials. Statistics & Probability Letters 42:39–46Biedermann S, Dette H (2000) Testing linearity of regression models with dependent errors bykernel based methods. TEST 9:417–438 35eschepper E, Thas O, Ottoy JP (2008) Tests and diagnostic plots for detecting lack-of-fit forcircular-linear regression models. Biometrics 64(3):912–920Eubank RL, Li CS, Wang S (2005) Testing lack-of-fit of parametric regression models usingnonparametric regression techniques. Statistica Sinica 15:135–152Fan J, Zhang C, Zhang J (2001) Generalized likelihood ratio statistics and Wilks phenomenon.The Annals of Statistics 29(1):153–193Fern´andez-Casal R (2019) n psp: Nonparametric spatial (geo)statistics. URL http://cran.r-project.org/package=npsp , R package version 0.7-5Fisher NI, Lee AJ (1992) Regression models for an angular response. Biometrics 48(3):665–677Garc´ıa-Portugu´es E, Van Keilegom I, Crujeiras and RM, Gonz´alez-Manteiga W (2016) Test-ing parametric models in linear-directional regression. Scandinavian Journal of Statistics43(4):1178–1191Gonz´alez-Manteiga W, Cao R (1993) Testing the hypothesis of a general linear model usingnonparametric regression estimation. TEST 2(1-2):161–188Gonz´alez-Manteiga W, Crujeiras RM (2013) An updated review of Goodness-of-Fit tests forregression models. TEST 22(3):361–411Gonz´alez Manteiga W, Vilar Fern´andez J (1995) Testing linear regression models using non-parametric regression estimators when errors are non-independent. Computational Statistics& Data Analysis 20(5):521–541H¨ardle W, Mammen E (1993) Comparing nonparametric versus parametric regression fits. TheAnnals of Statistics 21:1926–1947Hart J (2013) Nonparametric Smoothing and Lack-of-Fit Tests. Springer Science & BusinessMediaHastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications.Biometrika 57(1):97–109Jammalamadaka SR, SenGupta A (2001) Topics in Circular Statistics, vol 5. World ScientificJona-Lasinio G, Gelfand A, Jona-Lasinio M (2012) Spatial analysis of wave direction data usingwrapped Gaussian processes. The Annals of Applied Statistics 6(4):1478–149836ona-Lasinio G, Mastrantonio G, Santoro M (2019) C ircSpaceTime: Spatial and Spatio-Temporal Bayesian Model for Circular Data. URL http://cran.r-project.org/package=CircSpaceTime , R package version 0.9.0Kim S, SenGupta A (2017) Multivariate-multiple circular regression. Journal of Statistical Com-putation and Simulation 87(7):1277–1291Kozek AS (1991) A nonparametric test of fit of a parametric model. Journal of MultivariateAnalysis 37(1):66–75Lagona F, Picone M, Maruotti A (2015) A hidden Markov model for the analysis of cylindricaltime series. Environmetrics 26(8):534–544Lund U (1999) Least circular distance regression for directional data. Journal of Applied Statis-tics 26(6):723–733Mastrantonio G, Gelfand AE, Lasinio GJ (2016) The wrapped skew Gaussian process foranalyzing spatio-temporal data. Stochastic Environmental Research and Risk Assessment30(8):2231–2242Meil´an-Vila A, Crujeiras R, Francisco-Fern´andez M, Panzera A (2020a) Nonparametric circularregression estimation with application to wave directions. (Submitted)Meil´an-Vila A, Fern´andez-Casal R, Crujeiras RM, Francisco-Fern´andez M (2020b) A computa-tional validation for nonparametric assessment of spatial trends. arXiv 2002.05489Meil´an-Vila A, Francisco-Fern´andez M, Crujeiras R, Panzera A (2020c) Nonparametric multi-variate regression estimation for circular responses. arXiv 2001.10317Meil´an-Vila A, Opsomer JD, Francisco-Fern´andez M, Crujeiras RM (2020d) A goodness-of-fittest for regression models with spatially correlated errors. TEST 29:728–749Opsomer J, Francisco-Fern´andez M (2010) Finding local departures from a parametric modelusing nonparametric regression. Statistical Papers 51(1):69Park C, Kim TY, Ha J, Luo ZM, Hwang SY (2015) Using a bimodal kernel for a nonparametricregression specification test. Statistica Sinica 25:1145–1161Presnell B, Morrison SP, Littell RC (1998) Projected multivariate linear models for directionaldata. Journal of the American Statistical Association 93(443):1068–1077R Development Core Team (2020) R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna, Austria, URL