Featured Researches

Statistics Theory

A Note On Inference for the Mixed Fractional Ornstein-Uhlenbeck Process with Drift

This paper is devoted to parameter estimation of the mixed fractional Ornstein-Uhlenbeck process with a drift. Large sample asymptotical properties of the Maximum Likelihood Estimator is deduced using the Laplace transform computations or the Cameron-Martin formula with extra part from \cite{CK19}

Read more
Statistics Theory

A Note on Likelihood Ratio Tests for Models with Latent Variables

The likelihood ratio test (LRT) is widely used for comparing the relative fit of nested latent variable models. Following Wilks' theorem, the LRT is conducted by comparing the LRT statistic with its asymptotic distribution under the restricted model, a χ 2 -distribution with degrees of freedom equal to the difference in the number of free parameters between the two nested models under comparison. For models with latent variables such as factor analysis, structural equation models and random effects models, however, it is often found that the χ 2 approximation does not hold. In this note, we show how the regularity conditions of Wilks' theorem may be violated using three examples of models with latent variables. In addition, a more general theory for LRT is given that provides the correct asymptotic theory for these LRTs. This general theory was first established in Chernoff (1954) and discussed in both van der Vaart (2000) and Drton (2009), but it does not seem to have received enough attention. We illustrate this general theory with the three examples.

Read more
Statistics Theory

A Note on Online Change Point Detection

We investigate sequential change point estimation and detection in univariate nonparametric settings, where a stream of independent observations from sub-Gaussian distributions with a common variance factor and piecewise-constant but otherwise unknown means are collected. We develop a simple CUSUM-based methodology that provably control the probability of false alarms or the average run length while minimizing, in a minimax sense, the detection delay. We allow for all the model parameters to vary in order to capture a broad range of levels of statistical hardness for the problem at hand. We further show how our methodology is applicable to the case in which multiple change points are to be estimated sequentially.

Read more
Statistics Theory

A Note on the Likelihood Ratio Test in High-Dimensional Exploratory Factor Analysis

The likelihood ratio test is widely used in exploratory factor analysis to assess the model fit and determine the number of latent factors. Despite its popularity and clear statistical rationale, researchers have found that when the dimension of the response data is large compared to the sample size, the classical chi-square approximation of the likelihood ratio test statistic often fails. Theoretically, it has been an open problem when such a phenomenon happens as the dimension of data increases; practically, the effect of high dimensionality is less examined in exploratory factor analysis, and there lacks a clear statistical guideline on the validity of the conventional chi-square approximation. To address this problem, we investigate the failure of the chi-square approximation of the likelihood ratio test in high-dimensional exploratory factor analysis, and derive the necessary and sufficient condition to ensure the validity of the chi-square approximation. The results yield simple quantitative guidelines to check in practice and would also provide useful statistical insights into the practice of exploratory factor analysis.

Read more
Statistics Theory

A Note on the Sum of Non-Identically Distributed Doubly Truncated Normal Distributions

It is proved that the sum of n independent but non-identically distributed doubly truncated Normal distributions converges in distribution to a Normal distribution. It is also shown how the result can be applied in estimating a constrained mixed effects model.

Read more
Statistics Theory

A Portmanteau-type test for detecting serial correlation in locally stationary functional time series

The Portmanteau test provides the vanilla method for detecting serial correlations in classical univariate time series analysis. The method is extended to the case of observations from a locally stationary functional time series. Asymptotic critical values are obtained by a suitable block multiplier bootstrap procedure. The test is shown to asymptotically hold its level and to be consistent against general alternatives.

Read more
Statistics Theory

A Power Analysis for Knockoffs with the Lasso Coefficient-Difference Statistic

In a linear model with possibly many predictors, we consider variable selection procedures given by {1≤j≤p:| β ˆ j (λ)|>t}, where β ˆ (λ) is the Lasso estimate of the regression coefficients, and where λ and t may be data dependent. Ordinary Lasso selection is captured by using t=0 , thus allowing to control only λ , whereas thresholded-Lasso selection allows to control both λ and t . The potential advantages of the latter over the former in terms of power---figuratively, opening up the possibility to look further down the Lasso path---have been quantified recently leveraging advances in approximate message-passing (AMP) theory, but the implications are actionable only when assuming substantial knowledge of the underlying signal. In this work we study theoretically the power of a knockoffs-calibrated counterpart of thresholded-Lasso that enables us to control FDR in the realistic situation where no prior information about the signal is available. Although the basic AMP framework remains the same, our analysis requires a significant technical extension of existing theory in order to handle the pairing between original variables and their knockoffs. Relying on this extension we obtain exact asymptotic predictions for the true positive proportion achievable at a prescribed type I error level. In particular, we show that the knockoffs version of thresholded-Lasso can perform much better than ordinary Lasso selection if λ is chosen by cross-validation on the augmented matrix.

Read more
Statistics Theory

A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration

In this paper, we develop novel perturbation bounds for the high-order orthogonal iteration (HOOI) [DLDMV00b]. Under mild regularity conditions, we establish blockwise tensor perturbation bounds for HOOI with guarantees for both tensor reconstruction in Hilbert-Schmidt norm $\|\widehat{\bcT} - \bcT \|_{\tHS}$ and mode- k singular subspace estimation in Schatten- q norm $\| \sin \Theta (\widehat{\U}_k, \U_k) \|_q$ for any q≥1 . We show the upper bounds of mode- k singular subspace estimation are unilateral and converge linearly to a quantity characterized by blockwise errors of the perturbation and signal strength. For the tensor reconstruction error bound, we express the bound through a simple quantity ξ , which depends only on perturbation and the multilinear rank of the underlying signal. Rate matching deterministic lower bound for tensor reconstruction, which demonstrates the optimality of HOOI, is also provided. Furthermore, we prove that one-step HOOI (i.e., HOOI with only a single iteration) is also optimal in terms of tensor reconstruction and can be used to lower the computational cost. The perturbation results are also extended to the case that only partial modes of $\bcT$ have low-rank structure. We support our theoretical results by extensive numerical studies. Finally, we apply the novel perturbation bounds of HOOI on two applications, tensor denoising and tensor co-clustering, from machine learning and statistics, which demonstrates the superiority of the new perturbation results.

Read more
Statistics Theory

A Small-Uniform Statistic for the Inference of Functional Linear Regressions

We propose a "small-uniform" statistic for the inference of the functional PCA estimator in a functional linear regression model. The literature has shown two extreme behaviors: on the one hand, the FPCA estimator does not converge in distribution in its norm topology; but on the other hand, the FPCA estimator does have a pointwise asymptotic normal distribution. Our statistic takes a middle ground between these two extremes: after a suitable rate normalization, our small-uniform statistic is constructed as the maximizer of a fractional programming problem of the FPCA estimator over a finite-dimensional subspace, and whose dimensions will grow with sample size. We show the rate for which our scalar statistic converges in probability to the supremum of a Gaussian process. The small-uniform statistic has applications in hypothesis testing. Simulations show our statistic has comparable to slightly better power properties for hypothesis testing than the two statistics of Cardot, Ferraty, Mas and Sarda (2003).

Read more
Statistics Theory

A Statistical Learning Assessment of Huber Regression

As one of the triumphs and milestones of robust statistics, Huber regression plays an important role in robust inference and estimation. It has also been finding a great variety of applications in machine learning. In a parametric setup, it has been extensively studied. However, in the statistical learning context where a function is typically learned in a nonparametric way, there is still a lack of theoretical understanding of how Huber regression estimators learn the conditional mean function and why it works in the absence of light-tailed noise assumptions. To address these fundamental questions, we conduct an assessment of Huber regression from a statistical learning viewpoint. First, we show that the usual risk consistency property of Huber regression estimators, which is usually pursued in machine learning, cannot guarantee their learnability in mean regression. Second, we argue that Huber regression should be implemented in an adaptive way to perform mean regression, implying that one needs to tune the scale parameter in accordance with the sample size and the moment condition of the noise. Third, with an adaptive choice of the scale parameter, we demonstrate that Huber regression estimators can be asymptotic mean regression calibrated under (1+ϵ) -moment conditions ( ϵ>0 ). Last but not least, under the same moment conditions, we establish almost sure convergence rates for Huber regression estimators. Note that the (1+ϵ) -moment conditions accommodate the special case where the response variable possesses infinite variance and so the established convergence rates justify the robustness feature of Huber regression estimators. In the above senses, the present study provides a systematic statistical learning assessment of Huber regression estimators and justifies their merits in terms of robustness from a theoretical viewpoint.

Read more

Ready to get started?

Join us today