Featured Researches

Statistics Theory

Explicit expressions for joint moments of n -dimensional elliptical distributions

Inspired by Stein's lemma, we derive two expressions for the joint moments of elliptical distributions. We use two different methods to derive E[ X 2 1 f(X)] for any measurable function f satisfying some regularity conditions. Then, by applying this result, we obtain new formulae for expectations of product of normally distributed random variables, and also present simplified expressions of E[ X 2 1 f(X)] for multivariate Student- t , logistic and Laplace distributions.

Read more
Statistics Theory

Exponential confidence interval based on the recursive Wolverton-Wagner density estimation

We derive the exponential non improvable Grand Lebesgue Space norm decreasing estimations for tail of distribution for exact normed deviation for the famous recursive Wolverton-Wagner multivariate statistical density estimation. We consider pointwise as well as Lebesgue-Riesz norm error of statistical density of measurement.

Read more
Statistics Theory

Exponential inequalities for sampling designs

In this work we introduce a general approach, based on the mar-tingale representation of a sampling design and Azuma-Hoeffding's inequality , to derive exponential inequalities for the difference between a Horvitz-Thompson estimator and its expectation. Applying this idea, we establish such inequalities for Chao's procedure, Till{é}'s elimination procedure, the generalized Midzuno method as well as for Brewer's method. As a by-product, we prove that the first three sampling designs are (conditionally) negatively associated. For such sampling designs, we show that that the inequality we obtain is usually sharper than the one obtained by applying known results for negatively associated random variables.

Read more
Statistics Theory

Factor Modelling for Clustering High-dimensional Time Series

We propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some common factors which impact on all the time series concerned. Our setting also offers the flexibility that some time series may not belong to any clusters. The consistency with explicit convergence rates is established for the estimation of the common factors, the cluster-specific factors, the latent clusters. Numerical illustration with both simulated data as well as a real data example is also reported. As a spin-off, the proposed new approach also advances significantly the statistical inference for the factor model of Lam and Yao (2012).

Read more
Statistics Theory

Factorization and discrete-time representation of multivariate CARMA processes

In this paper we show that stationary and non-stationary multivariate continuous-time ARMA (MCARMA) processes have the representation as a sum of multivariate complex-valued Ornstein-Uhlenbeck processes under some mild assumptions. The proof benefits from properties of rational matrix polynomials. A conclusion is an alternative description of the autocovariance function of a stationary MCARMA process. Moreover, that representation is used to show that the discrete-time sampled MCARMA(p,q) process is a weak VARMA(p,p-1) process if second moments exist. That result complements the weak VARMA(p,p-1) representation derived in Chambers and Thornton (2012). In particular, it relates the right solvents of the autoregressive polynomial of the MCARMA process to the right solvents of the autoregressive polynomial of the VARMA process; in the one-dimensional case the right solvents are the zeros of the autoregressive polynomial. Finally, a factorization of the sample autocovariance function of the noise sequence is presented which is useful for statistical inference.

Read more
Statistics Theory

False discovery rate control with e-values

E-values have gained attention as potential alternatives to p-values as measures of uncertainty, significance and evidence. In brief, e-values are realized by random variables with expectation at most one under the null; examples include betting scores, (point null) Bayes factors, likelihood ratios and stopped supermartingales. We design a natural analog of the Benjamini-Hochberg (BH) procedure for false discovery rate (FDR) control that utilizes e-values, called the e-BH procedure, and compare it with the standard procedure for p-values. One of our central results is that, unlike the usual BH procedure, the e-BH procedure controls the FDR at the desired level---with no correction---for any dependence structure between the e-values. We illustrate that the new procedure is convenient in various settings of complicated dependence, structured and post-selection hypotheses, and multi-armed bandit problems. Moreover, the BH procedure is a special case of the e-BH procedure through calibration between p-values and e-values. Overall, the e-BH procedure is a novel, powerful and general tool for multiple testing under dependence, that is complementary to the BH procedure, each being an appropriate choice in different applications.

Read more
Statistics Theory

Families of discrete circular distributions with some novel applications

Motivated by some cutting edge circular data such as from Smart Home technologies and roulette spins from online and casino, we construct some new rich classes of discrete distributions on the circle. We give four new general methods of construction, namely (i) maximum entropy, (ii) centered wrapping, (iii) marginalized and (iv) conditionalized methods. We motivate these methods on the line and then work on the circular case and provide some properties to gain insight into these constructions. We mainly focus on the last two methods (iii) and (iv) in the context of circular location families, as they are amenable to general methodology. We show that the marginalized and conditionalized discrete circular location families inherit important properties from their parent continuous families. In particular, for the von Mises and wrapped Cauchy as the parent distribution, we examine their properties including the maximum likelihood estimators, the hypothesis test for uniformity and give a test of serial independence. Using our discrete circular distributions, we demonstrate how to determine changepoint when the data arise in a sequence and how to fit mixtures of this distribution. Illustrative examples are given which triggered the work. For example, for roulette data, we test for uniformity (unbiasedness) , test for serial correlation, detect changepoint in streaming roulette-spins data, and fit mixtures. We analyse a smart home data using our mixtures. We examine the effect of ignoring discreteness of the underlying population, and discuss marginalized versus conditionalized approaches. We give various extensions of the families with skewness and kurtosis, to those supported on an irregular lattice, and discuss potential extension to general manifolds by showing a construction on the torus

Read more
Statistics Theory

Fast Non-Asymptotic Testing And Support Recovery For Large Sparse Toeplitz Covariance Matrices

We consider n independent p -dimensional Gaussian vectors with covariance matrix having Toeplitz structure. We test that these vectors have independent components against a stationary distribution with sparse Toeplitz covariance matrix, and also select the support of non-zero entries. We assume that the non-zero values can occur in the recent past (time-lag less than p/2 ). We build test procedures that combine a sum and a scan-type procedures, but are computationally fast, and show their non-asymptotic behaviour in both one-sided (only positive correlations) and two-sided alternatives, respectively. We also exhibit a selector of significant lags and bound the Hamming-loss risk of the estimated support. These results can be extended to the case of nearly Toeplitz covariance structure and to sub-Gaussian vectors. Numerical results illustrate the excellent behaviour of both test procedures and support selectors - larger the dimension p , faster are the rates.

Read more
Statistics Theory

Fast and Asymptotically Powerful Detection for Filamentary Objects in Digital Images

Given an inhomogeneous chain embedded in a noisy image, we consider the conditions under which such an embedded chain is detectable. Many applications, such as detecting moving objects, detecting ship wakes, can be abstracted as the detection on the existence of chains. In this work, we provide the detection algorithm with low order of computation complexity to detect the chain and the optimal theoretical detectability regarding SNR (signal to noise ratio) under the normal distribution model. Specifically, we derive an analytical threshold that specifies what is detectable. We design a longest significant chain detection algorithm, with computation complexity in the order of O(nlogn) . We also prove that our proposed algorithm is asymptotically powerful, which means, as the dimension n→∞ , the probability of false detection vanishes. We further provide some simulated examples and a real data example, which validate our theory.

Read more
Statistics Theory

Feasible Inference for Stochastic Volatility in Brownian Semistationary Processes

This article studies the finite sample behaviour of a number of estimators for the integrated power volatility process of a Brownian semistationary process in the non semi-martingale setting. We establish three consistent feasible estimators for the integrated volatility, two derived from parametric methods and one non-parametrically. We then use a simulation study to compare the convergence properties of the estimators to one another, and to a benchmark of an infeasible estimator. We further establish bounds for the asymptotic variance of the infeasible estimator and assess whether a central limit theorem which holds for the infeasible estimator can be translated into a feasible limit theorem for the non-parametric estimator.

Read more

Ready to get started?

Join us today