Featured Researches

Statistics Theory

Functions with average smoothness: structure, algorithms, and learning

We initiate a program of average smoothness analysis for efficiently learning real-valued functions on metric spaces. Rather than using the Lipschitz constant as the regularizer, we define a local slope at each point and gauge the function complexity as the average of these values. Since the mean can be dramatically smaller than the maximum, this complexity measure can yield considerably sharper generalization bounds -- assuming that these admit a refinement where the Lipschitz constant is replaced by our average of local slopes. Our first major contribution is to obtain just such distribution-sensitive bounds. This required overcoming a number of technical challenges, perhaps the most formidable of which was bounding the {\em empirical} covering numbers, which can be much worse-behaved than the ambient ones. Our combinatorial results are accompanied by efficient algorithms for smoothing the labels of the random sample, as well as guarantees that the extension from the sample to the whole space will continue to be, with high probability, smooth on average. Along the way we discover a surprisingly rich combinatorial and analytic structure in the function class we define.

Read more
Statistics Theory

Gaussian distributions on Riemannian symmetric spaces in the large N limit

We consider Gaussian distributions on certain Riemannian symmetric spaces. In contrast to the Euclidean case, it is challenging to compute the normalization factors of such distributions, which we refer to as partition functions. In some cases, such as the space of Hermitian positive definite matrices or hyperbolic space, it is possible to compute them exactly using techniques from random matrix theory. However, in most cases which are important to applications, such as the space of symmetric positive definite (SPD) matrices or the Siegel domain, this is only possible numerically. Moreover, when we consider, for instance, high-dimensional SPD matrices, the known algorithms for computing partition functions can become exceedingly slow. Motivated by notions from theoretical physics, we will discuss how to approximate the partition functions in the large N limit: an approximation that gets increasingly better as the dimension of the underlying symmetric space (more precisely, its rank) gets larger. We will give formulas for leading order terms in the case of SPD matrices and related spaces. Furthermore, we will characterize the large N limit of the Siegel domain through a singular integral equation arising as a saddle-point equation.

Read more
Statistics Theory

Gaussian linear approximation for the estimation of the Shapley effects

In this paper, we address the estimation of the sensitivity indices called "Shapley eects". These sensitivity indices enable to handle dependent input variables. The Shapley eects are generally dicult to estimate, but they are easily computable in the Gaussian linear framework. The aim of this work is to use the values of the Shapley eects in an approximated Gaussian linear framework as estimators of the true Shapley eects corresponding to a non-linear model. First, we assume that the input variables are Gaussian with small variances. We provide rates of convergence of the estimated Shapley eects to the true Shapley eects. Then, we focus on the case where the inputs are given by an non-Gaussian empirical mean. We prove that, under some mild assumptions, when the number of terms in the empirical mean increases, the dierence between the true Shapley eects and the estimated Shapley eects given by the Gaussian linear approximation converges to 0. Our theoretical results are supported by numerical studies, showing that the Gaussian linear approximation is accurate and enables to decrease the computational time signicantly.

Read more
Statistics Theory

General Hannan and Quinn Criterion for Common Time Series

This paper aims to study data driven model selection criteria for a large class of time series, which includes ARMA or AR( ??) processes, as well as GARCH or ARCH( ??), APARCH and many others processes. We tackled the challenging issue of designing adaptive criteria which enjoys the strong consistency property. When the observations are generated from one of the aforementioned models, the new criteria, select the true model almost surely asymptotically. The proposed criteria are based on the minimization of a penalized contrast akin to the Hannan and Quinn's criterion and then involved a term which is known for most classical time series models and for more complex models, this term can be data driven calibrated. Monte-Carlo experiments and an illustrative example on the CAC 40 index are performed to highlight the obtained results.

Read more
Statistics Theory

Generalization error of random features and kernel methods: hypercontractivity and kernel matrix concentration

Consider the classical supervised learning problem: we are given data ( y i , x i ) , i?�n , with y i a response and x i ?�X a covariates vector, and try to learn a model f:X?�R to predict future responses. Random features methods map the covariates vector x i to a point ?( x i ) in a higher dimensional space R N , via a random featurization map ? . We study the use of random features methods in conjunction with ridge regression in the feature space R N . This can be viewed as a finite-dimensional approximation of kernel ridge regression (KRR), or as a stylized model for neural networks in the so called lazy training regime. We define a class of problems satisfying certain spectral conditions on the underlying kernels, and a hypercontractivity assumption on the associated eigenfunctions. These conditions are verified by classical high-dimensional examples. Under these conditions, we prove a sharp characterization of the error of random features ridge regression. In particular, we address two fundamental questions: (1) ~What is the generalization error of KRR? (2) ~How big N should be for the random features approximation to achieve the same error as KRR? In this setting, we prove that KRR is well approximated by a projection onto the top ??eigenfunctions of the kernel, where ??depends on the sample size n . We show that the test error of random features ridge regression is dominated by its approximation error and is larger than the error of KRR as long as N??n 1?��?for some δ>0 . We characterize this gap. For N??n 1+δ , random features achieve the same error as the corresponding KRR, and further increasing N does not lead to a significant change in test error.

Read more
Statistics Theory

Generalized Spacing-Statistics and a New Family of Non-Parametric Tests

Random divisions of an interval arise in various context, including statistics, physics, and geometric analysis. For testing the uniformity of a random partition of the unit interval [0,1] into k disjoint subintervals of size ( S k [1],…, S k [k]) , Greenwood (1946) suggested using the squared ℓ 2 -norm of this size vector as a test statistic, prompting a number of subsequent studies. Despite much progress on understanding its power and asymptotic properties, attempts to find its exact distribution have succeeded so far for only small values of k . Here, we develop an efficient method to compute the distribution of the Greenwood statistic and more general spacing-statistics for an arbitrary value of k . Specifically, we consider random divisions of {1,2,…,n} into k subsets of consecutive integers and study ∥ S n,k ∥ p p,w , the p th power of the weighted ℓ p -norm of the subset size vector S n,k =( S n,k [1],…, S n,k [k]) for arbitrary weights w=( w 1 ,…, w k ) . We present an exact and quickly computable formula for its moments, as well as a simple algorithm to accurately reconstruct a probability distribution using the moment sequence. We also study various scaling limits, one of which corresponds to the Greenwood statistic in the case of p=2 and w=(1,…,1) , and this connection allows us to obtain information about regularity, monotonicity and local behavior of its distribution. Lastly, we devise a new family of non-parametric tests using ∥ S n,k ∥ p p,w and demonstrate that they exhibit substantially improved power for a large class of alternatives, compared to existing popular methods such as the Kolmogorov-Smirnov, Cramer-von Mises, and Mann-Whitney/Wilcoxon rank-sum tests.

Read more
Statistics Theory

Generating Sparse Stochastic Processes Using Matched Splines

We provide an algorithm to generate trajectories of sparse stochastic processes that are solutions of linear ordinary differential equations driven by Lévy white noises. A recent paper showed that these processes are limits in law of generalized compound-Poisson processes. Based on this result, we derive an off-the-grid algorithm that generates arbitrarily close approximations of the target process. Our method relies on a B-spline representation of generalized compound-Poisson processes. We illustrate numerically the validity of our approach.

Read more
Statistics Theory

Geometric ergodicity of Gibbs samplers for the Horseshoe and its regularized variants

The Horseshoe is a widely used and popular continuous shrinkage prior for high-dimensional Bayesian linear regression. Recently, regularized versions of the Horseshoe prior have also been introduced in the literature. Various Gibbs sampling Markov chains have been developed in the literature to generate approximate samples from the corresponding intractable posterior densities. Establishing geometric ergodicity of these Markov chains provides crucial technical justification for the accuracy of asymptotic standard errors for Markov chain based estimates of posterior quantities. In this paper, we establish geometric ergodicity for various Gibbs samplers corresponding to the Horseshoe prior and its regularized variants in the context of linear regression. First, we establish geometric ergodicity of a Gibbs sampler for the original Horseshoe posterior under strictly weaker conditions than existing analyses in the literature. Second, we consider the regularized Horseshoe prior introduced in Piironen and Vehtari (2017), and prove geometric ergodicity for a Gibbs sampling Markov chain to sample from the corresponding posterior without any truncation constraint on the global and local shrinkage parameters. Finally, we consider a variant of this regularized Horseshoe prior introduced in Nishimura and Suchard (2020), and again establish geometric ergodicity for a Gibbs sampling Markov chain to sample from the corresponding posterior.

Read more
Statistics Theory

Gibbs sampler and coordinate ascent variational inference: a set-theoretical review

One of the fundamental problems in Bayesian statistics is the approximation of the posterior distribution. Gibbs sampler and coordinate ascent variational inference are renownedly utilized approximation techniques that rely on stochastic and deterministic approximations. In this paper, we define fundamental sets of densities frequently used in Bayesian inference. We shall be concerned with the clarification of the two schemes from the set-theoretical point of view. This new way provides an alternative mechanism for analyzing the two schemes endowed with pedagogical insights.

Read more
Statistics Theory

Global jump filters and realized volatility

For a semimartingale with jumps, we propose a new estimation method for integrated volatility, i.e., the quadratic variation of the continuous martingale part, based on the global jump filter proposed by Inatsugu and Yoshida [8]. To decide whether each increment of the process has jumps, the global jump filter adopts the upper α -quantile of the absolute increments as the threshold. This jump filter is called global since it uses all the observations to classify one increment. We give a rate of convergence and prove asymptotic mixed normality of the global realized volatility and its variant "Winsorized global volatility". By simulation studies, we show that our estimators outperform previous realized volatility estimators that use a few adjacent increments to mitigate the effects of jumps.

Read more

Ready to get started?

Join us today