Featured Researches

Statistics Theory

Maximum likelihood estimation of regularisation parameters in high-dimensional inverse problems: an empirical Bayesian approach. Part II: Theoretical Analysis

This paper presents a detailed theoretical analysis of the three stochastic approximation proximal gradient algorithms proposed in our companion paper [49] to set regularization parameters by marginal maximum likelihood estimation. We prove the convergence of a more general stochastic approximation scheme that includes the three algorithms of [49] as special cases. This includes asymptotic and non-asymptotic convergence results with natural and easily verifiable conditions, as well as explicit bounds on the convergence rates. Importantly, the theory is also general in that it can be applied to other intractable optimisation problems. A main novelty of the work is that the stochastic gradient estimates of our scheme are constructed from inexact proximal Markov chain Monte Carlo samplers. This allows the use of samplers that scale efficiently to large problems and for which we have precise theoretical guarantees.

Read more
Statistics Theory

Measuring association with Wasserstein distances

Let ??��?μ,ν) be a coupling between two probability measures μ and ν on a Polish space. In this article we propose and study a class of nonparametric measures of association between μ and ν . The analysis is based on the Wasserstein distance between ν and the disintegration ? x 1 of ? with respect to the first coordinate. We also establish basic statistical properties of this new class of measures: we develop a statistical theory for strongly consistent estimators and determine their convergence rate. Throughout our analysis we make use of the so-called adapted/causal Wasserstein distance, in particular we rely on results established in [Backhoff, Bartl, Beiglböck, Wiesel. Estimating processes in adapted Wasserstein distance. 2020]. Our class of measures offers on alternative to the correlation coefficient proposed by [Dette, Siburg and Stoimenov (2013). A copula-based non-parametric measure of regression dependence. Scandinavian Journal of Statistics 40(1), 21-41] and [Chatterjee (2020). A new coefficient of correlation. Journal of the American Statistical Association, 1-21]. In contrast to these works, our approach also applies to probability laws on general Polish spaces.

Read more
Statistics Theory

Minimax Efficient Finite-Difference Stochastic Gradient Estimators Using Black-Box Function Evaluations

Standard approaches to stochastic gradient estimation, with only noisy black-box function evaluations, use the finite-difference method or its variants. While natural, it is open to our knowledge whether their statistical accuracy is the best possible. This paper argues so by showing that central finite-difference is a nearly minimax optimal zeroth-order gradient estimator for a suitable class of objective functions and mean squared risk, among both the class of linear estimators and the much larger class of all (nonlinear) estimators.

Read more
Statistics Theory

Minimax estimation of norms of a probability density: I. Lower bounds

The paper deals with the problem of nonparametric estimating the L p --norm, p∈(1,∞) , of a probability density on R d , d≥1 from independent observations. The unknown density %to be estimated is assumed to belong to a ball in the anisotropic Nikolskii's space. We adopt the minimax approach, and derive lower bounds on the minimax risk. In particular, we demonstrate that accuracy of estimation procedures essentially depends on whether p is integer or not. Moreover, we develop a general technique for derivation of lower bounds on the minimax risk in the problems of estimating nonlinear functionals. The proposed technique is applicable for a broad class of nonlinear functionals, and it is used for derivation of the lower bounds in the~ L p --norm estimation.

Read more
Statistics Theory

Minimax estimation of norms of a probability density: II. Rate-optimal estimation procedures

In this paper we develop rate--optimal estimation procedures in the problem of estimating the L p --norm, p∈(0,∞) of a probability density from independent observations. The density is assumed to be defined on R d , d≥1 and to belong to a ball in the anisotropic Nikolskii space. We adopt the minimax approach and construct rate--optimal estimators in the case of integer p≥2 . We demonstrate that, depending on parameters of Nikolskii's class and the norm index p , the risk asymptotics ranges from inconsistency to n − − √ --estimation. The results in this paper complement the minimax lower bounds derived in the companion paper \cite{gl20}.

Read more
Statistics Theory

Minimax optimal estimator in the stochastic inverse problem for exponential Radon transform

In this article, we consider the problem of inverting the exponential Radon transform of a function in the presence of noise. We propose a kernel estimator to estimate the true function, analogous to the one proposed by Korostelëv and Tsybakov in their article `Optimal rates of convergence of estimators in a probabilistic setup of tomography problem', Problems of Information Transmission, 27:73-81,1991. For the estimator proposed in this article, we then show that it converges to the true function at a minimax optimal rate.

Read more
Statistics Theory

Minimax rates without the fixed sample size assumption

We generalize the notion of minimax convergence rate. In contrast to the standard definition, we do not assume that the sample size is fixed in advance. Allowing for varying sample size results in time-robust minimax rates and estimators. These can be either strongly adversarial, based on the worst-case over all sample sizes, or weakly adversarial, based on the worst-case over all stopping times. We show that standard and time-robust rates usually differ by at most a logarithmic factor, and that for some (and we conjecture for all) exponential families, they differ by exactly an iterated logarithmic factor. In many situations, time-robust rates are arguably more natural to consider. For example, they allow us to simultaneously obtain strong model selection consistency and optimal estimation rates, thus avoiding the "AIC-BIC dilemma".

Read more
Statistics Theory

Minimax-robust forecasting of sequences with periodically stationary long memory multiple seasonal increments

We introduce stochastic sequences ζ(k) with periodically stationary generalized multiple increments of fractional order which combines cyclostationary, multi-seasonal, integrated and fractionally integrated patterns. We solve the problem of optimal estimation of linear functionals constructed from unobserved values of stochastic sequences ζ(k) based on their observations at points k<0 . For sequences with known spectral densities, we obtain formulas for calculating values of the mean square errors and the spectral characteristics of the optimal estimates of functionals. Formulas that determine the least favorable spectral densities and minimax (robust) spectral characteristics of the optimal linear estimates of functionals are proposed in the case where spectral densities of sequences are not exactly known while some sets of admissible spectral densities are given.

Read more
Statistics Theory

Mixing convergence of LSE for supercritical Gaussian AR(2) processes using random scaling

We prove mixing convergence of least squares estimator of autoregressive parameters for supercritical Gaussian autoregressive processes of order 2 having real characteristic roots with different absolute values. We use an appropriate random scaling such that the limit distribution is a two-dimensional normal distribution concentrated on a one-dimensional ray determined by the characteristic root having the larger absolute value.

Read more
Statistics Theory

Mixture of Conditional Gaussian Graphical Models for unlabelled heterogeneous populations in the presence of co-factors

Conditional correlation networks, within Gaussian Graphical Models (GGM), are widely used to describe the direct interactions between the components of a random vector. In the case of an unlabelled Heterogeneous population, Expectation Maximisation (EM) algorithms for Mixtures of GGM have been proposed to estimate both each sub-population's graph and the class labels. However, we argue that, with most real data, class affiliation cannot be described with a Mixture of Gaussian, which mostly groups data points according to their geometrical proximity. In particular, there often exists external co-features whose values affect the features' average value, scattering across the feature space data points belonging to the same sub-population. Additionally, if the co-features' effect on the features is Heterogeneous, then the estimation of this effect cannot be separated from the sub-population identification. In this article, we propose a Mixture of Conditional GGM (CGGM) that subtracts the heterogeneous effects of the co-features to regroup the data points into sub-population corresponding clusters. We develop a penalised EM algorithm to estimate graph-sparse model parameters. We demonstrate on synthetic and real data how this method fulfils its goal and succeeds in identifying the sub-populations where the Mixtures of GGM are disrupted by the effect of the co-features.

Read more

Ready to get started?

Join us today