Featured Researches

Statistics Theory

Modified estimator for the proportion of true null hypotheses under discrete setup with proven FDR control by the adaptive Benjamini-Hochberg procedure

Some crucial issues about a recently proposed estimator for the proportion of true null hypotheses ( π 0 ) under discrete setup are discussed. An estimator for π 0 is introduced under the same setup. The estimator may be seen as a modification of a very popular estimator for π 0 , originally proposed under the assumption of continuous test statistics. It is shown that adaptive Benjamini-Hochberg procedure remains conservative with the new estimator for π 0 being plugged in.

Read more
Statistics Theory

Moments of the doubly truncated selection elliptical distributions with emphasis on the unified multivariate skew- t distribution

In this paper, we compute doubly truncated moments for the selection elliptical (SE) class of distributions, which includes some multivariate asymmetric versions of well-known elliptical distributions, such as, the normal, Student's t, slash, among others. We address the moments for doubly truncated members of this family, establishing neat formulation for high order moments as well as for its first two moments. We establish sufficient and necessary conditions for the existence of these truncated moments. Further, we propose optimized methods able to deal with extreme setting of the parameters, partitions with almost zero volume or no truncation which are validated with a brief numerical study. Finally, we present some results useful in interval censoring models. All results has been particularized to the unified skew-t (SUT) distribution, a complex multivariate asymmetric heavy-tailed distribution which includes the extended skew-t (EST), extended skew-normal (ESN), skew-t (ST) and skew-normal (SN) distributions as particular and limiting cases.

Read more
Statistics Theory

Monotonicity preservation properties of kernel regression estimators

Three common classes of kernel regression estimators are considered: the Nadaraya--Watson (NW) estimator, the Priestley--Chao (PC) estimator, and the Gasser--Müller (GM) estimator. It is shown that (i) the GM estimator has a certain monotonicity preservation property for any kernel K , (ii) the NW estimator has this property if and only the kernel K is log concave, and (iii) the PC estimator does not have this property for any kernel K . Other related properties of these regression estimators are discussed.

Read more
Statistics Theory

Motif-based tests for bipartite networks

Bipartite networks are a natural representation of the interactions between entities from two different types. The organization (or topology) of such networks gives insight to understand the systems they describe as a whole. Here, we rely on motifs which provide a meso-scale description of the topology. Moreover, we consider the bipartite expected degree distribution (B-EDD) model which accounts for both the density of the network and possible imbalances between the degrees of the nodes. Under the B-EDD model, we prove the asymptotic normality of the count of any given motif, considering sparsity conditions. We also provide close-form expressions for the mean and the variance of this count. This allows to avoid computationally prohibitive resampling procedures. Based on these results, we define a goodness-of-fit test for the B-EDD model and propose a family of tests for network comparisons. We assess the asymptotic normality of the test statistics and the power of the proposed tests on synthetic experiments and illustrate their use on ecological data sets.

Read more
Statistics Theory

Multi-Gaussian random variables

A generalization of the classic Gaussian random variable to the family of Multi- Gaussian (MG) random variables characterized by shape parameter M > 0, in addition to the mean and the standard deviation, is introduced. The probability density function of the MG family members is the alternating series of the Gaussian functions with the suitably chosen heights and widths. In particular, for the integer values of M the series has finite number of terms and leads to flattened profiles, while reducing to classic Gaussian density for M = 1. For non-integer, positive values of M a convergent infinite series of Gaussian functions is obtained that can be truncated in practical problems. While for all M > 1 the MG PDF has attened profiles, for 0 < M < 1 it leads to cusped profiles. Moreover, the multivariate extension of the MG random variable is obtained and the Log-Multi-Gaussian (LMG) random variable is introduced.

Read more
Statistics Theory

Multi-dimensional parameter estimation of heavy-tailed moving averages

In this paper we present a parametric estimation method for certain multi-parameter heavy-tailed Lévy-driven moving averages. The theory relies on recent multivariate central limit theorems obtained in [3] via Malliavin calculus on Poisson spaces. Our minimal contrast approach is related to the papers [14, 15], which propose to use the marginal empirical characteristic function to estimate the one-dimensional parameter of the kernel function and the stability index of the driving Lévy motion. We extend their work to allow for a multi-parametric framework that in particular includes the important examples of the linear fractional stable motion, the stable Ornstein-Uhlenbeck process, certain CARMA(2, 1) models and Ornstein-Uhlenbeck processes with a periodic component among other models. We present both the consistency and the associated central limit theorem of the minimal contrast estimator. Furthermore, we demonstrate numerical analysis to uncover the finite sample performance of our method.

Read more
Statistics Theory

Multiple Testing in Nonparametric Hidden Markov Models: An Empirical Bayes Approach

Given a nonparametric Hidden Markov Model (HMM) with two states, the question of constructing efficient multiple testing procedures is considered, treating one of the states as an unknown null hypothesis. A procedure is introduced, based on nonparametric empirical Bayes ideas, that controls the False Discovery Rate (FDR) at a user--specified level. Guarantees on power are also provided, in the form of a control of the true positive rate. One of the key steps in the construction requires supremum--norm convergence of preliminary estimators of the emission densities of the HMM. We provide the existence of such estimators, with convergence at the optimal minimax rate, for the case of a HMM with J?? states, which is of independent interest.

Read more
Statistics Theory

Multiplier U-processes: sharp bounds and applications

The theory for multiplier empirical processes has been one of the central topics in the development of the classical theory of empirical processes, due to its wide applicability to various statistical problems. In this paper, we develop theory and tools for studying multiplier U -processes, a natural higher-order generalization of the multiplier empirical processes. To this end, we develop a multiplier inequality that quantifies the moduli of continuity of the multiplier U -process in terms of that of the (decoupled) symmetrized U -process. The new inequality finds a variety of applications including (i) multiplier and bootstrap central limit theorems for U -processes, (ii) general theory for bootstrap M -estimators based on U -statistics, and (iii) theory for M -estimation under general complex sampling designs, again based on U -statistics.

Read more
Statistics Theory

Multivariate sparse clustering for extremes

Identifying directions where extreme events occur is a major challenge in multivariate extreme value analysis. In this paper, we use the concept of sparse regular variation introduced by Meyer and Wintenberger to infer the tail dependence of a random vector X. This approach relies on the Euclidean projection onto the simplex which better exhibits the sparsity structure of the tail of X than the standard methods. Our procedure based on a rigorous methodology aims at capturing clusters of extremal coordinates of X. It also includes the identification of a threshold above which the values taken by X are considered as extreme. We provide an efficient and scalable algorithm called MUSCLE and apply it on numerical experiments to highlight the relevance of our findings. Finally we illustrate our approach with wind speed data and financial return data.

Read more
Statistics Theory

Near-Optimal Confidence Sequences for Bounded Random Variables

Many inference problems, such as sequential decision problems like A/B testing, adaptive sampling schemes like bandit selection, are often online in nature. The fundamental problem for online inference is to provide a sequence of confidence intervals that are valid uniformly over the growing-into-infinity sample sizes. To address this question, we provide a near-optimal confidence sequence for bounded random variables by utilizing Bentkus' concentration results. We show that it improves on the existing approaches that use the Cram{é}r-Chernoff technique such as the Hoeffding, Bernstein, and Bennett inequalities. The resulting confidence sequence is confirmed to be favorable in both synthetic coverage problems and an application to adaptive stopping algorithms.

Read more

Ready to get started?

Join us today