Nicolas Verzelen
Institut national de la recherche agronomique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nicolas Verzelen.
Statistical Science | 2012
Christophe Giraud; Sylvie Huet; Nicolas Verzelen
We review recent results for high-dimensional sparse linear re- gression in the practical case of unknown variance. Different sparsity settings are covered, including coordinate-sparsity, group-sparsity and variation- sparsity. The emphasis is put on nonasymptotic analyses and feasible pro- cedures. In addition, a small numerical study compares the practical perfor- mance of three schemes for tuning the lasso estimator and some references are collected for some more general models, including multivariate regres- sion and nonparametric regression.
Annals of Statistics | 2017
Nicolas Verzelen; Ery Arias-Castro
We consider Gaussian mixture models in high dimensions and concentrate on the twin tasks of detection and feature selection. Under sparsity assumptions on the difference in means, we derive information bounds and establish the performance of various procedures, including the top sparse eigenvalue of the sample covariance matrix and other projection tests based on moments, such as the skewness and kurtosis tests of Malkovich and Afifi (1973), and other variants which we were better able to control under the null.
Statistical Applications in Genetics and Molecular Biology | 2012
Christophe Giraud; Sylvie Huet; Nicolas Verzelen
Applications on inference of biological networks have raised a strong interest in the problem of graph estimation in high-dimensional Gaussian graphical models. To handle this problem, we propose a two-stage procedure which first builds a family of candidate graphs from the data, and then selects one graph among this family according to a dedicated criterion. This estimation procedure is shown to be consistent in a high-dimensional setting, and its risk is controlled by a non-asymptotic oracle-like inequality. The procedure is tested on a real data set concerning gene expression data, and its performances are assessed on the basis of a large numerical study.The procedure is implemented in the R-package GGMselect available on the CRAN.
international symposium on information theory | 2017
Jess Banks; Cristopher Moore; Roman Vershynin; Nicolas Verzelen; Jiaming Xu
We study the problem of detecting a structured, low-rank signal matrix corrupted with additive Gaussian noise. This includes clustering in a Gaussian mixture model, sparse PCA, and submatrix localization. Each of these problems is conjectured to exhibit a sharp information-theoretic threshold, below which the signal is too weak for any algorithm to detect. We derive upper and lower bounds on these thresholds by applying the first and second moment methods to the likelihood ratio between these “planted models” and null models where the signal matrix is zero. For sparse PCA and submatrix localization, we determine this threshold exactly in the limit where the number of blocks is large or the signal matrix is very sparse; for the clustering problem, our bounds differ by a factor √2 when the number of clusters is large. Moreover, our upper bounds show that for each of these problems there is a significant regime where reliable detection is information-theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative. This regime is analogous to the conjectured ‘hard but detectable’ regime for community detection in sparse graphs.
Probability Theory and Related Fields | 2018
Olga Klopp; Nicolas Verzelen
Consider the twin problems of estimating the connection probability matrix of an inhomogeneous random graph and the graphon of a W-random graph. We establish the minimax estimation rates with respect to the cut metric for classes of block constant matrices and step function graphons. Surprisingly, our results imply that, from the minimax point of view, the raw data, that is, the adjacency matrix of the observed graph, is already optimal and more involved procedures cannot improve the convergence rates for this metric. This phenomenon contrasts with optimal rates of convergence with respect to other classical distances for graphons such as the
Annals of Statistics | 2014
Ery Arias-Castro; Nicolas Verzelen
Annals of Statistics | 2017
Olga Klopp; Alexandre B. Tsybakov; Nicolas Verzelen
l_1
arXiv: Statistics Theory | 2013
Ery Arias-Castro; Nicolas Verzelen
Annals of Applied Probability | 2015
Nicolas Verzelen; Ery Arias-Castro
l1 or
Annals of Statistics | 2018
Olivier Collier; Laëtitia Comminges; Alexandre B. Tsybakov; Nicolas Verzelen