Gérard Biau
Pierre-and-Marie-Curie University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gérard Biau.
IEEE Transactions on Information Theory | 2005
Gérard Biau; Florentina Bunea; Marten H. Wegkamp
Let X be a random variable taking values in a separable Hilbert space X, with label Y/spl isin/{0,1}. We establish universal weak consistency of a nearest neighbor-type classifier based on n independent copies (X/sub i/,Y/sub i/) of the pair (X,Y), extending the classical result of Stone to infinite-dimensional Hilbert spaces. Under a mild condition on the distribution of X, we also prove strong consistency. We reduce the infinite dimension of X by considering only the first d coefficients of a Fourier series expansion of each X/sub i/, and then we perform k-nearest neighbor classification in /spl Ropf//sup d/. Both the dimension and the number of neighbors are automatically selected from the data using a simple data-splitting device. An application of this technique to a signal discrimination problem involving speech recordings is presented.
IEEE Transactions on Information Theory | 2008
Gérard Biau; Luc Devroye; Gábor Lugosi
Based on randomly drawn vectors in a separable Hilbert space, one may construct a k-means clustering scheme by minimizing an empirical squared error. We investigate the risk of such a clustering scheme, defined as the expected squared distance of a random vector X from the set of cluster centers. Our main result states that, for an almost surely bounded , the expected excess clustering risk is O(¿1/n) . Since clustering in high (or even infinite)-dimensional spaces may lead to severe computational problems, we examine the properties of a dimension reduction strategy for clustering based on Johnson-Lindenstrauss-type random projections. Our results reflect a tradeoff between accuracy and computational complexity when one uses k-means clustering after random projection of the data to a low-dimensional space. We argue that random projections work better than other simplistic dimension reduction schemes.
Annals of Statistics | 2015
Erwan Scornet; Gérard Biau; Jean-Philippe Vert
Random forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (2001) 5–32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical performance, little is known about the mathematical properties of the procedure. This disparity between theory and practice originates in the difficulty to simultaneously analyze both the randomization process and the highly data-dependent tree structure. In the present paper, we take a step forward in forest exploration by proving a consistency result for Breimans [Mach. Learn. 45 (2001) 5–32] original algorithm in the context of additive regression models. Our analysis also sheds an interesting light on how random forests can nicely adapt to sparsity. 1. Introduction. Random forests are an ensemble learning method for classification and regression that constructs a number of randomized decision trees during the training phase and predicts by averaging the results. Since its publication in the seminal paper of Breiman (2001), the procedure has become a major data analysis tool, that performs well in practice in comparison with many standard methods. What has greatly contributed to the popularity of forests is the fact that they can be applied to a wide range of prediction problems and have few parameters to tune. Aside from being simple to use, the method is generally recognized for its accuracy and its ability to deal with small sample sizes, high-dimensional feature spaces and complex data structures. The random forest methodology has been successfully involved in many practical problems, including air quality prediction (winning code of the EMC data science global hackathon in 2012, see http://www.kaggle.com/c/dsg-hackathon), chemoinformatics [Svetnik et al. (2003)], ecology [Prasad, Iverson and Liaw (2006), Cutler et al. (2007)], 3D
Statistical Inference for Stochastic Processes | 2004
Gérard Biau; Benoît Cadre
Let (ℕ*)N be the integer lattice points in the N-dimensional Euclidean space. We define a nonparametric spatial predictor for the values of a random field indexed by (ℕ*)N using a kernel method. We first examine the general problem of the regression estimation for random fields. Then we show the uniform consistency on compact sets of our spatial predictor as well as its asymptotic normality.
Electronic Journal of Statistics | 2011
Gérard Biau; Frédéric Chazal; David Cohen-Steiner; Luc Devroye; Carlos C. Rodriguez
Motivated by a broad range of potential applications in topological and geometric inference, we introduce a weighted version of the k-nearest neighbor density estimate. Various pointwise consistency results of this estimate are established. We present a general central limit theorem under the lightest possible conditions. In addition, a strong approximation result is obtained and the choice of the optimal set of weights is discussed. In particular, the classical k-nearest neighbor estimate is not optimal in a sense described in the manuscript. The proposed method has been implemented to recover level sets in both simulated and real-life data.
IEEE Transactions on Information Theory | 2005
Gérard Biau; László Györfi
We present two simple and explicit procedures for testing homogeneity of two independent multivariate samples of size n. The nonparametric tests are based on the statistic T/sub n/, which is the L/sub 1/ distance between the two empirical distributions restricted to a finite partition. Both tests reject the null hypothesis of homogeneity if T/sub n/ becomes large, i.e., if T/sub n/ exceeds a threshold. We first discuss Chernoff-type large deviation properties of T/sub n/. This results in a distribution-free strong consistent test of homogeneity. Then the asymptotic null distribution of the test statistic is obtained, leading to an asymptotically /spl alpha/-level test procedure.
Canadian Journal of Statistics-revue Canadienne De Statistique | 2003
Christophe Abraham; Gérard Biau; Benoît Cadre
The authors consider an estimate of the mode of a multivariate probability density using a kernel estimate drawn from a random sample. The estimate is defined by maximizing the kernel estimate over the set of sample values. The authors show that this estimate is strongly consistent and give an almost sure rate of convergence. This rate depends on the sharpness of the density near the true mode, which is measured by a peak index.
Journal of Nonparametric Statistics | 2010
Gérard Biau; Kevin Bleakley; László Györfi; György Ottucsák
Time series prediction covers a vast field of everyday statistical applications in medical, environmental and economic domains. In this paper, we develop nonparametric prediction strategies based on the combination of a set of ‘experts’ and show the universal consistency of these strategies under a minimum of conditions. We perform an in-depth analysis of real-world data sets and show that these nonparametric strategies are more flexible, faster and generally outperform ARMA methods in terms of normalised cumulative prediction error.
Archive | 2015
Gérard Biau; Luc Devroye
This text presents a wide-ranging and rigorous overview of nearest neighbor methods, one of the most important paradigms in machine learning. Now in one self-contained volume, this book systematically covers key statistical, probabilistic, combinatorial and geometric ideas for understanding, analyzing and developing nearest neighbor methods. Grard Biau is a professor at Universit Pierre et Marie Curie (Paris). Luc Devroye is a professor at the School of Computer Science at McGill University (Montreal).
IEEE Transactions on Information Theory | 2012
Gérard Biau; Aurélie Fischer
Principal curves are nonlinear generalizations of the notion of first principal component. Roughly, a principal curve is a parameterized curve in which passes through the “middle” of a data cloud drawn from some unknown probability distribution. Depending on the definition, a principal curve relies on some unknown parameters (number of segments, length, turn, etc.) which have to be properly chosen to recover the shape of the data without interpolating. In this paper, we consider the principal curve problem from an empirical risk minimization perspective and address the parameter selection issue using the point of view of model selection via penalization. We offer oracle inequalities and implement the proposed approach to recover the hidden structures in both simulated and real-life data.