Stéphane Gaïffas
École Polytechnique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stéphane Gaïffas.
Electronic Journal of Statistics | 2012
Stéphane Gaïffas; Agathe Guilloux
Abstract: We consider a general high-dimensional additive hazards model in a non-asymptotic setting, including regression for censored-data. In this context, we consider a Lasso estimator with a fully data-driven `1 penalization, which is tuned for the estimation problem at hand. We prove sharp oracle inequalities for this estimator. Our analysis involves a new “data-driven” Bernstein’s inequality, that is of independent interest, where the predictable variation is replaced by the optional variation.
Annales De L Institut Henri Poincare-probabilites Et Statistiques | 2011
Fabienne Comte; Stéphane Gaïffas; Agathe Guilloux
We propose in this work an original estimator of the conditional intensity of a marker-dependent counting process, that is, a counting process with covariates. We use model selection methods and provide a non asymptotic bound for the risk of our estimator on a compact set. We show that our estimator reaches automatically a convergence rate over a functional class with a given (unknown) anisotropic regularity. Then, we prove a lower bound which establishes that this rate is optimal. Lastly, we provide a short illustration of the way the estimator works in the context of conditional hazard estimation.
Electronic Journal of Statistics | 2007
Stéphane Gaïffas; Guillaume Lecué
We want to recover the regression function in the single-index model. Using an aggregation algorithm with local polynomial estimators, we answer in particular to the second part of Question 2 from Stone (1982) on the optimal convergence rate. The procedure constructed here has strong adaptation properties: it adapts both to the smoothness of the link function and to the unknown index. Moreover, the procedure locally adapts to the distribution of the design. We propose new upper bounds for the local polynomial estimator (which are results of independent interest) that allows a fairly general design. The behavior of this algorithm is studied through numerical simulations. In particular, we show empirically that it improves strongly over empirical risk minimization.
IEEE Transactions on Information Theory | 2011
Stéphane Gaïffas; Guillaume Lecué
We observe (Xi,Yi)i=1n where the Yis are real valued outputs and the Xis are m × T matrices. We observe a new entry X and we want to predict the output Y associated with it. We focus on the high-dimensional setting, where mT ≫ n. This includes the matrix completion problem with noise, as well as other problems. We consider linear prediction procedures based on different penalizations, involving a mixture of several norms: the nuclear norm, the Frobenius norm and the ℓ1-norm. For these procedures, we prove sharp oracle inequalities, using a statistical learning theory point of view. A surprising fact in our results is that the rates of convergence do not depend on m and T directly. The analysis is conducted without the usually considered incoherency condition on the unknown matrix or restricted isometry condition on the sampling operator. Moreover, our results are the first to give for this problem an analysis of penalization (such as nuclear norm penalization) as a regularization algorithm: our oracle inequalities prove that these procedures have a prediction accuracy close to the deterministic oracle one, given that the reguralization parameters are well-chosen.
Journal of Physics A | 2016
Emmanuel Bacry; Stéphane Gaïffas; Iacopo Mastromatteo; Jean-François Muzy
We propose a fast and efficient estimation method that is able to accurately recover the parameters of a d-dimensional Hawkes point-process from a set of observations. We exploit a mean-field approximation that is valid when the fluctuations of the stochastic intensity are small. We show that this is notably the case in situations when interactions are sufficiently weak, when the dimension of the system is high or when the fluctuations are self-averaging due to the large number of past events they involve. In such a regime the estimation of a Hawkes process can be mapped on a least-squares problem for which we provide an analytic solution. Though this estimator is biased, we show that its precision can be comparable to the one of the Maximum Likelihood Estimator while its computation speed is shown to be improved considerably. We give a theoretical control on the accuracy of our new approach and illustrate its efficiency using synthetic datasets, in order to assess the statistical estimation error of the parameters.
Statistical Methods in Medical Research | 2018
Simon Bussy; Agathe Guilloux; Stéphane Gaïffas; Anne-Sophie Jannot
We introduce a supervised learning mixture model for censored durations (C-mix) to simultaneously detect subgroups of patients with different prognosis and order them based on their risk. Our method is applicable in a high-dimensional setting, i.e. with a large number of biomedical covariates. Indeed, we penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model and automatically pinpoints the relevant covariates for the survival prediction. Inference is achieved using an efficient Quasi-Newton Expectation Maximization algorithm, for which we provide convergence properties. The statistical performance of the method is examined on an extensive Monte Carlo simulation study and finally illustrated on three publicly available genetic cancer datasets with high-dimensional covariates. We show that our approach outperforms the state-of-the-art survival models in this context, namely both the CURE and Cox proportional hazards models penalized by the Elastic-Net, in terms of C-index, AUC(t) and survival prediction. Thus, we propose a powerful tool for personalized medicine in cancerology.
IEEE Transactions on Information Theory | 2015
Mokhtar Z. Alaya; Stéphane Gaïffas; Agathe Guilloux
We consider the problem of learning the inhomogeneous intensity of a counting process, under a sparse segmentation assumption. We introduce a weighted total-variation penalization, using data-driven weights that correctly scale the penalization along the observation interval. We prove that this leads to a sharp tuning of the convex relaxation of the segmentation prior, by stating oracle inequalities with fast rates of convergence, and consistency for change-points detection. This provides first theoretical guarantees for segmentation with a convex proxy beyond the standard independent identically distributed signal + white noise setting. We introduce a fast algorithm to solve this convex problem. Numerical experiments illustrate our approach on simulated and on a high-frequency genomics data set.
Journal of Machine Learning Research | 2014
Emile Richard; Stéphane Gaïffas; Nicolas Vayatis
arXiv: Information Theory | 2011
Stéphane Gaïffas; Guillaume Lecué
Journal of Machine Learning Research | 2011
Stéphane Gaïffas; Guillaume Lecué