Guillaume Obozinski
École des ponts ParisTech
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guillaume Obozinski.
international conference on machine learning | 2009
Laurent Jacob; Guillaume Obozinski; Jean-Philippe Vert
We propose a new penalty function which, when used as regularization for empirical risk minimization procedures, leads to sparse estimators. The support of the sparse vector is typically a union of potentially overlapping groups of co-variates defined a priori, or a set of covariates which tend to be connected to each other when a graph of covariates is given. We study theoretical properties of the estimator, and illustrate its behavior on simulated and breast cancer gene expression data.
arXiv: Learning | 2012
Francis R. Bach; Rodolphe Jenatton; Julien Mairal; Guillaume Obozinski
Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate nonsmooth norms. The goal of this monograph is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted l2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view.
Statistics and Computing | 2010
Guillaume Obozinski; Ben Taskar; Michael I. Jordan
We address the problem of recovering a common set of covariates that are relevant simultaneously to several classification problems. By penalizing the sum of ℓ2 norms of the blocks of coefficients associated with each covariate across different classification problems, similar sparsity patterns in all models are encouraged. To take computational advantage of the sparsity of solutions at high regularization levels, we propose a blockwise path-following scheme that approximately traces the regularization path. As the regularization coefficient decreases, the algorithm maintains and updates concurrently a growing set of covariates that are simultaneously active for all problems. We also show how to use random projections to extend this approach to the problem of joint subspace selection, where multiple predictors are found in a common low-dimensional subspace. We present theoretical results showing that this random projection approach converges to the solution yielded by trace-norm regularization. Finally, we present a variety of experimental results exploring joint covariate selection and joint subspace selection, comparing the path-following approach to competing algorithms in terms of prediction accuracy and running time.
Genome Biology | 2008
Lourdes Peña-Castillo; Murat Tasan; Chad L. Myers; Hyunju Lee; Trupti Joshi; Chao Zhang; Yuanfang Guan; Michele Leone; Andrea Pagnani; Wan-Kyu Kim; Chase Krumpelman; Weidong Tian; Guillaume Obozinski; Yanjun Qi; Guan Ning Lin; Gabriel F. Berriz; Francis D. Gibbons; Gert R. G. Lanckriet; Jian-Ge Qiu; Charles E. Grant; Zafer Barutcuoglu; David P. Hill; David Warde-Farley; Chris Grouios; Debajyoti Ray; Judith A. Blake; Minghua Deng; Michael I. Jordan; William Stafford Noble; Quaid Morris
Background:Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated.Results:In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%.Conclusion:We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized.
Statistical Science | 2012
Francis R. Bach; Rodolphe Jenatton; Julien Mairal; Guillaume Obozinski
Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the
allerton conference on communication, control, and computing | 2008
Guillaume Obozinski; Martin J. Wainwright; Michael I. Jordan
\ell_1
information processing in medical imaging | 2007
Julien Lefèvre; Guillaume Obozinski; Sylvain Baillet
-norm. In this paper, we consider situations where we are not only interested in sparsity, but where some structural prior knowledge is available as well. We show that the
computer vision and pattern recognition | 2015
Mateusz Kozinski; Raghudeep Gadde; Sergey Zagoruyko; Guillaume Obozinski; Renaud Marlet
\ell_1
Siam Journal on Imaging Sciences | 2017
Loic Landrieu; Guillaume Obozinski
-norm can then be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures. We present applications to unsupervised learning, for structured sparse principal component analysis and hierarchical dictionary learning, and to supervised learning in the context of non-linear variable selection.
asian conference on computer vision | 2014
Mateusz Kozinski; Guillaume Obozinski; Renaud Marlet
In the problem of multivariate regression, a K-dimensional response vector is regressed upon a common set of p covariates, with a matrix B* isin RopfptimesK of regression coefficients. We study the behavior of the group Lasso using lscr1/lscr2 regularization for the union support problem, meaning that the set of s rows for which B* is non-zero is recovered exactly. Studying this problem under high-dimensional scaling, we show that group Lasso recovers the exact row pattern with high probability over the random design and noise for scalings of (n, p, s) such that the sample complexity parameter given by thetas(n, p, s) := n/[2psi(B*) log(p - s)] exceeds a critical threshold. Here n is the sample size, p is the ambient dimension of the regression model, s is the number of non-zero rows, and psi(B*) is a sparsity-overlap function that measures a combination of the sparsities and overlaps of the K-regression coefficient vectors that constitute the model. This sparsity-overlap function reveals that, if the design is uncorrelated on the active rows, block lscr1/lscr2 regularization for multivariate regression never harms performance relative to an ordinary Lasso approach, and can yield substantial improvements in sample complexity (up to a factor of K) when the regression vectors are suitably orthogonal. For more general designs, it is possible for the ordinary Lasso to outperform the group Lasso.