Alexandre Belloni | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexandre Belloni is active.

Explore More

Publication

Featured researches published by Alexandre Belloni.

Biometrika | 2011

Square-root lasso: pivotal recovery of sparse signals via conic programming

Alexandre Belloni; Victor Chernozhukov; Lie Wang

We propose a pivotal method for estimating high-dimensional sparse linear regression models, where the overall number of regressors p is large, possibly much larger than n, but only s regressors are significant. The method is a modification of the lasso, called the square-root lasso. The method is pivotal in that it neither relies on the knowledge of the standard deviation σ nor does it need to pre-estimate σ. Moreover, the method does not rely on normality or sub-Gaussianity of noise. It achieves near-oracle performance, attaining the convergence rate σl(s/n) log pr-super-1/2 in the prediction norm, and thus matching the performance of the lasso with known σ. These performance results are valid for both Gaussian and non-Gaussian errors, under some mild moment restrictions. We formulate the square-root lasso as a solution to a convex conic programming problem, which allows us to implement the estimator using efficient algorithmic methods, such as interior-point and first-order methods. Copyright 2011, Oxford University Press.

Econometrica | 2010

Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain

Alexandre Belloni; Daniel L. Chen; Victor Chernozhukov; Christian Hansen

We develop results for the use of LASSO and Post-LASSO methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p, that apply even when p is much larger than the sample size, n. We rigorously develop asymptotic distribution and inference theory for the resulting IV estimators and provide conditions under which these estimators are asymptotically oracle-efficient. In simulation experiments, the LASSO-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the LASSO-based IV estimator substantially reduces estimated standard errors allowing one to draw much more precise conclusions about the economic effects of these decisions. Optimal instruments are conditional expectations; and in developing the IV results, we also establish a series of new results for LASSO and Post-LASSO estimators of non-parametric conditional expectation functions which are of independent theoretical and practical interest. Specifically, we develop the asymptotic theory for these estimators that allows for non-Gaussian, heteroscedastic disturbances, which is important for econometric applications. By innovatively using moderate deviation theory for self-normalized sums, we provide convergence rates for these estimators that are as sharp as in the homoscedastic Gaussian case under the weak condition that log p = o(n 1/3). Moreover, as a practical innovation, we provide a fully data-driven method for choosing the user-specified penalty that must be provided in obtaining LASSO and Post-LASSO estimates and establish its asymptotic validity under non-Gaussian, heteroscedastic disturbances.

Annals of Statistics | 2009

ℓ1-penalized quantile regression in high-dimensional sparse models

Alexandre Belloni; Victor Chernozhukov

We consider median regression and, more generally, quantile regression in high-dimensional sparse models. In these models the overall number of regressors p is very large, possibly larger than the sample size n, but only s of these regressors have non-zero impact on the conditional quantile of the response variable, where s grows slower than n. Since in this case the ordinary quantile regression is not consistent, we consider quantile regression penalized by the L1-norm of coefficients (L1-QR). First, we show that L1-QR is consistent at the rate of the square root of (s/n) log p, which is close to the oracle rate of the square root of (s/n), achievable when the minimal true model is known. The overall number of regressors p affects the rate only through the log p factor, thus allowing nearly exponential growth in the number of zero-impact regressors. The rate result holds under relatively weak conditions, requiring that s/n converges to zero at a super-logarithmic speed and that regularization parameter satisfies certain theoretical constraints. Second, we propose a pivotal, data-driven choice of the regularization parameter and show that it satisfies these theoretical constraints. Third, we show that L1-QR correctly selects the true minimal model as a valid submodel, when the non-zero coefficients of the true model are well separated from zero. We also show that the number of non-zero coefficients in L1-QR is of same stochastic order as s, the number of non-zero coefficients in the minimal true model. Fourth, we analyze the rate of convergence of a two-step estimator that applies ordinary quantile regression to the selected model. Fifth, we evaluate the performance of L1-QR in a Monte-Carlo experiment, and provide an application to the analysis of the international economic growth.

Bernoulli | 2013

Least squares after model selection in high-dimensional sparse models

Alexandre Belloni; Victor Chernozhukov

In this paper we study post-model selection estimators which apply ordinary least squares (ols) to the model selected by first-step penalized estimators, typically lasso. It is well known that lasso can estimate the non-parametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that ols post lasso estimator performs at least as well as lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the lasso-based model selection “fails�? in the sense of missing some components of the “true�? regression model. By the “true�? model we mean here the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, ols post lasso estimator can perform strictly better than lasso, in the sense of a strictly faster rate of convergence, if the lasso-based model selection correctly includes all components of the “true�? model as a subset and also achieves sufficient sparsity. In the extreme case, when lasso perfectly selects the “true�? model, the ols post lasso estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by lasso which guarantees that this dimension is at most of the same order as the dimension of the “true�? model. Our rate results are non-asymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the lasso estimator acting as selector in the first step, but also applies to any other estimator, for example various forms of thresholded lasso, with good rates and good sparsity properties. Our analysis covers both traditional thresholding and a new practical, data-driven thresholding scheme that induces maximal sparsity subject to maintaining a certain goodness-of-fit. The latter scheme has theoretical guarantees similar to those of lasso or ols post lasso, but it dominates these procedures as well as traditional thresholding in a wide variety of experiments. First arXiv version: December 2009.

Management Science | 2008

Optimizing Product Line Designs: Efficient Methods and Comparisons

Alexandre Belloni; Robert M. Freund; Matthew Selove; Duncan Simester

We take advantage of recent advances in optimization methods and computer hardware to identify globally optimal solutions of product line design problems that are too large for complete enumeration. We then use this guarantee of global optimality to benchmark the performance of more practical heuristic methods. We use two sources of data: (1) a conjoint study previously conducted for a real product line design problem, and (2) simulated problems of various sizes. For both data sources, several of the heuristic methods consistently find optimal or near-optimal solutions, including simulated annealing, divide-and-conquer, product-swapping, and genetic algorithms.

Annals of Operations Research | 2003

Bundle relaxation and primal recovery in Unit Commitment problems. The Brazilian case.

Alexandre Belloni; Andre L. Diniz Souto Lima; Maria Elvira Piñeiro Maceira; Claudia A. Sagastizábal

We consider the inclusion of commitment of thermal generation units in the optimal management of the Brazilian power system. By means of Lagrangian relaxation we decompose the problem and obtain a nondifferentiable dual function that is separable. We solve the dual problem with a bundle method. Our purpose is twofold: first, bundle methods are the methods of choice in nonsmooth optimization when it comes to solve large-scale problems with high precision. Second, they give good starting points for recovering primal solutions. We use an inexact augmented Lagrangian technique to find a near-optimal primal feasible solution. We assess our approach with numerical results.

arXiv: Applications | 2011

High Dimensional Sparse Econometric Models: An Introduction

Alexandre Belloni; Victor Chernozhukov

In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using l 1-penalization and post- l 1-penalization methods. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with Monte Carlo simulations and an empirical application. In the application, we examine and confirm the empirical validity of the Solow-Swan model for international economic growth.1

Annals of Statistics | 2007

On the computational complexity of MCMC-based estimators in large samples

Alexandre Belloni; Victor Chernozhukov

In this paper we examine the implications of the statistical large sample theory for the computational complexity of Bayesian and quasi-Bayesian estimation carried out using Metropolis random walks. Our analysis is motivated by the Laplace-Bernstein-Von Mises central limit theorem, which states that in large samples the posterior or quasi-posterior approaches a normal density. Using this observation, we establish polynomial bounds on the computational complexity of general Metropolis random walks methods in large samples. Our analysis covers cases, where the underlying log-likelihood or extremum criterion function is possibly nonconcave, discontinuous, and of increasing dimension. However, the central limit theorem restricts the deviations from continuity and log-concavity of the log-likelihood or extremum criterion function in a very specific manner.

Journal of Business & Economic Statistics | 2016

Post-Selection Inference for Generalized Linear Models With Many Controls

Alexandre Belloni; Victor Chernozhukov; Ying Wei

This article considers generalized linear models in the presence of many controls. We lay out a general methodology to estimate an effect of interest based on the construction of an instrument that immunizes against model selection mistakes and apply it to the case of logistic binary choice model. More specifically we propose new methods for estimating and constructing confidence regions for a regression parameter of primary interest α0, a parameter in front of the regressor of interest, such as the treatment variable or a policy variable. These methods allow to estimate α0 at the root-n rate when the total number p of other regressors, called controls, potentially exceeds the sample size n using sparsity assumptions. The sparsity assumption means that there is a subset of s < n controls, which suffices to accurately approximate the nuisance part of the regression function. Importantly, the estimators and these resulting confidence regions are valid uniformly over s-sparse models satisfying s2log 2p = o(n) and other technical conditions. These procedures do not rely on traditional consistent model selection arguments for their validity. In fact, they are robust with respect to moderate model selection mistakes in variable selection. Under suitable conditions, the estimators are semi-parametrically efficient in the sense of attaining the semi-parametric efficiency bounds for the class of models in this article.

arXiv: Methodology | 2011

Conditional quantile processes based on series or many regressors

Alexandre Belloni; Victor Chernozhukov; Denis Chetverikov; Ivan Fernandez-Val

Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR-series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In this framework, we approximate the entire conditional quantile function by a linear combination of series terms with quantile-specific coefficients and estimate the function-valued coefficients from the data. We develop large sample theory for the QR-series coefficient process, namely we obtain uniform strong approximations to the QR-series coefficient process by conditionally pivotal and Gaussian processes. Based on these strong approximations, or couplings, we develop four resampling methods (pivotal, gradient bootstrap, Gaussian, and weighted bootstrap) that can be used for inference on the entire QR-series coefficient function. We apply these results to obtain estimation and inference methods for linear functionals of the conditional quantile function, such as the conditional quantile function itself, its partial derivatives, average partial derivatives, and conditional average partial derivatives. Specifically, we obtain uniform rates of convergence and show how to use the four resampling methods mentioned above for inference on the functionals. All of the above results are for function-valued parameters, holding uniformly in both the quantile index and the covariate value, and covering the pointwise case as a by-product. We demonstrate the practical utility of these results with an example, where we estimate the price elasticity function and test the Slutsky condition of the individual demand for gasoline, as indexed by the individual unobserved propensity for gasoline consumption.

Explore More