Stefan Wager | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stefan Wager is active.

Explore More

Publication

Featured researches published by Stefan Wager.

Journal of the American Statistical Association | 2018

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Stefan Wager; Susan Athey

ABSTRACT Many scientific and engineering challenges—ranging from personalized medicine to customized marketing recommendations—require an understanding of treatment effect heterogeneity. In this article, we develop a nonparametric causal forest for estimating heterogeneous treatment effects that extends Breiman’s widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.

Proceedings of the National Academy of Sciences of the United States of America | 2016

High-dimensional regression adjustments in randomized experiments.

Stefan Wager; Wenfei Du; Jonathan Taylor; Robert Tibshirani

Significance As datasets get larger and more complex, there is a growing interest in using machine-learning methods to enhance scientific analysis. In many settings, considerable work is required to make standard machine-learning methods useful for specific scientific applications. We find, however, that in the case of treatment effect estimation with randomized experiments, regression adjustments via machine-learning methods designed to minimize test set error directly induce efficient estimates of the average treatment effect. Thus, machine-learning methods can be used out of the box for this task, without any special-case adjustments. We study the problem of treatment effect estimation in randomized experiments with high-dimensional covariate information and show that essentially any risk-consistent regression adjustment can be used to obtain efficient estimates of the average treatment effect. Our results considerably extend the range of settings where high-dimensional regression adjustments are guaranteed to provide valid inference about the population average treatment effect. We then propose cross-estimation, a simple method for obtaining finite-sample–unbiased treatment effect estimates that leverages high-dimensional regression adjustments. Our method can be used when the regression model is estimated using the lasso, the elastic net, subset selection, etc. Finally, we extend our analysis to allow for adaptive specification search via cross-validation and flexible nonparametric regression adjustments with machine-learning methods such as random forests or neural networks.

Annals of Statistics | 2018

High-dimensional asymptotics of prediction: Ridge regression and classification

Edgar Dobriban; Stefan Wager

We provide a unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model. We work in a high-dimensional asymptotic regime where

The American Statistician | 2015

Teaching Statistics at Google-Scale

Nicholas Chamandy; Omkar Muralidharan; Stefan Wager

p, n \to \infty

Journal of The Royal Statistical Society Series B-statistical Methodology | 2018

Approximate residual balancing: debiased inference of average treatment effects in high dimensions

Susan Athey; Guido W. Imbens; Stefan Wager

and

international conference on conceptual structures | 2015

Stable Autoencoding: A Flexible Framework for Regularized Low-rank Matrix Estimation☆

Julie Josse; Stefan Wager

p/n \to \gamma \in (0, \, \infty)

Journal of Multivariate Analysis | 2014

Subsampling extremes: From block maxima to smooth tail estimation

Stefan Wager

, and allow for arbitrary covariance among the features. For both methods, we provide an explicit and efficiently computable expression for the limiting predictive risk, which depends only on the spectrum of the feature-covariance matrix, the signal strength, and the aspect ratio

PLOS ONE | 2016

Prevalence and Predictors of Malnutrition among Guatemalan Children at 2 Years of Age

Jason M. Nagata; James Gippetti; Stefan Wager; Alejandro Chavez; Paul H. Wise

\gamma

The Annals of Applied Statistics | 2015

Weakly supervised clustering: Learning fine-grained signals from coarse labels

Stefan Wager; Alexander Blocker; Niall Cardin

. Especially in the case of regularized discriminant analysis, we find that predictive accuracy has a nuanced dependence on the eigenvalue distribution of the covariance matrix, suggesting that analyses based on the operator norm of the covariance matrix may not be sharp. Our results also uncover several qualitative insights about both methods: for example, with ridge regression, there is an exact inverse relation between the limiting predictive risk and the limiting estimation risk given a fixed signal strength. Our analysis builds on recent advances in random matrix theory.

PLOS ONE | 2014

Quantifying and Exploiting the Age Dependence in the Effect of Supplementary Food for Child Undernutrition

Milinda Lakkam; Stefan Wager; Paul H. Wise; Lawrence M. Wein

Modern data and applications pose very different challenges from those of the 1950s or even the 1980s. Students contemplating a career in statistics or data science need to have the tools to tackle problems involving massive, heavy-tailed data, often interacting with live, complex systems. However, despite the deepening connections between engineering and modern data science, we argue that training in classical statistical concepts plays a central role in preparing students to solve Google-scale problems. To this end, we present three industrial applications where significant modern data challenges were overcome by statistical thinking. [Received December 2014. Revised August 2015.]

Explore More