Stefan Wager
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stefan Wager.
Journal of the American Statistical Association | 2018
Stefan Wager; Susan Athey
ABSTRACT Many scientific and engineering challenges—ranging from personalized medicine to customized marketing recommendations—require an understanding of treatment effect heterogeneity. In this article, we develop a nonparametric causal forest for estimating heterogeneous treatment effects that extends Breiman’s widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.
Proceedings of the National Academy of Sciences of the United States of America | 2016
Stefan Wager; Wenfei Du; Jonathan Taylor; Robert Tibshirani
Significance As datasets get larger and more complex, there is a growing interest in using machine-learning methods to enhance scientific analysis. In many settings, considerable work is required to make standard machine-learning methods useful for specific scientific applications. We find, however, that in the case of treatment effect estimation with randomized experiments, regression adjustments via machine-learning methods designed to minimize test set error directly induce efficient estimates of the average treatment effect. Thus, machine-learning methods can be used out of the box for this task, without any special-case adjustments. We study the problem of treatment effect estimation in randomized experiments with high-dimensional covariate information and show that essentially any risk-consistent regression adjustment can be used to obtain efficient estimates of the average treatment effect. Our results considerably extend the range of settings where high-dimensional regression adjustments are guaranteed to provide valid inference about the population average treatment effect. We then propose cross-estimation, a simple method for obtaining finite-sample–unbiased treatment effect estimates that leverages high-dimensional regression adjustments. Our method can be used when the regression model is estimated using the lasso, the elastic net, subset selection, etc. Finally, we extend our analysis to allow for adaptive specification search via cross-validation and flexible nonparametric regression adjustments with machine-learning methods such as random forests or neural networks.
Annals of Statistics | 2018
Edgar Dobriban; Stefan Wager
We provide a unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model. We work in a high-dimensional asymptotic regime where
The American Statistician | 2015
Nicholas Chamandy; Omkar Muralidharan; Stefan Wager
p, n \to \infty
Journal of The Royal Statistical Society Series B-statistical Methodology | 2018
Susan Athey; Guido W. Imbens; Stefan Wager
and
international conference on conceptual structures | 2015
Julie Josse; Stefan Wager
p/n \to \gamma \in (0, \, \infty)
Journal of Multivariate Analysis | 2014
Stefan Wager
, and allow for arbitrary covariance among the features. For both methods, we provide an explicit and efficiently computable expression for the limiting predictive risk, which depends only on the spectrum of the feature-covariance matrix, the signal strength, and the aspect ratio
PLOS ONE | 2016
Jason M. Nagata; James Gippetti; Stefan Wager; Alejandro Chavez; Paul H. Wise
\gamma
The Annals of Applied Statistics | 2015
Stefan Wager; Alexander Blocker; Niall Cardin
. Especially in the case of regularized discriminant analysis, we find that predictive accuracy has a nuanced dependence on the eigenvalue distribution of the covariance matrix, suggesting that analyses based on the operator norm of the covariance matrix may not be sharp. Our results also uncover several qualitative insights about both methods: for example, with ridge regression, there is an exact inverse relation between the limiting predictive risk and the limiting estimation risk given a fixed signal strength. Our analysis builds on recent advances in random matrix theory.
PLOS ONE | 2014
Milinda Lakkam; Stefan Wager; Paul H. Wise; Lawrence M. Wein
Modern data and applications pose very different challenges from those of the 1950s or even the 1980s. Students contemplating a career in statistics or data science need to have the tools to tackle problems involving massive, heavy-tailed data, often interacting with live, complex systems. However, despite the deepening connections between engineering and modern data science, we argue that training in classical statistical concepts plays a central role in preparing students to solve Google-scale problems. To this end, we present three industrial applications where significant modern data challenges were overcome by statistical thinking. [Received December 2014. Revised August 2015.]