Featured Researches

Econometrics

Causal Inference for Spatial Treatments

I propose a framework, estimators, and inference procedures for the analysis of causal effects in a setting with spatial treatments. Many events and policies (treatments), such as opening of businesses, building of hospitals, and sources of pollution, occur at specific spatial locations, with researchers interested in their effects on nearby individuals or businesses (outcome units). However, the existing treatment effects literature primarily considers treatments that could be assigned directly at the level of the outcome units, potentially with spillover effects. I approach the spatial treatment setting from a similar experimental perspective: What ideal experiment would we design to estimate the causal effects of spatial treatments? This perspective motivates a comparison between individuals near realized treatment locations and individuals near unrealized candidate locations, which is distinct from current empirical practice. Furthermore, I show how to find such candidate locations and apply the proposed methods with observational data. I apply the proposed methods to study the causal effects of grocery stores on foot traffic to nearby businesses during COVID-19 lockdowns.

Read more
Econometrics

Causal Inference in Case-Control Studies

We investigate partial identification of causal relative and attributable risk---the ratio of two counterfactual proportions and the difference between them---in case-control and case-population studies. The odds ratio is shown to be a sharp upper bound on causal relative risk under the monotone treatment response and monotone treatment selection assumptions, without resorting to strong ignorability, nor to the rare-disease assumption. Sharp bounds on causal attributable risk are also obtained under the same assumptions. Paying special attention to the (conditional) odds ratio, we propose a semiparametrically efficient estimator of the aggregated (log) odds ratio. Further, we develop easy-to-implement causal inference procedures for relative and attributable risk. Finally, we showcase our methodology by applying it to two unique datasets in the literature. We find that attending private school may have little effect on entering a very selective university in Pakistan and that dropping out of school could substantially increase relative and attributable risk of joining a criminal gang in Brazil.

Read more
Econometrics

Causal Inference in Possibly Nonlinear Factor Models

This paper develops a general causal inference method for treatment effects models with noisily measured confounders. The key feature is that a large set of noisy measurements are linked with the underlying latent confounders through an unknown, possibly nonlinear factor structure. The main building block is a local principal subspace approximation procedure that combines K -nearest neighbors matching and principal component analysis. Estimators of many causal parameters, including average treatment effects and counterfactual distributions, are constructed based on doubly-robust score functions. Large-sample properties of these estimators are established, which only require relatively mild conditions on the principal subspace approximation. The results are illustrated with an empirical application studying the effect of political connections on stock returns of financial firms, and a Monte Carlo experiment. The main technical and methodological results regarding the general local principal subspace approximation method may be of independent interest.

Read more
Econometrics

Causal Spillover Effects Using Instrumental Variables

I set up a potential-outcomes framework to analyze spillover effects using instrumental variables. I characterize the population compliance types in a setting in which spillovers can occur on both treatment take-up and outcomes, and provide conditions for identification of the marginal distribution of compliance types. I show that intention-to-treat (ITT) parameters aggregate multiple direct and spillover effects for different compliance types, and hence do not have a clear link to causally interpretable parameters. Moreover, rescaling ITT parameters by first-stage estimands generally recovers a weighted combination of average effects where the sum of weights is larger than one. I then analyze identification of causal direct and spillover effects under one-sided noncompliance, and show that these effects can be estimated by 2SLS. I illustrate the proposed methods using data from an experiment on social interactions and voting behavior.

Read more
Econometrics

Causal mediation analysis with double machine learning

This paper combines causal mediation analysis with double machine learning to control for observed confounders in a data-driven way under a selection-on-observables assumption in a high-dimensional setting. We consider the average indirect effect of a binary treatment operating through an intermediate variable (or mediator) on the causal path between the treatment and the outcome, as well as the unmediated direct effect. Estimation is based on efficient score functions, which possess a multiple robustness property w.r.t. misspecifications of the outcome, mediator, and treatment models. This property is key for selecting these models by double machine learning, which is combined with data splitting to prevent overfitting in the estimation of the effects of interest. We demonstrate that the direct and indirect effect estimators are asymptotically normal and root-n consistent under specific regularity conditions and investigate the finite sample properties of the suggested methods in a simulation study when considering lasso as machine learner. We also provide an empirical application to the U.S. National Longitudinal Survey of Youth, assessing the indirect effect of health insurance coverage on general health operating via routine checkups as mediator, as well as the direct effect. We find a moderate short term effect of health insurance coverage on general health which is, however, not mediated by routine checkups.

Read more
Econometrics

Cointegrated Solutions of Unit-Root VARs: An Extended Representation Theorem

This paper establishes an extended representation theorem for unit-root VARs. A specific algebraic technique is devised to recover stationarity from the solution of the model in the form of a cointegrating transformation. Closed forms of the results of interest are derived for integrated processes up to the 4-th order. An extension to higher-order processes turns out to be within the reach on an induction argument.

Read more
Econometrics

Cointegrating Polynomial Regressions with Power Law Trends: A New Angle on the Environmental Kuznets Curve

The Environment Kuznets Curve (EKC) predicts an inverted U-shaped relationship between economic growth and environmental pollution. Current analyses frequently employ models which restrict the nonlinearities in the data to be explained by the economic growth variable only. We propose a Generalized Cointegrating Polynomial Regression (GCPR) with flexible time trends to proxy time effects such as technological progress and/or environmental awareness. More specifically, a GCPR includes flexible powers of deterministic trends and integer powers of stochastic trends. We estimate the GCPR by nonlinear least squares and derive its asymptotic distribution. Endogeneity of the regressors can introduce nuisance parameters into this limiting distribution but a simulated approach nevertheless enables us to conduct valid inference. Moreover, a subsampling KPSS test can be used to check the stationarity of the errors. A comprehensive simulation study shows good performance of the simulated inference approach and the subsampling KPSS test. We illustrate the GCPR approach on a dataset of 18 industrialised countries containing GDP and CO2 emissions. We conclude that: (1) the evidence for an EKC is significantly reduced when a nonlinear time trend is included, and (2) a linear cointegrating relation between GDP and CO2 around a power law trend also provides an accurate description of the data.

Read more
Econometrics

Cointegration in large VARs

The paper analyses cointegration in vector autoregressive processes (VARs) for the cases when both the number of coordinates, N , and the number of time periods, T , are large and of the same order. We propose a way to examine a VAR for the presence of cointegration based on a modification of the Johansen likelihood ratio test. The advantage of our procedure over the original Johansen test and its finite sample corrections is that our test does not suffer from over-rejection. This is achieved through novel asymptotic theorems for eigenvalues of matrices in the test statistic in the regime of proportionally growing N and T . Our theoretical findings are supported by Monte Carlo simulations and an empirical illustration. Moreover, we find a surprising connection with multivariate analysis of variance (MANOVA) and explain why it emerges.

Read more
Econometrics

Combining Observational and Experimental Data Using First-stage Covariates

Randomized controlled trials generate experimental variation that can credibly identify causal effects, but often suffer from limited scale, while observational datasets are large, but often violate desired identification assumptions. To improve estimation efficiency, I propose a method that combines experimental and observational datasets when 1) units from these two datasets are sampled from the same population and 2) some characteristics of these units are observed. I show that if these characteristics can partially explain treatment assignment in the observational data, they can be used to derive moment restrictions that, in combination with the experimental data, improve estimation efficiency. I outline three estimators (weighting, shrinkage, or GMM) for implementing this strategy, and show that my methods can reduce variance by up to 50% in typical experimental designs; therefore, only half of the experimental sample is required to attain the same statistical precision. If researchers are allowed to design experiments differently, I show that they can further improve the precision by directly leveraging this correlation between characteristics and assignment. I apply my method to a search listing dataset from Expedia that studies the causal effect of search rankings, and show that the method can substantially improve the precision.

Read more
Econometrics

Combining Shrinkage and Sparsity in Conjugate Vector Autoregressive Models

Conjugate priors allow for fast inference in large dimensional vector autoregressive (VAR) models but, at the same time, introduce the restriction that each equation features the same set of explanatory variables. This paper proposes a straightforward means of post-processing posterior estimates of a conjugate Bayesian VAR to effectively perform equation-specific covariate selection. Compared to existing techniques using shrinkage alone, our approach combines shrinkage and sparsity in both the VAR coefficients and the error variance-covariance matrices, greatly reducing estimation uncertainty in large dimensions while maintaining computational tractability. We illustrate our approach by means of two applications. The first application uses synthetic data to investigate the properties of the model across different data-generating processes, the second application analyzes the predictive gains from sparsification in a forecasting exercise for US data.

Read more

Ready to get started?

Join us today