Journal of Computational and Graphical Statistics | 2019

Using Approximation Algorithms to Build Evidence Factors and Related Designs for Observational Studies

 
 
 

Abstract


Abstract Observational or nonrandomized studies of treatment effects are often constructed with the aid of polynomial-time algorithms that optimally form matched treatment-control pairs or matched sets. Because each observational comparison may potentially be affected by bias, investigators often reinforce a single comparison with an additional comparison that is unlikely to be affected by the same biases, for instance using multiple control groups or evidence factors or control\u2009+\u2009instrument designs. Use of two comparisons affected by different biases may detect bias if the two comparisons disagree, or may show that two comparisons with different weakness concur in their conclusions. Even this simplest addition—a second comparison—creates design problems without polynomial-time solutions. Faced with a problem that no polynomial-time algorithm can solve, a so-called approximation algorithm is a type of compromise: it provides a solution in polynomial time that is provably not much worse than the unattainable optimal solution. Building upon existing techniques for related problems in operations research, we develop an approximation algorithm for minimum distance matching with near-fine balance for three comparison groups. This algorithm is a practical approach to most observational designs that add a second comparison. The method is applied to an observational study of the effects of side airbags on injury severity in the U.S. Fatality Analysis Reporting System. For many car makes and models, side airbags were initially unavailable, then later available as optional equipment for an additional fee, then still later provided as standard equipment. Within sets matched for make and model of car, for safety belt use, for direction of impact, and other covariates, we compare crashes in these three periods, where each comparison has different limitations. The method is implemented in the R package approxmatch, whose example reproduces some of the calculations. Supplementary materials for this article are available online.

Volume 28
Pages 698 - 709
DOI 10.1080/10618600.2019.1584900
Language English
Journal Journal of Computational and Graphical Statistics

Full Text