The American Journal of Drug and Alcohol Abuse | 2019

The clinical consequences of variable selection in multiple regression models: a case study of the Norwegian Opioid Maintenance Treatment program

 
 
 

Abstract


ABSTRACT Background: Selecting which variables to include in multiple regression models is a pervasive problem in medical research. Objectives: Based on questionnaire data (n = 18538, 69.9% men) from the Norwegian Opioid Maintenance Treatment Program, this study aims to compare the performance of different variable selection methods and the potential clinical consequences of choice of method. The effect of missing data is also explored. Methods: The dependent variable was engagement in criminal behavior while in treatment. Twenty-nine potential covariates on demographics, psychosocial factors and drug use were tested for inclusion in a multiple logistic regression model. Both complete case and multiply imputed data were considered. We compared the results from variable selection methods ranging from expert-based and purposeful variable selection, through stepwise methods, to more recently developed penalized regression using the Least Absolute Shrinkage and Selection Operator (LASSO). Results: The various variable selection methods resulted in regression models including from 9 to 22 covariates. The stepwise selection procedures generated the models with the most covariates included. The choice of variable selection method directly affected the estimated regression coefficients, both in effect size and statistical significance. For several variables the expert-based approach disagreed with all data-driven methods. Conclusions: The choice of variable selection method may strongly affect the resulting regression model, along with accompanying effect sizes and confidence intervals. This may affect clinical conclusions. The process should consequently be given sufficient consideration in model building. We recommend combining expert knowledge with a data-driven variable selection method to explore the models’ robustness.

Volume 46
Pages 13 - 21
DOI 10.1080/00952990.2019.1648484
Language English
Journal The American Journal of Drug and Alcohol Abuse

Full Text