International journal of medical informatics | 2021
Consultation length and no-show prediction for improving appointment scheduling efficiency at a cardiology clinic: A data analytics approach
Abstract
BACKGROUND\nThe observed consultation length at specialty clinics, such as cardiology care, is represented by two underlying groups - one with zero service time due to patient no-shows, and the other characterized by positive values with high variance. This inconstancy affects the scheduler s ability to accurately estimate consultation length, which, in turn, hinders effective utilization of the clinic s resources and timely access to care. The objectives of this study were to: (i) predict the consultation length by accounting for its semicontinuous nature (i.e., zero in case of no-shows and positive otherwise), using machine learning (ML) algorithms, (ii) identify important features for predicting no-shows and non-zero consultation length, and (iii) assess the impact of integrating the ML-based prediction with the appointment scheduling system.\n\n\nMETHODS\nWe used two-years of data extracted from the electronic medical records of a cardiology clinic. By leveraging 16 predictors pertaining to the patient, appointment, and doctor, a two-part ML-based approach was developed to handle the semicontinuous consultation length. Supervised classification models were employed to predict no-shows (i.e., categorize the consultation length as zero or positive), and regression algorithms were developed for estimating non-zero consultation lengths. Three algorithms, namely, random forests, stochastic gradient boosting, and deep neural networks, were individually employed for both no-show classification and positive consultation length prediction. Finally, the best performing classification and regression models were combined to establish the complete two-part model, and its prediction error on new data is benchmarked against the clinic s current performance. The evaluation metrics for classification models were area under the receiver operating characteristic curve (AUC-ROC) and area under the precision-recall curve (AUC-PR). The prediction performance of regression algorithms was evaluated by mean absolute error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE). A simulation modeling approach was adopted to ascertain the effectiveness of using ML-based prediction for scheduling decisions as opposed to the clinic s current strategy.\n\n\nRESULTS\nAmong the classification models tested, stochastic gradient boosted classification tree (SGBCT) demonstrated best performance (AUC-ROC = 0.85, AUC-PR = 0.64). For positive consultation length prediction, deep neural network regressor (DNNR) resulted in lowest prediction error (MAE = 8.55, RMSE = 6.88, MAPE = 12.24). The complete two-part model (SGBCT + DNNR) outperformed the clinic s approach to consultation length estimation by achieving 50 % and 52 % reduction in RMSE and MAE, respectively. Further adopting it for appointment scheduling could reduce the patient waiting time and doctor idle time by 56 % and 52 %, respectively. Besides, several clinical insights, along with critical features for no-show and consultation length prediction, were also identified from our proof-of-concept study.\n\n\nCONCLUSION\nThis study demonstrates that routine clinical tasks such as estimation of consultation length and no-shows can be accurately predicted using ML algorithms, and subsequently integrated into the clinical scheduling system to improve resource utilization and reduce patient waiting time.