Biostatistics | 2021

Testing calibration of phenotyping models using positive-only electronic health record data.

 
 
 
 

Abstract


Validation of phenotyping models using Electronic Health Records (EHRs) data conventionally requires gold-standard case and control labels. The labeling process requires clinical experts to retrospectively review patients medical charts, therefore is labor intensive and time consuming. For some disease conditions, it is prohibitive to identify the gold-standard controls because routine clinical assessments are performed for selective patients who are deemed to possibly have the condition. To build a model for phenotyping patients in EHRs, the most readily accessible data are often for a cohort consisting of a set of gold-standard cases and a large number of unlabeled patients. Hereby, we propose methods for assessing model calibration and discrimination using such positive-only EHR data that does not require gold-standard controls, provided that the labeled cases are representative of all cases. For model calibration, we propose a novel statistic that aggregates differences between model-free and model-based estimated numbers of cases across risk subgroups, which asymptotically follows a Chi-squared distribution. We additionally demonstrate that the calibration slope can also be estimated using such positive-only data. We propose consistent estimators for discrimination measures and derive their large sample properties. We demonstrate performances of the proposed methods through extensive simulation studies and apply them to Penn Medicine EHRs to validate two preliminary models for predicting the risk of primary aldosteronism.

Volume None
Pages None
DOI 10.1093/biostatistics/kxab003
Language English
Journal Biostatistics

Full Text