PLoS ONE | 2019

The effect of uncertainty in patient classification on diagnostic performance estimations

 
 
 

Abstract


Background The performance of a new diagnostic test is typically evaluated against a comparator which is assumed to correspond closely to some true state of interest. Judgments about the new test’s performance are based on the differences between the outputs of the test and comparator. It is commonly assumed that a small amount of uncertainty in the comparator’s classifications will negligibly affect the measured performance of a diagnostic test. Methods Simulated datasets were generated to represent typical diagnostic scenarios. Comparator noise was introduced in the form of random misclassifications, and the effect on the apparent performance of the diagnostic test was determined. An actual dataset from a clinical trial on a new diagnostic test for sepsis was also analyzed. Results We demonstrate that as little as 5% misclassification of patients by the comparator can be enough to statistically invalidate performance estimates such as sensitivity, specificity and area under the receiver operating characteristic curve, if this uncertainty is not measured and taken into account. This distortion effect is found to increase non-linearly with comparator uncertainty, under some common diagnostic scenarios. For clinical populations exhibiting high degrees of classification uncertainty, failure to measure and account for this effect will introduce significant risks of drawing false conclusions. The effect of classification uncertainty is magnified further for high performing tests that would otherwise reach near-perfection in diagnostic evaluation trials. A requirement of very high diagnostic performance for clinical adoption, such as a 99% sensitivity, can be rendered nearly unachievable even for a perfect test, if the comparator diagnosis contains even small amounts of uncertainty. This paper and an accompanying online simulation tool demonstrate the effect of classification uncertainty on the apparent performance of tests across a range of typical diagnostic scenarios. Both simulated and real datasets are used to show the degradation of apparent test performance as comparator uncertainty increases. Conclusions Overall, a 5% or greater misclassification rate by the comparator can lead to significant underestimation of true test performance. An online simulation tool allows researchers to explore this effect using their own trial parameters (https://imperfect-gold-standard.shinyapps.io/classification-noise/) and the source code is freely available (https://github.com/ksny/Imperfect-Gold-Standard).

Volume 14
Pages None
DOI 10.1371/journal.pone.0217146
Language English
Journal PLoS ONE

Full Text