Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Patrick M. Bossuyt is active.

Publication


Featured researches published by Patrick M. Bossuyt.


Annals of Internal Medicine | 2011

QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.

Penny F Whiting; Anne Wilhelmina Saskia Rutjes; Marie Westwood; Susan Mallett; Jonathan J Deeks; Johannes B. Reitsma; Mariska M.G. Leeflang; Jonathan A C Sterne; Patrick M. Bossuyt

In 2003, the QUADAS tool for systematic reviews of diagnostic accuracy studies was developed. Experience, anecdotal reports, and feedback suggested areas for improvement; therefore, QUADAS-2 was developed. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias. The QUADAS-2 tool is applied in 4 phases: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability. This tool will allow for more transparent rating of bias and applicability of primary diagnostic accuracy studies.


Clinical Chemistry | 2003

The STARD Statement for Reporting Studies of Diagnostic Accuracy: Explanation and Elaboration

Patrick M. Bossuyt; Johannes B. Reitsma; David E. Bruns; Constantine Gatsonis; Paul Glasziou; Les Irwig; David Moher; Drummond Rennie; Henrica C.W. de Vet; Jeroen G. Lijmer

The quality of reporting of studies of diagnostic accuracy is less than optimal. Complete and accurate reporting is necessary to enable readers to assess the potential for bias in the study and to evaluate the generalizability of the results. A group of scientists and editors has developed the STARD (Standards for Reporting of Diagnostic Accuracy) statement to improve the reporting the quality of reporting of studies of diagnostic accuracy. The statement consists of a checklist of 25 items and flow diagram that authors can use to ensure that all relevant information is present. This explanatory document aims to facilitate the use, understanding, and dissemination of the checklist. The document contains a clarification of the meaning, rationale, and optimal use of each item on the checklist, as well as a short summary of the available evidence on bias and applicability. The STARD statement, checklist, flowchart, and this explanation and elaboration document should be useful resources to improve reporting of diagnostic accuracy studies. Complete and informative reporting can only lead to better decisions in health care.


Annals of Internal Medicine | 2003

The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration.

Patrick M. Bossuyt; Johannes B. Reitsma; David E. Bruns; Constantine Gatsonis; Paul Glasziou; Les Irwig; David Moher; Drummond Rennie; Henrica C.W. de Vet; Jeroen G. Lijmer

Introduction In studies of diagnostic accuracy, results from one or more tests are compared with the results obtained with the reference standard on the same subjects. Such accuracy studies are a vital step in the evaluation of new and existing diagnostic technologies (1, 2). Several factors threaten the internal and external validity of a study of diagnostic accuracy (3-8). Some of these factors have to do with the design of such studies, others with the selection of patients, the execution of the tests, or the analysis of the data. In a study involving several meta-analyses a number of design deficiencies were shown to be related to overly optimistic estimates of diagnostic accuracy (9). Exaggerated results from poorly designed studies can trigger premature adoption of diagnostic tests and can mislead physicians to incorrect decisions about the care for individual patients. Reviewers and other readers of diagnostic studies must therefore be aware of the potential for bias and a possible lack of applicability. A survey of studies of diagnostic accuracy published in four major medical journals between 1978 and 1993 revealed that the methodological quality was mediocre at best (8). Furthermore, this review showed that information on key elements of design, conduct, and analysis of diagnostic studies was often not reported (8). To improve the quality of reporting of studies of diagnostic accuracy the Standards for Reporting of Diagnostic Accuracy (STARD) initiative was started. The objective of the STARD initiative is to improve the quality of reporting of studies of diagnostic accuracy. Complete and accurate reporting allows the reader to detect the potential for bias in the study and to judge the generalizability and applicability of the results. For this purpose, the STARD project group has developed a single-page checklist. Where possible, the decision to include items in the checklist was based on evidence linking these items to bias, variability in results, or limitations of the applicability of results to other settings. The checklist can be used to verify that all essential elements are included in the report of a study. This explanatory document aims to facilitate the use, understanding, and dissemination of the checklist. The document contains a clarification of the meaning, rationale, and optimal use of each item on the checklist, as well as a short summary of the available evidence on bias and applicability. The first part of this document contains a summary of the design and terminology of diagnostic accuracy studies. The second part contains an item-by-item discussion with examples. Studies of Diagnostic Accuracy Studies of diagnostic accuracy have a common basic structure (10). One or more tests are evaluated, with the purpose of detecting or predicting a target condition. The target condition can refer to a particular disease, a disease stage, a health status, or any other identifiable condition within a patient, such as staging a disease already known to be present, or a health condition that should prompt clinical action, such as the initiation, modification, or termination of treatment. Here test refers to any method for obtaining additional information on a patients health status. This includes laboratory tests, imaging tests, function tests, pathology, history, and physical examination. In a diagnostic accuracy study, the test under evaluationreferred to here as the index testis applied to a series of subjects. The results obtained with the index test are compared with the results of the reference standard, obtained in the same subjects. In this framework, the reference standard is the best available method for establishing the presence or absence of the target condition. The reference standard can be a single test, or a combination of methods and techniques, including clinical follow-up of tested subjects. The term accuracy refers to the amount of agreement between the results from the index test and those from the reference standard. Diagnostic accuracy can be expressed in a number of ways, including sensitivityspecificity pairs, likelihood ratios, diagnostic odds ratios, and areas under ROC [receiver-operating characteristic] curves (11, 12). Study Question, Design, and Potential for Bias Early in the evaluation of a test, the author may simply want to know if the test is able to discriminate. The appropriate early question may be Do the test results in patients with the target condition differ from the results in healthy people? If preliminary studies answer this question affirmatively, the next study question is, Are patients with specific test results more likely to have the target disorder than similar patients with other test results? The usual study design to answer this is to apply the index test and the reference standard to a number of patients who are suspected of the target condition. Some study designs are more prone to bias and have a more limited applicability than others. In this article, the term bias refers to difference between the observed measures of test performance and the true measures. No single design is guaranteed to be both feasible and able to provide valid, informative, and relevant answers with optimal precision to all study questions. For each study, the reader must judge the relevance, the potential for bias, and the limitations to applicability, making full and transparent reporting critical. For this reason, checklist items refer to the research question that prompted the study of diagnostic accuracy and ask for an explicit and complete description of the study design and results. Variability Measures of test accuracy may vary from study to study. Variability may reflect differences in patient groups, differences in setting, differences in definition of the target condition, and differences in test protocols or in criteria for test positivity (13). For example, bias may occur if a test is evaluated under circumstances that do not correspond to those of the research question. Examples are evaluating a screening test for early disease in patients with advanced stages of the disease and evaluating a physicians office test device in the specialty department of a university hospital. The checklist contains a number of items to make sure that a study report contains a clear description of the inclusion criteria for patients, the testing protocols and the criteria for positivity, as well as an adequate account of subjects included in the study and their results. These items will enable readers to judge if the study results apply to their circumstances. Items in the Checklist The next section contains a point-by-point discussion of the items on the checklist. The order of the items corresponds to the sequence used in many publications of diagnostic accuracy studies. Specific requirements made by journals could lead to a different order. Item 1. Identify the Article as a Study of Diagnostic Accuracy (Recommend MeSH Heading Sensitivity and Specificity) Example (an Excerpt from a Structured Abstract) Purpose: To determine the sensitivity and specificity of computed tomographic colonography for colorectal polyp and cancer detection by using colonoscopy as the reference standard (14). Electronic databases have become indispensable tools to identify studies. To facilitate retrieval of their study, authors should explicitly identify it as a report of a study of diagnostic accuracy. We recommend the use of the term diagnostic accuracy in the title or abstract of a report that compares the results of one or more index tests with the results of a reference standard. In 1991 the National Library of Medicines MEDLINE database introduced a specific keyword (MeSH heading) for diagnostic studies: Sensitivity and Specificity. Using this keyword to search for studies of diagnostic accuracy remains problematic (15-19). In a selected set of MEDLINE journals covering publications between 1992 through 1995, the use of the MeSH heading Sensitivity and Specificity identified only 51% of all studies of diagnostic accuracy and incorrectly identified many articles that were not reports of studies on diagnostic accuracy (18). In the example, the authors used the more general term Performance Characteristics of CT Colonography in the title. The purpose section of the structured abstract explicitly mentions sensitivity and specificity. The MEDLINE record for this paper contains the MeSH Sensitivity and Specificity. Item 2. State the Research Questions or Study Aims, Such as Estimating Diagnostic Accuracy or Comparing Accuracy between Tests or across Participant Groups Example Invasive x-ray coronary angiography remains the gold standard for the identification of clinically significant coronary artery disease . A noninvasive test would be desirable. Coronary magnetic resonance angiography performed while the patient is breathing freely has reached sufficient technical maturity to allow more widespread application with a standardized protocol. Therefore, we conducted a study to determine the [accuracy] of coronary magnetic resonance angiography in the diagnosis of native-vessel coronary artery disease (20). The Helsinki Declaration states that biomedical research involving people should be based on a thorough knowledge of the scientific literature (21). In the introduction of scientific reports authors describe the scientific background, previous work on the subject, the remaining uncertainty, and, hence, the rationale for their study. Clearly specified research questions help the readers to judge the appropriateness of the study design and data analysis. A single general description, such as diagnostic value or clinical usefulness, is usually not very helpful to the readers. In the example, the authors use the introduction section of their paper to describe the potential of coronary magnetic resonance angiography as a non-invasive alternative to conventional x-ray angiography in the diagn


Annals of Internal Medicine | 2008

Systematic reviews of diagnostic test accuracy

Mariska M.G. Leeflang; Jonathan J Deeks; Constantine Gatsonis; Patrick M. Bossuyt

Diagnosis is a critical component of health care, and clinicians, policymakers, and patients routinely face a range of questions regarding diagnostic tests. They want to know whether testing improves outcome; what test to use, purchase, or recommend in practice guidelines; and how to interpret test results. Well-designed diagnostic test accuracy studies can help in making these decisions, provided that they transparently and fully report their participants, tests, methods, and results as facilitated, for example, by the STARD (Standards for Reporting of Diagnostic Accuracy) statement (1). That 25-item checklist was published in many journals and is now adopted by more than 200 scientific journals worldwide. As in other areas of science, systematic reviews and meta-analysis of accuracy studies can be used to obtain more precise estimates when small studies addressing the same test and patients in the same setting are available. Reviews can also be useful to establish whether and how scientific findings vary by particular subgroups, and may provide summary estimates with a stronger generalizability than estimates from a single study. Systematic reviews may help identify the risk for bias that may be present in the original studies and can be used to address questions that were not directly considered in the primary studies, such as comparisons between tests. The Cochrane Collaboration is the largest international organization preparing, maintaining, and promoting systematic reviews to help people make well-informed decisions about health care (2). The Collaboration decided in 2003 to make preparations for including systematic reviews of diagnostic test accuracy in their Cochrane Database of Systematic Reviews. To enable this, a working group (Appendix). was formed to develop methodology, software, and a handbook The first diagnostic test accuracy review was published in the Cochrane Database in October 2008. In this paper, we review recent methodological developments concerning problem formulation, location of literature, quality assessment, and meta-analysis of diagnostic accuracy studies by using our experience from the work on the Cochrane Handbook. The information presented here is based on the recent literature and updates previously published guidelines by Irwig and colleagues (3). Definition of the Objectives of the Review Diagnostic test accuracy refers to the ability of a test to distinguish between patients with disease (or more generally, a specified target condition) and those without. In a study of test accuracy, the results of the test under evaluation, the index test, are compared with those of the reference standard determined in the same patients. The reference standard is an agreed-on and accurate method for identifying patients who have the target condition. Test results are typically categorized as positive or negative for the target condition. By using such binary test outcomes, the accuracy is most often expressed as the tests sensitivity (the proportion of patients with positive results on the reference standard that are also positive on the index test) and specificity (the proportion of patients with negative results on the reference standard that are also negative on the index test). Other measures have been proposed and are in use (46). It has long been recognized that test accuracy is not a fixed property of a test. It can vary between patient subgroups, with their spectrum of disease, with the clinical setting, or with the test interpreters and may depend on the results of previous testing. For this reason, inclusion of these elements in the study question is essential. In order to make a policy decision to promote use of a new index test, evidence is required that using the new test increases test accuracy over other testing options, including current practice, or that the new test has equivalent accuracy but offers other advantages (79). As with the evaluation of interventions, systematic reviews need to include comparative analyses between alternative testing strategies and should not focus solely on evaluating the performance of a test in isolation. In relation to the existing situation, 3 possible roles for a new test can be defined: replacement, triage, and add-on (7). If a new test is to replace an existing test, then comparing the accuracy of both tests on the same population and with the same reference standard provides the most direct evidence. In triage, the new test is used before the existing test or testing pathway, and only patients with a particular result on the triage test continue the testing pathway. When a test is needed to rule out disease in patients who then need no further testing, a test that gives a minimal proportion of falsenegative results and thus a relatively high sensitivity should be used. Triage tests may be less accurate than existing ones, but they have other advantages, such as simplicity or low cost. A third possible role of a new test is add-on. The new test is then positioned after the existing testing pathway to identify false-positive or false-negative results after the existing pathway. The review should provide data to assess the incremental change in accuracy made by adding the new test. An example of a replacement question can be found in a systematic review of the diagnostic accuracy of urinary markers for primary bladder cancer (10). Clinicians may use cytology to triage patients before they undergo invasive cystoscopy, the reference standard for bladder cancer. Because cytology combines high specificity with low sensitivity (11), the goal of the review was to identify a tumor marker with sufficient accuracy to either replace cytology or be used in addition to cytology. For a marker to replace cytology, it has to achieve equally high specificity with improved sensitivity. New markers that are sensitive but not specific may have roles as adjuncts to conventional testing. The review included studies in which the test under evaluation (several different tumor markers and cytology) was evaluated against cystoscopy or histopathology. Included studies compared 1 or more of the markers, cytology only, or a combination of markers and cytology. Although information on accuracy can help clinicians make decisions about tests, good diagnostic accuracy is a desirable but not sufficient condition for the effectiveness of a test (8). To demonstrate that using a new test does more good than harm to patients tested, randomized trials of test-and-treatment strategies and reviews of such trials may be necessary. However, with the possible exception of screening, in most cases, such randomized trials are not available and systematic reviews of test accuracy may provide the most useful evidence available to guide clinical and health policy decision making and use as input for decision and cost-effectiveness analysis (12). Identification and Selection of Studies Identifying test accuracy studies is more difficult than searching for randomized trials (13). There is not a clear, unequivocal keyword or indexing term for an accuracy study in literature databases comparable with the term randomized, controlled trial. The Medical Subject Heading sensitivity and specificity may look suitable but is inconsistently applied in most electronic bibliographic databases. Furthermore, data on diagnostic test accuracy may be hidden in studies that did not have test accuracy estimation as their primary objective. This complicates the efficient identification of diagnostic test accuracy studies in electronic databases, such as MEDLINE. Until indexing systems properly code studies of test accuracy, searching for them will remain challenging and may require additional manual searches, such as screening reference lists. In the development of a comprehensive search strategy, review authors can use search strings that refer to the test(s) under evaluation, the target condition, and the patient description or a subset of these. For tests with a clear name that are used for a single purpose, searching for publications in which those tests are mentioned may suffice. For other reviews, adding the patient description may be necessary, although this is also often poorly indexed. A search strategy in MEDLINE should contain both Medical Subject Headings and free text words. A search strategy for articles about tests for bladder cancer, for example, should include as many synonyms for bladder cancer as possible in the search strategy, including neoplasm, carcinoma, transitional cell, and hematuria. Several methodological electronic search filters for diagnostic test accuracy studies have been developed, each attempting to restrict the search to articles that are most likely to be test accuracy studies (1316). These filters rely on indexing terms for research methodology and text words used in reporting results, but they often miss relevant studies and are unlikely to decrease the number of articles one needs to screen. Therefore, they are not recommended for systematic reviews (17, 18). The incremental value of searching in languages other than English and in the gray literature has not yet been fully investigated. In systematic reviews of intervention studies, publication bias is an important and well-studied form of bias in which the decision to report and publish studies is linked to their findings. For clinical trials, the magnitude and determinants of publication bias have been identified by tracing the publication history of cohorts of trials reviewed by ethics committees and research boards (19). A consistent observation has been that studies with significant results are more likely to be published than studies with nonsignificant findings (19). Investigating publication bias for diagnostic tests is problematic, because many studies are done without ethical review or study registration; therefore, identification of cohorts of studies from registration to final publication status i


Clinical Chemistry | 2015

STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies

Patrick M. Bossuyt; Johannes B. Reitsma; David E. Bruns; Constantine Gatsonis; Paul Glasziou; Les Irwig; Jeroen G. Lijmer; David Moher; Drummond Rennie; Henrica C.W. de Vet; Herbert Y. Kressel; Nader Rifai; Robert M. Golub; Douglas G. Altman; Lotty Hooft; Daniël A. Korevaar; Jérémie F. Cohen

Incomplete reporting has been identified as a major source of avoidable waste in biomedical research. Essential information is often not provided in study reports, impeding the identification, critical appraisal, and replication of studies. To improve the quality of reporting of diagnostic accuracy studies, the Standards for Reporting Diagnostic Accuracy (STARD) statement was developed. Here we present STARD 2015, an updated list of 30 essential items that should be included in every report of a diagnostic accuracy study. This update incorporates recent evidence about sources of bias and variability in diagnostic accuracy and is intended to facilitate the use of STARD. As such, STARD 2015 may help to improve completeness and transparency in reporting of diagnostic accuracy studies.


Radiology | 2015

STARD 2015: An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies

Patrick M. Bossuyt; Johannes B. Reitsma; David E. Bruns; Constantine Gatsonis; Paul Glasziou; Les Irwig; Jeroen G. Lijmer; David Moher; Drummond Rennie; Henrica C.W. de Vet; Herbert Y. Kressel; Nader Rifai; Robert M. Golub; Douglas G. Altman; Lotty Hooft; Daniël A. Korevaar; Jérémie F. Cohen

Incomplete reporting has been identified as a major source of avoidable waste in biomedical research. Essential information is often not provided in study reports, impeding the identification, critical appraisal, and replication of studies. To improve the quality of reporting of diagnostic accuracy studies, the Standards for Reporting of Diagnostic Accuracy Studies (STARD) statement was developed. Here we present STARD 2015, an updated list of 30 essential items that should be included in every report of a diagnostic accuracy study. This update incorporates recent evidence about sources of bias and variability in diagnostic accuracy and is intended to facilitate the use of STARD. As such, STARD 2015 may help to improve completeness and transparency in reporting of diagnostic accuracy studies.


Clinical Chemistry and Laboratory Medicine | 2003

Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative.

Patrick M. Bossuyt; Johannes B. Reitsma; David E. Bruns; Constantine Gatsonis; Paul Glasziou; Les Irwig; Jeroen G. Lijmer; David Moher; Drummond Rennie; Henrica C.W. de Vet

Abstract Objective – To improve the accuracy and completeness of reporting of studies of diagnostic accuracy, to allow readers to assess the potential for bias in the study and to evaluate its generalisability. Methods – The Standards for Reporting of Diagnostic Accuracy (STARD) steering committee searched the literature to identify publications on the appropriate conduct and reporting of diagnostic studies and extracted potential items into an extensive list. Researchers, editors, and members of professional organisations shortened this list during a two-day consensus meeting with the goal of developing a checklist and a generic flow diagram for studies of diagnostic accuracy. Results – The search for published guidelines regarding diagnostic research yielded 33 previously published checklists, from which we extracted a list of 75 potential items. At the consensus meeting, participants shortened the list to 25 items, using evidence on bias whenever available. A prototypical flow diagram provides information about the method of patient recruitment, the order of test execution and the numbers of patients undergoing the test under evaluation, the reference standard or both. Conclusions – Evaluation of research depends on complete and accurate reporting. If medical journals adopt the checklist and the flow diagram, the quality of reporting of studies of diagnostic accuracy should improve to the advantage of clinicians, researchers, reviewers, journals, and the public.


BMC Medical Research Methodology | 2006

Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies

N. Smidt; Anne Wilhelmina Saskia Rutjes; Danielle van der Windt; Raymond Ostelo; Patrick M. Bossuyt; Johannes B. Reitsma; L.M. Bouter; Henrica C.W. de Vet

BackgroundIn January 2003, STAndards for the Reporting of Diagnostic accuracy studies (STARD) were published in a number of journals, to improve the quality of reporting in diagnostic accuracy studies. We designed a study to investigate the inter-assessment reproducibility, and intra- and inter-observer reproducibility of the items in the STARD statement.MethodsThirty-two diagnostic accuracy studies published in 2000 in medical journals with an impact factor of at least 4 were included. Two reviewers independently evaluated the quality of reporting of these studies using the 25 items of the STARD statement. A consensus evaluation was obtained by discussing and resolving disagreements between reviewers. Almost two years later, the same studies were evaluated by the same reviewers. For each item, percentages agreement and Cohens kappa between first and second consensus assessments (inter-assessment) were calculated. Intraclass Correlation coefficients (ICC) were calculated to evaluate its reliability.ResultsThe overall inter-assessment agreement for all items of the STARD statement was 85% (Cohens kappa 0.70) and varied from 63% to 100% for individual items. The largest differences between the two assessments were found for the reporting of the rationale of the reference standard (kappa 0.37), number of included participants that underwent tests (kappa 0.28), distribution of the severity of the disease (kappa 0.23), a cross tabulation of the results of the index test by the results of the reference standard (kappa 0.33) and how indeterminate results, missing data and outliers were handled (kappa 0.25). Within and between reviewers, also large differences were observed for these items. The inter-assessment reliability of the STARD checklist was satisfactory (ICC = 0.79 [95% CI: 0.62 to 0.89]).ConclusionAlthough the overall reproducibility of the quality of reporting on diagnostic accuracy studies using the STARD statement was found to be good, substantial disagreements were found for specific items. These disagreements were not so much caused by differences in interpretation of the items by the reviewers but rather by difficulties in assessing the reporting of these items due to lack of clarity within the articles. Including a flow diagram in all reports on diagnostic accuracy studies would be very helpful in reducing confusion between readers and among reviewers.


JAMA | 1999

Empirical evidence of design-related bias in studies of diagnostic tests.

Jeroen G. Lijmer; Ben Willem J. Mol; Siem H. Heisterkamp; Gouke J. Bonsel; Martin H. Prins; Jan van der Meulen; Patrick M. Bossuyt


The Lancet | 2003

The STARD initiative.

Patrick M. Bossuyt; Johannes B. Reitsma

Collaboration


Dive into the Patrick M. Bossuyt's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Drummond Rennie

American Medical Association

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Moher

Ottawa Hospital Research Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Henrica C.W. de Vet

VU University Medical Center

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge