Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Johannes B. Reitsma is active.

Publication


Featured researches published by Johannes B. Reitsma.


Annals of Internal Medicine | 2011

QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.

Penny F Whiting; Anne Wilhelmina Saskia Rutjes; Marie Westwood; Susan Mallett; Jonathan J Deeks; Johannes B. Reitsma; Mariska M.G. Leeflang; Jonathan A C Sterne; Patrick M. Bossuyt

In 2003, the QUADAS tool for systematic reviews of diagnostic accuracy studies was developed. Experience, anecdotal reports, and feedback suggested areas for improvement; therefore, QUADAS-2 was developed. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias. The QUADAS-2 tool is applied in 4 phases: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability. This tool will allow for more transparent rating of bias and applicability of primary diagnostic accuracy studies.


BMC Medical Research Methodology | 2003

The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews

Penny Whiting; Anne Wilhelmina Saskia Rutjes; Johannes B. Reitsma; Patrick M. Bossuyt; Jos Kleijnen

BackgroundIn the era of evidence based medicine, with systematic reviews as its cornerstone, adequate quality assessment tools should be available. There is currently a lack of a systematically developed and evaluated tool for the assessment of diagnostic accuracy studies. The aim of this project was to combine empirical evidence and expert opinion in a formal consensus method to develop a tool to be used in systematic reviews to assess the quality of primary studies of diagnostic accuracy.MethodsWe conducted a Delphi procedure to develop the quality assessment tool by refining an initial list of items. Members of the Delphi panel were experts in the area of diagnostic research. The results of three previously conducted reviews of the diagnostic literature were used to generate a list of potential items for inclusion in the tool and to provide an evidence base upon which to develop the tool.ResultsA total of nine experts in the field of diagnostics took part in the Delphi procedure. The Delphi procedure consisted of four rounds, after which agreement was reached on the items to be included in the tool which we have called QUADAS. The initial list of 28 items was reduced to fourteen items in the final tool. Items included covered patient spectrum, reference standard, disease progression bias, verification bias, review bias, clinical review bias, incorporation bias, test execution, study withdrawals, and indeterminate results. The QUADAS tool is presented together with guidelines for scoring each of the items included in the tool.ConclusionsThis project has produced an evidence based quality assessment tool to be used in systematic reviews of diagnostic accuracy studies. Further work to determine the usability and validity of the tool continues.


The American Journal of Gastroenterology | 2006

Polyp miss rate determined by tandem colonoscopy: a systematic review

Jeroen C. van Rijn; Johannes B. Reitsma; Jaap Stoker; Patrick M. Bossuyt; Sander J. H. van Deventer; Evelien Dekker

BACKGROUND AND AIMS:Colonoscopy is the best available method to detect and remove colonic polyps and therefore serves as the gold standard for less invasive tests such as virtual colonoscopy. Although gastroenterologists agree that colonoscopy is not infallible, there is no clarity on the numbers and rates of missed polyps. The purpose of this systematic review was to obtain summary estimates of the polyp miss rate as determined by tandem colonoscopy.METHODS:An extensive search was performed within PUBMED, EMBASE, and the Cochrane Library databases to identify studies in which patients had undergone two same-day colonoscopies with polypectomy. Random effects models based on the binomial distribution were used to calculate pooled estimates of miss rates.RESULTS:Six studies with a total of 465 patients could be included. The pooled miss rate for polyps of any size was 22% (95% CI: 19–26%; 370/1,650 polyps). Adenoma miss rate by size was, respectively, 2.1% (95% CI: 0.3–7.3%; 2/96 adenomas ≥10 mm), 13% (95% CI: 8.0–18%; 16/124 adenomas 5–10 mm), and 26% (95% CI: 27–35%; 151/587 adenomas 1–5 mm). Three studies reported data on nonadenomatous polyps: zero of eight nonadenomatous polyps ≥10 mm were missed (0%; 95% CI: 0–36.9%) and 83 of 384 nonadenomatous polyps <10 mm were missed (22%; 95% CI: 18–26%).CONCLUSIONS:Colonoscopy rarely misses polyps ≥10 mm, but the miss rate increases significantly in smaller sized polyps. The available evidence is based on a small number of studies with heterogeneous study designs and inclusion criteria.


BMC Medical Research Methodology | 2006

Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies

Penny F Whiting; Marie E Weswood; Anne Wilhelmina Saskia Rutjes; Johannes B. Reitsma; Patrick N M Bossuyt; Jos Kleijnen

BackgroundA quality assessment tool for diagnostic accuracy studies, named QUADAS, has recently been developed. Although QUADAS has been used in several systematic reviews, it has not been formally validated. The objective was to evaluate the validity and usefulness of QUADAS.MethodsThree reviewers independently rated the quality of 30 studies using QUADAS. We assessed the proportion of agreements between each reviewer and the final consensus rating. This was done for all QUADAS items combined and for each individual item. Twenty reviewers who had used QUADAS in their reviews completed a short structured questionnaire on their experience of QUADAS.ResultsOver all items, the agreements between each reviewer and the final consensus rating were 91%, 90% and 85%. The results for individual QUADAS items varied between 50% and 100% with a median value of 90%. Items related to uninterpretable test results and withdrawals led to the most disagreements. The feedback on the content of the tool was generally positive with only small numbers of reviewers reporting problems with coverage, ease of use, clarity of instructions and validity.ConclusionMajor modifications to the content of QUADAS itself are not necessary. The evaluation highlighted particular difficulties in scoring the items on uninterpretable results and withdrawals. Revised guidelines for scoring these items are proposed. It is essential that reviewers tailor guidelines for scoring items to their review, and ensure that all reviewers are clear on how to score studies. Reviewers should consider whether all QUADAS items are relevant to their review, and whether additional quality items should be assessed as part of their review.


Annals of Internal Medicine | 2015

Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration.

Karel G.M. Moons; Douglas G. Altman; Johannes B. Reitsma; John P. A. Ioannidis; Petra Macaskill; Ewout W. Steyerberg; Andrew J. Vickers; David F. Ransohoff; Gary S. Collins

In medicine, numerous decisions are made by care providers, often in shared decision making, on the basis of an estimated probability that a specific disease or condition is present (diagnostic setting) or a specific event will occur in the future (prognostic setting) in an individual. In the diagnostic setting, the probability that a particular disease is present can be used, for example, to inform the referral of patients for further testing, to initiate treatment directly, or to reassure patients that a serious cause for their symptoms is unlikely. In the prognostic context, predictions can be used for planning lifestyle or therapeutic decisions on the basis of the risk for developing a particular outcome or state of health within a specific period (13). Such estimates of risk can also be used to risk-stratify participants in therapeutic intervention trials (47). In both the diagnostic and prognostic setting, probability estimates are commonly based on combining information from multiple predictors observed or measured from an individual (1, 2, 810). Information from a single predictor is often insufficient to provide reliable estimates of diagnostic or prognostic probabilities or risks (8, 11). In virtually all medical domains, diagnostic and prognostic multivariable (risk) prediction models are being developed, validated, updated, and implemented with the aim to assist doctors and individuals in estimating probabilities and potentially influence their decision making. A multivariable prediction model is a mathematical equation that relates multiple predictors for a particular individual to the probability of or risk for the presence (diagnosis) or future occurrence (prognosis) of a particular outcome (10, 12). Other names for a prediction model include risk prediction model, predictive model, prognostic (or prediction) index or rule, and risk score (9). Predictors are also referred to as covariates, risk indicators, prognostic factors, determinants, test results, ormore statisticallyindependent variables. They may range from demographic characteristics (for example, age and sex), medical historytaking, and physical examination results to results from imaging, electrophysiology, blood and urine measurements, pathologic examinations, and disease stages or characteristics, or results from genomics, proteomics, transcriptomics, pharmacogenomics, metabolomics, and other new biological measurement platforms that continuously emerge. Diagnostic and Prognostic Prediction Models Multivariable prediction models fall into 2 broad categories: diagnostic and prognostic prediction models (Box A). In a diagnostic model, multiplethat is, 2 or morepredictors (often referred to as diagnostic test results) are combined to estimate the probability that a certain condition or disease is present (or absent) at the moment of prediction (Box B). They are developed from and to be used for individuals suspected of having that condition. Box A. Schematic representation of diagnostic and prognostic prediction modeling studies. The nature of the prediction in diagnosis is estimating the probability that a specific outcome or disease is present (or absent) within an individual, at this point in timethat is, the moment of prediction (T= 0). In prognosis, the prediction is about whether an individual will experience a specific event or outcome within a certain time period. In other words, in diagnostic prediction the interest is in principle a cross-sectional relationship, whereas prognostic prediction involves a longitudinal relationship. Nevertheless, in diagnostic modeling studies, for logistical reasons, a time window between predictor (index test) measurement and the reference standard is often necessary. Ideally, this interval should be as short as possible without starting any treatment within this period. Box B. Similarities and differences between diagnostic and prognostic prediction models. In a prognostic model, multiple predictors are combined to estimate the probability of a particular outcome or event (for example, mortality, disease recurrence, complication, or therapy response) occurring in a certain period in the future. This period may range from hours (for example, predicting postoperative complications [13]) to weeks or months (for example, predicting 30-day mortality after cardiac surgery [14]) or years (for example, predicting the 5-year risk for developing type 2 diabetes [15]). Prognostic models are developed and are to be used in individuals at risk for developing that outcome. They may be models for either ill or healthy individuals. For example, prognostic models include models to predict recurrence, complications, or death in a certain period after being diagnosed with a particular disease. But they may also include models for predicting the occurrence of an outcome in a certain period in individuals without a specific disease: for example, models to predict the risk for developing type 2 diabetes (16) or cardiovascular events in middle-aged nondiseased individuals (17), or the risk for preeclampsia in pregnant women (18). We thus use prognostic in the broad sense, referring to the prediction of an outcome in the future in individuals at risk for that outcome, rather than the narrower definition of predicting the course of patients who have a particular disease with or without treatment (1). The main difference between a diagnostic and prognostic prediction model is the concept of time. Diagnostic modeling studies are usually cross-sectional, whereas prognostic modeling studies are usually longitudinal. In this document, we refer to both diagnostic and prognostic prediction models as prediction models, highlighting issues that are specific to either type of model. Development, Validation, and Updating of Prediction Models Prediction model studies may address the development of a new prediction model (10), a model evaluation (often referred to as model validation) with or without updating of the model [1921]), or a combination of these (Box C and Figure 1). Box C. Types of prediction model studies. Figure 1. Types of prediction model studies covered by the TRIPOD statement. D = development data; V = validation data. Model development studies aim to derive a prediction model by selecting predictors and combining them into a multivariable model. Logistic regression is commonly used for cross-sectional (diagnostic) and short-term (for example 30-day mortality) prognostic outcomes and Cox regression for long-term (for example, 10-year risk) prognostic outcomes. Studies may also focus on quantifying the incremental or added predictive value of a specific predictor (for example, newly discovered) (22) to a prediction model. Quantifying the predictive ability of a model on the same data from which the model was developed (often referred to as apparent performance [Figure 1]) will tend to give an optimistic estimate of performance, owing to overfitting (too few outcome events relative to the number of candidate predictors) and the use of predictor selection strategies (2325). Studies developing new prediction models should therefore always include some form of internal validation to quantify any optimism in the predictive performance (for example, calibration and discrimination) of the developed model and adjust the model for overfitting. Internal validation techniques use only the original study sample and include such methods as bootstrapping or cross-validation. Internal validation is a necessary part of model development (2). After developing a prediction model, it is strongly recommended to evaluate the performance of the model in other participant data than was used for the model development. External validation (Box C and Figure 1) (20, 26) requires that for each individual in the new participant data set, outcome predictions are made using the original model (that is, the published model or regression formula) and compared with the observed outcomes. External validation may use participant data collected by the same investigators, typically using the same predictor and outcome definitions and measurements, but sampled from a later period (temporal or narrow validation); by other investigators in another hospital or country (though disappointingly rare [27]), sometimes using different definitions and measurements (geographic or broad validation); in similar participants, but from an intentionally different setting (for example, a model developed in secondary care and assessed in similar participants, but selected from primary care); or even in other types of participants (for example, model developed in adults and assessed in children, or developed for predicting fatal events and assessed for predicting nonfatal events) (19, 20, 26, 2830). In case of poor performance (for example, systematic miscalibration), when evaluated in an external validation data set, the model can be updated or adjusted (for example, recalibrating or adding a new predictor) on the basis of the validation data set (Box C) (2, 20, 21, 31). Randomly splitting a single data set into model development and model validation data sets is frequently done to develop and validate a prediction model; this is often, yet erroneously, believed to be a form of external validation. However, this approach is a weak and inefficient form of internal validation, because not all available data are used to develop the model (23, 32). If the available development data set is sufficiently large, splitting by time and developing a model using data from one period and evaluating its performance using the data from the other period (temporal validation) is a stronger approach. With a single data set, temporal splitting and model validation can be considered intermediate between internal and external validation. Incomplete and Inaccurate Reporting Prediction models are becoming increasingly abundant in the medical literature (9, 33, 34), and policymakers are incre


Annals of Surgery | 2007

Extended Transthoracic Resection Compared With Limited Transhiatal Resection for Adenocarcinoma of the Mid/distal Esophagus: Five-year Survival of a Randomized Clinical Trial

Jikke M. T. Omloo; Sjoerd M. Lagarde; Jan B. F. Hulscher; Johannes B. Reitsma; Paul Fockens; Herman van Dekken; Fiebo J. ten Kate; Huug Obertop; Hugo W. Tilanus; J. Jan B. van Lanschot

Objective:To determine whether extended transthoracic esophagectomy for adenocarcinoma of the mid/distal esophagus improves long-term survival. Background:A randomized trial was performed to compare surgical techniques. Complete 5-year survival data are now available. Methods:A total of 220 patients with adenocarcinoma of the distal esophagus (type I) or gastric cardia involving the distal esophagus (type II) were randomly assigned to limited transhiatal esophagectomy or to extended transthoracic esophagectomy with en bloc lymphadenectomy. Patients with peroperatively irresectable/incurable cancer were excluded from this analysis (n = 15). A total of 95 patients underwent transhiatal esophagectomy and 110 patients underwent transthoracic esophagectomy. Results:After transhiatal and transthoracic resection, 5-year survival was 34% and 36%, respectively (P = 0.71, per protocol analysis). In a subgroup analysis, based on the location of the primary tumor according to the resection specimen, no overall survival benefit for either surgical approach was seen in 115 patients with a type II tumor (P = 0.81). In 90 patients with a type I tumor, a survival benefit of 14% was seen with the transthoracic approach (51% vs. 37%, P = 0.33). There was evidence that the treatment effect differed depending on the number of positive lymph nodes in the resection specimen (test for interaction P = 0.06). In patients (n = 55) without positive nodes locoregional disease-free survival after transhiatal esophagectomy was comparable to that after transthoracic esophagectomy (86% and 89%, respectively). The same was true for patients (n = 46) with more than 8 positive nodes (0% in both groups). Patients (n = 104) with 1 to 8 positive lymph nodes in the resection specimen showed a 5-year locoregional disease-free survival advantage if operated via the transthoracic route (23% vs. 64%, P = 0.02). Conclusion:There is no significant overall survival benefit for either approach. However, compared with limited transhiatal resection extended transthoracic esophagectomy for type I esophageal adenocarcinoma shows an ongoing trend towards better 5-year survival. Moreover, patients with a limited number of positive lymph nodes in the resection specimen seem to benefit from an extended transthoracic esophagectomy.


Clinical Chemistry | 2015

STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies

Patrick M. Bossuyt; Johannes B. Reitsma; David E. Bruns; Constantine Gatsonis; Paul Glasziou; Les Irwig; Jeroen G. Lijmer; David Moher; Drummond Rennie; Henrica C.W. de Vet; Herbert Y. Kressel; Nader Rifai; Robert M. Golub; Douglas G. Altman; Lotty Hooft; Daniël A. Korevaar; Jérémie F. Cohen

Incomplete reporting has been identified as a major source of avoidable waste in biomedical research. Essential information is often not provided in study reports, impeding the identification, critical appraisal, and replication of studies. To improve the quality of reporting of diagnostic accuracy studies, the Standards for Reporting Diagnostic Accuracy (STARD) statement was developed. Here we present STARD 2015, an updated list of 30 essential items that should be included in every report of a diagnostic accuracy study. This update incorporates recent evidence about sources of bias and variability in diagnostic accuracy and is intended to facilitate the use of STARD. As such, STARD 2015 may help to improve completeness and transparency in reporting of diagnostic accuracy studies.


European Urology | 2015

Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD Statement.

Gary S. Collins; Johannes B. Reitsma; Douglas G. Altman; Karel G.M. Moons

CONTEXT Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. OBJECTIVE The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. EVIDENCE ACQUISITION This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. EVIDENCE SYNTHESIS The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. CONCLUSIONS To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). PATIENT SUMMARY The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes.


Canadian Medical Association Journal | 2006

Evidence of bias and variation in diagnostic accuracy studies

Anne Wilhelmina Saskia Rutjes; Johannes B. Reitsma; Marcello Di Nisio; Nynke Smidt; Jeroen C. van Rijn; Patrick M. Bossuyt

Background: Studies with methodologic shortcomings can overestimate the accuracy of a medical test. We sought to determine and compare the direction and magnitude of the effects of a number of potential sources of bias and variation in studies on estimates of diagnostic accuracy. Methods: We identified meta-analyses of the diagnostic accuracy of tests through an electronic search of the databases MEDLINE, EMBASE, DARE and MEDION (1999–2002). We included meta-analyses with at least 10 primary studies without preselection based on design features. Pairs of reviewers independently extracted study characteristics and original data from the primary studies. We used a multivariable meta-epidemiologic regression model to investigate the direction and strength of the association between 15 study features on estimates of diagnostic accuracy. Results: We selected 31 meta-analyses with 487 primary studies of test evaluations. Only 1 study had no design deficiencies. The quality of reporting was poor in most of the studies. We found significantly higher estimates of diagnostic accuracy in studies with nonconsecutive inclusion of patients (relative diagnostic odds ratio [RDOR] 1.5, 95% confidence interval [CI] 1.0–2.1) and retrospective data collection (RDOR 1.6, 95% CI 1.1–2.2). The estimates were highest in studies that had severe cases and healthy controls (RDOR 4.9, 95% CI 0.6–37.3). Studies that selected patients based on whether they had been referred for the index test, rather than on clinical symptoms, produced significantly lower estimates of diagnostic accuracy (RDOR 0.5, 95% CI 0.3–0.9). The variance between meta-analyses of the effect of design features was large to moderate for type of design (cohort v. case–control), the use of composite reference standards and the use of differential verification; the variance was close to zero for the other design features. Interpretation: Shortcomings in study design can affect estimates of diagnostic accuracy, but the magnitude of the effect may vary from one situation to another. Design features and clinical characteristics of patient groups should be carefully considered by researchers when designing new studies and by readers when appraising the results of such studies. Unfortunately, incomplete reporting hampers the evaluation of potential sources of bias in diagnostic accuracy studies.


The American Journal of Gastroenterology | 2003

Acute upper GI bleeding: did anything change?: Time trend analysis of incidence and outcome of acute upper GI bleeding between 1993/1994 and 2000

M E van Leerdam; E M Vreeburg; E. A. J. Rauws; A. A.M. Geraedts; Jan G.P. Tijssen; Johannes B. Reitsma; G. N. J. Tytgat

OBJECTIVES:The aim of this study was to examine recent time trends in incidence and outcome of upper GI bleeding.METHODS:Prospective data collection on all patients presenting with acute upper GI bleeding from a defined geographical area in the period 1993/1994 and 2000.RESULTS:Incidence decreased from 61.7/100,000 in 1993/94 to 47.7/100,000 persons annually in 2000, corresponding to a 23% decrease in incidence after age adjustment (95% CI = 15–30%). The incidence was higher among patients of more advanced age. Rebleeding (16% vs 15%) and mortality (14% vs 13%) did not differ between the two time periods. Ulcer bleeding was the most frequent cause of bleeding, at 40% (1993/94) and 46% (2000). Incidence remained stable for both duodenal and gastric ulcer bleeding. Almost one half of all patients with peptic ulcer bleeding were using nonsteroidal anti-inflammatory drugs or aspirin. Also, among patients with ulcer bleeding, rebleeding (22% vs 20%) and mortality (15% vs 14%) did not differ between the two time periods. Increasing age, presence of severe and life-threatening comorbidity, and rebleeding were associated with higher mortality.CONCLUSIONS:Between 1993/1994 and 2000, among patients with acute upper GI bleeding, the incidence rate of upper GI bleeding significantly decreased, but no improvement was seen in the risk of rebleeding or mortality in these patients. The incidence rate of ulcer bleeding remained stable. Prevention of ulcer bleeding is important.

Collaboration


Dive into the Johannes B. Reitsma's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gouke J. Bonsel

Erasmus University Rotterdam

View shared research outputs
Top Co-Authors

Avatar

Miranda Olff

University of Amsterdam

View shared research outputs
Researchain Logo
Decentralizing Knowledge