Anne Wilhelmina Saskia Rutjes
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anne Wilhelmina Saskia Rutjes.
Annals of Internal Medicine | 2011
Penny F Whiting; Anne Wilhelmina Saskia Rutjes; Marie Westwood; Susan Mallett; Jonathan J Deeks; Johannes B. Reitsma; Mariska M.G. Leeflang; Jonathan A C Sterne; Patrick M. Bossuyt
In 2003, the QUADAS tool for systematic reviews of diagnostic accuracy studies was developed. Experience, anecdotal reports, and feedback suggested areas for improvement; therefore, QUADAS-2 was developed. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias. The QUADAS-2 tool is applied in 4 phases: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability. This tool will allow for more transparent rating of bias and applicability of primary diagnostic accuracy studies.
BMC Medical Research Methodology | 2003
Penny Whiting; Anne Wilhelmina Saskia Rutjes; Johannes B. Reitsma; Patrick M. Bossuyt; Jos Kleijnen
BackgroundIn the era of evidence based medicine, with systematic reviews as its cornerstone, adequate quality assessment tools should be available. There is currently a lack of a systematically developed and evaluated tool for the assessment of diagnostic accuracy studies. The aim of this project was to combine empirical evidence and expert opinion in a formal consensus method to develop a tool to be used in systematic reviews to assess the quality of primary studies of diagnostic accuracy.MethodsWe conducted a Delphi procedure to develop the quality assessment tool by refining an initial list of items. Members of the Delphi panel were experts in the area of diagnostic research. The results of three previously conducted reviews of the diagnostic literature were used to generate a list of potential items for inclusion in the tool and to provide an evidence base upon which to develop the tool.ResultsA total of nine experts in the field of diagnostics took part in the Delphi procedure. The Delphi procedure consisted of four rounds, after which agreement was reached on the items to be included in the tool which we have called QUADAS. The initial list of 28 items was reduced to fourteen items in the final tool. Items included covered patient spectrum, reference standard, disease progression bias, verification bias, review bias, clinical review bias, incorporation bias, test execution, study withdrawals, and indeterminate results. The QUADAS tool is presented together with guidelines for scoring each of the items included in the tool.ConclusionsThis project has produced an evidence based quality assessment tool to be used in systematic reviews of diagnostic accuracy studies. Further work to determine the usability and validity of the tool continues.
BMC Medical Research Methodology | 2006
Penny F Whiting; Marie E Weswood; Anne Wilhelmina Saskia Rutjes; Johannes B. Reitsma; Patrick N M Bossuyt; Jos Kleijnen
BackgroundA quality assessment tool for diagnostic accuracy studies, named QUADAS, has recently been developed. Although QUADAS has been used in several systematic reviews, it has not been formally validated. The objective was to evaluate the validity and usefulness of QUADAS.MethodsThree reviewers independently rated the quality of 30 studies using QUADAS. We assessed the proportion of agreements between each reviewer and the final consensus rating. This was done for all QUADAS items combined and for each individual item. Twenty reviewers who had used QUADAS in their reviews completed a short structured questionnaire on their experience of QUADAS.ResultsOver all items, the agreements between each reviewer and the final consensus rating were 91%, 90% and 85%. The results for individual QUADAS items varied between 50% and 100% with a median value of 90%. Items related to uninterpretable test results and withdrawals led to the most disagreements. The feedback on the content of the tool was generally positive with only small numbers of reviewers reporting problems with coverage, ease of use, clarity of instructions and validity.ConclusionMajor modifications to the content of QUADAS itself are not necessary. The evaluation highlighted particular difficulties in scoring the items on uninterpretable results and withdrawals. Revised guidelines for scoring these items are proposed. It is essential that reviewers tailor guidelines for scoring items to their review, and ensure that all reviewers are clear on how to score studies. Reviewers should consider whether all QUADAS items are relevant to their review, and whether additional quality items should be assessed as part of their review.
Canadian Medical Association Journal | 2006
Anne Wilhelmina Saskia Rutjes; Johannes B. Reitsma; Marcello Di Nisio; Nynke Smidt; Jeroen C. van Rijn; Patrick M. Bossuyt
Background: Studies with methodologic shortcomings can overestimate the accuracy of a medical test. We sought to determine and compare the direction and magnitude of the effects of a number of potential sources of bias and variation in studies on estimates of diagnostic accuracy. Methods: We identified meta-analyses of the diagnostic accuracy of tests through an electronic search of the databases MEDLINE, EMBASE, DARE and MEDION (1999–2002). We included meta-analyses with at least 10 primary studies without preselection based on design features. Pairs of reviewers independently extracted study characteristics and original data from the primary studies. We used a multivariable meta-epidemiologic regression model to investigate the direction and strength of the association between 15 study features on estimates of diagnostic accuracy. Results: We selected 31 meta-analyses with 487 primary studies of test evaluations. Only 1 study had no design deficiencies. The quality of reporting was poor in most of the studies. We found significantly higher estimates of diagnostic accuracy in studies with nonconsecutive inclusion of patients (relative diagnostic odds ratio [RDOR] 1.5, 95% confidence interval [CI] 1.0–2.1) and retrospective data collection (RDOR 1.6, 95% CI 1.1–2.2). The estimates were highest in studies that had severe cases and healthy controls (RDOR 4.9, 95% CI 0.6–37.3). Studies that selected patients based on whether they had been referred for the index test, rather than on clinical symptoms, produced significantly lower estimates of diagnostic accuracy (RDOR 0.5, 95% CI 0.3–0.9). The variance between meta-analyses of the effect of design features was large to moderate for type of design (cohort v. case–control), the use of composite reference standards and the use of differential verification; the variance was close to zero for the other design features. Interpretation: Shortcomings in study design can affect estimates of diagnostic accuracy, but the magnitude of the effect may vary from one situation to another. Design features and clinical characteristics of patient groups should be carefully considered by researchers when designing new studies and by readers when appraising the results of such studies. Unfortunately, incomplete reporting hampers the evaluation of potential sources of bias in diagnostic accuracy studies.
BMJ | 2002
Peter Jüni; Anne Wilhelmina Saskia Rutjes; Paul Dieppe
Selective cyclo-oxygenase 2 (COX 2) inhibitors, including celecoxib (Celebrex) and rofecoxib (Vioxx), are hypothesised to have a lower risk of gastrointestinal complications than traditional non-steroidal anti-inflammatory drugs.1 In September 2000 the celecoxib long term arthritis safety study, better known as CLASS, was published in JAMA .2 This trial, widely cited and distributed, concluded that a COX 2 inhibitor was associated with a lower incidence of complications than traditional non-steroidal anti-inflammatory drugs. What was much less widely publicised were criticisms that contradicted this conclusion. CLASS was reported as a three arm trial comparing celecoxib 800 mg/day with ibuprofen 2400 mg/day and diclofenac 150 mg/day in osteoarthritis or rheumatoid arthritis. Clinically relevant upper gastrointestinal ulcer complications (bleeding, perforation, or obstruction) and symptomatic ulcers during the first six months of treatment were described as the two main outcome measures, comparing incidence rates for celecoxib and a traditional non-steroidal anti-inflammatory drug (fig 1). It was concluded that, compared with the traditional non-steroidal anti-inflammatory drug, celecoxib “was associated with a lower incidence of symptomatic ulcers and ulcer complications combined.”3 The trial was funded by celecoxibs manufacturer Pharmacia. An article in the Washington Post in August 20013 and two letters published in JAMA in November 2001 4 5 drew attention to the fact that complete information available to the United States Food and Drug Administration contradicted these conclusions. The paper reporting CLASS2 actually referred to the combined analysis of the results …
Journal of Thrombosis and Haemostasis | 2007
M. Di Nisio; Alessandro Squizzato; Anne Wilhelmina Saskia Rutjes; H. R. Büller; A. H. Zwinderman; Patrick M. Bossuyt
Summary. Background: The reported diagnostic accuracy of the D‐dimer test for exclusion of deep vein thrombosis (DVT) and pulmonary embolism (PE) varies. It is unknown to what extent this is due to differences in study design or patient groups, or to genuine differences between D‐dimer assays. Methods: Studies evaluating the diagnostic accuracy of the D‐dimer test in the diagnosis of venous thromboembolism were systematically searched for in the MEDLINE and EMBASE databases up to March 2005. Reference lists of all included studies and of reviews related to the topic of the present meta‐analysis were manually searched for other additional potentially eligible studies. Two reviewers independently extracted study characteristics using standardized forms. Results: In total, 217 D‐dimer test evaluations for DVT and 111 for PE were analyzed. Several study design characteristics were associated with systematic differences in diagnostic accuracy. After adjustment for these features, the sensitivities of the D‐dimer enzyme‐linked immunofluorescence assay (ELFA) (DVT 96%; PE 97%), microplate enzyme‐linked immunosorbent assay (ELISA) (DVT 94%; PE 95%), and latex quantitative assay (DVT 93%; PE 95%) were superior to those of the whole‐blood D‐dimer assay (DVT 83%; PE 87%), latex semiquantitative assay (DVT 85%; PE 88%) and latex qualitative assay (DVT 69%; PE 75%). The latex qualitative and whole‐blood D‐dimer assays had the highest specificities (DVT 99%, 71%; PE 99%, 69%). Conclusions: Compared to other D‐dimer assays, the ELFA, microplate ELISA and latex quantitative assays have higher sensitivity but lower specificity, resulting in a more confident exclusion of the disease at the expense of more additional imaging testing. These conclusions are based on the most up‐to‐date and extensive systematic review of the topic area, including 184 articles, with 328 D‐dimer test evaluations.
Annals of Internal Medicine | 2012
Anne Wilhelmina Saskia Rutjes; Peter Jüni; Bruno R. da Costa; Sven Trelle; Eveline Nüesch; Stephan Reichenbach
BACKGROUND Viscosupplementation, the intra-articular injection of hyaluronic acid, is widely used for symptomatic knee osteoarthritis. PURPOSE To assess the benefits and risks of viscosupplementation for adults with symptomatic knee osteoarthritis. DATA SOURCES MEDLINE (1966 to January 2012), EMBASE (1980 to January 2012), the Cochrane Central Register of Controlled Trials (1970 to January 2012), and other sources. STUDY SELECTION Randomized trials in any language that compared viscosupplementation with sham or nonintervention control in adults with knee osteoarthritis. DATA EXTRACTION Primary outcomes were pain intensity and flare-ups. Secondary outcomes included function and serious adverse events. Reviewers used duplicate abstractions, assessed study quality, pooled data by using a random-effects model, examined funnel plots, and explored heterogeneity by using meta-regression. DATA SYNTHESIS Eighty-nine trials involving 12 667 adults met inclusion criteria. Sixty-eight had a sham control, 40 had a follow-up duration greater than 3 months, and 22 used cross-linked forms of hyaluronic acid. Overall, 71 trials (9617 patients) showed that viscosupplementation moderately reduced pain (effect size, -0.37 [95% CI, -0.46 to -0.28]). There was important between-trial heterogeneity and an asymmetrical funnel plot: Trial size, blinded outcome assessment, and publication status were associated with effect size. Five unpublished trials (1149 patients) showed an effect size of -0.03 (CI, -0.14 to 0.09). Eighteen large trials with blinded outcome assessment (5094 patients) showed a clinically irrelevant effect size of -0.11 (CI, -0.18 to -0.04). Six trials (811 patients) showed that viscosupplementation increased, although not statistically significantly, the risk for flare-ups (relative risk, 1.51 [CI, 0.84 to 2.72]). Fourteen trials (3667 patients) showed that viscosupplementation increased the risk for serious adverse events (relative risk, 1.41 [CI, 1.02 to 1.97]). LIMITATIONS Trial quality was generally low. Safety data were often not reported. CONCLUSION In patients with knee osteoarthritis, viscosupplementation is associated with a small and clinically irrelevant benefit and an increased risk for serious adverse events. PRIMARY FUNDING SOURCE Arco Foundation.
Journal of Clinical Epidemiology | 2009
Johannes B. Reitsma; Anne Wilhelmina Saskia Rutjes; Khalid S. Khan; Arri Coomarasamy; Patrick M. Bossuyt
OBJECTIVE In diagnostic accuracy studies, the reference standard may be imperfect or not available in all patients. We systematically reviewed the proposed solutions for these situations and generated methodological guidance. STUDY DESIGN AND SETTING Review of methodological articles. RESULTS We categorized the solutions into four main groups. The first group includes methods that impute or adjust for missing data on the reference standard. The second group consists of methods that correct estimates of accuracy obtained with an imperfect reference standard. In the third group a reference standard is constructed by combining multiple test results through a predefined rule, based on a consensus procedure, or through statistical modeling. In the fourth group, the diagnostic accuracy paradigm is abandoned in favor of validation studies that relate index test results to relevant clinical data, such as history, future clinical events, and response to therapy. CONCLUSION Most of the methods try to impute, adjust, or construct a reference standard. In situations that deviate only marginally from the classical diagnostic accuracy paradigm, these are valuable methods. In cases where an acceptable reference standard does not exist, the concept of clinical test validation may provide an alternative paradigm to evaluate a diagnostic test.
Neurology | 2006
N. Smidt; Anne Wilhelmina Saskia Rutjes; D.A.W.M. van der Windt; Raymond Ostelo; Patrick M. Bossuyt; Johannes B. Reitsma; L.M. Bouter; H.C.W. de Vet
Objective: To assess whether the quality of reporting of diagnostic accuracy studies has improved since the publication of the Standards for the Reporting of Diagnostic Accuracy studies (STARD statement). Methods: The quality of reporting of diagnostic accuracy studies published in 12 medical journals in 2000 (pre-STARD) and 2004 (post-STARD) was evaluated by two reviewers independently. For each article, the number of reported STARD items was counted (range 0 to 25). Differences in completeness of reporting between articles published in 2000 and 2004 were analyzed, using multilevel analyses. Results: We included 124 articles published in 2000 and 141 articles published in 2004. Mean number of reported STARD items was 11.9 (range 3.5 to 19.5) in 2000 and 13.6 (range 4.0 to 21.0) in 2004, an increase of 1.81 items (95% CI: 0.61 to 3.01). Articles published in 2004 reported the following significantly more often: methods for calculating test reproducibility of the index test (16% vs 35%); distribution of the severity of disease and other diagnoses (23% vs 53%); estimates of variability of diagnostic accuracy between subgroups (39% vs 60%); and a flow diagram (2% vs 12%). Conclusions: The quality of reporting of diagnostic accuracy studies has improved slightly over time, without a more pronounced effect in journals that adopted the STARD statement. As there is still room for improvement, editors should mention the use of the STARD statement as a requirement in their guidelines for authors, and instruct reviewers to check the STARD items. Authors should include a flow diagram in their manuscript.
BMJ | 2014
Stephan Windecker; Stefan Stortecky; Giulio G. Stefanini; Bruno R daCosta; Anne Wilhelmina Saskia Rutjes; Marcello Di Nisio; Maria G Siletta; Ausilia Maione; Fernando Alfonso; Peter Clemmensen; Jean-Philippe Collet; Jochen Cremer; Volkmar Falk; Gerasimos Filippatos; Christian W. Hamm; Stuart J. Head; Arie Pieter Kappetein; Adnan Kastrati; Juhani Knuuti; Ulf Landmesser; Günther Laufer; Franz-Joseph Neumann; Dimitri Richter; Patrick Schauerte; Miguel Sousa Uva; David P. Taggart; Lucia Torracca; Marco Valgimigli; William Wijns; Adam Witkowski
Objective To investigate whether revascularisation improves prognosis compared with medical treatment among patients with stable coronary artery disease. Design Bayesian network meta-analyses to combine direct within trial comparisons between treatments with indirect evidence from other trials while maintaining randomisation. Eligibility criteria for selecting studies A strategy of initial medical treatment compared with revascularisation by coronary artery bypass grafting or Food and Drug Administration approved techniques for percutaneous revascularization: balloon angioplasty, bare metal stent, early generation paclitaxel eluting stent, sirolimus eluting stent, and zotarolimus eluting (Endeavor) stent, and new generation everolimus eluting stent, and zotarolimus eluting (Resolute) stent among patients with stable coronary artery disease. Data sources Medline and Embase from 1980 to 2013 for randomised trials comparing medical treatment with revascularisation. Main outcome measure All cause mortality. Results 100 trials in 93 553 patients with 262 090 patient years of follow-up were included. Coronary artery bypass grafting was associated with a survival benefit (rate ratio 0.80, 95% credibility interval 0.70 to 0.91) compared with medical treatment. New generation drug eluting stents (everolimus: 0.75, 0.59 to 0.96; zotarolimus (Resolute): 0.65, 0.42 to 1.00) but not balloon angioplasty (0.85, 0.68 to 1.04), bare metal stents (0.92, 0.79 to 1.05), or early generation drug eluting stents (paclitaxel: 0.92, 0.75 to 1.12; sirolimus: 0.91, 0.75 to 1.10; zotarolimus (Endeavor): 0.88, 0.69 to 1.10) were associated with improved survival compared with medical treatment. Coronary artery bypass grafting reduced the risk of myocardial infarction compared with medical treatment (0.79, 0.63 to 0.99), and everolimus eluting stents showed a trend towards a reduced risk of myocardial infarction (0.75, 0.55 to 1.01). The risk of subsequent revascularisation was noticeably reduced by coronary artery bypass grafting (0.16, 0.13 to 0.20) followed by new generation drug eluting stents (zotarolimus (Resolute): 0.26, 0.17 to 0.40; everolimus: 0.27, 0.21 to 0.35), early generation drug eluting stents (zotarolimus (Endeavor): 0.37, 0.28 to 0.50; sirolimus: 0.29, 0.24 to 0.36; paclitaxel: 0.44, 0.35 to 0.54), and bare metal stents (0.69, 0.59 to 0.81) compared with medical treatment. Conclusion Among patients with stable coronary artery disease, coronary artery bypass grafting reduces the risk of death, myocardial infarction, and subsequent revascularisation compared with medical treatment. All stent based coronary revascularisation technologies reduce the need for revascularisation to a variable degree. Our results provide evidence for improved survival with new generation drug eluting stents but no other percutaneous revascularisation technology compared with medical treatment.