Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jonathan J Deeks is active.

Publication


Featured researches published by Jonathan J Deeks.


BMJ | 2003

Measuring inconsistency in meta-analyses

Julian P. T. Higgins; Simon G. Thompson; Jonathan J Deeks; Douglas G. Altman

Cochrane Reviews have recently started including the quantity I 2 to help readers assess the consistency of the results of studies in meta-analyses. What does this new quantity mean, and why is assessment of heterogeneity so important to clinical practice? Systematic reviews and meta-analyses can provide convincing and reliable evidence relevant to many aspects of medicine and health care.1 Their value is especially clear when the results of the studies they include show clinically important effects of similar magnitude. However, the conclusions are less clear when the included studies have differing results. In an attempt to establish whether studies are consistent, reports of meta-analyses commonly present a statistical test of heterogeneity. The test seeks to determine whether there are genuine differences underlying the results of the studies (heterogeneity), or whether the variation in findings is compatible with chance alone (homogeneity). However, the test is susceptible to the number of trials included in the meta-analysis. We have developed a new quantity, I 2, which we believe gives a better measure of the consistency between trials in a meta-analysis. Assessment of the consistency of effects across studies is an essential part of meta-analysis. Unless we know how consistent the results of studies are, we cannot determine the generalisability of the findings of the meta-analysis. Indeed, several hierarchical systems for grading evidence state that the results of studies must be consistent or homogeneous to obtain the highest grading.2–4 Tests for heterogeneity are commonly used to decide on methods for combining studies and for concluding consistency or inconsistency of findings.5 6 But what does the test achieve in practice, and how should the resulting P values be interpreted? A test for heterogeneity examines the null hypothesis that all studies are evaluating the same effect. The usual test statistic …


Annals of Internal Medicine | 2011

QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.

Penny F Whiting; Anne Wilhelmina Saskia Rutjes; Marie Westwood; Susan Mallett; Jonathan J Deeks; Johannes B. Reitsma; Mariska M.G. Leeflang; Jonathan A C Sterne; Patrick M. Bossuyt

In 2003, the QUADAS tool for systematic reviews of diagnostic accuracy studies was developed. Experience, anecdotal reports, and feedback suggested areas for improvement; therefore, QUADAS-2 was developed. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias. The QUADAS-2 tool is applied in 4 phases: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability. This tool will allow for more transparent rating of bias and applicability of primary diagnostic accuracy studies.


BMJ | 2011

Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials.

Jonathan A C Sterne; Alex J. Sutton; John P. A. Ioannidis; Norma Terrin; David R. Jones; Joseph Lau; James Carpenter; Gerta Rücker; Roger Harbord; Christopher H. Schmid; Jennifer Tetzlaff; Jonathan J Deeks; Jaime Peters; Petra Macaskill; Guido Schwarzer; Sue Duval; Douglas G. Altman; David Moher; Julian P. T. Higgins

Funnel plots, and tests for funnel plot asymmetry, have been widely used to examine bias in the results of meta-analyses. Funnel plot asymmetry should not be equated with publication bias, because it has a number of other possible causes. This article describes how to interpret funnel plot asymmetry, recommends appropriate tests, and explains the implications for choice of meta-analysis model


BMJ | 2001

Systematic reviews of evaluations of diagnostic and screening tests

Jonathan J Deeks

This is the third in a series of four articles Tests are routinely used in medicine to screen for, diagnose, grade, and monitor the progression of disease. Diagnostic information is obtained from a multitude of sources, including imaging and biochemical technologies, pathological and psychological investigations, and signs and symptoms elicited during history taking and clinical examinations.1 Each of these items of information can be regarded as a result of a separate diagnostic or screening “test.” Systematic reviews of evaluations of tests are undertaken for the same reasons as systematic reviews of treatment interventions: to produce estimates of test performance and impact based on all available evidence, to evaluate the quality of published studies, and to account for variation in findings between studies.2–5 Reviews of studies of diagnostic accuracy involve the same key stages of defining questions, searching the literature, evaluating studies for eligibility and quality, and extracting and synthesising data. However, studies that evaluate the accuracy of tests have a unique design requiring different criteria to appropriately assess the quality of studies and the potential for bias. Additionally, each study reports a pair of related summary statistics (for example, sensitivity and specificity) rather than a single statistic (such as a risk ratio) and hence requires different statistical methods to pool the results of the studies. This article concentrates on the dimensions of study quality and the advantages and disadvantages of different summary statistics for combining studies in meta-analysis. Other aspects, including searching the literature and further technical details, are discussed elsewhere.6 #### Summary points Systematic reviews of studies of diagnostic accuracy differ from other systematic reviews in the assessment of study quality and the statistical methods used to combine results Important aspects of study quality include the selection of a clinically relevant cohort, the consistent use of a …


BMJ | 2001

Systematic reviews in health care: Systematic reviews of evaluations of diagnostic and screening tests

Jonathan J Deeks

This is the third in a series of four articles Tests are routinely used in medicine to screen for, diagnose, grade, and monitor the progression of disease. Diagnostic information is obtained from a multitude of sources, including imaging and biochemical technologies, pathological and psychological investigations, and signs and symptoms elicited during history taking and clinical examinations.1 Each of these items of information can be regarded as a result of a separate diagnostic or screening “test.” Systematic reviews of evaluations of tests are undertaken for the same reasons as systematic reviews of treatment interventions: to produce estimates of test performance and impact based on all available evidence, to evaluate the quality of published studies, and to account for variation in findings between studies.2–5 Reviews of studies of diagnostic accuracy involve the same key stages of defining questions, searching the literature, evaluating studies for eligibility and quality, and extracting and synthesising data. However, studies that evaluate the accuracy of tests have a unique design requiring different criteria to appropriately assess the quality of studies and the potential for bias. Additionally, each study reports a pair of related summary statistics (for example, sensitivity and specificity) rather than a single statistic (such as a risk ratio) and hence requires different statistical methods to pool the results of the studies. This article concentrates on the dimensions of study quality and the advantages and disadvantages of different summary statistics for combining studies in meta-analysis. Other aspects, including searching the literature and further technical details, are discussed elsewhere.6 #### Summary points Systematic reviews of studies of diagnostic accuracy differ from other systematic reviews in the assessment of study quality and the statistical methods used to combine results Important aspects of study quality include the selection of a clinically relevant cohort, the consistent use of a …


BMJ | 2003

Validity of indirect comparison for estimating efficacy of competing interventions: empirical evidence from published meta-analyses

Fujian Song; Douglas G. Altman; Anne-Marie Glenny; Jonathan J Deeks

Abstract Objective: To determine the validity of adjusted indirect comparisons by using data from published meta-analyses of randomised trials. Design: Direct comparison of different interventions in randomised trials and adjusted indirect comparison in which two interventions were compared through their relative effect versus a common comparator. The discrepancy between the direct and adjusted indirect comparison was measured by the difference between the two estimates. Data sources: Database of abstracts of reviews of effectiveness (1994-8), the Cochrane database of systematic reviews, Medline, and references of retrieved articles. Results: 44 published meta-analyses (from 28 systematic reviews) provided sufficient data. In most cases, results of adjusted indirect comparisons were not significantly different from those of direct comparisons. A significant discrepancy (P<0.05) was observed in three of the 44 comparisons between the direct and the adjusted indirect estimates. There was a moderate agreement between the statistical conclusions from the direct and adjusted indirect comparisons (κ 0.51). The direction of discrepancy between the two estimates was inconsistent. Conclusions: Adjusted indirect comparisons usually but not always agree with the results of head to head randomised trials. When there is no or insufficient direct evidence from randomised trials, the adjusted indirect comparison may provide useful or supplementary information on the relative efficacy of competing interventions. The validity of the adjusted indirect comparisons depends on the internal validity and similarity of the included trials. What is already known on this topic Many competing interventions have not been compared in randomised trials Indirect comparison of competing interventions has been carried out in systematic reviews, often implicitly Indirect comparison adjusted by a common control can partially take account of prognostic characteristics of patients in different trials What this study adds Results of adjusted indirect comparison usually, but not always, agree with those of head to head randomised trials The validity of adjusted indirect comparisons depends on the internal validity and similarity of the trials involved


BMJ | 2011

Interpretation of random effects meta-analyses

Richard D Riley; Julian P. T. Higgins; Jonathan J Deeks

Summary estimates of treatment effect from random effects meta-analysis give only the average effect across all studies. Inclusion of prediction intervals, which estimate the likely effect in an individual setting, could make it easier to apply the results to clinical practice


Annals of Internal Medicine | 2008

Systematic reviews of diagnostic test accuracy

Mariska M.G. Leeflang; Jonathan J Deeks; Constantine Gatsonis; Patrick M. Bossuyt

Diagnosis is a critical component of health care, and clinicians, policymakers, and patients routinely face a range of questions regarding diagnostic tests. They want to know whether testing improves outcome; what test to use, purchase, or recommend in practice guidelines; and how to interpret test results. Well-designed diagnostic test accuracy studies can help in making these decisions, provided that they transparently and fully report their participants, tests, methods, and results as facilitated, for example, by the STARD (Standards for Reporting of Diagnostic Accuracy) statement (1). That 25-item checklist was published in many journals and is now adopted by more than 200 scientific journals worldwide. As in other areas of science, systematic reviews and meta-analysis of accuracy studies can be used to obtain more precise estimates when small studies addressing the same test and patients in the same setting are available. Reviews can also be useful to establish whether and how scientific findings vary by particular subgroups, and may provide summary estimates with a stronger generalizability than estimates from a single study. Systematic reviews may help identify the risk for bias that may be present in the original studies and can be used to address questions that were not directly considered in the primary studies, such as comparisons between tests. The Cochrane Collaboration is the largest international organization preparing, maintaining, and promoting systematic reviews to help people make well-informed decisions about health care (2). The Collaboration decided in 2003 to make preparations for including systematic reviews of diagnostic test accuracy in their Cochrane Database of Systematic Reviews. To enable this, a working group (Appendix). was formed to develop methodology, software, and a handbook The first diagnostic test accuracy review was published in the Cochrane Database in October 2008. In this paper, we review recent methodological developments concerning problem formulation, location of literature, quality assessment, and meta-analysis of diagnostic accuracy studies by using our experience from the work on the Cochrane Handbook. The information presented here is based on the recent literature and updates previously published guidelines by Irwig and colleagues (3). Definition of the Objectives of the Review Diagnostic test accuracy refers to the ability of a test to distinguish between patients with disease (or more generally, a specified target condition) and those without. In a study of test accuracy, the results of the test under evaluation, the index test, are compared with those of the reference standard determined in the same patients. The reference standard is an agreed-on and accurate method for identifying patients who have the target condition. Test results are typically categorized as positive or negative for the target condition. By using such binary test outcomes, the accuracy is most often expressed as the tests sensitivity (the proportion of patients with positive results on the reference standard that are also positive on the index test) and specificity (the proportion of patients with negative results on the reference standard that are also negative on the index test). Other measures have been proposed and are in use (46). It has long been recognized that test accuracy is not a fixed property of a test. It can vary between patient subgroups, with their spectrum of disease, with the clinical setting, or with the test interpreters and may depend on the results of previous testing. For this reason, inclusion of these elements in the study question is essential. In order to make a policy decision to promote use of a new index test, evidence is required that using the new test increases test accuracy over other testing options, including current practice, or that the new test has equivalent accuracy but offers other advantages (79). As with the evaluation of interventions, systematic reviews need to include comparative analyses between alternative testing strategies and should not focus solely on evaluating the performance of a test in isolation. In relation to the existing situation, 3 possible roles for a new test can be defined: replacement, triage, and add-on (7). If a new test is to replace an existing test, then comparing the accuracy of both tests on the same population and with the same reference standard provides the most direct evidence. In triage, the new test is used before the existing test or testing pathway, and only patients with a particular result on the triage test continue the testing pathway. When a test is needed to rule out disease in patients who then need no further testing, a test that gives a minimal proportion of falsenegative results and thus a relatively high sensitivity should be used. Triage tests may be less accurate than existing ones, but they have other advantages, such as simplicity or low cost. A third possible role of a new test is add-on. The new test is then positioned after the existing testing pathway to identify false-positive or false-negative results after the existing pathway. The review should provide data to assess the incremental change in accuracy made by adding the new test. An example of a replacement question can be found in a systematic review of the diagnostic accuracy of urinary markers for primary bladder cancer (10). Clinicians may use cytology to triage patients before they undergo invasive cystoscopy, the reference standard for bladder cancer. Because cytology combines high specificity with low sensitivity (11), the goal of the review was to identify a tumor marker with sufficient accuracy to either replace cytology or be used in addition to cytology. For a marker to replace cytology, it has to achieve equally high specificity with improved sensitivity. New markers that are sensitive but not specific may have roles as adjuncts to conventional testing. The review included studies in which the test under evaluation (several different tumor markers and cytology) was evaluated against cystoscopy or histopathology. Included studies compared 1 or more of the markers, cytology only, or a combination of markers and cytology. Although information on accuracy can help clinicians make decisions about tests, good diagnostic accuracy is a desirable but not sufficient condition for the effectiveness of a test (8). To demonstrate that using a new test does more good than harm to patients tested, randomized trials of test-and-treatment strategies and reviews of such trials may be necessary. However, with the possible exception of screening, in most cases, such randomized trials are not available and systematic reviews of test accuracy may provide the most useful evidence available to guide clinical and health policy decision making and use as input for decision and cost-effectiveness analysis (12). Identification and Selection of Studies Identifying test accuracy studies is more difficult than searching for randomized trials (13). There is not a clear, unequivocal keyword or indexing term for an accuracy study in literature databases comparable with the term randomized, controlled trial. The Medical Subject Heading sensitivity and specificity may look suitable but is inconsistently applied in most electronic bibliographic databases. Furthermore, data on diagnostic test accuracy may be hidden in studies that did not have test accuracy estimation as their primary objective. This complicates the efficient identification of diagnostic test accuracy studies in electronic databases, such as MEDLINE. Until indexing systems properly code studies of test accuracy, searching for them will remain challenging and may require additional manual searches, such as screening reference lists. In the development of a comprehensive search strategy, review authors can use search strings that refer to the test(s) under evaluation, the target condition, and the patient description or a subset of these. For tests with a clear name that are used for a single purpose, searching for publications in which those tests are mentioned may suffice. For other reviews, adding the patient description may be necessary, although this is also often poorly indexed. A search strategy in MEDLINE should contain both Medical Subject Headings and free text words. A search strategy for articles about tests for bladder cancer, for example, should include as many synonyms for bladder cancer as possible in the search strategy, including neoplasm, carcinoma, transitional cell, and hematuria. Several methodological electronic search filters for diagnostic test accuracy studies have been developed, each attempting to restrict the search to articles that are most likely to be test accuracy studies (1316). These filters rely on indexing terms for research methodology and text words used in reporting results, but they often miss relevant studies and are unlikely to decrease the number of articles one needs to screen. Therefore, they are not recommended for systematic reviews (17, 18). The incremental value of searching in languages other than English and in the gray literature has not yet been fully investigated. In systematic reviews of intervention studies, publication bias is an important and well-studied form of bias in which the decision to report and publish studies is linked to their findings. For clinical trials, the magnitude and determinants of publication bias have been identified by tracing the publication history of cohorts of trials reviewed by ethics committees and research boards (19). A consistent observation has been that studies with significant results are more likely to be published than studies with nonsignificant findings (19). Investigating publication bias for diagnostic tests is problematic, because many studies are done without ethical review or study registration; therefore, identification of cohorts of studies from registration to final publication status i


The Lancet | 2003

Comparison of T-cell-based assay with tuberculin skin test for diagnosis of Mycobacterium tuberculosis infection in a school tuberculosis outbreak

Katie Ewer; Jonathan J Deeks; Lydia Alvarez; Gerry Bryant; Sue Waller; Peter Andersen; Philip Monk; Ajit Lalvani

BACKGROUND The diagnosis of latent tuberculosis infection relies on the tuberculin skin test (TST), which has many drawbacks. However, to find out whether new tests are better than TST is difficult because of the lack of a gold standard test for latent infection. We developed and assessed a sensitive enzyme-linked immunospot (ELISPOT) assay to detect T cells specific for Mycobacterium tuberculosis antigens that are absent from Mycobacterium bovis BCG and most environmental mycobacteria. We postulated that if the ELISPOT is a more accurate test of latent infection than TST, it should correlate better with degree of exposure to M tuberculosis. METHODS A large tuberculosis outbreak in a UK school resulted from one infectious index case. We tested 535 students for M tuberculosis infection with TST and ELISPOT. We compared the correlation of these tests with degree of exposure to the index case and BCG vaccination. FINDINGS Although agreement between the tests was high (89% concordance, kappa=0.72, p<0.0001), ELISPOT correlated significantly more closely with M tuberculosis exposure than did TST on the basis of measures of proximity (p=0.03) and duration of exposure (p=0.007) to the index case. TST was significantly more likely to be positive in BCG-vaccinated than in non-vaccinated students (p=0.002), whereas ELISPOT results were not associated with BCG vaccination (p=0.44). INTERPRETATION ELISPOT offers a more accurate approach than TST for identification of individuals who have latent tuberculosis infection and could improve tuberculosis control by more precise targeting of preventive treatment.


BMJ | 2016

ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions

Jonathan A C Sterne; Miguel A. Hernán; Barnaby C Reeves; Jelena Savovic; Nancy D Berkman; Meera Viswanathan; David Henry; Douglas G. Altman; Mohammed T Ansari; Isabelle Boutron; James Carpenter; An-Wen Chan; Rachel Churchill; Jonathan J Deeks; Asbjørn Hróbjartsson; Jamie Kirkham; Peter Jüni; Yoon K. Loke; Theresa D Pigott; Craig Ramsay; Deborah Regidor; Hannah R. Rothstein; Lakhbir Sandhu; Pasqualina Santaguida; Holger J. Schunemann; B. Shea; Ian Shrier; Peter Tugwell; Lucy Turner; Jeffrey C. Valentine

Non-randomised studies of the effects of interventions are critical to many areas of healthcare evaluation, but their results may be biased. It is therefore important to understand and appraise their strengths and weaknesses. We developed ROBINS-I (“Risk Of Bias In Non-randomised Studies - of Interventions”), a new tool for evaluating risk of bias in estimates of the comparative effectiveness (harm or benefit) of interventions from studies that did not use randomisation to allocate units (individuals or clusters of individuals) to comparison groups. The tool will be particularly useful to those undertaking systematic reviews that include non-randomised studies.

Collaboration


Dive into the Jonathan J Deeks's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jane P Daniels

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ajit Lalvani

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Pallavi Latthe

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar

Pelham Barton

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge