Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Carole Redding Flamm is active.

Publication


Featured researches published by Carole Redding Flamm.


Annals of Internal Medicine | 2005

Challenges in systematic reviews of diagnostic technologies.

Athina Tatsioni; Deborah A. Zarin; Naomi Aronson; David J. Samson; Carole Redding Flamm; Christopher H. Schmid; Joseph Lau

Diagnostic tests, broadly construed, consist of any method of gathering information that may change a clinicians belief about the probability that a patient has a particular condition. Diagnosis is not an end in itself; rather, the purpose of a diagnostic test is to guide patient management decisions and thus improve patient outcomes. Because they are pivotal to health care decision making, diagnostic tests should be evaluated as rigorously as therapeutic interventions. A cursory search of the literature for a diagnostic technology may reveal many articles dealing with various aspects of the test. But rarely do these include reports of trials to assess the outcomes of using the test to guide patient management. In the mid-1970s, several groups (1-4) developed a now widely adopted framework to evaluate diagnostic technologies by categorizing studies into 6 levels (5). This framework is hierarchal: Level 1 consists of studies that address technical feasibility, and level 6 consists of those that address societal impact. Table 1 summarizes the framework and the key questions addressed by studies in each level. Table 1. Hierarchy of Diagnostic Evaluation and the Number of Studies Available for Different Levels of Diagnostic Test in a Technology Assessment of Magnetic Resonance Spectroscopy for Brain Tumors Evidence-based Practice Centers (EPCs) have produced several evidence reports and technology assessments of diagnostic technologies (www.ahrq.gov/clinic/techix.htm). This article uses 3 reports produced by the EPCs to illustrate the challenges involved in evaluating diagnostic technologies. The first assessed the use of magnetic resonance spectroscopy (MRS) to evaluate and manage brain mass. It exemplifies the challenges of identifying relevant studies and assessing the methodologic quality of diagnostic accuracy studies (6). The second, a report on technologies to diagnose acute cardiac ischemia, illustrates the problem of synthesizing studies that assess tests in different patient populations and report different outcomes (7). In particular, this report highlights the challenges in quantitatively combining data on test accuracy. The third report, on positron emission tomography (PET) for diagnosing and managing Alzheimer disease and dementia, exemplifies the challenges of assessing the societal impact of a diagnostic test (8). Finally, we discuss the problem of publication bias, which may slant the conclusions of a systematic review and meta-analysis in a biased direction. Challenge: Identifying Relevant Published and Unpublished Studies A report that assessed the value of MRS to diagnose and manage patients with space-occupying brain tumors demonstrates that there are few higher-level diagnostic test studies (8). Table 1 shows the number of studies and patients available for systematic review at each of the 6 levels of evaluation. Among the 97 studies that met the inclusion criteria, 85 were level 1 studies that addressed technical feasibility and optimization. In contrast, only 8 level 2 studies evaluated the ability of MRS to differentiate tumors from nontumors, assign tumor grades, and detect intracranial cystic lesions or assessed the incremental value of MRS added to magnetic resonance imaging (MRI). These indications were sufficiently different that the studies could not be combined or compared. Three studies provided evidence that assessed impact on diagnostic thinking (level 3) or therapeutic choice (level 4). No studies assessed patient outcomes or societal impact (levels 5 and 6). The case of MRS for use in diagnosis and management of brain tumors illustrates a threshold problem in systematic review of diagnostic technologies: the availability of studies providing at least level 2 evidence (since diagnostic accuracy studies are the minimum level relevant to assessing the outcomes of using the test to guide patient management). Although direct evidence is preferred, robust diagnostic accuracy studies can be used to create a causal chain for linking these studies with evidence on treatment effectiveness, thereby allowing an estimate of the effect on outcomes. The example of PET for Alzheimer disease, described later in this article, shows how decision analysis models can quantify outcomes to be expected from use of a diagnostic technology to manage treatment. The reliability of a systematic review hinges on the completeness of information used in the assessment. Identifying all relevant data poses another challenge. The Hedges Team at McMaster University developed and tested special MEDLINE search strategies that retrieved up to 99% of scientifically strong studies of diagnostic tests (9). Although these search strategies are useful, they do not identify grey literature publications, which by their nature are not easily accessible. The Grey Literature Report is the first step in the initiative of New York Academy of Medicine (www.nyam.org/library/grey.shtml) to collect grey literature items, which may include theses, conference proceedings, technical specifications and standards, noncommercial translations, bibliographies, technical and commercial documentation, and official documents not published commercially (10). Diagnostic studies with poor test performance results that are not published may lead to exaggerated estimates of a tests true sensitivity and specificity in a systematic review. Because there are typically few studies in the categories of clinical impact, unpublished studies showing no benefit by the use of a diagnostic test have even greater potential to cause bias during a review of evidence. Of note, the problem of publication bias in randomized, controlled trials has been extensively studied, and several visual and statistical methods have been proposed to detect and correct for unpublished studies (11). Funnel plots, which assume symmetrical scattering of studies around a common estimate, are popular for assessing publication bias in randomized, controlled trials. However, the appearance of the shape of the funnel plot has been shown to depend on the choices of weight and metric (12). Without adequate empirical assessments, funnel plots are being used in systematic reviews of diagnostic tests. However, their use and interpretation should be viewed with caution. The validity of using a funnel plot to detect publication bias remains uncertain. Statistical models to detect and correct for publication bias of randomized trials also have limitations (13). One solution to the problem of publication bias is the mandatory registration of all clinical trials before patient enrollment; for therapeutic trials, considerable progress has already been made in this area. Such a clinical trials registry could readily apply to studies of the clinical outcomes of diagnostic tests (14). Challenge: Assessing Methodologic Quality Diagnostic test evaluations often have methodologic weaknesses (15-17). Of the 8 diagnostic accuracy studies of MRS, half had small sample sizes. Of the larger studies, all had limitations related to patient selection or potential for observer bias. Methodologic quality of a study has been defined as the extent to which all aspects of a studys design and conduct can be shown to protect against systematic bias, nonsystematic bias that may arise in poorly performed studies, and inferential error (18). Test accuracy studies often have important biases, which may result in unreliable estimates of the accuracy of a diagnostic test (19-22). Several proposals have been advanced to assess the quality of a study evaluating diagnostic accuracy (23-25). Partly because of the lack of a true reference standard, there is no consensus for a single approach to assessing study quality (26). The lack of consistent relationships between specific quality elements and the magnitude of outcomes complicates the task of assessing quality (27, 28). In addition, quality is assessed on the basis of reported information that does not necessarily reflect how the study was actually performed and analyzed. The Standards for Reporting of Diagnostic Accuracy (STARD) group recently published a 25-item checklist as a guide to improve the quality of reporting all aspects of a diagnostic study (29). The STARD checklist was not developed as a tool to assess the quality of diagnostic studies. However, many items in the checklist are included in a recently developed tool for quality assessment of diagnostic accuracy studies (the QUADAS tool). The QUADAS tool consists of 14 items that cover patient spectrum, reference standard, disease progression bias, verification and review bias, clinical review bias, incorporation bias, test execution, study withdrawals, and intermediate results (28, 30). Challenge: Assessing Applicability of Study Populations Studies beyond the level of technical feasibility must include both diseased and nondiseased individuals who reflect the use of the diagnostic technologies in actual clinical settings. Because of the need to understand the relationship between test sensitivity and specificity, a study that reports only sensitivity (that is, evaluation of the test only in a diseased population) or only specificity (that is, evaluation of the test only in a healthy population) results cannot be used for this evaluation. In this section, we base our discussion on the evidence report on evaluating diagnostic technologies for acute cardiac ischemia in the emergency department (7). When the spectrum of disease ranges widely within a diseased population, the interpretation of results in a diagnostic accuracy study may be affected if study participants possess only certain characteristics of the diseased population (15, 21). For example, patients in cardiac care units are more likely to have acute cardiac ischemia than patients in the emergency department. When patients with more severe illness are analyzed, the false-positive rate is reduced and sensitivity is overestimated. For example, biomar


Academic Radiology | 2002

Should FDG PET be used to decide whether a patient with an abnormal mammogram or breast finding at physical examination should undergo biopsy

David J. Samson; Carole Redding Flamm; Etta D. Pisano; Naomi Aronson

RATIONALE AND OBJECTIVES The purpose of this systematic review was to assess the performance of fluorodeoxyglucose positron emission tomography (PET) in the differential diagnosis of benign from malignant lesions among patients with abnormal mammograms or a palpable breast mass and to examine the effects of PET findings on patient care and health outcomes. MATERIALS AND METHODS A search of the MEDLINE and CancerLit databases covered articles entered between January 1966 and March 2001. Thirteen articles met the selection criteria. Each article was assessed for study quality characteristics. Meta-analysis was performed with a random effects model and a summary receiver operating characteristic curve. RESULTS A point on the summary receiver operating characteristic curve was selected that reflected average performance, with an estimated sensitivity of 89% and a specificity of 80%. When the prevalence of malignancy is 50%, 40% of all patients would benefit by avoiding the harm of a biopsy with negative biopsy results. The risk of a false-negative result, leading to delayed diagnosis and treatment, is 5.5%. The negative predictive value is 87.9%; thus, the false-negative risk is 12.1%. For a patient with a negative PET scan, a 12% chance of missed or delayed diagnosis of breast cancer is probably too high to make it worth the 88% chance of avoiding biopsy of a benign lesion. CONCLUSION The evidence does not favor the use of fluorodeoxyglucose PET to help decide whether to perform biopsy. Available studies omit a critical segment of the biopsy population with indeterminate mammograms or nonpalpable masses, for which no conclusions can be reached.


Gastrointestinal Endoscopy | 2002

Evidence-based assessment of ERCP in the treatment of pancreatitis

David H. Mark; Frank Lefevre; Carole Redding Flamm; Naomi Aronson

This article reports the results of an evidencebased assessment of ERCP for the treatment of pancreatitis.1 Pancreatitis encompasses a number of distinct entities with differing etiologies, clinical expression, and treatment options. Each is addressed separately to the extent allowed by the available literature. Also, there are a number of different endoscopic techniques used for varying clinical situations. For the purposes of this paper, “ERCP” will refer to the spectrum of interventional endoscopic techniques that are used in the treatment of pancreatitis. METHODS The methods used in the systematic review are summarized in the Methods article included in this supplement2 and are described in detail in the fullevidence report.1 The protocol for this systematic review prospectively defined study objectives; search strategy; patient populations of interest; study selection criteria; outcomes of interest; data elements to be abstracted and methods for abstraction; and methods for study quality assessment. Briefly, the initial selection criteria for this systematic review were full-length reports of comparative studies published in English in peer-reviewed journals. A minimum of 25 patients per treatment arm were required. Assessment of study quality was adapted from that of the U.S. Preventive Health Services Task Force.3 Outcomes of interest included measures of technical success, clinical success, resource utilization, and procedure-related morbidity. Because of a paucity of literature, especially the lack of comparative trials, very few studies on the treatment of recurrent or chronic pancreatitis met the initial selection criteria. Therefore, the selection criteria were relaxed so that this question could be examined. Concurrently controlled studies comparing ERCP to a therapeutic alternative were included regardless of sample size. Single-arm observational studies (subject serves as own control) of ERCP treatment of chronic pancreatitis with a minimum of 25 patients were included if the study selected a well-defined population and used appropriate outcome measures. Baseline evaluation and 6-month follow-up data were required. Single-arm studies of an ERCP in pancreas divisum were also included subject to the above conditions, but regardless of sample size.


Journal of the National Cancer Institute | 2001

Epoetin Treatment of Anemia Associated With Cancer Therapy: a Systematic Review and Meta-analysis of Controlled Clinical Trials

Jerome Seidenfeld; Margaret Piper; Carole Redding Flamm; Victor Hasselblad; James O. Armitage; Charles L. Bennett; Michael S. Gordon; Allan E. Lichtin; James L. Wade; Steven H. Woolf; Naomi Aronson


Gastrointestinal Endoscopy | 2002

Evidence-based assessment of ERCP approaches to managing pancreaticobiliary malignancies

Carole Redding Flamm; David H. Mark; Naomi Aronson


Gastrointestinal Endoscopy | 2002

Evidence-based assessment: patient, procedure, or operator factors associated with ERCP complications

Naomi Aronson; Carole Redding Flamm; Rhonda L. Bohn; David H. Mark; Theodore Speroff


Gastrointestinal Endoscopy | 2002

Evidence-based assessment of diagnostic modalities for common bile duct stones

David H. Mark; Carole Redding Flamm; Naomi Aronson


Journal of The American College of Radiology | 2005

Wireless Capsule Endoscopy in Patients with Obscure Small-Intestinal Bleeding

Kathleen M Ziegler; Carole Redding Flamm; Naomi Aronson


Gastrointestinal Endoscopy | 2002

Evidence-based review of ERCP: introduction and description of systematic review methods.

Carole Redding Flamm; David H. Mark; Naomi Aronson


Evidence Report/Technology Assessment (Summary) | 2001

Uses of epoetin for anemia in oncology.

Jerome Seidenfeld; Naomi Aronson; Margaret Piper; Carole Redding Flamm; Vic Hasselblad; Kathleen M Ziegler

Collaboration


Dive into the Carole Redding Flamm's collaboration.

Top Co-Authors

Avatar

Naomi Aronson

Blue Cross Blue Shield Association

View shared research outputs
Top Co-Authors

Avatar

Kathleen M Ziegler

Blue Cross Blue Shield Association

View shared research outputs
Top Co-Authors

Avatar

David H. Mark

Blue Cross Blue Shield Association

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Margaret Piper

University of Illinois at Chicago

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David J. Samson

Blue Cross Blue Shield Association

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge