Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where William DuMouchel is active.

Publication


Featured researches published by William DuMouchel.


Journal of the American Medical Informatics Association | 1996

A Meta-analysis of 16 Randomized Controlled Trials to Evaluate Computer-Based Clinical Reminder Systems for Preventive Care in the Ambulatory Setting

Steven Shea; William DuMouchel; Lisa Bahamonde

OBJECTIVE Computer-based reminder systems have the potential to change physician and patient behaviors and to improve patient outcomes. We performed a meta-analysis of published randomized controlled trials to assess the overall effectiveness of computer-based reminder systems in ambulatory settings directed at preventive care. DESIGN Meta-analysis. SEARCH STRATEGY Searches of the Medline (1966-1994), Nursing and Allied Health (1982-1994), and Health Planning and Administration (1975-1994) databases identified 16 randomized, controlled trials of computer-based reminder systems in ambulatory settings. STATISTICAL METHODS A weighted mixed effects model regression analysis was used to estimate intervention effects for computer and manual reminder systems for six classes of preventive practices. MAIN OUTCOME MEASURE Adjusted odds ratio for preventive practices. RESULTS Computer reminders improved preventive practices compared with the control condition for vaccinations (adjusted odds ratio [OR] 3.09; 95% confidence interval [CI] 2.39-4.00), breast cancer screening (OR 1.88; 95% CI 1.44-2.45), colorectal cancer screening (OR 2.25; 95% CI 1.74-2.91), and cardiovascular risk reduction (OR 2.01; 95% CI 1.55-2.61) but not cervical cancer screening (OR 1.15; 95% CI 0.89-1.49) or other preventive care (OR 1.02; 95% CI 0.79-1.32). For all six classes of preventive practices combined the adjusted OR was 1.77 (95% CI 1.38-2.27). CONCLUSION Evidence from randomized controlled studies supports the effectiveness of data-driven computer-based reminder systems to improve prevention services in the ambulatory care setting.


The American Statistician | 1999

Bayesian Data Mining in Large Frequency Tables, with an Application to the FDA Spontaneous Reporting System

William DuMouchel

Abstract A common data mining task is the search for associations in large databases. Here we consider the search for “interestingly large” counts in a large frequency table, having millions of cells, most of which have an observed frequency of 0 or 1. We first construct a baseline or null hypothesis expected frequency for each cell, and then suggest and compare screening criteria for ranking the cell deviations of observed from expected count. A criterion based on the results of fitting an empirical Bayes model to the cell counts is recommended. An example compares these criteria for searching the FDA Spontaneous Reporting System database maintained by the Division of Pharmacovigilance and Epidemiology. In the example, each cell count is the number of reports combining one of 1,398 drugs with one of 952 adverse events (total of cell counts = 4.9 million), and the problem is to screen the drug-event combinations for possible further investigation.


Annals of Internal Medicine | 1995

Unlocking Clinical Data from Narrative Reports: A Study of Natural Language Processing

George Hripcsak; Carol Friedman; Philip O. Alderson; William DuMouchel; Stephen B. Johnson; Paul D. Clayton

The use of automated systems and electronic databases to enhance the quality, reduce the cost, and improve the management of health care has become common. Recent examples include using these systems to prevent adverse drug events [1, 2] and to encourage efficient treatment [3]. To function properly, automated systems require accurate, complete data. Although laboratory results are routinely available in electronic form, the most important clinical informationsymptoms, signs, and assessmentsremains largely inaccessible to automated systems. Investigators have attempted to use data from nonclinical sources to fill in the gaps, but such data have been found to be unreliable [4]. Much clinical data is locked up in departmental word-processor files, clinical databases, and research databases in the form of narrative reports such as discharge summaries, radiology reports, pathology reports, admission histories, and reports of physical examinations. Untold volumes of data are deleted every day after word-processor files are printed for the paper chart and for the mailing of reports. Exploiting this information is not trivial, however. Sentences that are easy for a person to understand are difficult for a computer to sort out. Problems include the many ways in which the same concept can be expressed (for example, heart failure, congestive heart failure, CHF, and so forth); ambiguities in interpreting grammatical constructs (possible worsening infiltrate may refer to a definite infiltrate that may be worsening or to an uncertain infiltrate that, if present, is worsening); and negation (lung fields are unremarkable implies a lack of infiltrate). To be accurate, automated systems require coded data: The concepts must come from a well-defined, finite vocabulary, and the relations among the concepts must be expressed in an unambiguous, formal structure. How do we unlock the contents of narrative reports? Human coders can be trained to read and manually structure reports [5]. Few institutions have been willing to invest in the personnel necessary for manual coding (other than for billing purposes), and the human coders can introduce an additional delay in obtaining coded data. The producers of reports (for example, radiologists for radiology reports) can be trained to directly create coded reports. Unfortunately, because manual coding systems do not match the speed and simplicity of dictating narrative reports, this approach has not attained widespread use. It also does not address the large number of reports already available in institutions. Natural language processing offers an automated solution [6-11]. The processor converts narrative reports that are available in electronic formeither through word processors or electronic scanningto coded descriptions that are appropriate for automated systems. The promise of efficient, accurate extraction of coded clinical data from narrative reports is certainly enticing. The question is whether natural language processors are up to the taskjust how efficient and accurate are they, and how easy is it to use their coded output? Methods We evaluated a general-purpose processor [12] that is intended to cover various clinical reports. To be used in a particular domain (for example, radiology) and subdomain (chest radiograph), the processor must have initial programming under the supervision of an appropriate expert (radiologist). This programming process involves enumerating the vocabulary of the domain (for example, patchy infiltrate) and formulating the grammar rules that are specific to the domain. The natural language processor works as follows. The narrative report is fed into a preprocessor, which uses its vocabulary to recognize words and phrases in the report (for example, lungs, CHF), map them to standard terms (lung, congestive heart failure), and classify them into semantic categories (bodylocation, finding). The parser then matches sequences of semantic categories in the report to structures defined in the grammar. For example, if the original report read, infiltrate in lung, then the phrase might match this structure: finding, in, bodylocation. Far more complex semantic structures are also supported through the grammar. This structure is then mapped to the processors result: a set of findings, each of which is associated with its own descriptive modifiers, such as certainty, status, location, quantity, degree, and change. For example, the following is an excerpt from a narrative report: Probable mild pulmonary vascular congestion with new left pleural effusion, question mild congestive changes. From this report, the natural language processor generated the following three coded findings: Pulmonary vascular congestion certainty: high degree: low Pleural effusion region: left status: new Congestive changes certainty: moderate degree: low The processor attempts to encode all clinical information available in reports, including the clinical indication, description, and impression. These findings are stored in a clinical database, where they can be exploited for automated decision-support and clinical research. At Columbia-Presbyterian Medical Center, New York, New York, the processor has been trained to handle chest radiograph and mammogram reports. In normal operation, the radiologist dictates a report, which is then transcribed by a clerk with a word processor. The word-processor files are printed for the paper chart, stored in the clinical database in their narrative form for on-line review by clinicians, and transmitted to the natural language processor for coding. The coded data produced by the processor are exploited for automated decision-support by the use of a computer program called a clinical event monitor [13]. The event monitor generates alerts, reminders, and interpretations that are based on the Arden Syntax for Medical Logic Modules [14]. The event monitor follows all clinical events (for example, admissions and laboratory results) in the medical center that can be tracked by computer. Whenever a clinically important situation is detected, the event monitor sends a message to the health care provider. For example, the storage of a low serum potassium level prompts the monitor to check whether the patient is receiving digoxin; if so, the monitor warns the health care provider that the hypokalemia may potentiate cardiac arrhythmias. Our study was designed and conducted by an evaluation team that was separate from the development team responsible for the natural language processor. At the time of the evaluation, members of the evaluation team had no knowledge of the operation of the processor or of its strengths and weaknesses. They knew that the processor accepted chest radiograph reports and produced some coded result. Two hundred admission chest radiograph reports were randomly selected from among those of all adult patients discharged from the inpatient service of Columbia-Presbyterian Medical Center during a particular week. An admission chest radiograph was defined as the first chest radiograph obtained during the hospital stay, even if it was not obtained on the first day. Chest radiographs were chosen because they display a broad range of disease, vocabulary, and grammatical variation. To better assess true performance, no corrections were made to reports, despite misspellings and even the inclusion of other types of reports in the same electronic files as the chest radiograph reports. Study subjects (humans and automated methods) detected the presence or absence of six clinical conditions (Table 1). To ensure that the conditions were reasonable candidates for automated decision-support, they were selected from an independent published list of automated protocols that exploited chest radiographs [15]. An internist on the evaluation team selected the six conditions, thus ensuring that the conditions were common enough to be reasonably expected to appear several times in a set of 200 reports and that overlap would be minimized. Table 1. Conditions The 200 reports were processed by the natural language processor, and the resulting coded data were fed into the clinical event monitor. For each clinical condition, the monitor had a rule expressed as a Medical Logic Module [14] to detect the condition on the basis of the processors coded output. The Medical Logic Modules concluded true (present) or false (absent). For example, the Medical Logic Module that detected pneumothorax was the simplest and used the following logic: if finding is in (pneumothorax; hydropneumothorax) and certainty-modifier is not in (no; rule out; cannot evaluate) and status-modifier is not in (resolved) then conclude true; endif; The Medical Logic Module looks for reports with appropriate findings but eliminates reports that are actually stating that the finding is absent, unknown, or resolved. The Medical Logic Modules were written by a member of the evaluation team who was given access to the six condition definitions (Table 1), a sample of the natural language processors output based on an independent set of chest radiographs, and a complete list of all vocabulary terms that the processor could generate in its output. No changes were made to the natural language processor, its grammar, or its vocabulary for the entire duration of the study (including the design phase). Once written, Medical Logic Modules were also held constant. Human participants were recruited as follows. Six board-certified radiologists and six board-certified internists were selected as experts. All 12 physicians actively practice medicine in their respective fields at Columbia-Presbyterian Medical Center. Six professional lay persons without experience in the practice of medicine were selected as additional controls. Each human participant analyzed 100 reports; the time required to analyze all 200 reports (about 4 hours) would have been a disincentive to participate in the study and might have led participants


Clinical Pharmacology & Therapeutics | 2012

Novel Data-Mining Methodologies for Adverse Drug Event Discovery and Analysis

Rave Harpaz; William DuMouchel; Nigam H. Shah; David Madigan; Patrick B. Ryan; Carol Friedman

An important goal of the health system is to identify new adverse drug events (ADEs) in the postapproval period. Data‐mining methods that can transform data into meaningful knowledge to inform patient safety have proven essential for this purpose. New opportunities have emerged to harness data sources that have not been used within the traditional framework. This article provides an overview of recent methodological innovations and data sources used to support ADE discovery and analysis.


knowledge discovery and data mining | 1999

Squashing flat files flatter

William DuMouchel; Chris Volinsky; Theodore Johnson; Corinna Cortes; Daryl Pregibon

A feature of data mining that distinguishes it from “classical” machine learning (ML) and statistical modeling (SM) is scale. The community seems to agree on this yet progress to this point has been limited. We present a methodology that addresses scale in a novel fashion that has the potential for revolutionizing the field. While the methodology applies most directly to flat (row by column) data sets we believe that it can be adapted to other representations. Our approach to the problem is not to scale up individual ML and SM methods. Rather we prefer to leverage the entire collection of existing methods by scaling down the data set. We call the method squashing. Our method demonstrably outperforms random sampling and a theoretical argument suggests how and why it works well. Squashing consists of three modular steps: grouping, momentizing, and generating (GMG). These three steps describe the squashing pipeline whereby the original (very large data set) is sectioned off into mutually exclusive groups (or bins); within each group a series of low-order moments are computed; and finally these moments are passed off to a routine that generates pseudo data that accurately reproduce the moments. The result of the GMG squashing pipeline is a squashed data set that has the same structure as the original data with the addition of a weight for each pseudo data point that reflects the distribution of the original data into the initial groups. Any ML or SM method that accepts weights can be used to analyze the weighted pseudo data. By construction the resulting analyses will mimic the corresponding analyses on the original data set. Squashing should appeal to many of the sub-disciplines of


Clinical Pharmacology & Therapeutics | 2013

Performance of Pharmacovigilance Signal‐Detection Algorithms for the FDA Adverse Event Reporting System

Rave Harpaz; William DuMouchel; Paea LePendu; Anna Bauer-Mehren; Patrick B. Ryan; Nigam H. Shah

Signal‐detection algorithms (SDAs) are recognized as vital tools in pharmacovigilance. However, their performance characteristics are generally unknown. By leveraging a unique gold standard recently made public by the Observational Medical Outcomes Partnership (OMOP) and by conducting a unique systematic evaluation, we provide new insights into the diagnostic potential and characteristics of SDAs that are routinely applied to the US Food and Drug Administration (FDA) Adverse Event Reporting System (AERS). We find that SDAs can attain reasonable predictive accuracy in signaling adverse events. Two performance classes emerge, indicating that the class of approaches that address confounding and masking effects benefits safety surveillance. Our study shows that not all events are equally detectable, suggesting that specific events might be monitored more effectively using other data sources. We provide performance guidelines for several operating scenarios to inform the trade‐off between sensitivity and specificity for specific use cases. We also propose an approach and demonstrate its application in identifying optimal signaling thresholds, given specific misclassification tolerances.


Journal of the American Medical Informatics Association | 2013

Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions.

Rave Harpaz; Santiago Vilar; William DuMouchel; Hojjat Salmasian; Krystl Haerian; Nigam H. Shah; Herbert S. Chase; Carol Friedman

OBJECTIVE Data-mining algorithms that can produce accurate signals of potentially novel adverse drug reactions (ADRs) are a central component of pharmacovigilance. We propose a signal-detection strategy that combines the adverse event reporting system (AERS) of the Food and Drug Administration and electronic health records (EHRs) by requiring signaling in both sources. We claim that this approach leads to improved accuracy of signal detection when the goal is to produce a highly selective ranked set of candidate ADRs. MATERIALS AND METHODS Our investigation was based on over 4 million AERS reports and information extracted from 1.2 million EHR narratives. Well-established methodologies were used to generate signals from each source. The study focused on ADRs related to three high-profile serious adverse reactions. A reference standard of over 600 established and plausible ADRs was created and used to evaluate the proposed approach against a comparator. RESULTS The combined signaling system achieved a statistically significant large improvement over AERS (baseline) in the precision of top ranked signals. The average improvement ranged from 31% to almost threefold for different evaluation categories. Using this system, we identified a new association between the agent, rasburicase, and the adverse event, acute pancreatitis, which was supported by clinical review. CONCLUSIONS The results provide promising initial evidence that combining AERS with EHRs via the framework of replicated signaling can improve the accuracy of signal detection for certain operating scenarios. The use of additional EHR data is required to further evaluate the capacity and limits of this system and to extend the generalizability of these results.


American Journal of Epidemiology | 2013

Evaluating the Impact of Database Heterogeneity on Observational Study Results

David Madigan; Patrick B. Ryan; Martijn J. Schuemie; Paul E. Stang; J. Marc Overhage; Abraham G. Hartzema; Marc A. Suchard; William DuMouchel; Jesse A. Berlin

Clinical studies that use observational databases to evaluate the effects of medical products have become commonplace. Such studies begin by selecting a particular database, a decision that published papers invariably report but do not discuss. Studies of the same issue in different databases, however, can and do generate different results, sometimes with strikingly different clinical implications. In this paper, we systematically study heterogeneity among databases, holding other study methods constant, by exploring relative risk estimates for 53 drug-outcome pairs and 2 widely used study designs (cohort studies and self-controlled case series) across 10 observational databases. When holding the study design constant, our analysis shows that estimated relative risks range from a statistically significant decreased risk to a statistically significant increased risk in 11 of 53 (21%) of drug-outcome pairs that use a cohort design and 19 of 53 (36%) of drug-outcome pairs that use a self-controlled case series design. This exceeds the proportion of pairs that were consistent across databases in both direction and statistical significance, which was 9 of 53 (17%) for cohort studies and 5 of 53 (9%) for self-controlled case series. Our findings show that clinical studies that use observational databases can be sensitive to the choice of database. More attention is needed to consider how the choice of data source may be affecting results.


Drug Safety | 2006

Comparative performance of two quantitative safety signalling methods: implications for use in a pharmacovigilance department.

June S. Almenoff; Karol K. LaCroix; Nancy Yuen; David Fram; William DuMouchel

AbstractBackground and objectives: There is increasing interest in using disproportionality-based signal detection methods to support postmarketing safety surveillance activities. Two commonly used methods, empirical Bayes multi-item gamma Poisson shrinker (MGPS) and proportional reporting ratio (PRR), perform differently with respect to the number and types of signals detected. The goal of this study was to compare and analyse the performance characteristics of these two methods, to understand why they differ and to consider the practical implications of these differences for a large, industry-based pharmacovigilance department. Methods: We compared the numbers and types of signals of disproportionate reporting (SDRs) obtained with MGPS and PRR using two postmarketing safety databases and a simulated database. We recorded signal counts and performed a qualitative comparison of the drug-event combinations signalled by the two methods as well as a sensitivity analysis to better understand how the thresholds commonly used for these methods impact their performance. Results: PRR detected more SDRs than MGPS. We observed that MGPS is less subject to confounding by demographic factors because it employs stratification and is more stable than PRR when report counts are low. Simulation experiments performed using published empirical thresholds demonstrated that PRR detected false-positive signals at a rate of 1.1%, while MGPS did not detect any statistical false positives. In an attempt to separate the effect of choice of signal threshold from more fundamental methodological differences, we performed a series of experiments in which we modified the conventional threshold values for each method so that each method detected the same number of SDRs for the example drugs studied. This analysis, which provided quantitative examples of the relationship between the published thresholds for the two methods, demonstrates that the signalling criterion published for PRR has a higher signalling frequency than that published for MGPS. Discussion and conclusion: The performance differences between the PRR and MGPS methods are related to (i) greater confounding by demographic factors with PRR; (ii) a higher tendency of PRR to detect false-positive signals when the number of reports is small; and (iii) the conventional thresholds that have been adapted for each method. PRR tends to be more ‘sensitive’ and less ‘specific’ than MGPS. A high-specificity disproportionality method, when used in conjunction with medical triage and investigation of critical medical events, may provide an efficient and robust approach to applying quantitative methods in routine postmarketing pharmacovigilance.


Clinical Therapeutics | 2004

Association of asthma therapy and Churg-Strauss syndrome: an analysis of postmarketing surveillance data.

William DuMouchel; Eric T. Smith; Richard Beasley; Harold S. Nelson; Xionghu Yang; David Fram; June S. Almenoff

BACKGROUND Churg-Strauss syndrome (CSS), also known as allergic granulomatous angiitis (AGA), is a rare vasculitis that occurs in patients with bronchial asthma. The nature of the association of CSS with various asthma therapies is unclear. OBJECTIVE This study investigated the associations of different multidrug asthma therapy regimens and the reporting of AGA (the preferred code for CSS in the coding dictionary for the Adverse Event Reporting System [AERS]) by applying an iterative method of disproportionally analysis to th AERS database maintained by the US Food and Drug Administration. METHODS The public-release version of the AERS database was used to identify reports of AGA in patients receiving asthma therapy. Reporting of AGA was examined using iterative disproportionality methods in patients receiving > or =1 of the following drug classes: inhaled corticosteroid (ICS), leukotriene receptor antagonist (LTRA), short-acting beta(2)-agonist (SABA), or long-acting beta(2)-agonist (LABA). The Bayesian data-mining algorithm known as the multi-item gamma poisson shrinker was used to determine the relative reporting rates by calculation of the empirical Bayes geometric mean (EBGM) and its 90% CI (EB05 = lower limit and EB95 = upper limit) for each drug. Subset analyses were performed for each drug with different medication combinations to differentiate the relative reporting of AGA for each. RESULTS A strong association was found between LTRA use and AGA (EBGM = 104.0, EB05 = 95.0, EB95 = 113.8) that persisted with all combinations of therapy studied. AGA was also associated with the ICS, SABA and LABA classes (EBGM values of 27.8, 14.6 and 40.4, respectively). However, the latter associations were mostly dependent on the presence of concurrent LTRA and, to a lesser extemt, oral corticosteroid therapy and became negligible (ie, EB05 < 2) for patients who were not receiving these concurrent treatments. CONCLUSIONS Differences based on relative reporting were observed in the patterns of association of AGA with LTRA, ICS, and beta(2)-agonist therapies. A strong association between LTRA use and AGA was present regardless of the use of other asthma drugs.

Collaboration


Dive into the William DuMouchel's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge