George Hripcsak | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where George Hripcsak is active.

Explore More

Publication

Featured researches published by George Hripcsak.

Journal of the American Medical Informatics Association | 2003

Detecting Adverse Events Using Information Technology

David W. Bates; R. Scott Evans; Harvey J. Murff; Peter D. Stetson; Lisa Pizziferri; George Hripcsak

CONTEXT Although patient safety is a major problem, most health care organizations rely on spontaneous reporting, which detects only a small minority of adverse events. As a result, problems with safety have remained hidden. Chart review can detect adverse events in research settings, but it is too expensive for routine use. Information technology techniques can detect some adverse events in a timely and cost-effective way, in some cases early enough to prevent patient harm. OBJECTIVE To review methodologies of detecting adverse events using information technology, reports of studies that used these techniques to detect adverse events, and study results for specific types of adverse events. DESIGN Structured review. METHODOLOGY English-language studies that reported using information technology to detect adverse events were identified using standard techniques. Only studies that contained original data were included. MAIN OUTCOME MEASURES Adverse events, with specific focus on nosocomial infections, adverse drug events, and injurious falls. RESULTS Tools such as event monitoring and natural language processing can inexpensively detect certain types of adverse events in clinical databases. These approaches already work well for some types of adverse events, including adverse drug events and nosocomial infections, and are in routine use in a few hospitals. In addition, it appears likely that these techniques will be adaptable in ways that allow detection of a broad array of adverse events, especially as more medical information becomes computerized. CONCLUSION Computerized detection of adverse events will soon be practical on a widespread basis.

Journal of the American Medical Informatics Association | 2005

Agreement, the F-Measure, and Reliability in Information Retrieval

George Hripcsak; Adam S. Rothschild

Information retrieval studies that involve searching the Internet or marking phrases usually lack a well-defined number of negative cases. This prevents the use of traditional interrater reliability metrics like the kappa statistic to assess the quality of expert-generated gold standards. Such studies often quantify system performance as precision, recall, and F-measure, or as agreement. It can be shown that the average F-measure among pairs of experts is numerically identical to the average positive specific agreement among experts and that kappa approaches these measures as the number of negative cases grows large. Positive specific agreement-or the equivalent F-measure-may be an appropriate way to quantify interrater reliability and therefore to assess the reliability of a gold standard in these studies.

Journal of the American Medical Informatics Association | 2004

Automated Encoding of Clinical Documents Based on Natural Language Processing

Carol Friedman; Lyudmila Shagina; Yves A. Lussier; George Hripcsak

OBJECTIVE The aim of this study was to develop a method based on natural language processing (NLP) that automatically maps an entire clinical document to codes with modifiers and to quantitatively evaluate the method. METHODS An existing NLP system, MedLEE, was adapted to automatically generate codes. The method involves matching of structured output generated by MedLEE consisting of findings and modifiers to obtain the most specific code. Recall and precision applied to Unified Medical Language System (UMLS) coding were evaluated in two separate studies. Recall was measured using a test set of 150 randomly selected sentences, which were processed using MedLEE. Results were compared with a reference standard determined manually by seven experts. Precision was measured using a second test set of 150 randomly selected sentences from which UMLS codes were automatically generated by the method and then validated by experts. RESULTS Recall of the system for UMLS coding of all terms was .77 (95% CI.72-.81), and for coding terms that had corresponding UMLS codes recall was .83 (.79-.87). Recall of the system for extracting all terms was .84 (.81-.88). Recall of the experts ranged from .69 to .91 for extracting terms. The precision of the system was .89 (.87-.91), and precision of the experts ranged from .61 to .91. CONCLUSION Extraction of relevant clinical information and UMLS coding were accomplished using a method based on NLP. The method appeared to be comparable to or better than six experts. The advantage of the method is that it maps text to codes along with other related information, rendering the coded output suitable for effective retrieval.

human factors in computing systems | 2011

Design lessons from the fastest q&a site in the west

Lena Mamykina; Bella Manoim; Manas Mittal; George Hripcsak; Björn Hartmann

This paper analyzes a Question & Answer site for programmers, Stack Overflow, that dramatically improves on the utility and performance of Q&A systems for technical domains. Over 92% of Stack Overflow questions about expert topics are answered - in a median time of 11 minutes. Using a mixed methods approach that combines statistical data analysis with user interviews, we seek to understand this success. We argue that it is not primarily due to an a priori superior technical design, but also to the high visibility and daily involvement of the design team within the community they serve. This model of continued community leadership presents challenges to both CSCW systems research as well as to attempts to apply the Stack Overflow model to other specialized knowledge domains.

Annals of Internal Medicine | 1995

Unlocking Clinical Data from Narrative Reports: A Study of Natural Language Processing

George Hripcsak; Carol Friedman; Philip O. Alderson; William DuMouchel; Stephen B. Johnson; Paul D. Clayton

The use of automated systems and electronic databases to enhance the quality, reduce the cost, and improve the management of health care has become common. Recent examples include using these systems to prevent adverse drug events [1, 2] and to encourage efficient treatment [3]. To function properly, automated systems require accurate, complete data. Although laboratory results are routinely available in electronic form, the most important clinical informationsymptoms, signs, and assessmentsremains largely inaccessible to automated systems. Investigators have attempted to use data from nonclinical sources to fill in the gaps, but such data have been found to be unreliable [4]. Much clinical data is locked up in departmental word-processor files, clinical databases, and research databases in the form of narrative reports such as discharge summaries, radiology reports, pathology reports, admission histories, and reports of physical examinations. Untold volumes of data are deleted every day after word-processor files are printed for the paper chart and for the mailing of reports. Exploiting this information is not trivial, however. Sentences that are easy for a person to understand are difficult for a computer to sort out. Problems include the many ways in which the same concept can be expressed (for example, heart failure, congestive heart failure, CHF, and so forth); ambiguities in interpreting grammatical constructs (possible worsening infiltrate may refer to a definite infiltrate that may be worsening or to an uncertain infiltrate that, if present, is worsening); and negation (lung fields are unremarkable implies a lack of infiltrate). To be accurate, automated systems require coded data: The concepts must come from a well-defined, finite vocabulary, and the relations among the concepts must be expressed in an unambiguous, formal structure. How do we unlock the contents of narrative reports? Human coders can be trained to read and manually structure reports [5]. Few institutions have been willing to invest in the personnel necessary for manual coding (other than for billing purposes), and the human coders can introduce an additional delay in obtaining coded data. The producers of reports (for example, radiologists for radiology reports) can be trained to directly create coded reports. Unfortunately, because manual coding systems do not match the speed and simplicity of dictating narrative reports, this approach has not attained widespread use. It also does not address the large number of reports already available in institutions. Natural language processing offers an automated solution [6-11]. The processor converts narrative reports that are available in electronic formeither through word processors or electronic scanningto coded descriptions that are appropriate for automated systems. The promise of efficient, accurate extraction of coded clinical data from narrative reports is certainly enticing. The question is whether natural language processors are up to the taskjust how efficient and accurate are they, and how easy is it to use their coded output? Methods We evaluated a general-purpose processor [12] that is intended to cover various clinical reports. To be used in a particular domain (for example, radiology) and subdomain (chest radiograph), the processor must have initial programming under the supervision of an appropriate expert (radiologist). This programming process involves enumerating the vocabulary of the domain (for example, patchy infiltrate) and formulating the grammar rules that are specific to the domain. The natural language processor works as follows. The narrative report is fed into a preprocessor, which uses its vocabulary to recognize words and phrases in the report (for example, lungs, CHF), map them to standard terms (lung, congestive heart failure), and classify them into semantic categories (bodylocation, finding). The parser then matches sequences of semantic categories in the report to structures defined in the grammar. For example, if the original report read, infiltrate in lung, then the phrase might match this structure: finding, in, bodylocation. Far more complex semantic structures are also supported through the grammar. This structure is then mapped to the processors result: a set of findings, each of which is associated with its own descriptive modifiers, such as certainty, status, location, quantity, degree, and change. For example, the following is an excerpt from a narrative report: Probable mild pulmonary vascular congestion with new left pleural effusion, question mild congestive changes. From this report, the natural language processor generated the following three coded findings: Pulmonary vascular congestion certainty: high degree: low Pleural effusion region: left status: new Congestive changes certainty: moderate degree: low The processor attempts to encode all clinical information available in reports, including the clinical indication, description, and impression. These findings are stored in a clinical database, where they can be exploited for automated decision-support and clinical research. At Columbia-Presbyterian Medical Center, New York, New York, the processor has been trained to handle chest radiograph and mammogram reports. In normal operation, the radiologist dictates a report, which is then transcribed by a clerk with a word processor. The word-processor files are printed for the paper chart, stored in the clinical database in their narrative form for on-line review by clinicians, and transmitted to the natural language processor for coding. The coded data produced by the processor are exploited for automated decision-support by the use of a computer program called a clinical event monitor [13]. The event monitor generates alerts, reminders, and interpretations that are based on the Arden Syntax for Medical Logic Modules [14]. The event monitor follows all clinical events (for example, admissions and laboratory results) in the medical center that can be tracked by computer. Whenever a clinically important situation is detected, the event monitor sends a message to the health care provider. For example, the storage of a low serum potassium level prompts the monitor to check whether the patient is receiving digoxin; if so, the monitor warns the health care provider that the hypokalemia may potentiate cardiac arrhythmias. Our study was designed and conducted by an evaluation team that was separate from the development team responsible for the natural language processor. At the time of the evaluation, members of the evaluation team had no knowledge of the operation of the processor or of its strengths and weaknesses. They knew that the processor accepted chest radiograph reports and produced some coded result. Two hundred admission chest radiograph reports were randomly selected from among those of all adult patients discharged from the inpatient service of Columbia-Presbyterian Medical Center during a particular week. An admission chest radiograph was defined as the first chest radiograph obtained during the hospital stay, even if it was not obtained on the first day. Chest radiographs were chosen because they display a broad range of disease, vocabulary, and grammatical variation. To better assess true performance, no corrections were made to reports, despite misspellings and even the inclusion of other types of reports in the same electronic files as the chest radiograph reports. Study subjects (humans and automated methods) detected the presence or absence of six clinical conditions (Table 1). To ensure that the conditions were reasonable candidates for automated decision-support, they were selected from an independent published list of automated protocols that exploited chest radiographs [15]. An internist on the evaluation team selected the six conditions, thus ensuring that the conditions were common enough to be reasonably expected to appear several times in a set of 200 reports and that overlap would be minimized. Table 1. Conditions The 200 reports were processed by the natural language processor, and the resulting coded data were fed into the clinical event monitor. For each clinical condition, the monitor had a rule expressed as a Medical Logic Module [14] to detect the condition on the basis of the processors coded output. The Medical Logic Modules concluded true (present) or false (absent). For example, the Medical Logic Module that detected pneumothorax was the simplest and used the following logic: if finding is in (pneumothorax; hydropneumothorax) and certainty-modifier is not in (no; rule out; cannot evaluate) and status-modifier is not in (resolved) then conclude true; endif; The Medical Logic Module looks for reports with appropriate findings but eliminates reports that are actually stating that the finding is absent, unknown, or resolved. The Medical Logic Modules were written by a member of the evaluation team who was given access to the six condition definitions (Table 1), a sample of the natural language processors output based on an independent set of chest radiographs, and a complete list of all vocabulary terms that the processor could generate in its output. No changes were made to the natural language processor, its grammar, or its vocabulary for the entire duration of the study (including the design phase). Once written, Medical Logic Modules were also held constant. Human participants were recruited as follows. Six board-certified radiologists and six board-certified internists were selected as experts. All 12 physicians actively practice medicine in their respective fields at Columbia-Presbyterian Medical Center. Six professional lay persons without experience in the practice of medicine were selected as additional controls. Each human participant analyzed 100 reports; the time required to analyze all 200 reports (about 4 hours) would have been a disincentive to participate in the study and might have led participants

Journal of the American Medical Informatics Association | 2013

Next-generation phenotyping of electronic health records

George Hripcsak; David J. Albers

The national adoption of electronic health records (EHR) promises to make an unprecedented amount of data available for clinical research, but the data are complex, inaccurate, and frequently missing, and the record reflects complex processes aside from the patients physiological state. We believe that the path forward requires studying the EHR as an object of interest in itself, and that new models, learning from data, and collaboration will lead to efficient use of the valuable information currently locked in health records.

Journal of the American Medical Informatics Association | 1994

Knowledge-based Approaches to the Maintenance of a Large Controlled Medical Terminology

James J. Cimino; Paul D. Clayton; George Hripcsak; Stephen B. Johnson

OBJECTIVE Develop a knowledge-based representation for a controlled terminology of clinical information to facilitate creation, maintenance, and use of the terminology. DESIGN The Medical Entities Dictionary (MED) is a semantic network, based on the Unified Medical Language System (UMLS), with a directed acyclic graph to represent multiple hierarchies. Terms from four hospital systems (laboratory, electrocardiography, medical records coding, and pharmacy) were added as nodes in the network. Additional knowledge about terms, added as semantic links, was used to assist in integration, harmonization, and automated classification of disparate terminologies. RESULTS The MED contains 32,767 terms and is in active clinical use. Automated classification was successfully applied to terms for laboratory specimens, laboratory tests, and medications. One benefit of the approach has been the automated inclusion of medications into multiple pharmacologic and allergenic classes that were not present in the pharmacy system. Another benefit has been the reduction of maintenance efforts by 90%. CONCLUSION The MED is a hybrid of terminology and knowledge. It provides domain coverage, synonymy, consistency of views, explicit relationships, and multiple classification while preventing redundancy, ambiguity (homonymy) and misclassification.

Journal of the American Medical Informatics Association | 2005

Automated detection of adverse events using natural language processing of discharge summaries.

Genevieve B. Melton; George Hripcsak

OBJECTIVE To determine whether natural language processing (NLP) can effectively detect adverse events defined in the New York Patient Occurrence Reporting and Tracking System (NYPORTS) using discharge summaries. DESIGN An adverse event detection system for discharge summaries using the NLP system MedLEE was constructed to identify 45 NYPORTS event types. The system was first applied to a random sample of 1,000 manually reviewed charts. The system then processed all inpatient cases with electronic discharge summaries for two years. All system-identified events were reviewed, and performance was compared with traditional reporting. MEASUREMENTS System sensitivity, specificity, and predictive value, with manual review serving as the gold standard. RESULTS The system correctly identified 16 of 65 events in 1,000 charts. Of 57,452 total electronic discharge summaries, the system identified 1,590 events in 1,461 cases, and manual review verified 704 events in 652 cases, resulting in an overall sensitivity of 0.28 (95% confidence interval [CI]: 0.17-0.42), specificity of 0.985 (CI: 0.984-0.986), and positive predictive value of 0.45 (CI: 0.42-0.47) for detecting cases with events and an average specificity of 0.9996 (CI: 0.9996-0.9997) per event type. Traditional event reporting detected 322 events during the period (sensitivity 0.09), of which the system identified 110 as well as 594 additional events missed by traditional methods. CONCLUSION NLP is an effective technique for detecting a broad range of adverse events in text documents and outperformed traditional and previous automated adverse event detection methods.

Journal of the American Medical Informatics Association | 2009

Active Computerized Pharmacovigilance Using Natural Language Processing, Statistics, and Electronic Health Records: A Feasibility Study

Xiaoyan Wang; George Hripcsak; Marianthi Markatou; Carol Friedman

OBJECTIVE It is vital to detect the full safety profile of a drug throughout its market life. Current pharmacovigilance systems still have substantial limitations, however. The objective of our work is to demonstrate the feasibility of using natural language processing (NLP), the comprehensive Electronic Health Record (EHR), and association statistics for pharmacovigilance purposes. DESIGN Narrative discharge summaries were collected from the Clinical Information System at New York Presbyterian Hospital (NYPH). MedLEE, an NLP system, was applied to the collection to identify medication events and entities which could be potential adverse drug events (ADEs). Co-occurrence statistics with adjusted volume tests were used to detect associations between the two types of entities, to calculate the strengths of the associations, and to determine their cutoff thresholds. Seven drugs/drug classes (ibuprofen, morphine, warfarin, bupropion, paroxetine, rosiglitazone, ACE inhibitors) with known ADEs were selected to evaluate the system. RESULTS One hundred thirty-two potential ADEs were found to be associated with the 7 drugs. Overall recall and precision were 0.75 and 0.31 for known ADEs respectively. Importantly, qualitative evaluation using historic roll back design suggested that novel ADEs could be detected using our system. CONCLUSIONS This study provides a framework for the development of active, high-throughput and prospective systems which could potentially unveil drug safety profiles throughout their entire market life. Our results demonstrate that the framework is feasible although there are some challenging issues. To the best of our knowledge, this is the first study using comprehensive unstructured data from the EHR for pharmacovigilance.

Medical Care | 2013

Caveats for the use of operational electronic health record data in comparative effectiveness research.

William R. Hersh; Mark Weiner; Peter J. Embi; Judith R. Logan; Philip R. O. Payne; Elmer V. Bernstam; Harold P. Lehmann; George Hripcsak; Timothy H. Hartzog; James J. Cimino; Joel H. Saltz

The growing amount of data in operational electronic health record systems provides unprecedented opportunity for its reuse for many tasks, including comparative effectiveness research. However, there are many caveats to the use of such data. Electronic health record data from clinical settings may be inaccurate, incomplete, transformed in ways that undermine their meaning, unrecoverable for research, of unknown provenance, of insufficient granularity, and incompatible with research protocols. However, the quantity and real-world nature of these data provide impetus for their use, and we develop a list of caveats to inform would-be users of such data as well as provide an informatics roadmap that aims to insure this opportunity to augment comparative effectiveness research can be best leveraged.

Explore More