Henk Harkema | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Henk Harkema is active.

Explore More

Publication

Featured researches published by Henk Harkema.

Journal of Biomedical Informatics | 2009

ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports

Henk Harkema; John N. Dowling; Tyler Thornblade; Wendy W. Chapman

In this paper we describe an algorithm called ConText for determining whether clinical conditions mentioned in clinical reports are negated, hypothetical, historical, or experienced by someone other than the patient. The algorithm infers the status of a condition with regard to these properties from simple lexical clues occurring in the context of the condition. The discussion and evaluation of the algorithm presented in this paper address the questions of whether a simple surface-based approach which has been shown to work well for negation can be successfully transferred to other contextual properties of clinical conditions, and to what extent this approach is portable among different clinical report types. In our study we find that ConText obtains reasonable to good performance for negated, historical, and hypothetical conditions across all report types that contain such conditions. Conditions experienced by someone other than the patient are very rarely found in our report set. A comprehensive solution to the problem of determining whether a clinical condition is historical or recent requires knowledge above and beyond the surface clues picked up by ConText.

Journal of the American Medical Informatics Association | 2011

Developing a natural language processing application for measuring the quality of colonoscopy procedures

Henk Harkema; Wendy W. Chapman; Melissa I. Saul; Evan S. Dellon; Robert E. Schoen; Ateev Mehrotra

OBJECTIVEnThe quality of colonoscopy procedures for colorectal cancer screening is often inadequate and varies widely among physicians. Routine measurement of quality is limited by the costs of manual review of free-text patient charts. Our goal was to develop a natural language processing (NLP) application to measure colonoscopy quality.nnnMATERIALS AND METHODSnUsing a set of quality measures published by physician specialty societies, we implemented an NLP engine that extracts 21 variables for 19 quality measures from free-text colonoscopy and pathology reports. We evaluated the performance of the NLP engine on a test set of 453 colonoscopy reports and 226 pathology reports, considering accuracy in extracting the values of the target variables from text, and the reliability of the outcomes of the quality measures as computed from the NLP-extracted information.nnnRESULTSnThe average accuracy of the NLP engine over all variables was 0.89 (range: 0.62-1.0) and the average F measure over all variables was 0.74 (range: 0.49-0.89). The average agreement score, measured as Cohens κ, between the manually established and NLP-derived outcomes of the quality measures was 0.62 (range: 0.09-0.86).nnnDISCUSSIONnFor nine of the 19 colonoscopy quality measures, the agreement score was 0.70 or above, which we consider a sufficient score for the NLP-derived outcomes of these measures to be practically useful for quality measurement.nnnCONCLUSIONnThe use of NLP for information extraction from free-text colonoscopy and pathology reports creates opportunities for large scale, routine quality measurement, which can support quality improvement in colonoscopy care.

Gastrointestinal Endoscopy | 2012

Applying a Natural Language Processing Tool to Electronic Health Records to Assess Performance on Colonoscopy Quality Measures

Ateev Mehrotra; Evan S. Dellon; Robert E. Schoen; Melissa I. Saul; Faraz Bishehsari; Carrie M. Farmer; Henk Harkema

BACKGROUNDnGastroenterology specialty societies have advocated that providers routinely assess their performance on colonoscopy quality measures. Such routine measurement has been hampered by the costs and time required to manually review colonoscopy and pathology reports. Natural language processing (NLP) is a field of computer science in which programs are trained to extract relevant information from text reports in an automated fashion.nnnOBJECTIVEnTo demonstrate the efficiency and potential of NLP-based colonoscopy quality measurement.nnnDESIGNnIn a cross-sectional study design, we used a previously validated NLP program to analyze colonoscopy reports and associated pathology notes. The resulting data were used to generate provider performance on colonoscopy quality measures.nnnSETTINGnNine hospitals in the University of Pittsburgh Medical Center health care system.nnnPATIENTSnStudy sample consisted of the 24,157 colonoscopy reports and associated pathology reports from 2008 to 2009.nnnMAIN OUTCOME MEASUREMENTSnProvider performance on 7 quality measures.nnnRESULTSnPerformance on the colonoscopy quality measures was generally poor, and there was a wide range of performance. For example, across hospitals, the adequacy of preparation was noted overall in only 45.7% of procedures (range 14.6%-86.1% across 9 hospitals), cecal landmarks were documented in 62.7% of procedures (range 11.6%-90.0%), and the adenoma detection rate was 25.2% (range 14.9%-33.9%).nnnLIMITATIONSnOur quality assessment was limited to a single health care system in western Pennsylvania.nnnCONCLUSIONSnOur study illustrates how NLP can mine free-text data in electronic records to measure and report on the quality of care. Even within a single academic hospital system, there is considerable variation in the performance on colonoscopy quality measures, demonstrating the need for better methods to regularly and efficiently assess quality.

Journal of Biomedical Informatics | 2012

Building an automated SOAP classifier for emergency department reports

Danielle L. Mowery; Janyce Wiebe; Shyam Visweswaran; Henk Harkema; Wendy W. Chapman

Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP frameworks usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohens kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks.

north american chapter of the association for computational linguistics | 2009

ONYX: A System for the Semantic Analysis of Clinical Text

Lee M. Christensen; Henk Harkema; Peter J. Haug; Jeannie Yuhaniak Irwin; Wendy W. Chapman

This paper introduces ONYX, a sentence-level text analyzer that implements a number of innovative ideas in syntactic and semantic analysis. ONYX is being developed as part of a project that seeks to translate spoken dental examinations directly into chartable findings. ONYX integrates syntax and semantics to a high degree. It interprets sentences using a combination of probabilistic classifiers, graphical unification, and semantically annotated grammar rules. In this preliminary evaluation, ONYX shows inter-annotator agreement scores with humans of 86% for assigning semantic types to relevant words, 80% for inferring relevant concepts from words, and 76% for identifying relations between concepts.

north american chapter of the association for computational linguistics | 2009

Distinguishing Historical from Current Problems in Clinical Reports -- Which Textual Features Help?

Danielle L. Mowery; Henk Harkema; John N. Dowling; Jonathan L. Lustgarten; Wendy W. Chapman

Determining whether a condition is historical or recent is important for accurate results in biomedicine. In this paper, we investigate four types of information found in clinical text that might be used to make this distinction. We conducted a descriptive, exploratory study using annotation on clinical reports to determine whether this temporal information is useful for classifying conditions as historical or recent. Our initial results suggest that few of these feature values can be used to predict temporal classification.

Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing | 2008

Temporal Annotation of Clinical Text

Danielle L. Mowery; Henk Harkema; Wendy W. Chapman

We developed a temporal annotation schema that provides a structured method to capture contextual and temporal features of clinical conditions found in clinical reports. In this poster we describe the elements of the annotation schema and provide results of an initial annotation study on a document set comprising six different types of clinical reports.

international health informatics symposium | 2010

Leveraging the semantic web and natural language processing to enhance drug-mechanism knowledge in drug product labels

Richard D. Boyce; Henk Harkema; Mike Conway

Multiple studies indicate that drug-drug interactions are a significant source of preventable adverse drug events. Factors contributing to the occurrence of preventable ADEs resulting from DDIs include a lack of knowledge of the patients concurrent medications and inaccurate or inadequate knowledge of interactions by health care providers. FDA-approved drug product labeling is a major source of information intended to help clinicians prescribe drugs in a safe and effective manner. Unfortunately, drug product labeling has been identified as often lagging behind emerging drug knowledge; especially when it has been several years since a drug has been released to the market. In this paper we report on a novel approach that explores employing Semantic Web technology and natural language processing to identify drug mechanism information that may update or expand upon statements present in product labeling.

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing | 2012