Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kin Wah Fung is active.

Publication


Featured researches published by Kin Wah Fung.


meeting of the association for computational linguistics | 2007

From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches

Alan R. Aronson; Olivier Bodenreider; Dina Demner-Fushman; Kin Wah Fung; Vivian K. Lee; James G. Mork; Aurélie Névéol; Lee B. Peters; Willie J. Rogers

This paper describes the application of an ensemble of indexing and classification systems, which have been shown to be successful in information retrieval and classification of medical literature, to a new task of assigning ICD-9-CM codes to the clinical history and impression sections of radiology reports. The basic methods used are: a modification of the NLM Medical Text Indexer system, SVM, k-NN and a simple pattern-matching method. The basic methods are combined using a variant of stacking. Evaluated in the context of a Medical NLP Challenge, fusion produced an F-score of 0.85 on the Challenge test set, which is considerably above the mean Challenge F-score of 0.77 for 44 participating groups.


Journal of the American Medical Informatics Association | 2005

Integrating SNOMED CT into the UMLS: An Exploration of Different Views of Synonymy and Quality of Editing

Kin Wah Fung; William T. Hole; Stuart J. Nelson; Suresh Srinivasan; Tammy Powell; Laura Roth

OBJECTIVE The integration of SNOMED CT into the Unified Medical Language System (UMLS) involved the alignment of two views of synonymy that were different because the two vocabulary systems have different intended purposes and editing principles. The UMLS is organized according to one view of synonymy, but its structure also represents all the individual views of synonymy present in its source vocabularies. Despite progress in knowledge-based automation of development and maintenance of vocabularies, manual curation is still the main method of determining synonymy. The aim of this study was to investigate the quality of human judgment of synonymy. DESIGN Sixty pairs of potentially controversial SNOMED CT synonyms were reviewed by 11 domain vocabulary experts (six UMLS editors and five noneditors), and scores were assigned according to the degree of synonymy. MEASUREMENTS The synonymy scores of each subject were compared to the gold standard (the overall mean synonymy score of all subjects) to assess accuracy. Agreement between UMLS editors and noneditors was measured by comparing the mean synonymy scores of editors to noneditors. RESULTS Average accuracy was 71% for UMLS editors and 75% for noneditors (difference not statistically significant). Mean scores of editors and noneditors showed significant positive correlation (Spearmans rank correlation coefficient 0.654, two-tailed p < 0.01) with a concurrence rate of 75% and an interrater agreement kappa of 0.43. CONCLUSION The accuracy in the judgment of synonymy was comparable for UMLS editors and nonediting domain experts. There was reasonable agreement between the two groups.


Journal of the American Medical Informatics Association | 2013

Extracting drug indication information from structured product labels using natural language processing

Kin Wah Fung; Chiang S Jao; Dina Demner-Fushman

OBJECTIVE To extract drug indications from structured drug labels and represent the information using codes from standard medical terminologies. MATERIALS AND METHODS We used MetaMap and other publicly available resources to extract information from the indications section of drug labels. Drugs and indications were encoded by RxNorm and UMLS identifiers respectively. A sample was manually reviewed. We also compared the results with two independent information sources: National Drug File-Reference Terminology and the Semantic Medline project. RESULTS A total of 6797 drug labels were processed, resulting in 19 473 unique drug-indication pairs. Manual review of 298 most frequently prescribed drugs by seven physicians showed a recall of 0.95 and precision of 0.77. Inter-rater agreement (Fleiss κ) was 0.713. The precision of the subset of results corroborated by Semantic Medline extractions increased to 0.93. DISCUSSION Correlation of a patients medical problems and drugs in an electronic health record has been used to improve data quality and reduce medication errors. Authoritative drug indication information is available from drug labels, but not in a format readily usable by computer applications. Our study shows that it is feasible to use publicly available natural language processing resources to extract drug indications from drug labels. The same method can be applied to other sections of the drug label-for example, adverse effects, contraindications. CONCLUSIONS It is feasible to use publicly available natural language processing tools to extract indication information from freely available drug labels. Named entity recognition sources (eg, MetaMap) provide reasonable recall. Combination with other data sources provides higher precision.


Contemporary Clinical Trials | 2008

Heterogeneous but "standard" coding systems for adverse events: Issues in achieving interoperability between apples and oranges.

Rachel L. Richesson; Kin Wah Fung; Jeffrey P. Krischer

Monitoring adverse events (AEs) is an important part of clinical research and a crucial target for data standards. The representation of adverse events themselves requires the use of controlled vocabularies with thousands of needed clinical concepts. Several data standards for adverse events currently exist, each with a strong user base. The structure and features of these current adverse event data standards (including terminologies and classifications) are different, so comparisons and evaluations are not straightforward, nor are strategies for their harmonization. Three different data standards - the Medical Dictionary for Regulatory Activities (MedDRA) and the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) terminologies, and Common Terminology Criteria for Adverse Events (CTCAE) classification - are explored as candidate representations for AEs. This paper describes the structural features of each coding system, their content and relationship to the Unified Medical Language System (UMLS), and unsettled issues for future interoperability of these standards.


Annals of Emergency Medicine | 2013

Comparison of Electronic Pharmacy Prescription Records With Manually Collected Medication Histories in an Emergency Department

Kin Wah Fung; Mehmet Kayaalp; Fiona M. Callaghan; Clement J. McDonald

STUDY OBJECTIVE Medication history is an essential part of patient assessment in emergency care. Patient-reported medication history can be incomplete. We study whether an electronic pharmacy-sourced prescription record can supplement the patient-reported history. METHODS In a community hospital, we compared the patient-reported history obtained by triage nurses to a proprietary electronic pharmacy record in all emergency department (ED) patients during 3 months. RESULTS Of 9,426 triaged patients, 5,001 (53%) had at least 1 (mean 7.7) prescription medication in the full-year electronic pharmacy record. Counting only recent prescription medications (supply lasting to at least 7 days before the ED visit), 3,688 patients (39%) had at least 1 (mean 4.0) recent medication. After adjustment for possible false-positive results, recent electronic prescription medication record enriched the patient-reported history by 28% (adding 1.1 drugs per patient). However, only 60% of patients with any active prescription medications from either source had any recent prescription medications in their electronic pharmacy record. CONCLUSION The electronic pharmacy prescription record augments the manually collected history.


world congress on medical and health informatics, medinfo | 2013

Semantic interoperation and electronic health records: context sensitive mapping from SNOMED CT to ICD-10.

James R. Campbell; Hazel Brear; Rita A Scichilone; Susan White; Kathy Giannangelo; Brian Carlsen; Harold R. Solbrig; Kin Wah Fung

An important case for successful deployment of a lifetime electronic health record is reuse of clinical data from the electronic health record (EHR) for epidemiology, reimbursement, and research. We report a collaboration between the IHTSDO and the WHO to develop knowledge-based tools supporting translation of data from SNOMED CT to the ICD-10 classification. These tools have been vetted by an international community and are available for system vendors to enhance the interoperability of their products. The maps we created are also informing the development of the next generation of classifications which will employ a common ontology base between SNOMED CT and ICD-11 to promote interoperability.


Journal of the American Medical Informatics Association | 2015

An exploration of the properties of the CORE problem list subset and how it facilitates the implementation of SNOMED CT

Kin Wah Fung; Julia Xu

Abstract Objective Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is the emergent international health terminology standard for encoding clinical information in electronic health records. The CORE Problem List Subset was created to facilitate the terminology’s implementation. This study evaluates the CORE Subset’s coverage and examines its growth pattern as source datasets are being incorporated. Methods Coverage of frequently used terms and the corresponding usage of the covered terms were assessed by “leave-one-out” analysis of the eight datasets constituting the current CORE Subset. The growth pattern was studied using a retrospective experiment, growing the Subset one dataset at a time and examining the relationship between the size of the starting subset and the coverage of frequently used terms in the incoming dataset. Linear regression was used to model that relationship. Results On average, the CORE Subset covered 80.3% of the frequently used terms of the left-out dataset, and the covered terms accounted for 83.7% of term usage. There was a significant positive correlation between the CORE Subset’s size and the coverage of the frequently used terms in an incoming dataset. This implies that the CORE Subset will grow at a progressively slower pace as it gets bigger. Conclusion The CORE Problem List Subset is a useful resource for the implementation of Systematized Nomenclature of Medicine Clinical Terms in electronic health records. It offers good coverage of frequently used terms, which account for a high proportion of term usage. If future datasets are incorporated into the CORE Subset, it is likely that its size will remain small and manageable.


Journal of the American Medical Informatics Association | 2017

Comparison of three commercial knowledge bases for detection of drug-drug interactions in clinical decision support

Kin Wah Fung; Joan Kapusnik-Uner; Jean Cunningham; Stefanie Higby-Baker; Olivier Bodenreider

Objective To compare 3 commercial knowledge bases (KBs) used for detection and avoidance of potential drug-drug interactions (DDIs) in clinical practice. Methods Drugs in the DDI tables from First DataBank (FDB), Micromedex, and Multum were mapped to RxNorm. The KBs were compared at the clinical drug, ingredient, and DDI rule levels. The KBs were evaluated against a reference list of highly significant DDIs from the Office of the National Coordinator for Health Information Technology (ONC). The KBs and the ONC list were applied to a prescription data set to simulate their use in clinical decision support. Results The KBs contained 1.6 million (FDB), 4.5 million (Micromedex), and 4.8 million (Multum) clinical drug pairs. Altogether, there were 8.6 million unique pairs, of which 79% were found only in 1 KB and 5% in all 3 KBs. However, there was generally more agreement than disagreement in the severity rankings, especially in the contraindicated category. The KBs covered 99.8-99.9% of the alerts of the ONC list and would have generated 25 (FDB), 145 (Micromedex), and 84 (Multum) alerts per 1000 prescriptions. Conclusion The commercial KBs differ considerably in size and quantity of alerts generated. There is less variability in severity ranking of DDIs than suggested by previous studies. All KBs provide very good coverage of the ONC list. More work is needed to standardize the editorial policies and evidence for inclusion of DDIs to reduce variation among knowledge sources and improve relevance. Some DDIs considered contraindicated in all 3 KBs might be possible candidates to add to the ONC list.


Journal of the American Medical Informatics Association | 2018

A value set for documenting adverse reactions in electronic health records

Foster R. Goss; Kenneth H. Lai; Maxim Topaz; Warren W. Acker; Leigh Kowalski; Joseph M. Plasek; Kimberly G. Blumenthal; Diane L. Seger; Sarah P. Slight; Kin Wah Fung; Frank Y. Chang; David W. Bates; Li Zhou

Objective To develop a comprehensive value set for documenting and encoding adverse reactions in the allergy module of an electronic health record. Materials and Methods We analyzed 2 471 004 adverse reactions stored in Partners Healthcares Enterprise-wide Allergy Repository (PEAR) of 2.7 million patients. Using the Medical Text Extraction, Reasoning, and Mapping System, we processed both structured and free-text reaction entries and mapped them to Systematized Nomenclature of Medicine - Clinical Terms. We calculated the frequencies of reaction concepts, including rare, severe, and hypersensitivity reactions. We compared PEAR concepts to a Federal Health Information Modeling and Standards value set and University of Nebraska Medical Center data, and then created an integrated value set. Results We identified 787 reaction concepts in PEAR. Frequently reported reactions included: rash (14.0%), hives (8.2%), gastrointestinal irritation (5.5%), itching (3.2%), and anaphylaxis (2.5%). We identified an additional 320 concepts from Federal Health Information Modeling and Standards and the University of Nebraska Medical Center to resolve gaps due to missing and partial matches when comparing these external resources to PEAR. This yielded 1106 concepts in our final integrated value set. The presence of rare, severe, and hypersensitivity reactions was limited in both external datasets. Hypersensitivity reactions represented roughly 20% of the reactions within our data. Discussion We developed a value set for encoding adverse reactions using a large dataset from one health system, enriched by reactions from 2 large external resources. This integrated value set includes clinically important severe and hypersensitivity reactions. Conclusion This work contributes a value set, harmonized with existing data, to improve the consistency and accuracy of reaction documentation in electronic health records, providing the necessary building blocks for more intelligent clinical decision support for allergies and adverse reactions.


eGEMs (Generating Evidence & Methods to improve patient outcomes) | 2016

Preparing for the ICD-10-CM Transition: Automated Methods for Translating ICD Codes in Clinical Phenotype Definitions.

Kin Wah Fung; Rachel L. Richesson; Michelle Smerek; Katherine Pereira; Beverly B. Green; Ashwin A. Patkar; Megan Clowse; Alan Bauck; Olivier Bodenreider

Background: The national mandate for health systems to transition from ICD-9-CM to ICD-10-CM in October 2015 has an impact on research activities. Clinical phenotypes defined by ICD-9-CM codes need to be converted to ICD-10-CM, which has nearly four times more codes and a very different structure than ICD-9-CM. Methods: We used the Centers for Medicare & Medicaid Services (CMS) General Equivalent Maps (GEMs) to translate, using four different methods, condition-specific ICD-9-CM code sets used for pragmatic trials (n=32) into ICD-10-CM. We calculated the recall, precision, and F score of each method. We also used the ICD-9-CM and ICD-10-CM value sets defined for electronic quality measure as an additional evaluation of the mapping methods. Results: The forward-backward mapping (FBM) method had higher precision, recall and F-score metrics than simple forward mapping (SFM). The more aggressive secondary (SM) and tertiary mapping (TM) methods resulted in higher recall but lower precision. For clinical phenotype definition, FBM was the best (F=0.67), but was close to SM (F=0.62) and TM (F=0.60), judging on the F-scores alone. The overall difference between the four methods was statistically significant (one-way ANOVA, F=5.749, p=0.001). However, pairwise comparisons between FBM, SM, and TM did not reach statistical significance. A similar trend was found for the quality measure value sets. Discussion: The optimal method for using the GEMs depends on the relative importance of recall versus precision for a given use case. It appears that for clinically distinct and homogenous conditions, the recall of FBM is sufficient. The performance of all mapping methods was lower for heterogeneous conditions. Since code sets used for phenotype definition and quality measurement can be very similar, there is a possibility of cross-fertilization between the two activities. Conclusion: Different mapping approaches yield different collections of ICD-10-CM codes. All methods require some level of human validation.

Collaboration


Dive into the Kin Wah Fung's collaboration.

Top Co-Authors

Avatar

Olivier Bodenreider

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dina Demner-Fushman

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Suresh Srinivasan

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Clement J. McDonald

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Foster R. Goss

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Maxim Topaz

Brigham and Women's Hospital

View shared research outputs
Researchain Logo
Decentralizing Knowledge