Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chunhua Weng is active.

Publication


Featured researches published by Chunhua Weng.


Journal of the American Medical Informatics Association | 2013

Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research

Nicole Gray Weiskopf; Chunhua Weng

Objective To review the methods and dimensions of data quality assessment in the context of electronic health record (EHR) data reuse for research. Materials and methods A review of the clinical research literature discussing data quality assessment methodology for EHR data was performed. Using an iterative process, the aspects of data quality being measured were abstracted and categorized, as well as the methods of assessment used. Results Five dimensions of data quality were identified, which are completeness, correctness, concordance, plausibility, and currency, and seven broad categories of data quality assessment methods: comparison with gold standards, data element agreement, data source agreement, distribution comparison, validity checks, log review, and element presence. Discussion Examination of the methods by which clinical researchers have investigated the quality and suitability of EHR data for research shows that there are fundamental features of data quality, which may be difficult to measure, as well as proxy dimensions. Researchers interested in the reuse of EHR data for clinical research are recommended to consider the adoption of a consistent taxonomy of EHR data quality, to remain aware of the task-dependence of data quality, to integrate work on data quality assessment from other fields, and to adopt systematic, empirically driven, statistically based methods of data quality assessment. Conclusion There is currently little consistency or potential generalizability in the methods used to assess EHR data quality. If the reuse of EHR data for clinical research is to become accepted, researchers should adopt validated, systematic methods of EHR data quality assessment.


Journal of Biomedical Informatics | 2010

Formal representation of eligibility criteria

Chunhua Weng; Samson W. Tu; Ida Sim; Rachel L. Richesson

Standards-based, computable knowledge representations for eligibility criteria are increasingly needed to provide computer-based decision support for automated research participant screening, clinical evidence application, and clinical research knowledge management. We surveyed the literature and identified five aspects of eligibility criteria knowledge representation that contribute to the various research and clinical applications: the intended use of computable eligibility criteria, the classification of eligibility criteria, the expression language for representing eligibility rules, the encoding of eligibility concepts, and the modeling of patient data. We consider three of these aspects (expression language, codification of eligibility concepts, and patient data modeling) to be essential constructs of a formal knowledge representation for eligibility criteria. The requirements for each of the three knowledge constructs vary for different use cases, which therefore should inform the development and choice of the constructs toward cost-effective knowledge representation efforts. We discuss the implications of our findings for standardization efforts toward knowledge representation for sharable and computable eligibility criteria.


Journal of Biomedical Informatics | 2013

Defining and measuring completeness of electronic health records for secondary use

Nicole Gray Weiskopf; George Hripcsak; Sushmita Swaminathan; Chunhua Weng

We demonstrate the importance of explicit definitions of electronic health record (EHR) data completeness and how different conceptualizations of completeness may impact findings from EHR-derived datasets. This study has important repercussions for researchers and clinicians engaged in the secondary use of EHR data. We describe four prototypical definitions of EHR completeness: documentation, breadth, density, and predictive completeness. Each definition dictates a different approach to the measurement of completeness. These measures were applied to representative data from NewYork-Presbyterian Hospitals clinical data warehouse. We found that according to any definition, the number of complete records in our clinical database is far lower than the nominal total. The proportion that meets criteria for completeness is heavily dependent on the definition of completeness used, and the different definitions generate different subsets of records. We conclude that the concept of completeness in EHR is contextual. We urge data consumers to be explicit in how they define a complete record and transparent about the limitations of their data.


Journal of Biomedical Informatics | 2009

A review of auditing methods applied to the content of controlled biomedical terminologies

Xinxin Zhu; Jung Wei Fan; David M. Baorto; Chunhua Weng; James J. Cimino

Although controlled biomedical terminologies have been with us for centuries, it is only in the last couple of decades that close attention has been paid to the quality of these terminologies. The result of this attention has been the development of auditing methods that apply formal methods to assessing whether terminologies are complete and accurate. We have performed an extensive literature review to identify published descriptions of these methods and have created a framework for characterizing them. The framework considers manual, systematic and heuristic methods that use knowledge (within or external to the terminology) to measure quality factors of different aspects of the terminology content (terms, semantic classification, and semantic relationships). The quality factors examined included concept orientation, consistency, non-redundancy, soundness and comprehensive coverage. We reviewed 130 studies that were retrieved based on keyword search on publications in PubMed, and present our assessment of how they fit into our framework. We also identify which terminologies have been audited with the methods and provide examples to illustrate each part of the framework.


conference on computer supported cooperative work | 2004

Asynchronous collaborative writing through annotations

Chunhua Weng; John H. Gennari

Annotation is central to iterative reviewing and revising activities in asynchronous collaborative writing. Currently most digital annotation models and systems assume static context information and provide far less functionality than physical annotations. We extend prior annotation research by Marshall and Cadiz and design an activity-oriented annotation model to mimic the rich functionality of physical annotations for an enhanced collaborative writing process. In this model, we define an annotation life cycle and support annotation version control. We implement a collaborative writing system that supports improved in-situ communication and cross-role feedback based on our annotation model.


Journal of Biomedical Informatics | 2011

Combining PubMed knowledge and EHR data to develop a weighted bayesian network for pancreatic cancer prediction

Di Zhao; Chunhua Weng

In this paper, we propose a novel method that combines PubMed knowledge and Electronic Health Records to develop a weighted Bayesian Network Inference (BNI) model for pancreatic cancer prediction. We selected 20 common risk factors associated with pancreatic cancer and used PubMed knowledge to weigh the risk factors. A keyword-based algorithm was developed to extract and classify PubMed abstracts into three categories that represented positive, negative, or neutral associations between each risk factor and pancreatic cancer. Then we designed a weighted BNI model by adding the normalized weights into a conventional BNI model. We used this model to extract the EHR values for patients with or without pancreatic cancer, which then enabled us to calculate the prior probabilities for the 20 risk factors in the BNI. The software iDiagnosis was designed to use this weighted BNI model for predicting pancreatic cancer. In an evaluation using a case-control dataset, the weighted BNI model significantly outperformed the conventional BNI and two other classifiers (k-Nearest Neighbor and Support Vector Machine). We conclude that the weighted BNI using PubMed knowledge and EHR data shows remarkable accuracy improvement over existing representative methods for pancreatic cancer prediction.


Genetics in Medicine | 2013

Opportunities for genomic clinical decision support interventions

Casey Lynnette Overby; Isaac S. Kohane; Joseph Kannry; Marc S. Williams; Justin Starren; Erwin P. Bottinger; Omri Gottesman; Joshua C. Denny; Chunhua Weng; Peter Tarczy-Hornoch; George Hripcsak

The development and availability of genomic applications for use in clinical care is accelerating rapidly. the routine use of genomic information, however, is beyond most health-care providers’ formal training, and the challenges of understanding and interpreting genomic data are compounded by the demands of clinical practice. nearly all physicians, for example, agree that genetic variations may influence drug response, but only a small fraction feel adequately informed about pharmacogenomic testing.1 Clinical decision support (CDS) embedded into clinical information systems, such as the electronic health record (EHR) and the personal health record (PHR), is recognized as being necessary to facilitate the appropriate use of genomic applications.2–4 CDS provides clinical knowledge and patient-specific information, filtered or presented at particular times to enhance clinical care.5 CDS solutions can assist clinical-care providers with personalizing care and can incorporate the preferences of health-care consumers. EHRs and PHRs theoretically may support access to and storage of genetic data. These systems may also support data exchange between repositories and enable CDS embedment and linkage. The use of EHRs and PHRs in this manner depends on characteristics of the underlying health information technology (IT) infrastructure. This article seeks to provide a common ground for discussing CDS for genetic testing and for data access processes among heterogeneous health IT infrastructures. There are many lessons learned from more than five decades of experience with CDS that can be applied to CDS implementation in the era of genomic data. Indeed, existing CDS technologies already play a role in supporting genetic testing and data access processes. In the following sections, we provide an overview of existing frameworks for local evaluation of health IT infrastructures for CDS, processes for genetic testing and data access, and the rationale behind the Electronic Medical Records and Genomics (eMERGE) Network’s6 work on establishing a common ground for discussing CDS solutions among heterogeneous IT infrastructures. We also provide examples from eMERGE to illustrate that we can characterize genomic CDS using frameworks from the pregenomic CDS era, and outline lessons learned from implementing pregenomic CDS that can account for variation in health IT infrastructure. Finally, we propose a framework to describe opportunities for genomic CDS that can support provider- and consumer-initiated genetic testing and data access processes. The work in this article is complementary to that of the Clinical Sequence Exploratory Research Electronic Records Working Group, also in this special issue.7 The Working Group’s manuscript surveys the six current Clinical Sequence Exploratory Research sites on the processes used for variant annotation, curation, report generation, and integration into the EHR, in order to determine commonalities, determine gaps, and to suggest future directions. This article takes a more top–down approach to system desiderata.


Journal of the American Medical Informatics Association | 2012

Using EHRs to integrate research with patient care: promises and challenges

Chunhua Weng; Paul S. Appelbaum; George Hripcsak; Ian M. Kronish; Linda Busacca; Karina W. Davidson; J. Thomas Bigger

Clinical research is the foundation for advancing the practice of medicine. However, the lack of seamless integration between clinical research and patient care workflow impedes recruitment efficiency, escalates research costs, and hence threatens the entire clinical research enterprise. Increased use of electronic health records (EHRs) holds promise for facilitating this integration but must surmount regulatory obstacles. Among the unintended consequences of current research oversight are barriers to accessing patient information for prescreening and recruitment, coordinating scheduling of clinical and research visits, and reconciling information about clinical and research drugs. We conclude that the EHR alone cannot overcome barriers in conducting clinical trials and comparative effectiveness research. Patient privacy and human subject protection policies should be clarified at the local level to exploit optimally the full potential of EHRs, while continuing to ensure participant safety. Increased alignment of policies that regulate the clinical and research use of EHRs could help fulfill the vision of more efficiently obtaining clinical research evidence to improve human health.


Journal of the American Medical Informatics Association | 2013

A collaborative approach to developing an electronic health record phenotyping algorithm for drug-induced liver injury

Casey Lynnette Overby; Jyotishman Pathak; Omri Gottesman; Krystl Haerian; Adler J. Perotte; Sean P. Murphy; Kevin Bruce; Stephanie M. Johnson; Jayant A. Talwalkar; Yufeng Shen; Steve Ellis; Iftikhar J. Kullo; Christopher G. Chute; Carol Friedman; Erwin P. Bottinger; George Hripcsak; Chunhua Weng

OBJECTIVE To describe a collaborative approach for developing an electronic health record (EHR) phenotyping algorithm for drug-induced liver injury (DILI). METHODS We analyzed types and causes of differences in DILI case definitions provided by two institutions-Columbia University and Mayo Clinic; harmonized two EHR phenotyping algorithms; and assessed the performance, measured by sensitivity, specificity, positive predictive value, and negative predictive value, of the resulting algorithm at three institutions except that sensitivity was measured only at Columbia University. RESULTS Although these sites had the same case definition, their phenotyping methods differed by selection of liver injury diagnoses, inclusion of drugs cited in DILI cases, laboratory tests assessed, laboratory thresholds for liver injury, exclusion criteria, and approaches to validating phenotypes. We reached consensus on a DILI phenotyping algorithm and implemented it at three institutions. The algorithm was adapted locally to account for differences in populations and data access. Implementations collectively yielded 117 algorithm-selected cases and 23 confirmed true positive cases. DISCUSSION Phenotyping for rare conditions benefits significantly from pooling data across institutions. Despite the heterogeneity of EHRs and varied algorithm implementations, we demonstrated the portability of this algorithm across three institutions. The performance of this algorithm for identifying DILI was comparable with other computerized approaches to identify adverse drug events. CONCLUSIONS Phenotyping algorithms developed for rare and complex conditions are likely to require adaptive implementation at multiple institutions. Better approaches are also needed to share algorithms. Early agreement on goals, data sources, and validation methods may improve the portability of the algorithms.


Journal of Biomedical Informatics | 2007

User-centered semantic harmonization: a case study.

Chunhua Weng; John H. Gennari; Douglas B. Fridsma

Semantic interoperability is one of the great challenges in biomedical informatics. Methods such as ontology alignment or use of metadata neither scale nor fundamentally alleviate semantic heterogeneity among information sources. In the context of the Cancer Biomedical Informatics Grid program, the Biomedical Research Integrated Domain Group (BRIDG) has been making an ambitious effort to harmonize existing information models for clinical research from a variety of sources and modeling agreed-upon semantics shared by the technical harmonization committee and the developers of these models. This paper provides some observations on this user-centered semantic harmonization effort and its inherent technical and social challenges. The authors also compare BRIDG with related efforts to achieve semantic interoperability in healthcare, including UMLS, InterMed, the Semantic Web, and the Ontology for Biomedical Investigations initiative. The BRIDG project demonstrates the feasibility of user-centered collaborative domain modeling as an approach to semantic harmonization, but also highlights a number of technology gaps in support of collaborative semantic harmonization that remain to be filled.

Collaboration


Dive into the Chunhua Weng's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zhe He

Florida State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge