Is this you? Create Your Porfile

Yizhao Ni

Cincinnati Children's Hospital Medical Center

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yizhao Ni is active.

Explore More

Publication

Featured researches published by Yizhao Ni.

Journal of the American Medical Informatics Association | 2015

Desiderata for computable representations of electronic health records-driven phenotype algorithms.

Huan Mo; William K. Thompson; Luke V. Rasmussen; Jennifer A. Pacheco; Guoqian Jiang; Richard C. Kiefer; Qian Zhu; Jie Xu; Enid Montague; David Carrell; Todd Lingren; Frank D. Mentch; Yizhao Ni; Firas H. Wehbe; Peggy L. Peissig; Gerard Tromp; Eric B. Larson; Christopher G. Chute; Jyotishman Pathak; Joshua C. Denny; Peter Speltz; Abel N. Kho; Gail P. Jarvik; Cosmin Adrian Bejan; Marc S. Williams; Kenneth M. Borthwick; Terrie Kitchner; Dan M. Roden; Paul A. Harris

Background Electronic health records (EHRs) are increasingly used for clinical and translational research through the creation of phenotype algorithms. Currently, phenotype algorithms are most commonly represented as noncomputable descriptive documents and knowledge artifacts that detail the protocols for querying diagnoses, symptoms, procedures, medications, and/or text-driven medical concepts, and are primarily meant for human comprehension. We present desiderata for developing a computable phenotype representation model (PheRM). Methods A team of clinicians and informaticians reviewed common features for multisite phenotype algorithms published in PheKB.org and existing phenotype representation platforms. We also evaluated well-known diagnostic criteria and clinical decision-making guidelines to encompass a broader category of algorithms. Results We propose 10 desired characteristics for a flexible, computable PheRM: (1) structure clinical data into queryable forms; (2) recommend use of a common data model, but also support customization for the variability and availability of EHR data among sites; (3) support both human-readable and computable representations of phenotype algorithms; (4) implement set operations and relational algebra for modeling phenotype algorithms; (5) represent phenotype criteria with structured rules; (6) support defining temporal relations between events; (7) use standardized terminologies and ontologies, and facilitate reuse of value sets; (8) define representations for text searching and natural language processing; (9) provide interfaces for external software algorithms; and (10) maintain backward compatibility. Conclusion A computable PheRM is needed for true phenotype portability and reliability across different EHR products and healthcare systems. These desiderata are a guide to inform the establishment and evolution of EHR phenotype algorithm authoring platforms and languages.

Journal of the American Medical Informatics Association | 2014

Phenotyping for patient safety: algorithm development for electronic health record based automated adverse event and medical error detection in neonatal intensive care

Qi Li; Kristin Melton; Todd Lingren; Eric S. Kirkendall; Eric S. Hall; Haijun Zhai; Yizhao Ni; Megan Kaiser; Laura Stoutenborough; Imre Solti

Background Although electronic health records (EHRs) have the potential to provide a foundation for quality and safety algorithms, few studies have measured their impact on automated adverse event (AE) and medical error (ME) detection within the neonatal intensive care unit (NICU) environment. Objective This paper presents two phenotyping AE and ME detection algorithms (ie, IV infiltrations, narcotic medication oversedation and dosing errors) and describes manual annotation of airway management and medication/fluid AEs from NICU EHRs. Methods From 753 NICU patient EHRs from 2011, we developed two automatic AE/ME detection algorithms, and manually annotated 11 classes of AEs in 3263 clinical notes. Performance of the automatic AE/ME detection algorithms was compared to trigger tool and voluntary incident reporting results. AEs in clinical notes were double annotated and consensus achieved under neonatologist supervision. Sensitivity, positive predictive value (PPV), and specificity are reported. Results Twelve severe IV infiltrates were detected. The algorithm identified one more infiltrate than the trigger tool and eight more than incident reporting. One narcotic oversedation was detected demonstrating 100% agreement with the trigger tool. Additionally, 17 narcotic medication MEs were detected, an increase of 16 cases over voluntary incident reporting. Conclusions Automated AE/ME detection algorithms provide higher sensitivity and PPV than currently used trigger tools or voluntary incident-reporting systems, including identification of potential dosing and frequency errors that current methods are unequipped to detect.

Resuscitation | 2014

Developing and evaluating a machine learning based algorithm to predict the need of pediatric intensive care unit transfer for newly hospitalized children

Haijun Zhai; Patrick W. Brady; Qi Li; Todd Lingren; Yizhao Ni; Derek S. Wheeler; Imre Solti

BACKGROUND Early warning scores (EWS) are designed to identify early clinical deterioration by combining physiologic and/or laboratory measures to generate a quantified score. Current EWS leverage only a small fraction of Electronic Health Record (EHR) content. The planned widespread implementation of EHRs brings the promise of abundant data resources for prediction purposes. The three specific aims of our research are: (1) to develop an EHR-based automated algorithm to predict the need for Pediatric Intensive Care Unit (PICU) transfer in the first 24h of admission; (2) to evaluate the performance of the new algorithm on a held-out test data set; and (3) to compare the effectiveness of the new algorithms with those of two published Pediatric Early Warning Scores (PEWS). METHODS The cases were comprised of 526 encounters with 24-h Pediatric Intensive Care Unit (PICU) transfer. In addition to the cases, we randomly selected 6772 control encounters from 62516 inpatient admissions that were never transferred to the PICU. We used 29 variables in a logistic regression and compared our algorithm against two published PEWS on a held-out test data set. RESULTS The logistic regression algorithm achieved 0.849 (95% CI 0.753-0.945) sensitivity, 0.859 (95% CI 0.850-0.868) specificity and 0.912 (95% CI 0.905-0.919) area under the curve (AUC) in the test set. Our algorithms AUC was significantly higher, by 11.8 and 22.6% in the test set, than two published PEWS. CONCLUSION The novel algorithm achieved higher sensitivity, specificity, and AUC than the two PEWS reported in the literature.

Journal of Biomedical Informatics | 2014

Preparing an annotated gold standard corpus to share with extramural investigators for de-identification research

Louise Deléger; Todd Lingren; Yizhao Ni; Megan Kaiser; Laura Stoutenborough; Keith Marsolo; Michal Kouril; Katalin Molnar; Imre Solti

OBJECTIVE The current study aims to fill the gap in available healthcare de-identification resources by creating a new sharable dataset with realistic Protected Health Information (PHI) without reducing the value of the data for de-identification research. By releasing the annotated gold standard corpus with Data Use Agreement we would like to encourage other Computational Linguists to experiment with our data and develop new machine learning models for de-identification. This paper describes: (1) the modifications required by the Institutional Review Board before sharing the de-identification gold standard corpus; (2) our efforts to keep the PHI as realistic as possible; (3) and the tests to show the effectiveness of these efforts in preserving the value of the modified data set for machine learning model development. MATERIALS AND METHODS In a previous study we built an original de-identification gold standard corpus annotated with true Protected Health Information (PHI) from 3503 randomly selected clinical notes for the 22 most frequent clinical note types of our institution. In the current study we modified the original gold standard corpus to make it suitable for external sharing by replacing HIPAA-specified PHI with newly generated realistic PHI. Finally, we evaluated the research value of this new dataset by comparing the performance of an existing published in-house de-identification system, when trained on the new de-identification gold standard corpus, with the performance of the same system, when trained on the original corpus. We assessed the potential benefits of using the new de-identification gold standard corpus to identify PHI in the i2b2 and PhysioNet datasets that were released by other groups for de-identification research. We also measured the effectiveness of the i2b2 and PhysioNet de-identification gold standard corpora in identifying PHI in our original clinical notes. RESULTS Performance of the de-identification system using the new gold standard corpus as a training set was very close to training on the original corpus (92.56 vs. 93.48 overall F-measures). Best i2b2/PhysioNet/CCHMC cross-training performances were obtained when training on the new shared CCHMC gold standard corpus, although performances were still lower than corpus-specific trainings. DISCUSSION AND CONCLUSION We successfully modified a de-identification dataset for external sharing while preserving the de-identification research value of the modified gold standard corpus with limited drop in machine learning de-identification performance.

Journal of the American Medical Informatics Association | 2015

Automated clinical trial eligibility prescreening: increasing the efficiency of patient identification for clinical trials in the emergency department

Yizhao Ni; Stephanie Kennebeck; Judith W. Dexheimer; Constance McAneney; Huaxiu Tang; Todd Lingren; Qi Li; Haijun Zhai; Imre Solti

Objectives (1) To develop an automated eligibility screening (ES) approach for clinical trials in an urban tertiary care pediatric emergency department (ED); (2) to assess the effectiveness of natural language processing (NLP), information extraction (IE), and machine learning (ML) techniques on real-world clinical data and trials. Data and methods We collected eligibility criteria for 13 randomly selected, disease-specific clinical trials actively enrolling patients between January 1, 2010 and August 31, 2012. In parallel, we retrospectively selected data fields including demographics, laboratory data, and clinical notes from the electronic health record (EHR) to represent profiles of all 202795 patients visiting the ED during the same period. Leveraging NLP, IE, and ML technologies, the automated ES algorithms identified patients whose profiles matched the trial criteria to reduce the pool of candidates for staff screening. The performance was validated on both a physician-generated gold standard of trial–patient matches and a reference standard of historical trial–patient enrollment decisions, where workload, mean average precision (MAP), and recall were assessed. Results Compared with the case without automation, the workload with automated ES was reduced by 92% on the gold standard set, with a MAP of 62.9%. The automated ES achieved a 450% increase in trial screening efficiency. The findings on the gold standard set were confirmed by large-scale evaluation on the reference set of trial–patient matches. Discussion and conclusion By exploiting the text of trial criteria and the content of EHRs, we demonstrated that NLP-, IE-, and ML-based automated ES could successfully identify patients for clinical trials.

PLOS ONE | 2014

The effect of inversion at 8p23 on BLK association with lupus in Caucasian population.

Bahram Namjou; Yizhao Ni; Isaac T. W. Harley; Iouri Chepelev; Beth L. Cobb; Leah C. Kottyan; Patrick M. Gaffney; Joel M. Guthridge; Kenneth M. Kaufman; John B. Harley

To explore the potential influence of the polymorphic 8p23.1 inversion on known autoimmune susceptibility risk at or near BLK locus, we validated a new bioinformatics method that utilizes SNP data to enable accurate, high-throughput genotyping of the 8p23.1 inversion in a Caucasian population. Methods: Principal components analysis (PCA) was performed using markers inside the inversion territory followed by k-means cluster analyses on 7416 European derived and 267 HapMaP CEU and TSI samples. A logistic regression conditional analysis was performed. Results: Three subgroups have been identified; inversion homozygous, heterozygous and non-inversion homozygous. The status of inversion was further validated using HapMap samples that had previously undergone Fluorescence in situ hybridization (FISH) assays with a concordance rate of above 98%. Conditional analyses based on the status of inversion were performed. We found that overall association signals in the BLK region remain significant after controlling for inversion status. The proportion of lupus cases and controls (cases/controls) in each subgroup was determined to be 0.97 for the inverted homozygous group (1067 cases and 1095 controls), 1.12 for the inverted heterozygous group (1935 cases 1717 controls) and 1.36 for non-inverted subgroups (924 cases and 678 controls). After calculating the linkage disequilibrium between inversion status and lupus risk haplotype we found that the lupus risk haplotype tends to reside on non-inversion background. As a result, a new association effect between non-inversion status and lupus phenotype has been identified ((p = 8.18×10−7, OR = 1.18, 95%CI = 1.10–1.26). Conclusion: Our results demonstrate that both known lupus risk haplotype and inversion status act additively in the pathogenesis of lupus. Since inversion regulates expression of many genes in its territory, altered expression of other genes might also be involved in the development of lupus.

PLOS ONE | 2016

Electronic Health Record Based Algorithm to Identify Patients with Autism Spectrum Disorder

Todd Lingren; Pei Chen; Joseph Bochenek; Finale Doshi-Velez; Patty Manning-Courtney; Julie Bickel; Leah Wildenger Welchons; Judy Reinhold; Nicole Bing; Yizhao Ni; William J. Barbaresi; Frank D. Mentch; Melissa A. Basford; Joshua C. Denny; Lyam Vazquez; Cassandra Perry; Bahram Namjou; Haijun Qiu; John J. Connolly; Debra J. Abrams; Ingrid A. Holm; Beth A. Cobb; Nataline Lingren; Imre Solti; Hakon Hakonarson; Isaac S. Kohane; John B. Harley; Guergana Savova

Objective Cohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm for determining an Autism Spectrum Disorder (ASD) patient cohort from EHR. We demonstrate its utility via the largest investigation to date of the co-occurrence patterns of medical comorbidities in ASD. Methods We extracted ICD-9 codes and concepts derived from the clinical notes. A gold standard patient set was labeled by clinicians at Boston Children’s Hospital (BCH) (N = 150) and Cincinnati Children’s Hospital and Medical Center (CCHMC) (N = 152). Two algorithms were created: (1) rule-based implementing the ASD criteria from Diagnostic and Statistical Manual of Mental Diseases 4th edition, (2) predictive classifier. The positive predictive values (PPV) achieved by these algorithms were compared to an ICD-9 code baseline. We clustered the patients based on grouped ICD-9 codes and evaluated subgroups. Results The rule-based algorithm produced the best PPV: (a) BCH: 0.885 vs. 0.273 (baseline); (b) CCHMC: 0.840 vs. 0.645 (baseline); (c) combined: 0.864 vs. 0.460 (baseline). A validation at Children’s Hospital of Philadelphia yielded 0.848 (PPV). Clustering analyses of comorbidities on the three-site large cohort (N = 20,658 ASD patients) identified psychiatric, developmental, and seizure disorder clusters. Conclusions In a large cross-institutional cohort, co-occurrence patterns of comorbidities in ASDs provide further hypothetical evidence for distinct courses in ASD. The proposed automated algorithms for cohort selection open avenues for other large-scale EHR studies and individualized treatment of ASD.

BMC Medical Informatics and Decision Making | 2015

Increasing the efficiency of trial-patient matching: automated clinical trial eligibility Pre-screening for pediatric oncology patients

Yizhao Ni; Jordan Wright; John P. Perentesis; Todd Lingren; Louise Deléger; Megan Kaiser; Isaac S. Kohane; Imre Solti

BackgroundManual eligibility screening (ES) for a clinical trial typically requires a labor-intensive review of patient records that utilizes many resources. Leveraging state-of-the-art natural language processing (NLP) and information extraction (IE) technologies, we sought to improve the efficiency of physician decision-making in clinical trial enrollment. In order to markedly reduce the pool of potential candidates for staff screening, we developed an automated ES algorithm to identify patients who meet core eligibility characteristics of an oncology clinical trial.MethodsWe collected narrative eligibility criteria from ClinicalTrials.gov for 55 clinical trials actively enrolling oncology patients in our institution between 12/01/2009 and 10/31/2011. In parallel, our ES algorithm extracted clinical and demographic information from the Electronic Health Record (EHR) data fields to represent profiles of all 215 oncology patients admitted to cancer treatment during the same period. The automated ES algorithm then matched the trial criteria with the patient profiles to identify potential trial-patient matches. Matching performance was validated on a reference set of 169 historical trial-patient enrollment decisions, and workload, precision, recall, negative predictive value (NPV) and specificity were calculated.ResultsWithout automation, an oncologist would need to review 163 patients per trial on average to replicate the historical patient enrollment for each trial. This workload is reduced by 85% to 24 patients when using automated ES (precision/recall/NPV/specificity: 12.6%/100.0%/100.0%/89.9%). Without automation, an oncologist would need to review 42 trials per patient on average to replicate the patient-trial matches that occur in the retrospective data set. With automated ES this workload is reduced by 90% to four trials (precision/recall/NPV/specificity: 35.7%/100.0%/100.0%/95.5%).ConclusionBy leveraging NLP and IE technologies, automated ES could dramatically increase the trial screening efficiency of oncologists and enable participation of small practices, which are often left out from trial enrollment. The algorithm has the potential to significantly reduce the effort to execute clinical research at a point in time when new initiatives of the cancer care community intend to greatly expand both the access to trials and the number of available trials.

Journal of Biomedical Informatics | 2015

Automated detection of medication administration errors in neonatal intensive care

Qi Li; Eric S. Kirkendall; Eric S. Hall; Yizhao Ni; Todd Lingren; Megan Kaiser; Nataline Lingren; Haijun Zhai; Imre Solti; Kristin Melton

OBJECTIVE To improve neonatal patient safety through automated detection of medication administration errors (MAEs) in high alert medications including narcotics, vasoactive medication, intravenous fluids, parenteral nutrition, and insulin using the electronic health record (EHR); to evaluate rates of MAEs in neonatal care; and to compare the performance of computerized algorithms to traditional incident reporting for error detection. METHODS We developed novel computerized algorithms to identify MAEs within the EHR of all neonatal patients treated in a level four neonatal intensive care unit (NICU) in 2011 and 2012. We evaluated the rates and types of MAEs identified by the automated algorithms and compared their performance to incident reporting. Performance was evaluated by physician chart review. RESULTS In the combined 2011 and 2012 NICU data sets, the automated algorithms identified MAEs at the following rates: fentanyl, 0.4% (4 errors/1005 fentanyl administration records); morphine, 0.3% (11/4009); dobutamine, 0 (0/10); and milrinone, 0.3% (5/1925). We found higher MAE rates for other vasoactive medications including: dopamine, 11.6% (5/43); epinephrine, 10.0% (289/2890); and vasopressin, 12.8% (54/421). Fluid administration error rates were similar: intravenous fluids, 3.2% (273/8567); parenteral nutrition, 3.2% (649/20124); and lipid administration, 1.3% (203/15227). We also found 13 insulin administration errors with a resulting rate of 2.9% (13/456). MAE rates were higher for medications that were adjusted frequently and fluids administered concurrently. The algorithms identified many previously unidentified errors, demonstrating significantly better sensitivity (82% vs. 5%) and precision (70% vs. 50%) than incident reporting for error recognition. CONCLUSIONS Automated detection of medication administration errors through the EHR is feasible and performs better than currently used incident reporting systems. Automated algorithms may be useful for real-time error identification and mitigation.

BMC Medical Informatics and Decision Making | 2015

An end-to-end hybrid algorithm for automated medication discrepancy detection.

Qi Li; Stephen Andrew Spooner; Megan Kaiser; Nataline Lingren; Jessica Robbins; Todd Lingren; Huaxiu Tang; Imre Solti; Yizhao Ni

BackgroundIn this study we implemented and developed state-of-the-art machine learning (ML) and natural language processing (NLP) technologies and built a computerized algorithm for medication reconciliation. Our specific aims are: (1) to develop a computerized algorithm for medication discrepancy detection between patients’ discharge prescriptions (structured data) and medications documented in free-text clinical notes (unstructured data); and (2) to assess the performance of the algorithm on real-world medication reconciliation data.MethodsWe collected clinical notes and discharge prescription lists for all 271 patients enrolled in the Complex Care Medical Home Program at Cincinnati Children’s Hospital Medical Center between 1/1/2010 and 12/31/2013. A double-annotated, gold-standard set of medication reconciliation data was created for this collection. We then developed a hybrid algorithm consisting of three processes: (1) a ML algorithm to identify medication entities from clinical notes, (2) a rule-based method to link medication names with their attributes, and (3) a NLP-based, hybrid approach to match medications with structured prescriptions in order to detect medication discrepancies. The performance was validated on the gold-standard medication reconciliation data, where precision (P), recall (R), F-value (F) and workload were assessed.ResultsThe hybrid algorithm achieved 95.0%/91.6%/93.3% of P/R/F on medication entity detection and 98.7%/99.4%/99.1% of P/R/F on attribute linkage. The medication matching achieved 92.4%/90.7%/91.5% (P/R/F) on identifying matched medications in the gold-standard and 88.6%/82.5%/85.5% (P/R/F) on discrepant medications. By combining all processes, the algorithm achieved 92.4%/90.7%/91.5% (P/R/F) and 71.5%/65.2%/68.2% (P/R/F) on identifying the matched and the discrepant medications, respectively. The error analysis on algorithm outputs identified challenges to be addressed in order to improve medication discrepancy detection.ConclusionBy leveraging ML and NLP technologies, an end-to-end, computerized algorithm achieves promising outcome in reconciling medications between clinical notes and discharge prescriptions.

Explore More