Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kenney Ng is active.

Publication


Featured researches published by Kenney Ng.


Journal of Biomedical Informatics | 2014

PARAMO: A PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records

Kenney Ng; Amol Ghoting; Steven R. Steinhubl; Walter F. Stewart; Bradley Malin; Jimeng Sun

OBJECTIVE Healthcare analytics research increasingly involves the construction of predictive models for disease targets across varying patient cohorts using electronic health records (EHRs). To facilitate this process, it is critical to support a pipeline of tasks: (1) cohort construction, (2) feature construction, (3) cross-validation, (4) feature selection, and (5) classification. To develop an appropriate model, it is necessary to compare and refine models derived from a diversity of cohorts, patient-specific features, and statistical frameworks. The goal of this work is to develop and evaluate a predictive modeling platform that can be used to simplify and expedite this process for health data. METHODS To support this goal, we developed a PARAllel predictive MOdeling (PARAMO) platform which (1) constructs a dependency graph of tasks from specifications of predictive modeling pipelines, (2) schedules the tasks in a topological ordering of the graph, and (3) executes those tasks in parallel. We implemented this platform using Map-Reduce to enable independent tasks to run in parallel in a cluster computing environment. Different task scheduling preferences are also supported. RESULTS We assess the performance of PARAMO on various workloads using three datasets derived from the EHR systems in place at Geisinger Health System and Vanderbilt University Medical Center and an anonymous longitudinal claims database. We demonstrate significant gains in computational efficiency against a standard approach. In particular, PARAMO can build 800 different models on a 300,000 patient data set in 3h in parallel compared to 9days if running sequentially. CONCLUSION This work demonstrates that an efficient parallel predictive modeling platform can be developed for EHR data. This platform can facilitate large-scale modeling endeavors and speed-up the research workflow and reuse of health information. This platform is only a first step and provides the foundation for our ultimate goal of building analytic pipelines that are specialized for health data researchers.


international conference of the ieee engineering in medicine and biology society | 2015

Early detection of heart failure with varying prediction windows by structured and unstructured data in electronic health records

Yajuan Wang; Kenney Ng; Roy J. Byrd; Jianying Hu; Shahram Ebadollahi; Zahra Daar; Christopher R. deFilippi; Steven R. Steinhubl; Walter F. Stewart

Heart failure (HF) prevalence is increasing and is among the most costly diseases to society. Early detection of HF would provide the means to test lifestyle and pharmacologic interventions that may slow disease progression and improve patient outcomes. This study used structured and unstructured data from electronic health records (EHR) to predict onset of HF with a particular focus on how prediction accuracy varied in relation to time before diagnosis. EHR data were extracted from a single health care system and used to identify incident HF among primary care patients who received care between 2001 and 2010. A total of 1,684 incident HF cases were identified and 13,525 controls were selected from the same primary care practices. Models were compared by varying the beginning of the prediction window from 60 to 720 days before HF diagnosis. As the prediction window decreased, the performance [AUC (95% CIs)] of the predictive HF models increased from 65% (63%-66%) to 74% (73%-75%) for the unstructured, from 73% (72%-75%) to 81% (80%-83%) for the structured, and from 76% (74%-77%) to 83% (77%-85%) for the combined data.


Circulation-cardiovascular Quality and Outcomes | 2016

Early Detection of Heart Failure Using Electronic Health Records: Practical Implications for Time Before Diagnosis, Data Diversity, Data Quantity, and Data Density.

Kenney Ng; Steven R. Steinhubl; Christopher R. deFilippi; Sanjoy Dey; Walter F. Stewart

Background—Using electronic health records data to predict events and onset of diseases is increasingly common. Relatively little is known, although, about the tradeoffs between data requirements and model utility. Methods and Results—We examined the performance of machine learning models trained to detect prediagnostic heart failure in primary care patients using longitudinal electronic health records data. Model performance was assessed in relation to data requirements defined by the prediction window length (time before clinical diagnosis), the observation window length (duration of observation before prediction window), the number of different data domains (data diversity), the number of patient records in the training data set (data quantity), and the density of patient encounters (data density). A total of 1684 incident heart failure cases and 13 525 sex, age-category, and clinic matched controls were used for modeling. Model performance improved as (1) the prediction window length decreases, especially when <2 years; (2) the observation window length increases but then levels off after 2 years; (3) the training data set size increases but then levels off after 4000 patients; (4) more diverse data types are used, but, in order, the combination of diagnosis, medication order, and hospitalization data was most important; and (5) data were confined to patients who had ≥10 phone or face-to-face encounters in 2 years. Conclusions—These empirical findings suggest possible guidelines for the minimum amount and type of data needed to train effective disease onset predictive models using longitudinal electronic health records data.


IEEE Transactions on Visualization and Computer Graphics | 2018

Clustervision: Visual Supervision of Unsupervised Clustering

Bum Chul Kwon; Ben Eysenbach; Janu Verma; Kenney Ng; Christopher De Filippi; Walter F. Stewart; Adam Perer

Clustering, the process of grouping together similar items into distinct partitions, is a common type of unsupervised machine learning that can be useful for summarizing and aggregating complex multi-dimensional data. However, data can be clustered in many ways, and there exist a large body of algorithms designed to reveal different patterns. While having access to a wide variety of algorithms is helpful, in practice, it is quite difficult for data scientists to choose and parameterize algorithms to get the clustering results relevant for their dataset and analytical tasks. To alleviate this problem, we built Clustervision, a visual analytics tool that helps ensure data scientists find the right clustering among the large amount of techniques and parameters available. Our system clusters data using a variety of clustering techniques and parameters and then ranks clustering results utilizing five quality metrics. In addition, users can guide the system to produce more relevant results by providing task-relevant constraints on the data. Our visual user interface allows users to find high quality clustering results, explore the clusters using several coordinated visualization techniques, and select the cluster result that best suits their task. We demonstrate this novel approach using a case study with a team of researchers in the medical domain and showcase that our system empowers users to choose an effective representation of their complex data.


Journal of Patient-Centered Research and Reviews | 2016

Physician Documentation Behaviors in Electronic Health Records as a Potential Source of Noise for Early Detection of Heart Failure

Sanjoy Dey; Kenney Ng; Jianying Hu; Roy J. Byrd; Steven Steinbuhl; Christopher R. deFilippi; Alice Pressman; Walter F. Stewart

Introduction: Electronic health records (EHRs) are a potentially rich source of data for developing predictive models for early detection of heart failure (HF). But, EHR documentation can vary both because of patient health and variation in clinical practice and behavior among physicians. Hypothesis: The “noise” contributed by variable physician behaviors, such as differences in the frequency and detail of documentation, could potentially confound predictive models for detection of HF. In this study, we characterized the documentation behaviors of primary care physicians (PCPs) in an effort to better understand this potential source of noise. Methods: We used longitudinal EHR data of a stratified random sample of 5,187 patients who were: 1) ≥50 years of age, and without a history of HF. PCPs (n=144) were identified and paired with the patients that they had treated for a minimum of 6 months. We derived 28 measures to characterize PCP behaviors - documentation frequencies of assertions and denials of selected Framingham HF signs and symptoms (FHFSS) in office visit encounter notes adjusted for patient comorbidities. Hierarchical clustering analyses (HCA) were performed on PCP documentation behaviors. Results: Based on HCA analyses, PCPs were clustered into 3 groups with distinct documentation behaviors. Group 1 PCPs (n=63) documented 10 out of 15 assertions, and 11 out of 13 denials of FHFSS significantly more frequently than Group 3 (n=20); while Group 2 PCPs (n=61) have significantly more frequent denial documentation behaviors than the other two (see Figure 1). No significant differences were found among patients’ chronic, episodic and cardio metabolic chronic disease counts (comorbidities) in each of the groups (p Conclusions: This study identified PCP groups with distinct documentation behaviors unrelated to patient complexity. This source of noise and potential confounder should be taken into account for predictive modeling.


Journal of Cardiac Failure | 2014

Prevalence of Heart Failure Signs and Symptoms in a Large Primary Care Population Identified Through the Use of Text and Data Mining of the Electronic Health Record

Rajakrishnan Vijayakrishnan; Steven R. Steinhubl; Kenney Ng; Jimeng Sun; Roy J. Byrd; Brent A. Williams; Christopher R. deFilippi; Shahram Ebadollahi; Walter F. Stewart


Archive | 2017

IDENTIFYING AND RANKING RISK FACTORS USING TRAINED PREDICTIVE MODELS

Josua Krause; Kenney Ng; Adam Perer


Journal of Patient-Centered Research and Reviews | 2017

Early Detection of Heart Failure Using Electronic Health Records: Practical Implications for Time Before Diagnosis, Data Diversity, Data Quantity, and Data Density

Kenney Ng; Steven Steinbuhl; Christopher R. deFilippi; Sanjoy Dey; Walter F. Stewart


Journal of Patient-Centered Research and Reviews | 2017

Loop Diuretic Use in the Months and Years Preceding a Heart Failure Diagnosis: A Case-Control Study

David Knorek; Steven R. Steinhubl; Christopher R. deFilippi; Kenney Ng; Roy J. Byrd; Zahar Daar; Walter F. Stewart


Journal of Patient-Centered Research and Reviews | 2016

Data-Driven Modeling of Electronic Health Record Data to Predict Prediagnostic Heart Failure in Primary Care

Kenney Ng; Walter F. Stewart; Christopher R. deFilippi; Sanjoy Dey; Roy J. Byrd; Steven R. Steinhubl; Heather Law; Alice R Pressmnan; Jianying Hu

Collaboration


Dive into the Kenney Ng's collaboration.

Researchain Logo
Decentralizing Knowledge