Is this you? Create Your Porfile

Edward Choi

Georgia Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Edward Choi is active.

Explore More

Publication

Featured researches published by Edward Choi.

Journal of the American Medical Informatics Association | 2016

Using recurrent neural network models for early detection of heart failure onset

Edward Choi; Andy Schuetz; Walter F. Stewart; Jimeng Sun

Objective: We explored whether use of deep learning to model temporal relations among events in electronic health records (EHRs) would improve model performance in predicting initial diagnosis of heart failure (HF) compared to conventional methods that ignore temporality. Materials and Methods: Data were from a health system’s EHR on 3884 incident HF cases and 28 903 controls, identified as primary care patients, between May 16, 2000, and May 23, 2013. Recurrent neural network (RNN) models using gated recurrent units (GRUs) were adapted to detect relations among time-stamped events (eg, disease diagnosis, medication orders, procedure orders, etc.) with a 12- to 18-month observation window of cases and controls. Model performance metrics were compared to regularized logistic regression, neural network, support vector machine, and K-nearest neighbor classifier approaches. Results: Using a 12-month observation window, the area under the curve (AUC) for the RNN model was 0.777, compared to AUCs for logistic regression (0.747), multilayer perceptron (MLP) with 1 hidden layer (0.765), support vector machine (SVM) (0.743), and K-nearest neighbor (KNN) (0.730). When using an 18-month observation window, the AUC for the RNN model increased to 0.883 and was significantly higher than the 0.834 AUC for the best of the baseline methods (MLP). Conclusion: Deep learning models adapted to leverage temporal relations appear to improve performance of models for detection of incident heart failure with a short observation window of 12–18 months.

knowledge discovery and data mining | 2016

Multi-layer Representation Learning for Medical Concepts

Edward Choi; Mohammad Taha Bahadori; Elizabeth Searles; Catherine Coffey; Michael Thompson; James Bost; Javier Tejedor-Sojo; Jimeng Sun

Proper representations of medical concepts such as diagnosis, medication, procedure codes and visits from Electronic Health Records (EHR) has broad applications in healthcare analytics. Patient EHR data consists of a sequence of visits over time, where each visit includes multiple medical concepts, e.g., diagnosis, procedure, and medication codes. This hierarchical structure provides two types of relational information, namely sequential order of visits and co-occurrence of the codes within a visit. In this work, we propose Med2Vec, which not only learns the representations for both medical codes and visits from large EHR datasets with over million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts. In the experiments, Med2Vec shows significant improvement in prediction accuracy in clinical applications compared to baselines such as Skip-gram, GloVe, and stacked autoencoder, while providing clinically meaningful interpretation.

knowledge discovery and data mining | 2017

GRAM: Graph-based Attention Model for Healthcare Representation Learning

Edward Choi; Mohammad Taha Bahadori; Le Song; Walter F. Stewart; Jimeng Sun

Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: - Data insufficiency: Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results. Interpretation: The representations learned by deep learning methods should align with medical knowledge. To address these challenges, we propose GRaph-based Attention Model (GRAM) that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. Based on the data volume and the ontology structure, GRAM represents a medical concept as a combination of its ancestors in the ontology via an attention mechanism. We compared predictive performance (i.e. accuracy, data needs, interpretability) of GRAM to various methods including the recurrent neural network (RNN) in two sequential diagnoses prediction tasks and one heart failure prediction task. Compared to the basic RNN, GRAM achieved 10% higher accuracy for predicting diseases rarely observed in the training data and 3% improved area under the ROC curve for predicting heart failure using an order of magnitude less training data. Additionally, unlike other methods, the medical concept representations learned by GRAM are well aligned with the medical ontology. Finally, GRAM exhibits intuitive attention behaviors by adaptively generalizing to higher level concepts when facing data insufficiency at the lower level concepts.

international conference on data mining | 2015

Constructing Disease Network and Temporal Progression Model via Context-Sensitive Hawkes Process

Edward Choi; Nan Du; Robert Chen; Le Song; Jimeng Sun

Modeling disease relationships and temporal progression are two key problems in health analytics, which have not been studied together due to data and technical challenges. Thanks to the increasing adoption of Electronic Health Records (EHR), rich patient information is being collected over time. Using EHR data as input, we propose a multivariate context-sensitive Hawkes process or cHawkes, which simultaneously infers the disease relationship network and models temporal progression of patients. Besides learning disease network and temporal progression model, cHawkes is able to predict when a specific patient might have other related diseases in future given the patient history, which in turn can have many potential applications in predictive health analytics, public health policy development and customized patient care. Extensive experiments on real EHR data demonstrate that cHawkes not only can uncover meaningful disease relations and model accurate temporal progression of patients, but also has significantly better predictive performance compared to several baseline models.

Journal of the American Medical Informatics Association | 2018

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review

Cao Xiao; Edward Choi; Jimeng Sun

Abstract Objective To conduct a systematic review of deep learning models for electronic health record (EHR) data, and illustrate various deep learning architectures for analyzing different data sources and their target applications. We also highlight ongoing research and identify open challenges in building deep learning models of EHRs. Design/method We searched PubMed and Google Scholar for papers on deep learning studies using EHR data published between January 1, 2010, and January 31, 2018. We summarize them according to these axes: types of analytics tasks, types of deep learning model architectures, special challenges arising from health data and tasks and their potential solutions, as well as evaluation strategies. Results We surveyed and analyzed multiple aspects of the 98 articles we found and identified the following analytics tasks: disease detection/classification, sequential prediction of clinical events, concept embedding, data augmentation, and EHR data privacy. We then studied how deep architectures were applied to these tasks. We also discussed some special challenges arising from modeling EHR data and reviewed a few popular approaches. Finally, we summarized how performance evaluations were conducted for each task. Discussion Despite the early success in using deep learning for health analytics applications, there still exist a number of issues to be addressed. We discuss them in detail including data and label availability, the interpretability and transparency of the model, and ease of deployment.

empirical methods in natural language processing | 2014

Balanced Korean Word Spacing with Structural SVM

Changki Lee; Edward Choi; Hyunki Kim

Most studies on statistical Korean word spacing do not utilize the information provided by the input sentence and assume that it was completely concatenated. This makes the word spacer ignore the correct spaced parts of the input sentence and erroneously alter them. To overcome such limit, this paper proposes a structural SVM-based Korean word spacing method that can utilize the space information of the input sentence. The experiment on sentences with 10% spacing errors showed that our method achieved 96.81% F-score, while the basic structural SVM method only achieved 92.53% F-score. The more the input sentence was correctly spaced, the more accurately our method performed.

arXiv: Learning | 2016