Andrew M. Dai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew M. Dai is active.

Explore More

Publication

Featured researches published by Andrew M. Dai.

conference on computational natural language learning | 2016

Generating Sentences from a Continuous Space

Samuel R. Bowman; Luke Vilnis; Oriol Vinyals; Andrew M. Dai; Rafal Jozefowicz; Samy Bengio

The standard recurrent neural network language model (RNNLM) generates sentences one word at a time and does not work from an explicit global sentence representation. In this work, we introduce and study an RNN-based variational autoencoder generative model that incorporates distributed latent representations of entire sentences. This factorization allows it to explicitly model holistic properties of sentences such as style, topic, and high-level syntactic features. Samples from the prior over these sentence representations remarkably produce diverse and well-formed sentences through simple deterministic decoding. By examining paths through this latent space, we are able to generate coherent novel sentences that interpolate between known sentences. We present techniques for solving the difficult learning problem presented by this model, demonstrate its effectiveness in imputing missing words, explore many interesting properties of the models latent sentence space, and present negative results on the use of the model in language modeling.

arXiv: Computers and Society | 2018

Scalable and accurate deep learning with electronic health records

Alvin Rajkomar; Eyal Oren; Kai Chen; Andrew M. Dai; Nissan Hajaj; Michaela Hardt; Peter J. Liu; Xiaobing Liu; Jake Marcus; Mimi Sun; Patrik Sundberg; Hector Yee; Kun Zhang; Yi Zhang; Gerardo Flores; Gavin E. Duggan; Jamie Irvine; Quoc V. Le; Kurt Litsch; Alexander Mossin; Justin Tansuwan; De Wang; James Wexler; Jimbo Wilson; Dana Ludwig; Samuel L. Volchenboum; Katherine Chou; Michael Pearson; Srinivasan Madabushi; Nigam H. Shah

Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation of patients’ entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93–0.94), 30-day unplanned readmission (AUROC 0.75–0.76), prolonged length of stay (AUROC 0.85–0.86), and all of a patient’s final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient’s chart.Artificial intelligence: Algorithm predicts clinical outcomes for hospital inpatientsArtificial intelligence outperforms traditional statistical models at predicting a range of clinical outcomes from a patient’s entire raw electronic health record (EHR). A team led by Alvin Rajkomar and Eyal Oren from Google in Mountain View, California, USA, developed a data processing pipeline for transforming EHR files into a standardized format. They then applied deep learning models to data from 216,221 adult patients hospitalized for at least 24 h each at two academic medical centers, and showed that their algorithm could accurately predict risk of mortality, hospital readmission, prolonged hospital stay and discharge diagnosis. In all cases, the method proved more accurate than previously published models. The authors provide a case study to serve as a proof-of-concept of how such an algorithm could be used in routine clinical practice in the future.

international conference on artificial neural networks | 2011

The grouped author-topic model for unsupervised entity resolution

Andrew M. Dai; Amos J. Storkey

This paper describes a generative approach for tackling the problem of identity resolution in a completely unsupervised context with no fixed assumption regarding the true number of identities. The problem of entity resolution involves associating different references to authors (in a papers author list, for example) with real underlying identities. The references may be written in differing forms or may have errors, and identical references may refer to different real identities. The approach taken here uses a generative model of both the abstract of a document and its list of authors to resolve identities in a corpus of documents. In the model, authors and topics are associated with latent groups. For each document, an abstract and an author list are generated conditioned on a given group. Results are presented on real-world datasets, and outperform the best performing unsupervised methods.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

The Supervised Hierarchical Dirichlet Process

Andrew M. Dai; Amos J. Storkey

We propose the supervised hierarchical Dirichlet process (sHDP), a nonparametric generative model for the joint distribution of a group of observations and a response variable directly associated with that whole group. We compare the sHDP with another leading method for regression on grouped data, the supervised latent Dirichlet allocation (sLDA) model. We evaluate our method on two real-world classification problems and two real-world regression problems. Bayesian nonparametric regression models based on the Dirichlet process, such as the Dirichlet process-generalised linear models (DP-GLM) have previously been explored; these models allow flexibility in modelling nonlinear relationships. However, until now, hierarchical Dirichlet process (HDP) mixtures have not seen significant use in supervised problems with grouped data since a straightforward application of the HDP on the grouped data results in learnt clusters that are not predictive of the responses. The sHDP solves this problem by allowing for clusters to be learnt jointly from the group structure and from the label assigned to each group.

npj Digital Medicine | 2018

Reply: metrics to assess machine learning models

Alvin Rajkomar; Andrew M. Dai; Mimi Sun; Michaela Hardt; Kai Chen; Kathryn Rough; Jeffrey Dean

We thank Prof. Pinker for bringing up important points on how to assess the performance of machine learning models. The central finding of our work is that a machine learning pipeline operating on an open-source data-format for electronic health records can render accurate predictions across multiple tasks in a way that works for multiple health systems. To demonstrate this, we selected three commonly used binary prediction tasks, inpatient mortality, 30-day unplanned readmission, and length of stay, as well as the task of predicting every discharge diagnosis. The main metric we used for the binary predictions was the area-under-thereceiver-operator curve (AUROC). We would first like to clarify a few issues. We would highlight in our results section that we did report the number-needed-toevaluate or work-up to detection ratio for the inpatient mortality model and baseline model, which is (1/PPV) and commonly accepted as a clinically relevant metric. Also, as described in the “Study Cohort” section, we only included hospitalizations of 24 h or longer, and Table 1 reports the inpatient mortality rates of the hospitals to be approximately 2% in that cohort. This should not be confused with 2.3% of patients dying within 24 h. Prof. Pinker states that the public could be mislead by the way the mainstream media had reported the results of our paper. We observed that many reports incorrectly conflated accuracy with AUROC. We take our responsibility seriously to clearly explain our results to a more general audience and had simultaneously released a public blog post. In that post, we talked explicitly about the AUROC: “The most common way to assess accuracy is by a measure called the area-under-the-receiver-operator curve, which measures how well a model distinguishes between a patient who will have a particular future outcome compared to one who will not. In this metric, 1.00 is perfect, and 0.50 is no better than random chance, so higher numbers mean the model is more accurate.” We agree that the AUROC has its limitations, although we would note that no single metric conveys a complete picture of the performance of a model. The AUROC has an advantage of being a commonly reported metric in both clinical and recent machinelearning papers. We did caution in our manuscript that direct comparison of AUROCs from studies using different cohorts is problematic. However, we do agree that the area under the precision-recall curve (AUPRC) is relevant for prediction tasks and can be particularly helpful with clinical tasks with high class imbalance. Therefore, we report the AUPRC for each of the binary prediction tasks for the primary models reported in the manuscript, the clinical baselines, and the enhanced-baselines that we described in the supplemental materials (Table 1). The confidence intervals are calculated by stratified bootstrapping of the positive and negative classes, as is common for this metric. It is worth noting that the models evaluated here were tuned to optimize the AUROC, and it is well-known that a model tuned for optimizing AUROC does not necessarily optimize AUPRC (and vice-versa). The size of the test set (9624 for Hospital A and 12,127 for Hospital B) limits the power to make comparisons between models, although the point-estimates are higher for the deep learning models for each case.

neural information processing systems | 2015