Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Finale Doshi-Velez is active.

Publication


Featured researches published by Finale Doshi-Velez.


Pediatrics | 2014

Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis.

Finale Doshi-Velez; Yaorong Ge; Isaac S. Kohane

OBJECTIVE: The distinct trajectories of patients with autism spectrum disorders (ASDs) have not been extensively studied, particularly regarding clinical manifestations beyond the neurobehavioral criteria from the Diagnostic and Statistical Manual of Mental Disorders. The objective of this study was to investigate the patterns of co-occurrence of medical comorbidities in ASDs. METHODS: International Classification of Diseases, Ninth Revision codes from patients aged at least 15 years and a diagnosis of ASD were obtained from electronic medical records. These codes were aggregated by using phenotype-wide association studies categories and processed into 1350-dimensional vectors describing the counts of the most common categories in 6-month blocks between the ages of 0 to 15. Hierarchical clustering was used to identify subgroups with distinct courses. RESULTS: Four subgroups were identified. The first was characterized by seizures (n = 120, subgroup prevalence 77.5%). The second (n = 197) was characterized by multisystem disorders including gastrointestinal disorders (prevalence 24.3%) and auditory disorders and infections (prevalence 87.8%), and the third was characterized by psychiatric disorders (n = 212, prevalence 33.0%). The last group (n = 4316) could not be further resolved. The prevalence of psychiatric disorders was uncorrelated with seizure activity (P = .17), but a significant correlation existed between gastrointestinal disorders and seizures (P < .001). The correlation results were replicated by using a second sample of 496 individuals from a different geographic region. CONCLUSIONS: Three distinct patterns of medical trajectories were identified by unsupervised clustering of electronic health record diagnoses. These may point to distinct etiologies with different genetic and environmental contributions. Additional clinical and molecular characterizations will be required to further delineate these subgroups.


national conference on artificial intelligence | 2011

A Bayesian nonparametric approach to modeling motion patterns

Joshua Mason Joseph; Finale Doshi-Velez; Albert S. Huang; Nicholas Roy

The most difficult—and often most essential—aspect of many interception and tracking tasks is constructing motion models of the targets. Experts rarely can provide complete information about a target’s expected motion pattern, and fitting parameters for complex motion patterns can require large amounts of training data. Specifying how to parameterize complex motion patterns is in itself a difficult task.In contrast, Bayesian nonparametric models of target motion are very flexible and generalize well with relatively little training data. We propose modeling target motion patterns as a mixture of Gaussian processes (GP) with a Dirichlet process (DP) prior over mixture weights. The GP provides an adaptive representation for each individual motion pattern, while the DP prior allows us to represent an unknown number of motion patterns. Both automatically adjust the complexity of the motion model based on the available data. Our approach outperforms several parametric models on a helicopter-based car-tracking task on data collected from the greater Boston area.


international conference on machine learning | 2009

Accelerated sampling for the Indian Buffet Process

Finale Doshi-Velez; Zoubin Ghahramani

We often seek to identify co-occurring hidden features in a set of observations. The Indian Buffet Process (IBP) provides a non-parametric prior on the features present in each observation, but current inference techniques for the IBP often scale poorly. The collapsed Gibbs sampler for the IBP has a running time cubic in the number of observations, and the uncollapsed Gibbs sampler, while linear, is often slow to mix. We present a new linear-time collapsed Gibbs sampler for conjugate likelihood models and demonstrate its efficacy on large real-world datasets.


Inflammatory Bowel Diseases | 2015

Prevalence of Inflammatory Bowel Disease among Patients with Autism Spectrum Disorders

Finale Doshi-Velez; Paul Avillach; Nathan Palmer; Athos Bousvaros; Yaorong Ge; Kathe Fox; Greg Steinberg; Claire M. Spettell; Iver Juster; Isaac S. Kohane

Background:The objective of this study was to measure the prevalence of inflammatory bowel disease (IBD) among patients with autism spectrum disorders (ASD), which has not been well described previously. Methods:The rates of IBD among patients with and without ASD were measured in 4 study populations with distinct modes of ascertainment: a health care benefits company, 2 pediatric tertiary care centers, and a national ASD repository. The rates of IBD (established through International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM] codes) were compared with respective controls and combined using a Stouffer meta-analysis. Clinical charts were also reviewed for IBD among patients with ICD-9-CM codes for both IBD and ASD at one of the pediatric tertiary care centers. This expert-verified rate was compared with the rate in the repository study population (where IBD diagnoses were established by expert review) and in nationally reported rates for pediatric IBD. Results:In all of case–control study populations, the rates of IBD-related ICD-9-CM codes for patients with ASD were significantly higher than that of their respective controls (Stouffer meta-analysis, P < 0.001). Expert-verified rates of IBD among patients with ASD were 7 of 2728 patients in one study population and 16 of 7201 in a second study population. The age-adjusted prevalence of IBD among patients with ASD was higher than their respective controls and nationally reported rates of pediatric IBD. Conclusions:Across each population with different kinds of ascertainment, there was a consistent and statistically significant increased prevalance of IBD in patients with ASD than their respective controls and nationally reported rates for pediatric IBD.


international joint conference on artificial intelligence | 2017

Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations

Andrew Slavin Ross; Michael C. Hughes; Finale Doshi-Velez

Neural networks are among the most accurate supervised learning methods in use today, but their opacity makes them difficult to trust in critical applications, especially when conditions in training differ from those in test. Recent work on explanations for black-box models has produced tools (e.g. LIME) to show the implicit rules behind predictions, which can help us identify when models are right for the wrong reasons. However, these methods do not scale to explaining entire datasets and cannot correct the problems they reveal. We introduce a method for efficiently explaining and regularizing differentiable models by examining and selectively penalizing their input gradients, which provide a normal to the decision boundary. We apply these penalties both based on expert annotation and in an unsupervised fashion that encourages diverse models with qualitatively different decision boundaries for the same classification problem. On multiple datasets, we show our approach generates faithful explanations and models that generalize much better when conditions differ between training and test.


PLOS ONE | 2016

Electronic Health Record Based Algorithm to Identify Patients with Autism Spectrum Disorder

Todd Lingren; Pei Chen; Joseph Bochenek; Finale Doshi-Velez; Patty Manning-Courtney; Julie Bickel; Leah Wildenger Welchons; Judy Reinhold; Nicole Bing; Yizhao Ni; William J. Barbaresi; Frank D. Mentch; Melissa A. Basford; Joshua C. Denny; Lyam Vazquez; Cassandra Perry; Bahram Namjou; Haijun Qiu; John J. Connolly; Debra J. Abrams; Ingrid A. Holm; Beth A. Cobb; Nataline Lingren; Imre Solti; Hakon Hakonarson; Isaac S. Kohane; John B. Harley; Guergana Savova

Objective Cohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm for determining an Autism Spectrum Disorder (ASD) patient cohort from EHR. We demonstrate its utility via the largest investigation to date of the co-occurrence patterns of medical comorbidities in ASD. Methods We extracted ICD-9 codes and concepts derived from the clinical notes. A gold standard patient set was labeled by clinicians at Boston Children’s Hospital (BCH) (N = 150) and Cincinnati Children’s Hospital and Medical Center (CCHMC) (N = 152). Two algorithms were created: (1) rule-based implementing the ASD criteria from Diagnostic and Statistical Manual of Mental Diseases 4th edition, (2) predictive classifier. The positive predictive values (PPV) achieved by these algorithms were compared to an ICD-9 code baseline. We clustered the patients based on grouped ICD-9 codes and evaluated subgroups. Results The rule-based algorithm produced the best PPV: (a) BCH: 0.885 vs. 0.273 (baseline); (b) CCHMC: 0.840 vs. 0.645 (baseline); (c) combined: 0.864 vs. 0.460 (baseline). A validation at Children’s Hospital of Philadelphia yielded 0.848 (PPV). Clustering analyses of comorbidities on the three-site large cohort (N = 20,658 ASD patients) identified psychiatric, developmental, and seizure disorder clusters. Conclusions In a large cross-institutional cohort, co-occurrence patterns of comorbidities in ASDs provide further hypothetical evidence for distinct courses in ASD. The proposed automated algorithms for cohort selection open avenues for other large-scale EHR studies and individualized treatment of ASD.


Artificial Intelligence | 2012

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Finale Doshi-Velez; Joelle Pineau; Nicholas Roy

Acting in domains where an agent must plan several steps ahead to achieve a goal can be a challenging task, especially if the agent@?s sensors provide only noisy or partial information. In this setting, Partially Observable Markov Decision Processes (POMDPs) provide a planning framework that optimally trades between actions that contribute to the agent@?s knowledge and actions that increase the agent@?s immediate reward. However, the task of specifying the POMDP@?s parameters is often onerous. In particular, setting the immediate rewards to achieve a desired balance between information-gathering and acting is often not intuitive. In this work, we propose an approximation based on minimizing the immediate Bayes risk for choosing actions when transition, observation, and reward models are uncertain. The Bayes-risk criterion avoids the computational intractability of solving a POMDP with a multi-dimensional continuous state space; we show it performs well in a variety of problems. We use policy queries-in which we ask an expert for the correct action-to infer the consequences of a potential pitfall without experiencing its effects. More important for human-robot interaction settings, policy queries allow the agent to learn the reward model without the reward values ever being specified.


Journal of the American Medical Informatics Association | 2016

Understanding vasopressor intervention and weaning: Risk prediction in a public heterogeneous clinical time series database.

Mike Wu; Marzyeh Ghassemi; Mengling Feng; Leo Anthony Celi; Peter Szolovits; Finale Doshi-Velez

Background The widespread adoption of electronic health records allows us to ask evidence-based questions about the need for and benefits of specific clinical interventions in critical-care settings across large populations. Objective We investigated the prediction of vasopressor administration and weaning in the intensive care unit. Vasopressors are commonly used to control hypotension, and changes in timing and dosage can have a large impact on patient outcomes. Materials and Methods We considered a cohort of 15 695 intensive care unit patients without orders for reduced care who were alive 30 days post-discharge. A switching-state autoregressive model (SSAM) was trained to predict the multidimensional physiological time series of patients before, during, and after vasopressor administration. The latent states from the SSAM were used as predictors of vasopressor administration and weaning. Results The unsupervised SSAM features were able to predict patient vasopressor administration and successful patient weaning. Features derived from the SSAM achieved areas under the receiver operating curve of 0.92, 0.88, and 0.71 for predicting ungapped vasopressor administration, gapped vasopressor administration, and vasopressor weaning, respectively. We also demonstrated many cases where our model predicted weaning well in advance of a successful wean. Conclusion Models that used SSAM features increased performance on both predictive tasks. These improvements may reflect an underlying, and ultimately predictive, latent state detectable from the physiological time series.


international conference on robotics and automation | 2012

A Bayesian nonparametric approach to modeling battery health

Joshua Mason Joseph; Finale Doshi-Velez; Nicholas Roy

The batteries of many consumer products are both a substantial portion of the products cost and commonly a first point of failure. Accurately predicting remaining battery life can lower costs by reducing unnecessary battery replacements. Unfortunately, battery dynamics are extremely complex, and we often lack the domain knowledge required to construct a model by hand. In this work, we take a data-driven approach and aim to learn a model of battery time-to-death from training data. Using a Dirichlet process prior over mixture weights, we learn an infinite mixture model for battery health. The Bayesian aspect of our model helps to avoid over-fitting while the nonparametric nature of the model allows the data to control the size of the model, preventing under-fitting. We demonstrate our models effectiveness by making time-to-death predictions using real data from nickel-metal hydride battery packs.


Statistics and Computing | 2017

Restricted Indian buffet processes

Finale Doshi-Velez; Sinead A. Williamson

Latent feature models are a powerful tool for modeling data with globally-shared features. Nonparametric distributions over exchangeable sets of features, such as the Indian Buffet Process, offer modeling flexibility by letting the number of latent features be unbounded. However, current models impose implicit distributions over the number of latent features per data point, and these implicit distributions may not match our knowledge about the data. In this work, we demonstrate how the restricted Indian buffet process circumvents this restriction, allowing arbitrary distributions over the number of features in an observation. We discuss several alternative constructions of the model and apply the insights to develop Markov Chain Monte Carlo and variational methods for simulation and posterior inference.

Collaboration


Dive into the Finale Doshi-Velez's collaboration.

Top Co-Authors

Avatar

Nicholas Roy

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Been Kim

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge