Is this you? Create Your Porfile

Franck Dernoncourt

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Franck Dernoncourt is active.

Explore More

Publication

Featured researches published by Franck Dernoncourt.

north american chapter of the association for computational linguistics | 2016

Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks

Ji Young Lee; Franck Dernoncourt

Recent approaches based on artificial neural networks (ANNs) have shown promising results for short-text classification. However, many short texts occur in sequences (e.g., sentences in a document or utterances in a dialog), and most existing ANN-based systems do not leverage the preceding short texts when classifying a subsequent one. In this work, we present a model based on recurrent neural networks and convolutional neural networks that incorporates the preceding short texts. Our model achieves state-of-the-art results on three different datasets for dialog act prediction.

Journal of the American Medical Informatics Association | 2016

De-identification of patient notes with recurrent neural networks

Franck Dernoncourt; Ji Young Lee; Özlem Uzuner; Peter Szolovits

Objective Patient notes in electronic health records (EHRs) may contain critical information for medical investigations. However, the vast majority of medical investigators can only access de-identified notes, in order to protect the confidentiality of patients. In the United States, the Health Insurance Portability and Accountability Act (HIPAA) defines 18 types of protected health information that needs to be removed to de-identify patient notes. Manual de-identification is impractical given the size of electronic health record databases, the limited number of researchers with access to non-de-identified notes, and the frequent mistakes of human annotators. A reliable automated de-identification system would consequently be of high value. Materials and Methods We introduce the first de-identification system based on artificial neural networks (ANNs), which requires no handcrafted features or rules, unlike existing systems. We compare the performance of the system with state-of-the-art systems on two datasets: the i2b2 2014 de-identification challenge dataset, which is the largest publicly available de-identification dataset, and the MIMIC de-identification dataset, which we assembled and is twice as large as the i2b2 2014 dataset. Results Our ANN model outperforms the state-of-the-art systems. It yields an F1-score of 97.85 on the i2b2 2014 dataset, with a recall of 97.38 and a precision of 98.32, and an F1-score of 99.23 on the MIMIC de-identification dataset, with a recall of 99.25 and a precision of 99.21. Conclusion Our findings support the use of ANNs for de-identification of patient notes, as they show better performance than previously published systems while requiring no manual feature engineering.

arXiv: Computation and Language | 2017

Robust Dialog State Tracking for Large Ontologies

Franck Dernoncourt; Ji Young Lee; Trung Bui; Hung Hai Bui

The Dialog State Tracking Challenge 4 (DSTC 4) differentiates itself from the previous three editions as follows: the number of slot-value pairs present in the ontology is much larger, no spoken language understanding output is given, and utterances are labeled at the subdialog level. This paper describes a novel dialog state tracking method designed to work robustly under these conditions, using elaborate string matching, coreference resolution tailored for dialogs and a few other improvements. The method can correctly identify many values that are not explicitly present in the utterance. On the final evaluation, our method came in first among 7 competing teams and 24 entries. The F1-score achieved by our method was 9 and 7 percentage points higher than that of the runner-up for the utterance-level evaluation and for the subdialog-level evaluation, respectively.

Archive | 2016

Improving Patient Cohort Identification Using Natural Language Processing

Raymond Francis Sarmiento; Franck Dernoncourt

Retrieving information from structured data tables in a large database may be performed with little to no difficulty, but structured data may not always contain all that is needed to retrieve accurate information compared to narratives from clinical notes. The large volume of clinical notes, however, requires special processing to access the information contained in their unstructured format. In this case study, we present a comparison of two techniques (structured data extraction and natural language processing) and we evaluate their utility in identifying a specific patient cohort from a large clinical database.

genetic and evolutionary computation conference | 2013

Imprecise selection and fitness approximation in a large-scale evolutionary rule based system for blood pressure prediction

Erik Hemberg; Kalyan Veeramachaneni; Franck Dernoncourt; Mark Wagy; Una-May O'Reilly

We present how we have strategically allocated fitness evaluations in a large-scale rule based evolutionary system called ECStar. We describe a strategy that culls potentially weaker solutions early, then later only compete with solutions which have equivalent fitness evaluations, as they are evaluated on more fitness cases. Despite incurring some imprecision in fitness comparison, which arises from not evaluating on all the fitness cases or even the same ones, the strategy allows our system to make effective progress when the resources at its disposal are unpredictably available.

genetic and evolutionary computation conference | 2013

Efficient training set use for blood pressure prediction in a large scale learning classifier system

Erik Hemberg; Kalyan Veeramachaneni; Franck Dernoncourt; Mark Wagy; Una-May O'Reilly

We define a machine learning problem to forecast arterial blood pressure. Our goal is to solve this problem with a large scale learning classifier system. Because learning classifiers systems are extremely computationally intensive and this problems eventually large training set will be very costly to execute, we address how to use less of the training set while not negatively impacting learning accuracy. Our approach is to allow competition among solutions which have not been evaluated on the entire training set. The best of these solutions are then evaluated on more of the training set while their offspring start off being evaluated on less of the training set. To keep selection fair, we divide competing solutions according to how many training examples they have been tested on.

Archive | 2016

Trend Analysis: Evolution of Tidal Volume Over Time for Patients Receiving Invasive Mechanical Ventilation

Anuj Mehta; Franck Dernoncourt; Allan Walkey

Since the publication of the original landmark trial detailing the mortality benefits of low tidal volume ventilation among patients with the acute respiratory distress syndrome (non-cardiogenic pulmonary edema) (Amato et al. in The New England Journal of Medicine 342(18):1301–1308), epidemiological studies have demonstrated that tidal volumes used for mechanically ventilated patients in medical intensive care units have become lower over time (Esteban et al. in American Journal of Respiratory and Critical Care Medicine 177(2):170–177; Esteban et al. in American Journal of Respiratory and Critical Care Medicine 188(2):220). Because patients with heart failure (cardiogenic pulmonary edema) have been systematically excluded from studies investigating low tidal volume mechanical ventilation, the benefit of a low tidal volume strategy among cardiac patients is unclear. We sought to determine whether evidence supporting use of low tidal volumes in patients with non-cardiogenic edema has been generalized into the care of patients with cardiogenic pulmonary edema.

Archive | 2013