Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Noémie Elhadad is active.

Publication


Featured researches published by Noémie Elhadad.


Journal of Artificial Intelligence Research | 2002

Inferring strategies for sentence ordering in multidocument news summarization

Regina Barzilay; Noémie Elhadad; Kathleen R. McKeown

The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed for the task. Based on these experiments, we implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness. Evaluation of our augmented algorithm shows a significant improvement of the ordering over two baseline strategies.


empirical methods in natural language processing | 2003

Sentence alignment for monolingual comparable corpora

Regina Barzilay; Noémie Elhadad

We address the problem of sentence alignment for monolingual corpora, a phenomenon distinct from alignment in parallel corpora. Aligning large comparable corpora automatically would provide a valuable resource for learning of text-to-text rewriting rules. We incorporate context into the search for an optimal alignment in two complementary ways: learning rules for matching paragraphs using topic structure and further refining the matching through local alignment to find good sentence pairs. Evaluation shows that our alignment method outperforms state-of-the-art systems developed for the same task.


Journal of the American Medical Informatics Association | 2014

A review of approaches to identifying patient phenotype cohorts using electronic health records

Chaitanya Shivade; Preethi Raghavan; Eric Fosler-Lussier; Peter J. Embi; Noémie Elhadad; Stephen B. Johnson; Albert M. Lai

Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses.


knowledge discovery and data mining | 2015

Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission

Rich Caruana; Yin Lou; Johannes Gehrke; Paul Koch; Marc Sturm; Noémie Elhadad

In machine learning often a tradeoff must be made between accuracy and intelligibility. More accurate models such as boosted trees, random forests, and neural nets usually are not intelligible, but more intelligible models such as logistic regression, naive-Bayes, and single decision trees often have significantly worse accuracy. This tradeoff sometimes limits the accuracy of models that can be applied in mission-critical applications such as healthcare where being able to understand, validate, edit, and trust a learned model is important. We present two case studies where high-performance generalized additive models with pairwise interactions (GA2Ms) are applied to real healthcare problems yielding intelligible models with state-of-the-art accuracy. In the pneumonia risk prediction case study, the intelligible model uncovers surprising patterns in the data that previously had prevented complex learned models from being fielded in this domain, but because it is intelligible and modular allows these patterns to be recognized and removed. In the 30-day hospital readmission case study, we show that the same methods scale to large datasets containing hundreds of thousands of patients and thousands of attributes while remaining intelligible and providing accuracy comparable to the best (unintelligible) machine learning methods.


international conference on human language technology research | 2001

Sentence ordering in multidocument summarization

Regina Barzilay; Noémie Elhadad; Kathleen R. McKeown

The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. In this paper, we describe two naive ordering techniques and show that they do not perform well. We present an integrated strategy for ordering information, combining constraints from chronological order of events and cohesion. This strategy was derived from empirical observations based on experiments asking humans to order information. Evaluation of our augmented algorithm shows a significant improvement of the ordering over the two naive techniques we used as baseline.


meeting of the association for computational linguistics | 2011

Putting it Simply: a Context-Aware Approach to Lexical Simplification

Or Biran; Samuel Brody; Noémie Elhadad

We present a method for lexical simplification. Simplification rules are learned from a comparable corpus, and the rules are applied in a context-aware fashion to input sentences. Our method is unsupervised. Furthermore, it does not require any alignment or correspondence among the complex and simple corpora. We evaluate the simplification according to three criteria: preservation of grammaticality, preservation of meaning, and degree of simplification. Results show that our method outperforms an established simplification baseline for both meaning preservation and simplification, while maintaining a high level of grammaticality.


Artificial Intelligence in Medicine | 2005

Customization in a unified framework for summarizing medical literature

Noémie Elhadad; Min-Yen Kan; Judith L. Klavans; Kathleen R. McKeown

OBJECTIVE We present the summarization system in the PErsonalized Retrieval and Summarization of Images, Video and Language (PERSIVAL) medical digital library. Although we discuss the context of our summarization research within the PERSIVAL platform, the primary focus of this article is on strategies to define and generate customized summaries. METHODS AND MATERIAL Our summarizer employs a unified user model to create a tailored summary of relevant documents for either a physician or lay person. The approach takes advantage of regularities in medical literature text structure and content to fulfill identified user needs. RESULTS The resulting summaries combine both machine-generated text and extracted text that comes from multiple input documents. Customization includes both group-based modeling for two classes of users, physician and lay person, and individually driven models based on a patient record. CONCLUSIONS Our research shows that customization is feasible in a medical digital library.


Journal of the American Medical Informatics Association | 2015

Evaluating the state of the art in disorder recognition and normalization of the clinical narrative

Sameer Pradhan; Noémie Elhadad; Brett R. South; David Martinez; Lee M. Christensen; Amy Vogel; Hanna Suominen; Wendy W. Chapman; Guergana Savova

Objective The ShARe/CLEF eHealth 2013 Evaluation Lab Task 1 was organized to evaluate the state of the art on the clinical text in (i) disorder mention identification/recognition based on Unified Medical Language System (UMLS) definition (Task 1a) and (ii) disorder mention normalization to an ontology (Task 1b). Such a community evaluation has not been previously executed. Task 1a included a total of 22 system submissions, and Task 1b included 17. Most of the systems employed a combination of rules and machine learners. Materials and methods We used a subset of the Shared Annotated Resources (ShARe) corpus of annotated clinical text—199 clinical notes for training and 99 for testing (roughly 180 K words in total). We provided the community with the annotated gold standard training documents to build systems to identify and normalize disorder mentions. The systems were tested on a held-out gold standard test set to measure their performance. Results For Task 1a, the best-performing system achieved an F1 score of 0.75 (0.80 precision; 0.71 recall). For Task 1b, another system performed best with an accuracy of 0.59. Discussion Most of the participating systems used a hybrid approach by supplementing machine-learning algorithms with features generated by rules and gazetteers created from the training data and from external resources. Conclusions The task of disorder normalization is more challenging than that of identification. The ShARe corpus is available to the community as a reference standard for future studies.


meeting of the association for computational linguistics | 2007

Mining a Lexicon of Technical Terms and Lay Equivalents

Noémie Elhadad; Komal Sutaria

We present a corpus-driven method for building a lexicon of semantically equivalent pairs of technical and lay medical terms. Using a parallel corpus of abstracts of clinical studies and corresponding news stories written for a lay audience, we identify terms which are good semantic equivalents of technical terms for a lay audience. Our method relies on measures of association. Results show that, despite the small size of our corpus, a promising number of pairs are identified.


acm ieee joint conference on digital libraries | 2003

Leveraging a common representation for personalized search and summarization in a medical digital library

Kathleen R. McKeown; Noémie Elhadad; Vasileios Hatzivassiloglou

Despite the large amount of online medical literature, it can be difficult for clinicians to find relevant information at the point of patient care. We present techniques to personalize the results of search, making use of the online patient record as a sophisticated, preexisting user model. Our work in PERSIVAL, a medical digital library, includes methods for reranking the results of search to prioritize those that better match the patient record. It also generates summaries of the reranked results, which highlight information that is relevant to the patient under the physicians care. We focus on the use of a common representation for the articles returned by search and the patient record, which facilitates both the reranking and the summarization tasks. This common approach to both tasks has a strong positive effect on the ability to personalize information.

Collaboration


Dive into the Noémie Elhadad's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Guergana Savova

Boston Children's Hospital

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge