Lucy Vanderwende | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lucy Vanderwende is active.

Explore More

Publication

Featured researches published by Lucy Vanderwende.

meeting of the association for computational linguistics | 1998

MindNet: Acquiring and Structuring Semantic Information from Text

Stephen D. Richardson; William B. Dolan; Lucy Vanderwende

As a lexical knowledge base constructed automatically from the definitions and example sentences in two machine-readable dictionaries (MRDs), MindNet embodies several features that distinguish it from prior work with MRDs. It is, however, more than this static resource alone. MindNet represents a general methodology for acquiring, structuring, accessing, and exploiting semantic information from natural language text. This paper provides an overview of the distinguishing characteristics of MindNet, the steps involved in its creation, and its extension beyond dictionary text.

international acm sigir conference on research and development in information retrieval | 2006

A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization

Ani Nenkova; Lucy Vanderwende; Kathleen R. Mckeown

The usual approach for automatic summarization is sentence extraction, where key sentences from the input documents are selected based on a suite of features. While word frequency often is used as a feature in summarization, its impact on system performance has not been isolated. In this paper, we study the contribution to summarization of three factors related to frequency: content word frequency, composition functions for estimating sentence importance from word frequency, and adjustment of frequency weights based on context. We carry out our analysis using datasets from the Document Understanding Conferences, studying not only the impact of these features on automatic summarizers, but also their role in human summarization. Our research shows that a frequency based summarizer can achieve performance comparable to that of state-of-the-art systems, but only with a good composition function; context sensitivity improves performance and significantly reduces repetition.

Information Processing and Management | 2007

Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion

Lucy Vanderwende; Hisami Suzuki; Chris Brockett; Ani Nenkova

Abstract In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.

international conference on computational linguistics | 1994

Algorithm for automatic interpretation of noun sequences

Lucy Vanderwende

This paper describes an algorithm for automatically interpreting noun sequences in unrestricted text. This system uses broadcoverage semantic information which has been acquired automatically by analyzing the definitions in an on-line dictionary. Previously, computational studies of noun sequences made use of hand-coded semantic information, and they applied the analysis rules sequentially. In contrast, the task of analyzing noun sequences in unrestricted text strongly favors an algorithm according to which the rules are applied in parallel and the best interpretation is determined by weights associated with rule applications.

international conference on machine learning | 2005

What syntax can contribute in the entailment task

Lucy Vanderwende; William B. Dolan

We describe our submission to the PASCAL Recognizing Textual Entailment Challenge, which attempts to isolate the set of Text-Hypothesis pairs whose categorization can be accurately predicted based solely on syntactic cues. Two human annotators examined each pair, showing that a surprisingly large proportion of the data – 34% of the test items – can be handled with syntax alone, while adding information from a general-purpose thesaurus increases this to 48%.

international conference on computational linguistics | 1992

Structural patterns vs. string patterns for extracting semantic information from dictionaries

Simonetta Montemagni; Lucy Vanderwende

This chapter presents evidence for preferring to extract semantic information from a syntactic analysis of a dictionary definition rather than directly from the definition string itself when the information to be extracted is found in the differentiae. We present examples of how very complex information can be extracted from the differentiae of the definition using structural analysis patterns, and why string patterns would fail to do the same.

language and technology conference | 2006

Effectively Using Syntax for Recognizing False Entailment

Rion Snow; Lucy Vanderwende; Arul Menezes

Recognizing textual entailment is a challenging problem and a fundamental component of many applications in natural language processing. We present a novel framework for recognizing textual entailment that focuses on the use of syntactic heuristics to recognize false entailment. We give a thorough analysis of our system, which demonstrates state-of-the-art performance on a widely-used test set.

north american chapter of the association for computational linguistics | 2016

A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories.

Nasrin Mostafazadeh; Nathanael Chambers; Xiaodong He; Devi Parikh; Dhruv Batra; Lucy Vanderwende; Pushmeet Kohli; James F. Allen

Representation and learning of commonsense knowledge is one of the foundational problems in the quest to enable deep language understanding. This issue is particularly challenging for understanding casual and correlational relationships between events. While this topic has received a lot of interest in the NLP community, research has been hindered by the lack of a proper evaluation framework. This paper attempts to address this problem with a new framework for evaluating story understanding and script learning: the `Story Cloze Test’. This test requires a system to choose the correct ending to a four-sentence story. We created a new corpus of 50k five-sentence commonsense stories, ROCStories, to enable this evaluation. This corpus is unique in two ways: (1) it captures a rich set of causal and temporal commonsense relations between daily events, and (2) it is a high quality collection of everyday life stories that can also be used for story generation. Experimental evaluation shows that a host of baselines and state-of-the-art models based on shallow language understanding struggle to achieve a high score on the Story Cloze Test. We discuss these implications for script and story learning, and offer suggestions for deeper language understanding.

Journal of the American Medical Informatics Association | 2012

Pneumonia identification using statistical feature selection.

Cosmin Adrian Bejan; Fei Xia; Lucy Vanderwende; Mark M. Wurfel; Meliha Yetisgen-Yildiz

OBJECTIVE This paper describes a natural language processing system for the task of pneumonia identification. Based on the information extracted from the narrative reports associated with a patient, the task is to identify whether or not the patient is positive for pneumonia. DESIGN A binary classifier was employed to identify pneumonia from a dataset of multiple types of clinical notes created for 426 patients during their stay in the intensive care unit. For this purpose, three types of features were considered: (1) word n-grams, (2) Unified Medical Language System (UMLS) concepts, and (3) assertion values associated with pneumonia expressions. System performance was greatly increased by a feature selection approach which uses statistical significance testing to rank features based on their association with the two categories of pneumonia identification. RESULTS Besides testing our system on the entire cohort of 426 patients (unrestricted dataset), we also used a smaller subset of 236 patients (restricted dataset). The performance of the system was compared with the results of a baseline previously proposed for these two datasets. The best results achieved by the system (85.71 and 81.67 F1-measure) are significantly better than the baseline results (50.70 and 49.10 F1-measure) on the restricted and unrestricted datasets, respectively. CONCLUSION Using a statistical feature selection approach that allows the feature extractor to consider only the most informative features from the feature space significantly improves the performance over a baseline that uses all the features from the same feature space. Extracting the assertion value for pneumonia expressions further improves the system performance.

meeting of the association for computational linguistics | 2016

Generating Natural Questions About an Image

Nasrin Mostafazadeh; Ishan Misra; Jacob Devlin; Margaret Mitchell; Xiaodong He; Lucy Vanderwende

There has been an explosion of work in the vision & language community during the past few years from image captioning to video transcription, and answering questions about images. These tasks have focused on literal descriptions of the image. To move beyond the literal, we choose to explore how questions about an image are often directed at commonsense inference and the abstract events evoked by objects in the image. In this paper, we introduce the novel task of Visual Question Generation (VQG), where the system is tasked with asking a natural and engaging question when shown an image. We provide three datasets which cover a variety of images from object-centric to event-centric, with considerably more abstract training data than provided to state-of-the-art captioning systems thus far. We train and test several generative and retrieval models to tackle the task of VQG. Evaluation results show that while such models ask reasonable questions for a variety of images, there is still a wide gap with human performance which motivates further work on connecting images with commonsense knowledge and pragmatics. Our proposed task offers a new challenge to the community which we hope furthers interest in exploring deeper connections between vision & language.

Explore More