Elizabeth Boschee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Elizabeth Boschee is active.

Explore More

Publication

Featured researches published by Elizabeth Boschee.

international conference on human language technology research | 2001

Experiments in multi-modal automatic content extraction

Lance A. Ramshaw; Elizabeth Boschee; Sergey Bratus; Scott Miller; Rebecca Stone; Ralph M. Weischedel; Alex Zamanian

Unlike earlier information extraction research programs, the new ACE (Automatic Content Extraction) program calls for entity extraction by identifying and linking all of the mentions of an entity in the source text, including names, descriptions, and pronouns. Coreference is therefore a key component. BBN has developed statistical co-reference models for this task, including one for pronoun co-reference that we describe here in some detail. In addition, ACE calls for extraction not just from clean text, but also from noisy speech and OCR input. Since speech recognizer output includes neither case nor punctuation, we have extended our statistical parser to perform sentence breaking integrated with parsing in a probabilistic model.

Archive | 2013

Automatic Extraction of Events from Open Source Text for Predictive Forecasting

Elizabeth Boschee; Premkumar Natarajan; Ralph M. Weischedel

Automated analysis of news reports is a significant empowering technology for predictive models of political instability. To date, the standard approach to this analytic task has been embodied in systems such as KEDS/TABARI [1], which use manually-generated rules and shallow parsing techniques to identify events and their participants in text. In this chapter we explore an alternative to event extraction based on BBN SERIFTM, and BBN OnTopicTM, two state-of-the-art statistical natural language processing engines. We empirically compare this new approach to existing event extraction techniques on five dimensions: (1) Accuracy: when an event is reported by the system, how often is it correct? (2) Coverage: how many events are correctly reported by the system? (3) Filtering of historical events: how well are historical events (e.g. 9/11) correctly filtered out of the current event data stream? (4) Topic-based event filtering: how well do systems filter out red herrings based on document topic, such as sports documents mentioning “clashes” between two countries on the playing field? (5) Domain shift: how well do event extraction models perform on data originating from diverse sources? In all dimensions we show significant improvement to the state-of-the-art by applying statistical natural language processing techniques. It is our hope that these results will lead to greater acceptance of automated coding by creators and consumers of social science models that depend on event data and provide a new way to improve the accuracy of those predictive models.

empirical methods in natural language processing | 2014

Learning to Translate: A Query-Specific Combination Approach for Cross-Lingual Information Retrieval

Ferhan Türe; Elizabeth Boschee

When documents and queries are presented in different languages, the common approach is to translate the query into the document language. While there are a variety of query translation approaches, recent research suggests that combining multiple methods into a single ”structured query” is the most effective. In this paper, we introduce a novel approach for producing a unique combination recipe for each query, as it has also been shown that the optimal combination weights differ substantially across queries and other task specifics. Our query-specific combination method generates statistically significant improvements over other combination strategies presented in the literature, such as uniform and task-specific weighting. An in-depth empirical analysis presents insights about the effect of data size, domain differences, labeling and tuning on the end performance of our approach.

empirical methods in natural language processing | 2016

Learning to Translate for Multilingual Question Answering.

Ferhan Türe; Elizabeth Boschee

In multilingual question answering, either the question needs to be translated into the document language, or vice versa. In addition to direction, there are multiple methods to perform the translation, four of which we explore in this paper: word-based, 10-best, context-based, and grammar-based. We build a feature for each combination of translation direction and method, and train a model that learns optimal feature weights. On a large forum dataset consisting of posts in English, Arabic, and Chinese, our novel learn-to-translate approach was more effective than a strong baseline (p<0.05): translating all text into English, then training a classifier based only on English (original or translated) text.

international conference on big data | 2014

Researching persons & organizations: AWAKE: From text to an entity-centric knowledge base

Elizabeth Boschee; Marjorie Freedman; Saurabh Khanwalkar; Anoop Kumar; Amit Srivastava; Ralph M. Weischedel

We describe a pilot experiment building a capability to automatically read documents, develop a knowledge base, support analytics, and visualize the information found. The capability allows someone researching a topic of interest of focus on analysis and synthesis rather than on reading. We show how information from multiple modalities (speech, text, structured databases) and multiple approaches (ontology driven and open information extraction) can be fused to create a resource about both previously known and novel entities. We describe an extensible framework for language understanding tools that allows for scalability, plug-and-play of alternative components, and incorporation of additional input streams, including video, images, and foreign language text.

Archive | 2008