Elizabeth Boschee
BBN Technologies
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Elizabeth Boschee.
international conference on human language technology research | 2001
Lance A. Ramshaw; Elizabeth Boschee; Sergey Bratus; Scott Miller; Rebecca Stone; Ralph M. Weischedel; Alex Zamanian
Unlike earlier information extraction research programs, the new ACE (Automatic Content Extraction) program calls for entity extraction by identifying and linking all of the mentions of an entity in the source text, including names, descriptions, and pronouns. Coreference is therefore a key component. BBN has developed statistical co-reference models for this task, including one for pronoun co-reference that we describe here in some detail. In addition, ACE calls for extraction not just from clean text, but also from noisy speech and OCR input. Since speech recognizer output includes neither case nor punctuation, we have extended our statistical parser to perform sentence breaking integrated with parsing in a probabilistic model.
Archive | 2013
Elizabeth Boschee; Premkumar Natarajan; Ralph M. Weischedel
Automated analysis of news reports is a significant empowering technology for predictive models of political instability. To date, the standard approach to this analytic task has been embodied in systems such as KEDS/TABARI [1], which use manually-generated rules and shallow parsing techniques to identify events and their participants in text. In this chapter we explore an alternative to event extraction based on BBN SERIFTM, and BBN OnTopicTM, two state-of-the-art statistical natural language processing engines. We empirically compare this new approach to existing event extraction techniques on five dimensions: (1) Accuracy: when an event is reported by the system, how often is it correct? (2) Coverage: how many events are correctly reported by the system? (3) Filtering of historical events: how well are historical events (e.g. 9/11) correctly filtered out of the current event data stream? (4) Topic-based event filtering: how well do systems filter out red herrings based on document topic, such as sports documents mentioning “clashes” between two countries on the playing field? (5) Domain shift: how well do event extraction models perform on data originating from diverse sources? In all dimensions we show significant improvement to the state-of-the-art by applying statistical natural language processing techniques. It is our hope that these results will lead to greater acceptance of automated coding by creators and consumers of social science models that depend on event data and provide a new way to improve the accuracy of those predictive models.
empirical methods in natural language processing | 2014
Ferhan Türe; Elizabeth Boschee
When documents and queries are presented in different languages, the common approach is to translate the query into the document language. While there are a variety of query translation approaches, recent research suggests that combining multiple methods into a single ”structured query” is the most effective. In this paper, we introduce a novel approach for producing a unique combination recipe for each query, as it has also been shown that the optimal combination weights differ substantially across queries and other task specifics. Our query-specific combination method generates statistically significant improvements over other combination strategies presented in the literature, such as uniform and task-specific weighting. An in-depth empirical analysis presents insights about the effect of data size, domain differences, labeling and tuning on the end performance of our approach.
empirical methods in natural language processing | 2016
Ferhan Türe; Elizabeth Boschee
In multilingual question answering, either the question needs to be translated into the document language, or vice versa. In addition to direction, there are multiple methods to perform the translation, four of which we explore in this paper: word-based, 10-best, context-based, and grammar-based. We build a feature for each combination of translation direction and method, and train a model that learns optimal feature weights. On a large forum dataset consisting of posts in English, Arabic, and Chinese, our novel learn-to-translate approach was more effective than a strong baseline (p<0.05): translating all text into English, then training a classifier based only on English (original or translated) text.
international conference on big data | 2014
Elizabeth Boschee; Marjorie Freedman; Saurabh Khanwalkar; Anoop Kumar; Amit Srivastava; Ralph M. Weischedel
We describe a pilot experiment building a capability to automatically read documents, develop a knowledge base, support analytics, and visualize the information found. The capability allows someone researching a topic of interest of focus on analysis and synthesis rather than on reading. We show how information from multiple modalities (speech, text, structured databases) and multiple approaches (ontology driven and open information extraction) can be fused to create a resource about both previously known and novel entities. We describe an extensible framework for language understanding tools that allows for scalability, plug-and-play of alternative components, and incorporation of additional input streams, including video, images, and foreign language text.
Archive | 2008
Alex Baron; Marjorie Freedman; Ralph M. Weischedel; Elizabeth Boschee
Archive | 2011
Elizabeth Boschee; Michael Levit; Marjorie Freedman
empirical methods in natural language processing | 2011
Marjorie Freedman; Lance A. Ramshaw; Elizabeth Boschee; Ryan Gabbard; Gary Kratkiewicz; Nicolas Ward; Ralph M. Weischedel
conference of the international speech communication association | 2007
Michael Levit; Elizabeth Boschee; Marjorie Freedman
biologically inspired cognitive architectures | 2008
Elizabeth Boschee; Vasin Punyakanok; Ralph M. Weischedel