Marc Light
Mitre Corporation
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marc Light.
meeting of the association for computational linguistics | 1999
Lynette Hirschman; Marc Light; Eric Breck; John D. Burger
This paper describes initial work on Deep Read, an automated reading comprehension system that accepts arbitrary text input (a story) and answers questions about it. We have acquired a corpus of 60 development and 60 test stories of 3rd to 6th grade material; each story is followed by short-answer questions (an answer key was also provided). We used these to construct and evaluate a baseline system that uses pattern matching (bag-of-words) techniques augmented with additional automated linguistic processing (stemming, name identification, semantic class identification, and pronoun resolution). This simple system retrieves the sentence containing the answer 30--40% of the time.
Natural Language Engineering | 2001
Marc Light; Gideon S. Mann; Ellen Riloff; Eric Breck
In this paper, we take a detailed look at the performance of components of an idealized question answering system on two different tasks: the TREC Question Answering task and a set of reading comprehension exams. We carry out three types of analysis: inherent properties of the data, feature analysis, and performance bounds. Based on these analyses we explain some of the performance results of the current generation of Q/A systems and make predictions on future work. In particular, we present four findings: (1) Q/A system performance is correlated with answer repetition; (2) relative overlap scores are more effective than absolute overlap scores; (3) equivalence classes on scoring functions can be used to quantify performance bounds; and (4) perfect answer typing still leaves a great deal of ambiguity for a Q/A system because sentences often contain several items of the same type.
Cognitive Science | 2002
Marc Light; Warren R. Greiff
Selectional preferences have a long history in both generative and computational linguistics. However, since the publication of Resniks dissertation in 1993, a new approach has surfaced in the computational linguistics community. This new line of research combines knowledge represented in a pre-defined semantic class hierarchy with statistical tools including information theory, statistical modeling, and Bayesian inference. These tools are used to learn selectional preferences from examples in a corpus. Instead of simple sets of semantic classes, selectional preferences are viewed as probability distributions over various entities. We survey research that extends Resniks initial work, discuss the strengths and weaknesses of each approach, and show how they together form a cohesive line of research.
Communications of The ACM | 2002
Marc Light; Mark T. Maybury
Ask questions, get personalized answers.
arXiv: Computation and Language | 2001
Eric Breck; Marc Light; Gideon S. Mann; Ellen Riloff; Brianne Brown; Pranav Anand; Mats Rooth; Michael Thelen
In this paper we analyze two question answering tasks: the TREC-8 question answering task and a set of reading comprehension exams. First, we show that Q/A systems perform better when there are multiple answer opportunities per question. Next, we analyze common approaches to two subproblems: term overlap for answer sentence identification, and answer typing for short answer extraction. We present general tools for analyzing the strengths and limitations of techniques for these sub-problems. Our results quantify the limitations of both term overlap and answer typing to distinguish between competing answer candidates.
acm conference on hypertext | 2007
Shannon Bradshaw; Marc Light
We present a study of the degree to which annotations overlap when several researchers read the same set of scientific articles. Our objective is to determine whether there is sufficient evidence to suggest that information about which passages initial readers tend to annotate might be used to recommend important passages to later readers of the same material. We found that readers exhibit a high degree of overlap in the passages they annotate, that these passages account for a small but significant fraction of the total document, and that such passages are distributed throughout a document rather than concentrated in the same few sections in each paper (e.g., the results section). These findings indicate that work on developing a passage recommendation model based on annotation is warranted.
Software Engineering, Testing, and Quality Assurance for Natural Language Processing | 2008
Terry Heinze; Marc Light
Computing precision and recall metrics for named entity tagging and resolution involves classifying text spans as true positives, false positives, or false negatives. There are many factors that make this classification complicated for real world systems. We describe an evaluation system that attempts to control this complexity through a set of rules and a forward chaining inference engine.
language resources and evaluation | 2000
Eric Breck; John D. Burger; Lisa Ferro; Lynette Hirschman; David House; Marc Light; Inderjeet Mani
text retrieval conference | 1999
Eric Breck; John D. Burger; Lisa Ferro; David House; Marc Light; Inderjeet Mani
Archive | 1999
Eric Breck; John D. Burger; Marc Light