Warren R. Greiff | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Warren R. Greiff is active.

Explore More

Publication

Featured researches published by Warren R. Greiff.

Cognitive Science | 2002

Statistical models for the induction and use of selectional preferences

Marc Light; Warren R. Greiff

Selectional preferences have a long history in both generative and computational linguistics. However, since the publication of Resniks dissertation in 1993, a new approach has surfaced in the computational linguistics community. This new line of research combines knowledge represented in a pre-defined semantic class hierarchy with statistical tools including information theory, statistical modeling, and Bayesian inference. These tools are used to learn selectional preferences from examples in a corpus. Instead of simple sets of semantic classes, selectional preferences are viewed as probability distributions over various entities. We survey research that extends Resniks initial work, discuss the strengths and weaknesses of each approach, and show how they together form a cohesive line of research.

north american chapter of the association for computational linguistics | 2004

Direct maximization of average precision by hill-climbing, with a comparison to a maximum entropy approach

William T. Morgan; Warren R. Greiff; John C. Henderson

We describe an algorithm for choosing term weights to maximize average precision. The algorithm performs successive exhaustive searches through single directions in weight space. It makes use of a novel technique for considering all possible values of average precision that arise in searching for a maximum in a given direction. We apply the algorithm and compare this algorithm to a maximum entropy approach.

User Modeling and User-adapted Interaction | 2004

Personalcasting: Tailored Broadcast News

Mark T. Maybury; Warren R. Greiff; Stanley Boykin; Jay M. Ponte; Chad McHenry; Lisa Ferro

Broadcast news sources and newspapers provide society with the vast majority of real-time information. Unfortunately, cost efficiencies and real-time pressures demand that producers, editors, and writers select and organize content for stereotypical audiences. In this article we illustrate how content understanding, user modeling, and tailored presentation generation promise personalcasts on demand. Specifically, we report on the design and implementation of a personalized version of a broadcast news understanding system, MITRE’s Broadcast News Navigator (BNN), that tracks and infers user content interests and media preferences. We report on the incorporation of Local Context Analysis to both expand the user’s original query to the most related terms in the corpus, as well as to allow the user to provide interactive feedback to enhance the relevance of selected newsstories. We describe an empirical study of the search for stories on ten topics from a video corpus. By personalizing both the selection of stories and the form in which they are delivered, we provide users with tailored broadcast news. This individual news personalization provides more fine-grained content tailoring than current personalized television program level recommenders and does not rely on externally provided program metadata.

Natural Language Engineering | 2006

Reading comprehension tests for computer-based understanding evaluation

Ben Wellner; Lisa Ferro; Warren R. Greiff; Lynette Hirschman

Reading comprehension (RC) tests involve reading a short passage of text and answering a series of questions pertaining to that text. We present a methodology for evaluation of the application of modern natural language technologies to the task of responding to RC tests. Our work is based on ABCs (Abduction Based Comprehension system), an automated system for taking tests requiring short answer phrases as responses. A central goal of ABCs is to serve as a testbed for understanding the role that various linguistic components play in responding to reading comprehension questions. The heart of ABCs is an abductive inference engine that provides three key capabilities: (1) first-order logical representation of relations between entities and events in the text and rules to perform inference over such relations, (2) graceful degradation due to the inclusion of abduction in the reasoning engine, which avoids the brittleness that can be problematic in knowledge representation and reasoning systems and (3) system transparency such that the types of abductive inferences made over an entire corpus provide cues as to where the system is performing poorly and indications as to where existing knowledge is inaccurate or new knowledge is required. ABCs, with certain sub-components not yet automated, finds the correct answer phrase nearly 35 percent of the time using a strict evaluation metric and 45 percent of the time using a looser inexact metric on held out evaluation data. Performance varied for the different question types, ranging from over 50 percent on who questions to over 10 percent on what questions. We present analysis of the roles of individual components and analysis of the impact of various characteristics of the abductive proof procedure on overall system performance.

international conference on human language technology research | 2001

Fine-grained hidden markov modeling for broadcast-news story segmentation

Warren R. Greiff; Alex Morgan; Randall K. Fish; Marc Richards; Amlan Kundu

We present the design and development of a Hidden Markov Model for the division of news broadcasts into story segments. Model topology, and the textual features used, are discussed, together with the non-parametric estimation techniques that were employed for obtaining estimates for both transition and observation probabilities. Visualization methods developed for the analysis of system performance are also presented.

north american chapter of the association for computational linguistics | 2004

Audio hot spotting and retrieval using multiple features

Qian Hu; Fred J. Goodman; Stanley Boykin; Randy Fish; Warren R. Greiff

This paper reports our on-going efforts to exploit multiple features derived from an audio stream using source material such as broadcast news, teleconferences, and meetings. These features are derived from algorithms including automatic speech recognition, automatic speech indexing, speaker identification, prosodic and audio feature extraction. We describe our research prototype -- the Audio Hot Spotting System -- that allows users to query and retrieve data from multimedia sources utilizing these multiple features. The system aims to accurately find segments of user interest, i.e., audio hot spots within seconds of the actual event. In addition to spoken keywords, the system also retrieves audio hot spots by speaker identity, word spoken by a specific speaker, a change of speech rate, and other non-lexical features, including applause and laughter. Finally, we discuss our approach to semantic, morphological, phonetic query expansion to improve audio retrieval performance and to access cross-lingual data.

conference on information and knowledge management | 2002

The role of variance in term weighting for probabilistic information retrieval

Warren R. Greiff; William T. Morgan; Jay M. Ponte

In probabilistic approaches to information retrieval, the occurrence of a query term in a document contributes to the probability that the document will be judged relevant. It is typically assumed that the weight assigned to a query term should be based on the expected value of that contribution. In this paper we show that the degree to which observable document features such as term frequencies are expected to vary is also important. By means of stochastic simulation, we show that increased variance results in degraded retrieval performance. We further show that by decreasing term weights in the presence of variance, this degradation can be reduced. Hence, probabilistic models of information retrieval must take into account not only the expected value of a query terms contribution but also the variance of document features.

Archive | 2003

Contributions of Language Modeling to the Theory and Practice of Information Retrieval

Warren R. Greiff; William T. Morgan

This paper presents an analysis of what language modeling (LM) is in the context of information retrieval (IR). We argue that there are two principal contributions of the language modeling approach. First, that it brings the thinking, theory, and practical knowledge of research in related fields to bear on the retrieval problem. Second, that it makes patent that parameter estimation is important for probabilistic IR approaches. In particular, it has brought to the attention of the IR community the idea that explicit consideration needs to be given to variance reduction in the design of statistical estimators. We describe a simulation environment which has been developed for the study of theoretical issues in information retrieval. Results obtained from the simulation are presented, which show quantitatively how variance reduction techniques applied to parameter estimation can improve performance for the ad-hoc retrieval task.

Archive | 2002

The use of Exploratory Data Analysis in Information Retrieval Research

Warren R. Greiff

We report on a line of work in which techniques of Exploratory Data Analysis (EDA) have been used as a vehicle for better understanding of the issues confronting the researcher in information retrieval (IR). EDA is used for visualizing and studying data for the purpose of uncovering statistical regularities that might not be apparent otherwise. The analysis is carried out in terms of the formal notion of Weight of Evidence (WOE). As a result of this analysis, a novel theory in support of the use of inverse document frequency (idf) for document ranking is presented, and experimental evidence is given in favor of a modification of the classical idf formula motivated by the analysis. This approach is then extended to other sources of evidence commonly used for ranking in information retrieval systems.

Information Processing and Management | 2010

An application of document filtering in an operational system

Paul E. Lehner; Charles A. Worrell; Chrissy Vu; Janet Shipley Mittel; Stephen Snyder; Eric Schulte; Warren R. Greiff

This paper describes an applied document filtering system embedded in an operational watch center that monitors disease outbreaks worldwide. At the initial time of this writing, the system effectively supported monitoring of 23 geographic regions by filtering documents in several thousand daily news sources in 11 different languages. This paper describes the filtering algorithm, statistical procedures for estimating Precision and Recall in an operational environment, summarizes operational performance data and suggests lessons learned for other applications of document filtering technology. Overall, these results are interpreted as supporting the general utility of document filtering and information retrieval technology and offers recommendations for future applications of this technology.

Explore More