Featured Researches

Computation And Language

Building Natural-Language Generation Systems

This is a very short paper that briefly discusses some of the tasks that NLG systems perform. It is of no research interest, but I have occasionally found it useful as a way of introducing NLG to potential project collaborators who know nothing about the field.

Read more
Computation And Language

Building Probabilistic Models for Natural Language

In this thesis, we investigate three problems involving the probabilistic modeling of language: smoothing n-gram models, statistical grammar induction, and bilingual sentence alignment. These three problems employ models at three different levels of language; they involve word-based, constituent-based, and sentence-based models, respectively. We describe techniques for improving the modeling of language at each of these levels, and surpass the performance of existing algorithms for each problem. We approach the three problems using three different frameworks. We relate each of these frameworks to the Bayesian paradigm, and show why each framework used was appropriate for the given problem. Finally, we show how our research addresses two central issues in probabilistic modeling: the sparse data problem and the problem of inducing hidden structure.

Read more
Computation And Language

Building a Generation Knowledge Source using Internet-Accessible Newswire

In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which uses a client-server architecture to extract noun-phrase descriptions of entities such as people, places, and organizations. The system serves two purposes: as an information extraction tool, it allows users to search for textual descriptions of entities; as a utility to generate functional descriptions (FD), it is used in a functional-unification based generation system. We present an evaluation of the approach and its applications to natural language generation and summarization.

Read more
Computation And Language

CLEARS - An Education and Research Tool for Computational Semantics

The CLEARS (Computational Linguistics Education and Research for Semantics) tool provides a graphical interface allowing interactive construction of semantic representations in a variety of different formalisms, and using several construction methods. CLEARS was developed as part of the FraCaS project which was designed to encourage convergence between different semantic formalisms, such as Montague-Grammar, DRT, and Situation Semantics. The CLEARS system is freely available on the WWW from this http URL

Read more
Computation And Language

CLiFF Notes: Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

Short abstracts by computational linguistics researchers at the University of Pennsylvania describing ongoing individual and joint projects.

Read more
Computation And Language

Can Subcategorisation Probabilities Help a Statistical Parser?

Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statistical parser has not yet been answered. In this paper we describe an experiment with a wide-coverage statistical grammar and parser for English and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy.

Read more
Computation And Language

Centering in Dynamic Semantics

Centering theory posits a discourse center, a distinguished discourse entity that is the topic of a discourse. A simplified version of this theory is developed in a Dynamic Semantics framework. In the resulting system, the mechanism of center shift allows a simple, elegant analysis of a variety of phenomena involving sloppy identity in ellipsis and ``paycheck pronouns''.

Read more
Computation And Language

Centering in Italian

This paper explores the correlation between centering and different forms of pronominal reference in Italian, in particular zeros and overt pronouns in subject position. Such correlations, that I had proposed in earlier work (COLING 90), are verified through the analysis of a corpus of naturally occurring texts. In the process, I extend my previous analysis in several ways, for example by taking possessives and subordinates into account. I also provide a more detailed analysis of the "continue" transition: more specifically, I show that pronouns are used in a markedly different way in a "continue" preceded by another "continue" or by a "shift", and in a "continue" preceded by a "retain".

Read more
Computation And Language

Centering in Japanese Discourse

In this paper we propose a computational treatment of the resolution of zero pronouns in Japanese discourse, using an adaptation of the centering algorithm. We are able to factor language-specific dependencies into one parameter of the centering algorithm. Previous analyses have stipulated that a zero pronoun and its cospecifier must share a grammatical function property such as {\sc Subject} or {\sc NonSubject}. We show that this property-sharing stipulation is unneeded. In addition we propose the notion of {\sc topic ambiguity} within the centering framework, which predicts some ambiguities that occur in Japanese discourse. This analysis has implications for the design of language-independent discourse modules for Natural Language systems. The centering algorithm has been implemented in an HPSG Natural Language system with both English and Japanese grammars.

Read more
Computation And Language

Centering in-the-large: Computing referential discourse segments

We specify an algorithm that builds up a hierarchy of referential discourse segments from local centering data. The spatial extension and nesting of these discourse segments constrain the reachability of potential antecedents of an anaphoric expression beyond the local level of adjacent center pairs. Thus, the centering model is scaled up to the level of the global referential structure of discourse. An empirical evaluation of the algorithm is supplied.

Read more

Ready to get started?

Join us today