Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ralph Grishman is active.

Publication


Featured researches published by Ralph Grishman.


meeting of the association for computational linguistics | 2004

Discovering Relations among Named Entities from Large Corpora

Takaaki Hasegawa; Satoshi Sekine; Ralph Grishman

Discovering the significant relations embedded in documents would be very useful not only for information retrieval but also for question answering and summarization. Prior methods for relation discovery, however, needed large annotated corpora which cost a great deal of time and effort. We propose an unsupervised method for relation discovery from large corpora. The key idea is clustering pairs of named entities according to the similarity of context words intervening between the named entities. Our experiments using one year of newspapers reveals not only that the relations among named entities could be detected with high recall and precision, but also that appropriate labels could be automatically provided for the relations.


meeting of the association for computational linguistics | 2005

Extracting Relations with Integrated Information Using Kernel Methods

Shubin Zhao; Ralph Grishman

Entity relation detection is a form of information extraction that finds predefined relations between pairs of entities in text. This paper describes a relation detection approach that combines clues from different levels of syntactic processing using kernel methods. Information from three different levels of processing is considered: tokenization, sentence parsing and deep dependency analysis. Each source of information is represented by kernel functions. Then composite kernels are developed to integrate and extend individual kernels so that processing errors occurring at one level can be overcome by information from other levels. We present an evaluation of these methods on the 2004 ACE relation detection task, using Support Vector Machines, and show that each level of syntactic processing contributes useful information for this task. When evaluated on the official test data, our approach produced very competitive ACE value scores. We also compare the SVM with KNN on different kernels.


international conference on computational linguistics | 1994

Comlex Syntax: building a computational lexicon

Ralph Grishman; Catherine Macleod; Adam Meyers

We describe the design of Comlex Syntax, a computational lexicon providing detailed syntactic information for approximately 38,000 English headwords. We consider the types of errors which arise in creating such a lexicon, and how such errors can be measured and controlled.


meeting of the association for computational linguistics | 1981

PARSING

Ralph Grishman

One reason for the wide variety of views on many subjects in computational linguistics (such as parsing) is the diversity of objectives which lead people to do research in this area. Some researchers are motivated primarily by potential applications the development of natural language interfaces for computer systems. Others are primarily concerned with the psychological processes which underlie human language, and view the computer as a tool for modeling and thus improving our understanding of these processes. Since, as is often observed, man is our best example of a natural language processor, these two groups do have a strong commonality of research interest. Nonetheless, their divergence of objective must lead to differences in the way they regard the component processes of natural language understanding. (If when human processing is better understood it is recognized that the simulation of human processes is not the most effective way of constructing a natural language interface, there may even be a deliberate divergence in the processes themselves.) My work, and this position paper, reflect an applications orientation; those with different research objectives will come to quite different conclusions.


MUC6 '95 Proceedings of the 6th conference on Message understanding | 1995

The NYU system for MUC-6 or where's the syntax?

Ralph Grishman

Over the past five MUCs, New York University has clung faithfully to the idea that information extraction should begin with a phase of full syntactic analysis, followed by a semantic analysis of the syntactic structure. Because we have a good, broad-coverage English grammar and a moderately effective method for recovering from parse failures, this approach held us in fairly good stead.


meeting of the association for computational linguistics | 2003

An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition

Kiyoshi Sudo; Satoshi Sekine; Ralph Grishman

Several approaches have been described for the automatic unsupervised acquisition of patterns for information extraction. Each approach is based on a particular model for the patterns to be acquired, such as a predicate-argument structure or a dependency chain. The effect of these alternative models has not been previously studied. In this paper, we compare the prior models and introduce a new model, the Subtree model, based on arbitrary subtrees of dependency trees. We describe a discovery procedure for this model and demonstrate experimentally an improvement in recall using Subtree patterns.


conference on applied natural language processing | 2000

Unsupervised Discovery of Scenario-Level Patterns for Information Extraction

Roman Yangarber; Ralph Grishman; Pasi Tapanainen

Information Extraction (IE) systems are commonly based on pattern matching. Adapting an IE system to a new scenario entails the construction of a new pattern base---a time-consuming and expensive process. We have implemented a system for finding patterns automatically from un-annotated text. Starting with a small initial set of seed patterns proposed by the user, the system applies an incremental discovery procedure to identify new patterns. We present experiments with evaluations which show that the resulting patterns exhibit high precision and recall.


international conference on computational linguistics | 2002

Unsupervised learning of generalized names

Roman Yangarber; Winston Lin; Ralph Grishman

We present an algorithm, NOMEN, for learning generalized names in text. Examples of these are names of diseases and infectious agents, such as bacteria and viruses. These names exhibit certain properties that make their identification more complex than that of regular proper names, NOMEN uses a novel form of bootstrapping to grow sets of textual instances and of their contextual patterns. The algorithm makes use of competing evidence to boost the learning of several categories of names simultaneously. We present results of the algorithm on a large corpus. We also investigate the relative merits of several evaluation strategies.


Journal of Biomedical Informatics | 2002

Information extraction for enhanced access to disease outbreak reports

Ralph Grishman; Silja Huttunen; Roman Yangarber

Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.


international conference on computational linguistics | 1994

Generalizing automatically generated selectional patterns

Ralph Grishman; John Sterling

Frequency information on co-occurrence patterns can be automatically collected from a syntactically analyzed corpus; this information can then serve as the basis for selectional constraints when analyzing new text from the same domain. This information, however, is necessarily incomplete. We report on measurements of the degree of selectional coverage obtained with different sizes of corpora. We then describe a technique for using the corpus to identify selectionally similar terms, and for using this similarity to broaden the selectional coverage for a fixed corpus size.

Collaboration


Dive into the Ralph Grishman's collaboration.

Researchain Logo
Decentralizing Knowledge