Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stephen Soderland is active.

Publication


Featured researches published by Stephen Soderland.


MUC6 '95 Proceedings of the 6th conference on Message understanding | 1995

Description of the UMass system as used for MUC-6

David Fisher; Stephen Soderland; Fangfang Feng; Wendy G. Lehnert

Information extraction research at the University of Massachusetts is based on portable, trainable language processing components. Some components are more effective than others, some have been under development longer than others, but in all cases, we are working to eliminate manual knowledge engineering. Although UMass has participated in previous MUC evaluations, all of our information extraction software has been redesigned and rewritten since MUC-5, so we are evaluating a completely new system this year.


MUC3 '91 Proceedings of the 3rd conference on Message understanding | 1991

University of Massachusetts: description of the CIRCUS system as used for MUC-3

Wendy G. Lehnert; Claire Cardie; David Fisher; J. McCarthy; Ellen Riloff; Stephen Soderland

In 1988 Professor Wendy Lehnert completed the initial implementation of a semantically-oriented sentence analyzer named CIRCUS [1]. The original design for CIRCUS was motivated by two basic research interests: (1) we wanted to increase the level of syntactic sophistication associated with semantically-oriented parsers, and (2) we wanted to integrate traditional symbolic techniques in natural language processing with connectionist techniques in an effort to exploit the complementary strengths of these two computational paradigms.


MUC5 '93 Proceedings of the 5th conference on Message understanding | 1993

UMass/Hughes: description of the CIRCUS system used for MUC-5

Wendy G. Lehnert; J. McCarthy; Stephen Soderland; Ellen Riloff; Claire Cardie; J. Peterson; Fangfang Feng; Charles P. Dolan; Seth R. Goldman

The primary goal of our effort is the development of robust and portable language processing capabilities for information extraction applications. The system under evaluation here is based on language processing components that have demonstrated strong performance capabilities in previous evaluations [Lehnert et al. 1992a]. Having demonstrated the general viability of these techniques, we are now concentrating on the practicality of our technology by creating trainable system components to replace hand-coded data and manually-engineered software.


Journal of Artificial Intelligence Research | 1994

Wrap-Up: a trainable discourse module for information extraction

Stephen Soderland; Wendy G. Lehnert

The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse component that makes intersentential inferences and identifies logical relations among information extracted from the text. Previous corpus-based approaches were limited to lower level processing such as part-of-speech tagging, lexical disambiguation, and dictionary construction. Wrap-Up is fully trainable, and not only automatically decides what classifiers are needed, but even derives the feature set for each classifier automatically. Performance equals that of a partially trainable discourse module requiring manual customization for each domain.


MUC3 '91 Proceedings of the 3rd conference on Message understanding | 1991

University of Massachusetts: MUC-3 test results and analysis

Wendy G. Lehnert; Claire Cardie; David Fisher; J. McCarthy; Ellen Riloff; Stephen Soderland

We believe that the score reports we obtained for TST2 provide an accurate assessment of our systems capabilities insofar as they are consistent with the results of our own internal tests conducted near the end of phase 2.. The required TST2 score reports indicate that our system achieved the highest combined scores for recall (51%) and precision (62%) as well as the highest recall score of all the MUC-3 systems under the official MATCHED/MISSING scoring profile.


Journal of Experimental and Theoretical Artificial Intelligence | 1995

INDUCTIVE TEXT CLASSIFICATION FOR MEDICAL APPLICATIONS

Wendy G. Lehnert; Stephen Soderland; David B. Aronow; Fangfang Feng; Avinoam Shmueli

Abstract Text classification poses a significant challenge for knowledge-based technologies because it touches on all the familiar demons of artificial intelligence: the knowledge engineering bottleneck, problems of scale, easy portability across multiple applications, and cost-effective system construction. Information retrieval (IR) technologies traditionally avoid all of these issues by defining a document in terms of a statistical profile of its lexical items. The IR community is willing to exploit a superficial type of knowledge found in dictionaries and thesaurae, but anything that requires customization, application-specific engineering, or any amount of manual tinkering is thought to be incompatible with practical cost-effective system designs. In this paper those assumptions are challenged and it is shown how machine learning techniques can operate as an effective method for automated knowledge acquisition when it is applied to a representative training corpus, and leveraged against a few hours o...


Integrated Computer-aided Engineering | 1994

Evaluating an Information Extraction System

Wendy G. Lehnert; Claire Cardie; David Fisher; J. McCarthy; Ellen Riloff; Stephen Soderland

Many natural language researchers are now turning their attention to a relatively new task orientation known as information extraction. Information extraction systems are predicated on an I/O orientation that makes it possible to conduct formal evaluations and meaningful cross-system comparisons. This article presents the challenge of information extraction and shows how information extraction systems are currently being evaluated. We describe a specific system developed at the University of Massachusetts, identify key research issues of general interest, and conclude with some observations about the role of performance evaluations as a stimulus for basic research.


international joint conference on artificial intelligence | 1996

Issues in inductive learning of domain-specific text extraction rules

Stephen Soderland; David Fisher; Jonathan Aseltine; Wendy G. Lehnert

Domain-specific text analysis requires a dictionary of linguistic patterns that identify references to relevant information in a text. This paper describes CRYSTAL, a fully automated tool that induces such a dictionary of text extraction rules. We discuss some key issues in developing an automatic dictionary induction system, using CRYSTAL as a concrete example. CRYSTAL derives text extraction rules from training instances and generalizes each rule as far as possible, testing the accuracy of each proposed rule on the training corpus. An error tolerance parameter allows CRYSTAL to manipulate a trade-off between recall and precision. We discuss issues involved with creating training data, defining a domain ontology, and allowing a flexible and expressive representation while designing a search control mechanism that avoids intractability.


Proceedings of the TIPSTER Text Program: Phase I | 1993

UMASS/HUGHES: DESCRIPTION OF THE CIRCUS SYSTEM USED FOR TIPSTER TEXT

Wendy G. Lehnert; J. McCarthy; Stephen Soderland; Ellen Riloff; Claire Cardie; J. Peterson; Fangfang Feng

The primary goal of our effort is the development of robust and portable language processing capabilities for information extraction applications. The system under evaluation here is based on language processing components that have demonstrated strong performance capabilities in previous evaluations [Lehnert et al. 1992a]. Having demonstrated the general viability of these techniques, we are now concentrating on the practicality of our technology by creating trainable system components to replace hand-coded data and manually-engineered software.


international joint conference on artificial intelligence | 1995

CRYSTAL inducing a conceptual dictionary

Stephen Soderland; David Fisher; Jonathan Aseltine; Wendy G. Lehnert

Collaboration


Dive into the Stephen Soderland's collaboration.

Top Co-Authors

Avatar

Wendy G. Lehnert

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

David Fisher

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

J. McCarthy

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Fangfang Feng

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Jonathan Aseltine

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge