Kevin Humphreys | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kevin Humphreys is active.

Explore More

Publication

Featured researches published by Kevin Humphreys.

conference on applied natural language processing | 1997

GATE - a General Architecture for Text Engineering

Hamish Cunningham; Kevin Humphreys; Robert J. Gaizauskas; Yorick Wilks

This paper presents the design, implementation and evaluation of GATE, a General Architecture for Text Engineering.GATE lies at the intersection of human language computation and software engineering, and constitutes aninfrastructural system supporting research and development of languageprocessing software.

MUC6 '95 Proceedings of the 6th conference on Message understanding | 1995

University of Sheffield: description of the LaSIE system as used for MUC-6

Robert J. Gaizauskas; Kevin Humphreys; Hamish Cunningham; Yorick Wilks

The LaSIE (Large Scale Information Extraction) system has been developed at the University of Sheffield as part of an ongoing research effort into information extraction and, more generally, natural language engineering.

pacific symposium on biocomputing | 1999

Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures.

Kevin Humphreys; George Demetriou; Robert J. Gaizauskas

Information extraction technology, as defined and developed through the U.S. DARPA Message Understanding Conferences (MUCs), has proved successful at extracting information primarily from newswire texts and primarily in domains concerned with human activity. In this paper we consider the application of this technology to the extraction of information from scientific journal papers in the area of molecular biology. In particular, we describe how an information extraction system designed to participate in the MUC exercises has been modified for two bioinformatics applications: EMPathIE, concerned with enzyme and metabolic pathways; and PASTA, concerned with protein structure. Progress to date provides convincing grounds for believing that IE techniques will deliver novel and effective ways for scientists to make use of the core literature which defines their disciplines.

conference on applied natural language processing | 1997

Software Infrastructure for Natural Language Processing

Hamish Cunningham; Kevin Humphreys; Robert J. Gaizauskas; Yorick Wilks

We classify and review current approaches to software infrastructure for research, development and delivery of NLP systems. The task is motivated by a discussion of current trends in the field of NLP and Language Engineering. We describe a system called GATE (a General Architecture for Text Engineering) that provides a software infrastructure on top of which heterogeneous NLP processing modules may be evaluated and refined individually, or may be combined into larger application systems. GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE promotes reuse of component technology, permits specialisation and collaboration in large-scale projects, and allows for the comparison and evaluation of alternative technologies. The first release of GATE is now available.

international conference on tools with artificial intelligence | 1996

GATE: an environment to support research and development in natural language engineering

Robert J. Gaizauskas; Hamish Cunningham; Yorick Wilks; Peter Rodgers; Kevin Humphreys

We describe a software environment to support research and development in natural language (NL) engineering. This environment-GATE (General Architecture for Text Engineering)-aims to advance research in the area of machine processing of natural languages by providing a software infrastructure on top of which heterogeneous NL component modules may be evaluated and refined individually or may be combined into larger application systems. Thus, GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE will promote reuse of component technology, permit specialisation and collaboration in large-scale projects, and allow for the comparison and evaluation of alternative technologies. The first release of GATE is now available.

ANARESOLUTION '97 Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts | 1997

Event coreference for information extraction

Kevin Humphreys; Robert J. Gaizauskas; Saliha Azzam

We propose a general approach for performing event coreference and for constructing complex event representations, such as those required for information extraction tasks. Our approach is based on a representation which allows a tight coupling between world or conceptual modelling and discourse modelling. The representation and the coreference mechanism are fully implemented within the LaSIE information extraction system where the mechanism is used for both object (noun phrase) and event coreference resolution. Indirect evaluation of the approach shows small, but significant benefit, for information extraction tasks.

meeting of the association for computational linguistics | 1998

Evaluating a Focus-Based Approach to Anaphora Resolution

Saliha Azzam; Kevin Humphreys; Robert J. Gaizauskas

We present an approach to anaphora resolution based on a focusing algorithm, and implemented within an existing MUC (Message Understanding Conference) Information Extraction system, allowing quantitative evaluation against a substantial corpus of annotated real-world texts. Extensions to the basic focusing mechanism can be easily tested, resulting in refinements to the mechanism and resolution rules. Results show that the focusing algorithm is highly sensitive to the quality of syntactic-semantic analyses, when compared to a simpler heuristic-based approach.

international workshop/conference on parsing technologies | 2005

SUPPLE: A Practical Parser for Natural Language Engineering Applications

Robert J. Gaizauskas; Mark Hepple; Horacio Saggion; Mark A. Greenwood; Kevin Humphreys

We describe SUPPLE, a freely-available, open source natural language parsing system, implemented in Prolog, and designed for practical use in language engineering (LE) applications. SUPPLE can be run as a stand-alone application, or as a component within the GATE General Architecture for Text Engineering. SUPPLE is distributed with an example grammar that has been developed over a number of years across several LE projects. This paper describes the key characteristics of the parser and the distributed grammar.

Natural Language Engineering | 1997

Using a semantic network for information extraction

Robert J. Gaizauskas; Kevin Humphreys

This paper describes the approach to knowledge representation taken in the LaSIE Information Extraction (IE) system. Unlike many IE systems that skim texts and use large collections of shallow, domain-specific patterns and heuristics to fill in templates, LaSIE attempts a fuller text analysis, first translating individual sentences to a quasi-logical form, and then constructing a weak discourse model of the entire text from which template fills are finally derived. Underpinning the system is a general ‘world model’, represented as a semantic net, which is extended during the processing of a text by adding the classes and instances described in that text. In the paper we describe the systems knowledge representation formalisms, their use in the IE task, and how the knowledge represented in them is acquired, including experiments to extend the systems coverage using the WordNet general purpose semantic network. Preliminary evaluations of our approach, through the Sixth DARPA Message Understanding Conference, indicate comparable performance to shallower approaches. However, we believe its generality and extensibility offer a route towards the higher precision that is required of IE systems if they are to become genuinely usable technologies.

Lecture Notes in Computer Science | 1997

Conception vs. Lexicons: An Architecture for Multilingual Information Extraction

Robert J. Gaizauskas; Kevin Humphreys

Given an information extraction (IE) system that performs an extraction task against texts in one language, it is natural to consider how to modify the system to perform the same task against texts in a different language. More generally, there may be a requirement to do the extraction task against texts in an arbitrary number of different languages and to present results to a user who has no knowledge of the source language from which the information has been extracted. To minimise the language-specific alterations that need to be made in extending the system to a new language, it is important to separate the task-specific conceptual knowledge the system uses, which may be assumed to be language independent, from the language-dependent lexical knowledge the system requires, which unavoidably must be extended for each new language. In this paper we describe how the architecture of the LaSIE system, an IE system designed to do monolingual extraction from English texts, has been modified to support a clean separation between conceptual and lexical information. This separation allows hard-to-acquire, domain-specific conceptual knowledge to be represented only once, and hence to be reused in extracting information from texts in multiple languages, while standard lexical resources can be used to extend language coverage. Preliminary experiments with extending the system to French are described.

Explore More