Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Collin F. Baker is active.

Publication


Featured researches published by Collin F. Baker.


meeting of the association for computational linguistics | 2007

SemEval-2007 Task 19: Frame Semantic Structure Extraction

Collin F. Baker; Michael Ellsworth; Katrin Erk

This task consists of recognizing words and phrases that evoke semantic frames as defined in the FrameNet project (http://framenet.icsi.berkeley.edu), and their semantic dependents, which are usually, but not always, their syntactic dependents (including subjects). The training data was FN annotated sentences. In testing, participants automatically annotated three previously unseen texts to match gold standard (human) annotation, including predicting previously unseen frames and roles. Precision and recall were measured both for matching of labels of frames and FEs and for matching of semantic dependency trees based on the annotation.


linguistic annotation workshop | 2009

WordNet and FrameNet as Complementary Resources for Annotation

Collin F. Baker; Christiane Fellbaum

WordNet and FrameNet are widely used lexical resources, but they are very different from each other and are often used in completely different ways in NLP. In a case study in which a short passage is annotated in both frameworks, we show how the synsets and definitions of WordNet and the syntagmatic information from FrameNet can complement each other, forming a more complete representation of the lexical semantic of a text than either could alone. Close comparisons between them also suggest ways in which they can be brought into alignment.


meeting of the association for computational linguistics | 2003

The FrameNet Data and Software

Collin F. Baker; Hiroaki Sato

The FrameNet project has developed a lexical knowledge base providing a unique level of detail as to the the possible syntactic realizations of the specific semantic roles evoked by each predicator, for roughly 7,000 lexical units, on the basis of annotating more than 100,000 example sentences extracted from corpora. An interim version of the FrameNet data was released in October, 2002 and is being widely used. A new, more portable version of the FrameNet software is also being made available to researchers elsewhere, including the Spanish FrameNet project.This demo and poster will briefly explain the principles of Frame Semantics and demonstrate the new unified tools for lexicon building and annotation and also FrameSQL, a search tool for finding patterns in annotated sentences. We will discuss the content and format of the data releases and how the software and data can be used by other NLP researchers.


Natural Language Engineering | 2009

The framenet model and its applications

Birte Lönneker-Rodman; Collin F. Baker

The FrameNet database comprises an English lexicon, organized in terms of semantic frames. Frames describe situations or entities, along with their participants and props, termed frame elements. The frames are organized in an ontology-like network. For the lexical units, corpus annotations illustrate which frame elements are typically realized, and how they behave syntactically. Texts where all content words are annotated with FrameNet information offer a detailed, structured semantic representation with a variety of uses in Natural Language Processing applications, in particular in retrieving and meaningfully organizing texts written by humans, or in making human–computer interaction more natural. Also, the FrameNet English lexicon can be replaced by lexical data from other languages, while maintaining frame information, so the model is attractive for cross-lingual resources and applications. Manual annotation produced by FrameNet and similar projects for other languages is used to train automatic frame semantic annotation systems, which add rich semantic information to any type of text, and are important components for more sophisticated semantic processing applications.


international semantic web conference | 2003

Framenet meets the semantic web: lexical semantics for the web

Srini Narayanan; Collin F. Baker; Charles J. Fillmore; Miriam R. L. Petruck

This paper describes FrameNet [9,1,3], an online lexical resource for English based on the principles of frame semantics [5,7,2]. We provide a data category specification for frame semantics and FrameNet annotations in an RDF-based language. More specifically, we provide an RDF markup for lexical units, defined as a relation between a lemma and a semantic frame, and frame-to-frame relations, namely Inheritance and Subframes. The paper includes simple examples of FrameNet annotated sentences in an XML/RDF format that references the project-specific data category specification.


Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929-2014) | 2014

FrameNet: A Knowledge Base for Natural Language Processing

Collin F. Baker

Prof. Charles J. Fillmore had a lifelong interest in lexical semantics, and this culminated in the latter part of his life in a major research project, the FrameNet Project at the International Computer Science Institute in Berkeley, California (http://framenet. icsi.berkeley.edu). This paper reports on the background of this ongoing project, its connections to Fillmore’s other research interests, and briefly outlines applications and current directions of growth for FrameNet, including FrameNets in languages other than English.


meeting of the association for computational linguistics | 2003

Putting FrameNet Data into the ISO Linguistic Annotation Framework

Srinivas Narayanan; Miriam R. L. Petruck; Collin F. Baker; Charles J. Fillmore

This paper describes FrameNet (Lowe et al., 1997; Baker et al., 1998; Fillmore et al., 2002), an online lexical resource for English based on the principles of frame semantics (Fillmore, 1977a; Fillmore, 1982; Fillmore and Atkins, 1992), and considers the FrameNet database in reference to the proposed ISO model for linguistic annotation of language resources (ISO TC37 SC4 )(ISO, 2002; Ide and Romary, 2001b). We provide a data category specification for frame semantics and FrameNet annotations in an RDF-based language. More specifically, we provide a DAML+OIL markup for lexical units, defined as a relation between a lemma and a semantic frame, and frame-to-frame relations, namely Inheritance and Subframes. The paper includes simple examples of FrameNet annotated sentences in an XML/RDF format that references the project-specific data category specification.


linguistic annotation workshop | 2015

Scaling Semantic Frame Annotation

Nancy Chang; Praveen Paritosh; David Francois Huynh; Collin F. Baker

Large-scale data resources needed for progress toward natural language understanding are not yet widely available and typically require considerable expense and expertise to create. This paper addresses the problem of developing scalable approaches to annotating semantic frames and explores the viability of crowdsourcing for the task of frame disambiguation. We present a novel supervised crowdsourcing paradigm that incorporates insights from human computation research designed to accommodate the relative complexity of the task, such as exemplars and real-time feedback. We show that non-experts can be trained to perform accurate frame disambiguation, and can even identify errors in gold data used as the training exemplars. Results demonstrate the efficacy of this paradigm for semantic annotation requiring an intermediate level of expertise. 1 The semantic bottleneck Behind every great success in speech and language lies a great corpus—or at least a very large one. Advances in speech recognition, machine translation and syntactic parsing can be traced to the availability of large-scale annotated resources (Wall Street Journal, Europarl and Penn Treebank, respectively) providing crucial supervised input to statistically learned models. Semantically annotated resources have been comparatively harder to come by: representing meaning poses myriad philosophical, theoretical and practical challenges, particularly for general purpose resources that can be applied to diverse domains. If these challenges can be addressed, however, semantic resources hold significant potential for fueling progress beyond shallow syntax and toward deeper language understanding. This paper explores the feasibility of developing scalable methodologies for semantic annotation, inspired by three strands of work. First, frame semantics, and its instantiation in the Berkeley FrameNet project (Fillmore and Baker, 2010), offers a principled approach to representing meaning. FrameNet is a lexicographic resource that captures syntactic and semantic generalizations that go beyond surface form and part of speech, famously including the relationships among words like buy, sell, purchase and price. These rich structural relations provide an attractive foundation for work in deeper natural language understanding and inference, as attested by the breadth of applications at the Workshop in Honor of Chuck Fillmore at ACL 2014 (Petruck and de Melo, 2014). But FrameNet was not designed to support scalable language technologies; indeed, it is perhaps a paradigm example of a hand-curated knowledge resource, one that has required significant expertise, training, time and expense to create and that remains under development. Second, the task of automatic semantic role labeling (ASRL) (Gildea and Jurafsky, 2002) serves as an applied counterpart to the ideas of frame semantics. Recent progress has demonstrated the viability of training automated models using frameannotated data (Das et al., 2013; Das et al., 2010; Johansson and Nugues, 2006). Results based on FrameNet data have been limited by its incomplete


Linguistics | 2013

Comparing and harmonizing different verb classifications in light of a semantic annotation task

Christiane Fellbaum; Collin F. Baker

Abstract The verb lexicon can be classified in different and complementary ways; each approach faces challenges and limits. We focus on two large-scale lexical resources, WordNet (Miller et al. 1990; Fellbaum 1998) and FrameNet (Baker et al. 2003). WordNet is a semantic network where lexical meaning is represented in terms of relations among word forms. In contrast to WordNets paradigmatically organized entries, FrameNets Lexical Units are embedded in corpus-derived contexts, which serve as a basis for semantic distinctions. The classificatory perspective that each resource contributes to the analysis of the verb lexicon is illustrated with specific examples. Constructing verb typologies, even when based on attested data, requires the lexicographers introspection and judgments, and no two classifications are completely alike. In a case-based approach to compare and harmonize WordNet and FrameNet, we ask human annotators to select the context-appropriate senses from each resources that best fit tokens in a corpus. A typology of alignments is proposed.


workshop on graph based methods for natural language processing | 2017

Graph Methods for Multilingual FrameNets.

Collin F. Baker; Michael Ellsworth

This paper introduces a new, graph-based view of the data of the FrameNet project, which we hope will make it easier to understand the mixture of semantic and syntactic information contained in FrameNet annotation. We show how English FrameNet and other Frame Semantic resources can be represented as sets of interconnected graphs of frames, frame elements, semantic types, and annotated instances of them in text. We display examples of the new graphical representation based on the annotations, which combine Frame Semantics and Construction Grammar, thus capturing most of the syntax and semantics of each sentence. We consider how graph theory could help researchers to make better use of FrameNet data for tasks such as automatic Frame Semantic role labeling, paraphrasing, and translation. Finally, we describe the development of FrameNet-like lexical resources for other languages in the current Multilingual FrameNet project. which seeks to discover cross-lingual alignments, both in the lexicon (for frames and lexical units within frames) and across parallel or comparable texts. We conclude with an example showing graphically the semantic and syntactic similarities and differences between parallel sentences in English and Japanese. We will release software for displaying such graphs from the current data releases.

Collaboration


Dive into the Collin F. Baker's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Srini Narayanan

International Computer Science Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Martha Palmer

University of Colorado Boulder

View shared research outputs
Researchain Logo
Decentralizing Knowledge