Seth Kulick | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Seth Kulick is active.

Explore More

Publication

Featured researches published by Seth Kulick.

language and technology conference | 2006

Fully Parsing the Penn Treebank

Ryan Gabbard; Seth Kulick; Mitchell P. Marcus

We present a two stage parser that recovers Penn Treebank style syntactic analyses of new sentences including skeletal syntactic structure, and, for the first time, both function tags and empty categories. The accuracy of the first-stage parser on the standard Parseval metric matches that of the (Collins, 2003) parser on which it is based, despite the data fragmentation caused by the greatly enriched space of possible node labels. This first stage simultaneously achieves near state-of-the-art performance on recovering function tags with minimal modifications to the underlying parser, modifying less than ten lines of code. The second stage achieves state-of-the-art performance on the recovery of empty categories by combining a linguistically-informed architecture and a rich feature set with the power of modern machine learning methods.

meeting of the association for computational linguistics | 2005

Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE

Ryan T. McDonald; Fernando Pereira; Seth Kulick; R. Scott Winters; Yang Jin; Peter S. White

A complex relation is any n-ary relation in which some of the arguments may be be unspecified. We present here a simple two-stage method for extracting complex relations between named entities in text. The first stage creates a graph from pairs of entities that are likely to be related, and the second stage scores maximal cliques in that graph as potential complex relation instances. We evaluate the new method against a standard baseline for extracting genomic variation relations from biomedical text.

Linguistics and Philosophy | 1997

Partial proof trees as building blocks for a categorial grammar

Aravind K. Joshi; Seth Kulick

We describe a categorial system (PPTS) based on partial proof trees(PPTs) as the building blocks of the system. The PPTs are obtained byunfolding the arguments of the type that would be associated with a lexicalitem in a simple categorial grammar. The PPTs are the basic types in thesystem and a derivation proceeds by combining PPTs together. We describe theconstruction of the finite set of basic PPTs and the operations forcombining them. PPTS can be viewed as a categorial system incorporating someof the key insights of lexicalized tree adjoining grammar, namely the notionof an extended domain of locality and the consequent factoring of recursionfrom the domain of dependencies. PPTS therefore inherits the linguistic andcomputational properties of that system, and so can be viewed as a ’middleground‘ between a categorial grammar and a phrase structure grammar. We alsodiscuss the relationship between PPTS, natural deduction, and linear logicproof-nets, and argue that natural deduction rather than a proof-net systemis more appropriate for the construction of the PPTs. We also discuss howthe use of PPTs allows us to ’localize‘ the management of resources, therebyfreeing us from this management as the PPTs are combined.

Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora 2006 | 2006

Issues in Synchronizing the English Treebank and PropBank

Olga Babko-Malaya; Ann Bies; Ann Taylor; Szu-ting Yi; Martha Palmer; Mitch Marcus; Seth Kulick; Libin Shen

The PropBank primarily adds semantic role labels to the syntactic constituents in the parsed trees of the Treebank. The goal is for automatic semantic role labeling to be able to use the domain of locality of a predicate in order to find its arguments. In principle, this is exactly what is wanted, but in practice the PropBank annotators often make choices that do not actually conform to the Treebank parses. As a result, the syntactic features extracted by automatic semantic role labeling systems are often inconsistent and contradictory. This paper discusses in detail the types of mismatches between the syntactic bracketing and the semantic role labeling that can be found, and our plans for reconciling them.

workshop on events definition detection coreference and representation | 2015

From Light to Rich ERE: Annotation of Entities, Relations, and Events

Zhiyi Song; Ann Bies; Stephanie M. Strassel; Tom Riese; Justin Mott; Joe Ellis; Jonathan Wright; Seth Kulick; Neville Ryant; Xiaoyi Ma

We describe the evolution of the Entities, Relations and Events (ERE) annotation task, created to support research and technology development within the DARPA DEFT program. We begin by describing the specification for Light ERE annotation, including the motivation for the task within the context of DEFT. We discuss the transition from Light ERE to a more complex Rich ERE specification, enabling more comprehensive treatment of phenomena of interest to DEFT.

workshop on events definition detection coreference and representation | 2015

Event Nugget Annotation: Processes and Issues

Teruko Mitamura; Yukari Yamakawa; Susan E Holm; Zhiyi Song; Ann Bies; Seth Kulick; Stephanie M. Strassel

This paper describes the processes and issues of annotating event nuggets based on DEFT ERE Annotation Guidelines v1.3 and TAC KBP Event Detection Annotation Guidelines 1.7. Using Brat Rapid Annotation Tool (brat), newswire and discussion forum documents were annotated. One of the challenges arising from human annotation of documents is annotators’ disagreement about the way of tagging events. We propose using Event Nuggets to help meet the definitions of the specific type/subtypes which are part of this project. We present case studies of several examples of event annotation issues, including discontinuous multi-word events representing single events. Annotation statistics and consistency analysis is provided to characterize the interannotator agreement, considering single term events and multi-word events which are both continuous and discontinuous. Consistency analysis is conducted using a scorer to compare first pass annotated files against adjudicated files.

meeting of the association for computational linguistics | 2005

Parallel Entity and Treebank Annotation

Ann Bies; Seth Kulick; Mark A. Mandel

We describe a parallel annotation approach for PubMed abstracts. It includes both entity/relation annotation and a treebank containing syntactic structure, with a goal of mapping entities to constituents in the treebank. Crucial to this approach is a modification of the Penn Treebank guidelines and the characterization of entities as relation components, which allows the integration of the entity annotation with the syntactic structure while retaining the capacity to annotate and extract more complex events.

logical aspects of computational linguistics | 1996

Partial Proof Trees, Resource Sensitive Logics, and Syntactic Constraints

Aravind K. Joshi; Seth Kulick

We discuss the relationship between a categorial system (PPTS) based on partial proof trees (PPTs) as the building blocks of the system, resource sensitive logics and the nature of syntactic constraints. PPTS incorporates some of the key insights of lexicalized tree adjoining grammar, namely the notion of an extended domain of locality and the consequent factoring of recursion from the domain of dependencies. PPTS therefore inherits the linguistic and computational properties of that system. We discuss the relationship between PPTS, natural deduction, and linear logic proof-nets, and argue that a natural deduction system rather than a proof-net system is more appropriate for the construction of the PPTs. We also show how the use of PPTs allows us to ‘localize’ the management of resources, thereby freeing us from this management as the PPTs are combined.

ACM Transactions on Asian Language Information Processing | 2011

Exploiting Separation of Closed-Class Categories for Arabic Tokenization and Part-of-Speech Tagging

Seth Kulick

Research on the problem of morphological disambiguation of Arabic has noted that techniques developed for lexical disambiguation in English do not easily transfer over, since the affixation present in Arabic creates a very different tag set than for English, encoding both inflectional morphology and more complex tokenization sequences. This work takes a new approach to this problem based on a distinction between the open-class and closed-class categories of tokens, which differ both in their frequencies and in their possible morphological affixations. This separation simplifies the morphological analysis problem considerably, making it possible to use a Conditional Random Field model for joint tokenization and “core” part-of-speech tagging of the open-class items, while the closed-class items are handled by regular expressions. This work is therefore situated between data-driven approaches and those that use a morphological analyzer. For the tasks of tokenization and core part-of-speech tagging, the resulting system outperforms, on the given test set, a system that incorporates a morphological analyzer. We also evaluate the effects of the differences on parser performance when the tagger output is used for parser input.

meeting of the association for computational linguistics | 2014

The Penn Parsed Corpus of Modern British English: First Parsing Results and Analysis

Seth Kulick; Anthony S. Kroch; Beatrice Santorini

This paper presents the first results on parsing the Penn Parsed Corpus of Modern British English (PPCMBE), a millionword historical treebank with an annotation style similar to that of the Penn Treebank (PTB). We describe key features of the PPCMBE annotation style that differ from the PTB, and present some experiments with tree transformations to better compare the results to the PTB. First steps in parser analysis focus on problematic structures created by the parser.

Explore More