Keith Suderman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Keith Suderman is active.

Explore More

Publication

Featured researches published by Keith Suderman.

linguistic annotation workshop | 2007

GrAF: A Graph-based Format for Linguistic Annotations

Nancy Ide; Keith Suderman

In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, we have automatically transduced several different annotations of the Wall Street Journal corpus into GrAF and show how the annotations can then be merged, analyzed, and visualized using standard graph algorithms and tools. We also discuss how, as a standard graph representation, it allows for the application of well-established graph traversal and analysis algorithms to produce information about interactions and commonalities among merged annotations. GrAF is an extension of the Linguistic Annotation Framework (LAF) (Ide and Romary, 2004, 2006) developed within ISO TC37 SC4 and as such, implements state-of-the-art best practice guidelines for representing linguistic annotations.

language resources and evaluation | 2014

The Linguistic Annotation Framework: a standard for annotation interchange and merging

Nancy Ide; Keith Suderman

This paper overviews the International Standards Organization–Linguistic Annotation Framework (ISO–LAF) developed in ISO TC37 SC4. We describe the XML serialization of ISO–LAF, the Graph Annotation Format (GrAF) and discuss the rationale behind the various decisions that were made in determining the standard. We describe the structure of the GrAF headers in detail and provide multiple examples of GrAF representation for text and multi-media. Finally, we discuss the next steps for standardization of interchange formats for linguistic annotations.

NLPXML '06 Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing | 2006

Layering and merging linguistic annotations

Keith Suderman; Nancy Ide

The American National Corpus and its annotations are represented in a stand-off XML format compliant with the specifications of ISO TC37 SC4 WG1s Linguistic Annotation Framework. Because few systems that enable search and access of the corpus currently support stand-off markup, the project has developed a SAX like parser that generates ANC data with annotations in-line, in a variety of output formats.

WLSI 2015 Revised Selected Papers of the Second International Workshop on Worldwide Language Service Infrastructure - Volume 9442 | 2015

The LAPPS Interchange Format

Marc Verhagen; Keith Suderman; Di Wang; Nancy Ide; Chunqi Shi; Jonathan Wright; James Pustejovsky

We describe and motivate the LAPPS Interchange Format, a JSON-LD format that is used for data transfer between language services in the Language Application Grid. The LAPPS Interchange Format enables syntactic and semantic interoperability of language services by providing a uniform syntax for common linguistic data and by using the Linked Data aspect of JSON-LD to refer to external definitions of linguistic categories. It is tightly integrated with the Web Services Exchange Vocabulary, which specifies a terminology for a core of linguistic objects and features exchanged by services.

linguistic annotation workshop | 2017

Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs

Richard Eckart de Castilho; Nancy Ide; Emanuele Lapponi; Stephan Oepen; Keith Suderman; Erik Velldal; Marc Verhagen

For decades, most self-respecting linguistic engineering initiatives have designed and implemented custom representations for various layers of, for example, morphological, syntactic, and semantic analysis. Despite occasional efforts at harmonization or even standardization, our field today is blessed with a multitude of ways of encoding and exchanging linguistic annotations of these types, both at the levels of ‘abstract syntax’, naming choices, and of course file formats. To a large degree, it is possible to work within and across design plurality by conversion, and often there may be good reasons for divergent design reflecting differences in use. However, it is likely that some abstract commonalities across choices of representation are obscured by more superficial differences, and conversely there is no obvious procedure to tease apart what actually constitute contentful vs. mere technical divergences. In this study, we seek to conceptually align three representations for common types of morpho-syntactic analysis, pinpoint what in our view constitute contentful differences, and reflect on the underlying principles and specific requirements that led to individual choices. We expect that a more in-depth understanding of these choices across designs may led to increased harmonization, or at least to more informed design of future representations.

language resources and evaluation | 2004