Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ulrich Heid is active.

Publication


Featured researches published by Ulrich Heid.


Behavior Research Methods Instruments & Computers | 2003

The NITE XML Toolkit: Flexible annotation for multimodal language data

Jean Carletta; Stefan Evert; Ulrich Heid; Jonathan Kilgour; Judy Robertson; Holger Voormann

Multimodal corpora that show humans interacting via language are now relatively easy to collect. Current tools allow one either to apply sets of time-stamped codes to the data and consider their timing and sequencing or to describe some specific linguistic structure that is present in the data, built over the top of some form of transcription. To further our understanding of human communication, the research community needs code sets with both timings and structure, designed flexibly to address the research questions at hand. The NITE XML Toolkit offers library support that software developers can call upon when writing tools for such code sets and, thus, enables richer analyses than have previously been possible. It includes data handling, a query language containing both structural and temporal constructs, components that can be used to build graphical interfaces, sample programs that demonstrate how to use the libraries, a tool for running queries, and an experimental engine that builds interfaces on the basis of declarative specifications.


language resources and evaluation | 2005

The NITE XML Toolkit: Data Model and Query Language

Jean Carletta; Stefan Evert; Ulrich Heid; Jonathan Kilgour

The NITE XML Toolkit (NXT) is open source software for working with language corpora, with particular strengths for multimodal and heavily cross-annotated data sets. In NXT, annotations are described by types and attribute value pairs, and can relate to signal via start and end times, to representations of the external environment, and to each other via either an arbitrary graph structure or a multi-rooted tree structure characterized by both temporal and structural orderings. Simple queries in NXT express variable bindings for n-tuples of objects, optionally constrained by type, and give a set of conditions on the n-tuples combined with boolean operators. The defined operators for the condition tests allow full access to the timing and structural properties of the data model. A complex query facility passes variable bindings from one query to another for filtering, returning a tree structure. In addition to describing NXTȁ9s core data handling and search capabilities, we explain the stand-off XML data storage format that it employs and illustrate its use with examples from an early adopter of the technology.


Machine Translation | 1992

Interactions between linguistic constraints: Procedural vs. declarative approaches

Martin C. Emele; Ulrich Heid; Stefan Momma; Rémi Zajac

The traditional approach to generation is to derive a surface string from a semantic structure through various intermediate levels using a carefully ordered set of transformation steps. We show by some examples that this approach involves a lot of specific control decisions which cannot be generalized across several languages. We present a constraint-based approach where all levels of linguistic information are represented in a single structure. All levels introduce constraints on the linguistic structure stated as a set of feature type definitions. Relationships between levels are modelled as a set of (partial) relational constraints which apply simultaneously on all levels of the linguistic structure.


Lexicographica | 2012

Dictionary and corpus data in a common portal: state of the art and requirements for the future

Ulrich Heid; Daan J. Prinsloo; T.J.D. Bothma

Some recent dictionaries include corpus lines, links to concordances or to internet pages, or other links to dictionary-external data. Lexicographers present such external data to the user either to complement their lexicographic descriptions, or even instead of such lexicographically processed material, when the latter would be redundant to an authoritative source, or when it is simply not available. In this article, we critically review some dictionaries that offer such devices, and we make an attempt at a classifi cation of the ways in which dictionary-internal data and dictionary-external material are related. On this basis, we come up with some proposals both for future research on the topic and for future lexicographic realizations. 1


international conference on computational linguistics | 1990

Organizing linguistic knowledge for multilingual generation

Martin C. Emele; Ulrich Heid; Stefan Momma; Rémi Zajac

We propose an architecture for the organisation of linguistic knowledge which allows to (1) separately formulate generalizations for different types of linguistic information, and (2) state interrelations between partial information belonging to different levels of description. We use typed feature structures for encoding linguistic knowledge. We show the application of this representational device for the architecture of linguistic knowledge sources for multilingual generation. As an example, we describe the use of interacting collocational and syntactic constraints in the generation of French and German sentences.


language resources and evaluation | 2009

Multilingual language resources and interoperability

Andreas Witt; Ulrich Heid; Felix Sasaki; Gilles Sérasset

This article introduces the topic of “Multilingual language resources and interoperability”. We start with a taxonomy and parameters for classifying language resources. Later we provide examples and issues of interoperatability, and resource architectures to solve such issues. Finally we discuss aspects of linguistic formalisms and interoperability.


Proceedings of the First Workshop on Language Technologies for African Languages | 2009

Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function Words

Gertrud Faass; Ulrich Heid; Elsabé Taljard; D. J. Prinsloo

A major obstacle to part-of-speech (=POS) tagging of Northern Sotho (Bantu, S 32) are ambiguous function words. Many are highly polysemous and very frequent in texts, and their local context is not always distinctive. With certain taggers, this issue leads to comparatively poor results (between 88 and 92% accuracy), especially when sizeable tagsets (over 100 tags) are used. We use the RF-tagger (Schmid and Laws, 2008), which is particularly designed for the annotation of fine-grained tagsets (e.g. including agreement information), and we restructure the 141 tags of the tagset proposed by Taljard et al. (2008) in a way to fit the RF tagger. This leads to over 94% accuracy. Error analysis in addition shows which types of phenomena cause trouble in the POS-tagging of Northern Sotho.


Lexikos | 2012

Devices for information presentation in electronic dictionaries

D. J. Prinsloo; Ulrich Heid; T.J.D. Bothma; Gertrud Faaß

Electronic dictionaries should support dictionary users by giving them guidance in text production and text reception, alongside a user-definable offer of lexicographic data for cognitive purposes. In this article, we sketch the principles of an interactive and dynamic electronic dictionary aimed at text production and text reception guiding users in innovative ways, especially with respect to difficult, complicated or confusing issues. The lexicographer has to do a very careful analysis of the nature of the possible problems to suggest an optimal solution for a specific problem. We are of the opinion that there are numerous complex situations where users need more detailed support than currently available in e-dictionaries, enabling them to make valid and correct choices. For highly complex situations, we suggest guidance through a decision tree-like device. We assume that the solutions proposed here are not specific to one language only but can, after careful analysis, be applied to e-dictionaries in different languages across the world.


Southern African Linguistics and Applied Language Studies | 2008

Designing a verb guesser for part of speech tagging in Northern Sotho

D. J. Prinsloo; Gertrud Faaß; Elsabé Taljard; Ulrich Heid

The aim of this article is to describe the design and implementation of a verb guesser that will enhance the results of statistical part of speech (POS) tagging of verbs in Northern Sotho. It will be illustrated that verb stems in Northern Sotho can successfully be recognised by examining their suffixes and combinations of suffixes. Two approaches to verbal derivation analysis will be utilised, namely morphological analysis and corpus querying of suffixes and combinations of suffixes.


computational linguistics in the netherlands | 2002

A Dutch Chunker as a Basis for the Extraction of Linguistic Knowledge

Kristina Spranger; Ulrich Heid

We have developed a fully automatic recursive chunker for unrestricted Dutch text to be used as a basis for the extraction of linguistic and terminological information. The chunker is based on the approach adopted for the analysis of German in the YAC-chunker. Our tool builds up flat annotations of (maximal) syntactic constituents, using a multi-pass algorithm.We describe the chunking procedure and the coverage of the chunker with examples, e.g. PPs/NPs with prenominal modification, tegen de uit ioniserende stralingen voortspruitende gevaren or de te fuseren vennootschappen. We also illustrate its use in term candidate extraction from about 20 million words of social security documents from Flanders.

Collaboration


Dive into the Ulrich Heid's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stefan Evert

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kurt Eberle

University of Stuttgart

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anita Gojun

University of Stuttgart

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge