Bernd Bohnet | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bernd Bohnet is active.

Explore More

Publication

Featured researches published by Bernd Bohnet.

conference on computational natural language learning | 2009

Efficient Parsing of Syntactic and Semantic Dependency Structures

Bernd Bohnet

In this paper, we describe our system for the 2009 CoNLL shared task for joint parsing of syntactic and semantic dependency structures of multiple languages. Our system combines and implements efficient parsing techniques to get a high accuracy as well as very good parsing and training time. For the applications of syntactic and semantic parsing, the parsing time and memory footprint are very important. We think that also the development of systems can profit from this since one can perform more experiments in the given time. For the subtask of syntactic dependency parsing, we could reach the second place with an accuracy in average of 85.68 which is only 0.09 points behind the first ranked system. For this task, our system has the highest accuracy for English with 89.88, German with 87.48 and the out-of-domain data in average with 78.79. The semantic role labeler works not as well as our parser and we reached therefore the fourth place (ranked by the macro F1 score) in the joint task for syntactic and semantic dependency parsing.

international conference on natural language generation | 2000

A development Environment for an MTT-Based Sentence Generator

Bernd Bohnet; Andreas Langjahr; Leo Wanner

With the rising standard of the state of the art in text generation and the increase of the number of practical generation applications, it becomes more and more important to provide means for the maintenance of the generator, i.e. its extension, modification, and monitoring by grammarians who are not familiar with its internals. However, only a few sentence and text generators developed to date actually provide these means. One of these generators is KPML (Bateman, 1997). KPML comes with a Development Environment and there is no doubt about the contribution of this environment to the popularity of the systemic approach in generation.

international joint conference on natural language processing | 2015

Inverted indexing for cross-lingual NLP

Anders Søgaard; Żeljko Agić; Héctor Martínez Alonso; Barbara Plank; Bernd Bohnet; Anders Johannsen

We present a novel, count-based approach to obtaining inter-lingual word representations based on inverted indexing of Wikipedia. We present experiments applying these representations to 17 datasets in document classification, POS tagging, dependency parsing, and word alignment. Our approach has the advantage that it is simple, computationally efficient and almost parameter-free, and, more importantly, it enables multi-source crosslingual learning. In 14/17 cases, we improve over using state-of-the-art bilingual embeddings.

Applications of Graph Transformations with Industrial Relevance | 2008

Generation of Sierpinski Triangles: A Case Study for Graph Transformation Tools

Gabriele Taentzer; Enrico Biermann; Dénes Bisztray; Bernd Bohnet; Iovka Boneva; Artur Boronat; Leif Geiger; Rubino Geiß; Ákos Horváth; Ole Kniemeyer; Tom Mens; Benjamin Ness; Tamás Vajk

In this paper, we consider a large variety of solutions for the generation of Sierpinski triangles, one of the case studies for the AGTIVE graph transformation tool contest [15]. A Sierpinski triangle shows a well-known fractal structure. This case study is mostly a performance benchmark, involving the construction of all triangles up to a certain number of iterations. Both time and space performance are involved. The transformation rules themselves are quite simple.

Applied Artificial Intelligence | 2010

MARQUIS: GENERATION OF USER-TAILORED MULTILINGUAL AIR QUALITY BULLETINS

Leo Wanner; Bernd Bohnet; Nadjet Bouayad-Agha; François Lareau; Daniel Nicklass

Air pollution has a major influence on health. It is thus not surprising that air quality (AQ) increasingly becomes a central issue in the environmental information policy worldwide. The most common way to deliver AQ information is in terms of graphics, tables, pictograms, or color scales that display either the concentrations of the pollutant substances or the corresponding AQ indices. However, all of these presentation modi lack the explanatory dimension; nor can they be easily tailored to the needs of the individual users. MARQUIS is an AQ information generation service that produces user-tailored multilingual bulletins on the major measured and forecasted air pollution substances and their relevance to human health in five European regions. It incorporates modules for the assessment of pollutant time series episodes with respect to their relevance to a given addressee, for planning of the discourse structure of the bulletins and the selection of the adequate presentation mode, and for generation proper. The positive evaluation of the bulletins produced by MARQUIS by users shows that the use of automatic text generation techniques in such a complex and sensitive application is feasible.

Computer Speech & Language | 2006

Making sense of collocations

Leo Wanner; Bernd Bohnet; Mark Giereth

Lexico-semantic collocations (LSCs) are a prominent type of multiword expressions. Over the last decade, the automatic compilation of LSCs from text corpora has been addressed in a significant number of works. However, very often, the output of an LSC-extraction program is a plain list of LSCs. Being useful as raw material for dictionary construction, plain lists of LSCs are of a rather limited use in NLP-applications. For NLP, LSCs must be assigned syntactic and, especially, semantic information. Our goal is to develop an ‘‘off-the-shelf’’ LSC-acquisition program that annotates each LSC identified in the corpus with its syntax and semantics. In this article, we address the annotation task as a classification task,viewing it as a machine learning problem. The LSC-typology we use are the lexical functions from the Explanatory Combinatorial Lexicology; as lexico-semantic resource, EuroWordnet has been used. The applied machine learning technique is a variant of the nearest neighbor-family, which is defined over lexico-semantic features of the elements of LSCs. The technique has been tested on Spanish verb–noun bigrams. � 2005 Elsevier Ltd. All rights reserved.

natural language generation | 2001

On using a parallel graph rewriting formalism in generation

Bernd Bohnet; Leo Wanner

In this paper, we present a parallel context sensitive graph rewriting formalism for a dependency-oriented generation grammar. The parallel processing of the input structure makes an explicit presentation of all alternative options for its mapping onto the output structure possible. This allows for the selection of the linguistic realization that suits best the communicative and contextual criteria available.

international conference on natural language generation | 2008

The fingerprint of human referring expressions and their surface realization with graph transducers

Bernd Bohnet

The algorithm IS-FP takes up the idea from the IS-FBN algorithm developed for the shared task 2007. Both algorithms learn the individual attribute selection style for each human that provided referring expressions to the corpus. The IS-FP algorithm was developed with two additional goals (1) to improve the indentification time that was poor for the FBN algorithm and (2) to push the dice score even higher. In order to generate a word string for the selected attributes, we build based on individual preferences a surface syntactic dependency tree as input. We derive the individual preferences from the training set. Finally, a graph transducer maps the input strucutre to a deep morphologic structure.

north american chapter of the association for computational linguistics | 2015

Data-driven sentence generation with non-isomorphic trees

Miguel Ballesteros; Bernd Bohnet; Simon Mille; Leo Wanner

Abstract structures from which the generation naturally starts often do not contain any func- tional nodes, while surface-syntactic struc- tures or a chain of tokens in a linearized tree contain all of them. Therefore, data-driven linguistic generation needs to be able to cope with the projection between non-isomorphic structures that differ in their topology and number of nodes. So far, such a projection has been a challenge in data-driven genera- tion and was largely avoided. We present a fully stochastic generator that is able to cope with projection between non-isomorphic structures. The generator, which starts from PropBank-like structures, consists of a cas- cade of SVM-classifier based submodules that map in a series of transitions the input struc- tures onto sentences. The generator has been evaluated for English on the Penn-Treebank and for Spanish on the multi-layered Ancora- UPF corpus.

international conference on acoustics, speech, and signal processing | 2010

Evaluation of semantic role labeling and dependency parsing of automatic speech recognition output

Benoit Favre; Bernd Bohnet; Dilek Hakkani-Tür

Semantic role labeling (SRL) is an important module of spoken language understanding systems. This work extends the standard evaluation metrics for joint dependency parsing and SRL of text in order to be able to handle speech recognition output with word errors and sentence segmentation errors. We propose metrics based on word alignments and bags of relations, and compare their results on the output of several SRL systems on broadcast news and conversations of the OntoNotes corpus. We evaluate and analyze the relation between the performance of the subtasks that lead to SRL, including ASR, part-of-speech tagging or sentence segmentation. The tools are made available to the community.

Explore More