Florian Laws | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Florian Laws is active.

Explore More

Publication

Featured researches published by Florian Laws.

international conference on computational linguistics | 2008

Estimation of Conditional Probabilities With Decision Trees and an Application to Fine-Grained POS Tagging

Helmut Schmid; Florian Laws

We present a HMM part-of-speech tagging method which is particularly suited for POS tagsets with a large number of fine-grained tags. It is based on three ideas: (1) splitting of the POS tags into attribute vectors and decomposition of the contextual POS probabilities of the HMM into a product of attribute probabilities, (2) estimation of the contextual probabilities with decision trees, and (3) use of high-order HMMs. In experiments on German and Czech data, our tagger outperformed state-of-the-art POS taggers.

international conference on computational linguistics | 2008

Stopping Criteria for Active Learning of Named Entity Recognition

Florian Laws; Hinrich Schütze

Active learning is a proven method for reducing the cost of creating the training sets that are necessary for statistical NLP. However, there has been little work on stopping criteria for active learning. An operational stopping criterion is necessary to be able to use active learning in NLP applications. We investigate three different stopping criteria for active learning of named entity recognition (NER) and show that one of them, gradient-based stopping, (i) reliably stops active learning, (ii) achieves nearoptimal NER performance, (iii) and needs only about 20% as much training data as exhaustive labeling.

north american chapter of the association for computational linguistics | 2009

On Proper Unit Selection in Active Learning: Co-Selection Effects for Named Entity Recognition

Katrin Tomanek; Florian Laws; Udo Hahn; Hinrich Schütze

Active learning is an effective method for creating training sets cheaply, but it is a biased sampling process and fails to explore large regions of the instance space in many applications. This can result in a missed cluster effect, which signficantly lowers recall and slows down learning for infrequent classes. We show that missed clusters can be avoided in sequence classification tasks by using sentences as natural multi-instance units for labeling. Co-selection of other tokens within sentences provides an implicit exploratory component since we found for the task of named entity recognition on two corpora that entity classes co-occur with sufficient frequency within sentences.

Proceedings of the Workshop on Geometrical Models of Natural Language Semantics | 2009

A Graph-Theoretic Algorithm for Automatic Extension of Translation Lexicons

Beate Dorow; Florian Laws; Lukas Michelbacher; Christian Scheible; Jason Utt

This paper presents a graph-theoretic approach to the identification of yet-unknown word translations. The proposed algorithm is based on the recursive Sim-Rank algorithm and relies on the intuition that two words are similar if they establish similar grammatical relationships with similar other words. We also present a formulation of SimRank in matrix form and extensions for edge weights, edge labels and multiple graphs.

empirical methods in natural language processing | 2011