Florian Laws
University of Stuttgart
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Florian Laws.
international conference on computational linguistics | 2008
Helmut Schmid; Florian Laws
We present a HMM part-of-speech tagging method which is particularly suited for POS tagsets with a large number of fine-grained tags. It is based on three ideas: (1) splitting of the POS tags into attribute vectors and decomposition of the contextual POS probabilities of the HMM into a product of attribute probabilities, (2) estimation of the contextual probabilities with decision trees, and (3) use of high-order HMMs. In experiments on German and Czech data, our tagger outperformed state-of-the-art POS taggers.
international conference on computational linguistics | 2008
Florian Laws; Hinrich Schütze
Active learning is a proven method for reducing the cost of creating the training sets that are necessary for statistical NLP. However, there has been little work on stopping criteria for active learning. An operational stopping criterion is necessary to be able to use active learning in NLP applications. We investigate three different stopping criteria for active learning of named entity recognition (NER) and show that one of them, gradient-based stopping, (i) reliably stops active learning, (ii) achieves nearoptimal NER performance, (iii) and needs only about 20% as much training data as exhaustive labeling.
north american chapter of the association for computational linguistics | 2009
Katrin Tomanek; Florian Laws; Udo Hahn; Hinrich Schütze
Active learning is an effective method for creating training sets cheaply, but it is a biased sampling process and fails to explore large regions of the instance space in many applications. This can result in a missed cluster effect, which signficantly lowers recall and slows down learning for infrequent classes. We show that missed clusters can be avoided in sequence classification tasks by using sentences as natural multi-instance units for labeling. Co-selection of other tokens within sentences provides an implicit exploratory component since we found for the task of named entity recognition on two corpora that entity classes co-occur with sufficient frequency within sentences.
Proceedings of the Workshop on Geometrical Models of Natural Language Semantics | 2009
Beate Dorow; Florian Laws; Lukas Michelbacher; Christian Scheible; Jason Utt
This paper presents a graph-theoretic approach to the identification of yet-unknown word translations. The proposed algorithm is based on the recursive Sim-Rank algorithm and relies on the intuition that two words are similar if they establish similar grammatical relationships with similar other words. We also present a formulation of SimRank in matrix form and extensions for edge weights, edge labels and multiple graphs.
empirical methods in natural language processing | 2011
Florian Laws; Christian Scheible; Hinrich Schütze
international conference on computational linguistics | 2010
Florian Laws; Lukas Michelbacher; Beate Dorow; Christian Scheible; Ulrich Heid; Hinrich Schütze
language resources and evaluation | 2010
Lukas Michelbacher; Florian Laws; Beate Dorow; Ulrich Heid; Hinrich Schütze
international conference on computational linguistics | 2010
Christian Scheible; Florian Laws; Lukas Michelbacher; Hinrich Schütze
north american chapter of the association for computational linguistics | 2012
Florian Laws; Florian Heimerl; Hinrich Schütze
international conference on document analysis and recognition | 2017
Rasmus Berg Palm; Ole Winther; Florian Laws