André Kempe
Xerox
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by André Kempe.
international conference on computational linguistics | 1996
André Kempe; Lauri Karttunen
This paper extends the calculus of regular expressions with new types of replacement expressions that enhance the expressiveness of the simple replace operator defined in Karttunen (1995). Parallel replacement allows multiple replacements to apply simultaneously to the same input without interfering with each other. We also allow a replacement to be constrained by any number of alternative contexts. With these enhancements, the general replacement expressions are more versatile than two-level rules for the description of complex morphological alternations.
meeting of the association for computational linguistics | 1997
André Kempe
This paper describes the conversion of a Hidden Markov Model into a sequential transducer that closely approximates the behavior of the stochastic model. This transformation is especially advantageous for part-of-speech tagging because the resulting transducer can be composed with other transducers that encode correction rules for the most frequent tagging errors. The speed of tagging is also improved. The described methods have been implemented and successfully tested on six languages.
international conference on implementation and application of automata | 2003
André Kempe; Christof Baeijs; Tamás Gaál; Franck Guingne; Florent Nicart
This article presents a new tool, WFSC, for creating, manipulating, and applying weighted finite state automata. It inherits some powerful features from Xeroxs non-weighted XFST tool and represents a continuation of Xeroxs work in the field of finite state automata over two decades. The design is generic: algorithms work on abstract components of automata and on a generic abstract semiring, and are independent of their concrete realizations. Applications can access WFSCs functions through an API or create automata through an end-user interface, either from an enumeration of their states and transitions or from rational expressions.
international conference on implementation and application of automata | 2005
André Kempe; Jean-Marc Champarnaud; Jason Eisner; Franck Guingne; Florent Nicart
Weighted finite-state machines with n tapes describe n-ary rational string relations. The join n-ary relation is very important in applications. It is shown how to compute it via a more simple operation, the auto-intersection. Join and auto-intersection generally do not preserve rationality. We define a class of triples (A,i,j) such that the auto-intersection of the machine A on tapes i and j can be computed by a delay-based algorithm. We point out how to extend this class and hope that it is sufficient for many practical applications.
international conference on implementation and application of automata | 2000
André Kempe
We present a method of constructing and using a cascade consisting of a left-and a right-sequential finite-state transducer (FST), T 1 and T 2, for part-of-speech (POS) disambiguation. Compared to a Hidden Markov model (HMM), this FST cascade has the advantage of significantly higher processing speed, but at the cost of slightly lower accuracy. Applications such as Information Retrieval, where the speed can be more important than accuracy, could benefit from this approach.
Theoretical Computer Science | 2004
André Kempe
Much attention has been brought to determinization and e-removal in previous work. This article describes an algorithm for extracting all e-cycles, which are a particular type of nondeterminism, from an arbitrary finite-state transducer (FST). the algorithm decomposes the FST, T, into two FSTs, T1 and T2, such that T1 contains no e-cycles and T2 contains all e-cycles of T. The article also proposes an alternative approach where each e-cycle of T is replaced by a single transitions with a complex label that describes the output of the cycle, Since e-cycles are an obstacle for some algorithms such as the decomposition of ambiguous FSTs, the proposed approaches allow us to by-pass this problem, e-Cycles can be extracted or recoded before and re-inserted (by composition) after such algorithms.
conference on computational natural language learning | 1998
André Kempe
This paper describes the conversion of a Hidden Markov Model into a finite state transducer that closely approximates the behavior of the stochastic model. In some cases the transducer is equivalent to the HMM. This conversion is especially advantageous for part-of-speech tagging because the resulting transducer can be composed with other transducers that encode correction rules for the most frequent tagging errors. The speed of tagging is also improved. The described methods have been implemented and successfully tested.
finite state methods and natural language processing | 2005
André Kempe; Jean-Marc Champarnaud; Franck Guingne; Florent Nicart
The join of two n-ary string relations is a main operation regarding to applications. n-Ary rational string relations are realized by weighted finite-state machines with n tapes. We provide an algorithm that computes the join of two machines via a more simple operation, the auto-intersection. The two operations generally do not preserve rationality. A delay-based algorithm is described for the case of a single tape pair, as well as the class of auto-intersections that it handles. It is generalized to multiple tape pairs and some enhancements are discussed.
International Journal of Foundations of Computer Science | 2007
Florent Nicart; Jean-Marc Champarnaud; Tibor Csáki; Tamás Gaál; André Kempe
Rational relations are a powerful model used in many domains such as natural language processing. In this article, we propose a new model of finite state automata: multi-tape automata with symbol classes and identity or non-identity constraints. This model generalizes classical multi-tape automata, as well as automata and transducers with extended alphabet. We define this model in terms of a constraint satisfaction problem and discuss a problem occurring when handling the projection operation. Finally, we describe its implementation and results of a performance test.
conference on implementation and application of automata | 2004
Franck Guingne; Florent Nicart; André Kempe
This article estimates the worst-case running time complexity for traversing and printing all successful paths of a normalized trim acyclic automaton. First, we show that the worst-case structure is a festoon. Then, we prove that the complexity is maximal when we have a distribution of e (Napier constant) outgoing arcs per state on average, and that it can be exponential in the number of arcs.