Stephan Kepser
University of Tübingen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stephan Kepser.
Theoretical Computer Science | 2006
Stephan Kepser; Uwe Mönnich
Context-free tree grammars, originally introduced by Rounds [Math. Systems Theory 4(3) (1970) 257-287], are powerful grammar devices for the definition of tree languages. The properties of the class of context-free tree languages have been studied for more than three decades now. Particularly important here is the work by Engelfriet and Schmidt [J. Comput. System Sci. 15(3) (1977) 328-353, 16(1) (1978) 67-99]. In the present paper, we consider a subclass of the class of context-free tree languages, namely the class of linear context-free tree languages. A context-free tree grammar is linear, if no rule permits the copying of subtrees. For this class of linear context-free tree languages we show that the grammar derivation mode, which is very important for the general class of context-free tree languages, is immaterial. The main result we present is the closure of the class of linear context-free tree languages under linear frontier-to-root tree transduction mappings. Two further results are the closure of this class under linear root-to-frontier tree transduction mappings and under intersection with regular tree languages.The results of the first part of the paper are applied to the formalisation of optimality theory. Optimality theory (OT), introduced by Prince and Smolensky [Tech. Report 1993], is a linguistic framework in which the mapping of one level of linguistic representation to another is based on rules and filters. The rules generate candidate expressions in the target representation, which are subsequently checked against the filters, so that only those candidates remain that survive this filtering process. A proposal to formalise the description of OT using formal language theory and in particular automata theory was presented by Karttunen [Proceedings of International Workshop on Finite-State Methods in Natural Language Processing, 1998, pp. 1-12] and Frank and Satta [Comput. Linguistics 24 (1998) 307-315]. The main result of these papers is that if the generator is defined as a finite-state string transducer and the filters are defined by finite-state string automata, then the whole OT-system can be defined by means of a finite-state string transducer. Considering the fact that most parts of linguistics have trees as their underlying data structures instead of strings, we show here that generators can be extended to linear frontier-to-root tree transducers on linear context-free tree languages--with constraints being regular tree languages--while the computation of optimal candidates can still be performed using finite-state techniques (over trees).
conference of the european chapter of the association for computational linguistics | 2003
Stephan Kepser
Finite structure query (fsq for short) is a tool for querying syntactically annotated corpora. fsq employs a query language of high expressive power, namely full first order logic. It can be used to query arbitrary finite structures, not just trees.
Journal of Logic, Language and Information | 2011
Stephan Kepser; James Rogers
The equivalence of leaf languages of tree adjoining grammars and monadic linear context-free grammars was shown about a decade ago. This paper presents a proof of the strong equivalence of these grammar formalisms. Non-strict tree adjoining grammars and monadic linear context-free grammars define the same class of tree languages. We also present a logical characterisation of this tree language class showing that a tree language is a member of this class iff it is the two-dimensional yield of an MSO-definable three-dimensional tree language.
Journal of Logic, Language and Information | 2004
Stephan Kepser
In recent years large amounts of electronic texts have become available. While the first of these corpora had only a low level of annotation, the more recent ones are annotated with refined syntactic information. To make these rich annotations accessible for linguists, the development of query systems has become an important goal. One of the main difficulties in this task consists in the choice of the right query language, a language which at the same time should be powerful enough to let users formulate the queries they want and which should be efficiently evaluable to keep query response times short. There is a widespread belief that such a query language does not exist. It is therefore the aim of this paper to show that there is indeed a powerful query language that can be efficiently evaluated. We propose the use of monadic second-order logic as a query language. We show that a query in this language can be evaluated in linear time in the size of a tree in the corpus. We also provide examples of complicated linguistic queries expressed in monadic second-order logic thereby demonstrating the high expressive power of the language.
Electronic Notes in Theoretical Computer Science | 2004
Stephan Kepser
In this paper we present a computability and a complexity result on Relational Speciate Reentrant Logic (RSRL). RSRL is a description logic designed to formalise the linguistic framework and theory Head-Driven Phrase Structure Grammar. We show here that given an RSRL-formula and a finite RSRL-interpretation it is in general not decidable if the formula is true in the given interpretation by reduction to Post Correspondence Problems. For so-called chainless RSRL, a semantically weaker version in which the expressive power of RSRL is significantly reduced, we show that if a class of finite structures is definable in chainless RSRL it is decidable by a Turing machine polynomially time bounded in the size of the input structures.
empirical methods in natural language processing | 2005
Stephan Kepser
MONA is an automata toolkit providing a compiler for compiling formulae of monadic second order logic on strings or trees into string automata or tree automata. In this paper, we evaluate the option of using MONA as a treebank query tool. Unfortunately, we find that MONA is not an option. There are several reasons why the main being unsustainable query answer times. If the treebank contains larger trees with more than 100 nodes, then even the processing of simple queries may take hours.
Archive | 2010
Stephan Kepser; Uwe Mönnich; Frank Morawietz
In this contribution we propose a query method for XML documents that provides a well chosen balance between expressive power of the query language and query complexity using methods derived from logic. Since XML documents are basically regular tree languages, it is appealing to use monadic second-order logic as a query language. But MSO is incapable of querying secondary relations in XML documents introduced via the ID-IDREF mechanism. We therefore show how a well-defined subclass of these ID-IDREF pairs can be queried using MSO, signature translations, and MSO-definable transductions. The ID-IDREF pairs will be coded by linear context-free tree grammars. And any query result is intersected with the coding of the ID-IDREF pairs to ensure only those matches are retained that respect the ID-IDREF informations contained in the document. The advantage of this method is that it uses regular techniques only. In consequence every query is computable.
rewriting techniques and applications | 1999
Stephan Kepser; Jörn Richts
Equational unification algorithms can be used in resolution based theorem provers [9] and rewriting engines [6] to improve their handling of equality. Originally, the requirements of these theorem provers and rewrite engines were such that the unification algorithms had to compute complete sets of unifiers. But with the advent of constraint based approaches to theorem proving [4] and rewriting [8] the interest in unification algorithm that worked merely as decision procedures grew because minimal complete sets of unifiers can be very large – e.g., doubly exponential in the number of variables of the problem in the case of the theory AC – and are hence costly to compute. Because actual unification problems usually contain function symbols from several different signatures, the following combination problem is an important task in unification theory: Given unification algorithms for equational theories E1, E2, . . . , En over pairwise disjoint signatures, provide a general method that gives a unification algorithm for the union E1 ∪ E2 ∪ . . . ∪ En of these theories. Solutions for this problem were provided by Schmidt-Schaus [10] and Boudet [3] for the combination of algorithms calculating complete sets of unifiers and by Baader and Schulz [1] for combining decision procedures. The combination algorithm presented in [1] is mostly of theoretical interest, it contains many non-deterministic decisions, thus the search space that this algorithm spans is so huge, that it is unusable for practical implementations. Therefore the authors developed optimisation methods [7] for the combination algorithm by Baader and Schulz to gain an implementation that can be used in practise. This implementation is UniMoK. UniMoK stands for Unification Module for Keim. It contains algorithms for unification in certain equational theories and it provides several combination methods for them. All combination algorithms in UniMoK are extensions and optimisations of the combination method by Baader and Schulz [1].
mathematics of language | 2007
Stephan Kepser
Multidominance structures were introduced by Kracht [4] to provide a data structure for the formalisation of various aspects of GB-Theory. Kracht studied the PDL-theory of MDSes and showed that this theory is decidable in [5], actually 2EXPTIME-complete. He continues to conjecture that thus the MSO-theory of MDSes should be decidable, too. We show here the contrary. Actually, both the MSO-theory over vertices only and the MSO-theory over vertices and edges turn out to be undecidable.
Extremes | 2004
Stephan Kepser