Martin Berglund
Umeå University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin Berglund.
language and automata theory and applications | 2011
Martin Berglund; Henrik Björklund; Johanna Högberg
Language models that use interleaving, or shuffle, operators have applications in various areas of computer science, including system verification, plan recognition, and natural language processing. We study the complexity of the membership problem for such models, i.e., how difficult it is to determine if a string belongs to a language or not. In particular, we investigate how interleaving can be introduced into models that capture the context-free languages.
international conference on implementation and application of automata | 2016
Nicolaas Weideman; Brink van der Merwe; Martin Berglund; Bruce W. Watson
We apply results from ambiguity of non-deterministic finite automata to the problem of determining the asymptotic worst-case matching time, as a function of the length of the input strings, when attempting to match input strings with a given regular expression, where the matcher being used is a backtracking regular expression matcher.
automata and formal languages | 2014
Martin Berglund; Frank Drewes; Brink van der Merwe
We consider in some detail how regular expression matching happens in Java, as a popular representative of the category of regex-directed matching engines. We extract a slightly idealized algorithm ...
Theoretical Computer Science | 2017
Martin Berglund; Brink van der Merwe
We introduce prioritized transducers to formalize capturing groups in regular expression matching in a way that permits straightforward modelling of and comparison with real-world regular expression matching library behaviors. The broader questions of parsing semantics and performance are discussed, and also the complexity of deciding equivalence of regular expressions with capturing groups.
developments in language theory | 2013
Martin Berglund; Henrik Björklund; Frank Drewes; Brink van der Merwe; Bruce W. Watson
Most software packages with regular expression matching engines offer operators that extend the classical regular expressions, such as counting, intersection, complementation, and interleaving. Some of the most popular engines, for example those of Java and Perl, also provide operators that are intended to control the nondeterminism inherent in regular expressions. We formalize this notion in the form of the cut and iterated cut operators. They do not extend the class of languages that can be defined beyond the regular, but they allow for exponentially more succinct representation of some languages. Membership testing remains polynomial, but emptiness testing becomes PSPACE-hard.
Algorithms | 2016
Henrik Björklund; Martin Berglund; Petter Ericson
Parsing for mildly context-sensitive language formalisms is an important area within natural language processing. While the complexity of the parsing problem for some such formalisms is known to be polynomial, this is not the case for all of them. This article presents a series of results regarding the complexity of parsing for linear context-free rewriting systems and deterministic tree-walking transducers. We discuss the difference between uniform and nonuniform complexity measures and how parameterized complexity theory can be used to investigate how different aspects of the formalisms influence how hard the parsing problem is. The main results we survey are all hardness results and indicate that parsing is hard even for relatively small values of parameters such as rank and fan-out in a rewriting system.
international conference on implementation and application of automata | 2015
Martin Berglund; Brink van der Merwe
We introduce prioritized transducers to formalize capturing groups in regular expression matching in a way that permits straightforward modelling of and comparison with real-world regular expression matching library behaviors. The broader questions of parsing semantics and performance are discussed, and also the complexity of deciding equivalence of regular expressions with capturing groups.
Theoretical Computer Science | 2013
Martin Berglund; Henrik Björklund; Johanna Björklund
Language models that use interleaving, or shuffle, operators have applications in various areas of computer science, including system verification, plan recognition, and natural language processing ...
Archive | 2018
Martin Berglund; Willem Bester; Brink van der Merwe
Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, and we analyse the complexity of regular expression matching when using Boost semantics.
mathematics of language | 2013
Martin Berglund; Henrik Björklund; Frank Drewes