Martin Berglund | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Berglund is active.

Explore More

Publication

Featured researches published by Martin Berglund.

language and automata theory and applications | 2011

Recognizing shuffled languages

Martin Berglund; Henrik Björklund; Johanna Högberg

Language models that use interleaving, or shuffle, operators have applications in various areas of computer science, including system verification, plan recognition, and natural language processing. We study the complexity of the membership problem for such models, i.e., how difficult it is to determine if a string belongs to a language or not. In particular, we investigate how interleaving can be introduced into models that capture the context-free languages.

international conference on implementation and application of automata | 2016

Analyzing Matching Time Behavior of Backtracking Regular Expression Matchers by Using Ambiguity of NFA

Nicolaas Weideman; Brink van der Merwe; Martin Berglund; Bruce W. Watson

We apply results from ambiguity of non-deterministic finite automata to the problem of determining the asymptotic worst-case matching time, as a function of the length of the input strings, when attempting to match input strings with a given regular expression, where the matcher being used is a backtracking regular expression matcher.

automata and formal languages | 2014

Analyzing Catastrophic Backtracking Behavior in Practical Regular Expression Matching

Martin Berglund; Frank Drewes; Brink van der Merwe

We consider in some detail how regular expression matching happens in Java, as a popular representative of the category of regex-directed matching engines. We extract a slightly idealized algorithm ...

Theoretical Computer Science | 2017

On the semantics of regular expression parsing in the wild

Martin Berglund; Brink van der Merwe

We introduce prioritized transducers to formalize capturing groups in regular expression matching in a way that permits straightforward modelling of and comparison with real-world regular expression matching library behaviors. The broader questions of parsing semantics and performance are discussed, and also the complexity of deciding equivalence of regular expressions with capturing groups.

developments in language theory | 2013

Cuts in Regular Expressions

Martin Berglund; Henrik Björklund; Frank Drewes; Brink van der Merwe; Bruce W. Watson

Most software packages with regular expression matching engines offer operators that extend the classical regular expressions, such as counting, intersection, complementation, and interleaving. Some of the most popular engines, for example those of Java and Perl, also provide operators that are intended to control the nondeterminism inherent in regular expressions. We formalize this notion in the form of the cut and iterated cut operators. They do not extend the class of languages that can be defined beyond the regular, but they allow for exponentially more succinct representation of some languages. Membership testing remains polynomial, but emptiness testing becomes PSPACE-hard.

Algorithms | 2016

Uniform vs. Nonuniform Membership for Mildly Context-Sensitive Languages: A Brief Survey

Henrik Björklund; Martin Berglund; Petter Ericson

Parsing for mildly context-sensitive language formalisms is an important area within natural language processing. While the complexity of the parsing problem for some such formalisms is known to be polynomial, this is not the case for all of them. This article presents a series of results regarding the complexity of parsing for linear context-free rewriting systems and deterministic tree-walking transducers. We discuss the difference between uniform and nonuniform complexity measures and how parameterized complexity theory can be used to investigate how different aspects of the formalisms influence how hard the parsing problem is. The main results we survey are all hardness results and indicate that parsing is hard even for relatively small values of parameters such as rank and fan-out in a rewriting system.

international conference on implementation and application of automata | 2015

On the Semantics of Regular Expression Parsing in the Wild

Martin Berglund; Brink van der Merwe

Theoretical Computer Science | 2013

Shuffled languages : representation and recognition

Martin Berglund; Henrik Björklund; Johanna Björklund

Archive | 2018

Formalising Boost POSIX Regular Expression Matching

Martin Berglund; Willem Bester; Brink van der Merwe

Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, and we analyse the complexity of regular expression matching when using Boost semantics.

mathematics of language | 2013