Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dominik Kempa is active.

Publication


Featured researches published by Dominik Kempa.


combinatorial pattern matching | 2013

Linear Time Lempel-Ziv Factorization: Simple, Fast, Small

Juha Kärkkäinen; Dominik Kempa; Simon J. Puglisi

Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in many diverse applications, including data compression, text indexing, and pattern discovery. We describe new linear time LZ factorization algorithms, some of which require only 2nlogn + O(logn) bits of working space to factorize a string of length n. These are the most space efficient linear time algorithms to date, using n logn bits less space than any previous linear time algorithm. The algorithms are also simple to implement, very fast in practice, and amenable to streaming implementation.


symposium on experimental and efficient algorithms | 2013

Lightweight Lempel-Ziv Parsing

Juha Kärkkäinen; Dominik Kempa; Simon J. Puglisi

We introduce a new approach to LZ77 factorization that uses \(\O(n/d)\) words of working space and \(\O(dn)\) time for any d ≥ 1 (for polylogarithmic alphabet sizes). We also describe carefully engineered implementations of alternative approaches to lightweight LZ77 factorization. Extensive experiments show that the new algorithm is superior, and particularly so at the lowest memory levels and for highly repetitive data. As a part of the algorithm, we describe new methods for computing matching statistics which may be of independent interest.


algorithm engineering and experimentation | 2013

Lempel-Ziv factorization: simple, fast, practical

Dominik Kempa; Simon J. Puglisi

For decades the Lempel-Ziv (LZ77) factorization has been a cornerstone of data compression and string processing algorithms, and uses for it are still being uncovered. For example, LZ77 is central to several recent text indexing data structures designed to search highly repetitive collections. However, in many applications computation of the factorization remains a bottleneck in practice. In this paper we describe simple and fast algorithms for computing the LZ77 factorization. These new methods consistently outperform all previous approaches in practice, use less memory, and still offer strong worstcase performance guarantees. A common feature of the new algorithms is their avoidance of the longest-common-prefix array, essential to nearly all prior art.


data compression conference | 2014

Hybrid Compression of Bitvectors for the FM-Index

Juha Kärkkäinen; Dominik Kempa; Simon J. Puglisi

Compressed bit vectors supporting rank and select operations are the workhorse of compressed data structures. We propose a hybrid scheme for implementing compressed bit vectors, which divides the bit vector into blocks and then chooses the encoding of each block separately from a number of different encoding methods. Hybrid encoding is particularly suitable for bit vectors that have lots of local and regional variation, such as those present in the FM-index, a popular compressed data structure for pattern matching. We propose a specific hybrid combination of three simple encoding methods for FM-index bit vectors achieving superior space-time tradeoffs in experiments.


Journal of Discrete Algorithms | 2014

A subquadratic algorithm for minimum palindromic factorization

Gabriele Fici; Travis Gagie; Juha Kärkkäinen; Dominik Kempa

We give an O ( n log ź n ) -time, O ( n ) -space algorithm for factoring a string into the minimum number of palindromic substrings. That is, given a string S 1 . . n , in O ( n log ź n ) time our algorithm returns the minimum number of palindromes S 1 , ź , S ź such that S = S 1 ź S ź . We also show that the time complexity is O ( n ) on average and ź ( n log ź n ) in the worst case. The last result is based on a characterization of the palindromic structure of Zimin words.


data compression conference | 2014

Lempel-Ziv Parsing in External Memory

Juha Kärkkäinen; Dominik Kempa; Simon J. Puglisi

In the 35 years since its discovery, the Lempel-Ziv factorization (or LZ77 parsing) has become a fundamental method for data compression and string processing. In many applications, computation of the factorization is a time-space bottleneck. However, and despite the increasing need to apply LZ77 to massive data sets (for both storage and indexing), no algorithm to date scales to inputs that exceed the size of RAM. In this paper we describe the first algorithms for computing the LZ77 parsing efficiently using external memory.


Mathematics in Computer Science | 2017

Engineering a Lightweight External Memory Suffix Array Construction Algorithm

Juha Kärkkäinen; Dominik Kempa

We describe an external memory suffix array construction algorithm based on constructing suffix arrays for blocks of text and merging them into the full suffix array. The basic idea goes back over 20 years and there has been a couple of later improvements, but we describe several further improvements that make the algorithm much faster. In particular, we reduce the I/O volume of the algorithm by a factor


data compression conference | 2012

Slashing the Time for BWT Inversion

Juha Kärkkäinen; Dominik Kempa; Simon J. Puglisi


european symposium on algorithms | 2016

Faster External Memory LCP Array Construction

Juha Kärkkäinen; Dominik Kempa

\mathcal {O}\!\left( {\log _\sigma n} \right)


developments in language theory | 2015

Diverse Palindromic Factorization Is NP-complete

Hideo Bannai; Travis Gagie; Shunsuke Inenaga; Juha Kärkkäinen; Dominik Kempa; Marcin Piątkowski; Simon J. Puglisi; Shiho Sugimoto

Collaboration


Dive into the Dominik Kempa's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Travis Gagie

Diego Portales University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Simon Gog

University of Melbourne

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge