Péter Burcsi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Péter Burcsi is active.

Explore More

Publication

Featured researches published by Péter Burcsi.

International Journal of Foundations of Computer Science | 2012

ALGORITHMS FOR JUMBLED PATTERN MATCHING IN STRINGS

Péter Burcsi; Ferdinando Cicalese; Gabriele Fici; Zsuzsanna Lipták

The Parikh vector p(s) of a string s over a finite ordered alphabet Σ = {a1, …, aσ} is defined as the vector of multiplicities of the characters, p(s) = (p1, …, pσ), where pi = |{j | sj = ai}|. Parikh vector q occurs in s if s has a substring t with p(t) = q. The problem of searching for a query q in a text s of length n can be solved simply and worst-case optimally with a sliding window approach in O(n) time. We present two novel algorithms for the case where the text is fixed and many queries arrive over time. The first algorithm only decides whether a given Parikh vector appears in a binary text. It uses a linear size data structure and decides each query in O(1) time. The preprocessing can be done trivially in Θ(n2) time. The second algorithm finds all occurrences of a given Parikh vector in a text over an arbitrary alphabet of size σ ≥ 2 and has sub-linear expected time complexity. More precisely, we present two variants of the algorithm, both using an O(n) size data structure, each of which can be constructed in O(n) time. The first solution is very simple and easy to implement and leads to an expected query time of , where m = ∑i qi is the length of a string with Parikh vector q. The second uses wavelet trees and improves the expected runtime to , i.e., by a factor of log m.

fun with algorithms | 2012

On Approximate Jumbled Pattern Matching in Strings

Péter Burcsi; Ferdinando Cicalese; Gabriele Fici; Zsuzsanna Lipták

Given a string s, the Parikh vector of s, denoted p(s), counts the multiplicity of each character in s. Searching for a match of a Parikh vector q in the text s requires finding a substring t of s with p(t)=q. This can be viewed as the task of finding a jumbled (permuted) version of a query pattern, hence the term Jumbled Pattern Matching. We present several algorithms for the approximate version of the problem: Given a string s and two Parikh vectors u,v (the query bounds), find all maximal occurrences in s of some Parikh vector q such that u≤q≤v. This definition encompasses several natural versions of approximate Parikh vector search. We present an algorithm solving this problem in sub-linear expected time using a wavelet tree of s, which can be computed in time O(n) in a preprocessing phase. We then discuss a Scrabble-like variation of the problem, in which a weight function on the letters of s is given and one has to find all occurrences in s of a substring t with maximum weight having Parikh vector p(t)≤v. For the case of a binary alphabet, we present an algorithm which solves the decision version of the Approximate Jumbled Pattern Matching problem in constant time, by indexing the string in subquadratic time.

combinatorial pattern matching | 2014

On Combinatorial Generation of Prefix Normal Words

Péter Burcsi; Gabriele Fici; Zsuzsanna Lipták; Frank Ruskey; Joe Sawada

A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present an efficient algorithm for exhaustively listing the prefix normal words with a fixed length. The algorithm is based on the fact that the language of prefix normal words is a bubble language, a class of binary languages with the property that, for any word w in the language, exchanging the first occurrence of 01 by 10 in w results in another word in the language. We prove that each prefix normal word is produced in O(n) amortized time, and conjecture, based on experimental evidence, that the true amortized running time is O(log(n)).

fun with algorithms | 2014

Normal, Abby Normal, Prefix Normal

Péter Burcsi; Gabriele Fici; Zsuzsanna Lipták; Frank Ruskey; Joe Sawada

A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present results about the number \(\textit{pnw}(n)\) of prefix normal words of length n, showing that \(\textit{pnw}(n) =\Omega\left(2^{n - c\sqrt{n\ln n}}\right)\) for some c and \(\textit{pnw}(n) = O \left(\frac{2^n (\ln n)^2}{n}\right)\). We introduce efficient algorithms for testing the prefix normal property and a “mechanical algorithm” for computing prefix normal forms. We also include games which can be played with prefix normal words. In these games Alice wishes to stay normal but Bob wants to drive her “abnormal” – we discuss which parameter settings allow Alice to succeed.

Theoretical Computer Science | 2017

On prefix normal words and prefix normal forms

Péter Burcsi; Gabriele Fici; Zsuzsanna Lipták; Frank Ruskey; Joe Sawada

Abstract A 1-prefix normal word is a binary word with the property that no factor has more 1s than the prefix of the same length; a 0-prefix normal word is defined analogously. These words arise in the context of indexed binary jumbled pattern matching, where the aim is to decide whether a word has a factor with a given number of 1s and 0s (a given Parikh vector). Each binary word has an associated set of Parikh vectors of the factors of the word. Using prefix normal words, we provide a characterization of the equivalence class of binary words having the same set of Parikh vectors of their factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We give enumeration results on pnw ( n ) , the number of prefix normal words of length n , showing that, for sufficiently large n , 2 n − 4 n lg ⁡ n ≤ pnw ( n ) ≤ 2 n − lg ⁡ n + 1 . For fixed density (number of 1s), we show that the ordinary generating function of the number of prefix normal words of length n and density d is a rational function. Finally, we give experimental results on pnw ( n ) , discuss further properties, and state open problems.

combinatorial pattern matching | 2016

Reconstruction of Trees from Jumbled and Weighted Subtrees.

Dénes Bartha; Péter Burcsi; Zsuzsanna Lipták

Let T be an edge-labeled graph, where the labels are from a finite alphabet Sigma. For a subtree U of T the Parikh vector of U is a vector of length |Sigma| which specifies the multiplicity of each label in U. We ask when T can be reconstructed from the multiset of Parikh vectors of all its subtrees, or all of its paths, or all of its maximal paths. We consider the analogous problems for weighted trees. We show how several well-known reconstruction problems on labeled strings, weighted strings and point sets on a line can be included in this framework. We present reconstruction algorithms and non-reconstructibility results, and extend the polynomial method, previously applied to jumbled strings [Acharya et al., SIAM J. on Discr. Math, 2015] and weighted strings [Bansal et al., CPM 2004], to deal with general trees and special tree classes.

Computers & Mathematics With Applications | 2007

On the importance of cache tuning in a cache-aware algorithm: A case study

Péter Burcsi; Attila L. Kovács

In the present paper we describe and analyse a sieving algorithm for determining prime numbers. This external memory algorithm contains several parameters which are related to the sizes of the levels in the memory hierarchy. We examine how we should choose the values of these parameters in order to obtain an optimal running time. We compare the running times obtained by varying the parameters. We conclude that in this specific problem fine tuning pays off as we got a speed-up of almost 40%.

fun with algorithms | 2010