Pablo Ariel Heiber
Facultad de Ciencias Exactas y Naturales
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pablo Ariel Heiber.
Bioinformatics | 2009
Verónica Becher; Alejandro Deymonnaz; Pablo Ariel Heiber
MOTIVATION There is a significant ongoing research to identify the number and types of repetitive DNA sequences. As more genomes are sequenced, efficiency and scalability in computational tools become mandatory. Existing tools fail to find distant repeats because they cannot accommodate whole chromosomes, but segments. Also, a quantitative framework for repetitive elements inside a genome or across genomes is still missing. RESULTS We present a new efficient algorithm and its implementation as a software tool to compute all perfect repeats in inputs of up to 500 million nucleotide bases, possibly containing many genomes. Our algorithm is based on a suffix array construction and a novel procedure to extract all perfect repeats in the entire input, that can be arbitrarily distant, and with no bound on the repeat length. We tested the software on the Homo sapiens DNA genome NCBI 36.49. We computed all perfect repeats of at least 40 bases occurring in any two chromosomes with exact matching. We found that each H.sapiens chromosome shares approximately 10% of its full sequence with every other human chromosome, distributed more or less evenly among the chromosome surfaces. We give statistics including a quantification of repeats by diversity, length and number of occurrences. We compared the computed repeats against all biological repeats currently obtainable from Ensembl enlarged with the output of the dust program and all elements identified by TRF and RepeatMasker (ftp://ftp.ebi.ac.uk/pub/databases/ensembl/jherrero/.repeats/all_repeats.txt.bz2). We report novel repeats as well as new occurrences of repeats matching with known biological elements. AVAILABILITY The source code, results and visualization of some statistics are accessible from http://kapow.dc.uba.ar/patterns/.
Information & Computation | 2013
Verónica Becher; Pablo Ariel Heiber; Theodore A. Slaman
We give an algorithm to compute an absolutely normal number so that the first n digits in its binary expansion are obtained in time polynomial in n; in fact, just above quadratic. The algorithm uses combinatorial tools to control divergence from normality. Speed of computation is achieved at the sacrifice of speed of convergence to normality.
Journal of Computer and System Sciences | 2015
Verónica Becher; Olivier Carton; Pablo Ariel Heiber
We prove that finite-state transducers with injective behavior, deterministic or not, real-time or not, with no extra memory or a single counter, cannot compress any normal word. We exhaust all combinations of determinism, real-time, and additional memory in the form of counters or stacks, identifying which models can compress normal words. The case of deterministic push-down transducers is the only one still open. We also present results on the preservation of normality by selection with finite automata. Complementing Agafonovs theorem for prefix selection, we show that suffix selection preserves normality. However, there are simple two-sided selection rules that do not. We prove a non-real-time 1-counter transducers cannot compress any normal number.We prove a real-time k-counter transducers cannot compress any normal number.We prove there exist pushdown transducers that can compress a normal number.We prove normality is preserved by suffix selection by finite automata.
Information Processing Letters | 2011
Verónica Becher; Pablo Ariel Heiber
We give a complete proof of the following theorem: Every de Bruijn sequence of order n in at least three symbols can be extended to a de Bruijn sequence of order n+1. Every de Bruijn sequence of order n in two symbols can not be extended to order n+1, but it can be extended to order n+2.
Theoretical Computer Science | 2013
Verónica Becher; Pablo Ariel Heiber
We give an elementary and direct proof of the following theorem: A real number is normal to a given integer base if, and only if, its expansion in that base is incompressible by lossless finite-state compressors (these are finite automata augmented with an output transition function such that the automata input-output behaviour is injective; they are also known as injective finite-state transducers). As a corollary we obtain V.N. Agafonovs theorem on the preservation of normality on subsequences selected by finite automata.
Mathematics of Computation | 2015
Verónica Becher; Pablo Ariel Heiber; Theodore A. Slaman
Fil: Becher, Veronica Andrea. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computacion; Argentina. Consejo Nacional de Investigaciones Cientificas y Tecnicas; Argentina
Information & Computation | 2015
Olivier Carton; Pablo Ariel Heiber
We prove that two-way transducers (both deterministic and non-deterministic) cannot compress normal numbers. To achieve this, we first show that it is possible to generalize compressibility from one-way transducers to two-way transducers. These results extend a known result: normal infinite words are exactly those that cannot be compressed by lossless finite-state transducers, and, more generally, by bounded-to-one non-deterministic finite-state transducers. We also argue that such a generalization cannot be extended to two-way transducers with unbounded memory, even in the simple form of a single counter.
Theoretical Computer Science | 2012
Verónica Becher; Pablo Ariel Heiber
We present a measure of string complexity, called I-complexity, computable in linear time and space. It counts the number of different substrings in a given string. The least complex strings are the runs of a single symbol, the most complex are the de Bruijn strings. Although the I-complexity of a string is not the length of any minimal description of the string, it satisfies many basic properties of classical description complexity. In particular, the number of strings with I-complexity up to a given value is bounded, and most strings of each length have high I-complexity.
Fundamenta Mathematicae | 2014
Verónica Becher; Pablo Ariel Heiber; Theodore A. Slaman
Discrete Mathematics & Theoretical Computer Science | 2013
Pablo Barenbaum; Verónica Becher; Alejandro Deymonnaz; Melisa Halsband; Pablo Ariel Heiber