Gad M. Landau
University of Haifa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gad M. Landau.
Journal of Algorithms | 1989
Gad M. Landau; Uzi Vishkin
Consider the string matching problem, where differences between characters of the pattern and characters of the text are allowed. Each difference is due to either a mismatch between a character of the text and a character of the pattern, or a superfluous character in the text, or a superfluous character in the pattern. Given a text of lenght n, a pattern of length m and an integer k, we present parallel and serial algorthms for finding all occurrences of the pattern in the text with at most k differences. The parallel algorithm requires O(log m + k) time using n processors. The serial algorithm runs in O(nk) time for an alphabet whose size is fixed.
Algorithmica | 1988
Alberto Apostolico; Costas S. Iliopoulos; Gad M. Landau; Baruch Schieber; Uzi Vishkin
Many string manipulations can be performed efficiently on suffix trees. In this paper a CRCW parallel RAM algorithm is presented that constructs the suffix tree associated with a string ofn symbols inO(logn) time withn processors. The algorithm requires Θ(n2) space. However, the space needed can be reduced toO(n1+ɛ) for any 0< ɛ ≤1, with a corresponding slow-down proportional to 1/ɛ. Efficient parallel procedures are also given for some string problems that can be solved with suffix trees.
Theoretical Computer Science | 1986
Gad M. Landau; Uzi Vishkin
Abstract Given a text of length n , a pattern of length m , and an integer k , we present an algorithm for finding all occurrences of the pattern in the text, each with at most k mismatches. The algorithm runs in O( k ( m log m + n )) time.
foundations of computer science | 1988
Gad M. Landau; Uzi Vishkin
Abstract Consider the string matching problem where differences between characters of the pattern and characters of the text are allowed. Each difference is due to either a mismatch between a character of the text and a character of the pattern or a superfluous character in the text or a superfluous character in the pattern. Given a text of length n , a pattern of length m , and an integer k , we present an algorithm for finding all occurrences of the pattern in the text, each with at most k differences. It runs in O ( m + nk 2 ) time for an alphabet whose size is fixed. For general input the algorithm requires O ( m log m + nk 2 ) time. In both cases the space requirement is O ( m ).
symposium on the theory of computing | 1986
Gad M. Landau; Uzi Vishkin
Consider the stnng matching problem, where differences between characters of the pattern and characters of the text are allowed. Each difference is due to either a mismatch between a character of the text and a character of the pattern or a superfluous character in the text or a superfluous character in the pattern. Given a text of length n, a pattern of length m and an integer k, we present parallel and serial algorithms for finding all occurrences of the pattern in the text with at most k differences. The first part of the parallel algorithm consists of analysis of the pattern and takes 0 (log m ) time using m 2 processors. The rest of the algorithm consists of handling the text. The text han1. The research of this author was supported by NSF grants NSF-DCR-8318874 and NSF-DCR-8413359 and ONR grant
combinatorial pattern matching | 1993
Gad M. Landau; Jeanette P. Schmidt
A perfect tandem repeat within a string S is a substring r = r1,... r2l of S, for which r1 ... rl = rl+1 ... r2l. An approximate tandem repeat is a substring r = r1,..., rl′,... rl, for which r1,..., rl′ and rl′+1, ... rl are similar. In this paper we consider two criterions of similarity: the Hamming distance (k mismatches) and the edit distance (k differences). For a string S of length n and an integer k our algorithm reports all locally optimal approximate repeats, r = ūu, for which the Hamming distance of ū and u is at most k in O(nk log (n/k)) time, or all those for which the edit distance of ū and u is at most k, in O(nk log k log n) time.
Bioinformatics | 2002
Olga G. Troyanskaya; Ora Arbell; Yair Koren; Gad M. Landau; Alexander Bolshoy
MOTIVATION One of the major features of genomic DNA sequences, distinguishing them from texts in most spoken or artificial languages, is their high repetitiveness. Variation in the repetitiveness of genomic texts reflects the presence and density of different biologically important messages. Thus, deviation from an expected number of repeats in both directions indicates a possible presence of a biological signal. Linguistic complexity corresponds to repetitiveness of a genomic text, and potential regulatory sites may be discovered through construction of typical patterns of complexity distribution. RESULTS We developed software for fast calculation of linguistic sequence complexity of DNA sequences. Our program utilizes suffix trees to compute the number of subwords present in genomic sequences, thereby allowing calculation of linguistic complexity in time linear in genome size. The measure of linguistic complexity was applied to the complete genome of Haemophilus influenzae. Maps of complexity along the entire genome were obtained using sliding windows of 40, 100, and 2000 nucleotides. This approach provided an efficient way to detect simple sequence repeats in this genome. In addition, local profiles of complexity distribution around the starts of translation were constructed for 21 complete prokaryotic genomes. We hypothesize that complexity profiles correspond to evolutionary relationships between organisms. We found principal differences in profiles of the GC-rich and other (non-GC-rich) genomes. We also found characteristic differences in profiles of AT genomes, which probably reflect individual species variations in translational regulation. AVAILABILITY The program is available upon request from Alexander Bolshoy or at http://csweb.haifa.ac.il/library/#complex.
Information Processing Letters | 1993
Vincent A. Fischetti; Gad M. Landau; Peter H. Sellers; Jeanette P. Schmidt
Abstract Consider a template P of size m in which each character matches many different characters with various degrees of perfection. Given a text T of size n , we present a simple and practical algorithm that finds the substring of T , which best matches some substring of P n ( P n is the concatenation of an arbitrary number of copies of P ). The algorithm produces the matched pair and their alignment in O( mn ) time.
foundations of computer science | 1985
Gad M. Landau; Uzi Vishkin
Consider the string matching problem where differences between characters of the pattern and characters of the text are allowed. Each difference is due to either a mismatch between a character of the text and a character of the pattern or a superfluous character in the text or a superfluous character in the pattern. Given a text of length n, a pattern of length m and an integer k, we present an algorithm for finding all occurrences of the pattern in the text, each with at most k differences. The algorithm runs in O(m2 + k2n) time. Given the same input we also present an algorithm for finding all occurrences of the pattern in the text, each with at most k mismatches (superfluous characters in either the text or the pattern are not allowed). This algorithm runs in O(k(m logm + n)) time.
Journal of Algorithms | 2000
Amihood Amir; Dmitry Keselman; Gad M. Landau; Moshe Lewenstein; Noa Lewenstein; Michael Rodeh
The indexing problem is where a text is preprocessed and subsequent queries of the form “Find all occurrences of pattern P in the text” are answered in time proportional to the length of the query and the number of occurrences. In the dictionary matching problem a set of patterns is preprocessed and subsequent queries of the form “Find all occurrences of dictionary patterns in text T” are answered in time proportional to the length of the text and the number of occurrences.There exist efficient worst-case solutions for the indexing problem and the dictionary matching problem, but none that find approximate occurrences of the patterns, i.e., where the pattern is within a bound edit (or Hamming) distance from the appropriate text location.In this paper we present a uniform deterministic solution to both the indexing and the general dictionary matching problem with one error. We preprocess the data in time O(nlog2n), where n is the text size in the indexing problem and the dictionary size in the dictionary matching problem. Our query time for the indexing problem is O(mlognloglogn+tocc), where m is the query string size and tocc is the number of occurrences. Our query time for the dictionary matching problem is O(nlog3dloglogd+tocc), where n is the text size and d the dictionary size. The time bounds above apply to both bounded and unbounded alphabets.