Ely Porat | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ely Porat is active.

Explore More

Publication

Featured researches published by Ely Porat.

symposium on discrete algorithms | 2000

Faster algorithms for string matching with k mismatches

Amihood Amir; Moshe Lewenstein; Ely Porat

The string matching with mismatches problem is that of finding the number of mismatches between a pattern P of length m and every length m substring of the text T. Currently, the fastest algorithms for this problem are the following. The Galil-Giancarlo algorithm finds all locations where the pattern has at most k errors (where k is part of the input) in time O(nk). The Abrahamson algorithm finds the number of mismatches at every location in time O(n√ m log m). We present an algorithm that is faster than both. Our algorithm finds all locations where the pattern has at most k errors in time O(n√k log k). We also show an algorithm that solves the above problem in time O((n + (nk3)/m) log k).

IEEE Transactions on Information Theory | 2011

Explicit Nonadaptive Combinatorial Group Testing Schemes

Ely Porat; Amir Rothschild

Group testing is a long studied problem in combinatorics: A small set of r ill people should be identified out of the whole (n people) by using only queries (tests) of the form “Does set X contain an ill human?” In this paper we provide an explicit construction of a testing scheme which is better (smaller) than any known explicit construction. This scheme has Θ(min[r2 lnn,n]) tests which is as many as the best nonexplicit schemes have. In our construction, we use a fact that may have a value by its own right: Linear error-correction codes with parameters [m,k,δm]q meeting the Gilbert-Varshamov bound may be constructed quite efficiently, in Θ(qkm) time.

symposium on the theory of computing | 2010

Approximate sparse recovery: optimizing time and measurements

Anna C. Gilbert; Yi Li; Ely Porat; M. Strauss

A Euclidean approximate sparse recovery system consists of parameters k,N, an m-by-N measurement matrix, Φ, and a decoding algorithm, D. Given a vector, x, the system approximates x by ^x=D(Φ x), which must satisfy ||x - x||2≤ C ||x - xk||2, where xk denotes the optimal k-term approximation to x. (The output ^x may have more than k terms). For each vector x, the system must succeed with probability at least 3/4. Among the goals in designing such systems are minimizing the number m of measurements and the runtime of the decoding algorithm, D. In this paper, we give a system with m=O(k log(N/k)) measurements--matching a lower bound, up to a constant factor--and decoding time k log{O(1) N, matching a lower bound up to log(N) factors. We also consider the encode time (i.e., the time to multiply Φ by x), the time to update measurements (i.e., the time to multiply Φ by a 1-sparse x), and the robustness and stability of the algorithm (adding noise before and after the measurements). Our encode and update times are optimal up to log(k) factors. The columns of Φ have at most O(log2(k)log(N/k)) non-zeros, each of which can be found in constant time. Our full result, an FPRAS, is as follows. If x=xk+ν1, where ν1 and ν2 (below) are arbitrary vectors (regarded as noise), then, setting ^x = D(Φ x + ν2), and for properly normalized ν, we get [||^x - x||22 ≤ (1+ε)||ν1||22 + ε||ν2||22,] using O((k/ε)log(N/k)) measurements and (k/ε)logO(1)(N) time for decoding.

foundations of computer science | 2009

Exact and Approximate Pattern Matching in the Streaming Model

Benny Porat; Ely Porat

We present a fully online randomized algorithm for the classical pattern matching problem that uses merely O(log m) space, breaking the O(m) barrier that held for this problem for a long time. Our method can be used as a tool in many practical applications, including monitoring Internet traffic and firewall applications. In our online model we first receive the pattern P of size m and preprocess it. After the preprocessing phase, the characters of the text T of size n arrive one at a time in an online fashion. For each index of the text input we indicate whether the pattern matches the text at that location index or not. Clearly, for index i, an indication can only be given once all characters from index i till index i+m-1 have arrived. Our goal is to provide such answers while using minimal space, and while spending as little time as possible on each character (time and space which are in O(poly(log n)) ).We present an algorithm whereby both false positive and false negative answers are allowed with probability of at most 1/n^3. Thus, overall, the correct answer for all positions is returned with a probability of 1/n^2. The time which our algorithm spends on each input character is bounded by O(log m), and the space complexity is O(log m) words. We also present a solution in the same model for the pattern matching with k mismatches problem. In this problem, a match means allowing up to k symbol mismatches between the pattern and the subtext beginning at index i. We provide an algorithm in which the time spent on each character is bounded by O(k^2*poly(log m)), and the space complexity is O(k^3*poly(log m)) words.

Information and Computation archive | 2003

Overlap matching

Amihood Amir; Richard Cole; Ramesh Hariharan; Moshe Lewenstein; Ely Porat

We propose a new paradigm for string matching, namely structural matching. In structural matching, the text and pattern contents are not important. Rather, some areas in the text and patterns are singled out, say intervals. A “match” is a text location where a specified relation between the text and pattern areas is satisfied. In particular we define the structural matching problem of Overlap (Parity) Matching. We seek the text locations where all overlaps of the given pattern and text intervals have even length. We show that this problem can be solved in time &Ogr;(n log m), where the text length is n and the pattern length is m. As an application of overlap matching, we show how to reduce the String Matching with Swaps problem to the overlap matching problem. The String Matching with Swaps problem is the problem of string matching in the presence of local swaps. The best known deterministic upper bound for this problem was &Ogr;(nm1/3 log m log &sgr;) for a general alphabet ∑, where &sgr; = min(m, ¦∑¦). Our reduction provides a solution to the pattern matching with swaps problem in time &Ogr;(n log m log &sgr;).

international colloquium on automata languages and programming | 2008

Explicit Non-adaptive Combinatorial Group Testing Schemes

Ely Porat; Amir Rothschild

Group testing is a long studied problem in combinatorics: A small set of r ill people should be identified out of the whole (n people) by using only queries (tests) of the form “Does set X contain an ill human?” In this paper we provide an explicit construction of a testing scheme which is better (smaller) than any known explicit construction. This scheme has Θ(min[r2 lnn,n]) tests which is as many as the best nonexplicit schemes have. In our construction, we use a fact that may have a value by its own right: Linear error-correction codes with parameters [m,k,δm]q meeting the Gilbert-Varshamov bound may be constructed quite efficiently, in Θ(qkm) time.

symposium on the theory of computing | 2011

Fast moment estimation in data streams in optimal space

Daniel M. Kane; Jelani Nelson; Ely Porat; David P. Woodruff

We give a space-optimal streaming algorithm with update time O(log2(1/ε)loglog(1/ε)) for approximating the pth frequency moment, 0 < p < 2, of a length-n vector updated in a data stream up to a factor of 1 +/- ε. This provides a nearly exponential improvement over the previous space optimal algorithm of [Kane-Nelson-Woodruff, SODA 2010], which had update time Omega(1/eps2). When combined with the work of [Harvey-Nelson-Onak, FOCS 2008], we also obtain the first algorithm for entropy estimation in turnstile streams which simultaneously achieves near-optimal space and fast update time.

international colloquium on automata languages and programming | 2003

Function matching: algorithms, applications, and a lower bound

Amihood Amir; Yonatan Aumann; Richard Cole; Moshe Lewenstein; Ely Porat

We introduce a new matching criterion - function matching - that captures several different applications. The function matching problem has as its input a text T of length n over alphabet ΣT and a pattern P = P[1]P[2] ... P[m] of length m over alphabet ΣP. We seek all text locations i for which, for some function f : ΣP → ΣT (f may also depend on i), the m-length substring that starts at i is equal to f(P[1])f(P[2]) ... f(P[m]). We give a randomized algorithm which, for any given constant k, solves the function matching problem in time O(n log n) with probability 1/nk of declaring a false positive. We give a deterministic algorithm whose time is O(n|ΣP| logm) and show that it is almost optimal in the newly formalized convolutions model. Finally, a variant of the third problem is solved by means of two-dimensional parameterized matching, for which we also give an efficient algorithm.

computer science symposium in russia | 2009

An Optimal Bloom Filter Replacement Based on Matrix Solving

Ely Porat

We suggest a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters. The space requirements of the dictionary we suggest are much smaller than those of a hashtable. We allow storing n keys, each mapped to value which is a string of k bits. Our suggested method requires nk + o (n ) bits space to store the dictionary, and O (n ) time to produce the data structure, and allows answering a membership query in O (1) memory probes. The dictionary size does not depend on the size of the keys . However, reducing the space requirements of the data structure comes at a certain cost. Our dictionary has a small probability of a one sided error. When attempting to obtain the value for a key that is stored in the dictionary we always get the correct answer. However, when testing for membership of an element that is not stored in the dictionary, we may get an incorrect answer, and when requesting the value of such an element we may get a certain random value. Our method is based on solving equations in GF (2 k ) and using several hash functions. Another significant advantage of our suggested method is that we do not require using sophisticated hash functions. We only require pairwise independent hash functions. We also suggest a data structure that requires only nk bits space, has O (n 2) preprocessing time, and has a O (logn ) query time. However, this data structures requires a uniform hash functions. In order replace a Bloom Filter of n elements with an error proability of 2*** k , we require nk + o (n ) memory bits, O (1) query time, O (n ) preprocessing time, and only pairwise independent hash function. Even the most advanced previously known Bloom Filter would require nk + O (n ) space, and a uniform hash functions, so our method is significantly less space consuming especially when k is small. Our suggested dictionary can replace Bloom Filters, and has many applications. A few application examples are dictionaries for storing bad passwords, differential files in databases, Internet caching and distributed storage systems.

symposium on discrete algorithms | 2006

Pattern matching with address errors: rearrangement distances

Amihood Amir; Yonatan Aumann; Gary Benson; Avivit Levy; Ohad Lipsky; Ely Porat; Steven Skiena; Uzi Vishne

Historically, approximate pattern matching has mainly focused at coping with errors in the data, while the order of the text/pattern was assumed to be more or less correct. In this paper we consider a class of pattern matching problems where the content is assumed to be correct, while the locations may have shifted/changed. We formally define a broad class of problems of this type, capturing situations in which the pattern is obtained from the text by a sequence of rearrangements. We consider several natural rearrangement schemes, including the analogues of the l 1 and l 2 distances, as well as two distances based on interchanges. For these, we present efficient algorithms to solve the resulting string matching problems.

Explore More