Alexandr Andoni | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexandr Andoni is active.

Explore More

Publication

Featured researches published by Alexandr Andoni.

Communications of The ACM | 2008

Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

Alexandr Andoni; Piotr Indyk

We present an algorithm for the c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O(dn 1c2/+o(1)) and space O(dn + n1+1c2/+o(1)). This almost matches the lower bound for hashing-based algorithm recently obtained in (R. Motwani et al., 2006). We also obtain a space-efficient version of the algorithm, which uses dn+n logO(1) n space, with a query time of dnO(1/c2). Finally, we discuss practical variants of the algorithms that utilize fast bounded-distance decoders for the Leech lattice

foundations of computer science | 2006

Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions

Alexandr Andoni; Piotr Indyk

We present an algorithm for the c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O\left( {dn^{1/c^2 + o(1)} } \right) and space O\left( {dn + n^{1 + 1/c^2 + o(1)} } \right). This almost matches the lower bound for hashing-based algorithm recently obtained in [27]. We also obtain a space-efficient version of the algorithm, which uses dn+n log^{O(1)} n space, with a query time of dn^{O(1/c^2 )}. Finally, we discuss practical variants of the algorithms that utilize fast bounded-distance decoders for the Leech Lattice.

symposium on the theory of computing | 2015

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Alexandr Andoni; Ilya P. Razenshteyn

We show an optimal data-dependent hashing scheme for the approximate near neighbor problem. For an n-point dataset in a d-dimensional space our data structure achieves query time O(d ⋅ nρ+o(1)) and space O(n1+ρ+o(1) + d ⋅ n), where ρ=1/(2c2-1) for the Euclidean space and approximation c>1. For the Hamming space, we obtain an exponent of ρ=1/(2c-1). Our result completes the direction set forth in (Andoni, Indyk, Nguyen, Razenshteyn 2014) who gave a proof-of-concept that data-dependent hashing can outperform classic Locality Sensitive Hashing (LSH). In contrast to (Andoni, Indyk, Nguyen, Razenshteyn 2014), the new bound is not only optimal, but in fact improves over the best (optimal) LSH data structures (Indyk, Motwani 1998) (Andoni, Indyk 2006) for all approximation factors c>1. From the technical perspective, we proceed by decomposing an arbitrary dataset into several subsets that are, in a certain sense, pseudo-random.

symposium on the theory of computing | 2007

Testing k-wise and almost k-wise independence

Noga Alon; Alexandr Andoni; Tali Kaufman; Kevin Matulef; Ronitt Rubinfeld; Ning Xie

In this work, we consider the problems of testing whether adistribution over (0,1n) is k-wise (resp. (ε,k)-wise) independentusing samples drawn from that distribution. For the problem of distinguishing k-wise independent distributions from those that are δ-far from k-wise independence in statistical distance, we upper bound the number ofrequired samples by Õ(nk/δ2) and lower bound it by Ω(nk-1/2/δ) (these bounds hold for constantk, and essentially the same bounds hold for general k). Toachieve these bounds, we use Fourier analysis to relate adistributions distance from k-wise independence to its biases, a measure of the parity imbalance it induces on a setof variables. The relationships we derive are tighter than previouslyknown, and may be of independent interest. To distinguish (ε,k)-wise independent distributions from thosethat are δ-far from (ε,k)-wise independence in statistical distance, we upper bound thenumber of required samples by O(k log n / δ2ε2) and lower bound it by Ω(√ k log n / 2k(ε+δ)√ log 1/2k(ε+δ)). Although these bounds are anexponential improvement (in terms of n and k) over thecorresponding bounds for testing k-wise independence, we give evidence thatthe time complexity of testing (ε,k)-wise independence isunlikely to be poly(n,1/ε,1/δ) for k=Θ(log n),since this would disprove a plausible conjecture concerning the hardness offinding hidden cliques in random graphs. Under the conjecture, ourresult implies that for, say, k = log n and ε = 1 / n0.99,there is a set of (ε,k)-wise independent distributions, and a set of distributions at distance δ=1/n0.51 from (ε,k)-wiseindependence, which are indistinguishable by polynomial time algorithms.

foundations of computer science | 2006

On the Optimality of the Dimensionality Reduction Method

Alexandr Andoni; Piotr Indyk; Mihai Patrascu

We investigate the optimality of (1+epsi)-approximation algorithms obtained via the dimensionality reduction method. We show that: any data structure for the (1 + epsi)-approximate nearest neighbor problem in Hamming space, which uses constant number of probes to answer each query, must use nOmega(1/epsi2) space; any algorithm for the (1 + epsi)-approximate closest substring problem must run in time exponential in 1/epsi2 - gamma for any gamma > 0 (unless 3SAT can be solved in sub-exponential time). Both lower bounds are (essentially) tight

symposium on the theory of computing | 2009

Approximating edit distance in near-linear time

Alexandr Andoni; Krzysztof Onak

We show how to compute the edit distance between two strings of length n up to a factor of 2(O-tilde(sqrt(log n))) in n(1+o(1)) time. This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time, improving on the state-of-the-art n(1/3+o(1)) approximation. Previously, approximation of 2Õ √log n) was known only for embedding edit distance into l1, and it is not known if that embedding can be computed in less than a quadratic time.

foundations of computer science | 2009

Efficient Sketches for Earth-Mover Distance, with Applications

Alexandr Andoni; Khanh Do Ba; Piotr Indyk; David P. Woodruff

We provide the first sub-linear sketching algorithm for estimating the planar Earth-Mover Distance with a constant approximation. For sets living in the two-dimensional grid

symposium on discrete algorithms | 2006