Is this you? Create Your Porfile

Piotr Indyk

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Piotr Indyk is active.

Explore More

Publication

Featured researches published by Piotr Indyk.

Communications of The ACM | 2008

Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

Alexandr Andoni; Piotr Indyk

We present an algorithm for the c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O(dn 1c2/+o(1)) and space O(dn + n1+1c2/+o(1)). This almost matches the lower bound for hashing-based algorithm recently obtained in (R. Motwani et al., 2006). We also obtain a space-efficient version of the algorithm, which uses dn+n logO(1) n space, with a query time of dnO(1/c2). Finally, we discuss practical variants of the algorithms that utilize fast bounded-distance decoders for the Leech lattice

international conference on management of data | 1998

Enhanced hypertext categorization using hyperlinks

Soumen Chakrabarti; Byron Dom; Piotr Indyk

A major challenge in indexing unstructured hypertext databases is to automatically extract meta-data that enables structured search using topic taxonomies, circumvents keyword ambiguity, and improves the quality of search and profile-based routing and filtering. Therefore, an accurate classifier is an essential component of a hypertext database. Hyperlinks pose new problems not addressed in the extensive text classification literature. Links clearly contain high-quality semantic clues that are lost upon a purely term-based classifier, but exploiting link information is non-trivial because it is noisy. Naive use of terms in the link neighborhood of a document can even degrade accuracy. Our contribution is to propose robust statistical models and a relaxation labeling technique for better classification by exploiting link information in a small neighborhood around documents. Our technique also adapts gracefully to the fraction of neighboring documents having known topics. We experimented with pre-classified samples from Yahoo!1 and the US Patent Database2. In previous work, we developed a text classifier that misclassified only 13% of the documents in the well-known Reuters benchmark; this was comparable to the best results ever obtained. This classifier misclassified 36% of the patents, indicating that classifying hypertext can be more difficult than classifying text. Naively using terms in neighboring documents increased error to 38%; our hypertext classifier reduced it to 21%. Results with the Yahoo! sample were more dramatic: the text classifier showed 68% error, whereas our hypertext classifier reduced this to only 21%.

SIAM Journal on Computing | 2002

Maintaining Stream Statistics over Sliding Windows

Mayur Datar; Aristides Gionis; Piotr Indyk; Rajeev Motwani

We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We consider the following basic problem: Given a stream of bits, maintain a count of the number of 1s in the last N elements seen from the stream. We show that, using

foundations of computer science | 2006

Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions

Alexandr Andoni; Piotr Indyk

O(\frac{1}{\epsilon} \log^2 N)

Journal of the ACM | 2006

Stable distributions, pseudorandom generators, embeddings, and data stream computation

Piotr Indyk

bits of memory, we can estimate the number of 1s to within a factor of

allerton conference on communication, control, and computing | 2008

Combining geometry and combinatorics: A unified approach to sparse signal recovery

Radu Berinde; Anna C. Gilbert; Piotr Indyk; Howard J. Karloff; M. Strauss

1 + \epsilon

symposium on the theory of computing | 2002

Approximate clustering via core-sets

Mihai Bādoiu; Sariel Har-Peled; Piotr Indyk

. We also give a matching lower bound of

Proceedings of the IEEE | 2010

Sparse Recovery Using Sparse Matrices

Anna C. Gilbert; Piotr Indyk

\Omega(\frac{1}{\epsilon}\log^2 N)

international conference on cluster computing | 2001

Algorithmic applications of low-distortion geometric embeddings

Piotr Indyk

memory bits for any deterministic or randomized algorithms. We extend our scheme to maintain the sum of the last N positive integers and provide matching upper and lower bounds for this more general problem as well. We also show how to efficiently compute the Lp norms (

symposium on the theory of computing | 2002