Adam Kirsch | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Adam Kirsch is active.

Explore More

Publication

Featured researches published by Adam Kirsch.

european symposium on algorithms | 2006

Less hashing, same performance: building a better bloom filter

Adam Kirsch; Michael Mitzenmacher

A standard technique from the hashing literature is to use two hash functions h1(x) and h2(x) to simulate additional hash functions of the form gi(x) = h1(x) + i h2(x). We demonstrate that this technique can be usefully applied to Bloom filters and related data structures. Specifically, only two hash functions are necessary to effectively implement a Bloom filter without any loss in the asymptotic false positive probability. This leads to less computation and potentially less need for randomness in practice.

IEEE ACM Transactions on Networking | 2010

The power of one move: hashing schemes for hardware

Adam Kirsch; Michael Mitzenmacher

In a standard multiple-choice hashing scheme, each item is stored in one of d ≥ 2 hash table buckets. The availability of choice in where items are stored improves space utilization. These schemes are often very amenable to a hardware implementation, such as in a router. Recently, researchers have discovered powerful variants where items already in the hash table may be moved during the insertion of a new item. Unfortunately, these schemes occasionally require a large number of items to be moved to perform an insertion, making them inappropriate for a hardware implementation. We show that it is possible to significantly increase the space utilization of multiple-choice hashing schemes by allowing at most one item to be moved during an insertion. Furthermore, our schemes can be effectively analyzed, optimized, and compared using numerical methods based on fluid limit arguments, without resorting to much slower simulations.In a standard multiple-choice hashing scheme, each item is stored in one of hash table buckets. The availability of choice in where items are stored improves space utilization. These schemes are often very amenable to a hardware implementation, such as in a router. Recently, researchers have discovered powerful variants where items already in the hash table may be moved during the insertion of a new item. Unfortunately, these schemes occasionally require a large number of items to be moved to perform an insertion, making them inappropriate for a hardware implementation. We show that it is possible to significantly increase the space utilization of multiple-choice hashing schemes by allowing at most one item to be moved during an insertion. Furthermore, our schemes can be effectively analyzed, optimized, and compared using numerical methods based on fluid limit arguments, without resorting to much slower simulations.

Archive | 2010

Hash-Based Techniques for High-Speed Packet Processing

Adam Kirsch; Michael Mitzenmacher; George Varghese

Hashing is an extremely useful technique for a variety of high-speed packet-processing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background in either the theory or applications of hashing, reviewing the fundamentals as necessary.

IEEE Transactions on Information Theory | 2010

Directly Lower Bounding the Information Capacity for Channels With I.I.D. Deletions and Duplications

Adam Kirsch; Eleni Drinea

In this paper, we directly lower bound the information capacity for channels with independent identically distributed (i.i.d.) deletions and duplications. Our approach differs from previous work in that we focus on the information capacity using ideas from renewal theory, rather than focusing on the transmission capacity by analyzing the error probability of some randomly generated code using a combinatorial argument. Of course, the transmission and information capacities are equal, but our change of perspective allows for a much simpler analysis that gives more general theoretical results. We then apply these results to the binary deletion channel to improve existing lower bounds on its capacity.

european symposium on algorithms | 2008

More Robust Hashing: Cuckoo Hashing with a Stash

Adam Kirsch; Michael Mitzenmacher; Udi Wieder

Cuckoo hashing holds great potential as a high-performance hashing scheme for real applications. Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a failure occurs during the insertion of an item, requiring an expensive rehashing of all items in the table. In this paper, we show that this failure probability can be dramatically reduced by the addition of a very small constant-sized stash. We demonstrate both analytically and through simulations that stashes of size equivalent to only three or four items yield tremendous improvements, enhancing cuckoo hashings practical viability in both hardware and software. Our analysis naturally extends previous analyses of multiple cuckoo hashing variants, and the approach may prove useful in further related schemes.

IEEE ACM Transactions on Networking | 2008

Simple summaries for hashing with choices

Adam Kirsch; Michael Mitzenmacher

In a multiple-choice hashing scheme, each item is stored in one of ≥ 2 possible hash table buckets. The availability of these multiple choices allows for a substantial reduction in the maximum load of the buckets. However, a lookup may now require examining each of the d locations. For applications where this cost is undesirable, Song et al. propose keeping a summary that allows one to determine which of the d locations is appropriate for each item, where the summary may allow false positives for items not in hash table. We propose alternative, simple constructions of such summaries that use less space for both the summary and the underlying hash table. Moreover, our constructions are easily analyzable and tunable.

international conference on computer communications | 2008

The Power of One Move: Hashing Schemes for Hardware

Adam Kirsch; Michael Mitzenmacher

In a standard multiple choice hashing scheme, each item is stored in one of d ges 2 hash table buckets. The availability of choice in where items are stored improves space utilization. These schemes are often very amenable to a hardware implementation, such as in a router. Recently, however, researchers have discovered powerful variants where items already in the hash table may be moved during the insertion of a new item. Unfortunately, these schemes occasionally require a large number of items to be moved during an insertion operation, making them inappropriate for a hardware implementation. We show that it is possible to significantly increase the space utilization of a multiple choice hashing scheme by allowing at most one item to be moved during an insertion. Furthermore, our schemes can be effectively analyzed, optimized, and compared using numerical methods based on fluid limit arguments, without resorting to much slower simulations.

symposium on principles of database systems | 2009

An efficient rigorous approach for identifying statistically significant frequent itemsets

Adam Kirsch; Michael Mitzenmacher; Andrea Pietracaprina; Geppino Pucci; Eli Upfal; Fabio Vandin

As advances in technology allow for the collection, storage, and analysis of vast amounts of data, the task of screening and assessing the significance of discovered patterns is becoming a major challenge in data mining applications. In this work, we address significance in the context of frequent itemset mining. Specifically, we develop a novel methodology to identify a meaningful support threshold s* for a dataset, such that the number of itemsets with support at least s* represents a substantial deviation from what would be expected in a random dataset with the same number of transactions and the same individual item frequencies. These itemsets can then be flagged as statistically significant with a small false discovery rate. Our methodology hinges on a Poisson approximation to the distribution of the number of itemsets in a random dataset with support at least s, for any s greater than or equal to a minimum threshold smin. We obtain this result through a novel application of the Chen-Stein approximation method, which is of independent interest. Based on this approximation, we develop an efficient parametric multi-hypothesis test for identifying the desired threshold s*. A crucial feature of our approach is that, unlike most previous work, it takes into account the entire dataset rather than individual discoveries. It is therefore better able to distinguish between significant observations and random fluctuations. We present extensive experimental results to substantiate the effectiveness of our methodology.

international symposium on information theory | 2007

Directly Lower Bounding the Information Capacity for Channels with I.I.D. Deletions and Duplications

Eleni Drinea; Adam Kirsch

electronic commerce | 2007

On threshold behavior in query incentive networks

Esteban Arcaute; Adam Kirsch; Ravi Kumar; David Liben-Nowell; Sergei Vassilvitskii

Motivated by the role of incentives in large-scale information systems, Kleinberg and Raghavan (FOCS 2005) studied strategic games in decentralized information networks. Given a branching process that specifies the network, the rarity of answers to a specific question, and a desired probability of success, how much reward does the root node need to offer so that it receives an answer with this probability, when all of the nodes are playing strategically? For a specific family of branching processes and a constant failure probability, they showed that the reward function exhibited a threshold behavior that depends on the branching parameter b. In this paper we study two factors that can contribute to this transition behavior, namely, the branching process itself and the failure probability. On one hand we show that the threshold behavior is robust with respect to the branching process: for all branching processes and any constant failure probability, if b > 2 then the required reward is linear in the expected depth of the search tree, and if b < 2 then the required reward is exponential in that depth. On the other hand we show that the threshold behavior is fragile with respect to the failure probability σ: if σ is inversely polynomial in the rarity of the answer, then all branching processes require rewards exponential in the depth of the search tree.

Explore More