Justin Thaler
Yahoo!
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Justin Thaler.
very large data bases | 2011
Graham Cormode; Justin Thaler; Ke Yi
When computation is outsourced, the data owner would like to be assured that the desired computation has been performed correctly by the service provider. In theory, proof systems can give the necessary assurance, but prior work is not sufficiently scalable or practical. In this paper, we develop new proof protocols for verifying computations which are streaming in nature: the verifier (data owner) needs only logarithmic space and a single pass over the input, and after observing the input follows a simple protocol with a prover (service provider) that takes logarithmic communication spread over a logarithmic number of rounds. These ensure that the computation is performed correctly: that the service provider has not made any errors or missed out some data. The guarantee is very strong: even if the service provider deliberately tries to cheat, there is only vanishingly small probability of doing so undetected, while a correct computation is always accepted. We first observe that some theoretical results can be modified to work with streaming verifiers, showing that there are efficient protocols for problems in the complexity classes NP and NC. Our main results then seek to bridge the gap between theory and practice by developing usable protocols for a variety of problems of central importance in streaming and database processing. All these problems require linear space in the traditional streaming model, and therefore our protocols demonstrate that adding a prover can exponentially reduce the effort needed by the verifier. Our experimental results show that our protocols are practical and scalable.
ACM Transactions on Algorithms | 2014
Amit Chakrabarti; Graham Cormode; Andrew McGregor; Justin Thaler
The central goal of data stream algorithms is to process massive streams of data using sublinear storage space. Motivated by work in the database community on outsourcing database and data stream processing, we ask whether the space usage of such algorithms can be further reduced by enlisting a more powerful “helper” that can annotate the stream as it is read. We do not wish to blindly trust the helper, so we require that the algorithm be convinced of having computed a correct answer. We show upper bounds that achieve a nontrivial tradeoff between the amount of annotation used and the space required to verify it. We also prove lower bounds on such tradeoffs, often nearly matching the upper bounds, via notions related to Merlin-Arthur communication complexity. Our results cover the classic data stream problems of selection, frequency moments, and fundamental graph problems such as triangle-freeness and connectivity. Our work is also part of a growing trend—including recent studies of multipass streaming, read/write streams, and randomly ordered streams—of asking more complexity-theoretic questions about data stream processing. It is a recognition that, in addition to practical relevance, the data stream model raises many interesting theoretical questions in its own right.
Information & Computation | 2015
Mark Bun; Justin Thaler
The e-approximate degree of a Boolean function f : { - 1 , 1 } n ? { - 1 , 1 } is the minimum degree of a real polynomial that approximates f to within error e in the ? ∞ norm. We prove several lower bounds on this important complexity measure by explicitly constructing solutions to the dual of an appropriate linear program. Our first result resolves the e-approximate degree of the two-level AND-OR tree for any constant e 0 . We show that this quantity is ? ( n ) , closing a line of incrementally larger lower bounds. The same lower bound was recently obtained independently by Sherstov (Theory Comput. 2013) using related techniques. Our second result gives an explicit dual polynomial that witnesses a tight lower bound for the approximate degree of any symmetric Boolean function, addressing a question of Spalek (2008). Our final contribution is to reprove several Markov-type inequalities from approximation theory by constructing explicit dual solutions to natural linear programs. These inequalities underly the proofs of many of the best-known approximate degree lower bounds, and have important uses throughout theoretical computer science.
european symposium on algorithms | 2010
Graham Cormode; Michael Mitzenmacher; Justin Thaler
Motivated by the trend to outsource work to commercial cloud computing services, we consider a variation of the streaming paradigm where a streaming algorithm can be assisted by a powerful helper that can provide annotations to the data stream. We extend previous work on such annotation models by considering a number of graph streaming problems. Without annotations, streaming algorithms for graph problems generally require significant memory; we show that for many standard problems, including all graph problems that can be expressed with totally unimodular integer programming formulations, only constant memory is needed for single-pass algorithms given linearsized annotations. We also obtain a protocol achieving optimal tradeoffs between annotation length and memory usage for matrix-vector multiplication; this result contributes to a trend of recent research on numerical linear algebra in streaming models.
allerton conference on communication, control, and computing | 2009
Nicholas Ruozzi; Justin Thaler; Sekhar Tatikonda
We formulate a new approach to understanding the behavior of the min-sum algorithm by exploiting the properties of graph covers. First, we present a new, natural characterization of scaled diagonally dominant matrices in terms of graph covers; this result motivates our approach because scaled diagonal dominance is a known sufficient condition for the convergence of min-sum in the case of quadratic minimization. We use our understanding of graph covers to characterize the periodic behavior of the min-sum algorithm on a single cycle. Lastly, we explain how to extend the single cycle results to understand the 2-periodic behavior of min-sum for general pairwise MRFs. Some of our techniques apply more broadly, and we believe that by capturing the notion of indistinguishability, graph covers represent a valuable tool for understanding the abilities and limitations of general message-passing algorithms.
international colloquium on automata languages and programming | 2013
Mark Bun; Justin Thaler
The e-approximate degree of a Boolean function f: {−1, 1}n→{−1, 1} is the minimum degree of a real polynomial that approximates f to within e in the l∞ norm. We prove several lower bounds on this important complexity measure by explicitly constructing solutions to the dual of an appropriate linear program. Our first result resolves the e-approximate degree of the two-level AND-OR tree for any constant e>0. We show that this quantity is
allerton conference on communication, control, and computing | 2012
Michael Mitzenmacher; Justin Thaler
\Theta(\sqrt{n})
international colloquium on automata, languages and programming | 2015
Mark Bun; Justin Thaler
, closing a line of incrementally larger lower bounds [3,11,21,30,32]. The same lower bound was recently obtained independently by Sherstov using related techniques [25]. Our second result gives an explicit dual polynomial that witnesses a tight lower bound for the approximate degree of any symmetric Boolean function, addressing a question of Spalek [34]. Our final contribution is to reprove several Markov-type inequalities from approximation theory by constructing explicit dual solutions to natural linear programs. These inequalities underly the proofs of many of the best-known approximate degree lower bounds, and have important uses throughout theoretical computer science.
acm symposium on parallel algorithms and architectures | 2014
Jiayang Jiang; Michael Mitzenmacher; Justin Thaler
The analysis of several algorithms and data structures can be reduced to the analysis of the following greedy “peeling” process: start with a random hypergraph; find a vertex of degree at most k, and remove it and all of its adjacent hyperedges from the graph; repeat until there is no suitable vertex. This specific process finds the k-core of a hypergraph, and variations on this theme have proven useful in analyzing for example decoding from low-density parity-check codes, several hash-based data structures such as cuckoo hashing, and algorithms for satisfiability of random formulae. This approach can be analyzed several ways, with two common approaches being via a corresponding branching process or a fluid limit family of differential equations. In this paper, we make note of an interesting aspect of these types of processes: the results are generally the same when the randomness is structured in the manner of double hashing. This phenomenon allows us to use less randomness and simplify the implementation for several hash-based data structures and algorithms. We explore this approach from both an empirical and theoretical perspective, examining theoretical justifications as well as simulation results for specific problems.
symposium on principles of database systems | 2016
Edo Liberty; Michael Mitzenmacher; Justin Thaler; Jonathan Ullman
We establish a generic form of hardness amplification for the approximability of constant-depth Boolean circuits by polynomials. Specifically, we show that if a Boolean circuit cannot be pointwise approximated by low-degree polynomials to within constant error in a certain one-sided sense, then an OR of disjoint copies of that circuit cannot be pointwise approximated even with very high error. As our main application, we show that for every sequence of degrees \(d(n)\), there is an explicit depth-three circuit \(F: \{-1,1\}^n \rightarrow \{-1,1\}\) of polynomial-size such that any degree-\(d\) polynomial cannot pointwise approximate \(F\) to error better than \(1-\exp (-\tilde{\Omega }(nd^{-3/2}))\). As a consequence of our main result, we obtain an \(\exp (-\tilde{\Omega }(n^{2/5}))\) upper bound on the the discrepancy of a function in AC\(^{\text{0 }}\), and an \(\exp (\tilde{\Omega }(n^{2/5}))\) lower bound on the threshold weight of AC\(^{\text{0 }}\), improving over the previous best results of \(\exp (-\Omega (n^{1/3}))\) and \(\exp (\Omega (n^{1/3}))\) respectively.