Samuel B. Hopkins
Cornell University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Samuel B. Hopkins.
symposium on the theory of computing | 2016
Samuel B. Hopkins; Tselil Schramm; Jonathan Shi; David Steurer
We consider two problems that arise in machine learning applications: the problem of recovering a planted sparse vector in a random linear subspace and the problem of decomposing a random low-rank overcomplete 3-tensor. For both problems, the best known guarantees are based on the sum-of-squares method. We develop new algorithms inspired by analyses of the sum-of-squares method. Our algorithms achieve the same or similar guarantees as sum-of-squares for these problems but the running time is significantly faster. For the planted sparse vector problem, we give an algorithm with running time nearly linear in the input size that approximately recovers a planted sparse vector with up to constant relative sparsity in a random subspace of ℝn of dimension up to Ω(√n). These recovery guarantees match the best known ones of Barak, Kelner, and Steurer (STOC 2014) up to logarithmic factors. For tensor decomposition, we give an algorithm with running time close to linear in the input size (with exponent ≈ 1.125) that approximately recovers a component of a random 3-tensor over ℝn of rank up to Ω(n4/3). The best previous algorithm for this problem due to Ge and Ma (RANDOM 2015) works up to rank Ω(n3/2) but requires quasipolynomial time.
foundations of computer science | 2016
Boaz Barak; Samuel B. Hopkins; Jonathan A. Kelner; Pravesh Kothari; Ankur Moitra; Aaron Potechin
We prove that with high probability over the choice of a random graph G from the Erdös-Rényi distribution G(n,1/2), the nO(d)-time degree d Sum-of-Squares semidefinite programming relaxation for the clique problem will give a value of at least n1/2-c(d/log n)1/2 for some constant c > 0. This yields a nearly tight n1/2-o(1) bound on the value of this program for any degree d = o(log n). Moreover we introduce a new framework that we call pseudo-calibration to construct Sum-of-Squares lower bounds. This framework is inspired by taking a computational analogue of Bayesian probability theory. It yields a general recipe for constructing good pseudo-distributions (i.e., dual certificates for the Sum-of-Squares semidefinite program), and sheds further light on the ways in which this hierarchy differs from others.
symposium on the theory of computing | 2018
Samuel B. Hopkins; Jerry Li
We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Our contributions are: Mixture models with separated means: We study mixtures of poly(k)-many k-dimensional distributions where the means of every pair of distributions are separated by at least kε. In the special case of spherical Gaussian mixtures, we give a kO(1/ε)-time algorithm that learns the means assuming separation at least kε, for any ε> 0. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. Robust estimation: When an unknown (1−ε)-fraction of X1,…,Xn are chosen from a sub-Gaussian distribution with mean µ but the remaining points are chosen adversarially, we give an algorithm recovering µ to error ε1−1/t in time kO(t), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than ε1/2. As a corollary, we achieve similar results for robust covariance estimation. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,…,Xn ∈ ℝk for large k and n = poly(k) with the promise that a subset of X1,…,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.
ACM Transactions on Algorithms | 2018
Samuel B. Hopkins; Pravesh Kothari; Aaron Henry Potechin; Prasad Raghavendra; Tselil Schramm
The problem of finding large cliques in random graphs and its “planted” variant, where one wants to recover a clique of size ω > log (n) added to an Erdős-Rényi graph G ∼ G(n,1/2), have been intensely studied. Nevertheless, existing polynomial time algorithms can only recover planted cliques of size ω = Ω (√ n). By contrast, information theoretically, one can recover planted cliques so long as ω > log (n). In this work, we continue the investigation of algorithms from the Sum of Squares hierarchy for solving the planted clique problem begun by Meka, Potechin, and Wigderson [2] and Deshpande and Montanari [25]. Our main result is that degree four SoS does not recover the planted clique unless ω > √ n / polylog n, improving on the bound ω > n1/3 due to Reference [25]. An argument of Kelner shows that the this result cannot be proved using the same certificate as prior works. Rather, our proof involves constructing and analyzing a new certificate that yields the nearly tight lower bound by “correcting” the certificate of References [2, 25, 27].
foundations of computer science | 2017
Samuel B. Hopkins; David Steurer
We propose an efficient meta-algorithm for Bayesian inference problems based on low-degree polynomials, semidefinite programming, and tensor decomposition. The algorithm is inspired by recent lower bound constructions for sum-of-squares and related to the method of moments. Our focus is on sample complexity bounds that are as tight as possible (up to additive lower-order terms) and often achieve statistical thresholds or conjectured computational thresholds.Our algorithm recovers the best known bounds for partial recovery in the stochastic block model, a widely-studied class of inference problems for community detection in graphs. We obtain the first partial recovery guarantees for the mixed-membership stochastic block model (Airoldi et el.) for constant average degree—up to what we conjecture to be the computational threshold for this model. %Our algorithm also captures smooth trade-offs between sample and computational complexity, for example, for tensor principal component analysis. We show that our algorithm exhibits a sharp computational threshold for the stochastic block model with multiple communities beyond the Kesten–Stigum bound—giving evidence that this task may require exponential time.The basic strategy of our algorithm is strikingly simple: we compute the best-possible low-degree approximation for the moments of the posterior distribution of the parameters and use a robust tensor decomposition algorithm to recover the parameters from these approximate posterior moments.
conference on learning theory | 2015
Samuel B. Hopkins; Jonathan Shi; David Steurer
arXiv: Computational Complexity | 2015
Samuel B. Hopkins; Pravesh Kothari; Aaron Potechin
symposium on discrete algorithms | 2016
Samuel B. Hopkins; Pravesh Kothari; Aaron Henry Potechin; Prasad Raghavendra; Tselil Schramm
foundations of computer science | 2017
Samuel B. Hopkins; Pravesh Kothari; Aaron Potechin; Prasad Raghavendra; Tselil Schramm; David Steurer
Chicago Journal of Theoretical Computer Science | 2013
Eric Allender; George Davie; Luke Friedman; Samuel B. Hopkins; Iddo Tzameret