Sushant Sachdeva
Yale University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sushant Sachdeva.
symposium on the theory of computing | 2016
Rasmus Kyng; Yin Tat Lee; Richard Peng; Sushant Sachdeva; Daniel A. Spielman
We introduce the sparsified Cholesky and sparsified multigrid algorithms for solving systems of linear equations. These algorithms accelerate Gaussian elimination by sparsifying the nonzero matrix entries created by the elimination process. We use these new algorithms to derive the first nearly linear time algorithms for solving systems of equations in connection Laplacians---a generalization of Laplacian matrices that arise in many problems in image and signal processing. We also prove that every connection Laplacian has a linear sized approximate inverse. This is an LU factorization with a linear number of nonzero entries that is a strong approximation of the original matrix. Using such a factorization one can solve systems of equations in a connection Laplacian in linear time. Such a factorization was unknown even for ordinary graph Laplacians.
foundations of computer science | 2016
Rasmus Kyng; Sushant Sachdeva
We show how to perform sparse approximate Gaussian elimination for Laplacian matrices. We present a simple, nearly linear time algorithm that approximates a Laplacian by the product of a sparse lower triangular matrix with its transpose. This gives the first nearly linear time solver for Laplacian systems that is based purely on random sampling, and does not use any graph theoretic constructions such as low-stretch trees, sparsifiers, or expanders. Our algorithm performs a subsampled Cholesky factorization, which we analyze using matrix martingales. As part of the analysis, we give a proof of a concentration inequality for matrix martingales where the differences are sums of conditionally independent variables.
Foundations and Trends in Theoretical Computer Science | 2014
Sushant Sachdeva; Nisheeth K. Vishnoi
This monograph presents techniques to approximate real functions such as xs; x—1 and e—x by simpler functions and shows how these results can be used for the design of fast algorithms. The key lies in the fact that such results imply faster ways to approximate primitives such as Asv; A—1v and exp(—A)v, and to compute matrix eigenvalues and eigenvectors. Indeed, many fast algorithms reduce to the computation of such primitives, which have proved useful for speeding up several fundamental computations such as random walk simulation, graph partitioning and solving linear systems of equations.
conference on innovations in theoretical computer science | 2018
Rina Panigrahy; Ali Rahimi; Sushant Sachdeva; Qiuyi Zhang
We study whether a depth two neural network can learn another depth two network using gradient descent. Assuming a linear output node, we show that the question of whether gradient descent converges to the target function is equivalent to the following question in electrodynamics: Given
symposium on the theory of computing | 2017
David Durfee; Rasmus Kyng; John Peebles; Anup Rao; Sushant Sachdeva
k
symposium on discrete algorithms | 2017
Rasmus Kyng; Jakub W. Pachocki; Richard Peng; Sushant Sachdeva
fixed protons in
Operations Research Letters | 2016
Sushant Sachdeva; Nisheeth K. Vishnoi
mathbb{R}^d,
symposium on discrete algorithms | 2018
Amey Bhangale; Subhash Khot; Swastik Kopparty; Sushant Sachdeva; Devanathan Thimvenkatachari
and
international colloquium on automata, languages and programming | 2015
Amey Bhangale; Swastik Kopparty; Sushant Sachdeva
k
conference on learning theory | 2015
Rasmus Kyng; Anup Rao; Sushant Sachdeva; Daniel A. Spielman
electrons, each moving due to the attractive force from the protons and repulsive force from the remaining electrons, whether at equilibrium all the electrons will be matched up with the protons, up to a permutation. Under the standard electrical force, this follows from the classic Earnshaws theorem. In our setting, the force is determined by the activation function and the input distribution. Building on this equivalence, we prove the existence of an activation function such that gradient descent learns at least one of the hidden nodes in the target network. Iterating, we show that gradient descent can be used to learn the entire network one node at a time.