Anima Anandkumar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anima Anandkumar is active.

Explore More

Publication

Featured researches published by Anima Anandkumar.

neural information processing systems | 2012

A Spectral Algorithm for Latent Dirichlet Allocation

Anima Anandkumar; Yi-Kai Liu; Daniel J. Hsu; Dean P. Foster; Sham M. Kakade

Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. The increased representational power comes at the cost of a more challenging unsupervised learning problem for estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method is based on an efficiently computable orthogonal tensor decomposition of low-order moments.

computer vision and pattern recognition | 2017

Tensor Contraction Layers for Parsimonious Deep Nets

Jean Kossaifi; Aran Khanna; Zachary C. Lipton; Tommaso Furlanello; Anima Anandkumar

Tensors offer a natural representation for many kinds of data frequently encountered in machine learning. Images, for example, are naturally represented as third order tensors, where the modes correspond to height, width, and channels. In particular, tensor decompositions are noted for their ability to discover multi-dimensional dependencies and produce compact low-rank approximations of data. In this paper, we explore the use of tensor contractions as neural network layers and investigate several ways to apply them to activation tensors. Specifically, we propose the Tensor Contraction Layer (TCL), the first attempt to incorporate tensor contractions as end-to-end trainable neural network layers. Applied to existing networks, TCLs reduce the dimensionality of the activation tensors and thus the number of model parameters. We evaluate the TCL on the task of image recognition, augmenting popular networks (AlexNet, VGG). The resulting models are trainable end-to-end. We evaluate TCLs performance on the task of image recognition, using the CIFAR100 and ImageNet datasets, studying the effect of parameter reduction via tensor contraction on performance. We demonstrate significant model compression without significant impact on the accuracy and, in some cases, improved performance.

international conference on computer communications | 2013

FCD: Fast-concurrent-distributed load balancing under switching costs and imperfect observations

Furong Huang; Anima Anandkumar

The problem of distributed load balancing among m agents operating in an n-server slotted system is considered. A randomized local search mechanism, FCD (fast, concurrent and distributed) algorithm, is implemented concurrently by each agent associated with a user. It involves switching to a different server with a certain exploration probability and then backtracking with a probability proportional to the ratio of the measured loads in the two servers (in consecutive time slots). The exploration and backtracking operations are executed concurrently by users in local alternating time slots. To ensure that users do not switch to other servers asymptotically, each user chooses the exploration probability to be decaying polynomially with time for decaying rate β ∈ [0.5, 1]. The backtracking decision is then based on an estimate of the server load which is computed based on local information. Thus, FCD algorithm does not require synchronization or coordination with other users. The main contribution of this work, besides the FCD algorithm, is the analysis of the convergence time for the system to be approximately balanced, i.e. to reach an c-Nash equilibrium. We show that the system reaches an c-Nash equilibrium in expected time O (max {n log n/ϵ + n1/β, (n3/m3 log n2/ϵ)1/β}) when m > n2. This implies that the convergence rate is robust with large scale system(large user population), and is not affected by imperfect measurements of the server load. We also extend our analysis to open systems where users arrive and depart from a system with an initial load of m users. We allow for general time-dependent arrival processes (including heavy-tailed processes) and consider a uniform and a load-oblivious routing of the arrivals to the servers. A wide class of departure processes including load-dependent departures from the servers is also allowed. Our analysis demonstrates that it is possible to design fast, concurrent and distributed load balancing mechanisms in large multi-agent systems via randomized local search.

arXiv: Learning | 2017