Rajat Monga | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rajat Monga is active.

Explore More

Publication

Featured researches published by Rajat Monga.

computer vision and pattern recognition | 2015

Beyond short snippets: Deep networks for video classification

Joe Yue-Hei Ng; Matthew J. Hausknecht; Sudheendra Vijayanarasimhan; Oriol Vinyals; Rajat Monga; George Toderici

Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep neural network architectures to combine image information across a video over longer time periods than previously attempted. We propose two methods capable of handling full length videos. The first method explores various convolutional temporal feature pooling architectures, examining the various design choices which need to be made when adapting a CNN for this task. The second proposed method explicitly models the video as an ordered sequence of frames. For this purpose we employ a recurrent neural network that uses Long Short-Term Memory (LSTM) cells which are connected to the output of the underlying CNN. Our best networks exhibit significant performance improvements over previously published results on the Sports 1 million dataset (73.1% vs. 60.9%) and the UCF-101 datasets with (88.6% vs. 88.0%) and without additional optical flow information (82.6% vs. 73.0%).

international conference on acoustics, speech, and signal processing | 2013

On rectified linear units for speech processing

Matthew D. Zeiler; Marc'Aurelio Ranzato; Rajat Monga; Mark Mao; Ke Yang; Quoc V. Le; Patrick Nguyen; Andrew W. Senior; Vincent Vanhoucke; Jeffrey Dean; Geoffrey E. Hinton

Deep neural networks have recently become the gold standard for acoustic modeling in speech recognition systems. The key computational unit of a deep network is a linear projection followed by a point-wise non-linearity, which is typically a logistic function. In this work, we show that we can improve generalization and make training of deep networks faster and simpler by substituting the logistic units with rectified linear units. These units are linear when their input is positive and zero otherwise. In a supervised setting, we can successfully train very deep nets from random initialization on a large vocabulary speech recognition task achieving lower word error rates than using a logistic network with the same topology. Similarly in an unsupervised setting, we show how we can learn sparse features that can be useful for discriminative tasks. All our experiments are executed in a distributed environment using several hundred machines and several hundred hours of speech data.

european conference on computer systems | 2018

Dynamic control flow in large-scale machine learning

Yuan Yu; Martín Abadi; Paul Barham; Eugene Brevdo; Mike Burrows; Andy Davis; Jeffrey Dean; Sanjay Ghemawat; Tim Harley; Peter Hawkins; Michael Isard; Manjunath Kudlur; Rajat Monga; Derek Gordon Murray; Xiaoqiang Zheng

Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations. We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability.

neural information processing systems | 2012

Large Scale Distributed Deep Networks

Jeffrey Dean; Greg Corrado; Rajat Monga; Kai Chen; Matthieu Devin; Mark Mao; Marc'Aurelio Ranzato; Andrew W. Senior; Paul A. Tucker; Ke Yang; Quoc V. Le; Andrew Y. Ng

international conference on machine learning | 2012

Building high-level features using large scale unsupervised learning

Marc'Aurelio Ranzato; Rajat Monga; Matthieu Devin; Kai Chen; Greg Corrado; Jeffrey Dean; Quoc V. Le; Andrew Y. Ng

arXiv: Distributed, Parallel, and Cluster Computing | 2015

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Martín Abadi; Ashish Agarwal; Paul Barham; Eugene Brevdo; Zhifeng Chen; Craig Citro; Gregory S. Corrado; Andy Davis; Jeffrey Dean; Matthieu Devin; Sanjay Ghemawat; Ian J. Goodfellow; Andrew Harp; Geoffrey Irving; Michael Isard; Yangqing Jia; Rafal Jozefowicz; Lukasz Kaiser; Manjunath Kudlur; Josh Levenberg; Dan Mané; Rajat Monga; Sherry Moore; Derek Gordon Murray; Chris Olah; Mike Schuster; Jonathon Shlens; Benoit Steiner; Ilya Sutskever; Kunal Talwar

operating systems design and implementation | 2016

TensorFlow: a system for large-scale machine learning

Martín Abadi; Paul Barham; Zhifeng Chen; Andy Davis; Jeffrey Dean; Matthieu Devin; Sanjay Ghemawat; Geoffrey Irving; Michael Isard; Manjunath Kudlur; Josh Levenberg; Rajat Monga; Sherry Moore; Derek Gordon Murray; Benoit Steiner; Paul A. Tucker; Vijay Vasudevan; Pete Warden; Martin Wicke; Yuan Yu; Xiaoqiang Zheng

conference of the international speech communication association | 2014