Rajat Monga
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rajat Monga.
computer vision and pattern recognition | 2015
Joe Yue-Hei Ng; Matthew J. Hausknecht; Sudheendra Vijayanarasimhan; Oriol Vinyals; Rajat Monga; George Toderici
Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep neural network architectures to combine image information across a video over longer time periods than previously attempted. We propose two methods capable of handling full length videos. The first method explores various convolutional temporal feature pooling architectures, examining the various design choices which need to be made when adapting a CNN for this task. The second proposed method explicitly models the video as an ordered sequence of frames. For this purpose we employ a recurrent neural network that uses Long Short-Term Memory (LSTM) cells which are connected to the output of the underlying CNN. Our best networks exhibit significant performance improvements over previously published results on the Sports 1 million dataset (73.1% vs. 60.9%) and the UCF-101 datasets with (88.6% vs. 88.0%) and without additional optical flow information (82.6% vs. 73.0%).
international conference on acoustics, speech, and signal processing | 2013
Matthew D. Zeiler; Marc'Aurelio Ranzato; Rajat Monga; Mark Mao; Ke Yang; Quoc V. Le; Patrick Nguyen; Andrew W. Senior; Vincent Vanhoucke; Jeffrey Dean; Geoffrey E. Hinton
Deep neural networks have recently become the gold standard for acoustic modeling in speech recognition systems. The key computational unit of a deep network is a linear projection followed by a point-wise non-linearity, which is typically a logistic function. In this work, we show that we can improve generalization and make training of deep networks faster and simpler by substituting the logistic units with rectified linear units. These units are linear when their input is positive and zero otherwise. In a supervised setting, we can successfully train very deep nets from random initialization on a large vocabulary speech recognition task achieving lower word error rates than using a logistic network with the same topology. Similarly in an unsupervised setting, we show how we can learn sparse features that can be useful for discriminative tasks. All our experiments are executed in a distributed environment using several hundred machines and several hundred hours of speech data.
european conference on computer systems | 2018
Yuan Yu; Martín Abadi; Paul Barham; Eugene Brevdo; Mike Burrows; Andy Davis; Jeffrey Dean; Sanjay Ghemawat; Tim Harley; Peter Hawkins; Michael Isard; Manjunath Kudlur; Rajat Monga; Derek Gordon Murray; Xiaoqiang Zheng
Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations. We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability.
neural information processing systems | 2012
Jeffrey Dean; Greg Corrado; Rajat Monga; Kai Chen; Matthieu Devin; Mark Mao; Marc'Aurelio Ranzato; Andrew W. Senior; Paul A. Tucker; Ke Yang; Quoc V. Le; Andrew Y. Ng
international conference on machine learning | 2012
Marc'Aurelio Ranzato; Rajat Monga; Matthieu Devin; Kai Chen; Greg Corrado; Jeffrey Dean; Quoc V. Le; Andrew Y. Ng
arXiv: Distributed, Parallel, and Cluster Computing | 2015
Martín Abadi; Ashish Agarwal; Paul Barham; Eugene Brevdo; Zhifeng Chen; Craig Citro; Gregory S. Corrado; Andy Davis; Jeffrey Dean; Matthieu Devin; Sanjay Ghemawat; Ian J. Goodfellow; Andrew Harp; Geoffrey Irving; Michael Isard; Yangqing Jia; Rafal Jozefowicz; Lukasz Kaiser; Manjunath Kudlur; Josh Levenberg; Dan Mané; Rajat Monga; Sherry Moore; Derek Gordon Murray; Chris Olah; Mike Schuster; Jonathon Shlens; Benoit Steiner; Ilya Sutskever; Kunal Talwar
operating systems design and implementation | 2016
Martín Abadi; Paul Barham; Zhifeng Chen; Andy Davis; Jeffrey Dean; Matthieu Devin; Sanjay Ghemawat; Geoffrey Irving; Michael Isard; Manjunath Kudlur; Josh Levenberg; Rajat Monga; Sherry Moore; Derek Gordon Murray; Benoit Steiner; Paul A. Tucker; Vijay Vasudevan; Pete Warden; Martin Wicke; Yuan Yu; Xiaoqiang Zheng
conference of the international speech communication association | 2014
Hasim Sak; Oriol Vinyals; Georg Heigold; Andrew W. Senior; Erik McDermott; Rajat Monga; Mark Z. Mao
arXiv: Learning | 2017
Xinghao Pan; Rajat Monga; Samy Bengio; Rafal Jozefowicz
arXiv: Neural and Evolutionary Computing | 2015
Sudheendra Vijayanarasimhan; Jonathon Shlens; Rajat Monga; Jay Yagnik