Tushar Deepak Chandra

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tushar Deepak Chandra is active.

Explore More

Publication

Featured researches published by Tushar Deepak Chandra.

ACM Transactions on Computer Systems | 2008

Bigtable: A Distributed Storage System for Structured Data

Fay W. Chang; Jeffrey Dean; Sanjay Ghemawat; Wilson C. Hsieh; Deborah A. Wallach; Michael Burrows; Tushar Deepak Chandra; Andrew Fikes; Robert Gruber

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this article, we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

international conference on machine learning | 2008

Efficient projections onto the l 1 -ball for learning in high dimensions

John C. Duchi; Shai Shalev-Shwartz; Yoram Singer; Tushar Deepak Chandra

We describe efficient algorithms for projecting a vector onto the l1-ball. We present two methods for projection. The first performs exact projection in O(n) expected time, where n is the dimension of the space. The second works on vectors k of whose elements are perturbed outside the l1-ball, projecting in O(k log(n)) time. This setting is especially useful for online learning in sparse feature spaces such as text categorization applications. We demonstrate the merits and effectiveness of our algorithms in numerous batch and online learning tasks. We show that variants of stochastic gradient projection methods augmented with our efficient projection procedures outperform interior point methods, which are considered state-of-the-art optimization techniques. We also show that in online settings gradient updates with l1 projections outperform the exponentiated gradient algorithm while obtaining models with high degrees of sparsity.

principles of distributed computing | 2007

Paxos made live: an engineering perspective

Tushar Deepak Chandra; Robert Griesemer; Joshua Redstone

We describe our experience in building a fault-tolerant data-base using the Paxos consensus algorithm. Despite the existing literature in the field, building such a database proved to be non-trivial. We describe selected algorithmic and engineering problems encountered, and the solutions we found for them. Our measurements indicate that we have built a competitive system.

conference on recommender systems | 2016

Wide & Deep Learning for Recommender Systems

Heng-Tze Cheng; Levent Koc; Jeremiah Harmsen; Tal Shaked; Tushar Deepak Chandra; Hrishi Aradhye; Glen Anderson; Gregory S. Corrado; Wei Chai; Mustafa Ispir; Rohan Anil; Zakaria Haque; Lichan Hong; Vihan Jain; Xiaobing Liu; Hemal Shah

Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. In this paper, we present Wide & Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on Google Play, a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide & Deep significantly increased app acquisitions compared with wide-only and deep-only models. We have also open-sourced our implementation in TensorFlow.

principles of distributed computing | 2016

An Algorithm for Replicated Objects with Efficient Reads

Tushar Deepak Chandra; Vassos Hadzilacos; Sam Toueg

The problem. We consider the problem of implementing a consistent replicated object in a partially synchronous message passing distributed system susceptible to process and communication failures. The object is a generic shared resource, such as a data structure, a file, or a lock. The processes implementing the replicated object access it by applying operations to it at unpredictable times and potentially concurrently.1 The object should be linearizable: it should behave as if each operation applied to it takes effect at a distinct instant in time during the interval between its invocation and its response. The main reason for replicating an object is fault tolerance; if a copy of the object becomes inaccessible, the object can still be accessed through the other copies. Another reason for replicating an object is performance; a process that requires the object can access its local copy. The benefit of accessing a local copy, however, must be weighed against the cost of keeping the copies consistent.

web search and data mining | 2015

Regressing Towards Simpler Prediction Systems

Tushar Deepak Chandra

This talk will focus on our experience in managing the complexity of Sibyl, a large scale machine learning system that is widely used within Google. We believe that a large fraction of the challenges faced by Sibyl are inherent to large scale production machine learning and that other production systems are likely to encounter them as well [1]. Thus, these challenges present interesting opportunities for future research. The Sibyl system is complex for a number of reasons. We have learnt that a complete end-to-end machine learning solution has to have subsystems to address a variety of different needs: data ingestion, data analysis, data verification, experimentation, model analysis, model serving, configuration, data transformations, support for different kinds of loss functions and modeling, machine learning algorithm implementations, etc. Machine learning algorithms themselves constitute a relatively small fraction of the overall system. Each subsystem consists of a number of distinct components to support the variety of product needs. For example, Sibyl supports more than 5 different model serving systems, each with its own idiosyncrasies and challenges. In addition, Sibyl configuration contains more lines of code than the core Sibyl learner itself. Finally existing solutions for some of the challenges dont feel adequate and we believe these challenges present opportunities for future research. Though the overall system is complex, our users need to be able to deploy solutions quickly. This is because a machine learning deployment is typically an iterative process of model improvements. At each iteration, our users experiment with new features, find those that improve the models prediction capability, and then launch a new model with those improved features. A user may go through 10 or more such productive launches. Not only is speed of iteration crucial to our users, but they are often willing to sacrifice the improved prediction quality of a high quality but cumbersome system for the speed of iteration of a lower quality but nimble system. In this talk I will give an example of how simplification drives systems design and sometimes the design of novel algorithms.

operating systems design and implementation | 2006