Roberto J. Bayardo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Roberto J. Bayardo.
international world wide web conferences | 2007
Roberto J. Bayardo; Yiming Ma; Ramakrishnan Srikant
Given a large collection of sparse vector data in a high dimensional space, we investigate the problem of finding all pairs of vectors whose similarity score (as determined by a function such as cosine distance) is above a given threshold. We propose a simple algorithm based on novel indexing and optimization strategies that solves this problem without relying on approximation methods or extensive parameter tuning. We show the approach efficiently handles a variety of datasets across a wide setting of similarity thresholds, with large speedups over previous state-of-the-art approaches.
very large data bases | 2009
Biswanath Panda; Joshua Seth Herbach; Sugato Basu; Roberto J. Bayardo
Classification and regression tree learning on massive datasets is a common data mining task at Google, yet many state of the art tree learning algorithms require training data to reside in memory on a single machine. While more scalable implementations of tree learning have been proposed, they typically require specialized parallel computing architectures. In contrast, the majority of Googles computing infrastructure is based on commodity hardware. n nIn this paper, we describe PLANET: a scalable distributed framework for learning tree models over large datasets. PLANET defines tree learning as a series of distributed computations, and implements each one using the MapReduce model of distributed computation. We show how this framework supports scalable construction of classification and regression trees, as well as ensembles of such models. We discuss the benefits and challenges of using a MapReduce compute cluster for tree learning, and demonstrate the scalability of this approach by applying it to a real world learning task from the domain of computational advertising.
knowledge discovery and data mining | 2009
D. Sculley; Robert G. Malkin; Sugato Basu; Roberto J. Bayardo
This paper explores an important and relatively unstudied quality measure of a sponsored search advertisement: bounce rate. The bounce rate of an ad can be informally defined as the fraction of users who click on the ad but almost immediately move on to other tasks. A high bounce rate can lead to poor advertiser return on investment, and suggests search engine users may be having a poor experience following the click. In this paper, we first provide quantitative analysis showing that bounce rate is an effective measure of user satisfaction. We then address the question, can we predict bounce rate by analyzing the features of the advertisement? An affirmative answer would allow advertisers and search engines to predict the effectiveness and quality of advertisements before they are shown. We propose solutions to this problem involving large-scale learning methods that leverage features drawn from ad creatives in addition to their keywords and landing pages.
Archive | 2008
Mayur Datar; Roberto J. Bayardo
Archive | 2008
Weipeng Yan; Nicholas C. Fox; Roberto J. Bayardo; David Chen Chang; Monica Chawathe
Archive | 2008
Roberto J. Bayardo; Yiming Ma; Ramakrishnan Srikant
Archive | 2008
Roberto J. Bayardo; Rajat Jain; Ramakrishnan Srikant; Diane L. Tang
Archive | 2012
Yifang Liu; Roberto J. Bayardo; Guangyu Zhu
Archive | 2011
Biswanath Panda; Joshua Seth Herbach; Sugato Basu; Roberto J. Bayardo
Archive | 2011
Roberto J. Bayardo; Uma Mahadevan; Giao Nguyen; Shivakumar Venkataraman; Adam Isaac Juda