Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Scott Gaffney is active.

Publication


Featured researches published by Scott Gaffney.


conference on information and knowledge management | 2009

Improving web page classification by label-propagation over click graphs

Soo-Min Kim; Patrick Pantel; Lei Duan; Scott Gaffney

In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled similar documents. Current state-of-the-art classifiers are supervised and require large amounts of manually labeled data. We hypothesize that unlabeled documents similar to our positive and negative labeled documents tend to be clicked through by the same user queries. Our proposed method leverages this hypothesis and augments our training set by modeling the similarity between documents in a click graph. We experiment with three different web page classifiers and show empirical evidence that our proposed approach outperforms state-of-the-art methods and reduces the amount of human effort to label training data.


international world wide web conferences | 2010

A large-scale active learning system for topical categorization on the web

Suju Rajan; Dragomir Yankov; Scott Gaffney; Adwait Ratnaparkhi

Many web applications such as ad matching systems, vertical search engines, and page categorization systems require the identification of a particular type or class of pages on the Web. The sheer number and diversity of the pages on the Web, however, makes the problem of obtaining a good sample of the class of interest hard. In this paper, we describe a successfully deployed end-to-end system that starts from a biased training sample and makes use of several state-of-the-art machine learning algorithms working in tandem, including a powerful active learning component, in order to achieve a good classification system. The system is evaluated on traffic from a real-world ad-matching platform and is shown to achieve high categorization effectiveness with a significant reduction in editorial effort and labeling time.


international conference on data mining | 2010

Learning Document Labels from Enriched Click Graphs

Lan Nie; Zhigang Hua; Xiaofeng He; Scott Gaffney

Document classification plays an increasingly important role in extracting and organizing the knowledge, however, the Web document classification task was hindered by the huge number of Web documents while limited resource of human judgment on the training data. To obtain sufficient training data in a cost-efficient way, in this paper, we propose a semi-supervised learning approach to predict a document’s class label by mining the click graph. To overcome the sparseness problem of click graph, we enrich it by including hyperlinks between the Web documents. Content-based constraints are further added to regularize the graph. The resulting graph unifies three data sources: click-through data, hyperlinks and content relevance. Starting from a very small seed set of manually labeled documents, we automatically explore large amount of relevant documents by applying a Markov random walk model to the enriched click graph. The top pages with high confidence scores are included to the current training data for classifier model training. We investigate various combinations among the three sources and conduct extensive experiments on six typical web classification tasks. The experimental results show that the click graph enriched with hyperlink and content information can significantly improve the classification quality across multiple tasks only with a minimal human labeling cost.


Climate Dynamics | 2007

Probabilistic clustering of extratropical cyclones using regression mixture models

Scott Gaffney; Andrew W. Robertson; Padhraic Smyth; Suzana J. Camargo; Michael Ghil


Archive | 2007

Granular Data for Behavioral Targeting

John Canny; Shi Zhong; Scott Gaffney; Chad Brower; Pavel Berkhin; George H. John


international conference on computational linguistics | 2010

Resolving Surface Forms to Wikipedia Topics

Yiping Zhou; Lan Nie; Omid Rouhani-kalleh; Flavian Vasile; Scott Gaffney


Archive | 2007

Granular data for behavioral targeting using predictive models

John Canny; Shi Zhong; Scott Gaffney; Chad Brower; Pavel Berkhin; George H. John


Archive | 2011

Method and System for Generating A Linear Machine Learning Model for Predicting Online User Input Actions

John Canny; Shi Zhong; Scott Gaffney; Chad Brower; Pavel Berkhin; George H. John


Archive | 2013

METHOD AND SYSTEM FOR MEASURING USER ENGAGEMENT USING CLICK/SKIP IN CONTENT STREAM

Nathan Liu; Scott Gaffney; Jean-Marc Langlois


Archive | 2008

MEASURING TOPICAL COHERENCE OF KEYWORD SETS

Suju Rajan; Scott Gaffney

Collaboration


Dive into the Scott Gaffney's collaboration.

Researchain Logo
Decentralizing Knowledge