Robert L. Rounthwaite

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Robert L. Rounthwaite is active.

Explore More

Publication

Featured researches published by Robert L. Rounthwaite.

Journal of Machine Learning Research | 2001

Dependency networks for inference, collaborative filtering, and data visualization

David Heckerman; David Maxwell Chickering; Christopher Meek; Robert L. Rounthwaite; Carl M. Kadie

We describe a graphical model for probabilistic relationships--an alternative to the Bayesian network--called a dependency network. The graph of a dependency network, unlike a Bayesian network, is potentially cyclic. The probability component of a dependency network, like a Bayesian network, is a set of conditional distributions, one for each node given its parents. We identify several basic properties of this representation and describe a computationally efficient procedure for learning the graph and probability components from data. We describe the application of this representation to probabilistic inference, collaborative filtering (the task of predicting preferences), and the visualization of acausal predictive relationships.

electronic commerce | 2004

Stopping outgoing spam

Joshua T. Goodman; Robert L. Rounthwaite

We analyze the problem of preventing outgoing spam. We show that some conventional techniques for limiting outgoing spam are likely to be ineffective. We show that while imposing per message costs would work, less annoying techniques also work. In particular, it is only necessary that the average cost to the spammer over the lifetime of an account exceed his profits, meaning that not every message need be challenged. We develop three techniques, one based on additional HIP challenges, one based on computational challenges, and one based on paid subscriptions. Each system is designed to impose minimal costs on legitimate users, while being too costly for spammers. We also show that maximizing complaint rates is a key factor, and suggest new standards to encourage high complaint rates.

international world wide web conferences | 2004

Filtering spam e-mail on a global scale

Geoff Hulten; Joshua T. Goodman; Robert L. Rounthwaite

In this paper we analyze a very large junk e-mail corpus which was generated by a hundred thousand volunteer users of the Hotmail e-mail service. We describe how the corpus is being collected, and analyze: the geographic origins of the e-mail who the e-mail is targeting and what the e-mail is selling.

international conference on data mining | 2001

Efficient determination of dynamic split points in a decision tree

David Maxwell Chickering; Christopher Meek; Robert L. Rounthwaite

We consider the problem of choosing split points for continuous predictor variables in a decision tree. Previous approaches to this problem typically either: (1) discretize the continuous predictor values prior to learning, or (2) apply a dynamic method that considers all possible split points for each potential split. We describe a number of alternative approaches that generate a small number of candidate split points dynamically with little overhead. We argue that these approaches are preferable to pre-discretization, and provide experimental evidence that they yield probabilistic decision trees with the same prediction accuracy as the traditional dynamic approach. Furthermore, because the time to grow a decision tree is proportional to the number of split points evaluated, our approach is significantly faster than the traditional dynamic approach.

web search and data mining | 2011

Searchable web sites recommendation

Yang Song; Nam Nguyen; Li-wei He; Scott K. Imig; Robert L. Rounthwaite

In this paper, we propose a new framework for searchable web sites recommendation. Given a query, our system will recommend a list of searchable web sites ranked by relevance, which can be used to complement the web page results and ads from a search engine. We model the conditional probability of a searchable web site being relevant to a given query in term of three main components: the language model of the query, the language model of the content within the web site, and the reputation of the web site searching capability (static rank). The language models for queries and searchable sites are built using information mined from client-side browsing logs. The static rank for each searchable site leverages features extracted from these client-side logs such as number of queries that are submitted to this site, and features extracted from general search engines such as the number of web pages that indexed for this site, number of clicks per query, and the dwell-time that a user spends on the search result page and on the clicked result web pages. We also learn a weight for each kind of feature to optimize the ranking performance. In our experiment, we discover 10.5 thousand searchable sites and use 5 million unique queries, extracted from one week of log data to build and demonstrate the effectiveness of our searchable web site recommendation system.

conference on human information interaction and retrieval | 2018

Characterizing Search Behavior in Productivity Software

Horatiu Bota; Adam Fourney; Susan T. Dumais; Tomasz L. Religa; Robert L. Rounthwaite

Complex software applications expose hundreds of commands to users through intricate menu hierarchies. One of the most popular productivity software suites, Microsoft Office, has recently developed functionality that allows users to issue free-form text queries to a search system to quickly find commands they want to execute, retrieve help documentation or access web results in a unified interface. In this paper, we analyze millions of search sessions originating from within Microsoft Office applications, collected over one month of activity, in an effort to characterize search behavior in productivity software. Our research brings together previous efforts in analyzing command usage in large-scale applications and efforts in understanding search behavior in environments other than the web. Our findings show that users engage primarily in command search, and that re-accessing commands through search is a frequent behavior. Our work represents the first large-scale analysis of search over command spaces and is an important first step in understanding how search systems integrated with productivity software can be successfully developed.

Archive | 2004

Feedback loop for spam prevention

Robert L. Rounthwaite; Joshua T. Goodman; David Heckerman; John D. Mehr; Nathan D. Howell; Micah C. Rupersburg; Dean A. Slawson

Archive | 2004

Origination/destination features and lists for spam prevention

Joshua T. Goodman; Robert L. Rounthwaite; Daniel Gwozdz; John D. Mehr; Nathan D. Howell; Micah C. Rupersburg; Bryan T. Starbuck

Archive | 2005

Phishing detection, prevention, and notification

Joshua T. Goodman; Paul S Rehfuss; Robert L. Rounthwaite; Manav Mishra; Geoffrey J. Hulten; Kenneth G. Richards; Aaron H. Averbuch; Anthony P. Penta; Roderict C. Deyo

Archive | 2002