Robert Cooley
University of Minnesota
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Robert Cooley.
Sigkdd Explorations | 2000
Jaideep Srivastava; Robert Cooley; Mukund Deshpande; Pang Ning Tan
Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. This paper describes each of these phases in detail. Given its application potential, Web usage mining has seen a rapid increase in interest, from both the research and practice communities. This paper provides a detailed taxonomy of the work in this area, including research efforts as well as commercial offerings. An up-to-date survey of the existing work is also provided. Finally, a brief overview of the WebSIFT system as an example of a prototypical Web usage mining system is given.
Knowledge and Information Systems | 1999
Robert Cooley; Bamshad Mobasher; Jaideep Srivastava
The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of traffic and the size and complexity of Web sites. The complexity of tasks such as Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. An important input to these design tasks is the analysis of how a Web site is being used. Usage analysis includes straightforward statistics, such as page access frequency, as well as more sophisticated forms of analysis, such as finding the common traversal paths through a Web site. Web Usage Mining is the application of data mining techniques to usage logs of large Web data repositories in order to produce results that can be used in the design tasks mentioned above. However, there are several preprocessing tasks that must be performed prior to applying data mining algorithms to the data collected from server logs. This paper presents several data preparation techniques in order to identify unique users and user sessions. Also, a method to divide user sessions into semantically meaningful transactions is defined and successfully tested against two other methods. Transactions identified by the proposed methods are used to discover association rules from real world data using the WEBMINER system [15].
Communications of The ACM | 2000
Bamshad Mobasher; Robert Cooley; Jaideep Srivastava
The ease and speed with which business transactions can be carried out over the Web have been a key driving force in the rapid growth of electronic commerce. Business-to-business e-commerce is the focus of much attention today, mainly due to its huge volume. While there are certainly gains to be made in this arena, most of it is the implementation of much more efficient supply management, payments, etc. On the other hand, e-commerce activity that involves the end user is undergoing a significant revolution. The ability to track users’ browsing behavior down to individual mouse clicks has brought the vendor and end customer closer than ever before. It is now possible for a vendor to personalize his product message for individual customers at a massive scale, a phenomenon that is being referred to as mass customization.
international conference on tools with artificial intelligence | 1997
Robert Cooley; Bamshad Mobasher; Jaideep Srivastava
Application of data mining techniques to the World Wide Web, referred to as Web mining, has been the focus of several recent research projects and papers. However, there is no established vocabulary, leading to confusion when comparing research efforts. The term Web mining has been used in two distinct ways. The first, called Web content mining in this paper, is the process of information discovery from sources across the World Wide Web. The second, called Web usage mining, is the process of mining for user browsing and access patterns. We define Web mining and present an overview of the various research issues, techniques, and development efforts. We briefly describe WEBMINER, a system for Web usage mining, and conclude the paper by listing research issues.
Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453) | 1999
Bamshad Mobasher; Robert Cooley; Jaideep Srivastava
We describe an approach to usage based Web personalization taking into account both the offline tasks related to the mining of usage data, and the online process of automatic Web page customization based on the mined knowledge. Specifically, we propose an effective technique for capturing common user profiles based on association rule discovery and usage based clustering. We also propose techniques for combining this knowledge with the current status of an ongoing Web activity to perform real time personalization. Finally, we provide an experimental evaluation of the proposed techniques using real Web usage data.
IEEE Transactions on Knowledge and Data Engineering | 2004
Chris Clifton; Robert Cooley; Jason D. M. Rennie
TopCat (topic categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. We present a novel method for identifying related items based on traditional data mining techniques. Frequent itemsets are generated from the groups of items, followed by clusters formed with a hypergraph partitioning scheme. We present an evaluation against a manually categorized ground truth news corpus; it shows this technique is effective in identifying topics in collections of news articles.
knowledge discovery and data mining | 1999
Robert Cooley; Pang Ning Tan; Jaideep Srivastava
Web Usage Mining is the application of data mining techniques to large Web data repositories in order to extract usage patterns. As with many data mining application domains, the identification of patterns that are considered interesting is a problem that must be solved in addition to simply generating them. Aneces sary step in identifying interesting results is quantifying what is considered uninteresting in order to form a basis for comparison. Several research efforts have relied on manually generated sets of uninteresting rules. However, manual generation of a comprehensive set of evidence about beliefs for a particular domain is impractical in many cases. Generally, domain knowledge can be used to automatically create evidence for or against a set of beliefs. This paper develops a quantitative model based on support logic for determining the interestingness of discovered patterns. For Web Usage Mining, there are three types of domain information available; usage, content, and structure. This paper also describes algorithms for using these three types of information to automatically identify interesting knowledge. These algorithms have been incorporated into the Web Site Information Filter (WebSIFT) system and examples of interesting frequent itemsets automatically discovered from real Web data are presented.
european conference on principles of data mining and knowledge discovery | 1999
Chris Clifton; Robert Cooley
TopCat (topic categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. We present a novel method for identifying related items based on traditional data mining techniques. Frequent itemsets are generated from the groups of items, followed by clusters formed with a hypergraph partitioning scheme. We present an evaluation against a manually categorized ground truth news corpus; it shows this technique is effective in identifying topics in collections of news articles.
acm multimedia | 1998
Brian P. Bailey; Joseph A. Konstan; Robert Cooley; Moses Dejong
This chapter discusses a multimedia synchronization toolkit, called Nsync, to address the complicated issues inherent in designing flexible, interactive multimedia presentations. The toolkit consists of two primary components, a declarative synchronization definition language, and a run-time presentation management system. The synchronization definition language supports the specification of synchronous interaction, asynchronous interaction, fine-grained relationships, and combinations of each through the use of conjunctive and disjunctive operators. Precomputed playout schedules are too inflexible to deal with asynchronous interaction, and a more adaptive presentation management system is required. Nsyncs run-time system uses a novel predictive logic to predict the future behavior of a presentation. As the viewer makes decisions, the presentation is updated and new predictions are made to maintain consistency with the viewers wishes and the integrity of the presentations message. The Nsync toolkit has been completely implemented in the Tcl/Tk scripting language. The total implementation effort was about 3,500 lines of Tel code extending over a 6 months period. Although, Nsync can model any granularity of skew relationship, it cannot currently enforce them. To address this issue, some parts of the system would need to be reimplemented in a lower-level programming language, such as C or C++.
Archive | 1999
Robert Cooley; Pang Ning Tan; Jaideep Srivastava