Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jason D. M. Rennie is active.

Publication


Featured researches published by Jason D. M. Rennie.


Information Retrieval | 2000

Automating the Construction of Internet Portals with Machine Learning

Andrew McCallum; Kamal Nigam; Jason D. M. Rennie; Kristie Seymore

Domain-specific internet portals are growing in popularity because they gather content from the Web and organize it for easy access, retrieval and search. For example, www.campsearch.com allows complex queries by age, location, cost and specialty over summer camps. This functionality is not possible with general, Web-wide search engines. Unfortunately these portals are difficult and time-consuming to maintain. This paper advocates the use of machine learning techniques to greatly automate the creation and maintenance of domain-specific Internet portals. We describe new research in reinforcement learning, information extraction and text classification that enables efficient spidering, the identification of informative text segments, and the population of topic hierarchies. Using these techniques, we have built a demonstration system: a portal for computer science research papers. It already contains over 50,000 papers and is publicly available at www.cora.justresearch.com. These techniques are widely applicable to portal creation in other domains.


IEEE Transactions on Knowledge and Data Engineering | 2004

TopCat: data mining for topic identification in a text corpus

Chris Clifton; Robert Cooley; Jason D. M. Rennie

TopCat (topic categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. We present a novel method for identifying related items based on traditional data mining techniques. Frequent itemsets are generated from the groups of items, followed by clusters formed with a hypergraph partitioning scheme. We present an evaluation against a manually categorized ground truth news corpus; it shows this technique is effective in identifying topics in collections of news articles.


neural information processing systems | 2004

Maximum-Margin Matrix Factorization

Nathan Srebro; Jason D. M. Rennie; Tommi S. Jaakkola


international conference on machine learning | 2003

Tackling the poor assumptions of naive bayes text classifiers

Jason D. M. Rennie; Lawrence Shih; Jaime Teevan; David R. Karger


international conference on machine learning | 1999

Using Reinforcement Learning to Spider the Web Efficiently

Jason D. M. Rennie; Andrew McCallum


Archive | 2000

ifile: An Application of Machine Learning to E-Mail Filtering

Jason D. M. Rennie


international joint conference on artificial intelligence | 1999

A machine learning approach to building domain-specific search engines

Andrew McCallum; Kamal Nigam; Jason D. M. Rennie; Kristie Seymore


Archive | 1999

Building Domain-Specific Search Engines with Machine Learning Techniques

Andrew McCallum; Kamal Nigam; Jason D. M. Rennie; Kristie Seymore


Archive | 2001

Improving Multiclass Text Classification with the Support Vector Machine

Jason D. M. Rennie; Ryan Rifkin


Archive | 2001

Improving Multi-class Text Classification with Naive Bayes

Jason D. M. Rennie

Collaboration


Dive into the Jason D. M. Rennie's collaboration.

Top Co-Authors

Avatar

Andrew McCallum

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Tommi S. Jaakkola

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Kamal Nigam

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Kristie Seymore

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

David R. Karger

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Lawrence Shih

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Nathan Srebro

Toyota Technological Institute at Chicago

View shared research outputs
Top Co-Authors

Avatar

Yu-Han Chang

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge