Jason D. M. Rennie
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jason D. M. Rennie.
Information Retrieval | 2000
Andrew McCallum; Kamal Nigam; Jason D. M. Rennie; Kristie Seymore
Domain-specific internet portals are growing in popularity because they gather content from the Web and organize it for easy access, retrieval and search. For example, www.campsearch.com allows complex queries by age, location, cost and specialty over summer camps. This functionality is not possible with general, Web-wide search engines. Unfortunately these portals are difficult and time-consuming to maintain. This paper advocates the use of machine learning techniques to greatly automate the creation and maintenance of domain-specific Internet portals. We describe new research in reinforcement learning, information extraction and text classification that enables efficient spidering, the identification of informative text segments, and the population of topic hierarchies. Using these techniques, we have built a demonstration system: a portal for computer science research papers. It already contains over 50,000 papers and is publicly available at www.cora.justresearch.com. These techniques are widely applicable to portal creation in other domains.
IEEE Transactions on Knowledge and Data Engineering | 2004
Chris Clifton; Robert Cooley; Jason D. M. Rennie
TopCat (topic categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. We present a novel method for identifying related items based on traditional data mining techniques. Frequent itemsets are generated from the groups of items, followed by clusters formed with a hypergraph partitioning scheme. We present an evaluation against a manually categorized ground truth news corpus; it shows this technique is effective in identifying topics in collections of news articles.
neural information processing systems | 2004
Nathan Srebro; Jason D. M. Rennie; Tommi S. Jaakkola
international conference on machine learning | 2003
Jason D. M. Rennie; Lawrence Shih; Jaime Teevan; David R. Karger
international conference on machine learning | 1999
Jason D. M. Rennie; Andrew McCallum
Archive | 2000
Jason D. M. Rennie
international joint conference on artificial intelligence | 1999
Andrew McCallum; Kamal Nigam; Jason D. M. Rennie; Kristie Seymore
Archive | 1999
Andrew McCallum; Kamal Nigam; Jason D. M. Rennie; Kristie Seymore
Archive | 2001
Jason D. M. Rennie; Ryan Rifkin
Archive | 2001
Jason D. M. Rennie