Kat Hagedorn
University of Michigan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kat Hagedorn.
Library Hi Tech | 2003
Kat Hagedorn
OAIster, at the University of Michigan, University Libraries, Digital Library Production Service (DLPS), is an Andrew W. Mellon Foundation grant‐funded project designed to test the feasibility of using the Open Archives Initiative Protocol for Metadata Harvesting (OAI‐PMH) to harvest digital object metadata from multiple and varied digital object repositories and develop a service to allow end‐users to access that metadata. This article describes in‐depth the development of our system to harvest, store, transform the metadata into Digital Library eXtension Service (DLXS) Bibliographic Class format, build indexes and make the metadata searchable through an interface using the XPAT search engine. Results of the testing of our service and statistics on usage are reported, as well as the issues that we have encountered during our harvesting and transformation operations. The article closes by discussing the future improvements and potential of OAIster and the OAI‐PMH protocol.
acm/ieee joint conference on digital libraries | 2007
David Newman; Kat Hagedorn; Chaitanya Chemudugunta; Padhraic Smyth
Creating a collection of metadata records from disparate and diverse sources often results in uneven, unreliable and variable quality subject metadata. Having uniform, consistent and enriched subject metadata allows users to more easily discover material, browse the collection, and limit keyword search results by subject. We demonstrate how statistical topic models are useful for subject metadata enrichment. We describe some of the challenges of metadata enrichment on a huge scale (10 million metadata records from 700 repositories in the OAIster Digital Library) when the metadata is highly heterogeneous (metadata about images and text, and both cultural heritage material and scientific literature). We show how to improve the quality of the enriched metadata, using both manual and statistical modeling techniques. Finally, we discuss some of the challenges of the production environment, and demonstrate the value of the enriched metadata in a prototype portal.
international conference theory and practice digital libraries | 2003
Martin Halbert; Joanne Kaczmarek; Kat Hagedorn
Findings are reported from four projects initiated through funding by the Andrew W. Mellon Foundation in 2001 to explore applications of metadata harvesting using the OAI-PMH. Metadata inconsistencies among providers have been encountered and strategies for normalization have been studied. Additional findings concerning harvesting are format conflicts, harvesting problems, provider system development, and questions regarding the entire cycle of metadata production, dissemination, and use (termed metadata gardening, rather than harvesting).
D-lib Magazine | 2007
Kat Hagedorn; Suzanne Chapman; David Newman
The Web puzzle of online information resources often hinders end-users from effective and efficient access to these resources. Clustering resources into appropriate subject-based groupings may help alleviate these difficulties, but will it work with heterogeneous material? The University of Michigan and the University of California Irvine joined forces to test automatically enhancing metadata records using the Topic Modeling algorithm on the varied OAIster corpus. We created labels for the resulting clusters of metadata records, matched the clusters to an in-house classification system, and developed a prototype that would showcase methods for search and retrieval using the enhanced records. Results indicated that while the algorithm was somewhat time-intensive to run and using a local classification scheme had its drawbacks, precise clustering of records was achieved and the prototype interface proved that faceted classification could be powerful in helping end-users find resources.
Journal of Web Librarianship | 2013
Suzanne Chapman; Shevon Desai; Kat Hagedorn; Kenneth J. Varnum; Sonali Mishra; Julie Piacentine
The University of Michigan Library wanted to learn more about the kinds of searches its users were conducting through the “one search” search box on the Library Web site. Library staff conducted two investigations. A preliminary investigation in 2011 involved the manual review of the 100 most frequently occurring queries conducted through the site search box over the course of a month. Those 100 search terms accounted for 16 percent of total queries and were largely one-word searches for databases. In the follow-up investigation, the Library embarked on a more ambitious exploration of the 454,443 searches conducted during the winter 2011 semester, devising a method for selecting, categorizing, and summarizing user search queries. A sample of 1,201 searches from the search query logs was examined; after eliminating duplicate searches, there were 992 unique terms available for categorization. Using a non-overlapping sample of queries, a rubric was developed for categorizing user searches. Each of seven library staff members reviewed all unique terms in the sample to categorize them into the best fitting category from the rubric. After establishing a threshold for reliability among the individuals categorizing the queries, 862 unique search terms were analyzed. Based on this analysis, the most frequent kinds of searches conducted in the winter semester in 2011 on the University of Michigan Librarys Web site were specific databases (28 percent), topical/exploratory types of queries (28 percent), and books (including searches by title, ISBN, call number, or a combination thereof) (16 percent).Within the sample, known-item searches comprised nearly half (44 percent) of searches in the sample. Another fifth (20 percent) of total searches were categorized as “exploratory,” supporting the need to provide broader, subject-based paths to information through the site. Somewhat surprisingly, there were a small number of article searches (article titles, or mixed searches of journal names and authors and/or title words) in the search box—an indication that users understand the University of Michigan Library primary search box is not for articles.
Science & Technology Libraries | 2007
Muriel Foulonneau; Timothy W. Cole; Charles Blair; Peter C. Gorman; Kat Hagedorn; Jenn Riley
ABSTRACT The CIC consortium includes 12 major Midwestern Universities. Their libraries have decided to share the cost of a joint project (2003-2006) aimed at better understanding the mechanisms by which emerging technologies and standards can facilitate metadata sharing and the creation of value-added services for their users. The CIC metadata portal project has performed advanced work in the area of Open Archives Initiative Protocol for Metadata Harvesting, collection-level descriptions, metadata transformation and enrichment, and practices and usability of metadata standards. It has provided an opportunity for increased collaboration between CIC academic libraries and a way to highlight the wealth of digital resources held by the participating libraries. This article describes the project and enumerates project accomplishments. The project has helped to better the way in which partner institutions share information about digital content and provide access to digital resources. Four content providers of the project highlight different aspects of the project and the practical benefits they found in the collaboration.
D-lib Magazine | 2011
Kat Hagedorn; Michael Kargela; Youn Noh; David Newman
Using a topic modeling algorithm to find relevant materials in a large corpus of textual items is not new; however, to date there has been little investigation into its usefulness to end-users. This article describes two methods we used to research this issue. In both methods, we used an instance of HathiTrust containing a snapshot of art, architecture and art history records from early 2010, that was populated with navigable terms generated using the topic modeling algorithm. In the first method, we created an unmoderated environment in which people navigated this instance on their own without supervision. In the second method, we talked to expert users as they navigated this same HathiTrust instance. Our unmoderated testing environment resulted in some conflicting results (use of topic facets was high, but satisfaction rating was somewhat low), while our one-on-one sessions with expert users give us reason to believe that topics and other subject terms (LCSH) are best used in conjunction with each other. This is a possibility we are interested in researching further.
acm/ieee joint conference on digital libraries | 2012
David Newman; Youn Noh; Kat Hagedorn; Arun Balagopalan
The number of books available online is increasing, but user interfaces may not be taking full advantage of advances in machine learning techniques that could help users navigate, explore, discover and understand interesting and useful content in books. Using a group of ten students and over one thousand crowdsourced judgments, we conducted multiple user studies to evaluate topics and related passages in books, all learned by topic modeling. Using ten books, selected from humanities (e.g. Platos Republic), social sciences (e.g. Marxs Capital) and sciences (e.g. Einsteins Relativity), and four different evaluation experiments, we show that users agree that the learned topics are coherent and important to the book, and related to the automatically generated passages. We show how crowdsourced evaluations are useful, and can complement more focused evaluations using students who have studied the texts. This work provides a framework for (1) learning topics and related passages in books, and (2) evaluating those learned topics and passages, and moves one step toward automatic annotation to support topic navigation of books.
Library Trends | 2005
Sarah L. Shreeves; Thomas G. Habing; Kat Hagedorn; Jeffrey A. Young
D-lib Magazine | 2008
Kat Hagedorn; Joshua Santelli