Mat Kelly | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mat Kelly is active.

Explore More

Publication

Featured researches published by Mat Kelly.

international conference theory and practice digital libraries | 2013

On the Change in Archivability of Websites Over Time

Mat Kelly; Justin F. Brunelle; Michele C. Weigle; Michael L. Nelson

As web technologies evolve, web archivists work to keep up so that our digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts that load data without a referential identifier or that require user interaction (e.g., content loading when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. Because of the evolving schemes of publishing web pages along with the progressive capability of web preservation tools, the archivability of pages on the web has varied over time. In this paper we show that the archivability of a web page can be deduced from the type of page being archived, which aligns with that page’s accessibility in respect to dynamic content. We show concrete examples of when these technologies were introduced by referencing mementos of pages that have persisted through a long evolution of available technologies. Identifying these reasons for the inability of these web pages to be archived in the past in respect to accessibility serves as a guide for ensuring that content that has longevity is published using good practice methods that make it available for preservation.

International Journal on Digital Libraries | 2016

The impact of JavaScript on archivability

Justin F. Brunelle; Mat Kelly; Michele C. Weigle; Michael L. Nelson

As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study.

acm/ieee joint conference on digital libraries | 2014

Mink: integrating the live and archived web viewing experience using web browsers and memento

Mat Kelly; Michael L. Nelson; Michele C. Weigle

We describe Mink, a new web browser extension that provides a different model for integration of the live and archived web. While a user browses the live web, Mink actively queries the archives and reports other instances of the page in the archives without requiring active querying by the user. Further, by querying the archives dynamically and asynchronously, a user can view the extent to which the currently viewed page on the live web has been archived and proactively submit a request to various archives using an overlay on the live web page and a simple interface.

international conference theory and practice digital libraries | 2016

InterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives

Mat Kelly; Sawood Alam; Michael L. Nelson; Michele C. Weigle

We have integrated Web ARChive (WARC) files with the peer-to-peer content addressable InterPlanetary File System (IPFS) to allow the payload content of web archives to be easily propagated. We also provide an archival replay system extended from pywb to fetch the WARC content from IPFS and re-assemble the originally archived HTTP responses for replay. From a 1.0 GB sample Archive-It collection of WARCs containing 21,994 mementos, we show that extracting and indexing the HTTP response content of WARCs containing IPFS lookup hashes takes 66.6 min inclusive of dissemination into IPFS.

acm/ieee joint conference on digital libraries | 2016

InterPlanetary Wayback: The Permanent Web Archive

Sawood Alam; Mat Kelly; Michael L. Nelson

To facilitate permanence and collaboration in web archives, we built Interplanetary Wayback to disseminate the contents of WARC files into the IPFS network. IPFS is a peer-to-peer content-addressable file system that inherently allows deduplication and facilitates opt-in replication. We split the header and payload of WARC response records before disseminating into IPFS to leverage the deduplication, build a CDXJ index, and combine them at the time of replay. From a 1.0 GB sample Archive-It collection of WARCs containing 21,994 mementos, we found that on an average, 570 files can be indexed and disseminated into IPFS per minute. We also found that in our naive prototype implementation, replay took on an average 370 milliseconds per request.

acm ieee joint conference on digital libraries | 2018

Unobtrusive and Extensible Archival Replay Banners Using Custom Elements

Sawood Alam; Mat Kelly; Michele C. Weigle; Michael L. Nelson

We compare and contrast three different ways to implement an archival replay banner. We propose an implementation that utilizes Custom Elements and adds some unique behaviors, not common in existing archival replay systems, to enhance the user experience. Our approach has a minimal user interface footprint and resource overhead while still providing rich interactivity and extended on-demand provenance information about the archived resources.

acm ieee joint conference on digital libraries | 2018

A Framework for Aggregating Private and Public Web Archives

Mat Kelly; Michael L. Nelson; Michele C. Weigle

Personal and private Web archives are proliferating due to the increase in the tools to create them and the realization that Internet Archive and other public Web archives are unable to capture personalized (e.g., Facebook) and private (e.g., banking) Web pages. We introduce a framework to mitigate issues of aggregation in private, personal, and public Web archives without compromising potential sensitive information contained in private captures. We amend Memento syntax and semantics to allow TimeMap enrichment to account for additional attributes to be expressed inclusive of the requirements for dereferencing private Web archive captures. We provide a method to involve the user further in the negotiation of archival captures in dimensions beyond time. We introduce a model for archival querying precedence and short-circuiting, as needed when aggregating private and personal Web archive captures with those from public Web archives through Memento. Negotiation of this sort is novel to Web archiving and allows for the more seamless aggregation of various types of Web archives to convey a more accurate picture of the past Web.

acm ieee joint conference on digital libraries | 2018

ArchiveNow: Simplified, Extensible, Multi-Archive Preservation

Mohamed Aturban; Mat Kelly; Sawood Alam; John A. Berlin; Michael L. Nelson; Michele C. Weigle

ArchiveNow is a Python module for preserving web pages in on-demand web archives. This module allows a user to submit a URI of a web page for archiving at several configured web archives. Once the web page is captured, ArchiveNow provides the user with links to the archived copies of the web page. ArchiveNow is initially configured to use four archives but is easily configurable to add or remove other archives. In addition to pushing web pages to public archives, ArchiveNow, through the use of Wget and Squidwarc, allows users to generate local WARC files, enabling them to create their own personal and private archives.

acm/ieee joint conference on digital libraries | 2015

Mobile Mink: Merging Mobile and Desktop Archived Webs

Wesley Jordan; Mat Kelly; Justin F. Brunelle; Laura Vobrak; Michele C. Weigle; Michael L. Nelson

We describe the mobile app \emph{Mobile Mink} which extends Mink, a browser extension that integrates the live and archived web. Mobile Mink discovers mobile and desktop URIs and provides the user an aggregated TimeMap of both mobile and desktop mementos. Mobile Mink also allows users to submit mobile and desktop URIs for archiving at the Internet Archive and Archive.today. Mobile Mink helps to increase the archival coverage of the growing mobile web.

military communications conference | 2014

A Retasking Framework for Wireless Sensor Networks

Michael Ruffing; Yangyang He; Mat Kelly; Jason O. Hallstrom; Stepahn Olariu; Michele C. Weigle

Wireless sensor networks have been widely used in scientific research, industrial manufacturing, and environmental monitoring over the past decade. Using pre-existing networks to assist in responding to disaster events can be cost-effective. In this paper, we present Alert, a software framework for re tasking wireless sensor networks, enabling these networks to respond rapidly to unexpected events without neglecting their originally assigned tasks. Alert, built upon Deluge [1], is a wireless network code distribution protocol enabling node group management, selective node and group reprogramming, and network state monitoring. We used a test bed of 25 Tmote Sky nodes to evaluate the reprogramming performance and space overhead of Alert under different network sizes and densities.

Explore More