Simeon Warner
Cornell University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Simeon Warner.
international conference on data mining | 2006
Daria Sorokina; Johannes Gehrke; Simeon Warner; Paul Ginsparg
We describe a large-scale application of methods for finding plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efficiently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efficient enough to implement as a real-time submission screen for a collection many times larger.
Library Hi Tech | 2003
Simeon Warner
The Open Archives Initiative (OAI) was created as a practical way to promote interoperability between e‐print repositories. Although the scope of the OAI has been broadened, e‐print repositories still represent a significant fraction of OAI data providers. This article presents a brief survey of OAI e‐print repositories, and of services using metadata harvested from e‐print repositories using the OAI protocol for metadata harvesting (OAI‐PMH). It then discusses several situations where metadata harvesting may be used to further improve the utility of e‐print archives as a component of the scholarly communication infrastructure.
Learned Publishing | 2007
Edwin A. Henneken; Michael J. Kurtz; Günther Eichhorn; Alberto Accomazzi; Carolyn S. Grant; Donna M. Thompson; Elizabeth H. Bohlen; Stephen S. Murray; Paul Ginsparg; Simeon Warner
Are the e‐prints (electronic preprints) from the arXiv repository being used instead of journal articles? We show that the e‐prints have not undermined the usage of journal papers from the four core journals in astrophysics. As soon as the journal article is published, the astronomical community prefers to read it and the use of e‐prints through the NASA Astrophysics Data System drops to zero. This suggests that most astronomers have access to institutional subscriptions and that they choose to read the journal article. In other words, the e‐prints have not undermined journal use in this community and thus currently do not pose a financial threat to publishers. Furthermore, we show that the half‐life (the point at which the use of an article drops to half the use of a newly published article) for an e‐print is shorter than for a journal paper.
International Journal on Digital Libraries | 2007
Simeon Warner; Jeroen Bekaert; Carl Lagoze; Xiaoming Liu; Sandy Payette; Herbert Van de Warner
In the emerging eScience environment, repositories of papers, datasets, software, etc., should be the foundation of a global and natively-digital scholarly communications system. The current infrastructure falls far short of this goal. Cross-repository interoperability must be augmented to support the many workflows and value-chains involved in scholarly communication. This will not be achieved through the promotion of single repository architecture or content representation, but instead requires an interoperability framework to connect the many heterogeneous systems that will exist.We present a simple data model and service architecture that augments repository interoperability to enable scholarly value-chains to be implemented. We describe an experiment that demonstrates how the proposed infrastructure can be deployed to implement the workflow involved in the creation of an overlay journal over several different repository systems (Fedora, aDORe, DSpace and arXiv).
Concurrency and Computation: Practice and Experience | 2012
Carl Lagoze; Herbert Van de Sompel; Michael L. Nelson; Simeon Warner; Robert Sanderson; Pete Johnston
Digital scholarship offers the opportunity to move beyond the limitations of traditional scholarly publication. Rather than limiting scholarly communication to text‐based static documents, the Web makes it possible for scholars to expose and share the full evidence of their research including data, images, video, and other genre of materials. These aggregations of evidence, or compound documents, can then be integrated into a linked data cloud, the basis of Scholarship 2.0—an open environment in which scholars collaborate and build new knowledge on the existing scholarship. We present Open Archives Initiative–Object Reuse and Exchange (OAI–ORE), a set of standards to identify and describe aggregations of WebResources, thereby making the Scholarship 2.0 vision possible. Copyright
Learned Publishing | 2005
Simeon Warner
Recent debate on the reform of scholarly communication has focused on access issues. Although important, access is only one dimension in which the scholarly process can be transformed. Scholars are embracing highly collaborative and data‐intensive standards of practice influenced by powerful computing and network technologies. This dramatic transformation of scholarship demands a natively digital, network‐based scholarly communication system that is able to capture the scholarly record, make it accessible, and preserve it over time. I will offer a technological perspective on how these demands might be met.
international world wide web conferences | 2013
Bernhard Haslhofer; Simeon Warner; Carl Lagoze; Martin Klein; Robert Sanderson; Michael L. Nelson; Herbert Van de Sompel
Many applications need up-to-date copies of collections of changing Web resources. Such synchronization is currently achieved using ad-hoc or proprietary solutions. We propose ResourceSync, a general Web resource synchronization protocol that leverages XML Sitemaps. It provides a set of capabilities that can be combined in a modular manner to meet local or community requirements. We report on work to implement this protocol for arXiv.org and also provide an experimental prototype for the English Wikipedia as well as a client API.
International Journal on Digital Libraries | 2007
Simeon Warner; Jeroen Bekaert; Carl Lagoze; Xiaoming Liu; Sandy Payette; Herbert Van de Sompel
Due to a processing error, the name of one of the author is incorrect in the HTML version of this article and should read: Herbert Van de Sompel.
Computer Physics Communications | 2000
Simeon Warner; Simon Catterall; Eric B. Gregory; Edward D. Lipson
SimScience is a collaboration between Cornell University and Syracuse University. It comprises four interactive educational modules on crack propagation, crackling noise, fluid flow, and membranes. Computer simulations are at the forefront of current research in all of these topics. Our aim is explain some elements of each subject and to show the relevance of computer simulations. The crack propagation module explores the mechanisms of dam failure. The crackling noise module uses everyday sounds to illustrate types of noise, and links this to noise created by jumps in magnetization processes. The fluid flow module describes various properties of flows and explains phenomena such as a curve ball in baseball. The membranes module leverages everyday experience with membranes such as soap bubbles to help explain biological membranes and the relevance of membranes to theories of gravity. We have used Java not only to produce small-scale versions of research simulations but also to provide models illustrating simpler concepts underlying the main subject matter. Web technology allows us to deliver SimScience both over the Internet and on CD-ROM. To accommodate a target audience spanning K-12 and university general science students, we have created three levels for each module. Efforts are underway to assess the SimScience modules with the help of teachers and students.
european conference on research and advanced technology for digital libraries | 2005
Simeon Warner
I present a summary of recent use of the Open Archives Initiative (OAI) registration and validation services for data-providers. The registration service has seen a steady stream of registrations since its launch in 2002, and there are now over 220 registered repositories. I examine the validation logs to produce a breakdown of reasons why repositories fail validation. This breakdown highlights some common problems and will be used to guide work to improve the validation service.