Donna Bergmark
Cornell University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Donna Bergmark.
IEEE Transactions on Professional Communication | 1979
Gerard Salton; Donna Bergmark
The bibliographic reference and citations which exist among documents in a given document collection can be used to study the history and scope of particular subject areas and to assess the importance of individual authors, documents, and journals. A clustering study of computer science literature is described, using bibliographic citations as a clustering criterion, and conclusions are drawn regarding the scope of computer science and the characteristics of individual documents in the area. In particular, the clustering characteristics lead to a distinction between core and fringe areas in the field and to the identification of particularly influential articles.
acm international conference on digital libraries | 2000
Steve Hitchcock; Les Carr; Zhuoan Jiao; Donna Bergmark; Wendy Hall; Carl Lagoze; Stevan Harnad
The rapid growth of scholarly information resources available in electronic form and their organisation by digital libraries is proving fertile ground for the development of sophisticated new services, of which citation linking will be one indispensable example. Many new projects, partnerships and commercial agreements have been announced to build citation linking applications. This paper describes the Open Citation (OpCit) project, which will focus on linking papers held in freely accessible eprint archives such as the Los Alamos physics archives and other distributed archives, and which will build on the work of the Open Archives initiative to make the data held in such archives available to compliant services. The paper emphasises the work of the project in the context of emerging digital library information environments, explores how a range of new linking tools might be combined and identifies ways in which different linking applications might converge. Some early results of linked pages from the OpCit project are reported.
acm/ieee joint conference on digital libraries | 2002
Donna Bergmark
The invention of the hyperlink and the HTTP transmission protocol caused an amazing new structure to appear on the Internet -- the World Wide Web. With the Web, there came spiders, robots, and Web crawlers, which go from one link to the next checking Web health, ferreting out information and resources, and imposing organization on the huge collection of information (and dross) residing on the net. This paper reports on the use of one such crawler to synthesize document collections on various topics in science, mathematics, engineering and technology. Such collections could be part of a digital library.
international acm sigir conference on research and development in information retrieval | 2001
Donna Bergmark; Paradee Phempoonpanich; Shumin Zhao
As part of a larger project to automatically reference link the online scholarly literature, an attempt to analyze PDF documents was undertaken. The ACM Digital Library was used as the corpus for these experiments. With the current PDF and HTML analysis tools, roughly 80% accuracy was obtained in the automatic extraction of reference linking information.
IEEE Communications Magazine | 2000
Donna Bergmark; Srinivasan Keshav
Convergence between the existing telephone networks and data transfer over the Internet not only demands that new software be written to handle telephony applications which span both networks, but also makes new and innovative applications possible. Rather than writing these applications from the ground up, it would be helpful to have a relatively high-level API on which to prototype new applications. In this article we describe a set of Java packages developed at Cornell University to accomplish this purpose. The softwares name is ITX; it is available for download at no charge, and includes sample applications and documentation.
parallel computing | 1981
Gerard Salton; Donna Bergmark
Conventional information retrieval processes are largely based on data movement, pointer manipulations and integer arithmetic; more refined retrieval algorithms may in addition benefit from substantial computational power. In the present study a number of parallel processing methods are described that serve to enhance retrieval services. In conventional retrieval environments parallel list processing and parallel search facilities are of greatest interest. In more advanced systems, the use of array processors also proves beneficial. Various information retrieval processes are examined and evidence is given to demonstrate the usefulness of parallel processing and fast computational facilities in information retrieval.
Scientific Programming | 1996
Bill Appelbe; Donna Bergmark
Applications programming for high-performance computing is notoriously difficult. Al-though parallel programming is intrinsically complex, the principal reason why high-performance computing is difficult is the lack of effective software tools. We believe that the lack of tools in turn is largely due to market forces rather than our inability to design and build such tools. Unfortunately, the poor availability and utilization of parallel tools hurt the entire supercomputing industry and the U.S. high performance computing initiative which is focused on applications. A disproportionate amount of resources is being spent on faster hardware and architectures, while tools are being neglected. This article introduces a taxonomy of tools, analyzes the major factors that contribute to this situation, and suggests ways that the imbalance could be redressed and the likely evolution of tools.
international conference on parallel processing | 2002
Donna Bergmark
Nothing is more distributed than the Web, with its content spread across thousands of servers. High performance hardware and software is essential for an effective download, analysis, and organization of this content. We describe our experience with a highly parallel Web crawling system (Mercator) to construct - automatically - collections of scientific resources for the National Science Digital Library.
international conference on supercomputing | 1995
Donna Bergmark
We compare two different approaches to parallelization of Fortran programs. The first approach is to optimize the serial code so that it runs as fast as possible on a single processor, and then optimize the parallel version. In this paper a variety of parallel programming tools is used to obtain an optimal, parallel version of an economic policy modelling application for the IBM SP1. We apply a new technique called Data Access Normalization; we use an extended ParaScope as our parallel programming environment; we use FORGE 90 as our parallelizer; and we use KAP as our optimizer. We make a number of observations about the effectiveness of these tools. Both strategies obtain a working, parallel program, but use different tools to get there. On this occasion, both KAP and Data Access Normalization lead to the same critical transformation of inverting four of the twelve loop nests in the original program. The next most important optimization is parallel I/O, one of the few transformations that had to be done by hand. Speedups are obtained on the SP1 (using MPLp communication over the High Speed Switch). Keywords: multiprocessors, program transformations, parallel programming tools, data access normalization, ParaScope, Lambda Toolkit, Fortran, HPF, FORGE, SP1, SPMD, KAP, parallel I/O, PED LAMBDA, data parallel, loop distribution, loop fusion, trace analyzers
european conference on research and advanced technology for digital libraries | 2002
Donna Bergmark; Carl Lagoze; Alex Sbityakov