Deborah Paul | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Deborah Paul is active.

Explore More

Publication

Featured researches published by Deborah Paul.

ZooKeys | 2012

Five task clusters that enable efficient and effective digitization of biological collections

Gil Nelson; Deborah Paul; G. Riccardi; Austin R. Mast

Abstract This paper describes and illustrates five major clusters of related tasks (herein referred to as task clusters) that are common to efficient and effective practices in the digitization of biological specimen data and media. Examples of these clusters come from the observation of diverse digitization processes. The staff of iDigBio (The U.S. National Science Foundation’s National Resource for Advancing Digitization of Biological Collections) visited active biological and paleontological collections digitization programs for the purpose of documenting and assessing current digitization practices and tools. These observations identified five task clusters that comprise the digitization process leading up to data publication: (1) pre-digitization curation and staging, (2) specimen image capture, (3) specimen image processing, (4) electronic data capture, and (5) georeferencing locality descriptions. While not all institutions are completing each of these task clusters for each specimen, these clusters describe a composite picture of digitization of biological and paleontological specimens across the programs that were observed. We describe these clusters, three workflow patterns that dominate the implemention of these clusters, and offer a set of workflow recommendations for digitization programs.

Applications in Plant Sciences | 2015

Digitization Workflows for Flat Sheets and Packets of Plants, Algae, and Fungi

Gil Nelson; Patrick W. Sweeney; Lisa E. Wallace; Richard K. Rabeler; Dorothy Allard; Herrick Brown; J. Richard Carter; Michael W. Denslow; Elizabeth R. Ellwood; Charlotte C. Germain-Aubrey; Ed Gilbert; Emily L. Gillespie; Leslie R. Goertzen; Ben Legler; D. Blaine Marchant; Travis D. Marsico; Ashley B. Morris; Zack E. Murrell; Mare Nazaire; Chris Neefus; Shanna Oberreiter; Deborah Paul; Brad R. Ruhfel; Thomas Sasek; Joey Shaw; Pamela S. Soltis; Kimberly Watson; Andrea Weeks; Austin R. Mast

Effective workflows are essential components in the digitization of biodiversity specimen collections. To date, no comprehensive, community-vetted workflows have been published for digitizing flat sheets and packets of plants, algae, and fungi, even though latest estimates suggest that only 33% of herbarium specimens have been digitally transcribed, 54% of herbaria use a specimen database, and 24% are imaging specimens. In 2012, iDigBio, the U.S. National Science Foundations (NSF) coordinating center and national resource for the digitization of public, nonfederal U.S. collections, launched several working groups to address this deficiency. Here, we report the development of 14 workflow modules with 7–36 tasks each. These workflows represent the combined work of approximately 35 curators, directors, and collections managers representing more than 30 herbaria, including 15 NSF-supported plant-related Thematic Collections Networks and collaboratives. The workflows are provided for download as Portable Document Format (PDF) and Microsoft Word files. Customization of these workflows for specific institutional implementation is encouraged.

Applications in Plant Sciences | 2018

Herbarium data: Global biodiversity and societal botanical needs for novel research

Shelley A. James; Pamela S. Soltis; Lee Belbin; Arthur Chapman; Gil Nelson; Deborah Paul; Matthew Collins

Building on centuries of research based on herbarium specimens gathered through time and around the globe, a new era of discovery, synthesis, and prediction using digitized collections data has begun. This paper provides an overview of how aggregated, open access botanical and associated biological, environmental, and ecological data sets, from genes to the ecosystem, can be used to document the impacts of global change on communities, organisms, and society; predict future impacts; and help to drive the remediation of change. Advocacy for botanical collections and their expansion is needed, including ongoing digitization and online publishing. The addition of non‐traditional digitized data fields, user annotation capability, and born‐digital field data collection enables the rapid access of rich, digitally available data sets for research, education, informed decision‐making, and other scholarly and creative activities. Researchers are receiving enormous benefits from data aggregators including the Global Biodiversity Information Facility (GBIF), Integrated Digitized Biocollections (iDigBio), the Atlas of Living Australia (ALA), and the Biodiversity Heritage Library (BHL), but effective collaboration around data infrastructures is needed when working with large and disparate data sets. Tools for data discovery, visualization, analysis, and skills training are increasingly important for inspiring novel research that improves the intrinsic value of physical and digital botanical collections.

BioScience | 2018

Worldwide Engagement for Digitizing Biocollections (WeDigBio): The Biocollections Community's Citizen-Science Space on the Calendar

Elizabeth R. Ellwood; Paul Kimberly; Robert P. Guralnick; Paul Flemons; Kevin Love; Shari Ellis; Julie M. Allen; Jason H. Best; Richard Carter; Simon Chagnoux; Robert Costello; Michael W. Denslow; Betty A. Dunckel; Meghan M Ferriter; Edward Gilbert; Christine Goforth; Quentin Groom; Erica R Krimmel; Raphael LaFrance; Joann Lacey Martinec; Andrew N. Miller; Jamie Minnaert-Grote; Thomas H. Nash; Peter T. Oboyski; Deborah Paul; Katelin D. Pearson; N. Dean Pentcheff; Mari A Roberts; Carrie E Seltzer; Pamela S. Soltis

Abstract The digitization of biocollections is a critical task with direct implications for the global community who use the data for research and education. Recent innovations to involve citizen scientists in digitization increase awareness of the value of biodiversity specimens; advance science, technology, engineering, and math literacy; and build sustainability for digitization. In support of these activities, we launched the first global citizen-science event focused on the digitization of biodiversity specimens: Worldwide Engagement for Digitizing Biocollections (WeDigBio). During the inaugural 2015 event, 21 sites hosted events where citizen scientists transcribed specimen labels via online platforms (DigiVol, Les Herbonautes, Notes from Nature, the Smithsonian Institutions Transcription Center, and Symbiota). Many citizen scientists also contributed off-site. In total, thousands of citizen scientists around the world completed over 50,000 transcription tasks. Here, we present the process of organizing an international citizen-science event, an analysis of the events effectiveness, and future directions—content now foundational to the growing WeDigBio event.

Archive | 2013

Improving the Character of Optical Character Recognition (OCR): iDigBio Augmenting OCR Working Group Seeks Collaborators and Strategies to Improve OCR Output and Parsing of OCR Output ...

Robert Anglin; Jason H. Best; Renato Figueiredo; Edward Gilbert; Nathan Gnanasambandam; Stephen Gottschalk; Elspeth Haston; P. Bryan Heidorn; Daryl Lafferty; Peter Lang; Gil Nelson; Deborah Paul; William Ulate; Kimberly Watson; Qianjin Zhang

There are an estimated 2 – 3 billion museum specimens world – wide (OECD 1999, Ariño 2010). In an effort to increase the research value of their collections, institutions across the U. S. have been seeking new ways to cost effectively transcribe the label information associated with these specimen collections. Current digitization methods are still relatively slow, labor-intensive, and therefore expensive. New methods, such as optical character recognition (OCR), natural language processing, and human-in-theloop assisted parsing are being explored to reduce these costs. The National Science Foundation (NSF), through the Advancing Digitization of Biodiversity Collections (ADBC) program, funded Integrated Digitized Biocollections (iDigBio) in 2011 to create a Home Uniting Biodiversity Collections (HUB) cyberinfrastructure to aggregate and collectively integrate specimen data and find ways to digitize specimen data faithfully and faster and disseminate the knowledge of how to achieve this. The iDigBio Augmenting OCR Working Group is part of this national effort.

Archive | 2013

Augmenting optical character recognition (OCR) for improved digitization: Strategies to access scientific data in natural history collections

Deborah Paul; P. Bryan Heidorn

The Augmenting OCR Working Group (A-OCR WG) at Integrated Digitized Biocollections (iDigBio) seeks to improve community OCR strategies and algorithms for faster, better parsing of OCR output derived from valuable data on natural history collection specimen labels. This task is exceedingly difficult because museum labels are often annotated, and vary in content, form and font. Under the National Science Foundations (NSF) Advancing Digitization of Biological Collections (ADBC) program, iDigBio is building a cyberinfrastructure to aggregate quality data from museum specimens housed in collections across the United States for use by researchers, educators, environmentalists and the public. Since March of 2012, the A-OCR WG formed from community consensus to begin its role in this endeavor, defining reachable goals including setting up a hackathon concurrent with iConference 2013. This paper reports on the definition of some key problems identified by the A-OCR WG since these science problems will drive research and cyberinfrastructure development.

BioScience | 2015

Accelerating the Digitization of Biodiversity Research Specimens through Online Public Participation

Elizabeth R. Ellwood; Betty A. Dunckel; Paul Flemons; Robert P. Guralnick; Gil Nelson; Greg Newman; Sarah Newman; Deborah Paul; Greg Riccardi; Nelson Rios; Katja C. Seltmann; Austin R. Mast

Biodiversity Information Science and Standards | 2018