Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Matthew Christy is active.

Publication


Featured researches published by Matthew Christy.


ACM Journal on Computing and Cultural Heritage | 2017

Mass Digitization of Early Modern Texts With Optical Character Recognition

Matthew Christy; Anshul Gupta; Elizabeth Grumbach; Laura Mandell; Richard Furuta; Ricardo Gutierrez-Osuna

Optical character recognition (OCR) engines work poorly on texts published with premodern printing technologies. Engaging the key technological contributors from the IMPACT project, an earlier project attempting to solve the OCR problem for early modern and modern texts, the Early Modern OCR Project (eMOP) of Texas A8M received funding from the Andrew W. Mellon Foundation to improve OCR outputs for early modern texts from the Eighteenth Century Collections Online (ECCO) and Early English Books Online (EEBO) proprietary database products—or some 45 million pages. Added to print problems are the poor quality of the page images in these collections, which would be too time consuming and expensive to reimage. This article describes eMOPs attempts to OCR 307,000 documents digitized from microfilm to make our cultural heritage available for current and future researchers. We describe the reasoning behind our choices as we undertook the project based on other relevant studies; discoveries we made; the data and the system we developed for processing it; the software, algorithms, training procedures, and tools that we developed; and future directions that should be taken for further work in developing OCR engines for cultural heritage materials.


Digital Scholarship in the Humanities | 2015

Navigating the storm : IMPACT, eMOP, and Agile Steering Standards

Laura Mandell; Clemens Neudecker; Apostolos Antonacopoulos; Elizabeth Grumbach; Loretta Auvil; Matthew Christy; Jacob A. Heil; Todd Samuelson

This article discusses two major initiatives tasked with developing tools to im- prove optical character recognition (OCR) or the mechanical keying of texts that are digitally available only as page images. The two initiatives are the IMProving ACcess to Text Project in Europe and the Early Modern OCR Project in the USA. Because of dealing with a multilayered problem like OCR technologies and having to collaborate with radically interdisciplinary and international team members, the two projects developed techniques that we call Agile Project Management, outlined in this essay with rationales for their use.


national conference on artificial intelligence | 2015

Automatic assessment of OCR quality in historical documents

Anshul Gupta; Ricardo Gutierrez-Osuna; Matthew Christy; Boris Capitanu; Loretta Auvil; Liz Grumbach; Richard Furuta; Laura Mandell


DH | 2014

Diagnosing Page Image Problems with Post-OCR Triage for eMOP.

Matthew Christy; Loretta Auvil; Ricardo Gutierrez-Osuna; Boris Capitanu; Anshul Gupta; Elizabeth Grumbach


arXiv: Computer Vision and Pattern Recognition | 2016

Font Identification in Historical Documents Using Active Learning.

Anshul Gupta; Ricardo Gutierrez-Osuna; Matthew Christy; Richard Furuta; Laura Mandell


Archive | 2015

Considering Frameworks for the Ideal Digital Research Community: The Past and Present of 18thConnect

Liz Grumbach; Matthew Christy


2015 Texas Conference on Digital Libraries | 2015

Expanding and Improving Access to Early English Books Online (EEBO)

Matthew Christy; Elizabeth Grumbach; Laura Mandell


2015 Texas Conference on Digital Libraries | 2015

Beyond the Early Modern OCR Project

Matthew Christy; Elizabeth Grumbach; Laura Mandell


DH | 2014

Book History and Software Tools: Examining Typefaces for OCR Training in eMOP.

Matthew Christy; Todd Samuelson; Katayoun Torabi; Bryan Tarpley; Elizabeth Grumbach


DH | 2014

The Early Modern OCR Project (eMOP): Fostering Access to Early Modern Cultural Materials.

Elizabeth Grumbach; Laura Mandell; Matthew Christy

Collaboration


Dive into the Matthew Christy's collaboration.

Researchain Logo
Decentralizing Knowledge