Matteo Romanello | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matteo Romanello is active.

Explore More

Publication

Featured researches published by Matteo Romanello.

european conference on research and advanced technology for digital libraries | 2009

Improving OCR accuracy for classical critical editions

Federico Boschetti; Matteo Romanello; Alison Babeu; David Bamman; Gregory R. Crane

This paper describes a work-flow designed to populate a digital library of ancient Greek critical editions with highly accurate OCR scanned text. While the most recently available OCR engines are now able after suitable training to deal with the polytonic Greek fonts used in 19th and 20th century editions, further improvements can also be achieved with postprocessing. In particular, the progressive multiple alignment method applied to different OCR outputs based on the same images is discussed in this paper.

acm/ieee joint conference on digital libraries | 2009

Collecting fragmentary authors in a digital library

Monica Berti; Matteo Romanello; Alison Babeu; Gregory R. Crane

This paper discusses new work to represent, in a digital library of classical sources, authors whose works themselves are lost and who survive only where surviving authors quote, paraphrase or allude to them. It describes initial works from a digital collection of such fragmentary authors designed not only to capture but to extend the ontologies that traditional scholarship has developed over generations: the aim is representing every nuance of print conventions while using the capabilities of digital libraries to extend our ability to identify fragments, to represent what we have identified, and to render the results of that work intellectually and physically more accessible than was possible in print culture.

Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries | 2009

Citations in the Digital Library of Classics: Extracting Canonical References by Using Conditional Random Fields

Matteo Romanello; Federico Boschetti; Gregory R. Crane

Scholars of Classics cite ancient texts by using abridged citations called canonical references. In the scholarly digital library, canonical references create a complex textile of links between ancient and modern sources reflecting the deep hypertextual nature of texts in this field. This paper aims to demonstrate the suitability of Conditional Random Fields (CRF) for extracting this particular kind of reference from unstructured texts in order to enhance the capabilities of navigating and aggregating scholarly electronic resources. In particular, we developed a parser which recognizes word level n-grams of a text as being canonical references by using a CRF model trained with both positive and negative examples.

Proceedings of the 1st International Workshop on Collaborative Annotations in Shared Environment | 2013

Citations and annotations in classics: old problems and new perspectives

Matteo Romanello; Michele Pasin

Annotations played a major role in Classics since the very beginning of the discipline. Some of the first attested examples of philological work, the so-called scholia, were in fact marginalia, namely comments written at the margins of a text. Over the centuries this kind of scholarship evolved until it became a genre on its own, the classical commentary, thus moving away from the text with the result that philologists had to devise a solution to linking together the commented and the commenting text. The solution to this problem is the system of canonical citations, a special kind of bibliographic references that are at the same time very precise and highly interoperable. In this paper we present HuCit, an ontology that models in depth the semantics of canonical citations. We discuss how it can be used to a) support the automatic extraction of canonical citations from texts and b) to publish them in machine-readable format on the Semantic Web. Finally, we describe how HuCits machine-generated citation data can also be expressed as annotations by using the Open Annotation Collaboration (OAC) ontology, to the aim of increasing reuse and semantic interoperability.

International Journal on Digital Libraries | 2018

The references of references: a method to enrich humanities library catalogs with citation data

Giovanni Colavizza; Matteo Romanello; Frédéric Kaplan

The advent of large-scale citation indexes has greatly impacted the retrieval of scientific information in several domains of research. The humanities have largely remained outside of this shift, despite their increasing reliance on digital means for information seeking. Given that publications in the humanities have a longer than average life-span, mainly due to the importance of monographs for the field, this article proposes to use domain-specific reference monographs to bootstrap the enrichment of library catalogs with citation data. Reference monographs are works considered to be of particular importance in a research library setting, and likely to possess characteristic citation patterns. The article shows how to select a corpus of reference monographs, and proposes a pipeline to extract the network of publications they refer to. Results using a set of reference monographs in the domain of the history of Venice show that only 7% of extracted citations are made to publications already within the initial seed. Furthermore, the resulting citation network suggests the presence of a core set of works in the domain, cited more frequently than average.

acm conference on hypertext | 2009

When printed hypertexts go digital: information extraction from the parsing of indices

Matteo Romanello; Monica Berti; Alison Babeu; Gregory R. Crane

Modern critical editions of ancient works generally include manually created indices of other sources quoted in the text. Since indices can be considered as a form of domain specific language, the paper presents a parsing-based approach to the problem of extracting information from them to support the creation of a collection of fragmentary texts. This paper first considers the characteristics and structure of quotation indices and their importance when dealing with fragmentary texts. It then presents the results of applying a fuzzy parser to the OCR transcription of an index of quotations to extract information from potentially noisy input.

international conference on electronic publishing | 2008