Greta Franzini | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Greta Franzini is active.

Explore More

Publication

Featured researches published by Greta Franzini.

EuroVis (STARs) | 2015

On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges

Stefan Jänicke; Greta Franzini; Muhammad Faisal Cheema; Gerik Scheuermann

We present an overview of the last ten years of research on visualizations that support close and distant reading of textual data in the digital humanities. We look at various works published within both the visualization and digital humanities communities. We provide a taxonomy of applied methods for close and distant reading, and illustrate approaches that combine both reading techniques to provide a multifaceted view of the data. Furthermore, we list toolkits and potentially beneficial visualization approaches for research in the digital humanities. Finally, we summarize collaboration experiences when developing visualizations for close and distant reading, and give an outlook on future challenges in that research area.

Computer Graphics Forum | 2017

Visual Text Analysis in Digital Humanities

Stefan Jänicke; Greta Franzini; Muhammad Faisal Cheema; Gerik Scheuermann

In 2005, Franco Moretti introduced Distant Reading to analyse entire literary text collections. This was a rather revolutionary idea compared to the traditional Close Reading, which focuses on the thorough interpretation of an individual work. Both reading techniques are the prior means of Visual Text Analysis. We present an overview of the research conducted since 2005 on supporting text analysis tasks with close and distant reading visualizations in the digital humanities. Therefore, we classify the observed papers according to a taxonomy of text analysis tasks, categorize applied close and distant reading techniques to support the investigation of these tasks and illustrate approaches that combine both reading techniques in order to provide a multi‐faceted view of the textual data. In addition, we take a look at the used text sources and at the typical data transformation steps required for the proposed visualizations. Finally, we summarize collaboration experiences when developing visualizations for close and distant reading, and we give an outlook on future challenges in that research area.

ACM Journal on Computing and Cultural Heritage | 2016

Sentence Shortening via Morpho-Syntactic Annotated Data in Historical Language Learning

Maria Moritz; Barbara Pavlek; Greta Franzini; Gregory Crane

We present an approach to shorten Ancient Greek sentences by using morpho-syntactic information attached to each word in a sentence. This work underpins the content of our eLearning application, AncientGeek, whose unique teaching technique draws from primary Greek sources. By applying a technique that skips the clausal dependents of a main verb, we reached a well-formed rate of 89% of the sentences.

Archive | 2014

Towards a Historical Text Re-use Detection

Marco Büchler; Philip R. Burns; Martin Müller; Emily Franzini; Greta Franzini

Text re-use describes the spoken and written repetition of information. Historical text re-use, with its longer time span, embraces a larger set of morphological, linguistic, syntactic, semantic and copying variations, thus adding a complication to text-reuse detection. Furthermore, it increases the chances of redundancy in a Digital Library. In Natural Language Processing it is crucial to remove these redundancies before applying any kind of machine learning techniques to the text. In Humanities, these redundancies foreground textual criticism and allow scholars to identify lines of transmission. This chapter investigates two aspects of the historical text re-use detection process, based on seven English editions of the Holy Bible. First, we measure the performance of several techniques. For this purpose, when considering a verse—such as book Genesis, Chapter 1, Verse 1—that is present in two editions, one verse is always understood as a paraphrase of the other. It is worth noting that paraphrasing is considered a hyponym of text re-use. Depending on the intention with which the new version was created, verses tend to differ significantly in the wording, but not in the meaning. Secondly, this chapter explains and evaluates a way of extracting paradigmatic relations. However, as regards historical languages, there is a lack of language resources (for example, WordNet) that makes non-literal text re-use and paraphrases much more difficult to identify. These differences are present in the form of replacements, corrections, varying writing styles, etc. For this reason, we introduce both the aforementioned and other correlated steps as a method to identify text re-use, including language acquisition to detect changes that we call paradigmatic relations. The chapter concludes with the recommendation to move from a ”single run” detection to an iterative process by using the acquired relations to run a new task.

Frontiers in Digital Humanities | 2018

Attributing authorship in the noisy digitized correspondence of Jacob and Wilhelm Grimm

Greta Franzini; Mike Kestemont; Gabriela Rotari; Melina Jander; Jeremi K. Ochab; Emily Franzini; Joanna Byszuk; Jan Rybicki

This article presents the results of a multidisciplinary project aimed at better understanding the impact of different digitization strategies in computational text analysis. More specifically, it describes an effort to automatically discern the authorship of Jacob and Wilhelm Grimm in a body of uncorrected correspondence processed by HTR (Handwritten Text Recognition) and OCR (Optical Character Recognition), reporting on the effect this noise has on the analyses necessary to computationally identify the different writing style of the two brothers. In summary, our findings show that OCR digitization serves as a reliable proxy for the more painstaking process of manual digitization, at least when it comes to authorship attribution. Our results suggest that attribution is viable even when using training and test sets from different digitization pipelines. With regard to HTR, this research demonstrates that even though automated transcription significantly increases risk of text misclassification when compared to OCR, a cleanliness above ≈ 20% is already sufficient to achieve a higher-than-chance probability of correct binary attribution.

international conference on big data | 2016

Mining and analysing one billion requests to linguistic services

Marco Büchler; Greta Franzini; Emily Franzini; Thomas Eckart

From 2004 to 2016 the Leipzig Linguistic Services (LLS) existed as a SOAP-based cyberinfrastructure of atomic micro-services for the Wortschatz project, which covered different-sized textual corpora in more than 230 languages. The LLS were developed in 2004 and went live in 2005 in order to provide a webservice-based API to these corpus databases. In 2006, the LLS infrastructure began to systematically log and store requests made to the text collection, and in August 2016 the LLS were shut down. This article summarises the experience of the past ten years of running such a cyberinfrastructure with a total of nearly one billion requests. It includes an explanation of the technical decisions and limitations but also provides an overview of how the services were used.

Digital Scholarship in the Humanities | 2015

TRAViz: A Visualization for Variant Graphs

Stefan Jänicke; Annette Geßner; Greta Franzini; Melissa Terras; Simon Mahony; Gerik Scheuermann

Journal of the Text Encoding Initiative | 2014

The Linked Fragment: TEI and the Encoding of Text Reuses of Lost Authors

Monica Berti; Bridget Almas; David Dubin; Greta Franzini; Simona Stoyanova; Gregory R. Crane

language resources and evaluation | 2014

Open Philology at the University of Leipzig

Frederik Baumgardt; Giuseppe G. A. Celano; Gregory R. Crane; Stella Dee; Maryam Foradi; Emily Franzini; Greta Franzini; Monica Lent; Maria Moritz; Simona Stoyanova

DH | 2018