Rita Marquilhas
University of Lisbon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rita Marquilhas.
Language Technology for Cultural Heritage | 2011
Iris Hendrickx; Michel Généreux; Rita Marquilhas
In this investigation we aim to reduce the manual workload by automatic processing of the corpus of historical letters for pragmatic research. We focus on two consecutive sub tasks: the first task is automatic text segmentation of the letters in formal/informal parts using a statistical n-gram based technique. As a second task we perform semantic labeling of the formal parts of the letters using supervised machine learning. The main stumbling block in our investigation is data sparsity due to the small size of the data set and enlarged by the spelling variation present in the historical letters. We try to address the latter problem with a dictionary look up and edit distance text normalization step. We achieve results of 86% micro-averaged F-score for the text segmentation task and 66.3% for the semantic labeling task. Even though these scores are not high enough to completely replace the manual annotation with automatic annotation, our results are promising and demonstrate that an automatic approach based on such small data set is feasible.
Archive | 2017
Martyn Lyons; Rita Marquilhas
The Introduction argues that written documents must be examined not just as historical testimony, but as historical objects in their own right. The use of writing in any given society is a key to its workings and power structures. The history of scribal culture must treat written objects in three dimensions: as material object, as social practice and as text. This book emphasises the materiality of writing, and sees literacy as a social practice embedded in everyday life.
Journal of Historical Sociolinguistics | 2015
Rita Marquilhas
Abstract This paper discusses the methods that historical sociolinguists can use in order to avoid anachronism. It is argued that there are four practical ways of triggering a sense of scale, both for external variables that correlate with past language use and for the linguistic data that we inherited from past societies: by learning from social and cultural historians, by visiting judicial archives, by making scholarly digital editions, and by using corpus linguistic statistical procedures. The case study focused on here is Portuguese in the Early Modern period, from the sixteenth century to the early nineteenth century. The size of an ideal sample of informants is discussed, based on the demographic history of Portugal. Furthermore, social categories are established relevant in the context of Portuguese cultural history, taking into account the social world that made sense to Early Modern people. Next, I introduce a corpus of Early Modern letters containing a close-to-conversational register, and discuss two case studies. An analysis of spelling variation in the corpus shows the diachronic dialectal spread of a merger of sibilants. The statistical analysis of keywords shows the pervasiveness of register markers as well as some typical uses found in epistolary communication by social actors from different social strata.
International Journal of Humanities and Arts Computing | 2014
Rita Marquilhas; Iris Hendrickx
The CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters.
Archive | 1991
Ivo Castro; Rita Marquilhas; J Léon Acosta
JLCL | 2011
Iris Hendrickx; Rita Marquilhas
Archive | 2012
M.W.C. Reynaert; Iris Hendrickx; Rita Marquilhas; F. Mambrini; M. Passarotti; C. Sporleder
Archive | 2012
Rita Marquilhas
Archive | 2003
Rita Marquilhas
Written Language and Literacy | 2015
Rita Marquilhas