M. Antònia Martí
University of Barcelona
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by M. Antònia Martí.
Archive | 2009
Marta Recasens; M. Antònia Martí; Mariona Taulé
Traditional linguistic theories of definiteness have characterized the definite article in terms of uniqueness or familiarity, inclusiveness or identifiability. From this perspective, anaphoric uses of definite noun phrases (NPs) are seen as the paradigm case, while non-anaphoric or first-mention uses are treated as exceptions deserving no special attention. The main weaknesses of such approach are its tendency to be based on constructed examples and its focus on one single language, English. When natural data is taken into account, classical treatments of definites collapse.
Archive | 2016
Naymé Salas; Anna Llauradó; Cristina Castillo; Mariona Taulé; M. Antònia Martí
Holistic scoring of written texts is a most favored procedure to evaluate text quality in both the teaching and research of writing. However, the text properties that educators take into account to perform those evaluations have rarely been investigated. In this paper we examined the extent to which a series of linguistic markers obtained from written narrative texts contributed to explaining variation in the holistic scores assigned by independent raters. The written texts were produced by 80 participants divided into four age groups (9-, 12, 16-year olds, and adults), who were asked to write about the topic of a silent video showing conflicts at school. Linguistic markers were organized into three domains: syntactic complexity, cohesion, and vocabulary use. Our findings suggest that linguistic features are fundamental to perceptions of text quality in Spanish, though only a few text-based measures contributed significantly to the models for each age group. Educators took into account modality and genre constraints, and adjusted their criteria to the educational level of the writers.
language resources and evaluation | 2015
Marta Vila; Manuel Bertran; M. Antònia Martí; Horacio Rodríguez
Paraphrase corpora annotated with the types of paraphrases they contain constitute an essential resource for the understanding of the phenomenon of paraphrasing and the improvement of paraphrase-related systems in natural language processing. In this article, a new annotation scheme for paraphrase-type annotation is set out, together with newly created measures for the computation of inter-annotator agreement. Three corpora different in nature and in two languages have been annotated using this infrastructure. The annotation results and the inter-annotator agreement scores for these corpora are proof of the adequacy and robustness of our proposal.
international conference on web engineering | 2015
Mariona Taulé; M. Antònia Martí; Ann Bies; Montserrat Nofre; Aina Garí; Zhiyi Song; Stephanie M. Strassel; Joe Ellis
This paper presents the Latin American Spanish Discussion Forum Treebank (LAS-DisFo). This corpus consists of 50,291 words and 2,846 sentences that are part-of-speech tagged, lemmatized and syntactically annotated with constituents and functions. We describe how it was built and the methodology followed for its annotation, the annotation scheme and criteria applied for dealing with the most problematic phenomena commonly encountered in this kind of informal unedited web text. This is the first available Latin American Spanish corpus of non-standard language that has been morphologically and syntactically annotated. It is a valuable linguistic resource that can be used for the training and evaluation of parsers and PoS taggers.
language resources and evaluation | 2018
Salud María Jiménez-Zafra; Mariona Taulé; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; M. Antònia Martí
AbstractIn this paper, we present SFU ReviewSP-NEG, the first Spanish corpus annotated with negation with a wide coverage freely available. We describe the methodology applied in the annotation of the corpus including the tagset, the linguistic criteria and the inter-annotator agreement tests. We also include a complete typology of negation patterns in Spanish. This typology has the advantage that it is easy to express in terms of a tagset for corpus annotation: the types are clearly defined, which avoids ambiguity in the annotation process, and they provide wide coverage (i.e. they resolved all the cases occurring in the corpus). We use the SFU ReviewSP as a base in order to make the annotations. The corpus consists of 400 reviews, 221,866 words and 9455 sentences, out of which 3022 sentences contain at least one negation structure.
Language, cognition and neuroscience | 2014
Marta Recasens; Liliana Tolchinsky; M. Antònia Martí
Coreference has been traditionally defined dichotomously as identity-of-reference or non-identity-of-reference. Here, we consider the existence of a near-identity category for referential relations that are neither coreferent nor non-coreferent. We present a three-task experiment on the interpretation of the identity relationship between the referents of noun phrases in English and in Catalan. The experiment collected the judgements and reaction times (RTs) of 70 English speakers and 34 Catalan speakers to investigate the reality of a near-identity category that undermines the traditional dichotomous approach to coreference. The results show that whereas some referential relations are classified as either identity or non-identity by the majority of participants, there is a third class of relations for which the judgements are split between identity and non-identity. This third class of relations also shows a longer RT. This evidence supports the conclusion that it does not suffice to distinguish between identity-of-reference and non-identity-of-reference, but that it is psychologically plausible to assume a middle-ground category of near-identity to include those referential relations on which participants disagree as to whether they are coreferent. In addition, the results allow to conclude that near-identity relations involve higher processing complexity. The fact that this is true for both English and Catalan points towards the cross-linguistic nature of near-identical referents.
Digithum | 2003
Mariona Taulé; M. Antònia Martí
In this paper we present an overview of the international exercise SENSEVAL, the basic purpose of which is to organize and run evaluation of Word Sense Disambiguation (WSD) algorithms and systems with respect to different words and languages. In order to do this, we first introduce the main goals and the organization of the SENSEVAL exercise; we follow with the methodology developed, the tasks and the linguistic resources created (corpora and dictionaries); and we finally end with some general reflections about the effects on Linguistics and, concretely, on Lexicography motivated by this methodology. We will base on the resources developed for the Spanish SENSEVAL-2 in order to illustrate this methodology.
language resources and evaluation | 2010
Marta Recasens; M. Antònia Martí
Lingua | 2011
Marta Recasens; Eduard H. Hovy; M. Antònia Martí
Procesamiento Del Lenguaje Natural | 2011
Marta Vila; M. Antònia Martí; Horacio Rodríguez