Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Cristina Mota is active.

Publication


Featured researches published by Cristina Mota.


ElectricDict '04 Proceedings of the Workshop on Enhancing and Using Electronic Dictionaries | 2004

Multiword lexical acquisition and dictionary formalization

Cristina Mota; Paula Carvalho; Elisabete Ranchhod

In this paper, we present the current state of development of a large-scale lexicon built at LabEL for Portuguese. We will concentrate on multiword expressions (MWE), particularly on multiword nouns, (i) illustrating their most relevant morphological features, and (ii) pointing out the methods and techniques adopted to generate the inflected forms from lemmas. Moreover, we describe a corpus-based aproach for the acquisition of new multiword nouns, which led to a significant enlargement of the existing lexicon. Evaluation results concerning lexical coverage in the corpus are also discussed.


north american chapter of the association for computational linguistics | 2009

Relation detection between named entities: report of a shared task

Cláudia Freitas; Diana Santos; Cristina Mota; Hugo Gonçalo Oliveira; Paula Carvalho

In this paper we describe the first evaluation contest (track) for Portuguese whose goal was to detect and classify relations between named entities in running text, called ReRelEM. Given a collection annotated with named entities belonging to ten different semantic categories, we marked all relationships between them within each document. We used the following fourfold relationship classification: identity, included-in, located-in, and other (which was later on explicitly detailed into twenty different relations). We provide a quantitative description of this evaluation resource, as well as describe the evaluation architecture and summarize the results of the participating systems in the track.


meeting of the association for computational linguistics | 2009

Updating a Name Tagger Using Contemporary Unlabeled Data

Cristina Mota; Ralph Grishman

For many NLP tasks, including named entity tagging, semi-supervised learning has been proposed as a reasonable alternative to methods that require annotating large amounts of training data. In this paper, we address the problem of analyzing new data given a semi-supervised NE tagger trained on data from an earlier time period. We will show that updating the unlabeled data is sufficient to maintain quality over time, and outperforms updating the labeled data. Furthermore, we will also show that augmenting the unlabeled data with older data in most cases does not result in better performance than simply using a smaller amount of current unlabeled data.


processing of the portuguese language | 2003

ANELL: a web system for Portuguese corpora annotation

Cristina Mota; Pedro Moura

In this paper, we briefly describe a system for annotating corpora via web which offers two operating modes: a full automatic mode and a supervised mode. The linguistic analysis of the corpora is performed by the INTEX system using the LabEL linguistic resources. The motivation, the architecture and some examples of its behavior are presented, with special emphasis on the several output formats it allows.


international conference natural language processing | 2002

Complex Lexical Units and Automata

Paula Carvalho; Cristina Mota; Elisabete Ranchhod

This paper discusses the problem of disambiguating noun phrases that contain compound words ambiguous with free simple word combinations. Finite-state transducers will be used to both represent the noun phrase ambiguities and to formalize the linguistic constraints that allow the elimination or reduction of the incorrect analyses.


International Conference on Automatic Processing of Natural-Language Electronic Texts with NooJ | 2015

Generating Paraphrases of Human Intransitive Adjective Constructions with Port4NooJ

Cristina Mota; Paula Carvalho; Francisco Raposo; Anabela Barreiro

This paper details the integration into Port4NooJ of 15 lexicon-grammar tables describing the distributional properties of 4,248 human intransitive adjectives. The properties described in these tables enable the recognition and generation of adjectival constructions where the adjective has a predicative function. These properties also establish semantic relationships between adjective, noun and verb predicates, allowing new paraphrasing capabilities that were described in NooJ grammars. The new dictionary of human intransitive adjectives created by merging the information on those tables with the Port4NooJ homograph adjectives is comprised of 5,177 entries. The enhanced Port4NooJ is being used in eSPERTo, a NooJ-based paraphrase generation platform.


Archive | 2010

Journalistic corpus similarity over time

Cristina Mota

We used the method proposed in Kilgarriff (2001) to assess corpus similarity over a short period of time both within topic and cross topic. The corpus samples were drawn from a Portuguese journalistic corpus. The corpus spans eight years (from 1991 to 1998) and comprises article extracts marked with the year segment, half-year segment, and newspaper section of publication. We analyzed the corpus, taking as reference each text in the time interval and comparing it with all texts published in different periods. We observed that (i) the similarity between two texts within the same topic generally decreases as the time gap between them increases, being more significant for some topics, and (ii) in some cases the texts on one topic over time become as different as two texts from different topics. Since the ultimate goal of our work is to understand how the changes in corpus similarity affect the performance of a named entity tagger, we also measured similarity based on frequency lists containing only capitalized words and containing only lowercase words. The former similarity aims at comparing the corpora from the viewpoint of the named entities content, whereas the latter one approximately compares the surrounding contexts of the named entities. The results show that the similarity values based on these lists also generally decrease over time, even though the decreasing profiles are topicdependent.


International Conference on Automatic Processing of Natural-Language Electronic Texts with NooJ | 2017

Integrating the Lexicon-Grammar of Predicate Nouns with Support Verb fazer into Port4NooJ

Cristina Mota; Lucília Chacoto; Anabela Barreiro

This paper describes the ongoing process of integrating approximately 3,000 predicate nouns into Port4NooJ, the Portuguese module for NooJ. The integration of these resources enables us to further extend the paraphrastic capabilities of eSPERTo paraphrasing system developed in the scope of a project with the same name. The integrated predicate nouns co-occur with the support verb fazer (do or make) and their syntactic and distributional properties are formalized in lexicon-grammar tables. These lexicon-grammar tables resulted in a standalone dictionary of predicate noun constructions and a few new grammars that can be used in paraphrase analysis and generation.


International NooJ Conference | 2016

eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization

Cristina Mota; Anabela Barreiro; Francisco Raposo; Ricardo Ribeiro; Sérgio Curto; Luísa Coheur

This paper reports our first attempt of integrating eSPERTo’s paraphrastic engine, which is based on NooJ platform, with two application scenarios: a conversational agent, and a summarization system. We briefly describe eSPERTo’s base resources, and the necessary modifications to these resources that enabled the production of paraphrases required to feed both systems. Although the improvement observed in both scenarios is not significant, we present a detailed error analysis to further improve the achieved results in future experiments.


processing of the portuguese language | 2012

SIGA, a system to manage information retrieval evaluations

Luís Fernando Costa; Cristina Mota; Diana Santos

This paper provides an overview of the current version of SIGA, a system that supports the organization of information retrieval (IR) evaluations. SIGA was recently used in Pagico, an evaluation contest where both automatic and human participants competed to find answers to 150 topics in the Portuguese Wikipedia, and we describe its new capabilities in this context as well as provide preliminary results from Pagico.

Collaboration


Dive into the Cristina Mota's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Cláudia Freitas

Pontifical Catholic University of Rio de Janeiro

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Cláudia Freitas

Pontifical Catholic University of Rio de Janeiro

View shared research outputs
Researchain Logo
Decentralizing Knowledge