Carlos Gómez-Rodríguez
Grupo México
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Carlos Gómez-Rodríguez.
association for information science and technology | 2015
David Vilares; Miguel A. Alonso; Carlos Gómez-Rodríguez
Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focused on determining whether the content of a text is objective or subjective, and in the latter case, if it conveys a positive or a negative opinion. Most polarity detection techniques tend to take into account individual terms in the text and even some degree of linguistic knowledge, but they do not usually consider syntactic relations between words. This article explores how relating lexical, syntactic, and psychometric information can be helpful to perform polarity classification on Spanish tweets. We provide an evaluation for both shallow and deep linguistic perspectives. Empirical results show an improved performance of syntactic approaches over pure lexical models when using large training sets to create a classifier, but this tendency is reversed when small training collections are used.
Knowledge Based Systems | 2017
David Vilares; Carlos Gómez-Rodríguez; Miguel A. Alonso
We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules. On the one hand, we exploit some of the main advantages of unsupervised algorithms: (1) the interpretability of their output, in contrast with most supervised models, which behave as a black box and (2) their robustness across different corpora and domains. On the other hand, by introducing the concept of compositional operations and exploiting syntactic information in the form of universal dependencies, we tackle one of their main drawbacks: their rigidity on data that are differently structured depending on the language. Experiments show an improvement both over existing unsupervised methods, and over state-of-the-art supervised models when evaluating outside their corpus of origin. The system is freely available.
TAGRF '06 Proceedings of the Eighth International Workshop on Tree Adjoining Grammar and Related Formalisms | 2006
Carlos Gómez-Rodríguez; Miguel A. Alonso; Manuel Vilares
In this paper, a generic system that generates parsers from parsing schemata is applied to the particular case of the XTAG English grammar. In order to be able to generate XTAG parsers, some transformations are made to the grammar, and TAG parsing schemata are extended with feature structure unification support and a simple tree filtering mechanism. The generated implementations allow us to study the performance of different TAG parsers when working with a large-scale, wide-coverage grammar.
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw#N# Text to Universal Dependencies | 2017
David Vilares; Carlos Gómez-Rodríguez
The LyS-FASTPARSE team presents BIST-COVINGTON, a neural implementation of the Covington (2001) algorithm for non-projective dependency parsing. The bidirectional LSTM approach by Kipperwasser and Goldberg (2016) is used to train a greedy parser with a dynamic oracle to mitigate error propagation. The model participated in the CoNLL 2017 UD Shared Task. In spite of not using any ensemble methods and using the baseline segmentation and PoS tagging, the parser obtained good results on both macro-average LAS and UAS in the big treebanks category (55 languages), ranking 7th out of 33 teams. In the all treebanks category (LAS and UAS) we ranked 16th and 12th. The gap between the all and big categories is mainly due to the poor performance on four parallel PUD treebanks, suggesting that some `suffixed treebanks (e.g. Spanish-AnCora) perform poorly on cross-treebank settings, which does not occur with the corresponding `unsuffixed treebank (e.g. Spanish). By changing that, we obtain the 11th best LAS among all runs (official and unofficial). The code is made available at this https URL
Proceedings of the Workshop on Noisy User-generated Text | 2015
Yerai Doval Mosquera; Jesús Vilares; Carlos Gómez-Rodríguez
In this article we describe the microtext normalization system we have used to participate in the Normalization of Noisy Text Task of the ACL W-NUT 2015 Workshop. Our normalization system was originally developed for text mining tasks on Spanish tweets. Our main goals during its development were flexibility, scalability and maintainability, in order to test a wide variety of approximations to the problem at hand with minimum effort. We will pay special attention to the process of adapting the components of our system to deal with English tweets which, as we will show, was achieved without major modifications of its base structure.
Archive | 2006
Carlos Gómez-Rodríguez; Miguel A. Alonso; Manuel Vilares
TASS@SEPLN | 2015
David Vilares; Yerai Doval; Miguel A. Alonso; Carlos Gómez-Rodríguez
Archive | 2017
Carlos Gómez-Rodríguez; Iago Alonso-Alonso; David Vilares
Taller de NEGación en ESpañol (NEGES-2017) | 2017
David Vilares Calvo; Miguel A. Alonso; Carlos Gómez-Rodríguez
Archive | 2006
Jesús Vilares; Carlos Gómez-Rodríguez; Miguel A. Alonso