Miguel A. Alonso | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Miguel A. Alonso is active.

Explore More

Publication

Featured researches published by Miguel A. Alonso.

conference of the european chapter of the association for computational linguistics | 1999

Tabular algorithms for TAG parsing

Miguel A. Alonso; Éric Villemonte de la Clergerie; David Cabrero; Manuel Vilares

We describe several tabular algorithms for Tree Adjoining Grammar parsing, creating a continuum from simple pure bottom-up algorithms to complex predictive algorithms and showing what transformations must be applied to each one in order to obtain the next one in the continuum.

association for information science and technology | 2015

On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages

David Vilares; Miguel A. Alonso; Carlos Gómez-Rodríguez

Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focused on determining whether the content of a text is objective or subjective, and in the latter case, if it conveys a positive or a negative opinion. Most polarity detection techniques tend to take into account individual terms in the text and even some degree of linguistic knowledge, but they do not usually consider syntactic relations between words. This article explores how relating lexical, syntactic, and psychometric information can be helpful to perform polarity classification on Spanish tweets. We provide an evaluation for both shallow and deep linguistic perspectives. Empirical results show an improved performance of syntactic approaches over pure lexical models when using large training sets to create a classifier, but this tendency is reversed when small training collections are used.

Natural Language Engineering | 2015

A syntactic approach for opinion mining on Spanish reviews

David Vilares; Miguel A. Alonso; Carlos Gómez-Rodríguez

We describe an opinion mining system which classies the polarity of Spanish texts. We propose an NLP approach that undertakes pre-processing, tokenisation and POS tagging of texts to then obtain the syntactic structure of sentences by means of a dependency parser. This structure is then used to address three of the most signicant linguistic constructions for the purpose in question: intensication, subordinate adversative clauses and negation.

international conference on computational linguistics | 2002

Using Syntactic Dependency-Pairs Conflation to Improve Retrieval Performance in Spanish

Jesús Vilares Ferro; Francisco-Mario Barcala; Miguel A. Alonso

This article presents two new approaches for term indexing which are particularly appropriate for languages with a rich lexis and morphology, such as Spanish, and need few resources to be applied. At word level, productive derivational morphology is used to conflate semantically related words. At sentence level, an approximate grammar is used to conflate syntactic and morphosyntactic variants of a given multiword term into a common base form. Experimental results show remarkable improvements with regard to classical indexing methods.

international conference on computational linguistics | 2001

Applying Productive Derivational Morphology to Term Indexing of Spanish Texts

Jesús Vilares; David Cabrero; Miguel A. Alonso

This paper deals with the application of natural language processing techniques to the field of information retrieval. To be precise, we propose the application of morphological families for single term conflation in order to reduce the linguistic variety of indexed documents written in Spanish. A system for automatic generation of morphological families by means of Productive Derivational Morphology is discussed. The main characteristics of this system are the use of a minimum of linguistic resources, a low computational cost, and the independence with respect to the indexing engine.

international conference on artificial intelligence | 2002

On the Usefulness of Extracting Syntactic Dependencies for Text Indexing

Miguel A. Alonso; Jesús Vilares Ferro; Victor M. Darriba

In recent years, there has been a considerable amount of interest in using Natural Language Processing in Information Retrieval research, with specific implementations varying from the word-level morphological analysis to syntactic parsing to conceptual-level semantic analysis. In particular, different degrees of phrase-level syntactic information have been incorporated in information retrieval systems working on English or Germanic languages such as Dutch. In this paper we study the impact of using such information, in the form of syntactic dependency pairs, in the performance of a text retrieval system for a Romance language, Spanish.

database and expert systems applications | 2002

Tokenization and proper noun recognition for information retrieval

F.M. Barcala; Jesús Vilares; Miguel A. Alonso; Jorge Graña; Manuel Vilares

In this paper we consider a set of natural language processing techniques that can be used to analyze large amounts of texts, focusing on the advanced tokenizer which accounts for a number of complex linguistic phenomena, as well as for pre-tagging tasks such as proper noun recognition. We also show the results of several experiments performed in order to study the impact of the strategy chosen for the recognition of proper nouns.

Journal of Information Science | 2015

The megaphone of the people? Spanish SentiStrength for real-time analysis of political tweets

David Vilares; Mike Thelwall; Miguel A. Alonso

Twitter is an important platform for sharing opinions about politicians, parties and political decisions. These opinions can be exploited as a source of information to monitor the impact of politics on society. This article analyses the sentiment of 2,704,523 tweets referring to Spanish politicians and parties from a month in 2014–2015. The article makes three specific contributions: (a) enriching SentiStrength, a fast unsupervised sentiment strength detection system, for Spanish political tweeting; (b) analysing how linguistic phenomena such as negation, idioms and character duplication influence Spanish sentiment strength detection accuracy; and (c) analysing Spanish political tweets to rank political leaders, parties and personalities for popularity. Sentiment in Twitter for key politicians broadly reflects the main official polls for popularity but not for voting intention. In addition, the data suggests that the primary role of Twitter in politics is to select and amplify political events published by traditional media.

international conference on implementation and application of automata | 2001

Compilation Methods of Minimal Acyclic Finite-State Automata for Large Dictionaries

Jorge Graña; Francisco-Mario Barcala; Miguel A. Alonso

We present a reflection on the evolution of the different methods for constructing minimal deterministic acyclic finite-state automata from a finite set of words. We outline the most important methods, including the traditional ones (which consist of the combination of two phases: insertion of words and minimization of the partial automaton) and the incremental algorithms (which add new words one by one and minimize the resulting automaton on-the-fly, being much faster and having significantly lower memory requirements). We analyze their main features in order to provide some improvements for incremental constructions, and a general architecture that is needed to implement large dictionaries in natural language processing (NLP) applications.

Information Processing and Management | 2017

Supervised sentiment analysis in multilingual environments

David Vilares; Miguel A. Alonso; Carlos Gmez-Rodrguez

This article tackles the problem of performing multilingual polarity classification on Twitter, comparing three techniques: (1) a multilingual model trained on a multilingual dataset, obtained by fusing existing monolingual resources, that does not need any language recognition step, (2) a dual monolingual model with perfect language detection on monolingual texts and (3) a monolingual model that acts based on the decision provided by a language identification tool. The techniques were evaluated on monolingual, synthetic multilingual and code-switching corpora of English and Spanish tweets. In the latter case we introduce the first code-switching Twitter corpus with sentiment labels. The samples are labelled according to two well-known criteria used for this purpose: the SentiStrength scale and a trinary scale (positive, neutral and negative categories). The experimental results show the robustness of the multilingual approach (1) and also that it outperforms the monolingual models on some monolingual datasets.

Explore More