Is this you? Create Your Porfile

Saul León

Benemérita Universidad Autónoma de Puebla

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Saul León is active.

Explore More

Publication

Featured researches published by Saul León.

international conference on computational linguistics | 2014

BUAP: Evaluating Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment

Saul León; Darnes Vilariño; David Pinto; Mireya Tovar; Beatriz Beltrán

The results obtained by the BUAP team at Task 1 of SemEval 2014 are presented in this paper. The run submitted is a supervised version based on two classification models: 1) We used logistic regression for determining the semantic relatedness between a pair of sentences, and 2) We employed support vector machines for identifying textual entailment degree between the two sentences. The behaviour for the second subtask (textual entailment) obtained much better performance than the one evaluated at the first subtask (relatedness), ranking our approach in the 7th position of 18 teams that participated at the competition.

international conference on computational linguistics | 2014

BUAP: Evaluating Features for Multilingual and Cross-Level Semantic Textual Similarity

Darnes Vilariño; David Pinto; Saul León; Mireya Tovar; Beatriz Beltrán

In this paper we present the evaluation of different features for multiligual and crosslevel semantic textual similarity. Three different types of features were used: lexical, knowledge-based and corpus-based. The results obtained at the Semeval competition rank our approaches above the average of the rest of the teams highlighting the usefulness of the features presented in this paper.

mexican conference on pattern recognition | 2012

A machine-translation method for normalization of SMS

Darnes Vilariño; David Pinto; Beatriz Beltrán; Saul León; Esteban Castillo; Mireya Tovar

Normalization of SMS is a very important task that must be addressed by the computational community because of the tremendous growth of services based on mobile devices, which make use of this kind of messages. There exist many limitations on the automatic treatment of SMS texts derived from the particular writing style used. Even if there are suficient problems dealing with this kind of texts, we are also interested in some tasks requiring to understand the meaning of documents in different languages, therefore, increasing the complexity of such tasks. Our approach proposes to normalize SMS texts employing machine translation techniques. For this purpose, we use a statistical bilingual dictionary calculated on the basis of the IBM-4 model for determining the best translation for a given SMS term. We have compared the presented approach with a traditional probabilistic method of information retrieval, observing that the normalization model proposed here highly improves the performance of the probabilistic one.

international conference on computational linguistics | 2014

BUAP: Polarity Classification of Short Texts

David Pinto; Darnes Vilariño; Saul León; Miguel Jasso; Cupertino Lucero

We report the results we obtained at the subtask B (Message Polarity Classification) of SemEval 2014 Task 9. The features used for representing the messages were basically trigrams of characters, trigrams of PoS and a number of words selected by means of a graph mining tool. Our approach performed slightly below the overall average, except when a corpus of tweets with sarcasm was evaluated, in which we performed quite well obtaining around 6% above the overall average.

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval | 2010

BUAP: a first approach to the data-centric track of INEX 2010

Darnes Vilariño; David Pinto; Carlos Balderas; Mireya Tovar; Saul León

In this paper we present the results of the evaluation of an information retrieval system constructed in the Faculty of Computer Science, BUAP. This system was used in the Data-Centric track of the Initiative for the Evaluation of XML retrieval (INEX 2010). This track is focused on the extensive use of a very rich structure of the documents beyond the content. We have considered topics (queries) in two variants: Content Only (CO) and Content And Structure (CAS) of the information need. The obtained results are shown and compared with those presented by other teams in the competition.

mexican conference on pattern recognition | 2015

A Graph-Based Textual Entailment Method Aware of Real-World Knowledge

Saul León; Darnes Vilariño; David Pinto; Mireya Tovar; Beatriz Beltrán

In this paper we propose an unsupervised methodology to solve the textual entailment task, that extracts facts associated to pair of sentences. Those extracted facts are represented as a graph. Then, two graph-based representations of two sentences may be further compared in order to determine the type of textual entailment judgment that they hold. The comparison method is based on graph-based algorithms for finding sub-graphs structures inside another graph, but generalizing the concepts by means of a real world knowledge database. The performance of the approach presented in this paper has been evaluated using the data provided in the Task 1 of the SemEval 2014 competition, obtaining 79i¾?% accuracy.

FIRE | 2013

Two Models for the SMS-Based FAQ Retrieval Task of FIRE 2011

Darnes Vilariño; David Pinto; Saul León; Esteban Castillo; Mireya Tovar

In this paper we propose a normalization model in order to standardize the terms used in SMS. For this purpose, we use a statistical bilingual dictionary calculated on the basis of the IBM-4 model for determining the best translation for a given SMS term. In order to compare our proposal with another method of document retrieval, we have submitted to the FIRE 2011 competition forum a second run which was obtained by using a probabilistic information retrieval model which employes the same statistical dictionaries used by our normalization method.

CLEF (Working Notes) | 2014