Is this you? Create Your Porfile

Esteban Castillo

Benemérita Universidad Autónoma de Puebla

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Esteban Castillo is active.

Explore More

Publication

Featured researches published by Esteban Castillo.

north american chapter of the association for computational linguistics | 2015

UDLAP: Sentiment Analysis Using a Graph-Based Representation

Esteban Castillo; Ofelia Cervantes; Darnes Vilariño; David Báez; J. Alfredo Sánchez

We present an approach for tackling the Sentiment Analysis problem in SemEval 2015. The approach is based on the use of a cooccurrence graph to represent existing relationships among terms in a document with the aim of using centrality measures to extract the most representative words that express the sentiment. These words are then used in a supervised learning algorithm as features to obtain the polarity of unknown documents. The best results obtained for the different datasets are: 77.76% for positive, 100% for negative and 68.04% for neutral, showing that the proposed graph-based representation could be a way of extracting terms that are relevant to detect a sentiment.

international conference on electronics, communications, and computers | 2015

Author attribution using a graph based representation

Esteban Castillo; Darnes Vilariño; Ofelia Cervantes; David Pinto

Authorship attribution is the task of determining the real author of a given anonymous document. Even though different approaches exist in literature, this problem has been barely dealt with by using document representations that employ graph structures. Actually, most research works in literature handle this problem by employing simple sequences of n words (n-grams), such as bigrams and trigrams. In this paper, an exploration in the use of graphs for representing document sentences is presented. These structures are used for carrying out experiments for solving the problem of Authorship attribution. The experiments that are presented here attain approximately a 79% of accuracy, showing that the graph-based representation could be a way of encapsulating various levels of natural language descriptions in a simple structure.

mexican conference on pattern recognition | 2012

A machine-translation method for normalization of SMS

Darnes Vilariño; David Pinto; Beatriz Beltrán; Saul León; Esteban Castillo; Mireya Tovar

Normalization of SMS is a very important task that must be addressed by the computational community because of the tremendous growth of services based on mobile devices, which make use of this kind of messages. There exist many limitations on the automatic treatment of SMS texts derived from the particular writing style used. Even if there are suficient problems dealing with this kind of texts, we are also interested in some tasks requiring to understand the meaning of documents in different languages, therefore, increasing the complexity of such tasks. Our approach proposes to normalize SMS texts employing machine translation techniques. For this purpose, we use a statistical bilingual dictionary calculated on the basis of the IBM-4 model for determining the best translation for a given SMS term. We have compared the presented approach with a traditional probabilistic method of information retrieval, observing that the normalization model proposed here highly improves the performance of the probabilistic one.

north american chapter of the association for computational linguistics | 2016

UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation.

Esteban Castillo; Ofelia Cervantes; Darnes Vilariño; David Báez

We present an approach for tackling the tweet quantification problem in SemEval 2016. The approach is based on the creation of a cooccurrence graph per sentiment from the training dataset and a graph per topic from the test dataset with the aim of comparing each topic graph against the sentiment graphs and evaluate the similarity between them. A heuristic is applied on those similarities to calculate the percentage of positive and negative texts. The overall result obtained for the test dataset according to the proposed task score (KL divergence) is 0.261, showing that the graph based representation and heuristic could be a way of quantifying the percentage of tweets that are positive and negative in a given set of texts about a topic.

International Workshop of the Initiative for the Evaluation of XML Retrieval | 2011

BUAP: A Recursive Approach to the Data-Centric Track of INEX 2011

Darnes Vilariño Ayala; David Pinto; Saúl León Silverio; Esteban Castillo; Mireya Tovar Vidal

A recursive approach for keyword search on XML data for the Ad-Hoc Search Task of INEX 2011 is presented in this paper. The aim of this approach was to detect the concrete part (in the representation tree) of the XML document containing the expected answer. For this purpose, we initially obtain a tree structure, which represents an XML document, tagged by levels. A typical search engine based on posting lists is used in order to determine those documents that match in some degree with the terms appearing in the given query(topic). Thereafter, in a recursively process, we navigate into the tree structure until we find the best match for the topic. The obtained results are shown and compared with the best overall submission score obtained in the competition.

FIRE | 2013

Two Models for the SMS-Based FAQ Retrieval Task of FIRE 2011

Darnes Vilariño; David Pinto; Saul León; Esteban Castillo; Mireya Tovar

In this paper we propose a normalization model in order to standardize the terms used in SMS. For this purpose, we use a statistical bilingual dictionary calculated on the basis of the IBM-4 model for determining the best translation for a given SMS term. In order to compare our proposal with another method of document retrieval, we have submitted to the FIRE 2011 competition forum a second run which was obtained by using a probabilistic information retrieval model which employes the same statistical dictionaries used by our normalization method.

CLEF (Working Notes) | 2014