Is this you? Create Your Porfile

Beatriz Beltrán

Benemérita Universidad Autónoma de Puebla

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Beatriz Beltrán is active.

Explore More

Publication

Featured researches published by Beatriz Beltrán.

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval | 2009

BUAP: performance of K-Star at the INEX'09 clustering task

David Pinto; Mireya Tovar; Darnes Vilariño; Beatriz Beltrán; Héctor Jiménez-Salazar; Basilia Campos

The aim of this paper is to use unsupervised classification techniques in order to group the documents of a given huge collection into clusters. We approached this challenge by using a simple clustering algorithm (K-Star) in a recursive clustering process over subsets of the complete collection. The presented approach is a scalable algorithm which may automatically discover the number of clusters. The obtained results outperformed different baselines presented in the INEX 2009 clustering task.

mexican conference on pattern recognition | 2014

Use of Lexico-Syntactic Patterns for the Evaluation of Taxonomic Relations

Mireya Tovar; David Pinto; Azucena Montes; Gabriel González; Darnes Vilariño; Beatriz Beltrán

In this paper we present an approach for the evaluation of taxonomic relations of restricted domain ontologies. We use the evidence found in corpora associated to the ontology domain for determining the validity of the taxonomic relations. Our approach employs lexico-syntactic patterns for evaluating taxonomic relations in which the concepts are totally different, and it uses a particular technique based on subsumption for those relations in which one concept is completely included in the other one. The integration of these two techniques has allowed to automatically evaluate taxonomic relations for two ontologies of restricted domain. The performance obtained was about 70% for one ontology of the e-learning domain, whereas we obtained around 88% for the ontology associated to the artificial intelligence domain.

international conference on computational linguistics | 2014

BUAP: Evaluating Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment

Saul León; Darnes Vilariño; David Pinto; Mireya Tovar; Beatriz Beltrán

The results obtained by the BUAP team at Task 1 of SemEval 2014 are presented in this paper. The run submitted is a supervised version based on two classification models: 1) We used logistic regression for determining the semantic relatedness between a pair of sentences, and 2) We employed support vector machines for identifying textual entailment degree between the two sentences. The behaviour for the second subtask (textual entailment) obtained much better performance than the one evaluated at the first subtask (relatedness), ranking our approach in the 7th position of 18 teams that participated at the competition.

international conference on computational linguistics | 2014

BUAP: Evaluating Features for Multilingual and Cross-Level Semantic Textual Similarity

Darnes Vilariño; David Pinto; Saul León; Mireya Tovar; Beatriz Beltrán

In this paper we present the evaluation of different features for multiligual and crosslevel semantic textual similarity. Three different types of features were used: lexical, knowledge-based and corpus-based. The results obtained at the Semeval competition rank our approaches above the average of the rest of the teams highlighting the usefulness of the features presented in this paper.

mexican conference on pattern recognition | 2012

A machine-translation method for normalization of SMS

Darnes Vilariño; David Pinto; Beatriz Beltrán; Saul León; Esteban Castillo; Mireya Tovar

Normalization of SMS is a very important task that must be addressed by the computational community because of the tremendous growth of services based on mobile devices, which make use of this kind of messages. There exist many limitations on the automatic treatment of SMS texts derived from the particular writing style used. Even if there are suficient problems dealing with this kind of texts, we are also interested in some tasks requiring to understand the meaning of documents in different languages, therefore, increasing the complexity of such tasks. Our approach proposes to normalize SMS texts employing machine translation techniques. For this purpose, we use a statistical bilingual dictionary calculated on the basis of the IBM-4 model for determining the best translation for a given SMS term. We have compared the presented approach with a traditional probabilistic method of information retrieval, observing that the normalization model proposed here highly improves the performance of the probabilistic one.

mexican conference on pattern recognition | 2010

A Naïve bayes approach to cross-lingual word sense disambiguation and lexical substitution

David Pinto; Darnes Vilariño; Carlos Balderas; Mireya Tovar; Beatriz Beltrán

Word Sense Disambiguation (WSD) is considered one of the most important problems in Natural Language Processing [1]. It is claimed that WSD is essential for those applications that require of language comprehension modules such as search engines, machine translation systems, automatic answer machines, second life agents, etc. Moreover, with the huge amounts of information in Internet and the fact that this information is continuosly growing in different languages, we are encourage to deal with cross-lingual scenarios where WSD systems are also needed. On the other hand, Lexical Substitution (LS) refers to the process of finding a substitute word for a source word in a given sentence. The LS task needs to be approached by firstly disambiguating the source word, therefore, these two tasks (WSD and LS) are somehow related. In this paper, we present a naive approach to tackle the problem of cross-lingual WSD and cross-lingual lexical substitution. We use a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus, in order to calculate the probability of a source word to be translated to a target word (which is assumed to be the correct sense of the source word but in a different language). Two versions of the probabilistic model are tested: unweighted and weighted. The results were compared with those of an international competition, obtaining a good performance.

mexican conference on pattern recognition | 2015

A Graph-Based Textual Entailment Method Aware of Real-World Knowledge

Saul León; Darnes Vilariño; David Pinto; Mireya Tovar; Beatriz Beltrán

In this paper we propose an unsupervised methodology to solve the textual entailment task, that extracts facts associated to pair of sentences. Those extracted facts are represented as a graph. Then, two graph-based representations of two sentences may be further compared in order to determine the type of textual entailment judgment that they hold. The comparison method is based on graph-based algorithms for finding sub-graphs structures inside another graph, but generalizing the concepts by means of a real world knowledge database. The performance of the approach presented in this paper has been evaluated using the data provided in the Task 1 of the SemEval 2014 competition, obtaining 79i¾?% accuracy.

mexican conference on pattern recognition | 2011

Use of elliptic curves in term discrimination

Darnes Vilariño; David Pinto; Carlos Balderas; Mireya Tovar; Beatriz Beltrán; Sofía Paniagua

Detection of discriminant terms allow us to improve the performance of natural language processing systems. The goal is to be able to find the possible term contribution in a given corpus and, thereafter, to use the terms of high contribution for representing the corpus. In this paper we present various experiments that use elliptic curves with the purpose of discovering discriminant terms of a given textual corpus. Different experiments led us to use the mean and variance of the corpus terms for determining the parameters of a Weierstrass reduced equation (elliptic curve). We use the elliptic curves in order to graphically visualize the behavior of the corpus vocabulary. Thereafter, we use the elliptic curve parameters in order to cluster those terms that share characteristics. These clusters are then used as discriminant terms in order to represent the original document collection. Finally, we evaluated all these corpus representations in order to determine those terms that best discrimine each document.

mexican international conference on artificial intelligence | 2010

A probabilistic model based on n-grams for bilingual word sense disambiguation

Darnes Vilariño; David Pinto; Mireya Tovar; Carlos Balderas; Beatriz Beltrán

Word Sense Disambiguation (WSD) is considered one of the most important problems in Natural Language Processing. Even if the problem of WSD is difficult, when we consider its bilingual version, this problem becomes to be much more complex. In this case, it is needed not only to find the correct translation, but this translation must consider the contextual senses of the original sentence (in a source language), in order to find the correct sense (in the target language) of the source word. In this paper we propose a model based on n-grams (3-grams and 5-grams) that significantly outperforms the last results that we presented at the cross-lingual word sense disambiguation task at the SemEval-2 forum. We use a naive Bayes classifier for determining the probability of a target sense (in a target language) given a sentence which contains the ambiguous word (in a source language). For this purpose, we use a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus, in order to determine the probability of a source word to be translated to a target word (which is assumed to be the correct sense of the source word but in a different language). As we mentioned, the results were compared with those of an international competition, obtaining a good performance.

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval | 2010

The BUAP participation at the web service discovery track of INEX 2010

María J. Somodevilla; Beatriz Beltrán; David Pinto; Darnes Vilariño; José Cruz Aaron

A first approach for web services discovering based on techniques from Information Retrieval (IR), Natural Language Processing (NLP) and XML Retrieval was developed in order to use texts contained in WSDL files. It calculates the degree of similarity between words and their relative importance to support the task of web services discovering. The first algorithm uses the information contained in the WSDL (Web Service Description Language) specifications and clusters web services based on their similarity. A second approach based on a information retrieval system that index terms by using an inverted index structure was also used. Both algorithms are applied in order to evaluate 25 topics in a set of 1947 real web services (all of them provided by INEX).

Explore More