Noé Alejandro Castro-Sánchez
Instituto Politécnico Nacional
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Noé Alejandro Castro-Sánchez.
mexican international conference on artificial intelligence | 2012
Grigori Sidorov; Sabino Miranda-Jiménez; Francisco Viveros-Jiménez; Alexander F. Gelbukh; Noé Alejandro Castro-Sánchez; Francisco Velasquez; Ismael Díaz-Rangel; Sergio Suárez-Guerra; Alejandro Treviño; Juan Gordon
Opinion mining deals with determining of the sentiment orientation--positive, negative, or neutral--of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naive Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing--in our case, for Spanish language--of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.
applications of natural language to data bases | 2010
Noé Alejandro Castro-Sánchez; Grigori Sidorov
Due to the importance that verbs have in language an identification of their actants (obligatory complements) is important for understanding of the meaning of sentences. Usually, the solution of this problem in natural language processing is based on machine learning approaches, which are trained on large sets of tagged texts. We show that it is possible to work with other kind of sources, i.e., explanatory dictionaries. Dictionary definitions have patterns that provide enough information for identifying actants. We develop a heuristic approach in order to obtain this information and developed an algorithm for detection of actants in texts.
mexican conference on pattern recognition | 2011
Noé Alejandro Castro-Sánchez; Grigori Sidorov
In this paper we present an automatic method for extraction of synonyms of verbs from an explanatory dictionary based only on hyponym/hyperonym relations existing between the verbs defined and the genus used in their definitions. The set of pairs verb-genus can be considered as a directed graph, so we applied an algorithm to identify cycles in these kind of structures. We found that some cycles represent chains of synonyms. We obtain high precision and low recall.
Polibits | 2012
Noé Alejandro Castro-Sánchez; Grigori Sidorov
In this work we propose the application of symbolic methods for extraction of semantic valences of the verbs describing them under the Government Pattern concept of the MeaningText Theory. The method is based on the automatic processing of the definitions of verbs used in Explanatory Dictionaries and the analysis of semantic relationships, as inclusion and synonymy, given among them. We believe that lexicographic definitions of Explanatory Dictionaries supply enough information for identifying verb actants. The obtained results show that even when it is not possible to find information related to the argument structure of verbs in the definitions, it is possible to deduce it identifying and analyzing other definitions which semantic relationships are established.
mexican international conference on artificial intelligence | 2016
Germán Ríos-Toledo; Grigori Sidorov; Noé Alejandro Castro-Sánchez; Alondra Nava-Zea; Liliana Chanona-Hernández
Named entities (NE) are words that refer to names of people, locations, organization, etc. NE are present in every kind of documents: e-mails, letters, essays, novels, poems. Automatic detection of these words is very important task in natural language processing. Sometimes, NE are used in authorship attribution studies as a stylometric feature. The goal of this paper is to evaluate the effect of the presence of NE in texts for the authorship attribution task: are we really detecting the style of an author or are we just discovering the appearance of the same NE. We used the corpus that consists of 91 novels of 7 authors of XVIII century. These authors spoke and wrote English, their native language. All novels belong to fiction genre. The used stylometric features were character n-grams, word n-gram and n-gram of POS tags of various sizes (2-grams, 3-grams, etc.). Five novels were selected for each author, these novels contain between 4 and 7% of the NE. All novels were divided into blocks, each block contains 10,000 terms. Two kinds of experiment were conducted: automatic classification of blocks containing NE and of the same blocks without NE. In some cases, we use only the most frequent n-grams (500, 2,000 and 4,000 n-grams). Three machine learning algorithms were used for classification task: NB, SVM (SMO) and J48. The results show that as a tendency the presence of the NE helps to classify (improvements from 5% to 20%), but there are specific authors when NE do not help and even make the classification worse (about 10% of experimental data).
Clei Electronic Journal | 2017
Roberto Villarejo-Martínez; Noé Alejandro Castro-Sánchez; Gerardo Sierra-Martinez
In this paper the creation of two important relevant resources for the double entendre and humour recognition problem in Mexican Spanish is described: a morphological dictionary and a semantic dictionary. These were created from two sources: a corpus of albures (drawn from “Antología del albur”) and a Mexican slang dictionary (“El chilangonario”). The morphological dictionary consists of 410 forms of words that corresponds to 350 lemmas. The semantic dictionary consists of 27 synsets that are associated to lemmas of morphological dictionary. Since both resources are based on Freeling library, they are easy to implement for tasks in Natural Language Processing. The motivation for this work comes from the need to address problems such as double entendre and computational humour. The usefulness of these disciplines has been discussed many times and it has been shown that they have a direct impact on user interfaces and mainly in human-computer interaction. This work aims to promote the scientific community to generate more resources about informal language in Spanish and other languages.
Revista Signos | 2015
Noé Alejandro Castro-Sánchez; Irasema Cruz Domínguez; Grigori Sidorov; Alicia Martínez Rebollar
Resumen En este articulo presentamos un metodo para identificar colocaciones de manera automatica en definiciones de verbos extraidas del diccionario explicativo de la Real Academia Espanola (RAE) con el fin de probar que las colocaciones pueden identificarse aplicando heuristicas sencillas que consideran solo criterios semanticos en contextos textuales bien estructurados, como es el caso de las definiciones lexicograficas. Los candidatos a colocaciones se caracterizan porque estan situados al inicio de las definiciones y tienen como particularidad que la base de la colocacion candidata pertenece a la familia lexica del verbo definido (1.347 casos). La evaluacion de las combinaciones de palabras obtenidas se realizo de manera semiautomatica, considerando criterios estadisticos y sintactico-semanticos. Esta arrojo como resultado que el 61% de las combinaciones de palabras extraidas de esta manera son colocaciones, logrando alcanzar una cobertura del 36%. Palabras Clave: Colocaciones, unidades fraseologicas, diccionario explicativo, extraccion automatica de colocaciones.
Research on computing science | 2016
Felipe Ojeda-Cruz; Noé Alejandro Castro-Sánchez; Héctor Jiménez-Salazar
Research on computing science | 2016
Roberto Villarejo-Martínez; Noé Alejandro Castro-Sánchez
Research on computing science | 2015
Noé Alejandro Castro-Sánchez; Sadher Abelardo Vázquez-Cámara; Grigori Sidorov