Gabriela Ramírez-de-la-Rosa

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gabriela Ramírez-de-la-Rosa is active.

Explore More

Publication

Featured researches published by Gabriela Ramírez-de-la-Rosa.

mexican international conference on artificial intelligence | 2014

Towards Automatic Detection of User Influence in Twitter by Means of Stylistic and Behavioral Features

Gabriela Ramírez-de-la-Rosa; Esaú Villatoro-Tello; Héctor Jiménez-Salazar; Christian Sánchez-Sánchez

Online communities are filled with comments of loyal readers or first-time viewers, that are constantly creating and sharing information at an unprecedented level, resulting in millions of messages containing opinions, ideas, needs and beliefs of Internet users. Therefore, businesses companies are very interested in finding influential users and encouraging them to create positive influence. Influential users represent users with the ability to influence individual’s attitudes in a desired way with relative frequency. We present an empirical analysis on influential users identification problem in Twitter. Our proposed approach considers that the influential level of users can be detected by considering its communication patterns, by means of particular writing style features as well as behavioral features. Performed experiments on more that 7000 users profiles, indicate that it is possible to automatically identify influential users among the members of a social networking community, and also it obtains competitive results against several state-of-the-art methods.

Information Retrieval Journal | 2018

Retrieving and classifying instances of source code plagiarism

Debasis Ganguly; Gareth J. F. Jones; Aarón Ramírez-de-la-Cruz; Gabriela Ramírez-de-la-Rosa; Esaú Villatoro-Tello

Automatic detection of source code plagiarism is an important research field for both the commercial software industry and within the research community. Existing methods of plagiarism detection primarily involve exhaustive pairwise document comparison, which does not scale well for large software collections. To achieve scalability, we approach the problem from an information retrieval (IR) perspective. We retrieve a ranked list of candidate documents in response to a pseudo-query representation constructed from each source code document in the collection. The challenge in source code document retrieval is that the standard bag-of-words (BoW) representation model for such documents is likely to result in many false positives being retrieved, because of the use of identical programming language specific constructs and keywords. To address this problem, we make use of an abstract syntax tree (AST) representation of the source code documents. While the IR approach is efficient, it is essentially unsupervised in nature. To further improve its effectiveness, we apply a supervised classifier (pre-trained with features extracted from sample plagiarized source code pairs) on the top ranked retrieved documents. We report experiments on the SOCO-2014 dataset comprising 12K Java source files with almost 1M lines of code. Our experiments confirm that the AST based approach produces significantly better retrieval effectiveness than a standard BoW representation, i.e., the AST based approach is able to identify a higher number of plagiarized source code documents at top ranks in response to a query source code document. The supervised classifier, trained on features extracted from sample plagiarized source code pairs, is shown to effectively filter and thus further improve the ranked list of retrieved candidate plagiarized documents.

text speech and dialogue | 2016

From Dialogue Corpora to Dialogue Systems: Generating a Chatbot with Teenager Personality for Preventing Cyber-Pedophilia

Ángel Callejas-Rodríguez; Esaú Villatoro-Tello; Ivan Meza; Gabriela Ramírez-de-la-Rosa

A conversational agent, also known as chatbot, is a machine conversational system which interacts with human users via natural language. Traditionally, chatbot technology is built under certain set of “manually” elaborated conversational rules. However, given the availability of large and real examples of humans’ interactions in the web, automatically generating these rules is becoming a more feasible option. In this paper we describe an approach for building and training a conversational agent, which holds a teenager personality and it is able to dialogue in Mexican Spanish. By means of this chatter bot we aim at assisting law enforcement officers in the prevention of cyber-pedophilia. Our performed experiments demonstrate that our developed chatbot is able to elaborate comparable lexical and syntactical constructions to those a teenager would produce. As an additional contribution, we compile and release a large dialogue corpus containing real examples of conversations among teenagers.

mexican international conference on artificial intelligence | 2016

A Compact Representation for Cross-Domain Short Text Clustering

Alba Núñez-Reyes; Esaú Villatoro-Tello; Gabriela Ramírez-de-la-Rosa; Christian Sánchez-Sánchez

Nowadays, Twitter depicts a rich source of on-line reviews, ratings, recommendations, and other forms of opinion expressions. This scenario has created the compelling demand to develop innovative mechanisms to store, search, organize and analyze all this data automatically. Unfortunately, it is seldom available to have enough labeled data in Twitter, because of the cost of the process or due to the impossibility to obtain them, given the rapid growing and change of this kind of media. To avoid such limitations, unsupervised categorization strategies are employed. In this paper we face the problem of cross-domain short text clustering through a compact representation that allows us to avoid the problems that arise with the high dimensionality and sparseness of vocabulary. Our experiments, conducted on a cross-domain scenario using very short texts, indicate that the proposed representation allows to generate high quality groups, according to the value of Silhouette coefficient obtained.

ibero-american conference on artificial intelligence | 2016

Enhancing Semi-supevised Text Classification Using Document Summaries

Esaú Villatoro-Tello; Emmanuel Anguiano; Manuel Montes-y-Gómez; Luis Villaseñor-Pineda; Gabriela Ramírez-de-la-Rosa

The vast amount of electronic documents available on the Internet demands for automatic tools that help people finding, organizing and easily accessing to all this information. Although current text classification methods have alleviated some of the above problems, such strategies depend on having a large and reliable set of labeled data. In order to overcome such limitation, this work proposes an alternative approach for semi-supervised text classification, which is based on a new strategy for diminishing the sensitivity to the noise contained on labeled data by means of automatic text summarization. Experimental results showed that our proposed approach outperforms traditional semi-supervised text classification techniques; additionally, our results also indicate that our approach is suitable for learning from only one labeled example per category.

mexican international conference on artificial intelligence | 2015

The Role of n-grams in Firstborns Identification

Gabriela Ramírez-de-la-Rosa; Verónica Reyes-Meza; Esaú Villatoro-Tello; Héctor Jiménez-Salazar; Manuel Montes-y-Gómez; Luis Villaseñor-Pineda

Psychologists have long theorized about the effects of birth order on intellectual development and verbal abilities. Several studies within the field of psychology have tried to prove such theories, however no concrete evidence has been found yet. Therefore, in this paper we present an empirical analysis on the pertinence of traditional Author Profiling techniques. Thus, we re-formulate the problem of identifying developed language abilities by firstborns as a classification problem. Particularly we measure the importance of lexical and syntactic features extracted from a set of 129 speech transcriptions, which were gathered from videos of approximately three minutes length each. Obtained results indicate that both bag of words n-grams and bag of part-of-speech n-grams are able to provide useful information for accurately characterize the language properties employed by firstborns and later-borns. Consequently, our performed experiments helped to validate the presence of distinct language abilities among firstborns and later-borns.

fire workshops | 2015

High Level Features for Detecting Source Code Plagiarism across Programming Languages.

Aarón Ramírez-de-la-Cruz; Gabriela Ramírez-de-la-Rosa; Christian Sánchez-Sánchez; Héctor Jiménez-Salazar; Carlos Rodríguez-Lucatero; Wulfrano Arturo Luna-Ramírez

CLEF (Working Notes) | 2014

UAMCLyR at RepLab 2014: Author Profiling Task.

Esaú Villatoro-Tello; Gabriela Ramírez-de-la-Rosa; Christian Sánchez-Sánchez; Héctor Jiménez-Salazar; Wulfrano Arturo Luna-Ramírez; Carlos Rodríguez-Lucatero

Journal of Intelligent and Fuzzy Systems | 2018