María del Pilar Salas-Zárate

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where María del Pilar Salas-Zárate is active.

Explore More

Publication

Featured researches published by María del Pilar Salas-Zárate.

Journal of Information Science | 2017

Feature-based opinion mining in financial news: An ontology-driven approach

María del Pilar Salas-Zárate; Rafael Valencia-García; Antonio Ruiz-Martínez; Ricardo Colomo-Palacios

Financial news plays a significant role with regard to predicting the behaviour of financial markets. However, the exponential growth of financial news on the Web has led to a need for new technologies that automatically collect and categorise large volumes of information in a fast and easy manner. Sentiment analysis, or opinion mining, is the field of study that analyses people’s opinions, moods and evaluations using written text on Web platforms. In recent research, a substantial effort has been made to develop sophisticated methods with which to classify sentiments in the financial domain. However, there is a lack of approaches that analyse the positive or negative orientation of each aspect contained in a document. In this respect, we propose a new sentiment analysis method for feature and news polarity classification. The method presented is based on an ontology-driven approach that makes it possible to semantically describe relations between concepts in the financial news domain. The polarity of the features in each document is also calculated by taking into account the words from around the linguistic expression of the feature. These words are obtained by using the ‘N_GRAM After’, ‘N_GRAM Before’, ‘N_GRAM Around’ and ‘All_Phrase’ methods. The effectiveness of our method has been proved by carrying out a set of experiments on a corpus of 1000 financial news items. Our proposal obtained encouraging results with an accuracy of 66.7% and an F-measure of 64.9% for feature polarity classification and an accuracy of 89.8% and an F-measure of 89.7% for news polarity classification. The experimental results additionally show that the N_GRAM Around method provides the best average results.

Journal of Information Science | 2014

A study on LIWC categories for opinion mining in Spanish reviews

María del Pilar Salas-Zárate; Estanislao López-López; Rafael Valencia-García; Nathalie Aussenac-Gilles; Ángela Almela; Giner Alor-Hernández

With the exponential growth of social media, that is, blogs and social networks, organizations and individual persons are increasingly using the number of reviews of these media for decision-making about a product or service. Opinion mining detects whether the emotions of an opinion expressed by a user on Web platforms in natural language are positive or negative. This paper presents extensive experiments to study the effectiveness of the classification of Spanish opinions in five categories: highly positive, highly negative, positive, negative and neutral, using the combination of the psychological and linguistic features of LIWC (Linguistic Inquiry and Word Count). LIWC is a text analysis software that enables the extraction of different psychological and linguistic features from natural language text. For this study, two corpora have been used, one about movies and one about technological products. Furthermore, we conducted a comparative assessment of the performance of various classification techniques, J48, SMO and BayesNet, using precision, recall and F-measure metrics. The findings revealed that the positive and negative categories provide better results than the other categories. Finally, experiments on both corpora indicated that SMO produces better results than BayesNet and J48 algorithms, obtaining an F-measure of 90.4 and 87.2% in each domain.

Computational and Mathematical Methods in Medicine | 2017

Sentiment Analysis on Tweets about Diabetes: An Aspect-Level Approach

María del Pilar Salas-Zárate; José Medina-Moreira; Katty Lagos-Ortiz; Harry Luna-Aveiga; Miguel Ángel Rodríguez-García; Rafael Valencia-García

In recent years, some methods of sentiment analysis have been developed for the health domain; however, the diabetes domain has not been explored yet. In addition, there is a lack of approaches that analyze the positive or negative orientation of each aspect contained in a document (a review, a piece of news, and a tweet, among others). Based on this understanding, we propose an aspect-level sentiment analysis method based on ontologies in the diabetes domain. The sentiment of the aspects is calculated by considering the words around the aspect which are obtained through N-gram methods (N-gram after, N-gram before, and N-gram around). To evaluate the effectiveness of our method, we obtained a corpus from Twitter, which has been manually labelled at aspect level as positive, negative, or neutral. The experimental results show that the best result was obtained through the N-gram around method with a precision of 81.93%, a recall of 81.13%, and an F-measure of 81.24%.

Scientific Programming | 2017

Sentiment Analysis in Spanish for Improvement of Products and Services: A Deep Learning Approach

Mario Andrés Paredes-Valverde; Ricardo Colomo-Palacios; María del Pilar Salas-Zárate; Rafael Valencia-García

Sentiment analysis is an important area that allows knowing public opinion of the users about several aspects. This information helps organizations to know customer satisfaction. Social networks such as Twitter are important information channels because information in real time can be obtained and processed from them. In this sense, we propose a deep-learning-based approach that allows companies and organizations to detect opportunities for improving the quality of their products or services through sentiment analysis. This approach is based on convolutional neural network (CNN) and word2vec. To determine the effectiveness of this approach for classifying tweets, we conducted experiments with different sizes of a Twitter corpus composed of 100000 tweets. We obtained encouraging results with a precision of 88.7%, a recall of 88.7%, and an -measure of 88.7% considering the complete dataset.

world conference on information systems and technologies | 2018

Machine Learning Based Sentiment Analysis on Spanish Financial Tweets

José Antonio García-Díaz; María del Pilar Salas-Zárate; Maria Luisa Hernández-Alcaraz; Rafael Valencia-García; Juan Miguel Gómez-Berbís

Nowadays, financial data on social networks play an important role to predict the stock market. However, the exponential growth of financial information on social networks such as Twitter has led to a need for new technologies that automatically collect and categorise large volumes of information in a fast and easy manner. The Natural Language Processing (NLP) and sentiment analysis areas can solve this problem. In this respect, we propose a supervised machine learning method to detect the polarity of financial tweets. The method employs a set of lexico-morphological and semantic features, which were extracted with UMTextStats tool. Furthermore, we have conducted a comparison of the performance of three classification algorithms (J48, BayesNet, and SMO). The results showed that SMO provides better results than BayesNet and J48 algorithms, obtaining an F-measure of 73.2%.

Science of Computer Programming | 2018

An ontology-based approach with which to assign human resources to software projects

Mario Andrés Paredes-Valverde; María del Pilar Salas-Zárate; Ricardo Colomo-Palacios; Juan Miguel Gómez-Berbís; Rafael Valencia-García

Abstract Human resources play a critical role in the success of software projects. Ensuring the correct assignment of them to a specific project is, therefore, an immediate requirement for Software development organizations. Within this context, this work explores the use of ontologies in the building of a decision support system that will help human resources managers or project leaders to select those employees who are best suited to participating in a new software development project. Ontologies allow the system to discover semantic relatedness among new and previous software projects by means of its requirements specification. The system can, therefore, suggest those people who have participated on similar projects. We have proved the effectiveness of our approach by conducting an evaluation in a software development organization. Our findings confirm the success of our approach and reveal that it may bring considerable benefits to the software development process.

international conference computer science and applied mathematics | 2017

Sentiment Polarity Detection in Social Networks: An Approach for Asthma Disease Management

Harry Luna-Aveiga; José Medina-Moreira; Katty Lagos-Ortiz; Oscar Apolinario; Mario Andrés Paredes-Valverde; María del Pilar Salas-Zárate; Rafael Valencia-García

Asthma disease is a serious health problem that affects all age groups. Asthma-related hospitalizations and deaths have declined in some countries. However, the number of patients with symptoms has increased in the last years. Even though asthma patients have contact with health professionals, they must be an active part in treatment team. On the other hand, there has been an exponential growth of information about healthcare and diseases management on social networks such as Twitter. Aiming to benefit from this information, in this work we propose a method for detecting the emotional reaction of patients about asthma domain concepts such as physical activities, drugs, among others. The findings obtained from the analysis of such information can help to other patients to avoid habits that could harm their health. Our proposal was evaluated with a corpus of Twitter messages obtaining a precision of 82.95%, a recall of 82.27%, and F-measure of 82.36% in sentiment polarity identification.

Current Trends on Knowledge-Based Systems | 2017

im4Things: An Ontology-Based Natural Language Interface for Controlling Devices in the Internet of Things

José Ángel Noguera-Arnaldos; Mario Andrés Paredes-Valverde; María del Pilar Salas-Zárate; Miguel Ángel Rodríguez-García; Rafael Valencia-García; José Luis Ochoa

The Internet of Things (IoT) offers opportunities for new applications and services that enable users to access and control their working and home environment from local and remote locations, aiming to perform daily life activities in an easy way. However, the IoT also introduces new challenges, some of which arise from the large range of devices currently available and the heterogeneous interfaces provided for their control. The control and management of this variety of devices and interfaces represent a new challenge for non-expert users, instead of making their life easier. Based on this understanding, in this work we present a natural language interface for the IoT, which takes advantage of Semantic Web technologies to allow non-expert users to control their home environment through an instant messaging application in an easy and intuitive way. We conducted several experiments with a group of end users aiming to evaluate the effectiveness of our approach to control home appliances by means of natural language instructions. The evaluation results proved that without the need for technicalities, the user was able to control the home appliances in an efficient way.

distributed computing and artificial intelligence | 2014

LIWC-Based Sentiment Analysis in Spanish Product Reviews

Estanislao López-López; María del Pilar Salas-Zárate; Ángela Almela; Miguel Ángel Rodríguez-García; Rafael Valencia-García; Giner Alor-Hernández

Opinion mining is the study of opinions and emotions of authors about specific topics on the Web. Opinion mining identifies whether the opinion about a given topic, expressed in a document, is positive or negative. Nowadays, with the exponential growth of social medial i.e. blogs and social networks, organizations and individual persons are increasingly using the number of reviews of these media for decision making about a product or service. This paper investigates technological products reviews mining using the psychological and linguistic features obtained through of text analysis software, LIWC. Furthermore, an analysis of the classification techniques J48, SMO, and BayesNet has been performed by using WEKA (Waikato Environment for Knowledge Analysis). This analysis aims to evaluate the classifying potential of the LIWC (Linguistic Inquiry and Word Count) dimensions on written opinions in Spanish. All in all, findings have revealed that the combination of the four LIWC dimensions provides better results than the other combinations and individual dimensions, and that SMO is the algorithm which has obtained the best results.

Procesamiento Del Lenguaje Natural | 2018

Detección de Patrones Psicolingüísticos para el Análisis de Lenguaje Subjetivo en Español

María del Pilar Salas-Zárate

OBJETIVOS. La clasificacion automatica de opiniones requiere un esfuerzo multidisciplinario, donde la linguistica y el procesamiento del lenguaje natural juegan un rol importante. Un aspecto importante a considerar en la clasificacion de opiniones es el lenguaje figurado tal como la ironia, el sarcasmo y la satira, ya que el doble sentido expresado en una opinion o comentario puede invertir la polaridad de la opinion. El objetivo principal de esta tesis es la deteccion de patrones psicolinguisticos para el analisis de lenguaje subjetivo en espanol. Especificamente, se establecieron 4 objetivos especificos: 1) diseno de un metodo para la deteccion de patrones psicolinguisticos para el analisis de sentimientos; 2) diseno de un metodo para la deteccion de patrones psicolinguisticos para el analisis de textos satiricos y no satiricos; 3) validacion del metodo para el analisis de sentimientos en diversos dominios como el turistico y peliculas; 4) validacion del metodo para la deteccion automatica de la satira en el dominio de noticias. METODOLOGIA. Para lograr este objetivo, primero se lleva a cabo un estudio del estado del arte que incluye tecnologias de procesamiento de lenguaje natural, analisis de sentimientos y lenguaje subjetivo. Especificamente, los diferentes niveles de procesamiento, principales enfoques del analisis de sentimientos, niveles de procesamiento de la opinion, bases de conocimiento, recursos linguisticos disponibles y principales tecnicas para la deteccion del lenguaje figurado. Posteriormente, se realiza el diseno e implementacion de un metodo para el analisis de sentimientos y deteccion de la satira basados en caracteristicas psicolinguisticas. Finalmente, la propuesta se valida en diferentes dominios. Concretamente, el metodo de analisis de sentimientos se aplica al dominio turistico y de peliculas; y el metodo de deteccion de la satira se aplica en el dominio de noticias en redes sociales. RESULTADOS. Como resultado se obtiene: o Un metodo para la clasificacion de sentimientos y deteccion de la satira. Este metodo permite clasificar opiniones como positivas, negativas, neutras, muy positivas y muy negativas y tweets como satiricos y no satiricos. o Un proceso para el pre-procesamiento de tweets en espanol. o Un corpus en el dominio del turismo. El corpus contiene 1600 opiniones sobre hoteles, restaurantes, museos, entre otros temas, las cuales son clasificadas con su respectiva polaridad (positivo, negativo, neutro, muy positivo, muy negativo). o Un corpus de tweets satiricos y no satiricos. Este corpus consiste en un conjunto de 10000 tweets etiquetados como satiricos y no satiricos extraidos desde diversas cuentas de Twitter. o Un conjunto de caracteristicas psicolinguisticas para la clasificacion de sentimientos y deteccion de la satira. CONCLUSIONES. La clasificacion automatica de opiniones requiere un esfuerzo donde la linguistica y el procesamiento del lenguaje natural juegan un rol importante. Gracias a estas disciplinas fue posible entender de mejor manera el lenguaje humano, clasificar las opiniones y resumir los sentimientos expresados en textos. Por otro lado, el lenguaje figurado es uno de los temas mas dificiles del PLN, ya que a diferencia del lenguaje literal, el escritor toma ventaja de diversas figuras linguisticas tales como la metafora, la analogia, la ambiguedad, entre otros, para proyectar significados mas complejos. Este tipo de lenguaje es dificil de entender no solo para las computadoras, sino tambien para el ser humano. Esta tesis describio un metodo para la deteccion de patrones psicolinguisticos para el analisis de sentimientos y la deteccion automatica de la satira. Las caracteristicas psicolinguisticas, junto con tecnicas de procesamiento de lenguaje natural y mineria de datos, resultaron ser efectivas para la deteccion de sentimientos y de la satira. Ademas, la validacion de los metodos en diversos dominios ha demostrado la efectividad de nuestro enfoque para clasificar opiniones y tweets. AIMS OF THE THESIS. The linguistic and natural language processing play an important role in the automatic classification of opinions. Furthermore, the figurative language is an important aspect to be considered in sentiment analysis, because of the double meaning expressed in the opinion can reverse the polarity of an opinion. The main goal of this thesis is to detect psycholinguistic patterns for the analysis of subjective language in Spanish. Four specific aims are established: 1) design of a method for detecting psycholinguistic patterns for sentiment analysis; 2) design of a method for detecting psycholinguistic patterns for the analysis of satirical texts; 3) validation of the method for sentiment analysis in different contexts, namely, tourism and movies domains; 4) validation of the method for automatic detection of satire in the news domain. METHODOLOGY. The methodology proposed is based on the analysis of the state of the art. This analysis includes technologies such as natural language processing, sentiment analysis, and subjective language. Furthermore, this task involves the analysis of the different levels of natural language processing, sentiment analysis approaches, levels of processing of opinions, knowledge bases, available linguistic resources, and main techniques for the detection of figurative language. Subsequently, a psycholinguistic features-based method for the sentiment analysis and detection of satire is designed and implemented. Finally, the proposal is validated in different domains. Specifically, the method of sentiment analysis is applied to the tourist and movies domain, and the method of satire detection is applied in the news domain in social networks. RESULTS. The main contributions of this work are: o A method for sentiment analysis and detection of satire. This method classifies opinions as positive, negative, neutral, very positive and very negative; and tweets as satirical and non-satirical. o A process for the pre-processing of tweets in Spanish. o A corpus in the tourism domain. The corpus contains 1600 reviews about hotels, restaurants, museums, among other topics, which are classified with their respective polarity (positive, negative, neutral, very positive, very negative). o A corpus of satirical and non-satirical tweets. This corpus consists of 10000 tweets tagged as satirical and non-satirical. These tweets were extracted from different Twitter accounts. o A set of psycholinguistic features for the sentiment analysis and detection of satire. CONCLUSIONS. The automatic classification of opinions requires a multidisciplinary approach where linguist and natural language processing need to be involved. Theses disciplines allow understanding the human language, classify opinions and summarize the sentiment expressed about a product, and other aspects. However, the figurative language expressed in some texts uses linguistic figures such as metaphor, analogy, and ambiguity, among others. This fact makes difficult to understand this kind of language, not only for computers but also by humans. This thesis described a method for the detection of psycholinguistic patterns for sentiment analysis and the automatic detection of satire. The psycholinguistic features, in conjunction with natural language processing and data mining technologies, demonstrated to be effective for the detection of sentiments and satire. In addition, the validation of the method in different domains verified its effectiveness for the classification of opinions and tweets.

Explore More