Marcelo Mendoza | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marcelo Mendoza is active.

Explore More

Publication

Featured researches published by Marcelo Mendoza.

knowledge discovery and data mining | 2010

Twitter under crisis: can we trust what we RT?

Marcelo Mendoza; Barbara Poblete; Carlos Castillo

In this article we explore the behavior of Twitter users under an emergency situation. In particular, we analyze the activity related to the 2010 earthquake in Chile and characterize Twitter in the hours and days following this disaster. Furthermore, we perform a preliminary study of certain social phenomenons, such as the dissemination of false rumors and confirmed news. We analyze how this information propagated through the Twitter network, with the purpose of assessing the reliability of Twitter as an information source under extreme circumstances. Our analysis shows that the propagation of tweets that correspond to rumors differs from tweets that spread news because rumors tend to be questioned more than news by the Twitter community. This result shows that it is posible to detect rumors by using aggregate analysis on tweets.

extending database technology | 2004

Query recommendation using query logs in search engines

Ricardo A. Baeza-Yates; Carlos A. Hurtado; Marcelo Mendoza

In this paper we propose a method that, given a query submitted to a search engine, suggests a list of related queries The related queries are based in previously issued queries, and can be issued by the user to the search engine to tune or redirect the search process The method proposed is based on a query clustering process in which groups of semantically similar queries are identified The clustering process uses the content of historical preferences of users registered in the query log of the search engine The method not only discovers the related queries, but also ranks them according to a relevance criterion Finally, we show with experiments over the query log of a search engine the effectiveness of the method.

Internet Research | 2013

Predicting information credibility in time-sensitive social media

Carlos Castillo; Marcelo Mendoza; Barbara Poblete

Purpose – Twitter is a popular microblogging service which has proven, in recent years, its potential for propagating news and information about developing events. The purpose of this paper is to focus on the analysis of information credibility on Twitter. The purpose of our research is to establish if an automatic discovery process of relevant and credible news events can be achieved. Design/methodology/approach – The paper follows a supervised learning approach for the task of automatic classification of credible news events. A first classifier decides if an information cascade corresponds to a newsworthy event. Then a second classifier decides if this cascade can be considered credible or not. The paper undertakes this effort training over a significant amount of labeled data, obtained using crowdsourcing tools. The paper validates these classifiers under two settings: the first, a sample of automatically detected Twitter “trends” in English, and second, the paper tests how well this model transfers to...

latin american web congress | 2005

Modeling user search behavior

Ricardo A. Baeza-Yates; Carlos A. Hurtado; Marcelo Mendoza; Georges Dupret

Web usage mining is a main research area in Web mining focused on learning about Web users and their interactions with Web sites. Main challenges in Web usage mining are the application of data mining techniques to Web data in an efficient way and the discovery of non trivial user behaviour patterns. In this paper we focus the attention on search engines analyzing query log data and showing several models about how users search and how users use search engine results.

conference on information and knowledge management | 2011

Do all birds tweet the same?: characterizing twitter around the world

Barbara Poblete; Ruth Garcia; Marcelo Mendoza; Alejandro Jaimes

Social media services have spread throughout the world in just a few years. They have become not only a new source of information, but also new mechanisms for societies world-wide to organize themselves and communicate. Therefore, social media has a very strong impact in many aspects -- at personal level, in business, and in politics, among many others. In spite of its fast adoption, little is known about social media usage in different countries, and whether patterns of behavior remain the same or not. To provide deep understanding of differences between countries can be useful in many ways, e.g.: to improve the design of social media systems (which features work best for which country?), and influence marketing and political campaigns. Moreover, this type of analysis can provide relevant insight into how societies might differ. In this paper we present a summary of a large-scale analysis of Twitter for an extended period of time. We analyze in detail various aspects of social media for the ten countries we identified as most active. We collected one years worth of data and report differences and similarities in terms of activity, sentiment, use of languages, and network structure. To the best of our knowledge, this is the first on-line social network study of such characteristics.

Knowledge Based Systems | 2014

Meta-level sentiment models for big social data analysis

Felipe Bravo-Marquez; Marcelo Mendoza; Barbara Poblete

People react to events, topics and entities by expressing their personal opinions and emotions. These reactions can correspond to a wide range of intensities, from very mild to strong. An adequate processing and understanding of these expressions has been the subject of research in several fields, such as business and politics. In this context, Twitter sentiment analysis, which is the task of automatically identifying and extracting subjective information from tweets, has received increasing attention from the Web mining community. Twitter provides an extremely valuable insight into human opinions, as well as new challenging Big Data problems. These problems include the processing of massive volumes of streaming data, as well as the automatic identification of human expressiveness within short text messages. In that area, several methods and lexical resources have been proposed in order to extract sentiment indicators from natural language texts at both syntactic and semantic levels. These approaches address different dimensions of opinions, such as subjectivity, polarity, intensity and emotion. This article is the first study of how these resources, which are focused on different sentiment scopes, complement each other. With this purpose we identify scenarios in which some of these resources are more useful than others. Furthermore, we propose a novel approach for sentiment classification based on meta-level features. This supervised approach boosts existing sentiment classification of subjectivity and polarity detection on Twitter. Our results show that the combination of meta-level features provides significant improvements in performance. However, we observe that there are important differences that rely on the type of lexical resource, the dataset used to build the model, and the learning strategy. Experimental results indicate that manually generated lexicons are focused on emotional words, being very useful for polarity prediction. On the other hand, lexicons generated with automatic methods include neutral words, introducing noise in the detection of subjectivity. Our findings indicate that polarity and subjectivity prediction are different dimensions of the same problem, but they need to be addressed using different subspace features. Lexicon-based approaches are recommendable for polarity, and stylistic part-of-speech based approaches are meaningful for subjectivity. With this research we offer a more global insight of the resource components for the complex task of classifying human emotion and opinion.

atlantic web intelligence conference | 2004

Query Clustering for Boosting Web Page Ranking

Ricardo A. Baeza-Yates; Carlos A. Hurtado; Marcelo Mendoza

Over the past few years, there has been a great deal of research on the use of content and links of Web pages to improve the quality of Web page rankings returned by search engines. However, few formal approaches have considered the use of search engine logs to improve the rankings. In this paper we propose a ranking algorithm that uses the logs of search engines to boost their retrieval quality. The relevance of Web pages is estimated using the historical preferences of users that appear in the logs. The algorithm is based on a clustering process in which groups of semantically similar queries are identified. The method proposed is simple, has low computational cost, and we show with experiments that achieves good results.

ifip world computer congress wcc | 2006

Automatic Query Recommendation using Click-Through Data

Georges Dupret; Marcelo Mendoza

We present a method to help a user redefine a query suggesting a list of similar queries. The method proposed is based on click-through data were sets of similar queries could be identified. Scientific literature shows that similar queries are useful for the identification of different information needs behind a query. Unlike most previous work, in this paper we are focused on the discovery of better queries rather than related queries. We will show with experiments over real data that the identification of better queries is useful for query disambiguation and query specialization.

string processing and information retrieval | 2006

A statistical model of query log generation

Georges Dupret; Benjamin Piwowarski; Carlos A. Hurtado; Marcelo Mendoza

Query logs record past query sessions across a time span. A statistical model is proposed to explain the log generation process. Within a search engine list of results, the model explains the document selection – a user’s click – by taking into account both a document position and its popularity. We show that it is possible to quantify this influence and consequently estimate document “un-biased” popularities. Among other applications, this allows to re-order the result list to match more closely user preferences and to use the logs as a feedback to improve search engines.

conference on information and knowledge management | 2010

Visual-semantic graphs: using queries to reduce the semantic gap in web image retrieval

Barbara Poblete; Benjamin Bustos; Marcelo Mendoza; Juan Manuel Barrios

We explore the application of a graph representation to model similarity relationships that exist among images found on the Web. The resulting similarity-induced graph allows us to model in a unified way different types of content-based similarities, as well as semantic relationships. Content-based similarities include different image descriptors, and semantic similarities can include relevance user feedback from search engines. The goal of our representation is to provide an experimental framework for combining apparently unrelated metrics into a unique graph structure, which allows us to enhance the results of Web image retrieval. We evaluate our approach by re-ranking Web image search results.

Explore More