Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Daniel Gayo-Avello is active.

Publication


Featured researches published by Daniel Gayo-Avello.


Social Science Computer Review | 2013

A Meta-Analysis of State-of-the-Art Electoral Prediction From Twitter Data

Daniel Gayo-Avello

Electoral prediction from Twitter data is an appealing research topic. It seems relatively straightforward and the prevailing view is overly optimistic. This is problematic because while simple approaches are assumed to be good enough, core problems are not addressed. Thus, this article aims to (1) provide a balanced and critical review of the state of the art; (2) cast light on the presume predictive power of Twitter data; and (3) propose some considerations to push forward the field. Hence, a scheme to characterize Twitter prediction methods is proposed. It covers every aspect from data collection to performance evaluation, through data processing and vote inference. Using that scheme, prior research is analyzed and organized to explain the main approaches taken up to date but also their weaknesses. This is the first meta-analysis of the whole body of research regarding electoral prediction from Twitter data. It reveals that its presumed predictive power regarding electoral prediction has been somewhat exaggerated: Social media may provide a glimpse on electoral outcomes but, up to now, research has not provided strong evidence to support it can currently replace traditional polls. Nevertheless, there are some reasons for optimism and, hence, further work on this topic is required, along with tighter integration with traditional electoral forecasting research.


Communications of The ACM | 2011

Don't turn social media into another 'Literary Digest' poll

Daniel Gayo-Avello

The power to predict outcomes based on Twitter data is greatly exaggerated, especially for political elections.


Proceedings of the 2009 workshop on Web Search Click Data | 2009

Survey and evaluation of query intent detection methods

David J. Brenes; Daniel Gayo-Avello; Kilian Pérez-González

User interactions with search engines reveal three main underlying intents, namely navigational, informational, and transactional. By providing more accurate results depending on such query intents the performance of search engines can be greatly improved. Therefore, query classification has been an active research topic for the last years. However, while query topic classification has deserved a specific bakeoff, no evaluation campaign has been devoted to the study of automatic query intent detection. In this paper some of the available query intent detection techniques are reviewed, an evaluation framework is proposed, and it is used to compare those methods in order to shed light on their relative performance and drawbacks. As it will be shown, manually prepared gold-standard files are much needed, and traditional pooling is not the most feasible evaluation method. In addition to this, future lines of work in both query intent detection and its evaluation are proposed.


Information Processing and Management | 2013

Nepotistic relationships in Twitter and their impact on rank prestige algorithms

Daniel Gayo-Avello

Abstract Micro-blogging services such as Twitter allow anyone to publish anything, anytime. Needless to say, many of the available contents can be diminished as babble or spam. However, given the number and diversity of users, some valuable pieces of information should arise from the stream of tweets. Thus, such services can develop into valuable sources of up-to-date information (the so-called real-time web) provided a way to find the most relevant/trustworthy/authoritative users is available. Hence, this makes a highly pertinent question for which graph centrality methods can provide an answer. In this paper the author offers a comprehensive survey of feasible algorithms for ranking users in social networks, he examines their vulnerabilities to linking malpractice in such networks, and suggests an objective criterion against which to compare such algorithms. Additionally, he suggests a first step towards “desensitizing” prestige algorithms against cheating by spammers and other abusive users.


Information Sciences | 2009

Stratified analysis of AOL query log

David J. Brenes; Daniel Gayo-Avello

Characterizing users intent and behaviour while using a retrieval information tool (e.g. a search engine) is a key question on web research, as it hold the keys to know how the users interact, what they are expecting and how we can provide them information in the most beneficial way. Previous research has focused on identifying the average characteristics of user interactions. This paper proposes a stratified method for analyzing query logs that groups queries and sessions according to their hit frequency and analyzes the characteristics of each group in order to find how representative the average values are. Findings show that behaviours typically associated with the average user do not fit in most of the aforementioned groups.


latin american web congress | 2012

Opinion Dynamics of Elections in Twitter

Felipe Bravo-Marquez; Daniel Gayo-Avello; Marcelo Mendoza; Barbara Poblete

In this work we conduct an empirical study of opinion time series created from Twitter data regarding the 2008 U.S. elections. The focus of our proposal is to establish whether a time series is appropriate or not for generating a reliable predictive model. We analyze time series obtained from Twitter messages related to the 2008 U.S. elections using ARMA/ARIMA and GARCH models. The first models are used in order to assess the conditional mean of the process and the second ones to assess the conditional variance or volatility. The main argument we discuss is that opinion time series that exhibit volatility should not be used for long-term forecasting purposes. We present an in-depth analysis of the statistical properties of these time series. Our experiments show that these time series are not fit for predicting future opinion trends. Due to the fact that researchers have not provided enough evidence to support the alleged predictive power of opinion time series, we discuss how more rigorous validation of predictive models generated from time series could benefit the opinion mining field.


IEEE MultiMedia | 2015

Social Media, Democracy, and Democratization

Daniel Gayo-Avello

The confluence of social media with political action is a complex field raising important questions. Is social media a realm for democratic deliberation? Can we ascertain public opinion from social media outlets? How are people using social media for political participation? Can social media boost democracy in authoritarian regimes? Here, the author considers these questions and contemplates the future of social media and politics.


Lecture Notes in Computer Science | 2004

Naïve Algorithms for Keyphrase Extraction and Text Summarization from a Single Document Inspired by the Protein Biosynthesis Process

Daniel Gayo-Avello; Darío Álvarez-Gutiérrez; José Gayo-Avello

Keywords are a simple way of describing a document, giving the reader some clues about its contents. However, sometimes they only categorize the text into a topic being more useful a summary. Keywords and abstracts are common in scientific and technical literature but most of the documents available (e.g., web pages) lack such help, so automatic keyword extraction and summarization tools are fundamental to fight against the “information overload” and improve the users’ experience. Therefore, this paper describes a new technique to obtain keyphrases and summaries from a single document. With this technique, inspired by the process of protein biosynthesis, a sort of “document DNA” can be extracted and translated into a “significance protein” which both produces a set of keyphrases and acts on the document highlighting the most relevant passages. These ideas have been implemented into a prototype, publicly available in the Web, which has obtained really promising results.


international conference natural language processing | 2004

One Size Fits All? A Simple Technique to Perform Several NLP Tasks

Daniel Gayo-Avello; Darío Álvarez-Gutiérrez; José Gayo-Avello

Word fragments or n-grams have been widely used to perform different Natural Language Processing tasks such as information retrieval [1] [2], document categorization [3], automatic summarization [4] or, even, genetic classification of languages [5]. All these techniques share some common aspects such as: (1) documents are mapped to a vector space where n-grams are used as coordinates and their relative frequencies as vector weights, (2) many of them compute a context which plays a role similar to stop-word lists, and (3) cosine distance is commonly used for document-to-document and query-to-document comparisons. blindLight is a new approach related to these classical n-gram techniques although it introduces two major differences: (1) Relative frequencies are no more used as vector weights but replaced by n-gram significances, and (2) cosine distance is abandoned in favor of a new metric inspired by sequence alignment techniques although not so computationally expensive. This new approach can be simultaneously used to perform document categorization and clustering, information retrieval, and text summarization. In this paper we will describe the foundations of such a technique and its application to both a particular categorization problem (i.e., language identification) and information retrieval tasks.


cross language evaluation forum | 2004

Application of variable length N -gram vectors to monolingual and bilingual information retrieval

Daniel Gayo-Avello; Darío Álvarez-Gutiérrez; José Gayo-Avello

Our group in the Department of Informatics at the University of Oviedo has participated, for the first time, in two tasks at CLEF: monolingual (Russian) and bilingual (Spanish-to-English) information retrieval. Our main goal was to test the application to IR of a modified version of the n-gram vector space model (codenamed blindLight). This new approach has been successfully applied to other NLP tasks such as language identification or text summarization and the results achieved at CLEF 2004, although not exceptional, are encouraging. There are two major differences between the blindLight approach and classical techniques: (1) relative frequencies are no longer used as vector weights but are replaced by n-gram significances, and (2) cosine distance is abandoned in favor of a new metric inspired by sequence alignment techniques, not so computationally expensive. In order to perform cross-language IR we have developed a naive n-gram pseudo-translator similar to those described by McNamee and Mayfield or Pirkola et al.

Collaboration


Dive into the Daniel Gayo-Avello's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge