Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Víctor Fresno is active.

Publication


Featured researches published by Víctor Fresno.


conference on information and knowledge management | 2011

Classifying trending topics: a typology of conversation triggers on Twitter

Arkaitz Zubiaga; Damiano Spina; Víctor Fresno; Raquel Martínez

Twitter summarizes the great deal of messages posted by users in the form of trending topics that reflect the top conversations being discussed at a given moment. These trending topics tend to be connected to current affairs. Different happenings can give rise to the emergence of these trending topics. For instance, a sports event broadcasted on TV, or a viral meme introduced by a community of users. Detecting the type of origin can facilitate information filtering, enhance real-time data processing, and improve user experience. In this paper, we introduce a typology to categorize the triggers that leverage trending topics: news, current events, memes, and commemoratives. We define a set of straightforward language-independent features that rely on the social spread of the trends to discriminate among those types of trending topics. Our method provides an efficient way to immediately and accurately categorize trending topics without need of external data, outperforming a content-based approach.


document engineering | 2009

Getting the most out of social annotations for web page classification

Arkaitz Zubiaga; Raquel Martínez; Víctor Fresno

User-generated annotations on social bookmarking sites can provide interesting and promising metadata for web document management tasks like web page classification. These user-generated annotations include diverse types of information, such as tags and comments. Nonetheless, each kind of annotation has a different nature and popularity level. In this work, we analyze and evaluate the usefulness of each of these social annotations to classify web pages over a taxonomy like that proposed by the Open Directory Project. We compare them separately to the content-based classification, and also combine the different types of data to augment performance. Our experiments show encouraging results with the use of social annotations for this purpose, and we found that combining these metadata with web page content improves even more the classifiers performance.


association for information science and technology | 2015

Real‐time classification of Twitter trends

Arkaitz Zubiaga; Damiano Spina; Raquel Martínez; Víctor Fresno

In this work, we explore the types of triggers that spark trends on Twitter, introducing a typology with the following 4 types: news, ongoing events, memes, and commemoratives. While previous research has analyzed trending topics over the long term, we look at the earliest tweets that produce a trend, with the aim of categorizing trends early on. This allows us to provide a filtered subset of trends to end users. We experiment with a set of straightforward language‐independent features based on the social spread of trends and categorize them using the typology. Our method provides an efficient way to accurately categorize trending topics without need of external data, enabling news organizations to discover breaking news in real‐time, or to quickly identify viral memes that might inform marketing decisions, among others. The analysis of social features also reveals patterns associated with each type of trend, such as tweets about ongoing events being shorter as many were likely sent from mobile devices, or memes having more retweets originating from a few trend‐setters.


Proceedings of the 3rd International Semantic Search Workshop on | 2010

Using BM25F for semantic search

José R. Pérez-Agüera; Javier Arroyo; Jane Greenberg; Joaquin Perez Iglesias; Víctor Fresno

Information Retrieval (IR) approaches for semantic web search engines have become very populars in the last years. Popularization of different IR libraries, like Lucene, that allows IR implementations almost out-of-the-box have make easier IR integration in Semantic Web search engines. However, one of the most important features of Semantic Web documents is the structure, since this structure allow us to represent semantic in a machine readable format. In this paper we analyze the specific problems of structured IR and how to adapt weighting schemas for semantic document retrieval.


advances in social networks analysis and mining | 2009

Content-Based Clustering for Tag Cloud Visualization

Arkaitz Zubiaga; Alberto Pérez García-Plaza; Víctor Fresno; Raquel Martínez

Social tagging systems are becoming an interesting way to retrieve web information from previously annotated data. These sites present a tag cloud made up by the most popular tags, where neither tag grouping nor their corresponding content is considered. We present a methodology to obtain and visualize a cloud of related tags based on the use of self-organizing maps, and where the relations among tags are established taking into account the textual content of tagged documents. Each map unit can be represented by the most relevant terms of the tags it contains, so that it is possible to study and analyze the groups as well as to visualize and navigate through the relevant terms and tags.


Applied Soft Computing | 2012

Learning a taxonomy from a set of text documents

Mari-Sanna Paukkeri; Alberto Pérez García-Plaza; Víctor Fresno; Raquel Martínez Unanue; Timo Honkela

We present a methodology for learning a taxonomy from a set of text documents that each describes one concept. The taxonomy is obtained by clustering the concept definition documents with a hierarchical approach to the Self-Organizing Map. In this study, we compare three different feature extraction approaches with varying degree of language independence. The feature extraction schemes include fuzzy logic-based feature weighting and selection, statistical keyphrase extraction, and the traditional tf-idf weighting scheme. The experiments are conducted for English, Finnish, and Spanish. The results show that while the rule-based fuzzy logic systems have an advantage in automatic taxonomy learning, taxonomies can also be constructed with tolerable results using statistical methods without domain- or style-specific knowledge.


IEEE Transactions on Knowledge and Data Engineering | 2013

Harnessing Folksonomies to Produce a Social Classification of Resources

Arkaitz Zubiaga; Víctor Fresno; Raquel Martínez; Alberto Pérez García-Plaza

In our daily lives, organizing resources like books or webpages into a set of categories to ease future access is a common task. The usual largeness of these collections requires a vast endeavor and an outrageous expense to organize manually. As an approach to effectively produce an automated classification of resources, we consider the immense amounts of annotations provided by users on social tagging systems in the form of bookmarks. In this paper, we deal with the utilization of these user-provided tags to perform a social classification of resources. For this purpose, we have created three large-scale social tagging data sets including tagging data for different types of resources, webpages and books. Those resources are accompanied by categorization data from sound expert-driven taxonomies. We analyze the characteristics of the three social tagging systems and perform an analysis on the usefulness of social tags to perform a social classification of resources that resembles the classification by experts as much as possible. We analyze six different representations using tags and compare to other data sources by using three different settings of SVM classifiers. Finally, we explore combinations of different data sources with tags using classifier committees to best classify the resources.


intelligent information systems | 2004

An Analytical Approach to Concept Extraction in HTML Environments

Víctor Fresno; Angela Ribeiro

The core of the Internet and World Wide Web revolution comes from their capacity to efficiently share the huge quantity of data, but the rapid and chaotic growth of the Net has extremely complicated the task of sharing or mining useful information. Each inference process, from Internet information, requires an adequate characterization of the Web pages. The textual part of a page is one of the most important aspects that should be considered to appropriately perform a page characterization. The textual characterization should be made through the extraction of an appropriate set of relevant concepts that properly represent the text included in the Web page. This paper presents a method to obtain such a set of relevant concepts from a Web page, essentially based on a relevance estimation of each word in the text of a Web page. The word-relevance is defined by a combination of criteria that take into account characteristics of the HTML language as well as more classical measures such as the frequency and the position of a word in a document. Besides, heuristic rules to obtain the most suitable fusion of criteria is achieved via a statistical study. Several experiments are conducted to test the performance of the proposed concept extraction method compared to other approaches including a commercial tool. The results obtained here exhibit a greater success in the concept extraction by the proposed technique against other tested methods.


meeting of the association for computational linguistics | 2006

Multilingual Document Clustering: An Heuristic Approach Based on Cognate Named Entities

Soto Montalvo; Raquel Martínez; Arantza Casillas; Víctor Fresno

This paper presents an approach for Multilingual Document Clustering in comparable corpora. The algorithm is of heuristic nature and it uses as unique evidence for clustering the identification of cognate named entities between both sides of the comparable corpora. One of the main advantages of this approach is that it does not depend on bilingual or multilingual resources. However, it depends on the possibility of identifying cognate named entities between the languages used in the corpus. An additional advantage of the approach is that it does not need any information about the right number of clusters; the algorithm calculates it. We have tested this approach with a comparable corpus of news written in English and Spanish. In addition, we have compared the results with a system which translates selected document features. The obtained results are encouraging.


Intelligent exploration of the web | 2003

A fuzzy system for the web page representation

Angela Ribeiro; Víctor Fresno; Maria C. Garcia-Alegre; Domingo Guinea

This paper addresses the issue of an adequate representation of a web page, to perform further on classification and data mining. The approach focuses the textual part of web pages, which are represented by a two-dimension vector. The vector components are sorted by the relevance of each word in the text. Two approaches, analytical and fuzzy, that take advantage of characteristics of the HTML language are presented to compute the word relevance. Both models are contrasted in learning and classification tasks, to evaluate the suitability of each approach. The experiments show an obvious improvement of fuzzy method versus analytical one. The analytical and fuzzy approaches here presented are general, in the sense that every characteristic of the web pages could be easily integrated without additional cost.

Collaboration


Dive into the Víctor Fresno's collaboration.

Top Co-Authors

Avatar

Raquel Martínez

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Soto Montalvo

King Juan Carlos University

View shared research outputs
Top Co-Authors

Avatar

Alberto Pérez García-Plaza

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Arkaitz Zubiaga

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Agustín D. Delgado

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Arantza Casillas

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar

Arkaitz Zubiaga

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Joaquín Pérez-Iglesias

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Angela Ribeiro

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

José R. Pérez-Agüera

Complutense University of Madrid

View shared research outputs
Researchain Logo
Decentralizing Knowledge