Victor Carneiro
University of A Coruña
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Victor Carneiro.
ACM Transactions on The Web | 2011
Fidel Cacheda; Victor Carneiro; Diego Fernández; Vreixo Formoso
The technique of collaborative filtering is especially successful in generating personalized recommendations. More than a decade of research has resulted in numerous algorithms, although no comparison of the different strategies has been made. In fact, a universally accepted way of evaluating a collaborative filtering algorithm does not exist yet. In this work, we compare different techniques found in the literature, and we study the characteristics of each one, highlighting their principal strengths and weaknesses. Several experiments have been performed, using the most popular metrics and algorithms. Moreover, two new metrics designed to measure the precision on good items have been proposed. The results have revealed the weaknesses of many algorithms in extracting information from user profiles especially under sparsity conditions. We have also confirmed the good results of SVD-based techniques already reported by other authors. As an alternative, we present a new approach based on the interpretation of the tendencies or differences between users and items. Despite its extraordinary simplicity, in our experiments, it obtained noticeably better results than more complex algorithms. In fact, in the cases analyzed, its results are at least equivalent to those of the best approaches studied. Under sparsity conditions, there is more than a 20% improvement in accuracy over the traditional user-based algorithms, while maintaining over 90% coverage. Moreover, it is much more efficient computationally than any other algorithm, making it especially adequate for large amounts of data.
Information Processing and Management | 2007
Fidel Cacheda; Victor Carneiro; Vassilis Plachouras; Iadh Ounis
The increasing number of documents that have to be indexed in different environments, particularly on the Web, and the lack of scalability of a single centralised index lead to the use of distributed information retrieval systems to effectively search for and locate the required information. In this study, we present several improvements over the two main bottlenecks in a distributed information retrieval system (the network and the brokers). We extend a simulation network model in order to represent a switched network. The new simulation model is validated by comparing the estimated response times with those obtained using a real system. We show that the use of a switched network reduces the saturation of the interconnection network, especially in a replicated system, and some improvements may be achieved using multicast messages and faster connections with the brokers. We also demonstrate that reducing the partial results sets will improve the response time of a distributed system by 53%, with a negligible probability of changing the systems precision and recall values. Finally, we present a simple hierarchical distributed broker model that will reduce the response times for a distributed system by 55%.
european conference on information retrieval | 2007
Fidel Cacheda; Victor Carneiro; Vassilis Plachouras; Iadh Ounis
The amount of information available over the Internet is increasing daily as well as the importance and magnitude of Web search engines. Systems based on a single centralised index present several problems (such as lack of scalability), which lead to the use of distributed information retrieval systems to effectively search for and locate the required information. A distributed retrieval system can be clustered and/or replicated. In this paper, using simulations, we present a detailed performance analysis, both in terms of throughput and response time, of a clustered system compared to a replicated system. In addition, we consider the effect of changes in the query topics over time. We show that the performance obtained for a clustered system does not improve the performance obtained by the best replicated system. Indeed, the main advantage of a clustered system is the reduction of network traffic. However, the use of a switched network eliminates the bottleneck in the network, markedly improving the performance of the replicated systems. Moreover, we illustrate the negative performance effect of the changes over time in the query topics when a distributed clustered system is used. On the contrary, the performance of a distributed replicated system is query independent.
european conference on information retrieval | 2003
Fidel Cacheda; Victor Carneiro; Carmen Guerrero; Ángel Viña
The need of efficient tools in order to manage, retrieve and filter the information in the WWW is clear. Web directories are taxonomies for the classification of Web documents. These kind of information retrieval systems present a specific type of search where the document collection is restricted to one area of the category graph. This paper introduces a specific data architecture for Web directories that improves the performance of restricted searches. That architecture is based on a hybrid data structure composed of an inverted file with multiple embedded signature files. Two variants are presented: hybrid architecture with total information and with partial information. This architecture has been analyzed by means of developing both variants to be compared with a basic model. The performance of the restricted queries was clearly improved, especially the hybrid model with partial information, which yielded a positive response under any load of the search system.
World Wide Web | 2015
Vreixo Formoso; Diego Fernández; Fidel Cacheda; Victor Carneiro
Collaborative filtering is one of the most popular recommendation techniques. While the quality of the recommendations has been significantly improved in the last years, most approaches present poor efficiency and scalability. In this paper, we study several factors that affect the performance of a k-Nearest Neighbors algorithm, and we propose a distributed architecture that significantly improves both throughput and response time. Two techniques for distributing recommender systems, user and item partition, were proposed and evaluated using that simulation model. We have found that user partition is generally better, with a faster response time and higher throughput.
european conference on information retrieval | 2005
Fidel Cacheda; Victor Carneiro; Vassilis Plachouras; Iadh Ounis
In this study, we present the analysis of the interconnection network of a distributed Information Retrieval (IR) system, by simulating a switched network versus a shared access network. The results show that the use of a switched network improves the performance, especially in a replicated system because the switched network prevents the saturation of the network, particularly when using a large number of query servers.
IEEE Latin America Transactions | 2007
Fidel Cacheda; Vreixo Formoso; Victor Carneiro
The importance and size of Web search engines is increasing daily. Information retrieval systems based on a single centralized index present several problems, which lead to the use of distributed information retrieval systems to effectively search for and locate the required information. In this study, we analyze two improvements over the brokers’ bottlenecks in a distributed information retrieval system. We demonstrate that reducing the partial results sets will improve the response time of a distributed system by 53%, with a negligible probability of changing the system’s precision and recall values. Finally, we present a simple hierarchical distributed broker model that will reduce the response times for a distributed system by 55%.
conference on information and knowledge management | 2011
Fidel Cacheda; Victor Carneiro; Diego Fernández; Vreixo Formoso
In the last years, recommender systems have achieved a great popularity. Many different techniques have been developed and applied to this field. However, in many cases the algorithms do not obtain the expected results. In particular, when the applied model does not fit the real data the results are especially bad. This happens because many times models are directly applied to a domain without a previous analysis of the data. In this work we study the most popular datasets in the movie recommendation domain, in order to understand how the users behave in this particular context. We have found some remarkable facts that question the utility of the similarity measures traditionally used in k-Nearest Neighbors (kNN) algorithms. These findings can be useful in order to develop new algorithms. In particular, we modify traditional kNN algorithms by introducing a new similarity measure specially suited for sparse contexts, where users have rated very few items. Our experiments show slight improvements in prediction accuracy, which proves the importance of a thorough dataset analysis as a previous step to any algorithm development.
global communications conference | 1998
Victor Carneiro; Ángel Viña; C. Guerrero
We introduce an alarm management system based on management by delegation paradigm (MbD) which provides the operator with an integrated and homogeneous environment in which different types of alarms exist. The platform chosen was Java owing to its special features (code mobility, platform independency, distributed capabilities, etc.). This system provides the programmer with a flexible, modular and robust environment where the functionality of the system can be increased dynamically without having to alter any part of it. This system overcomes most of the limitations inherent to centralised systems. Some of the key characteristics of the system are: protocol integration, the use of an RDBMS to enhance the information about alarms, multi-user monitoring through an intuitive GUI applet with several permission constraints.
Proceedings of IEEE Enterprise Networking Mini-Conference (ENM-97) in conjunction with ICC 97 | 1997
C. Guerrero; D. Sanchez; Victor Carneiro; Ángel Viña; J. Coego
We address the problem of integrating proprietary managed technology in a corporate TMN system by using TMN-based platforms support facilities. A prototype that integrates a proprietary managed PDH network in a fully TMN corporate management system has been designed and developed using three different TMN platforms in parallel: Solstice Enterprise Manager, NetView/6000 TMN Support Facility and OpenView DM. This experimental prototype helped us (1) to understand how the new emerging management platforms support the engineering of solutions to integrate proprietary protocols and (2) to identify potential problems that can arise when trying to apply the platform functionality to the real network elements.