Josep M. Pujol | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Josep M. Pujol is active.

Explore More

Publication

Featured researches published by Josep M. Pujol.

PLOS ONE | 2012

Social Features of Online Networks: The Strength of Intermediary Ties in Online Social Media

Przemyslaw A. Grabowicz; José J. Ramasco; Esteban Moro; Josep M. Pujol; Víctor M. Eguíluz

An increasing fraction of todays social interactions occur using online social media as communication channels. Recent worldwide events, such as social movements in Spain or revolts in the Middle East, highlight their capacity to boost peoples coordination. Online networks display in general a rich internal structure where users can choose among different types and intensity of interactions. Despite this, there are still open questions regarding the social value of online interactions. For example, the existence of users with millions of online friends sheds doubts on the relevance of these relations. In this work, we focus on Twitter, one of the most popular online social networks, and find that the network formed by the basic type of connections is organized in groups. The activity of the users conforms to the landscape determined by such groups. Furthermore, Twitters distinction between different types of interactions allows us to establish a parallelism between online and offline social networks: personal interactions are more likely to occur on internal links to the groups (the weakness of strong ties); events transmitting new information go preferentially through links connecting different groups (the strength of weak ties) or even more through links connecting to users belonging to several groups that act as brokers (the strength of intermediary ties).

international conference on user modeling adaptation and personalization | 2009

I Like It... I Like It Not: Evaluating User Ratings Noise in Recommender Systems

Xavier Amatriain; Josep M. Pujol; Nuria Oliver

Recent growing interest in predicting and influencing consumer behavior has generated a parallel increase in research efforts on Recommender Systems. Many of the state-of-the-art Recommender Systems algorithms rely on obtaining user ratings in order to later predict unknown ratings. An underlying assumption in this approach is that the user ratings can be treated as ground truth of the users taste. However, users are inconsistent in giving their feedback, thus introducing an unknown amount of noise that challenges the validity of this assumption. n nIn this paper, we tackle the problem of analyzing and characterizing the noise in user feedback through ratings of movies. We present a user study aimed at quantifying the noise in user ratings that is due to inconsistencies. We measure RMSE values that range from 0.557 to 0.8156. We also analyze how factors such as item sorting and time of rating affect this noise.

international acm sigir conference on research and development in information retrieval | 2009

The wisdom of the few: a collaborative filtering approach based on expert opinions from the web

Xavier Amatriain; Neal Lathia; Josep M. Pujol; Haewoon Kwak; Nuria Oliver

Nearest-neighbor collaborative filtering provides a successful means of generating recommendations for web users. However, this approach suffers from several shortcomings, including data sparsity and noise, the cold-start problem, and scalability. In this work, we present a novel method for recommending items to users based on expert opinions. Our method is a variation of traditional collaborative filtering: rather than applying a nearest neighbor algorithm to the user-rating data, predictions are computed using a set of expert neighbors from an independent dataset, whose opinions are weighted according to their similarity to the user. This method promises to address some of the weaknesses in traditional collaborative filtering, while maintaining comparable accuracy. We validate our approach by predicting a subset of the Netflix data set. We use ratings crawled from a web portal of expert reviews, measuring results both in terms of prediction accuracy and recommendation list precision. Finally, we explore the ability of our method to generate useful recommendations, by reporting the results of a user-study where users prefer the recommendations generated by our approach.

Recommender Systems Handbook | 2011

Data Mining Methods for Recommender Systems

Xavier Amatriain; Alejandro Jaimes; Nuria Oliver; Josep M. Pujol

In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common prepro- cessing methods such as sampling or dimensionality reduction. Next, we review the most important classification techniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an effi- cient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.

conference on recommender systems | 2009

Rate it again: increasing recommendation accuracy by user re-rating

Xavier Amatriain; Josep M. Pujol; Nava Tintarev; Nuria Oliver

A common approach to designing Recommender Systems (RS) consists of asking users to explicitly rate items in order to collect feedback about their preferences. However, users have been shown to be inconsistent and to introduce a non-negligible amount of natural noise in their ratings that affects the accuracy of the predictions. In this paper, we present a novel approach to improve RS accuracy by reducing the natural noise in the input data via a preprocessing step. In order to quantitatively understand the impact of natural noise, we first analyze the response of common recommendation algorithms to this noise. Next, we propose a novel algorithm to denoise existing datasets by means of re-rating: i.e. by asking users to rate previously rated items again. This denoising step yields very significant accuracy improvements. However, re-rating all items in the original dataset is unpractical. Therefore, we study the accuracy gains obtained when re-rating only some of the ratings.In particular, we propose two partial denoising strategies: data and user-dependent denoising. Finally, we compare the value of adding a rating of an unseen item vs. re-rating an item. We conclude with a proposal for RS to improve the quality of their user data and hence their accuracy: asking users to re-rate items might, in some circumstances, be more beneficial than asking users to rate unseen items.

passive and active network measurement | 2009

Monitoring the Bittorrent Monitors: A Bird's Eye View

Georgos Siganos; Josep M. Pujol; Pablo Rodriguez

Detecting clients with deviant behavior in the Bittorrent network is a challenging task that has not received the deserved attention. Typically, this question is seen as not politically correct, since it is associated with the controversial issue of detecting agencies that monitor Bittorrent for copyright infringement. However, deviant behavior detection and its associated blacklists might prove crucial for the well being of Bittorrent as there are other deviant entities in Bittorrent besides monitors. Our goal is to provide some initial heuristics that can be used to automatically detect deviant clients. We analyze for 45 days the top 600 torrents of Pirate Bay. We show that the empirical observation of Bittorrent clients can be used to detect deviant behavior, and consequently, it is possible to automatically build dynamic blacklists.

PLOS Computational Biology | 2010

Informing optimal environmental influenza interventions: how the host, agent, and environment alter dominant routes of transmission.

Ian H. Spicknall; James S. Koopman; Mark Nicas; Josep M. Pujol; Sheng Li; Joseph N. S. Eisenberg

Influenza can be transmitted through respirable (small airborne particles), inspirable (intermediate size), direct-droplet-spray, and contact modes. How these modes are affected by features of the virus strain (infectivity, survivability, transferability, or shedding profiles), host population (behavior, susceptibility, or shedding profiles), and environment (host density, surface area to volume ratios, or host movement patterns) have only recently come under investigation. A discrete-event, continuous-time, stochastic transmission model was constructed to analyze the environmental processes through which a virus passes from one person to another via different transmission modes, and explore which factors increase or decrease different modes of transmission. With the exception of the inspiratory route, each route on its own can cause high transmission in isolation of other modes. Mode-specific transmission was highly sensitive to parameter values. For example, droplet and respirable transmission usually required high host density, while the contact route had no such requirement. Depending on the specific context, one or more modes may be sufficient to cause high transmission, while in other contexts no transmission may result. Because of this, when making intervention decisions that involve blocking environmental pathways, generic recommendations applied indiscriminately may be ineffective; instead intervention choice should be contextualized, depending on the specific features of people, virus strain, or venue in question.

IEEE ACM Transactions on Networking | 2012

The little engine(s) that could: scaling online social networks

Josep M. Pujol; Vijay Erramilli; Georgos Siganos; Xiaoyuan Yang; Nikolaos Laoutaris; Parminder Chhabra; Pablo Rodriguez

The difficulty of partitioning social graphs has introduced new system design challenges for scaling of online social networks (OSNs). Vertical scaling by resorting to full replication can be a costly proposition. Scaling horizontally by partitioning and distributing data among multiple servers using, for e.g., distributed hash tables (DHTs), can suffer from expensive interserver communication. Such challenges have often caused costly rearchitecting efforts for popular OSNs like Twitter and Facebook. We design, implement, and evaluate SPAR, a Social Partitioning and Replication middleware that mediates transparently between the application and the database layer of an OSN. SPAR leverages the underlying social graph structure in order to minimize the required replication overhead for ensuring that users have their neighbors data colocated in the same machine. The gains from this aremultifold: Application developers can assume local semantics, i.e., develop as they would for a single machine; scalability is achieved by adding commodity machines with low memory and network I/O requirements; and N+K redundancy is achieved at a fraction of the cost. We provide a complete system design, extensive evaluation based on datasets from Twitter, Orkut, and Facebook, and a working implementation. We show that SPAR incurs minimum overhead, can help a well-known Twitter clone reach Twitters scale without changing a line of its application logic, and achieves higher throughput than Cassandra, a popular key-value store database.

Journal of the Royal Society Interface | 2011

A dynamic dose-response model to account for exposure patterns in risk assessment: A case study in inhalation anthrax

Bryan T. Mayer; James S. Koopman; Edward L. Ionides; Josep M. Pujol; Joseph N. S. Eisenberg

The most commonly used dose–response models implicitly assume that accumulation of dose is a time-independent process where each pathogen has a fixed risk of initiating infection. Immune particle neutralization of pathogens, however, may create strong time dependence; i.e. temporally clustered pathogens have a better chance of overwhelming the immune particles than pathogen exposures that occur at lower levels for longer periods of time. In environmental transmission systems, we expect different routes of transmission to elicit different dose–timing patterns and thus potentially different realizations of risk. We present a dose–response model that captures time dependence in a manner that incorporates the dynamics of initial immune response. We then demonstrate the parameter estimation of our model in a dose–response survival analysis using empirical time-series data of inhalational anthrax in monkeys in which we find slight dose–timing effects. Future dose–response experiments should include varying the time pattern of exposure in addition to varying the total doses delivered. Ultimately, the dynamic dose–response paradigm presented here will improve modelling of environmental transmission systems where different systems have different time patterns of exposure.

human computer interaction with mobile devices and services | 2010

MobileHCI'10 workshop summary: social mobile web

Karen Church; Josep M. Pujol; Barry Smyth; Noshir Contractor

The mobile space is evolving at an astonishing rate with over 4.1 billion subscribers in existence. The world is also witnessing an explosion in social web services with more users seeking novel ways of interacting with friends and family. We are interested in the combination of these two exciting research spaces: the social web and the mobile web. We believe that the social mobile web is going to be a highly influential research area in the near future and given the huge growth that both these fields have experienced in recent times we feel that now is an excellent time to discuss this nascent research space. This workshop continues the successful social mobile web workshop held as part of SocialCom in 2009. The workshop explores the current state of the social mobile web and combines technical presentations, demos and position papers to drive interaction and discussion among participants.

Explore More