Alessandro Epasto
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alessandro Epasto.
ieee symposium on security and privacy | 2013
Lorenzo Alvisi; Allen Clement; Alessandro Epasto; Silvio Lattanzi; Alessandro Panconesi
Sybil attacks in which an adversary forges a potentially unbounded number of identities are a danger to distributed systems and online social networks. The goal of sybil defense is to accurately identify sybil identities. This paper surveys the evolution of sybil defense protocols that leverage the structural properties of the social graph underlying a distributed system to identify sybil identities. We make two main contributions. First, we clarify the deep connection between sybil defense and the theory of random walks. This leads us to identify a community detection algorithm that, for the first time, offers provable guarantees in the context of sybil defense. Second, we advocate a new goal for sybil defense that addresses the more limited, but practically useful, goal of securely white-listing a local region of the graph.
international world wide web conferences | 2015
Alessandro Epasto; Silvio Lattanzi; Mauro Sozio
Densest subgraph computation has emerged as an important primitive in a wide range of data analysis tasks such as community and event detection. Social media such as Facebook and Twitter are highly dynamic with new friendship links and tweets being generated incessantly, calling for efficient algorithms that can handle very large and highly dynamic input data. While either scalable or dynamic algorithms for finding densest subgraphs have been proposed, a viable and satisfactory solution for addressing both the dynamic aspect of the input data and its large size is still missing. We study the densest subgraph problem in the the dynamic graph model, for which we present the first scalable algorithm with provable guarantees. In our model, edges are added adversarially while they are removed uniformly at random from the current graph. We show that at any point in time we are able to maintain a 2(1+ε)-approximation of a current densest subgraph, while requiring O(polylog(n+r)) amortized cost per update (with high probability), where r is the total number of update operations executed and n is the maximum number of nodes in the graph. In contrast, a naive algorithm that recomputes a dense subgraph every time the graph changes requires Omega(m) work per update, where m is the number of edges in the current graph. Our theoretical analysis is complemented with an extensive experimental evaluation on large real-world graphs showing that (approximate) densest subgraphs can be maintained efficiently within hundred of microseconds per update.
knowledge discovery and data mining | 2016
Lorenzo De Stefani; Alessandro Epasto; Matteo Riondato; Eli Upfal
We present TRIEST, a suite of one-pass streaming algorithms to compute unbiased, low-variance, high-quality approximations of the global and local (i.e., incident to each vertex) number of triangles in a fully-dynamic graph represented as an adversarial stream of edge insertions and deletions. Our algorithms use reservoir sampling and its variants to exploit the user-specified memory space at all times. This is in contrast with previous approaches, which require hard-to-choose parameters (e.g., a fixed sampling probability) and offer no guarantees on the amount of memory they use. We analyze the variance of the estimations and show novel concentration bounds for these quantities. Our experimental results on very large graphs demonstrate that TRIEST outperforms state-of-the-art approaches in accuracy and exhibits a small update time.
knowledge discovery and data mining | 2015
Flavio Chierichetti; Alessandro Epasto; Ravi Kumar; Silvio Lattanzi; Vahab S. Mirrokni
We introduce the public-private model of graphs. In this model, we have a public graph and each node in the public graph has an associated private graph. The motivation for studying this model stems from social networks, where the nodes are the users, the public graph is visible to everyone, and the private graph at each node is visible only to the user at the node. From each nodes viewpoint, the graph is just a union of its private graph and the public graph. We consider the problem of efficiently computing various properties of the graphs from each nodes point of view, with minimal amount of recomputation on the public graph. To illustrate the richness of our model, we explore two powerful computational paradigms for studying large graphs, namely, sketching and sampling, and focus on some key problems in social networks and show efficient algorithms in the public-private graph model. In the sketching model, we show how to efficiently approximate the neighborhood function, which in turn can be used to approximate various notions of centrality. In the sampling model, we focus on all-pair shortest path distances, node similarities, and correlation clustering.
conference on online social networks | 2014
Paweł Brach; Alessandro Epasto; Alessandro Panconesi; Piotr Sankowski
In this paper we tackle the following question: is it possible to predict the characteristics of the evolution of an epidemic process in a social network on the basis of the degree distribution alone? We answer this question affirmatively for several diffusion processes-- Push-Pull, Broadcast and SIR-- by showing that it is possible to predict with good accuracy their average evolution. We do this by developing a space efficient predictor that makes it possible to handle very large networks with very limited computational resources. Our experiments show that the prediction is surprisingly good for many instances of real-world networks. The class of real-world networks for which this happens can be characterized in terms of their neighbourhood function, which turns out to be similar to that of random networks. Finally, we analyse real instances of rumour spreading in Twitter and observe that our model describes qualitatively well their evolution.
international world wide web conferences | 2017
Alessandro Epasto; Silvio Lattanzi; Sergei Vassilvitskii; Morteza Zadimoghaddam
Maximizing submodular functions under cardinality constraints lies at the core of numerous data mining and machine learning applications, including data diversification, data summarization, and coverage problems. In this work, we study this question in the context of data streams, where elements arrive one at a time, and we want to design low-memory and fast update-time algorithms that maintain a good solution. Specifically, we focus on the sliding window model, where we are asked to maintain a solution that considers only the last W items. In this context, we provide the first non-trivial algorithm that maintains a provable approximation of the optimum using space sublinear in the size of the window. In particular we give a 1/3 - ε approximation algorithm that uses space polylogarithmic in the spread of the values of the elements, δ, and linear in the solution size k for any constant ε > 0. At the same time, processing each element only requires a polylogarithmic number of evaluations of the function itself. When a better approximation is desired, we show a different algorithm that, at the cost of using more memory, provides a 1/2 - ε approximation, and allows a tunable trade-off between average update time and space. This algorithm matches the best known approximation guarantees for submodular optimization in insertion-only streams, a less general formulation of the problem. We demonstrate the efficacy of the algorithms on a number of real world datasets, showing that their practical performance far exceeds the theoretical bounds. The algorithms preserve high quality solutions in streams with millions of items, while storing a negligible fraction of them.
international world wide web conferences | 2014
Alessandro Epasto; Jon Feldman; Silvio Lattanzi; Stefano Leonardi; Vahab S. Mirrokni
We study the problem of computing similarity rankings in large-scale multi-categorical bipartite graphs, where the two sides of the graph represent actors and items, and the items are partitioned into an arbitrary set of categories. The problem has several real-world applications, including identifying competing advertisers and suggesting related queries in an online advertising system or finding users with similar interests and suggesting content to them. In these settings, we are interested in computing on-the-fly rankings of similar actors, given an actor and an arbitrary subset of categories of interest. Two main challenges arise: First, the bipartite graphs are huge and often lopsided (e.g. the system might receive billions of queries while presenting only millions of advertisers). Second, the sheer number of possible combinations of categories prevents the pre-computation of the results for all of them. We present a novel algorithmic framework that addresses both issues for the computation of several graph-theoretical similarity measures, including # common neighbors, and Personalized PageRank. We show how to tackle the imbalance in the graphs to speed up the computation and provide efficient real-time algorithms for computing rankings for an arbitrary subset of categories. Finally, we show experimentally the accuracy of our approach with real-world data, using both public graphs and a very large dataset from Google AdWords.
Internet Mathematics | 2014
Lorenzo Alvisi; Allen Clement; Alessandro Epasto; Silvio Lattanzi; Alessandro Panconesi
Abstract Sybil attacks, in which an adversary forges a potentially unbounded number of identities, are a danger to distributed systems and online social networks. The goal of sybil defense is to accurately identify sybil identities. This article surveys the evolution of sybil defense protocols that leverage the structural properties of the social graph underlying a distributed system to identify sybil identities. We make two main contributions. First, we clarify the deep connection between sybil defense and the theory of random walks. This leads us to identify a community detection algorithm that, for the first time, offers provable guarantees in the context of sybil defense. Second, we advocate a new goal for sybil defense that addresses the more limited, but practically useful, goal of securely white-listing a local region of the graph.
international world wide web conferences | 2017
David Stück; Haraldur Tómas Hallgrímsson; Greg Ver Steeg; Alessandro Epasto; Luca Foschini
Many behaviors that lead to worsened health outcomes are modifiable, social, and visible. Social influence has thus the potential to foster adoption of habits that promote health and improve disease management. In this study, we consider the evolution of the physical activity of 44.5 thousand Fitbit users as they interact on the Fitbit social network, in relation to their health status. The users collectively recorded 9.3 million days of steps over the period of a year through a Fitbit device. 7,515 of the users also self-reported whether they were diagnosed with a major chronic condition. A time-aggregated analysis shows that ego net size, average alter physical activity, gender, and body mass index (BMI) are significantly predictive of ego physical activity. For users who self-reported chronic conditions, the direction and effect size of associations varied depending on the condition, with diabetic users specifically showing almost a 6-fold increase in additional daily steps for each additional social tie. Subsequently, we consider the co-evolution of activity and friendship longitudinally on a month by month basis. We show that the fluctuations in average alter activity significantly predict fluctuations in ego activity. By leveraging a class of novel non-parametric statistical tests we investigate the causal factors in these fluctuations. We find that under certain stationarity assumptions, non-null causal dependence exists between ego and alters activity, even in the presence of unobserved stationary individual traits. We believe that our findings provide evidence that the study of online social networks have the potential to improve our understanding of factors affecting adoption of positive habits, especially in the context of chronic condition management.
acm symposium on parallel algorithms and architectures | 2017
Alessandro Epasto; Vahab S. Mirrokni; Morteza Zadimoghaddam
We study the problem of efficiently optimizing submodular functions under cardinality constraints in distributed setting. Recently, several distributed algorithms for this problem have been introduced which either achieve a sub-optimal solution or they run in super-constant number of rounds of computation. Unlike previous work, we aim to design distributed algorithms in multiple rounds with almost optimal approximation guarantees at the cost of outputting a larger number of elements. Toward this goal, we present a distributed algorithm that, for any ε > 0 and any constant r, outputs a set S of O(rk/ε1/r) items in r rounds, and achieves a (1-ε)-approximation of the value of the optimum set with k items. This is the first distributed algorithm that achieves an approximation factor of (1-ε) running in less than log 1/ε number of rounds. We also prove a hardness result showing that the output of any 1-ε approximation distributed algorithm limited to one distributed round should have at least Ω(k/ε) items. In light of this hardness result, our distributed algorithm in one round, r = 1, is asymptotically tight in terms of the output size. We support the theoretical guarantees with an extensive empirical study of our algorithm showing that achieving almost optimum solutions is indeed possible in a few rounds for large-scale real datasets.