Matthieu Latapy
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Matthieu Latapy.
Theoretical Computer Science | 2008
Matthieu Latapy
Finding, counting and/or listing triangles (three vertices with three edges) in massive graphs are natural fundamental problems, which have recently received much attention because of their importance in complex network analysis. Here we provide a detailed survey of proposed main-memory solutions to these problems, in a unified way. We note that previous authors have paid surprisingly little attention to space complexity of main-memory solutions, despite its both fundamental and practical interest. We therefore detail space complexities of known algorithms and discuss their implications. We also present new algorithms which are time optimal for triangle listing and beats previous algorithms concerning space needs. They have the additional advantage of performing better on power-law graphs, which we also detail. We finally show with an experimental study that these two algorithms perform very well in practice, allowing us to handle cases which were previously out of reach.
ACM Journal of Experimental Algorithms | 2009
Clémence Magnien; Matthieu Latapy; Michel Habib
The diameter of a graph is among its most basic parameters. Since a few years ago, it moreover became a key issue to compute it for massive graphs in the context of complex network analysis. However, known algorithms, including the ones producing approximate values, have too high a time and/or space complexity to be used in such cases. We propose here a new approach relying on very simple and fast algorithms that compute (upper and lower) bounds for the diameter. We show empirically that, on various real-world cases representative of complex networks studied in the literature, the obtained bounds are very tight (and even equal in some cases). This leads to rigorous and very accurate estimations of the actual diameter in cases which were previously untractable in practice.
international conference on computer communications | 2008
Matthieu Latapy; Clémence Magnien
Complex networks, modeled as large graphs, received much attention during these last years. However, topological information on these networks is only available through intricate measurement procedures. Until recently, most studies assumed that these procedures eventually lead to samples large enough to be representative of the whole, at least concerning some key properties. This has a crucial impact on network modeling and simulation, which rely on these properties. Recent contributions proved that this assumption may be misleading, but no solution has been proposed. We provide here the first practical methodology to distinguish between cases where it is indeed misleading, and cases where the observed properties may be trusted. It consists in studying how the properties of interest evolve when the sample grows, and in particular whether they reach a steady state or not. In order to illustrate this method and to demonstrate its relevance, we apply it to data-sets on complex network measurements that are representative of the ones commonly used. The obtained results show that the method fulfills its goals very well. We moreover identify some properties which seem easier to evaluate in practice, thus opening interesting perspectives.
international conference on computer communications | 2005
Jean-Loup Guillaume; Matthieu Latapy
Internet maps are generally constructed using the traceroute tool from a few sources to many destinations. It appeared recently that this exploration process gives a partial and biased view of the real topology, which leads to the idea of increasing the number of sources to improve the quality of the maps. In this paper, we present a set of experiments we have conduced to evaluate the relevance of this approach. It appears that the statistical properties of the underlying network have a strong influence on the quality of the obtained maps, which can be improved using massively distributed explorations. Conversely, we show that the exploration process induces some properties on the maps. We validate our analysis using real-world data and experiments and we discuss its implications.
Computer Networks | 2008
Fabien Viger; Brice Augustin; Xavier Cuvellier; Clémence Magnien; Matthieu Latapy; Timur Friedman; Renata Teixeira
Traceroute is widely used, from the diagnosis of network problems to the assemblage of internet maps. Unfortunately, there are a number of problems with traceroute methodology, which lead to the inference of erroneous routes. This paper studies particular structures arising in nearly all traceroute measurements. We characterize them as loops, cycles, and diamonds. We identify load balancing as a possible cause for the appearance of false loops, cycles, and diamonds, i.e., artifacts that do not represent the internet topology. We provide a new publicly available traceroute, called Paris traceroute, which, by controlling the packet header contents, provides a truer picture of the actual routes that packets follow. We performed measurements, from the perspective of a single source tracing towards multiple destinations, and Paris traceroute allowed us to show that many of the particular structures we observe are indeed traceroute measurement artifacts.
international parallel and distributed processing symposium | 2009
Frédéric Aidouni; Matthieu Latapy; Clémence Magnien
This paper presents a capture of the queries managed by an eDonkey server during almost 10 weeks, leading to the observation of almost 9 billion messages involving almost 90 million users and more than 275 million distinct files. Acquisition and management of such data raises several challenges, which we discuss as well as the solutions we developed. We obtain a very rich dataset, orders of magnitude larger than previously avalaible ones, which we provide for public use. We finally present basic analysis of the obtained data, which already gives evidence of non-trivial features.
Theoretical Computer Science | 2011
Pascal Pons; Matthieu Latapy
Dense sub-graphs of sparse graphs (communities), which appear in most real-world complex networks, play an important role in many contexts. Most existing community detection algorithms produce a hierarchical structure of communities and seek a partition into communities that optimizes a given quality function. We propose new methods to improve the results of any of these algorithms. First we show how to optimize a general class of additive quality functions (containing the modularity, the performance, and a new similarity based quality function which we propose) over a larger set of partitions than the classical methods. Moreover, we define new multi-scale quality functions which make it possible to detect different scales at which meaningful community structures appear, while classical approaches find only one partition.
conference on computer communications workshops | 2011
Oussama Allali; Clémence Magnien; Matthieu Latapy
Many real-world complex networks, like client-product or file-provider relations, have a bipartite nature and evolve during time. Predicting links that will appear in them is one of the main approach to understand their dynamics. Only few works address the bipartite case, though, despite its high practical interest and the specific challenges it raises. We define in this paper the notion of internal links in bipartite graphs and propose a link prediction method based on them. We describe the method and experimentally compare it to a basic collaborative filtering approach. We present results obtained for two typical practical cases. We reach the conclusion that our method performs very well, and that internal links play an important role in bipartite graphs and their dynamics.
Information Processing and Management | 2013
Matthieu Latapy; Clémence Magnien; Raphaël Fournier
Increasing knowledge of paedophile activity in P2P systems is a crucial societal concern, with important consequences on child protection, policy making, and internet regulation. Because of a lack of traces of P2P exchanges and rigorous analysis methodology, however, current knowledge of this activity remains very limited. We consider here a widely used P2P system, eDonkey, and focus on two key statistics: the fraction of paedophile queries entered in the system and the fraction of users who entered such queries. We collect hundreds of millions of keyword-based queries; we design a paedophile query detection tool for which we establish false positive and false negative rates using assessment by experts; with this tool and these rates, we then estimate the fraction of paedophile queries in our data; finally, we design and apply methods for quantifying users who entered such queries. We conclude that approximately 0.25% of queries are paedophile, and that more than 0.2% of users enter such queries. These statistics are by far the most precise and reliable ever obtained in this domain.
Computer Networks | 2007
Jérémie Leguay; Matthieu Latapy; Timur Friedman; Kavé Salamatian
This contribution deals with actual routes followed by packets in the Internet at the ip level. We first propose a set of statistical properties to analyse such routes. We then use the results to suggest and evaluate methods for generating artificial routes suitable for simulation purposes. The proposed approach also leads to insight on various network models. The present work is based on large data sets provided mainly by caidas skitter infrastructure.