Michael Hamann
Karlsruhe Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Hamann.
advances in social networks analysis and mining | 2015
Gerd Lindner; Christian L. Staudt; Michael Hamann; Henning Meyerhenke; Dorothea Wagner
Sparsification reduces the size of networks while preserving structural and statistical properties of interest. Various sparsifying algorithms have been proposed in different contexts. We contribute the first systematic conceptual and experimental comparison of edge sparsification methods on a diverse set of network properties. It is shown that they can be understood as methods for rating edges by importance and then filtering globally by these scores. In addition, we propose a new sparsification method (Local Degree) which preserves edges leading to local hub nodes. All methods are evaluated on a set of 100 Facebook social networks with respect to network properties including diameter, connected components, community structure, and multiple node centrality measures. Experiments with our implementations of the sparsification methods (using the open-source network analysis tool suite NetworKit) show that many network properties can be preserved down to about 20% of the original set of edges. Furthermore, the experimental results allow us to differentiate the behavior of different methods and show which method is suitable with respect to which property. Our Local Degree method is fast enough for large-scale networks and performs well across a wider range of properties than previously proposed methods.
arXiv: Data Structures and Algorithms | 2018
Michael Hamann; Ben Strasser
We introduce FlowCutter, a novel algorithm to compute a set of edge cuts or node separators that optimize cut size and balance in the Pareto sense. Our core algorithm heuristically solves the balanced connected st-edge-cut problem, where two given nodes s and t must be separated by removing edges to obtain two connected parts. Using the core algorithm as a subroutine, we build variants that compute node separators that are independent of s and t. From the computed Pareto set, we can identify cuts with a particularly good tradeoff between cut size and balance that can be used to compute contraction and minimum fill-in orders, which can be used in Customizable Contraction Hierarchies (CCHs), a speed-up technique for shortest-path computations. Our core algorithm runs in O(c∣E∣) time, where E is the set of edges and c is the size of the largest outputted cut. This makes it well suited for separating large graphs with small cuts, such as road graphs, which is the primary application motivating our research. For road graphs, we present an extensive experimental study demonstrating that FlowCutter outperforms the current state of the art in terms of both cut sizes and CCH performance. By evaluating FlowCutter on a standard graph partitioning benchmark, we further show that FlowCutter also finds small, balanced cuts on nonroad graphs. Another application is the computation of small tree decompositions. To evaluate the quality of our algorithm in this context, we entered the PACE 2016 challenge [13] and won first place in the corresponding sequential competition track. We can therefore conclude that our FlowCutter algorithm finds small, balanced cuts on a wide variety of graphs.
Studies in computational intelligence | 2016
Christian L. Staudt; Michael Hamann; Ilya Safro; Alexander Gutfraind; Henning Meyerhenke
Research on generative models plays a central role in the emerging field of network science, studying how statistical patterns found in real networks can be generated by formal rules. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how models can be fitted to an original network to produce a structurally similar replica, and (c) aim for producing much larger networks than the original exemplar. In a comparative experimental study, we find ReCoN often superior to many other stateof- the-art network generation methods. Our design yields a scalable and effective tool for replicating a given network while preserving important properties at both microand macroscopic scales and (optionally) scaling the replica by orders of magnitude in size. We recommend ReCoN as a general practical method for creating realistic test data for the engineering of computational methods on networks, verification, and simulation studies. We provide scalable open-source implementations of most studied methods, including ReCoN.
algorithm engineering and experimentation | 2017
Michael Hamann; Ulrich Meyer; Manuel Penschuck; Dorothea Wagner
LFR is a popular benchmark graph generator used to evaluate community detection algorithms. We present EM-LFR, the first external memory algorithm able to generate massive complex networks following the LFR benchmark. Its most expensive component is the generation of random graphs with prescribed degree sequences which can be divided into two steps: the graphs are first materialized deterministically using the Havel-Hakimi algorithm, and then randomized. Our main contributions are EM-HH and EM-ES, two I/Oefficient external memory algorithms for these two steps. We also propose EM-CM/ES, an alternative sampling scheme using the Configuration Model and rewiring steps to obtain a random simple graph. In an experimental evaluation we demonstrate their performance; our implementation is able to handle graphs with more than 37 billion edges on a single machine, is competitive with a massive parallel distributed algorithm, and is faster than a state-of-theart internal memory implementation even on instances fitting in main memory. EM-LFR’s implementation is capable of generating large graph instances orders of magnitude faster than the original implementation. We give evidence that both implementations yield graphs with matching properties by applying clustering algorithms to generated instances. Similarly, we analyse the evolution of graph properties as EM-ES is executed on networks obtained with EM-CM/ES and find that the alternative approach can accelerate the sampling process. ∗This work was partially supported by the DFG under grants ME 2088/3-2, WA 654/22-2. Parts of this paper were published as [21]. 1 ar X iv :1 60 4. 08 73 8v 3 [ cs .D S] 1 4 Ju n 20 17
european symposium on algorithms | 2015
Ulrik Brandes; Michael Hamann; Ben Strasser; Dorothea Wagner
We introduce Quasi-Threshold Mover (QTM), an algorithm to solve the quasi-threshold (also called trivially perfect) graph editing problem with a minimum number of edge insertions and deletions. Given a graph it computes a quasi-threshold graph which is close in terms of edit count, but not necessarily closest as this edit problem is NP-hard. We present an extensive experimental study, in which we show that QTM performs well in practice and is the first heuristic that is able to scale to large real-world graphs in practice. As a side result we further present a simple linear-time algorithm for the quasi-threshold recognition problem.
Bioinformatics | 2018
Rudolf Biczok; Peter Bozsoky; Peter Eisenmann; Johannes Ernst; Tobias Ribizel; Fedor Scholz; Axel Trefzer; Florian Weber; Michael Hamann; Alexandros Stamatakis
Abstract Motivation The presence of terraces in phylogenetic tree space, i.e. a potentially large number of distinct tree topologies that have exactly the same analytical likelihood score, was first described by Sanderson et al. However, popular software tools for maximum likelihood and Bayesian phylogenetic inference do not yet routinely report, if inferred phylogenies reside on a terrace, or not. We believe, this is due to the lack of an efficient library to (i) determine if a tree resides on a terrace, (ii) calculate how many trees reside on a terrace and (iii) enumerate all trees on a terrace. Results In our bioinformatics practical that is set up as a programming contest we developed two efficient and independent C++ implementations of the SUPERB algorithm by Constantinescu and Sankoff (1995) for counting and enumerating trees on a terrace. Both implementations yield exactly the same results, are more than one order of magnitude faster, and require one order of magnitude less memory than a previous thirrd party python implementation. Availability and implementation The source codes are available under GNU GPL at https://github.com/terraphast. Supplementary information Supplementary data are available at Bioinformatics online.
arXiv: Social and Information Networks | 2017
Christian L. Staudt; Michael Hamann; Alexander Gutfraind; Ilya Safro; Henning Meyerhenke
Research on generative models plays a central role in the emerging field of network science, studying how statistical patterns found in real networks could be generated by formal rules. Output from these generative models is then the basis for designing and evaluating computational methods on networks including verification and simulation studies. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions. In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods. We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.
european conference on parallel processing | 2018
Michael Hamann; Ben Strasser; Dorothea Wagner; Tim Zeitz
We study large-scale, distributed graph clustering. Given an undirected, weighted graph, our objective is to partition the nodes into disjoint sets called clusters. Each cluster should contain many internal edges. Further, there should only be few edges between clusters. We study two established formalizations of this internally-dense-externally-sparse principle: modularity and map equation. We present two versions of a simple distributed algorithm to optimize both measures. They are based on Thrill, a distributed big data processing framework that implements an extended MapReduce model. The algorithms for the two measures, DSLM-Mod and DSLM-Map, differ only slightly. Adapting them for similar quality measures is easy. In an extensive experimental study, we demonstrate the excellent performance of our algorithms on real-world and synthetic graph clustering benchmark graphs.
Journal of Experimental Algorithmics | 2018
Michael Hamann; Ulrich Meyer; Manuel Penschuck; Hung Tran; Dorothea Wagner
LFR is a popular benchmark graph generator used to evaluate community detection algorithms. We present EM-LFR, the first external memory algorithm able to generate massive complex networks following the LFR benchmark. Its most expensive component is the generation of random graphs with prescribed degree sequences which can be divided into two steps: the graphs are first materialized deterministically using the Havel-Hakimi algorithm, and then randomized. Our main contributions are EM-HH and EM-ES, two I/O-efficient external memory algorithms for these two steps. We also propose EM-CM/ES, an alternative sampling scheme using the Configuration Model and rewiring steps to obtain a random simple graph. In an experimental evaluation, we demonstrate their performance; our implementation is able to handle graphs with more than 37 billion edges on a single machine, is competitive with a massively parallel distributed algorithm, and is faster than a state-of-the-art internal memory implementation even on instances fitting in main memory. EM-LFR’s implementation is capable of generating large graph instances orders of magnitude faster than the original implementation. We give evidence that both implementations yield graphs with matching properties by applying clustering algorithms to generated instances. Similarly, we analyze the evolution of graph properties as EM-ES is executed on networks obtained with EM-CM/ES and find that the alternative approach can accelerate the sampling process.
Algorithms | 2017
Michael Hamann; Eike Röhrs; Dorothea Wagner
Community detection aims to find dense subgraphs in a network. We consider the problem of finding a community locally around a seed node both in unweighted and weighted networks. This is a faster alternative to algorithms that detect communities that cover the whole network when actually only a single community is required. Further, many overlapping community detection algorithms use local community detection algorithms as basic building block. We provide a broad comparison of different existing strategies of expanding a seed node greedily into a community. For this, we conduct an extensive experimental evaluation both on synthetic benchmark graphs as well as real world networks. We show that results both on synthetic as well as real-world networks can be significantly improved by starting from the largest clique in the neighborhood of the seed node. Further, our experiments indicate that algorithms using scores based on triangles outperform other algorithms in most cases. We provide theoretical descriptions as well as open source implementations of all algorithms used.