[PDF] Distributed Evolutionary k-way Node Separators

Abstract

Computing high quality node separators in large graphs is necessary for a variety of applications, ranging from divide-and-conquer algorithms to VLSI design. In this work, we present a novel distributed evolutionary algorithm tackling the k-way node separator problem. A key component of our contribution includes new k-way local search algorithms based on maximum flows. We combine our local search with a multilevel approach to compute an initial population for our evolutionary algorithm, and further show how to modify the coarsening stage of our multilevel algorithm to create effective combine and mutation operations. Lastly, we combine these techniques with a scalable communication protocol, producing a system that is able to compute high quality solutions in a short amount of time. Our experiments against competing algorithms show that our advanced evolutionary algorithm computes the best result on 94% of the chosen benchmark instances.

Full PDF

DDistributed Evolutionary k -way Node Separators ∗ Peter Sanders

Karlsruhe Institute of TechnologyKarlsruhe, [email protected]

Christian Schulz

Karlsruhe Institute of TechnologyKarlsruhe, Germany and University of ViennaVienna, [email protected]

Darren Strash

Colgate UniversityHamilton, [email protected]

Robert Williger

Karlsruhe Institute of TechnologyKarlsruhe, [email protected]

ABSTRACT

Computing high quality node separators in large graphs isnecessary for a variety of applications, ranging from divide-and-conquer algorithms to VLSI design. In this work, wepresent a novel distributed evolutionary algorithm tacklingthe k -way node separator problem. A key component ofour contribution includes new k -way local search algorithmsbased on maximum ﬂows. We combine our local search witha multilevel approach to compute an initial population forour evolutionary algorithm, and further show how to modifythe coarsening stage of our multilevel algorithm to create ef-fective combine and mutation operations. Lastly, we combinethese techniques with a scalable communication protocol,producing a system that is able to compute high qualitysolutions in a short amount of time. Our experiments againstcompeting algorithms show that our advanced evolutionaryalgorithm computes the best result on 94% of the chosenbenchmark instances. KEYWORDS graph partitioning, node separators, max-ﬂow min-cut

Given a graph G = ( V, E ) , the k -way node separator problem is to ﬁnd a small separator S ⊂ V , and k disjoint subsets of V , V , . . . , V k called blocks , such that no edges exist betweentwo diﬀerent blocks V i and V j ( i (cid:54) = j ) and V = (cid:83) i V i ∪ S . Theobjective is to minimize the size of the separator S or, depend-ing on the application, the cumulative weight of its nodes,while the blocks V i are balanced. Note that removing the set S from the graph results in at least k connected components.Many algorithms rely on small node separators. For exam-ple, small balanced separators are a popular tool in divide-and-conquer strategies [2, 18, 20], are useful to speed up thecomputations of shortest paths [7, 8, 28], are necessary inscientiﬁc computing to compute ﬁll reducing orderings withnested dissection algorithms [13] or in VLSI design [2, 18]. ∗ This work was partially supported by DFG grants SA 933/11-1.

Finding a balanced node separator is NP-hard for generalgraphs even if the maximum node degree is three [4, 12].Therefore, heuristic and approximation algorithms are usedin practice to ﬁnd small node separators. The most commonlyused method to solve the node separator problem on largegraphs in practice is the multilevel approach. During acoarsening phase, a multilevel algorithm reduces the graphsize by iteratively contracting the nodes and edges of G untilthe graph is small enough to compute a node separator bysome other (presumably time consuming) algorithm. A nodeseparator of the input graph is then constructed by iterativelyuncontracting the graph, transferring the solution to this ﬁnergraph, and then applying local search algorithms to improvethe solution.Although current solvers are typically fast enough for mostapplications, they unfortunately produce separators of lowsolution quality. This may be acceptable in applications thatuse a separator just once, however, many applications ﬁrstcompute a separator as a preprocessing step, and then relyon a high-quality separator for speed in later stages. This istrue in VLSI design [2, 18], where even small improvementsin separator size can have a large impact on computationtime and production costs. High-quality node separators canalso speed up shortest path queries in road networks, forexample, in customizable contraction hierarchies [8], wheresmaller node separators yield better node orderings that arerepeatedly used to answer shortest path queries. The cost forcomputing one high quality node separator is then amortizedover a many shortest path queries. Hence, our focus is onsolution quality in this work. The main contribution of this paper is a technique thatintegrates an evolutionary search algorithm with a novelmultilevel k -node separator algorithm and its scalable paral-lelization. We present novel mutation and combine operatorsfor the problem which are based on the multilevel scheme.Due to the coarse-grained parallelization, our system is ableto compute separators that have high quality within a fewminutes for graphs of moderate size. a r X i v : . [ c s . N E ] F e b eter Sanders, Christian Schulz, Darren Strash, and Robert Williger Throughout this paper, we consider an undirected graph G = ( V = { , . . . , n − } , E ) with n = | V | , and m = | E | . Γ( v ) := { u : { v, u } ∈ E } denotes the neighborhood of a node v . A graph S = ( V (cid:48) , E (cid:48) ) is said to be a subgraph of G = ( V, E ) if V (cid:48) ⊆ V and E (cid:48) ⊆ E ∩ ( V (cid:48) × V (cid:48) ) . We call S an induced subgraph when E (cid:48) = E ∩ ( V (cid:48) × V (cid:48) ) . For a set of nodes U ⊆ V , G [ U ] denotes the subgraph induced by U .The graph partitioning problem , which is closely related tothe node separator problem, asks for blocks of nodes V ,. . . , V k that partition V (i.e., V ∪· · ·∪ V k = V and V i ∩ V j = ∅ for i (cid:54) = j ). A balancing constraint demands that ∀ i ∈ { ..k } : | V i | ≤ L max := (1+ (cid:15) ) (cid:100)| V | /k (cid:101) for some parameter (cid:15) . In this case, theobjective is often to minimize the total cut (cid:80) i and describealgorithms that are able to balance solutions, e.g. solutionsthat contain blocks with too many vertices. These algorithmsare used to create the initial population of our evolutionaryalgorithm as well as to provide the combine and mutationoperations. Our algorithm is called Adv and the evolutionaryalgorithm that is introduced later

AdvEvo . k -way Local Search Our k -way local search builds on top of the ﬂow-based searchwhich is intended for improving a separator with k = 2 . Themain idea is to ﬁnd pairs of adjoint blocks and then performlocal search on the subgraph induced by adjoint block pairs. Preprocessing.

In order to ﬁnd pairs of adjoint blocks, welook at separator nodes which directly separate two diﬀer-ent blocks, meaning these separator nodes are adjacent tonodes from at least two diﬀerent blocks not including theseparator. In general directly separating nodes do not haveto exist (see Figure 1). In other words, it may be that aseparator disconnects two blocks, but the shortest path dis-tance between the blocks is greater or equal to two. Usinga preprocessing step, we ﬁrst make sure that each separatornode is adjacent to at least two blocks, i.e. each separatornode is directly separating.The preprocessing step works as follows: we iterate over allseparator nodes and try to remove them from the separatorif they do not directly separate two blocks. The order inwhich we look at the separator nodes is given by the num-ber of adjacent blocks (highest ﬁrst). Let s be the currentseparator node under consideration. If it has two or morenon-separator neighbors in diﬀerent blocks, it already directlyseparates at least two blocks and we continue. If s only hasneighbors in a single block in addition to the separator, wemove it into that block. Lastly, if s only has other separatornodes as neighbors, we put it into a block having smallestoverall weight. In each step, we update the priorities ofadjacent neighboring separator nodes. Note that nodes areonly removed from the separator and never added. Moreover,removing a node from the separator can increase the priority Figure 1: Illustration of the preprocessing for agraph with blocks V (green), V (blue) and separator S (red). of an adjacent separator node only by one. As soon as thepriority of a node is larger than one, it is directly separatingand we do not have to look at the vertex again. After thealgorithm is done, each separator node is directly separatingat least two blocks and we can build the quotient graph inorder to ﬁnd adjoint blocks. Our preprocessing can introduceimbalance to the solution. Hence, we run the balance routinedeﬁned below after preprocessing. Pair-wise Local Search.

Subsequent to the preprocessingstep, we identify the set of all adjoint block pairs P byiterating through all separator nodes and their adjacent nodes.We iterate through all pairs p = ( A, B ) ∈ P and build thesubgraph G p . G p is induced by the set of nodes consisting of allnodes in A and B as well as all separator nodes that directlyseparate the blocks. After building G p , we run local searchdesigned for -way separators on this subgraph. Note that animprovement in the separator size between A and B directlycorresponds to an improvement to the overall separator size.To gain even smaller separators and because the solutionis potentially modiﬁed by local search, we repeat local searchmultiple times in the following way. The algorithm is orga-nized in rounds. In each round, we iterate over the elementsin P and perform local search on each induced subgraph. Iflocal search has not been successful, we remove p from P .Otherwise, we keep p for the next round. To guarantee the balance constraint, we use a balance opera-tion. Given an imbalanced separator of a graph, the algorithmreturns a balanced node separator. Roughly speaking, wemove nodes from imbalanced blocks towards the blocks thatcan take nodes without becoming overloaded (see Figure 2for an illustrating example). As long as there are imbalancedblocks, we iteratively repeat the following steps:First, we ﬁnd a path p from the heaviest block H tothe lightest block L in the quotient graph. If there is nosuch path, we directly move a node to the lightest blockand make its neighbors separator nodes. Next, we iteratethrough the edges ( A, B ) ∈ p and move nodes from A into B . In general, we move min( L max − |L| , |H| − L max ) nodesalong the path, i.e. as many nodes as the lightest block cantake without getting overloaded and as little nodes necessaryso that the heaviest block is balanced. Moving nodes isbased on gain of the movement as deﬁned in Section 2.3.Basically, we use a priority queue of separator nodes thatdirectly separate A and B with the key being set to thegain. Note that these movements create new separator nodesand can potentially worsen solution quality. We use thegain deﬁnition because our primary target is to minimize theincrease of the separator size.Then we dequeue nodes from the our priority queue until A is balanced. We move each dequeued node s to B andmove its neighbors being in A into the separator and thepriority queue. Also the priorities of the nodes in the queueare updated. After moving the nodes A will be balanced. If B is imbalanced, we continue with the next pair in the path, istributed Evolutionary k -way Node Separators Figure 2: Illustration of the quotient graph andthe balancing path (blue) from an imbalanced block(red) to the lightest block (green). i.e. sending the same amount of nodes. If B is also balanced,we are done with this path and do not move any more nodes.Our algorithm continues with the next imbalanced block. K -WAY NODESEPARATORS Our EA starts with a population of individuals (in our casethe node separator of the graph) and evolves the popula-tion into diﬀerent populations over several rounds. In eachround, the EA uses a selection rule based on the ﬁtness of theindividuals (in our case the size of the separator) of the popu-lation to select good individuals and combine them to obtainimproved oﬀspring. Note that we can use the size/weight ofthe separator as a ﬁtness function since our algorithm alwaysgenerates separators fulﬁlling the given balance constraint,i.e. there is no need to use a penalty function to ensure thatthe ﬁnal separator is feasible. When an oﬀspring is generatedan eviction rule is used to select a member of the populationand replace it with the new oﬀspring. In general one has totake both into consideration, the ﬁtness of an individual andthe distance between individuals in the population [1]. Ouralgorithm generates only one oﬀspring per generation.

We now describe the combine operator. Our combine opera-tor assures that the oﬀspring has an objective at least as goodas the best of both parents . Roughly speaking, the combine op-erator combines an individual/separator P = V P , ..., V P k , S P (which has to fulﬁll a balance constraint) with a second indi-vidual/separator C = V C , ..., V C k , S C . Let P be the individualwith better ﬁtness.The algorithm begins with selecting two individuals fromthe population. The selection process is based on the tour-nament selection rule [22], i.e. P is the ﬁttest out of tworandom individuals R , R from the population and the sameis done to select C . Both node separators are used as inputfor our multi-level algorithm in the following sense. Let E be the set of edges that are cut edges, i.e. edges that runbetween blocks and the separator, in either P or C . All edgesin E are blocked during the coarsening phase, i.e. they arenot contracted during the coarsening phase. In other words these edges are not eligible for the matching algorithm usedduring the coarsening phase and therefore are not part ofany matching.The stopping criterion of the multi-level algorithm is mod-iﬁed such that it stops when no contractable edge is left.As soon as the coarsening phase is stopped, we apply theseparator P to the coarsest graph and use this as initialseparator. This is possible since we did not contract anyedge running between the blocks and the separator in P .Note that due to the specialized coarsening phase and thisspecialized initial phase we obtain a high quality initial solu-tion on a very coarse graph which is usually not discoveredby conventional algorithms that compute an initial solution.Since our local search algorithms guarantee no worseningof the input solution and we use random tie breaking wecan assure non-decreasing quality. Note that the local searchalgorithms can eﬀectively exchange good parts of the solutionon the coarse levels by moving only a few vertices.Also note that this combine operator can be extended tobe a multi-point combine operator, i.e. the operator woulduse (cid:96) instead of two parents. However, during the courseof the algorithm a sequence of two point combine steps isexecuted which somehow "emulates" a multi-point combinestep. Therefore, we restrict ourselves to the case (cid:96) = 2 .When the oﬀspring is generated we have to decide whichsolution should be evicted from the current population. Weevict the solution that is most similar to the oﬀspring amongthose individuals in the population that have an objectiveworse or equal than the oﬀspring itself. Here, we deﬁnethe similarity σ of two node separators S and S as thecardinality of the symmetric diﬀerence of both separators: σ = | ( S \ S ) ∪ ( S \ S ) | . Therefore σ denotes the numberof nodes contained in one separator but not in the other. Thisensures some diversity in the population and hence makesthe evolutionary algorithm more eﬀective. The mutation operation works similar to the combine oper-ation. The main diﬀerence is that there is only one inputindividual to the multi-level algorithm and that the oﬀspringcan be less ﬁt compared to the input individual. Hence, onlyedges that run between the blocks and the separator of thatindividual are not eligible for the matching algorithm. Thisway the input individual can be transferred downwards inthe hierarchy. Additionally, the solution is not used as initialseparator but the initial algorithm is performed to ﬁnd aninitial separator. Note however due to the way the coarseningprocess is deﬁned the input separator is still contained in thecoarsest graph.

We now explain parallelization and describe how everythingis put together to be our full evolutionary algorithm

AdvEvo .We use a parallelization scheme that has been successfullyused in graph partitioning [25]. Each processing element eter Sanders, Christian Schulz, Darren Strash, and Robert Williger (PE) basically performs the same operations using diﬀerentrandom seeds (see Algorithm 1). First we estimate the popu-lation size S : each PE creates an individuum and measuresthe time t spend. We then choose S such that the time forcreating S node separators is approximately t total /f wherethe fraction f is a tuning parameter and t total is the totalrunning time that the algorithm is given to produce a nodeseparator of the graph. Each PE then builds its own popula-tion, i.e. our multi-level algorithm is called several times tocreate S individuals/separators. Afterwards the algorithmproceeds in rounds as long as time is left. With correspondingprobabilities, mutation or combine operations are performedand the new oﬀspring is inserted into the population.We choose a parallelization/communication protocol thatis quite similar to randomized rumor spreading [9] whichhas shown to be scalable in an evolutionary algorithm forgraph partitioning [25]. We follow their description closely.Let p denote the number of PEs used. A communicationstep is organized in rounds. In each round, a PE choosesa communication partner and sends her the currently bestnode separator P of the local population. The selection ofthe communication partner is done uniformly at randomamong those PEs to which P not already has been send to.Afterwards, a PE checks if there are incoming individualsand if so inserts them into the local population using theeviction strategy described above. If P is improved, all PEsare again eligible. This is repeated log p times. The algorithmis implemented completely asynchronously , i.e. there is noneed for a global synchronization. Algorithm 1 locallyEvolve estimate population size S while time left if elapsed time < t total /f then create individual, insert into local population else ﬂip coin c with corresponding probabilities if c shows head then perform mutation else perform combineinsert oﬀspring into population if possiblecommunicate according to comm. protocol Besides

Adv and

AdvEvo , we also use two more algorithmsto compare solution quality. The ﬁrst one is a sequentialalgorithm that starts by computing a k -way partition usingKaFFPa-Strong and derives a k -way separator by pair-wisedecoupling by using the method of Pothen and Fan [24] oneach adjacent pair of blocks. The main idea of Pothen andFan is to compute a minimum vertex cover in the bipartitesubgraph induced by the set of cut edges between two pairsof blocks. The union of the computed separators nodes isa k -way separator. In our experiments, the algorithm iscalled Simple . The second algorithm, is a modiﬁcation ofKaFFPaE [25] which is an evolutionary algorithm to computegraph partitions. We modify the ﬁtness function to be thesize of the separator that can be derived using the

Simple approach, but keep the rest of the algorithm. More precisely,this means that the population of the algorithm are still graphpartitions instead of separators, but for example selection isbased on the size of the derivable separator. Additionally,the combine operations in KaFFPaE still optimize for cutsinstead of separators. This algorithm is called

SimpleEvo . Methodology.

We have implemented the algorithm describedabove within the KaHIP framework using C++ and com-piled all algorithms using gcc 4.8.3 with full optimization’sturned on (-O3 ﬂag). We integrated our algorithms inKaHIP v1.00. Our new codes will also be included into theKaHIP graph partitioning framework. Each run was madeon a machine that has four Octa-Core Intel Xeon E5-2670processors running at 2.6 GHz with 64 GB local memory.Our main objective is the cardinality of node separators onthe input graph. In our experiments, we use the imbalanceparameter (cid:15) = 3% since this is one of the default values inthe Metis graph partitioning framework. Our full algorithmis not too sensitive about the precise choice with most of theparameters. However, we performed a number of experimentsto evaluate the inﬂuence and choose the parameters of ouralgorithms. We omit details here and refer the reader to [30].We mark the instances that have been used for the parametertuning in Table 1 with a * and exclude these graphs fromour experiments.We present multiple views on the data: average values (geo-metric mean) as well as convergence plots that show qualityachieved by the algorithms over time and performance plots.

Graph n Graph n Walshaw Graphs Walshaw Graphsadd20 2 395 bcsstk32 44 609data 2 851 fe_body* 45 0873elt 4 720 t60k 60 005uk 4 824 wing 62 032add32 4 960 brack2 62 631bcsstk33* 8 738 ﬁnan512 74 752whitaker3 9 800 fe_tooth* 78 136crack 10 240 fe_rotor 99 617wing_nodal 10 937 UF Graphsfe_4elt2 11 143 cop20k_A 99 843vibrobox* 12 328 2cubes_sphere* 101 492bcsstk29 13 992 thermomech_TC 102 1584elt 15 606 cfd2 123 440fe_sphere 16 386 boneS01 127 224cti 16 840 Dubcova3 146 689memplus 17 758 bmwcra_1 148 770cs4 22 499 G2_circuit* 150 102bcsstk30 28 924 c-73 169 422bcsstk31 35 588 shipsec5 179 860fe_pwt 36 519 cont-300 180 895

Table 1: Walshaw graphs and ﬂorida sparse matrixgraphs from [26]. Basic properties of the instances.Graphs with a * have been used for parameter tun-ing and are excluded from the evaluation. istributed Evolutionary k -way Node Separators k=2 normalized time t n m e a n m i n s e p a r a t o r s i z e AdvRepsAdvEvo SimRepsSimEvo k=4 normalized time t n m e a n m i n s e p a r a t o r s i z e AdvRepsAdvEvo SimRepsSimEvo k=8 normalized time t n m e a n m i n s e p a r a t o r s i z e AdvRepsAdvEvo SimRepsSimEvo k=16 normalized time t n m e a n m i n s e p a r a t o r s i z e AdvRepsAdvEvo SimRepsSimEvo k=32 normalized time t n m e a n m i n s e p a r a t o r s i z e AdvRepsAdvEvo SimRepsSimEvo k=64 normalized time t n m e a n m i n s e p a r a t o r s i z e AdvRepsAdvEvo SimRepsSimEvo

Figure 3: Convergence plots for diﬀerent values of k for diﬀerent algorithms. eter Sanders, Christian Schulz, Darren Strash, and Robert Williger We now explain how we compute the convergence plots . Westart explaining how we compute them for a single instance I :whenever a PE creates a separator it reports a pair ( t , separa-tor size), where the timestamp t is the currently elapsed timeon the particular PE and separator size refers to the size ofthe separator that has been created. When performing mul-tiple repetitions, we report average values ( t , avg. separatorsize) instead. After the completion of algorithm we are leftwith P sequences of pairs ( t , separator size) which we nowmerge into one sequence. The merged sequence is sorted bythe timestamp t . The resulting sequence is called T I . Sincewe are interested in the evolution of the solution quality, wecompute another sequence T I min . For each entry (in sortedorder) in T I , we insert the entry ( t, min t (cid:48) ≤ t separator size ( t (cid:48) )) into T I min . Here, min t (cid:48) ≤ t separator size ( t (cid:48) ) is the minimumseparator size that occurred until time t . N I min refers to thenormalized sequence, i.e. each entry ( t , separator size) in T I min is replaced by ( t n , separator size) where t n = t/t I and t I is the average time that the sequential algorithm needs tocompute a separator for the instance I . To obtain averagevalues over multiple instances we do the following: for eachinstance we label all entries in N I min , i.e. ( t n , separator size)is replaced by ( t n , separator size, I ). We then merge allsequences N I min and sort by t n . The resulting sequence iscalled S . The ﬁnal sequence S g presents event based geomet-ric averages values. We start by computing the geometricmean value G using the ﬁrst value of all N I min (over I ). Toobtain S g , we basically sweep through S : for each entry (insorted order) ( t n , c, I ) in S , we update G , i.e. the separatorsize of I that took part in the computation of G is replacedby the new value c , and insert ( t n , G ) into S g . Note, c canbe only smaller than or equal to the old value of I . Instances.

We use the small and Florida Sparse Matrixgraphs from [26] which are from various sources to test ouralgorithm. Small graphs have been obtained from ChrisWalshaw’s benchmark archive [29]. Graphs derived fromsparse matrices have been taken from the Florida SparseMatrix Collection [6]. Basic properties of the instances canbe found in Table 1.

In this section we compare our algorithms in a setting whereeach one gets the same (fairly large) amount of time tocompute a separator. We do this on the graphs from ourbenchmark set. We use all 16 cores per run of our machine(basically one node of the cluster) and two hours of timeper instance when we use the evolutionary algorithm tocreate separators. We parallelized repeated executions ofthe sequential algorithms (embarrassingly parallel, diﬀerentseeds) and also gave them 16 PEs and two hours of timeto compute a separator. We look at k ∈ { , , , , , } and performed three repetitions per instance. To see howthe solution quality of the diﬀerent algorithms evolves overtime, we use convergence plots. Figure 3 shows convergenceplots for k ∈ { , , , , , } . Additionally, Tables 2 and k AdvEvo AdvReps SimEvo SimReps2 159.4 0.0 +3.8% +5.0%4 373.9 +1.7% +5.6% +7.7%8 664.4 +2.7% +8.4% +11.1%16 1097.9 +5.1% +10.4% +13.4%32 1694.2 +6.8% +11.8% +14.4%64 2601.8 +8.1% +15.3% +17.8%overall +0% 4.1% 9.2% 11.6%

Table 2: Average of AdvEvo and average increase inseparator size for diﬀerent algorithms.

Algorithm ≤ < AdvEvo 181 122AdvReps 65 7SimEvo 33 3SimReps 29 0

Table 3: Number of instances where algorithm X isbest w.r.t to ≤ and < . The total number of instancesis 192. k . This is due to the fact that the problems become more diﬃ-cult when increasing the number of blocks k . For larger valuesof k , the quality gap between the evolutionary algorithm Ad-vEvo and SimEvo as well as the other algorithms increaseswith more time invested. On the other hand, for k = 2 thereis almost no diﬀerence between the results produced by theevolutionary algorithm AdvEvo and the non-evolutionaryversion AdvReps. Overall, the experimental data indicatesthe AdvEvo is the best algorithm. Separators produced byAdvEvo are 4.1%, 9.2% and 11.6% smaller compared to Ad-vReps, SimEvo, and SimReps on average. Additionally, ouradvanced evolutionary algorithm computes the best resulton 181 out of 192 instances.Note that single executions of the simple algorithms aremuch faster. However, the results of our experiments per-formed in this section emphasize that one cannot simply usethe best result out of multiple repetitions of a faster algorithmto obtain the same solution quality. Yet it is interesting tosee that SimpleEvo, where only the ﬁtness function of theevolutionary algorithm is modiﬁed and the combine operationstill optimizes for edge cuts of partitions, computes bettersolutions than its non-evolutionary counter part SimReps. In this work, we derived a new approach to ﬁnd small nodeseparators in large graphs which combines an evolutionarysearch algorithm with a multilevel method. We combinedthese techniques with a scalable communication protocoland obtain a system that is able to compute high qualitysolutions in a short amount of time. Experiments show thatour advanced evolutionary algorithm computes the best resulton 94% of the benchmark instances. In future work, we aimto look at diﬀerent types of applications, in particular thoseapplications in which the running time may not be considered istributed Evolutionary k -way Node Separators a drawback when the node separator has the highest quality. Acknowledgements.

The authors acknowledge support bythe state of Baden-Württemberg through bwHPC.

REFERENCES [1] T. Bäck. 1996.

Evolutionary Algorithms in Theory and Prac-tice: Evolution Strategies, Evolutionary Programming, GeneticAlgorithms . Ph.D. Dissertation.[2] S. N. Bhatt and F. T. Leighton. 1984. A framework for solvingVLSI graph layout problems.

J. Comput. System Sci.

28, 2 (1984),300 – 343.

DOI: http://dx.doi.org/10.1016/0022-0000(84)90071-0[3] C. Bichot and P. Siarry (Eds.). 2011.

Graph Partitioning . Wiley.[4] T. N. Bui and C. Jones. 1992. Finding Good Approximate Vertexand Edge Partitions is NP-hard.

Inform. Process. Lett.

42, 3(1992), 153–159.[5] A. Buluç, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz.2016. Recent Advances in Graph Partitioning. In

AlgorithmEngineering – Selected Results (LNCS) , Vol. 9920. Springer,117–158.[6] T. Davis. 2017. The University of Florida Sparse Matrix Collec-tion. (2017).[7] D. Delling, M. Holzer, K. Müller, F. Schulz, and D. Wagner.2009. High-performance multi-level routing.

The Shortest PathProblem: Ninth DIMACS Implementation Challenge

74 (2009),73–92.[8] J. Dibbelt, B. Strasser, and D. Wagner. 2014. Customizablecontraction hierarchies. In . Springer, 271–282.[9] B. Doerr and M. Fouz. 2011. Asymptotically Optimal Random-ized Rumor Spreading. In

Proceedings of the 38th InternationalColloquium on Automata, Languages and Programming, Pro-ceedings, Part II (LNCS) , Vol. 6756. Springer, 502–513.[10] G. N. Federickson. 1987. Fast Algorithms for Shortest Pathsin Planar Graphs, with Applications.

SIAM J. Comput.

DOI: http://dx.doi.org/10.1137/0216064arXiv:http://dx.doi.org/10.1137/0216064[11] J. Fukuyama. 2006. NP-Completeness of the Planar SeparatorProblems.

Journal of Graph Algorithms and Applications

10, 2(2006), 317–328.[12] M. R. Garey and D. S. Johnson. 2002.

Computers and Intractabil-ity . Vol. 29. WH Freeman & Co., San Francisco.[13] A. George. 1973. Nested Dissection of a Regular Finite ElementMesh.

SIAM J. Numer. Anal.

10, 2 (1973), 345–363.[14] W. W. Hager, J. T. Hungerford, and I. Safro. 2014.

A Multi-level Bilinear Programming Algorithm For the Vertex SeparatorProblem . Technical Report.[15] M. Hamann and B. Strasser. 2016. Graph Bisection with Pareto-Optimization. In

Proc. of the Eighteenth Workshop on AlgorithmEngineering and Experiments, ALENEX’16 . SIAM, 90–102.

DOI: http://dx.doi.org/10.1137/1.9781611974317.8[16] G. Karypis and V. Kumar. 1998. A Fast and High Quality Multi-level Scheme for Partitioning Irregular Graphs.

SIAM Journalon Scientiﬁc Computing

20, 1 (1998), 359–392.[17] D. LaSalle and G. Karypis. 2015. Eﬃcient Nested Dissection forMulticore Architectures. In

Euro-Par 2015: Parallel Processing .Springer, 467–478.[18] C. E. Leiserson. 1980. Area-Eﬃcient Graph Layouts. In . IEEE, 270–281.[19] R. J. Lipton and R. E. Tarjan. 1979. A Separator Theorem forPlanar Graphs.

SIAM J. Appl. Math.

36, 2 (1979), 177–189.[20] R. J. Lipton and R. E. Tarjan. 1980. Applications of a PlanarSeparator Theorem.

SIAM Journal On Computing

9, 3 (1980),615–627.[21] J. Maue and P. Sanders. 2007. Engineering Algorithms forApproximate Weighted Matching. In

Proceedings of the 6thWorkshop on Experimental Algorithms (WEA’07) (LNCS) ,Vol. 4525. Springer, 242–255. http://dx.doi.org/10.1007/978-3-540-72845-0_19[22] B. L Miller and D. E Goldberg. 1996. Genetic Algorithms, Tour-nament Selection, and the Eﬀects of Noise.

Evolutionary Com-putation

SIAM J. Matrix Anal. Appl.

11, 3 (1990), 430–452.[25] P. Sanders and C. Schulz. 2012. Distributed Evolutionary GraphPartitioning. In

Proc. of the 12th Workshop on Algorithm Engi-neering and Experimentation (ALENEX’12) . 16–29.[26] P. Sanders and C. Schulz. 2016. Advanced Multilevel Node Sepa-rator Algorithms. In , Vol. 9685. Springer, 294–309.[27] C. Schulz. 2013.

High Quality Graph Partititioning . Ph.D.Dissertation. Karlsruhe Institute of Technology.[28] F. Schulz, D. Wagner, and C. Zaroliagis. 2002. Using multi-levelgraphs for timetable information in railway systems. In

Proceed-ings of Algorithm Engineering and Experiments (ALENEX) .Springer, 43–59.[29] A. J. Soper, C. Walshaw, and M. Cross. 2004. A CombinedEvolutionary Search and Multilevel Optimisation Approach toGraph-Partitioning.

Journal of Global Optimization

29, 2 (2004),225–241.[30] R. Williger. 2016.

Evolutionary k -way Node Separators . Bache-lor’s Thesis. Karlsruhe Institute of Technologie. eter Sanders, Christian Schulz, Darren Strash, and Robert Williger graph k AdvEvo AdvReps SimpleEvo SimpleRepsmin avg min avg min avg min avg3elt 2 43 43 43 43 43 43 43 433elt 4 97 97 97 97 97 97 97 973elt 8 158 158 159 159 159 159 160 1603elt 16 269 269 270 270 271 271 272 2723elt 32 452 452 455 456 458 459 466 4663elt 64 706 710 720 721 733 734 746 7474elt 2 68 68 68 68 68 68 68 684elt 4 157 157 157 157 157 157 157 1574elt 8 253 253 254 254 256 256 254 2554elt 16 438 439 443 445 443 445 442 4444elt 32 737 743 755 757 744 746 749 7504elt 64 1 221 1 227 1 257 1 259 1 234 1 238 1 253 1 253add20 2 26 26 25 25 28 28 28 28add20 4 37 37 37 37 45 47 49 49add20 8 67 67 69 70 82 86 94 94add20 16 95 96 106 108 116 119 133 133add20 32 110 111 140 140 166 169 170 174add20 64 138 138 161 164 219 225 218 221add32 2 2 2 2 2 2 2 2 2add32 4 6 6 6 6 6 6 6 6add32 8 11 11 11 11 12 12 12 12add32 16 20 20 20 20 20 20 20 20add32 32 33 33 33 33 33 33 33 33add32 64 114 115 118 119 127 128 131 131bcsstk29 2 180 180 180 180 180 180 180 180bcsstk29 4 528 528 528 528 534 536 534 534bcsstk29 8 948 954 966 968 1 173 1 202 1 086 1 094bcsstk29 16 1 512 1 530 1 578 1 578 2 120 2 127 2 019 2 062bcsstk29 32 2 231 2 250 2 316 2 326 2 891 2 898 2 892 2 899bcsstk29 64 3 130 3 157 3 361 3 371 4 073 4 089 4 065 4 096bcsstk30 2 206 206 206 206 206 206 206 206bcsstk30 4 549 549 549 549 573 573 573 573bcsstk30 8 1 121 1 121 1 123 1 123 1 138 1 145 1 138 1 140bcsstk30 16 2 128 2 146 2 183 2 201 2 446 2 455 2 430 2 452bcsstk30 32 3 195 3 249 3 321 3 338 3 985 3 987 3 956 4 008bcsstk30 64 4 709 4 836 5 045 5 111 5 846 5 940 5 994 6 013bcsstk31 2 308 308 308 308 317 321 317 319bcsstk31 4 767 767 767 767 802 803 798 800bcsstk31 8 1 433 1 434 1 441 1 442 1 529 1 538 1 534 1 545bcsstk31 16 2 353 2 399 2 437 2 446 2 592 2 630 2 624 2 633bcsstk31 32 3 635 3 695 3 837 3 874 4 338 4 361 4 361 4 376bcsstk31 64 5 102 5 167 5 323 5 394 6 090 6 136 6 169 6 205bcsstk32 2 297 297 297 297 322 328 321 321bcsstk32 4 569 569 569 569 633 637 627 627bcsstk32 8 1 145 1 152 1 177 1 183 1 312 1 336 1 315 1 326bcsstk32 16 2 080 2 102 2 122 2 131 2 391 2 434 2 443 2 468bcsstk32 32 3 422 3 449 3 498 3 524 4 102 4 118 4 114 4 142bcsstk32 64 5 386 5 469 5 621 5 677 6 293 6 321 6 348 6 412bmwcra_1 2 657 657 657 657 684 684 683 683bmwcra_1 4 1 668 1 673 1 656 1 659 1 683 1 685 1 659 1 666bmwcra_1 8 3 918 3 970 4 002 4 013 4 080 4 126 4 112 4 133bmwcra_1 16 10 011 10 037 9 846 9 921 10 099 10 199 10 190 10 250bmwcra_1 32 16 089 16 300 16 798 16 979 16 863 16 947 16 725 16 775bmwcra_1 64 23 586 24 279 25 707 25 809 24 979 25 087 24 885 25 042

Table 4: Detailed per instance results. istributed Evolutionary k -way Node Separators graph k AdvEvo AdvReps SimpleEvo SimpleRepsmin avg min avg min avg min avgboneS01 2 1 524 1 524 1 524 1 524 1 563 1 563 1 563 1 565boneS01 4 3 357 3 357 3 357 3 357 3 465 3 469 3 471 3 477boneS01 8 5 112 5 128 5 139 5 148 5 316 5 351 5 358 5 371boneS01 16 7 728 7 781 7 776 7 826 8 139 8 179 8 142 8 189boneS01 32 11 082 11 127 11 400 11 440 11 619 11 670 11 758 11 770boneS01 64 15 264 15 271 16 495 16 683 16 173 16 219 16 296 16 314brack2 2 183 183 183 183 206 208 205 205brack2 4 796 796 796 796 829 831 829 829brack2 8 1 740 1 741 1 759 1 761 1 896 1 899 1 906 1 910brack2 16 2 794 2 806 2 868 2 874 3 108 3 119 3 117 3 120brack2 32 4 160 4 184 4 282 4 289 4 632 4 647 4 646 4 651brack2 64 6 053 6 084 6 361 6 381 6 808 6 841 6 809 6 820cfd2 2 1 030 1 030 1 030 1 030 1 036 1 036 1 036 1 036cfd2 4 2 543 2 546 2 548 2 551 2 684 2 688 2 645 2 667cfd2 8 4 304 4 313 4 304 4 312 4 569 4 581 4 551 4 591cfd2 16 7 068 7 095 7 018 7 036 7 416 7 440 7 374 7 392cfd2 32 10 723 10 873 11 165 11 272 12 066 12 088 11 924 11 956cfd2 64 16 521 17 138 17 829 18 021 18 093 18 179 18 014 18 067cont-300 2 598 598 598 598 598 598 598 598cont-300 4 1 041 1 042 1 063 1 065 1 184 1 184 1 184 1 184cont-300 8 1 786 1 814 1 807 1 825 2 188 2 192 2 188 2 194cont-300 16 2 863 2 874 2 893 2 916 3 526 3 528 3 534 3 537cont-300 32 4 299 4 340 4 413 4 450 5 466 5 483 5 504 5 506cont-300 64 6 452 6 474 6 667 6 679 8 094 8 105 8 168 8 170cop20k_A 2 620 620 620 620 620 620 620 620cop20k_A 4 1 673 1 675 1 676 1 676 1 733 1 741 1 716 1 724cop20k_A 8 2 919 2 934 2 939 2 942 2 997 2 999 2 993 2 996cop20k_A 16 4 721 4 744 4 765 4 780 4 842 4 864 4 849 4 858cop20k_A 32 7 241 7 333 7 525 7 645 7 481 7 502 7 423 7 465cop20k_A 64 10 757 10 837 11 721 12 107 11 135 11 147 11 102 11 155crack 2 73 73 73 73 75 75 74 74crack 4 145 145 145 145 152 152 152 152crack 8 257 257 258 258 280 282 285 286crack 16 420 420 421 422 461 465 474 474crack 32 636 639 648 648 722 724 735 738crack 64 939 943 958 959 1 098 1 100 1 123 1 124cs4 2 287 287 287 287 319 319 322 322cs4 4 727 729 738 740 824 825 832 836cs4 8 1 108 1 109 1 133 1 134 1 244 1 245 1 267 1 267cs4 16 1 548 1 558 1 623 1 630 1 812 1 818 1 827 1 830cs4 32 2 132 2 148 2 273 2 286 2 512 2 516 2 537 2 541cs4 64 2 864 2 909 3 179 3 202 3 433 3 435 3 461 3 463cti 2 266 266 266 266 266 266 266 266cti 4 756 758 761 761 808 808 807 807cti 8 1 243 1 270 1 311 1 315 1 537 1 539 1 539 1 539cti 16 1 821 1 845 1 897 1 925 2 287 2 298 2 319 2 324cti 32 2 426 2 457 2 646 2 647 3 257 3 267 3 258 3 267cti 64 3 234 3 242 3 581 3 596 4 489 4 491 4 481 4 491data 2 51 51 51 51 51 51 51 51data 4 96 96 96 96 96 96 97 97data 8 165 165 165 165 172 172 173 173data 16 275 275 276 276 298 301 300 300data 32 448 450 454 454 489 492 499 502data 64 688 690 695 697 757 760 764 764

Table 5: Detailed per instance results. eter Sanders, Christian Schulz, Darren Strash, and Robert Williger graph k AdvEvo AdvReps SimpleEvo SimpleRepsmin avg min avg min avg min avgDubcova3 2 383 383 383 383 383 383 383 383Dubcova3 4 765 765 765 765 765 765 765 765Dubcova3 8 1 433 1 437 1 436 1 437 1 463 1 475 1 463 1 464Dubcova3 16 2 295 2 309 2 319 2 322 2 373 2 398 2 346 2 355Dubcova3 32 3 581 3 621 3 707 3 711 3 887 3 893 3 821 3 856Dubcova3 64 5 448 5 492 5 706 5 732 5 754 5 772 5 763 5 766fe_4elt2 2 66 66 66 66 66 66 66 66fe_4elt2 4 163 163 162 162 168 169 167 167fe_4elt2 8 283 285 288 288 290 291 293 293fe_4elt2 16 478 479 482 483 482 484 486 487fe_4elt2 32 759 762 780 781 773 774 783 785fe_4elt2 64 1 149 1 153 1 185 1 190 1 189 1 191 1 206 1 208fe_pwt 2 116 116 116 116 116 116 116 116fe_pwt 4 236 236 236 236 236 236 236 236fe_pwt 8 473 473 473 473 474 474 476 476fe_pwt 16 925 925 929 929 930 930 929 929fe_pwt 32 1 834 1 839 1 909 1 910 1 862 1 864 1 872 1 873fe_pwt 64 2 846 2 919 3 025 3 043 3 458 3 472 3 447 3 462fe_rotor 2 460 460 460 460 464 464 464 464fe_rotor 4 1 540 1 545 1 543 1 554 1 575 1 593 1 573 1 580fe_rotor 8 2 833 2 838 2 844 2 848 2 891 2 891 2 898 2 899fe_rotor 16 4 404 4 448 4 483 4 489 4 605 4 632 4 550 4 589fe_rotor 32 6 809 6 898 6 943 7 013 7 024 7 065 7 037 7 062fe_rotor 64 10 196 10 289 10 534 10 615 10 249 10 293 10 242 10 261fe_sphere 2 192 192 192 192 192 192 192 192fe_sphere 4 379 379 380 380 380 380 380 380fe_sphere 8 570 570 575 577 570 570 570 570fe_sphere 16 804 804 852 854 835 835 839 839fe_sphere 32 1 177 1 184 1 262 1 264 1 208 1 213 1 216 1 218fe_sphere 64 1 657 1 667 1 803 1 805 1 722 1 723 1 745 1 750ﬁnan512 2 50 50 50 50 50 50 50 50ﬁnan512 4 100 100 100 100 100 100 100 100ﬁnan512 8 200 200 200 200 200 200 200 200ﬁnan512 16 400 400 400 400 400 400 400 400ﬁnan512 32 800 800 800 800 800 800 800 800ﬁnan512 64 3 210 3 216 3 259 3 263 3 200 3 200 3 200 3 200memplus 2 70 70 70 70 90 103 107 107memplus 4 90 90 91 91 127 131 123 126memplus 8 106 106 106 106 154 158 142 144memplus 16 132 132 151 153 234 240 239 242memplus 32 178 179 194 195 265 268 264 268memplus 64 181 182 205 206 392 412 428 436shipsec5 2 1 203 1 203 1 203 1 203 1 227 1 231 1 227 1 233shipsec5 4 3 681 3 681 3 681 3 681 3 783 3 793 3 801 3 803shipsec5 8 6 078 6 112 6 198 6 216 6 486 6 509 6 531 6 549shipsec5 16 8 826 8 881 8 850 8 903 9 612 9 650 9 570 9 651shipsec5 32 12 861 12 983 13 521 13 601 14 208 14 236 14 316 14 340shipsec5 64 17 304 17 398 18 114 18 301 21 482 21 594 21 549 21 611t60k 2 70 70 70 70 71 71 71 71t60k 4 202 202 202 202 203 203 203 203t60k 8 447 447 449 449 448 448 448 448t60k 16 800 803 807 810 792 793 802 804t60k 32 1 330 1 331 1 339 1 343 1 307 1 308 1 335 1 336t60k 64 2 040 2 042 2 104 2 114 2 031 2 043 2 098 2 103