Efficiently Computing Maximum Flows in Scale-Free Networks
EEfficiently Computing Maximum Flows in Scale-Free Networks
Thomas Bl¨asius Tobias Friedrich Christopher Weyand
Abstract
We study the maximum-flow/minimum-cut problem on scale-free networks, i.e., graphs whose degree distribution followsa power-law. We propose a simple algorithm that capitalizeson the fact that often only a small fraction of such a networkis relevant for the flow. At its core, our algorithm augmentsDinitz’s algorithm with a balanced bidirectional search. Ourexperiments on a scale-free random network model indicatesublinear run time. On scale-free real-world networks, weoutperform the commonly used highest-label Push-Relabelimplementation by up to two orders of magnitude. Comparedto Dinitz’s original algorithm, our modifications reduce thesearch space, e.g., by a factor of 275 on an autonomoussystems graph.Beyond these good run times, our algorithm has anadditional advantage compared to Push-Relabel. The lattercomputes a preflow, which makes the extraction of a minimumcut potentially more difficult. This is relevant, for example,for the computation of Gomory-Hu trees. On a social networkwith 70 000 nodes, our algorithm computes the Gomory-Hutree in 3 seconds compared to 12 minutes when using Push-Relabel.
The maximum flow problem is arguably one of the mostfundamental graph problems that regularly appears asa subtask in various applications [2, 32, 35]. The go-to general-purpose algorithm for computing flows inpractice is the highest-label Push-Relabel algorithm byCherkassky and Goldberg [10], which is also part of theboost graph library [33]. Beyond that, the BK-algorithmby Boykov and Kolmogorov [7] or its later iteration [17]should be used for instances appearing in computervision. Our main goal in this paper is to provide a flowalgorithm tailored towards scale-free networks. Suchnetworks are characterized by their heavy-tailed degreedistribution resembling a power-law, i.e., they are sparsewith few vertices of comparatively high degree and manyvertices of low degree.At its core, our algorithm is a variant of Dinitz’salgorithm [12]. Dinitz’s algorithm is an augmentingpath algorithm that iteratively increases the flow alongcollections of shortest paths in the residual network. Ineach iteration, at least one edge on every shortest pathgets saturated, thereby increasing the distance betweensource and sink in the residual network. To exploitthe structure of scale-free networks, we make use of thefacts that, firstly, shortest paths tend to span only asmall fraction of such networks, and secondly, a balancedbidirectional breadth first search is able to find the shortest paths very efficiently [6, 5]. Using a bidirectionalsearch to compute the collection of shortest paths inDinitz’s algorithm directly translates this efficiency tothe first iteration, as the residual network initiallycoincides with the flow network. Though the structureof the residual network changes in later iterations, ourexperiments show that the run time improvementsachieved by using a bidirectional search remain high.Scaling experiments with geometric inhomogeneousrandom graphs (GIRGs) [8], in fact indicate that theflow computation of our algorithm runs in sublinear time.In comparison, previous algorithms (Push-Relabel, BK,and unidirectional Dinitz) require slightly super-lineartime. This is also reflected in the high speedups weachieve on real-world scale-free networks.With the flow computation itself being so efficient,the total run time for computing the maximum flowfor a single source-sink pair in a scale-free network isheavily dominated by loading the graph and buildingdata structures. Thus, our algorithm is particularlyrelevant when we have to compute multiple flows in thesame network. This is, e.g., the case when computingthe Gomory-Hu tree [20] of a network. The Gomory-Hutree is a compact representation of the minimum s - t cutsfor all source-sink pairs ( s, t ). It can be computed withGusfield’s algorithm [21] using n − n vertices. Using our bidirectionalflow algorithm as the subroutine for flow computationsin Gusfield’s algorithm lets us compute the Gomory-Hutree of, e.g., the soc-slashdot instance with 70 k nodesand 360 k edges in only 2 . GIRGs are a generative network model closely related tohyperbolic random graphs [25]. They resemble real-world networksin regards to important properties such as degree distribution,clustering, and distances. a r X i v : . [ c s . D S ] S e p nsurprisingly, our algorithm is outperformed by theBK-algorithm on a segmentation instance from computervision. Moreover, Push-Relabel performs best on alayered network that was specifically constructed toevaluate flow algorithms. However, we would arguethat this type of instance is rather artificial. Our findings can be summarizedin the following main contributions. • We provide a simple and efficient flow-algorithmthat significantly outperforms previous algorithmson scale-free networks. • It’s efficiency on non-scale-free instances makesit a potential replacement for the Push-Relabelalgorithm for general-purpose flow computations. • Our algorithm is well suited to compute the Gomory-Hu tree of comparatively large instances. • There are situations, where computing a flow withthe Push-Relabel algorithm is significantly moreexpensive than computing a preflow. This standsin contrast to previous observations [10, 11].
The maximum flow problem hasbeen for a long time and still is subject of active research.In the following, we briefly discuss only the work mostrelated to our result. For a more extensive overview onthe topic of flows, we refer to the survey by Goldbergand Tarjan [19].Our algorithm is based on Dinitz’s Algorithm [12],which belongs to the family of augmenting path al-gorithms originating from the Ford-Fulkerson algo-rithm [16]. Augmenting path algorithms use the residualnetwork to represent the remaining capacities and iter-atively increase the flow by augmenting it with pathsfrom source to sink in the residual network, until nosuch path exists. At every point in time, a valid flow isknown and at the end of execution, non-reachability inthe residual network certifies maximality.From this perspective, the
Push-Relabel algo-rithm [18] does the reverse. At every point in time,the sink is not reachable from the source in the residualnetwork, thereby guaranteeing maximality, while the ob-ject maintained throughout the algorithm is a so-called preflow and the algorithm stops once the preflow is ac-tually a flow. This is achieved using two operations push and relabel ; hence the name. Different variants of thePush-Relabel algorithm mainly differ with regards to theorder in which operations are applied. A strategy per-forming well in practice is the highest-label strategy [10].The extensive empirical study by Ahuja et al. [1] onten different algorithms shows that the highest-label Push-Relabel algorithm indeed performs the best out ofthe ten. The only small caveat with these experimentsis the fact that they are based on artificial networksthat are specifically generated to pose difficult instances.Our experiments show that the structure of the instancematters in the sense that it impacts different algorithmsdifferently; potentially yielding different rankings ondifferent types of instances. The so-called pseudoflowalgorithm by Hochbaum [23] was later shown to slightlyoutperform (low single-digit speedups on most instances)the highest-label Push-Relabel algorithm; again basedon artificial instances [9].Boykov and Kolmogorov [7] gave an algorithmtailored specifically towards instances that appear incomputer vision; outperforming Push-Relabel on theseinstances. It was later refined by Goldberg et al. [17].Most related to our studies is the work by Halim etal. [22] who developed a distributed flow algorithm forMapReduce to compute flows on huge social networks.
In this section we introduce the concept of network flowand describe Dinitz’s algorithm [12].
Network Flows.
A flow network is a directedgraph G = ( V, E ) with source and sink vertices s, t ∈ V ,and a capacity function c : V × V → N with c ( u, v ) = 0 if( u, v ) (cid:54)∈ E . A flow f on G is a function over vertex pairs f : V × V → Z satisfying three constrains: (I) capacity f ( u, v ) ≤ c ( u, v ) (II) asymmetry f ( u, v ) = − f ( v, u ) and(III) conservation (cid:80) v ∈ V f ( u, v ) = 0 for u ∈ V \ { s, t } .We call an edge ( u, v ) ∈ E saturated if f ( u, v ) = c ( u, v ).Denote the value of a flow f as (cid:80) v ∈ V f ( s, v ). Themaximum flow problem, max-flow for short, is theproblem of finding a flow of maximum value.Given a flow f in G , we define a network G f calledthe residual network . G f has the same set of nodes andcontains the directed edge ( u, v ) if f ( u, v ) < c ( u, v ). Thecapacity c (cid:48) of edges in G f is given by the residual capacityin the original network, i.e., c (cid:48) ( u, v ) = c ( u, v ) − f ( u, v ).An s - t path in G f is called an augmenting path . Dinitz’s Algorithm.
One can solve max-flow byiteratively increasing the flow on augmenting paths, yield-ing the famous Ford-Fulkerson algorithm [16]. Dinitz’salgorithm belongs to the family of augmenting path algo-rithms [2]. In contrast to the Ford-Fulkerson algorithm,Dinitz groups augmentations into rounds.Let d s ( v ) be the distance from s to vertex v in G f .We define a subgraph of G f called the layered network by restricting the edge set to edges ( u, v ) of G f for which d s ( u ) + 1 = d s ( v ), i.e., edges that increase the distanceto the source. We call a flow of some network a blockingflow if every s - t path contains at least one edge that issaturated by this flow, i.e., there is no augmenting path.ach round, Dinitz’s algorithm (see Algorithm 1)augments a set of edges that constitutes a blocking flowof the layered network. One can find such a set ofedges by iteratively augmenting s - t paths in the layerednetwork until source and sink become disconnected.After augmenting a blocking flow, the distance betweenthe terminals in the residual network strictly increases. Algorithm 1:
Dinitz’s Algorithm. while s-t path in residual network do build layered network while s-t path in layered network do augment flow with s-t path Asymptotic Running Time.
To better under-stand how our modifications impact the run time, webriefly sketch how Dinitz running time of O ( n m ) isobtained. Since d s ( t ) increases each round, the numberof rounds is bounded by n −
1. Each round consists oftwo stages: building the layered network and augment-ing a blocking flow. To build the layered network, thedistances from the source to every vertex in the resid-ual network are needed. The layered network can beconstructed in O ( m ) using a breadth-first search (BFS).Asymptotically, however, this is dominated by the timeto find the blocking flow. Finding the paths of theblocking flow is done with a repeated graph traversal,usually using a depth-first search (DFS). The numberof found paths is bounded by m , because each foundpath saturates at least one edge, removing it from thelayered network. A single DFS can be done in amortized O ( n ) time as follows. Edges that are not part of an s - t path in the layered network do not need to be lookedat more than once during one round. This is achievedby remembering for each node which edges of the lay-ered network were already found to have no remainingpath to the sink. Each subsequent DFS will start wherethe last one left off. Thus, per round, the depth-firstsearches have a combined search space of O ( m ), whileeach individual search additionally visits the nodes onone s - t path which is O ( n ). Efficient Dinitz Implementation.
Typical im-plementations represent the graph by adding a reversedtwin for each edge. Furthermore, neither the residualnetwork nor the layered network are constructed ex-plicitly. The residual network is implicitly defined bythe capacities and flow values on edges and the layerednetwork by a distance labeling. This conveniently elimi-nates the need to modify the network structure duringthe algorithm. When, e.g., saturating an edge duringaugmentation, this implicitly removes the edge from theresidual network and layered network. However, with this representation, the BFS and DFS are performed onall edges and must check if edges are part of the residualor layered network when they are encountered. Thebound for the BFS is unaffected and the amortizationargument for the DFS extends to edges that are notpart of the layered and/or residual network. During theaugmentation of the blocking flow, a counter into theadjacency list of each vertex indicates which outgoingedges were already processed this round.
Practical Performance.
The practical perfor-mance of Dinitz’s algorithm is far better than its worst-case bound. Actually, O ( n ) as the length of the foundaugmenting path is very unrealistic. In our experiments d s ( t ) remains mostly below 10, implying that the num-ber of rounds is significantly lower than n −
1. Also, thenumber of found augmenting paths during one rounds isfar below O ( m ). In unweighted networks, for example,a DFS saturates all edges of the found path resultingin a bound of O ( m ) to find a blocking flow. In fact,Dinitz’s algorithm has a tight upper bound of O ( n / m )in unweighted networks [14, 24]. We adapt a common Dinitz implementation to exploitthe specific structure of scale-free networks. We achievea significant speedup by using the fact that a flow andcut respectively often depend only on a small fractionof the network. The following three modifications eachtackle a performance bottleneck. Bidirectional Search.
Recently, sublinear run-ning time was shown for balanced bidirectional searchin a scale-free network model [5, 6]. We use a bidi-rectional breadth-first-search to compute the distancesthat define the layered network during each round ofDinitz’s algorithm. A forward search is performed fromthe source and a backward search from the sink, eachtime advancing the search that incurs the lower cost toadvance one layer. A shortest s - t path is found whena vertex is discovered that was already seen from theother direction. Note that, for our purpose, the bidirec-tional search has to finish the current layer when sucha vertex is discovered, because all shortest paths mustbe found. Figure 1 visualizes the difference in exploredvertices between a normal and a bidirectional BFS. Theaugmentations with DFS are restricted to the visitedpart of the layered network, meaning the search spaceof the BFS plus the next layer.The distance labeling obtained by the bidirectionalBFS requires a change to the DFS. The purpose of thelayered network is to contain all edges on shortest s - t paths. The DFS identifies edges ( u, v ) of the layered https://cp-algorithms.com/graph/dinic.html s ts Figure 1: Search space of a breadth-first search from asource s to a sink t unidirectional (left) and bidirectional(right). The blue area represents the vertices that areexplored, i.e., whose outgoing edges were scanned, bythe forward search and the green area the backwardsearch. In the gray area are vertices that are seenduring exploration of the last layer, but not yet explored.Vertices in the intersection of the upcoming layers of thebackward and forward search are marked orange.network by checking if they increase the distance fromthe source, i.e., d s ( u )+1 = d s ( v ). However, we no longerobtain the distances from the source for all relevantvertices. For vertices processed by the backward search,distances to the sink d t ( v ) are known instead. Toresolve the problem, we allow edges that either increasedistance from the source or decrease distance to thesink, i.e., d s ( u ) + 1 = d s ( v ) or d t ( u ) − d t ( v ). Thisdeviates from the definition of the layered network. Butsince edges on shortest s - t paths must both, increase thedistance from the source and decrease the distance tothe sink, we do not miss any relevant edges. Time Stamps.
The bidirectional search reducesthe search space of the breadth-first search and depth-first search substantially, potentially to sublinear. Theinitialization, however, still requires linear time. Itincludes the following. For the BFS, distances fromthe source and to the sink must be initialized to infinity.For the augmentations, one counter per node has to beinitialized to zero.To avoid the linear initializations, we introduce timestamps to indicate if a vertex was seen during the currentround. The initialization of distances and counters isdone lazily as vertices are discovered during the BFS.Another detail of our implementation is that we use beginand end indices into an array instead of a dynamicallygrowing queue for the BFS. We allocate this memory inadvance and override the data each round.
Skip Next Forward Layer.
Recall that we iden-tify edges of the layered network by checking if theyincrease the distance from the source or decrease thedistance to the sink. Therefore the DFS proceeds alongedges outgoing from the last forward search layer inde- pendent from the target vertex being seen only by theforward search (gray in Figure 1) or also by the backwardsearch (orange in Figure 1). However, the former type ofvertex cannot be part of a shortest s - t path. By savingthe number of explored layers of the forward search wecan avoid the exploration of such vertices, thus limitingthe DFS to vertices colored blue, green, or orange inFigure 1. With this optimization, the combined searchspace during augmentation (lines 3,4 in Algorithm 1)is almost limited to the search space of the BFS. Theonly additional edges that are visited originate from theintersection of the forward and backward search. In this section, we investigate the performance of our al-gorithm
DinitzOPT . First, we compare it to establishedapproaches on real-world networks in Section 4.1. Weadditionally examine the scaling behavior and how thecomparison is affected by problem size, i.e., is there anasymptotic improvement over other algorithms? Then,Section 4.2 evaluates to which extent the different op-timizations contribute to better run times and searchspace. In Section 4.3 we analyze the algorithms in a spe-cific application (Gomory-Hu trees) and compare theirusability beyond the speed of the actual flow computa-tion. To this end, we test three different approaches toobtain a cut with the Push-Relabel algorithm. Lastly,we extent our considerations to other types of networksin Section 4.4 and discuss why the results on scale-freenetworks differ from previous studies. Recall that bidi-rectional search was found to perform particularly wellon heterogeneous networks. In this section we com-pare our new approach to three existing algorithms:Dinitz [12], Push-Relabel [18], and the Boykov-Kolmogorov (BK) algorithm [7]. We modified theirrespective implementations to support our experiments.This also includes some minor performance-relevantchanges listed in the appendix (see Section A.1). The ex-periments include two synthetic and eight real-worldnetworks. All networks are undirected and all but visualize-us and actors are unweighted. Further de-tails regarding the datasets can be found in Table 2.We restrict our experiments in this section to the flowcomputation only. That is, the measurements excludethe time it takes to initialize intermediate data struc-tures before and after flow computations as well as thecreation of the graph structure. For Push-Relabel weonly measure the computation of the preflow , which issufficient to determine the value of the flow/cut. Fig- The code will be available upon publication. b-pages-tvshow girg10000 soc-slashdot girg100000 soc-flickr visualize-us dogster as-skitter actors brainInstance10 T i m e [ m s ] DinitzDinitzOPTPushRelabelBK-Algorithm lowhigh
Figure 2: Runtime comparison of flow computations. The 20 computed flows per instance are divided into low and high terminal pairs. For low , the terminal degree is between 0.75 and 1.25 times the average degree. For high ,it is between 10 and 100 times the average degree. Pairs are chosen uniformly at random from all vertices with therespective degree.ure 2 shows the resulting run times. For this plot, theterminals were chosen uniformly at random from theset of vertices with degree close to the average ( low ) orconsiderably higher degree ( high ).One can see that Dinitz and Push-Relabel displaycomparable times while BK is slightly slower on mostlarge instances. DinitzOPT consistently outperforms theother algorithms by one to three orders of magnitude.The variance is also higher for DinitzOPT with low pairsapproximately one order of magnitude faster on averagethan high pairs. This is best seen in the girg100000 instance and suggests that DinitzOPT is able to betterexploit easy problem instances. For all other algorithmsthe effect of the terminal degree on the run time is barelynoticeable. Another observation is that all algorithmsdisplay drastically lower run times than their respectiveworst-case bounds would suggest.The times in our experiments are close to whatone might expect from linear algorithms. For example,Dinitz computes a flow on the as-skitter instance inone second. Considering the tight O ( mn / ) bound inunweighted networks and assuming the throughput persecond to be around 10 — which is a generous guessfor graph algorithms — would result in an estimateof 30 minutes per flow. In this context, there arealso experimental results that appear to conflict withour results. Earlier studies found Dinitz to be slowerthan Push-Relabel and both algorithms clearly super-linear on a series of synthetic instances [1]. However,these synthetic instances exhibit specifically crafted hardstructures that are placed between designated source andsink vertices. These instances thus present substantiallymore challenging flow problems. We assume the lowtimes in our experiments to be caused by the scale-freenetwork structure and, to a lesser degree, the simplicity of the problem instances when choosing a random pair ofnodes as terminals. Furthermore, most of our instancesare unweighted and undirected. Effect of the Terminal Degree.
In the following,we discuss the effect of terminal degree and structureof the cut on the run time of Dinitz and DinitzOPT.Note that the terminal degree is an upper bound onthe size of the cut in unweighted networks. Moreover,the terminal degree in our experiments is based on theaverage degree, which is assumed to be constant inmany real-world networks [3]. Thus, the O ( mC ) boundfor augmenting path based algorithms, with C beingthe size of the cut, implies not only a linear boundfor the eight unweighted networks in our experiments,but would also explain faster low pairs. Surprisingly,DinitzOPT exploits low terminal degrees much morethan Dinitz. Another explanation for faster low pairsis that many cuts are close around one terminal, whichis consistent with previous observations about cuts inscale-free networks [29, 34]. Moreover, Dinitz tends toperform well when the source side of the cut is small [30].Although this does not fully explain why DinitzOPTis more sensitive to the terminal degree, we observein Section 4.3 that Dinitz slows down massively whenthe source degree is high, even with low sink degree.Since DinitzOPT always advances the side with smallervolume during bidirectional search it does not matterwhich terminal has the higher degree. Scaling.
We perform additional experiments to an-alyze the scaling behavior of the algorithms. Since realnetworks are scarce and fixed in size, we generate syn-thetic networks to gradually increase the size while keep-ing the relevant structural properties fixed. GeometricInhomogeneous Random Graphs (GIRGs) [8], a gen-eralization of Hyperbolic Random Graphs [25], are a Number of Nodes10 T i m e [ m s ] DinitzDinitzOPTPushRelabelBK-Algorithm
Figure 3: Runtime scaling of flow algorithms. The plotshows the average time per flow over multiple GIRGsand terminal pairs. Two linear and a quadratic functionwere added for reference.scale-free generative network model that captures manyproperties of real-world networks. The efficient gener-ator [4] allows us to benchmark our algorithms on dif-ferently sized networks with similar structure. Figure 3and Figure 4 show the results.We measure the run time over a series of GIRGs withthe number of nodes growing exponentially from 1000 to1 024 000 with 10 iterations each. In each iteration, wesample a new random graph with average degree 10,power-law exponent 2.8, dimension 1, and temperature 0.The run time for each algorithm is then averaged over10 uniform random pairs of vertices with degree between10 and 20. Standard deviation is shown as error bars.The lower half of the symmetric error bars seems longerdue to the log-axis. We add five functions in blackas reference: a quadratic and two linear functions inFigure 3 and n . and n . in Figure 4.Dinitz, Push-Relabel and BK show a near-linearrunning time. Compared to the linear functions inFigure 3, Dinitz and Push-Relabel seem to scale slightlyworse than linear, while DinitzOPT scales better thanlinear. In a construction with super-sink and super-source, a similar scaling was observed for Push-Relabelon the Yahoo Instant Messenger graph [27]. We addedthe function n . to Figure 4, because it is the theoreticalupper bound for the bidirectional search on hyperbolicrandom graphs with the chosen power-law exponent [5].Also it appears to be a good estimate for Dinitz runningtime with just the first optimization of bidirectionalsearch (DinitzBi). It was previously observed thatbidirectional search on hyperbolic random graphs withthe chosen parameters usually scales like n . [5], whichfits the run time of DinitzOPT in our experiments.Finally, the standard deviation and shape of the Number of Nodes10 T i m e [ m s ] DinitzDinitzBiDinitzStampDinitzOPT
Figure 4: Scaling of Dinitz variants. This plot differsfrom Figure 3 only in the set of displayed algorithms.curve confirms our claim that the run time of DinitzOPTis more sensitive to the graph structure. In fact,comparison with our intermediate versions of Dinitzshows that, while bidirectional search improved run timethe most, each successive optimization increased thesensitivity to the graph structure.
In this section weevaluate the performance impact of the changes discussedin Section 3. We present a search space analysis andin-depth profiler results . In addition to the unmodifiedDinitz, we consider four incrementally more optimizedversions of the algorithm: DinitzBi, DinitzReset, Dinitz-Stamp, and DinitzOPT. Each algorithm corresponds toadding one optimization to the previous ones. Experimental Setup.
All optimizations can beapplied in any order and combination. Instead ofconsidering all combinations, we individually add themin a specific order, such that the next change alwaystackles a performance bottleneck. In fact, additionalbenchmarks reveal that the current optimization speedsup the computation more than enabling all otherremaining changes together.The experiments and benchmarks in this sectionconsider 1000 uniform random terminal pairs close tothe average degree on the as-skitter instance. Theaverage distance between source and sink in the initialnetwork is 4.2. The average number of rounds until amaximum flow is found is 4.8, where the last round runsonly the BFS to verify that no augmenting path exists.Only counting rounds before the last round, 2.9 units offlow are found on average per round. Out of the 1000cuts, 882 have value equal to the degree of the smallerterminal. Table 1 shows profiler results and search spacefor Dinitz and the optimized versions of the algorithm. We used the Intel VTune profiler. able 1: Total run times and search space of visited edges for the five intermediate versions of our Dinitzimplementation during the computation of 1000 flows in as-skitter . Terminals are chosen like low pairs inFigure 2. The first seven columns show times in seconds accumulated over all flow computations. BUILD is theconstruction of the residual network that is reused for all flow computations, RESET means clearing flow on edgesbetween computations, INIT includes initialization of distances and counters per round, BFS and DFS refer tothe respective subroutines, FLOW is the summed time during flow computations (sum of BFS, DFS, INIT), andTOTAL is the run time of the whole application including reading the graph from file. The last three columnscontain the search space relative to the number of edges in the graph in percent. Search space columns for BFSand DFS are per round, while the FLOW column lists the search space per flow, e.g., Dinitz visits on average65.66% of all edges per BFS and every edge is visited about 5.58 times on overage in one flow computation.MaxFlow Search Space [%]BUILD RESET INIT BFS DFS FLOW TOTAL BFS DFS FLOWDinitz 0.50 56.79 14.87 405.46 426.80 847.13 904.85 65.66 63.64 558.04DinitzBi 0.55 58.15 21.02 2.78 8.94 32.73 91.82 0.26 1.87 8.38DinitzReset 0.50 — 20.73 2.47 8.01 31.20 32.06 0.26 1.87 8.38DinitzStamp 0.55 — — 2.51 10.30 12.81 13.72 0.26 1.87 8.38DinitzOPT 0.55 — — 2.40 1.06 3.46 4.22 0.26 0.20 2.03Additionally, Figure 5 compares the search space withand without bidirectional search.
Bidirectional Search.
Dinitz takes 15 minutesto compute the 1000 flows and the search space perflow is more than five times the number of edges onaverage. Almost all of that time is spent in BFS or DFS.The bidirectional Dinitz reduces the flow-time from 14minutes to 30 seconds, an improvement by a factor of 25.The search space is reduced by factors of 252 for BFS,34 for DFS, and 67 per flow. It is interesting to note,that the search space of BFS during the last round ofeach flow changes even more. In this round the BFS willfind no s-t path. The bidirectional search visits 39 edgeson average, while the normal breadth-fist-search visits44% of the graph. This not only emphasizes that thecuts are close around one terminal, but also shows thatthe bidirectional search heavily exploits this structure.The run time does not fully reflect this drasticreduction in search space, because DFS and BFS nolonger dominate the flow computation. The initializationtime per round increased by 50%, which can be explainedby the additional distance label per node to store thedistance to the sink (now 3 ints instead of 2). Althoughthe initialization is a simple linear operation in thenumber of nodes, it takes twice as long as BFS and DFScombined. Actually, the performance of initializationheavily depends on the data layout. We decided tostore node data interleaved instead of in separate buffers.This data layout reduces memory loads and facilitatescache locality because all data for one node is fetchedat once. On the other hand, the choice hinders efficientinitialization with SIMD instructions. The real bottleneck, however, is to reset the flowvalues between computations. RESET takes almost afull minute which is twice as long as computing the flows.
Reset flow between computations.
Betweenflow computations, the residual capacity of all edgeshas to be reset before another flow can be found. Afterchanging the BFS to a bidirectional search, resettingthe flow on all edges between computations dominatesthe run time. To reduce the time of our benchmarks,and to make the code more efficient in situations wheremultiple flows are computed in the same network, weaddress this bottleneck. Instead of explicitly resettingflow values for all edges, we remember the edges thatcontain flow and reset only those. The number of edgeswith positive flow is typically very small in comparisonto the whole network. Additionally, edges that containflow are visited during the algorithm anyway. By storingchanged edges during DFS, reset flow takes at most aslong as augmenting the flow in the first place. In fact,the time to reset flow is so low, it is not detected bythe profiler. This change is not mentioned in Section 3because it does not speed up a single flow computation.This change completely eliminates the time forRESET, while other operations are not affected. Thetotal time to compute all 1000 flows is thus three timeslower with the flow computation making up for almostall spent time. The slowest part of the flow computationitself is still the initialization with 21 of the 31 seconds.
Time Stamps.
The distance labels and countersper node are initialized each round. Using time stampseliminates the need for initialization completely whileadding a small overhead to DFS. The flow computation
LLBFSDFS
UNI-directional Search
Forward SearchNext Forward LayerIntersectionNext Backward LayerBackward SeachALLBFSDFS
BI-directional Search
Figure 5: Average number of edges visited per flowcomputation for the terminal pairs used in Table 1,partitioned as in Figure 1.
Forward/Backward Search represent the edges explored by the respective search.
Next Forward/Backward Layer denote the edges thatwould be explored in the next step of the BFS. Edges inthe
Intersection originate from vertices in both upcomingBFS-layers. The BFS and DFS bars show the edges thatare actually visited by the algorithm. The shaded areaindicates the edges skipped by our last optimization(from DinitzStamp to DinitzOPT in Table 1) and isexcluded in the sum on the right.gets 2.4 times faster with 13 seconds instead of 31.After introducing the time stamps, the DFS is the newbottleneck and makes up for about 80% of flow time.
Skip Next Forward Layer.
As discussed in Sec-tion 3, this change prevents the DFS from visiting ver-tices beyond the last layer of the forward search thatare not also seen by the backwards search. In Figure 5the skipped part is shaded. This optimization reducesthe average search space for DFS during one round fromalmost 2% of all edges to just 0.2%. The improvementin search space is reflected by the profiler results. DFSis sped up from 10 seconds to just one second, which isfaster than the BFS. The resulting time to compute all1000 flows is 3.46 seconds, which is only 7 times slowerthan building the adjacency list in the beginning. Intotal, the time to compute the flows with the optimizedDinitz is 245 times faster than the unmodified Dinitz.
Misc.
Since the BFS is the slowest part of the finalalgorithm, we add another low-level optimization forundirected networks. Line-by-line load analysis showsthat more time is spent during the backward search thanthe forward search. The backward search from the sinkhas to consider incoming instead of outgoing edges butour implementation only maintains an adjacency list ofoutgoing edges. However, for each incoming edge thereis an outgoing twin edge with a reference to the incomingedge. This reference is used to determine the residualcapacity of the incoming edge to check if the incomingedge is part of the residual network. We can save a memory lookup in the hot code ofthe algorithm, by determining the residual capacity ofthe incoming edge without loading it into memory. Theresidual capacity of an edge is obtained by subtractingthe flow from the capacity. In undirected networks thecapacity of an edge is the same as that of its twin.Additionally, consistency of flow links the flow of bothedges. Thus we can compute the residual capacity ofincoming edges by looking only at the outgoing edges.The change improves performance by 20 to 40 percentin undirected networks.Note that a similar optimization is possible fordirected networks: one can cache the capacity of theback edge in each twin. This concept is known and wasapplied in previous flow implementations , however weonly use the optimization for undirected networks. In the last sections weobserved that heterogeneous network structure yieldseasy flow problems that can be solved significantlyfaster than the construction of the adjacency list.This performance becomes important in applicationsthat require multiple flows to be found in the samenetwork. Gomory-Hu trees [20] fit this setting and haveapplications in graph clustering [15]. A Gomory-Hu tree(GH-tree) of a network is a weighted tree on the sameset of vertices that preserves minimum cuts, i.e., eachminimum cut between any two vertices s and t in thetree is also a minimum s - t cut in the original network.Thus, they compactly represent s - t cuts for all vertexpairs of a graph. For the construction of a GH-tree,there exists a very simple algorithm by Gusfield [21] thatrequires n − Flow Computation on Gusfield Pairs.
Figure 6shows the same networks and algorithms as in Figure 2but with terminal pairs sampled out of the n − https://github.com/Zagrosss/maxflow b-pages-tvshow girg10000 soc-slashdot girg100000 soc-flickr visualize-us dogster as-skitter actors brainInstance10 T i m e [ m s ] DinitzDinitzOPTPushRelabelBK-Algorithm gh
Figure 6: Runtime comparison of flow computations. The 10 terminal pairs per instance are uniformly chosen outof the n − gh pairs measured for the soc-slashdot instance aresolved by DinitzOPT and Push-Relabel faster than onemicrosecond which is the precision of our measurements.This suggests, that these algorithms are more sensitiveto the varying difficulty of the flow computations for gh pairs. Our speedup over the Push-Relabel algorithm on gh pairs is not as pronounced as for the random pairs inSection 4.1. On the dogster instance PR is even fasterthan DinitzOPT.To further investigate why gh pairs are this easy tosolve, we analyze a complete run of all pairs needed byGusfield’s algorithm on the soc-slashdot instance. InGusfield’s algorithm each vertex is the source once, thusthe average degree of the source is the average degreeof the graph (10.24). In contrast, the average degreeof the sink is ca. 1500, which hinders the benefit ofbidirectional search. Uni-directional Dinitz slows downby a factor of 15 when computing the flows with switchedterminals. The average distance between two verticesin the original network is 4.16, but interestingly herethe average distance from source to sink is 1.78. Out ofthe 70 k flow computations, 56 k are trivial cuts aroundone terminal. Computing a flow for a single s-t pairtakes 2.76 rounds on average with the last round only toconfirm that the flow is optimal. Before the last roundon average 5.56 flow is being found per round.DinitzOPT and Push-Relabel are both extremelyfast on gh pairs. DinitzOPT takes 2.5 seconds tocompute all n = 70 k required flows, while PR needs5 seconds. To obtain the 5 seconds for PR we exclusivelymeasured the preflow computation, but PR is notlimited by the time to compute the preflow. Actually,the entire computation of the Gomory-Hu tree on the soc-slashdot instance takes 12 minutes with Push- Relabel and 2.6 seconds with DinitzOPT. Instead ofbeing caused by the Gusfield logic — which actuallymakes up less than 3% of the run time when usingDinitzOPT as oracle — the bottleneck when using PR asa cut oracle is not the flow computation, but initializationand extracting the cut. The drastic difference in runtime is in part due to the optimizations we added toDinitzOPT to reduce time between flow computations,while the Push-Relabel implementation recreates theauxiliary data structures, except the adjacency list,before each flow. However, in the following we will seethat a large amount of Push-Relabels run time is actuallynecessary to extract the cuts for Gusfield’s algorithm. Measuring Gusfield’s Algorithm.
In Gusfield’salgorithm we have to iterate over all vertices in thesource-side of the cut. Extracting these with the Push-Relabel algorithm is slower than with Dinitz. We outlinethe three approaches to extract the cut with Push-Relabel and show that each has major drawbacks.The PR algorithm is executed in two stages. Thefirst stage computes a preflow and the second stageconverts the preflow into a flow. Often, computinga preflow is sufficient, because one obtains the valueof the max-flow/min-cut and can determine a cut byfinding all sink-reaching vertices in the residual network.Since Gusfield requires the source-side of the cut, thecomplement of the found set of vertices can be used.This approach is computationally expensive, because ofthe high sink degree.Given a max-flow, one partition of a min-cut canalso be identified by reachability from the source in theresidual network. Since the source usually has a smallerdegree during Gusfield’s algorithm, the source-side ofthe cut is small. This approach is efficient and can beused for Dinitz. However, for PR it requires the preflowto be converted into a flow. Asymptotically, the firststage (preflow) dominates the second (convert) stage, onvertT-SideSwap initializepreflowconvertfind cut 736s1062s3333s
Figure 7: Distribution of spent time during Gusfield’salgorithm on the soc-slashdot instance with threeapproaches to use the Push-Relabel algorithm as a min-cut oracle. We split the measurements into initialization,preflow, conversion, and cut identification. The timeoverhead for measurement, logging, and the logic ofGusfield’s algorithm is included in the numbers on theright but excluded in the bars.but in practice this is not always the case. In the paperthat proposed the current PR implementation [10] theauthors experiment with different implementations ofthe conversion and find a method whose ”running time[. . . ] is a small fraction of the running time of the firststage”. Other works find that 95% of time is spent instage one [11]. Our experiments in Section 4.1 are in linewith these findings and thus only the time for the firststage of PR is shown there. However, Gusfield pairs poseeasily solvable flow instances due to the low distancebetween source and sink. Thus, the simplicity causesthe second stage of PR to dominate the first one.The drawbacks of the two previous approachescan be avoided in undirected networks by computingthe preflow from sink to source. Without preflow-conversion, a cut can then be extracted by determiningthe vertices that can reach the original source in theresidual network. The drawback of this method is thatthe preflow computation slows down massively.In short, the three approaches to extract the source-side of a min-cut with the Push-Relabel algorithm are:
Convert.
Compute a preflow from the source, convertit into a flow, then run BFS from the source.
T-Side.
Compute a preflow from the source, run BFSbackwards from the sink, then take complement.
Swap.
Compute a preflow from sink to source, then runBFS backwards from the source.Figure 7 shows the distribution of run time usingthese approaches to run Gusfields algorithm on the soc-slashdot instance. The convert approach is thefastest with just above 12 minutes followed by
T-side with 18 minutes and swap with almost an hour. Theinitialization time provides a reference, because it is approximately 7 minutes for all approaches. We notehere that the initialization for PR creates some Boostrelated data and performs an operation linear in thenumber of edges.We see that the flow computation is actually reallyfast for convert . It takes only about 5 seconds of these12 minutes. The initialization dominates this time andthe conversion is also far slower than the flow itself.Surprisingly, the flow takes twice as long for
T-side than for convert , although only the way to identify thecut was changed. This is, because we find other min-cuts and thus obtain a different GH-tree while processingdifferent terminal pairs. We also implemented the
T-side approach for DinitzOPT to verify the correctnessof the computed cuts and trees. Interestingly runningthis takes 4.5 minutes which is a factor 100 slower thanidentifying the cut via the source-side for DinitzOPT.Similarly for PR, we observe that the cut identification,which was almost unnoticeable for convert , makes upmost of the computation time for
T-side .Lastly, the swap approach takes more than 4 times aslong as the convert approach. As the degree of the sink issignificantly larger than the source, the flow computationslows down massively. It goes from 5 seconds to 47minutes. Recall that the unmodified Dinitz slows downby a factor of 15 when switching source and sink.In conclusion, all three methods perform significantlyworse than DinitzOPT, not because PR flow computa-tions are slow, but because the initialization and cutidentification already take orders of magnitude longerthan the complete process with DinitzOPT. Both meth-ods to avoid the four minutes run time of stage two ofthe Push-Relabel algorithm imply even worse perfor-mance cost; either due to a breadth-first search that hasto traverse almost the whole graph (T-side) or due tosignificantly slower preflow computations (Swap).
After evaluatingthe performance on heterogeneous networks we extendour experiments to networks of different structures. Weconsider the following networks: an Erd˝os-R´enyi ran-dom graph [13] ( er100000 ), an Erd˝os-R´enyi randomgraph with uniform random weights in [500 , er100000 weighted ), an Erd˝os-R´enyi random graphwith super terminals ( er100000 super ), a generatedlayered network [1] ( layered10000 ), the road networkof Pennsylvania ( roadNet-PA ), and a liver CT scan asa regular 6-connected grid ( liver.n6c100 ). Further de-tails regarding the datasets can be found in Section A.2.Figure 8 shows the performance of the flow algo-rithms on these instances. The performance on theErd˝os-R´enyi graphs is similar to our results for het-erogeneous networks; the BK-algorithm is the slowest, r100000 er100000_weighted er100000_super layered10000 roadNet-PA liver.n6c100Instance10 T i m e [ m s ] DinitzDinitzOPTPushRelabelBK-Algorithm
Figure 8: Run time of max-flow computations for various networks. Each point corresponds to one s - t flow. Foreach instance we computed 50 s - t flows. The instances er100000 super , layered10000 , and liver.n6c100 havedesignated terminals. For er100000 , er100000 weighted , and roadNet-PA terminals are chosen uniformly atrandom. Unlike the experiments in Section 4.1, the algorithms rebuild their internal data structures includingthe adjacency list before each flow computation. This was necessary to prevent the BK-algorithm from reusingsearch-trees, which makes the instances with given terminal pairs trivial after the first run.followed by Dinitz, Push-Relabel, and DinitzOPT in thisorder. Note that a running time close to O ( √ n ) wasshown for bidirectional search on Erd˝os-R´enyi randomgraphs [6]. Neither weights nor higher-degree terminalschange how the algorithms compare among each other.The layered network, which is specifically con-structed to produce a computationally difficult flowinstance [1], is indeed more difficult than the others.In the layered network, Push-Relabel is at least fivetimes faster than Dinitz. DinitzOPT is 10-20% slowerthan Dinitz. After all, our optimizations trade a smalloverhead during flow computation for the possibility ofsublinear running time on particularly easy instances.For the road network, the choice of the algorithmdoes not matter as much as for the other instances.The choice of the terminal pair, however, affects theperformance immensely. With a diameter of almost 800and a very homogeneous degree distribution, the uniformrandom choice of terminal pairs produces problems ofvarying difficulty. Dinitz, BK, and DinitzOPT capitalizeon the easier pairs, while Push-Relabel shows lessvariance between pairs.Lastly, the liver scan produces different results thanprevious instances. The BK-algorithm was specificallydesigned for this kind of network structure and applica-tion. Unsurprisingly, the BK-algorithm performs best,followed by Push-Relabel, Dinitz, and DinitzOPT. We presented a modified version of Dinitz’s algorithmwith greatly improved run time and search space on real-world and generated scale-free networks. The scalingbehavior appears to be sublinear, which matches previous theoretical and empirical observations about the runningtime of balanced bidirectional search in scale-free randomnetworks. While these theoretical bounds apply duringthe first round of our algorithm, it is still unknownwhether the analysis can be extended to account forthe changes in the residual network. Our experiments,however, indicate that the search space remains small insubsequent rounds.We observe that that the low diameter and heteroge-neous degree distribution leads to small and unbalancedcuts that our algorithm finds very efficiently. The flowcomputations required to compute a Gomory-Hu treeare even easier, making usually insignificant parts ofthe tested algorithms be a bottleneck. For example, thepreflow conversion leads to Push-Relabel being greatlyoutperformed by our algorithm in this setting.Our results on other types of instances show thattheir structural properties play a huge role when com-paring flow algorithms. It is not surprising thatour algorithm is outperformed by the BK-algorithm,which was specifically designed for vision problems, on liver.n6c100 . Moreover, the experiments on the arti-ficial layered1000 instance indicate that Push-Relabelis more robust regarding hard instances. On scale-freenetworks, however, we drastically improve performanceover existing algorithms. eferences [1] Ravindra K. Ahuja, Murali Kodialam, Ajay K. Mishra,and James B. Orlin. Computational investigationsof maximum flow algorithms.
European Journal ofOperational Research , 97(3):509–542, 1997. doi:10.1016/S0377-2217(96)00269-X .[2] Ravindra K. Ahuja, Thomas L. Magnanti, and James B.Orlin.
Network flows: Theory, algorithms and applica-tions . Prentice-Hall, Inc., 1993.[3] Albert-L´aszl´o Barab´asi.
Network science . Cambridgeuniversity press, 2016.[4] Thomas Bl¨asius, Tobias Friedrich, Maximilian Katz-mann, Ulrich Meyer, Manuel Penschuck, and Christo-pher Weyand. Efficiently Generating Geometric In-homogeneous and Hyperbolic Random Graphs. In , volume 144 of
Leibniz International Proceed-ings in Informatics (LIPIcs) , pages 21:1–21:14. SchlossDagstuhl–Leibniz-Zentrum fuer Informatik, 2019. doi:10.4230/LIPIcs.ESA.2019.21 .[5] Thomas Blsius, Cedric Freiberger, Tobias Friedrich,Maximilian Katzmann, Felix Montenegro-Retana, andMarianne Thieffry. Efficient Shortest Paths in Scale-Free Networks with Underlying Hyperbolic Geome-try. In , volume107 of
Leibniz International Proceedings in Informatics(LIPIcs) , pages 20:1–20:14. Schloss DagstuhlLeibniz-Zentrum fuer Informatik, 2018. doi:10.4230/LIPIcs.ICALP.2018.20 .[6] Michele Borassi and Emanuele Natale. KADABRA isan ADaptive Algorithm for Betweenness via RandomApproximation. In , volume 57 of
Leibniz In-ternational Proceedings in Informatics (LIPIcs) , pages20:1–20:18. Schloss Dagstuhl–Leibniz-Zentrum fuer In-formatik, 2016. doi:10.4230/LIPIcs.ESA.2016.20 .[7] Y. Boykov and V. Kolmogorov. An experimentalcomparison of min-cut/max- flow algorithms for energyminimization in vision.
IEEE Transactions on PatternAnalysis and Machine Intelligence , 26(9):1124–1137,2004. doi:10.1109/TPAMI.2004.60 .[8] Karl Bringmann, Ralph Keusch, and Johannes Lengler.Geometric inhomogeneous random graphs.
TheoreticalComputer Science , 760:35–54, 2019. doi:10.1016/j.tcs.2018.08.014 .[9] Bala G. Chandran and Dorit S. Hochbaum. A com-putational study of the pseudoflow and push-relabelalgorithms for the maximum flow problem.
OperationsResearch , 57(2):358–376, 2009.[10] B. V. Cherkassky and A. V. Goldberg. On imple-menting the push—relabel method for the maximumflow problem.
Algorithmica , 19(4):390–410, 1997. doi:10.1007/pl00009180 .[11] U. Derigs and W. Meier. Implementing Goldberg’smax-flow-algorithm A computational investigation.
Zeitschrift fr Operations Research , 33(6):383–403, 1989. doi:10.1007/BF01415937 .[12] Yefim Dinitz. Algorithm for Solution of a Problem ofMaximum Flow in Networks with Power Estimation.
Soviet Mathematics Doklady , 11:1277–1280, 1970.[13] Paul Erd˝os and Alfr´ed R´enyi. On random graphs,i.
Publicationes Mathematicae (Debrecen) , 6:290–297,1959.[14] Shimon Even and R. Endre Tarjan. Network flowand testing graph connectivity.
SIAM Journal onComputing , 4(4):507–518, 1975. doi:10.1137/0204043 .[15] Gary William Flake, Robert E. Tarjan, and KostasTsioutsiouliklis. Graph clustering and minimum cuttrees.
Internet Mathematics , 1(4):385–408, 2004. doi:10.1080/15427951.2004.10129093 .[16] L. R. Ford and D. R. Fulkerson. Maximal flow througha network.
Canadian Journal of Mathematics , 8:399404,1956. doi:10.4153/CJM-1956-045-5 .[17] Andrew V. Goldberg, Sagi Hed, Haim Kaplan, Robert E.Tarjan, and Renato F. Werneck. Maximum Flows byIncremental Breadth-First Search. In , LectureNotes in Computer Science, pages 457–468. Springer,2011. doi:10.1007/978-3-642-23719-5_39 .[18] Andrew V. Goldberg and Robert E. Tarjan. A newapproach to the maximum-flow problem.
Journal ofthe ACM , 35(4):921–940, 1988. doi:10.1145/48014.61051 .[19] Andrew V. Goldberg and Robert E. Tarjan. Efficientmaximum flow algorithms.
Commun. ACM , 57(8):82–89, 2014. doi:10.1145/2628036 .[20] R. E. Gomory and T. C. Hu. Multi-Terminal NetworkFlows.
Journal of the Society for Industrial and AppliedMathematics , 9(4):551–570, 1961.[21] Dan Gusfield. Very simple methods for all pairs networkflow analysis.
SIAM Journal on Computing , 19(1):143–155, 1990. doi:10.1137/0219009 .[22] Felix Halim, Roland H.C. Yap, and Yongzheng Wu.A MapReduce-Based Maximum-Flow Algorithm forLarge Small-World Network Graphs. In , pages 192–202, 2011. ISSN: 1063-6927. doi:10.1109/ICDCS.2011.62 .[23] Dorit S. Hochbaum. The pseudoflow algorithm: A newalgorithm for the maximum-flow problem.
OperationsResearch , 56(4):992–1009, aug 2008. doi:10.1287/opre.1080.0524 .[24] Alexander V. Karzanov. On finding a maximum flow ina network with special structure and some applications.
Matematicheskie Voprosy Upravleniya Proizvodstvom ,5:81–94, 1973.[25] Dmitri Krioukov, Fragkiskos Papadopoulos, MaksimKitsak, Amin Vahdat, and Mari´an Bogu˜n´a. Hyperbolicgeometry of complex networks.
Physical Review E ,82(3), 2010. doi:10.1103/physreve.82.036106 .[26] Jrme Kunegis. KONECT – The Koblenz NetworkCollection. In
Proc. Int. Conf. on World Wide WebCompanion , pages 1343–1350, 2013. URL: http://onect.cc/ .[27] Kevin Lang. Finding good nearly balanced cuts inpower law graphs. Technical Report YRL-2004-036,Yahoo! Research Labs, 2004.[28] Jure Leskovec and Andrej Krevl. SNAP Datasets:Stanford large network dataset collection. http://snap.stanford.edu/data , 2014.[29] Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, andMichael W. Mahoney. Community structure in largenetworks: Natural cluster sizes and the absence of largewell-defined clusters.
Internet Mathematics , 6(1):29–123, 2009. doi:10.1080/15427951.2009.10129177 .[30] Lorenzo Orecchia and Zeyuan Allen Zhu. Flow-basedalgorithms for local graph clustering. In
Proceedings ofthe Twenty-Fifth Annual ACM-SIAM Symposium onDiscrete Algorithms . Society for Industrial and AppliedMathematics, 2014. doi:10.1137/1.9781611973402.94 .[31] Ryan A. Rossi and Nesreen K. Ahmed. The net-work data repository with interactive graph analyt-ics and visualization. In
AAAI , 2015. URL: http://networkrepository.com .[32] Satu Elisa Schaeffer. Graph clustering.
ComputerScience Review , 1(1):27–64, 2007. doi:10.1016/j.cosrev.2007.05.001 .[33] Boris Sch¨aling.
The boost C++ libraries . Boris Sch¨aling,2011. URL: https://theboostcpplibraries.com/ .[34] S.-W. Son, H. Jeong, and J. D. Noh. Random field isingmodel and community structure in complex networks.
The European Physical Journal B , 50(3):431–437, 2006. doi:10.1140/epjb/e2006-00155-4 .[35] Tanmay Verma and Dhruv Batra. MaxFlow revisited:An empirical comparison of maxflow algorithms fordense vision problems. In
Procedings of the BritishMachine Vision Conference 2012 . British MachineVision Association, 2012. doi:10.5244/c.26.61 . AppendixA.1 Implementation Details
Experiments weredone on a Dell XPS 15 9570 Laptop with an Intel Corei7-8750H CPU.
BK-Algorithm.
As a BK implementation we usethe one that was written for the original paper [7]provided on the web page of Vladimir Kolmogorov .For each s - t flow we add edges with huge capacitybetween s, t and the virtual terminals. After the flow iscomputed, we remove these edges again. This O (1) workis included in time measurements. We apply the reusetrees feature and mark the changed terminals betweenflow computations accordingly. Internal memory isallocated on network construction and not per flow.There is a BK implementation available in Boost .We found the original one easier to use, because itsinterface is tailored towards multiple flow computationsand provides easy and efficient access to the found cut. Push-Relabel
The original implementation, usedfor example in [35], is no longer available . We use theC++ version of the original implementation providedin Boost . The Boost version is mostly the same code(up to same variable names) ported to C++, but is datastructure agnostic. Therefore, we had to reimplementedthe linearised adjacency list data structure used in theoriginal implementation. Dinitz and DinitzOPT.
Our implementation isbased on a version of Dinitz that is usually used inprogramming competitions . We changed the graphrepresentation to a linear adjacency list of outgoingedges. Edges are sorted by originating vertex in lineartime. Each node stores a range of edges into this list.This is the same structure used for the Push-Relabelimplementation. Performance-wise, the data structuresignificantly reduces the time to build large networks,but flow time remains the same. We use an array of size n as a queue, because during BFS each vertex is pushedat most once. We allocate memory for distance labels,counter, and the queue in advance when the network isbuilt instead of per flow. In the unidirectional BFS, onecould break when the sink is encountered but we finishthe current layer for the purpose of measuring searchspace. Undirected Networks.
We support flow for undi-rected networks. A simple way to do this, is to representeach undirected edge as two directed edges, which was http://pub.ist.ac.at/~vnk/software.html was https://cp-algorithms.com/graph/dinic.html done for Push-Relabel. However, each directed edgealready implies two edges in the residual network: onewith the given capacity, and a reversed twin edge withno capacity. To avoid storing four times the amount ofedges, the twin edge can be used to implement undi-rected flow. By giving the twin edge the same capacityas its counterpart, the exact same implementation canbe used for undirected as well as directed networks. Non-integer capacities.
We use 64-bit floatingpoint numbers instead of integers to represent flowvalues and capacities, because some applications usenon-integer capacities. The same implementation can beused but requires more memory and additional checksto handle floating point imprecision. We applied this toDinitz, PR, and BK and observed a performance dropof approximately 10% for all algorithms. Note that therange in which 64-bit floats exactly represent integralnumbers even exceeds the range of 32-bit integers.Precision issues are cause by the infinity capacity edgesintroduced for BK. To resolve this, the representation ofinfinity on these edges must be chosen according to therange of capacities.
A.2 Data.
We obtained the datasets from the Univer-sity of Koblenz (KONECT) [26], the Network Repositorywebsite [31], as well as the Stanford Network AnalysisProject (Snap) [28].Furthermore, we used the GIRG generator byBl¨asius et al. [4] mostly with default parameters. Weimplemented the ER model and the layered networkconstruction from Ajuja et al. [1]. The parameters forER are n = 100000 and p = 0 .
02. The parameters forthe layered network are taken from the largest instancein their paper (W=71, L=141, d=10).Lastly, the liver.n6c100 instance is from theUniversity of Western Ontario. It is a regular 3D gridwith 170x170x144 nodes, 6 edges per node, capacitiesup to 100, and a super sink/source.We converted all instances to a text-based edgelist with zero-based indices. In Section 4.4 we use thedirected DIMACS format instead.able 2: Instances used in this paper. The road network was undirected and is converted to directed DIMACSformat. In this case, the number of edges refers to the undirected version.instance directed weighted nodes edges avg. degree sourcefb-pages-tvshow 4K 17K 8.87 Network Repositorygirg10000 10K 60K 11.99 generatedsoc-slashdot 70K 360K 10.24 Network Repositorygirg100000 100K 600K 12.00 generatedsoc-flickr 514K 3.2M 12.42 Network Repositoryvisualize-us (cid:88) (cid:88) (cid:88) (cid:88)
10K 100K 9.96 generatedroadNet-PA ( (cid:88) ) 1.1M 1.5M 2.83 U. Stanfordliver.n6c100 (cid:88) (cid:88)(cid:88) (cid:88)