Memory Efficient Max Flow for Multi-label Submodular MRFs
11 Memory Efficient Max Flow for Multi-labelSubmodular MRFs
Thalaiyasingam Ajanthan,
Student Member, IEEE,
Richard Hartley,
Fellow, IEEE, and Mathieu Salzmann,
Member, IEEE
Abstract —Multi-label submodular Markov Random Fields (MRFs) have been shown to be solvable using max-flow based on anencoding of the labels proposed by Ishikawa, in which each variable X i is represented by (cid:96) nodes (where (cid:96) is the number of labels)arranged in a column. However, this method in general requires (cid:96) edges for each pair of neighbouring variables. This makes itinapplicable to realistic problems with many variables and labels, due to excessive memory requirement. In this paper, we introduce avariant of the max-flow algorithm that requires much less storage. Consequently, our algorithm makes it possible to optimally solvemulti-label submodular problems involving large numbers of variables and labels on a standard computer. Index Terms —Max-flow, Mutli-label submodular, Memory efficiency, Flow encoding, Graphical models. (cid:70)
NTRODUCTION
Ishikawa [1] introduced a max-flow-based method toglobally minimize the energy of multi-label MRFs with con-vex edge terms. In [2], this method was extended to energyfunctions satisfying the multi-label submodularity condition,analogous to the submodularity condition for MRFs withbinary labels. In the general case, however, this methodrequires (cid:96) directed edges for each pair of neighbouringvariables. For instance, for a × , 4-connectedimage with 256 labels, it would require approximately × × × × × ≈ GB of memoryto store the edges (assuming 4 bytes per edge). Clearly, thisis beyond the storage capacity of most computers.In this paper, we introduce a variant of the max-flowalgorithm that requires storing only two (cid:96) -dimensional vec-tors per variable pair instead of the (cid:96) edge capacities ofthe standard max-flow algorithm. In the example discussedabove, our algorithm would therefore use only 4 GB ofmemory for the edges. As a result, our approach lets usoptimally solve much larger problems.More specifically, in contrast to the usual augmentingpath algorithm [3], we do not store the residual edge ca-pacities at each iteration. Instead, our algorithm recordstwo (cid:96) -dimensional flow-related quantities for every pair ofneighbouring variables. We show that, at any stage of thealgorithm, the residual edge capacities can be computedfrom these flow-related quantities and the initial edge ca-pacities. This, of course, assumes that the initial capacitiescan be computed by some memory-efficient routine, whichis almost always the case in computer vision. • T. Ajanthan and R. Hartley are with the College of Engineering andComputer Science, Australian National University, Canberra and Data61,CSIRO, Canberra.E-mail: [email protected] • M. Salzmann is with the Computer Vision Laboratory, ´Ecole Polytech-nique F´ed´erale de Lausanne. • Data61 (formerly NICTA) is funded by the Australian Government asrepresented by the Department of Broadband, Communications and theDigital Economy and the Australian Research Council (ARC) throughthe ICT Centre of Excellence program.
The optimality of Ishikawa’s formalism made it amethod of choice as a subroutine in many approxi-mate energy minimization algorithms, such as multi-labelmoves [4], [5] and IRGC [6]. Since our approach can simplyreplace the standard max-flow algorithm [7] in Ishikawa-type graphs, it also allows us to minimize the energy ofmuch larger non-submodular MRFs in such approximatetechniques. Furthermore, due to the similarity to standardmax-flow, our algorithm can easily be extended to handledynamic MRFs [8] and also be accelerated using the parallelmax-flow technique [9].We demonstrate the effectiveness of our algorithm on theproblems of stereo correspondence estimation and imageinpainting. Our experimental evaluation shows that ourmethod can solve much larger problems than standard max-flow on a standard computer and is an order of magnitudefaster than state-of-the-art message-passing algorithms [10],[11], [12]. Our code is available at https://github.com/tajanthan/memf.A preliminary version of this paper is appeared in [13].This extended version contains a polynomial time version ofthe MEMF algorithm, a discussion on the equivalence withmin-sum message passing and an experiment to evaluatethe empirical time complexity.
RELIMINARIES
Let X i be a random variable taking label x i ∈ L . A pairwiseMRF defined over a set of such random variables can berepresented by an energy of the form E ( x ) = (cid:88) i ∈V θ i ( x i ) + (cid:88) ( i,j ) ∈E θ ij ( x i , x j ) , (1)where θ i and θ ij denote the unary potentials ( i.e ., data costs)and pairwise potentials ( i.e ., interaction costs), respectively.Here, V is the set of vertices, e.g ., corresponding to pixelsor superpixels in an image, and E is the set of edges in theMRF, e.g ., encoding a 4-connected or 8-connected grid overthe image pixels. a r X i v : . [ c s . D S ] F e b Fig. 1:
Example of an Ishikawa graph. The graph incorporate edgeswith infinite capacity from U i : λ to U i : λ +1 , not shown in the graph.Here the cut corresponds to the labeling x = { , } where thelabel set L = { , , , } . In this work, we consider a pairwise MRF with anordered label set L = { , , · · · , (cid:96) − } , and we assumethat the pairwise terms satisfy the multi-label submodularity condition [2]: θ ij ( λ (cid:48) , µ ) + θ ij ( λ, µ (cid:48) ) − θ ij ( λ, µ ) − θ ij ( λ (cid:48) , µ (cid:48) ) ≥ , (2)for all λ, λ (cid:48) , µ, µ (cid:48) ∈ L , where λ < λ (cid:48) and µ < µ (cid:48) . Fur-thermore, we assume that the pairwise potentials can becomputed either by some routine or can be stored in anefficient manner. In other words, we assume that we do notneed to store each individual pairwise term. Note that, incomputer vision, this comes at virtually no loss of generality. Ishikawa [1] introduced a method to represent the multi-label energy function (1) in a graph. The basic idea behindthe Ishikawa construction is to encode the label X i = x i of avertex i ∈ V using binary-valued random variables U i : λ , onefor each label λ ∈ L . In particular, the encoding is defined as u i : λ = 1 if and only if x i ≥ λ , and otherwise. The Ishikawagraph is then an st -graph ˆ G = ( ˆ V ∪ { , } , ˆ E ) , where the setof nodes and the set of edges are defined as follows: ˆ V = { U i : λ | i ∈ V , λ ∈ { , · · · , (cid:96) − }} , (3) ˆ E = ˆ E v ∪ ˆ E c , ˆ E v = { ( U i : λ , U i : λ ± ) | i ∈ V , λ ∈ { , · · · , (cid:96) − }} , ˆ E c = { ( U i : λ , U j : µ ) , ( U j : µ , U i : λ ) | ( i, j ) ∈ E , U i : λ , U j : µ ∈ ˆ V} , where ˆ E v is the set of vertical edges and ˆ E c is the set of crossedges. Note that, the nodes U i : (cid:96) and U i :0 are identified asnode 0 and node 1 respectively. We denote the Ishikawaedges by e ij : λµ ∈ ˆ E (contains edges in both directions)and their capacities by φ ij : λµ . We also denote by e i : λ thedownward edge ( U i : λ +1 , U i : λ ) . An example of an Ishikawagraph is shown in Fig. 1.In an st -graph, a labeling x is represented by a “cut” inthe graph (a “cut” partitions the nodes in the graph into twodisjoint subsets ˆ V and ˆ V , with ∈ ˆ V and ∈ ˆ V ). Then,the value of the energy function E ( x ) is equal to the sum
1. Some authors denote the nodes 0 and 1 by s and t . of the capacities on the edges from ˆ V to ˆ V . In an Ishikawagraph, if the downward edge e i : λ is in the “cut”, then vertex i takes label λ . In MRF energy minimization, each vertex i takes exactly one label x i , which means that exactly oneedge e i : λ must be in the min-cut of the Ishikawa graph. Thisis ensured by having infinite capacity for each upward edge e ii : λλ +1 , i.e ., φ ii : λλ +1 = ∞ for all i ∈ V and λ ∈ L . Notethat, by construction of the Ishikawa graph, the capacities φ and the energy parameters θ are related according to thefollowing formula: θ i ( λ ) = φ ii : λ +1 λ = φ i : λ , (4) θ ij ( λ, µ ) = (cid:88) λ (cid:48) >λµ (cid:48) ≤ µ φ ij : λ (cid:48) µ (cid:48) + (cid:88) λ (cid:48) ≤ λµ (cid:48) >µ φ ji : µ (cid:48) λ (cid:48) . Finding the minimum energy labeling is a min-cut prob-lem, which can be solved optimally using the max-flowalgorithm [3] when the edge capacities are non-negative. Asshown in [2], a multi-label submodular energy function can berepresented by an Ishikawa graph with non-negative edgecapacities φ and can therefore be minimized optimally bymax-flow. The most popular max-flow algorithm in computer vi-sion [7] is an augmenting path algorithm that finds a pathfrom node 0 to node 1 through positive edges (called an augmenting path ) and then pushes the maximum flow with-out exceeding the edge capacities (called augmentation ). Theaugmentation operation changes the edge capacities in thegraph, and therefore, the residual graph needs to be stored.That is, when applied to the Ishikawa graph, the max-flow algorithm stores (cid:96) values per pair of neighbouringvariables. For large number of labels and of variables, thememory requirement is high and, in many practical prob-lems, exceeds the capacity of most computers. Let us assume that the max-flow algorithm is applied to theIshikawa graph. As the algorithm proceeds, the capacitieson the edges in the graph change in response to the flow.Here, instead of storing the residual graph, we proposerecording the flow that has been applied to the graph.However, since storing the flow would also require (cid:96) values per variable pair, we propose recording two (cid:96) -dimensional quantities related to the flow between pair ofvariables. More precisely, for each directed edge ( i, j ) ∈E + , we record the sum of outgoing flows from each node U i : λ to the nodes U j : µ for all µ ∈ { , · · · , (cid:96) − } . Wecall this quantity an exit-flow , denoted by Σ ij : λ (definedbelow in Eq. 6). We show that these exit-flows allow us toreconstruct a permissible flow (defined below in Def. 3.3),which in turn lets us compute the residual edge capacitiesfrom the initial ones. Importantly, while flow reconstructionis not unique, we show that all such reconstructions areequivalent up to a null flow (Def. 3.4), which does not affectthe energy function. Note that this idea can be applied to E + denotes the set of directed edges between the vertices in theMRF, i.e ., if ( i, j ) ∈ E then, ( i, j ) ∈ E + and ( j, i ) ∈ E + . any augmenting path algorithm, as long as the residualgraph can be rapidly constructed.For increased efficiency, we then show how finding anaugmenting path can be achieved in a simplified Ishikawagraph (called block-graph ) that amalgamates the nodes ineach column into blocks. We then perform augmentation,which translates to updating our exit-flows, in this block-graph. As a side effect, since an augmenting path in ourblock-graph corresponds to a collection of augmentingpaths in the Ishikawa graph, our algorithm converges infewer iterations than the standard max-flow implementa-tion of [7]. EMORY EFFICIENT FLOW ENCODING
Before we introduce our memory efficient max flow algo-rithm, let us describe how the cumulative flow can be storedin a memory efficient manner. This technique can be used inany augmenting path flow algorithm, by reconstructing theresidual edge capacities whenever needed.Let us assume that the max-flow algorithm is applied tothe Ishikawa graph. At some point in the algorithm, flowhas passed along many of the edges of the graph.
Definition 3.1.
A flow is a mapping ψ : ˆ E → IR , denoted by ψ ij : λµ for the edges e ij : λµ , that satisfies the anti-symmetrycondition ψ ij : λµ = − ψ ji : µλ for all e ij : λµ ∈ ˆ E .A flow is called conservative if the total flow into a nodeis zero for all nodes, except for the source and the terminal, i.e ., (cid:88) j,µ | e ji : µλ ∈ ˆ E ψ ji : µλ = 0 ∀ U i : λ ∈ ˆ V . (5)Given ψ , the residual capacities of the Ishikawa graphare updated as φ = φ − ψ , where φ represents the initialedge capacities. Furthermore, we call the flow restricted toeach column column-flows , which we denote by ψ i : λ ; i ∈V , λ ∈ L .At first sight, it might seem that, to apply the max-flowalgorithm, it is necessary to keep track of all the values ψ ij : λµ , which would require the same order of storageas recording all the edge capacities. Below, however, weshow that it is necessary to store only O ( (cid:96) ) values for each ( i, j ) ∈ E , instead of O ( (cid:96) ) .To this end, the flow values that we store in our algo-rithm, namely source-flows and exit-flows are defined below. Definition 3.2.
1) For each i ∈ V , the flow out from thesource node ψ i : (cid:96) − is called a source-flow .2) For each ( i, j ) ∈ E + and λ ∈ { , · · · , (cid:96) − } , we definean exit-flow as Σ ij : λ = (cid:88) µ ψ ij : λµ . (6)We will show that these source-flows and exit-flowspermit the flow ψ to be reconstructed up to equivalence.Now, let us define some additional properties of flow,which will be useful in our exposition. Definition 3.3.
A flow ψ is called permissible if φ ij : λµ − ψ ij : λµ ≥ for all e ij : λµ ∈ ˆ E .
3. A conservative flow is often referred to as a flow in the literature. (a) ψ (b) ψ (cid:48) (c) ψ ≡ ψ (cid:48) Fig. 2:
An example of two equivalent flow representations withthe same exit-flows. Note that each red arrow represents thevalue ψ ij : λµ and the opposite arrows ψ ji : µλ are not shown.Furthermore, the exit-flows Σ are shown next to the nodes andthe initial edges φ are not shown. In (c) , the flow ψ (cid:48) is obtainedfrom ψ by passing flow around a loop. Definition 3.4.
A flow ψ is called null if the total flow intoa node is zero for all nodes including the source and theterminal, i.e ., satisfies Eq. 5 for all U i : λ ∈ ˆ V ∪ { , } .Note that a null flow does not change the energy func-tion represented by the st -graph and it is identical to passingflow around loops. Also, if ψ is a null flow then so is − ψ .Furthermore, note that the energy function encoded byan st -graph is a quadratic pseudo-boolean function [14], anda reparametrization of such a function is identical to a nullflow in the corresponding st -graph. Lemma 3.1.
Two sets of capacities φ and φ (cid:48) represent the sameenergy function exactly (not up to a constant), written as E φ ≡ E φ (cid:48) , if and only if φ (cid:48) − φ is a null flow.Proof. This lemma is a restatement of the reparametrizationlemma of [10], [15] in the context of st -graphs.Let φ and φ (cid:48) be two sets of residual capacities obtainedfrom an initial set of capacities φ by passing two flows ψ and ψ (cid:48) , i.e ., φ = φ − ψ and φ (cid:48) = φ − ψ (cid:48) . If φ and φ (cid:48) areequivalent, then, by Lemma 3.1, ( φ − ψ ) − ( φ − ψ (cid:48) ) = ψ (cid:48) − ψ is a null flow. Hence ψ (cid:48) can be obtained from ψ by passingflow around loops in the graph. See Fig. 2.We can now state our main theorem. Theorem 3.1.
Let φ be the initial capacities of an Ishikawagraph, and let Σ be a set of exit-flows. Suppose that ψ and ψ (cid:48) aretwo conservative flows compatible with Σ , meaning that (6) holdsfor both ψ and ψ (cid:48) , and that ψ and ψ (cid:48) have identical source-flows.Then E φ − ψ ≡ E φ − ψ (cid:48) . The idea is then as follows. If a permissible conservativeflow ψ is obtained during an augmenting path flow algo-rithm, but only the exit-flows Σ ij : λ are retained for each ( i, j ) ∈ E + and label λ , then one wishes, when required, toreconstruct the flow ψ on a given edge ( i, j ) ∈ E . Althoughthe reconstructed flow ψ (cid:48) may not be identical with the flow ψ , the two will result in equivalent energy functions (not justequal up to a constant, but exactly equal for all assignments).In the augmenting path algorithm, the current flow valuesare only needed temporarily, one edge at a time, to find anew augmenting path, and hence do not need to be stored,as long as they can be rapidly computed.Now we prove Theorem 3.1. Proof.
First we will prove that ψ and ψ (cid:48) have identicalcolumn-flows. For a conservative flow ψ i : λ − ψ i : λ − + (cid:88) ( i,j ) ∈E + Σ ij : λ = 0 , (7) Fig. 3:
Given φ and Σ (left), flow reconstruction is formulatedas a max-flow problem (right). Here the nodes with positive exit-flows are connected to the source (0) and those with negative exit-flows are connected to the terminal (1). for all i ∈ V and λ ∈ { , . . . , (cid:96) − } . Since ψ and ψ (cid:48) arecompatible with Σ and have identical source-flows, ψ i : λ = ψ (cid:48) i : λ for all i ∈ V and λ = L . Hence they have identicalcolumn-flows.Now we will prove the equivalence. Given a flow ψ , letus denote its restriction to the edges e ij : λµ for all λ, µ ∈{ , . . . , (cid:96) − } for some ( i, j ) ∈ E by ψ ij , i.e . restriction tocross edges only. Since both ψ ij and ψ (cid:48) ij satisfy Eq. 6, ψ (cid:48) ij − ψ ij is a null flow. Furthermore, since both ψ and ψ (cid:48) haveidentical column-flows, ψ (cid:48) − ψ = ( φ − ψ ) − ( φ − ψ (cid:48) ) is anull flow and, by Lemma 3.1, E φ − ψ ≡ E φ − ψ (cid:48) . Note that, from Eq. 7 it is clear that given the source-flows ψ i : (cid:96) − ; i ∈ V , the column-flows ψ i : λ ; i ∈ V , λ ∈ L canbe computed in a top-down fashion. Now we turn to theproblem of finding the flows along the cross-edges e ij : λµ .Given the set of exit-flows Σ , the objective is to find apermissible flow ψ (cid:48) satisfying Eq. 6. Note that there existsa permissible conservative flow ψ compatible with Σ andhence we find ψ (cid:48) such that ψ (cid:48) − ψ is a null flow. We do thisby considering one edge ( i, j ) ∈ E at a time and reconstructthe flow by formulating a small max-flow problem.Considering all the nodes U i : λ and U j : µ for a givenpair ( i, j ) , we join them with edges with initial capacities φ ij : λµ . Nodes with positive exit-flow Σ ij : λ are joined to thesource with edges of capacities | Σ ij : λ | . Similarly, those withnegative exit-flow are joined to the terminal. See Fig. 3.Note that, in this network, the edges from the source canbe thought of as “supply” and the edges to the terminalcan be thought of as “demand”. Since the total supplyequals the total demand in this network and there existsa permissible flow ψ ij compatible with Σ ( i.e ., satisfyingthe supply-demand equality), the maximum flow solutionof this network ψ (cid:48) ij is compatible with Σ , i.e ., satisfies Eq. 6.In fact we are interested in non-negative residual capacities φ (cid:48) ij = φ ij − ψ (cid:48) ij which are readily available in this network.This problem can be solved using a greedy augmentingpath algorithm. While this graph has O ( (cid:96) ) nodes and O ( (cid:96) ) edges, this remains perfectly tractable, since we only con-sider one edge ( i, j ) at a time. Therefore, ultimately, flowreconstruction can be done efficiently.At this point, given the initial capacities φ , the source-flows ψ i : (cid:96) − ; i ∈ V and the set of exit-flows Σ , we haveshown how to reconstruct the non-negative residual edgecapacities φ (cid:48) . This requires O ( |V| + |E| (cid:96) ) values to be stored. OLYNOMIAL T IME M EMORY E FFICIENT M AX F LOW
We now introduce our polynomial time memory efficientmax flow algorithm, which minimizes multi-label submodu- lar MRF energies with pairwise interactions. Our algorithmfollows a similar procedure as the standard Edmonds-Karpalgorithm [16], in that it iteratively finds the shortest aug-menting path and then pushes the maximum flow through itwithout exceeding the edge capacities. However, instead ofstoring the residual graph, we store exit-flows as proposedin Section 3, which, at any stage of the algorithm, wouldallow us to compute the residual graph. Below, we discusshow one can find an augmenting path and update the exit-flows, i.e ., perform augmentation, without storing the fullIshikawa graph.
Our algorithm finds an augmenting path in a subgraphof the Ishikawa graph, called lower-graph . In particular, thelower-graph contains only a subset of Ishikawa edges whichsatisfy the lowest-cross-edge property.
Definition 4.1.
Consider a directed edge ( i, j ) ∈ E + . Foreach node U i : λ , the lowest-cross-edge is defined as, the edge e ij : λµ where µ is the smallest value such that φ ij : λµ > .More specifically, in addition to the vertical edges ˆ E v ,the lower-graph contains the lowest-cross-edges. Therefore,we only store O ( (cid:96) ) edges per variable pair ( i, j ) . Now,the relationship between augmenting paths in the originalIshikawa graph and the lower-graph can be characterizedby the following theorem. Theorem 4.1.
Given the Ishikawa graph, there is an augmentingpath in the lower-graph if and only if there exists an augmentingpath in the Ishikawa graph.Proof.
Since the lower-graph is a subgraph of the Ishikawagraph, if there is an augmenting path in the lower-graph,then there exists an augmenting path in the Ishikawa graph.We will now prove the converse. Consider a directededge ( i, j ) ∈ E + . Let e ij : λµ and e ij : λµ (cid:48) be two positivecapacity edges from U i : λ and e ij : λµ (cid:48) be the lowest-cross-edge. Then, due to the upward infinite capacity edgesfrom U j : µ (cid:48) (cid:32) U j : µ , there is a positive capacity path from U i : λ (cid:32) U j : µ through the lowest-cross-edge e ij : λµ (cid:48) . Thisproves the theorem.This enables us to find all the augmenting paths in theIshikawa graph by searching in a smaller graph that has O ( (cid:96) ) edges per variable pair ( i, j ) .Note that, as mentioned earlier, we find the shortest augmenting path in this lower-graph. However, by contrastto the Edmonds-Karp algorithm [16], the path distance iscomputed considering zero distance for the infinite capacityedges and unit distance for other edges, instead of unit dis-tance for all the edges. The intuition for this modification isthat, the infinite capacity edges will never become saturated(or eliminated from the graph) for the entire course of thealgorithm. Note that, with this definition of path distance,the augmenting paths in both lower-graph and Ishikawagraph have same length. This will enable us to prove thepolynomial time bound of our algorithm in a similar manneras the standard Edmonds-Karp algorithm. Note that, even inthis case, the shortest augmenting path can be found usinga Breadth First Search (BFS) scheme. Algorithm 1
Memory Efficient Max Flow (MEMF) - Polyno-mial Time Version
Require: φ (cid:46) Initial Ishikawa capacities Σ ← (cid:46) Initialize exit-flows ¯ φ ← lower-graph( φ ) (cid:46) Store the lowest-cross-edges repeat p ← shortest augmenting path( ¯ φ ) (cid:46) Sec. 4.1 ( ¯ φ, Σ) ← augment( p, ¯ φ ) (cid:46) Sec. 4.2 for each edge e ij : λµ becomes saturated do φ ij ← compute edges( φ , Σ , i, j ) (cid:46) Sec. 3.1 ¯ φ ij ← lower-graph( φ ij , i, j ) (cid:46) Sec. 4.1 until no augmenting paths possible return get labelling( ¯ φ ) (cid:46) Find the cut using BFS
Now, given an augmenting path p , we want to push themaximum permissible flow through it. The edges in theaugmenting path p are updated in the similar manner as inthe usual max-flow algorithm. In addition to that, for eachcross edge e ij : λµ ∈ ˆ E c that is in the augmenting path, theexit-flows are updated as follows: Σ ij : λ = Σ ij : λ + α , (8) Σ ji : µ = Σ ji : µ − α , where α is the maximum possible flow along the path p .After the flow augmentation, the lower-graph needsto be updated to maintain the lowest-cross-edge property.Note that the lowest-cross-edge property may be violateddue to the following reasons:1) A new lowest-cross-edge e ij : λµ is created due to a flowalong the edge e ji : µλ .2) A new lowest-cross-edge e ij : λµ is created due to asaturating flow along e ij : λµ (cid:48) for some µ (cid:48) < µ , i.e ., theedge e ij : λµ (cid:48) disappears from the Ishikawa graph.Note that, during an augmentation, if a new lowest-cross-edge is created due to a flow in the opposite direction(case-1 above), then the new lowest-cross-edge is knownand the lower-graph can be updated directly, i.e ., the newlowest-cross-edge can be stored.On the other hand, if a cross edge becomes saturated(case-2), then we need to run the flow reconstruction al-gorithm to find the new lowest-cross-edge and update thelower-graph. This can be done in a memory efficient man-ner, since it only involves one edge ( i, j ) ∈ E at a time. Our polynomial time memory efficient max flow is summa-rized in Algorithm 1. Let us briefly explain the subroutinesbelow. lower-graph : Given the initial Ishikawa edge capac-ities φ , this subroutine constructs the lower-graph (withedge capacities ¯ φ ) by retaining the lowest-cross-edges fromeach node U i : λ ∈ ˆ V , for each directed edge ( i, j ) ∈ E + (see Sec. 4.1). If the input to this subroutine is the Ishikawacapacities φ ij corresponding to the edge ( i, j ) ∈ E , then itretains the lowest-cross-edges ¯ φ ij . shortest augmenting path : Given the lower-graphparameters ¯ φ , this subroutine finds the shortest augmenting p using BFS, as discussed in Section 4.1. augment : Given the path p , this subroutine findsthe maximum possible flow through the path and updatesthe lower-graph and the set of exit-flows, as discussed inSection 4.2. In addition, if a new lowest-cross-edge is createddue to a flow in the opposite direction (case-1 in Sec. 4.2),then it also updates the lower-graph capacities ¯ φ . compute edges : Given the initial Ishikawa edge ca-pacities φ and the set of exit-flows Σ , this subroutinecomputes the non-negative residual Ishikawa capacities φ ij corresponding to the given edge ( i, j ) . This is accomplishedby solving a small max-flow problem (see Sec. 3.1). get labelling : This subroutines finds the partition ofthe lower-graph by running BFS.As discussed above, the exit-flows Σ require O ( (cid:96) ) stor-age for each edge ( i, j ) ∈ E . In addition, the lower-graph canhave at most O ( |V| (cid:96) ) nodes and O ( |E| (cid:96) ) edges. Further-more, recall that we assume that the initial Ishikawa edgecapacities φ can be stored efficiently. Therefore, ultimately,the space complexity of our algorithm is O (( |V| + |E| ) (cid:96) ) = O ( |E| (cid:96) ) . Let us now prove the polynomial time bound ofour algorithm. We follow the time complexity analysis of the standardEdmonds-Karp algorithm [16] to derive a polynomial timebound on our algorithm. In particular, first the analysisproves that the shortest path distance from source (node0) to any node is monotonically increasing with each flowaugmentation. Then, it derives a bound on the number ofaugmentations. In fact, the number of augmentations of ourMEMF algorithm also has the same bound as the Edmonds-Karp algorithm.
Theorem 4.2.
If the MEMF algorithm is run on the Ishikawagraph ˆ G = ( ˆ V ∪ { , } , ˆ E ) with source and terminal , thenthe total number of augmentations performed by the algorithm is O ( | ˆ V|| ˆ E| ) .Proof. The proof follows the steps of standard proof of theEdmonds-Karp algorithm. See Appendix A for details.Let us analyze the time complexity of each sub-routine below. Note that, both the subroutines short-est augmenting path and augment runs in O ( |E| (cid:96) ) time,since, in the worst case, both subroutines need to check eachedge in the lower-graph. However, compute edges requiredto run the flow reconstruction algorithm which takes O ( (cid:96) ) time for each variable pair ( i, j ) (assuming a small max-flow problem with (cid:96) nodes and (cid:96) edges, solved usingthe most efficient algorithm [17], see Sec. 3.1). Also lower-graph requires O ( (cid:96) ) time for each variable pair ( i, j ) , sinceit needs to check each of the Ishikawa edges. Hence, theworst case running time of each iteration ( i.e ., augmentationstep) is O ( |E| (cid:96) + K ( (cid:96) + (cid:96) )) = O ( |E| (cid:96) + |E| (cid:96) ) = O ( |E| (cid:96) ) ,where K is the maximum number of flow-reconstructions( i.e ., saturated cross edges) at an augmentation step. Sincethe number of augmentations is bounded by O ( | ˆ V|| ˆ E| ) , theworst case running time of the entire execution of the MEMFalgorithm is O ( |V| (cid:96) |E| (cid:96) |E| (cid:96) ) = O ( |V||E| (cid:96) ) . This is O ( (cid:96) ) slower than the standard Edmonds-Karp algorithmon the Ishikawa graph. Note that, however, MEMF requires O ( (cid:96) ) less memory. Fig. 4:
To find an augmenting path in a memory efficient manner,we propose a simplified representation of the Ishikawa graph interms of blocks corresponding to consecutive non-zero edges ineach column i . FFICIENT ALGORITHM
In the previous section, we have provided a general purposepolynomial time max-flow algorithm that is also memoryefficient. However, for computer vision applications, the BKalgorithm [7] is shown to be significantly faster than thestandard max-flow implementations, even though it lacksthe polynomial time guarantee. The basic idea is to maintaina search tree throughout the algorithm instead of buildingthe search tree from scratch at each iteration.Motivated by this, we also propose doing search-tree-recycling similar to the BK algorithm. Since we lose the poly-nomial time guarantee, for increased efficiency, we furthersimplify the Ishikawa graph. In particular, we find an aug-menting path in a block-graph , that amalgamates the nodesin each column into blocks. Since an augmenting path inour block-graph corresponds to a collection of augmentingpaths in the Ishikawa graph, our algorithm converges infewer iterations than the BK algorithm.
As mentioned above, we find an augmenting path in ablock-graph , whose construction is detailed below.Given the parameters φ , we rely on the fact that thereexists a label λ such that φ i : λ = 0 for each i ∈ V . In fact, itis easy to see that in each column i , if all φ i : λ are positive,then there exists a trivial augmenting path from U i : (cid:96) to U i :0 ,and the minimum along the column can be subtracted fromeach φ i : λ . Now, at each column i , we partition the nodes U i : λ for all λ ∈ { , · · · , (cid:96) − } into a set of blocks , such thateach node in a block is connected with positive edges φ i : λ .Let us denote these blocks by B i : γ , where γ is indexed frombottom to top starting from 0. Note that there is no edgebetween B i : γ and B i : γ ± . As depicted by Fig. 4, our block-graph then contains only the blocks and the edges betweenthe blocks.The edges in the block-graph are obtained as follows.Let us consider a directed edge ( i, j ) ∈ E + . We add an edge B i : γ → B j : δ , where δ is the smallest value such that φ ij : λµ > for some U i : λ ∈ B i : γ and U j : µ ∈ B j : δ . While doing this,
4. We called this a simplified graph in [13]. we also enforce that there is no edge B i : γ (cid:48) → B j : δ (cid:48) suchthat γ (cid:48) > γ and δ (cid:48) < δ . The reasoning behind this is that,because of the upward infinite-capacity edges between thenodes U i : λ and U i : λ +1 , we have the following:1) If a node U j : µ can be reached from U i : λ through positiveedges, then the nodes U j : µ (cid:48) , for all µ (cid:48) ≥ µ , can also bereached.2) If a node U j : µ can be reached from U i : λ through positiveedges, then it can also be reached from the nodes U i : λ (cid:48) ,for all λ (cid:48) ≤ λ .Hence, an edge B i : γ → B j : δ indicates the fact that thereis some positive flow possible from any node U i : λ ∈ B i : γ (cid:48) ,for all γ (cid:48) ≤ γ , to any node U j : µ ∈ B j : δ (cid:48) , for all δ (cid:48) ≥ δ . Inother words, the set of edges obtained by this procedure issufficient.Now, the relationship between augmenting paths in theoriginal Ishikawa graph and in our block-graph can becharacterized by the following theorem. Theorem 5.1.
Given the set of Ishikawa graph parameters φ ,there is an augmenting path in the block-graph if and only if thereexists an augmenting path in the Ishikawa graph.Proof. The basic idea of this proof is the same as the proofof Theorem 4.1. See Appendix B.Note that the block-graph can only be used to find anaugmenting path; the quantity of the maximum permissibleflow cannot be determined in this graph. Therefore, thecapacity of an edge B i : γ → B j : δ is not important, but itis important to have these edges. Note also that the block-graph is constructed incrementally for each edge ( i, j ) ∈ E .Hence, it only requires us to store the Ishikawa graph pa-rameters φ ij corresponding to the edge ( i, j ) . Furthermore,since the block-graph G b is sparse, an augmenting path canbe found fairly quickly.Furthermore, similar to the BK algorithm, we find anaugmenting path P b using BFS and maintain the searchtree throughout the algorithm, by repairing it wheneverthe block-graph is updated. However, since the block-graphneeds to be reconstructed after each augmentation, forsimplicity, we maintain a single tree . More specifically, wegrow the search tree from source (node 0), in a breadth firstmanner, and if sink (node 1) is reached, then the augmentingpath P b is found. Now, given an augmenting path P b in the block-graph, wewant to push the maximum permissible flow through it.More specifically, since P b corresponds to a set of augment-ing paths { p b } in the Ishikawa graph, we will push themaximum flow through each path p b , until no such pathexists. This could be achieved by constructing the subgraph ˆ G p of the Ishikawa graph corresponding to the augmentingpath P b , and then finding each of the augmenting path p b bysearching in ˆ G p . This would require us to either store ˆ G p (notmemory efficient) or call the flow reconstruction algorithmtoo many times.
5. The BK algorithm maintains two trees, source-tree and sink-tree,but we only maintain the source-tree.
Fig. 5:
An example flow-loop ˜ m (1 , , α ij ) in the block-graph (left)is equivalent to the summation of two flow-loops m (3 , , α ) and m (4 , , α ) in the Ishikawa graph (right), with α ij = α + α . Instead, we propose breaking down the augmentationoperation in the block-graph into a sequence of flow-loopsand a subtraction along a column. Then, the maximumflow through the path can be pushed in a greedy manner,by pushing the maximum flow through each flow-loop.Before describing this procedure in detail, we introduce thefollowing definitions.
Definition 5.1.
A flow-loop m ( λ, µ, α ) in the Ishikawagraph is defined as the following sequence of operations:First, a value α is pushed down the left column from U i : (cid:96) to U i : λ , then across from U i : λ to U j : µ , and finally up theright column from U j : µ to U j : (cid:96) . Thus, applying the flow-loop m ( λ, µ, α ) corresponds to replacing φ by φ + ∆ , where ∆ i : λ (cid:48) = − α ∀ λ (cid:48) ≥ λ , ∆ ij : λµ = − α , ∆ ji : µλ = α , ∆ j : µ (cid:48) = α ∀ µ (cid:48) ≥ µ . Definition 5.2.
A flow-loop ˜ m ( γ, δ, α ) in the block-graph G b is defined by the following sequence of operations: Firsta value α is pushed down the left column from U i : (cid:96) to B i : γ , then across from B i : γ to B j : δ , and finally up the rightcolumn from B j : δ to U j : (cid:96) .Note that, for a flow-loop ˜ m ( γ, δ, α ) to be permissible,block B i : γ must contain node U i : (cid:96) − . Note also that the flow-loop ˜ m ( γ, δ, α ) can be thought of as a summation of flow-loops m ( λ, µ, α (cid:48) ) , where U i : λ ∈ B i : γ and U j : µ ∈ B j : δ (cid:48) , forall δ (cid:48) ≥ δ (see Fig. 5).Given these definitions, one can easily see that the aug-mentation operation along the path P b can be broken downinto a sequence of flow-loops ˜ m ( γ, δ, α ) and a subtractionalong the last column k , as illustrated in Fig. 6. Now, wepush the maximum permissible flow through P b , using thefollowing greedy approach.For each edge B i : γ → B j : δ that is part of the path P b , we apply a flow-loop ˜ m ( γ, δ, α ij ) , where α ij is themaximum permissible flow through the edge B i : γ → B j : δ .In fact, applying this flow-loop translates to reconstructingthe Ishikawa edge capacities φ ij corresponding to edge ( i, j ) and then applying flow-loops m ( λ, µ, α (cid:48) ) for all λ ≥ ˇ λ and Fig. 6: An augmentation operation is broken down into a sequenceof flow-loops ˜ m ( γ, δ, α ) , and a subtraction along the column k .The augmenting path P s is highlighted in red. Algorithm 2
Memory Efficient Max Flow (MEMF) - EfficientVersion
Require: φ (cid:46) Initial Ishikawa capacities Σ ← , T ← ∅ (cid:46) Initialize exit-flows and search tree G b ← block-graph( φ ) (cid:46) Initial block-graph repeat ( T, P b ) ← augmenting path( G b , T ) (cid:46) Sec. 5.1 Σ ← augment( P b , φ , Σ ) (cid:46) Sec. 5.2 for each edge ( i, j ) ∈ E affected by augmentation do φ ij ← compute edges( φ , Σ , i, j ) (cid:46) Sec. 3.1 G ijb ← block-graph( φ ij , i, j ) (cid:46) Sec. 5.1 T ← repair tree( T, G b ) (cid:46) Repair search tree until no augmenting paths possible return get labelling( T ) (cid:46) Read from search tree µ ≥ ˇ µ , starting from ˇ λ and ˇ µ , until no permissible flow-loop m ( λ, µ, α (cid:48) ) exists, with ˇ λ and ˇ µ the smallest valuessuch that U i : λ ∈ B i : γ and U j : µ ∈ B j : δ . Finally, in the lastcolumn k , all the values φ k : λ are positive, and the minimumalong column k is subtracted from each φ k : λ . It is easy to seethat this approach pushes the maximum permissible flowthrough the path P b .Since, for each edge ( i, j ) , we do not store all the (cid:96) capacities, but only the (cid:96) exit-flows Σ , augmentation mustthen also update these values. Fortunately, there is a directrelation between the flow-loops and Σ . To see this, let usconsider the example flow-loop ˜ m (1 , , α ij ) shown in Fig. 5.Applying this flow-loop updates the corresponding exit-flows as Σ ij :3 = Σ ij :3 + α , (9) Σ ji :1 = Σ ji :1 − α , Σ ij :4 = Σ ij :4 + α , Σ ji :4 = Σ ji :4 − α . Similar updates can be done for all flow-loops in our proce-dure. Note that the edge B i : γ → B j : δ represents a collectionof possible paths from all the nodes U i : λ ∈ B i : γ to all thenodes U j : µ ∈ B j : δ (cid:48) , for all δ (cid:48) ≥ δ . Therefore, unlike in thefull Ishikawa graph, after applying a flow-loop, the portionof the graph G ijb corresponding to edge ( i, j ) ∈ E needs tobe reconstructed. This, however, can be done in a memoryefficient manner, since it only involves one edge ( i, j ) at atime. Our memory efficient max-flow (
MEMF ) method is summa-rized in Algorithm 2. Let us briefly explain the subroutinesbelow. block-graph : Given the initial Ishikawa parameters φ , this subroutine constructs the block-graph by amalga-mating nodes into blocks as described in Section 5.1. Ifthe input to the subroutine is the Ishikawa capacities φ ij corresponding to the edge ( i, j ) ∈ E , then it constructs theblock-graph portion G ijb . augmenting path : Given the block-graph G b and thesearch tree T , this subroutine finds an augmenting path P b by growing the search tree, as discussed in Section 5.1. augment : Given the path P b , this subroutine pushesthe maximum permissible flow through it by applying flow-loops ˜ m ( γ, δ, α ) and then subtracting the minimum from thelast column, as discussed in Section 5.2. compute edges : This is the same subroutine as inAlgorithm 1. (see Sec. 4.3). repair tree : This subroutine is similar to the adoption stage of the BK algorithm. Given the reconstructed block-graph, the search tree T is repaired by checking for valid parent for each orphan node. See section 3.2.3 in [7] for moredetail. get labelling : This subroutine directly reads the op-timal labelling from the search tree T .As discussed Section 4.3, the space complexity of ouralgorithm is O ( |E| (cid:96) ) . For the rest of the paper, this efficientversion of the algorithm is referred to as MEMF. QUIVALENCE WITH M ESSAGE PASSING
In this section, we will give a more insightful interpretationof our max-flow algorithm, by showing equivalence withthe min-sum message passing algorithm. In particular, firstwe will characterize the notion of an augmenting path inthe message passing context. Then, we will show that, themax-flow algorithm is, in spirit, equivalent to min-summessage passing, for multi-label submodular MRFs. Finally,we observe the relationship between the set of exit-flowsin our algorithm and the set of messages in the messagepassing algorithm. To this end, first let us define the multi-label graph which will be used to explain the equivalence.
An alternative way of representing the multi-label energyfunction (1) is by defining indicator variables x i : λ ∈ { , } ,where x i : λ = 1 if and only if x i = λ . For a given i , exactlyone of x i : λ ; λ ∈ L can have value 1. In terms of the indicatorvariables, the energy function (1) may be written as E θ ( x ) = (cid:88) i ∈V (cid:88) λ ∈L θ i : λ x i : λ + (cid:88) ( i,j ) ∈E (cid:88) λ,µ ∈L θ ij : λµ x i : λ x j : µ , (10)where the values θ are a particular set of parameters de-termining the energy function. One may define a graph,called a multi-label graph , with nodes denoted by X i : λ ; i ∈V , λ ∈ L , as shown in Fig. 7. This graph represents theenergy function. Given a labelling x , the value of the energyfunction is obtained by summing the weights on all nodes Fig. 7: Example of a multi-label grpah. Here the nodes representthe unary potentials θ i : λ and the edges represent the pairwisepotentials θ ij : λµ . with x i : λ = 1 (in other words x i = λ ) plus the weights θ ij : λµ such that x i : λ = 1 and x j : µ = 1 .Furthermore, given the Ishikawa edge capacities φ theparameters θ can be calculated using Eq. 4. Note that, inthis case, E θ ( x ) = E φ ( x ) for all labellings x . The energy function (10) can be written in different waysas a sum of unary and pairwise terms . In particular, theremay exist a different set of parameters θ (cid:48) such that E θ ( x ) = E θ (cid:48) ( x ) for all x , denoted as E θ ≡ E θ (cid:48) . The conditions underwhich E θ ≡ E θ (cid:48) are well known [10], [15]. Lemma 6.1.
Two energy functions E θ and E θ (cid:48) are equivalent ifand only if there exist values m ij : λ and m ji : µ for ( i, j ) ∈ E and λ, µ ∈ L such that θ (cid:48) ij : λµ = θ ij : λµ − m ij : λ − m ji : µ ,θ (cid:48) i : λ = θ i : λ + (cid:88) ( k,i ) ∈E + m ik : λ . The values of m ij : λ constitute a message m ij passed fromthe edge ( i, j ) to the node i ; it may be thought of as amessage vector (indexed by λ ). A message m ij causes values m ij : λ to be swept out of all the edges θ ij : λµ and adds ontothe nodes θ i : λ . Messages are passed in both directions froman edge ( i, j ) .Note that, reparametrization provides an alternative wayof implementing the min-sum message passing algorithm.In particular, the objective of min-sum message passing isto reparametrize the energy function so that the constantterm θ c is maximized, while keeping the parameters θ non-negative, where the constant term is defined as follows: θ c = (cid:88) i ∈V min λ ∈L θ i : λ . (11)For more details see [10], [18]. We will now give an equivalent notion of an augmentingpath in the context of the multi-label graph. To this end, we will first understand the motivation behind finding anaugmenting path in the Ishikawa graph.Let us consider an augmenting path in the Ishikawagraph. If it is a trivial augmenting path, i.e ., it is an aug-menting path along a column from nodes U i : (cid:96) to U i :0 , thenpushing the maximum flow along the path translates intosubtracting the minimum value from each φ i : λ ; λ ∈ L . Inthe multi-label graph, it is trivially equivalent to subtractingthe minimum from each θ i : λ ; λ ∈ L .Let us consider a more interesting augmenting path inthe Ishikawa graph, which contains at least one cross edge e ij : λµ ∈ ˆ E c . Similar to the discussion in Section 5.2, one caneasily see that, the augmentation operation can be brokendown into a sequence of flow-loops m ( λ, µ, α ) (see Def. 5.1)and a subtraction along a column. This intuitively suggeststhat an augmenting path in the Ishikawa graph can betranslated into a trivial augmenting path by passing flowaround loops, i.e ., they differ by a null flow. Therefore, themotivation of finding an augmenting path, is to pass flowaround loops to get a trivial augmenting path.Note that, the notion of a trivial augmenting path in themulti-label graph is, θ i : λ > ∀ λ ∈ L , (12)for some i ∈ V . Furthermore, by Lemma 3.1 and byLemma 6.1, a flow-loop (or null flow) corresponds to areparametrization of the multi-label energy function. Hence,the notion of an augmenting path in the multi-label graphcan be characterized as, finding a set of reparametrizationsthat makes θ i : λ positive for all λ ∈ L for some i ∈ V . Note that, as we have discussed above, an augmentingpath in the Ishikawa graph can be translated into a triv-ial augmenting path by passing flow around loops. Fur-thermore, pushing the maximum flow through a trivialaugmenting path is simply a subtraction of the minimumvalue min λ ∈L φ i : λ from each φ i : λ . In fact, one can accu-mulate the total flow passed from source to sink, which isexactly the constant term defined in Eq. 11. Hence, max-flow tries to maximize θ c , by passing flow around loops( i.e ., reparametrizing), while keeping the edge capacities φ non-negative. This is, in spirit, equivalent to the min-sum message passing algorithm. Note that, the optimalityof min-sum message passing for the case of multi-labelsubmodular MRFs, is observed in [15], [19]. As mentioned above, from Lemma 3.1 and Lemma 6.1, it isclear that, a flow-loop corresponds to a reparametrization ofthe multi-label energy function. In this section, we will findthe equivalent reparametrization of a flow-loop m ( λ, µ, α ) .This will later allow us to understand the relationshipbetween the set of exit-flows and the set of messages. Letus now state and prove our theorem. Fig. 8: A flow m (2 , , α ) in the Ishikawa graph (left) and itsequivalent reparametrization in the multi-label graph (right).Note that, the exit-flow vectors ( Σ ij , Σ ji ) and the correspondingmessage vectors ( m ij , m ji ) are shown next to the nodes. Theorem 6.1.
Applying a flow-loop m ( λ, µ, α ) in the Ishikawagraph is equivalent to a reparametrization in the multi-labelgraph, with messages m ij : λ (cid:48) = − α ∀ λ (cid:48) ≥ λ , (13) m ji : µ (cid:48) = α ∀ µ (cid:48) ≥ µ . Proof.
Let the Ishikawa parameters be φ and the multi-labelgraph parameters be θ and assume that the flow is appliedbetween columns i and j . Also after the flow the parametersbe φ (cid:48) and θ (cid:48) respectively. Since θ can be calculated from φ using Eq. 4, E θ ≡ E φ . Similarly E φ (cid:48) ≡ E θ (cid:48) . Also fromLemma 3.1, E φ ≡ E φ (cid:48) . Hence, E θ ≡ E φ ≡ E φ (cid:48) ≡ E θ (cid:48) . Nowfrom Definition 5.1, φ (cid:48) i : λ (cid:48) = φ i : λ (cid:48) − α ∀ λ (cid:48) ≥ λ , (14) φ (cid:48) j : µ (cid:48) = φ j : µ (cid:48) + α ∀ µ (cid:48) ≥ µ ,φ (cid:48) ij : λµ = φ ij : λµ − α ,φ (cid:48) ji : µλ = φ ji : µλ + α . Substituting in Eq. 4, θ (cid:48) i : λ (cid:48) = θ i : λ (cid:48) − α ∀ λ (cid:48) ≥ λ , (15) θ (cid:48) j : µ (cid:48) = θ j : µ (cid:48) + α ∀ µ (cid:48) ≥ µ ,θ (cid:48) ij : λ (cid:48) µ (cid:48) = θ ij : λ (cid:48) µ (cid:48) − α ∀ λ (cid:48) < λ, µ (cid:48) ≥ µ ,θ (cid:48) ij : λ (cid:48) µ (cid:48) = θ ij : λ (cid:48) µ (cid:48) + α ∀ λ (cid:48) ≥ λ, µ (cid:48) < µ . Now, since E θ ≡ E θ (cid:48) , by Lemma 6.1, there exists messages m ij : λ and m ji : µ such that, θ (cid:48) ij : λµ = θ ij : λµ − m ij : λ − m ji : µ , (16) θ (cid:48) i : λ = θ i : λ + (cid:88) ( k,i ) ∈E + m ki : λ . With a little bit of calculation, one can see that, the messagestake the following form m ij : λ (cid:48) = − α ∀ λ (cid:48) ≥ λ , (17) m ji : µ (cid:48) = α ∀ µ (cid:48) ≥ µ . Note that, for a permissible flow m ( λ, µ, α ) , the parameters φ (cid:48) and θ (cid:48) are non-negative.This equivalence is shown in Fig. 8 for an exampleflow-loop m (2 , , α ) . Note that, as shown in the figure, the flow α through an edge φ ij : λµ may be recorded in the setof exit-flows Σ . Furthermore, as shown in the figure, therelationship between the set of exit-flows and the set ofmessages can be written as, Σ ij : λ = m ij : λ − − m ij : λ ∀ λ ∈ { , · · · , (cid:96) − } , (18) Σ ji : µ = m ji : µ − − m ji : µ ∀ µ ∈ { , · · · , (cid:96) − } . ELATED WORK
The approaches that have been proposed to minimize multi-label submodular MRFs can be roughly grouped into twocategories: Those based on max-flow and those based onan LP relaxation of the problem. Below, we briefly reviewrepresentative techniques in each category.
The most popular method to minimize a multi-label sub-modular MRF energy is to construct the Ishikawa graph [1]and then apply a max-flow algorithm to find the min-cutsolution. Broadly speaking, there are three different kinds ofmax-flow algorithms: those relying on finding augmentingpaths [3], the push-relabel approach [20] and the pseudo-flow techniques [21]. Even though numerous implementa-tions are available, the BK method [7] is arguably the fastestimplementation for 2D and sparse 3D graphs. Recently,for dense problems, the IBFS algorithm [22] was shown tooutperform the BK method in a number of experiments [23].All the above-mentioned algorithms, however, require thesame order of storage as the Ishikawa graph and hence scalepoorly. Two approaches have nonetheless been studied toscale the max-flow algorithms. The first one explicitly relieson the N-D grid structure of the problem at hand [24], [25].The second one makes use of distributed computing [9],[26], [27]. Unfortunately, both these approaches require ad-ditional resources (disk space or clusters) to run max-flowon an Ishikawa graph. By contrast, our algorithm lets usefficiently minimize the energy of much larger Ishikawa-type graphs on a standard computer. Furthermore, usingthe method of [9], it can also be parallelized.
One memory-efficient way to minimize a multi-label sub-modular MRF energy consists of formulating the problemas a linear program and then maximize the dual usingmessage-passing techniques [18]. Many such algorithmshave been studied [10], [11], [12], [15]. Even though thesealgorithms are good at approximating the optimal solu-tion (also theoretically optimal for multi-label submodularMRFs [19]), as evidenced by the comparison of [28] and byour experiments, they usually take much longer to convergeto the optimal solution than max-flow-based techniques.
XPERIMENTS
We evaluated our algorithm on the problems of stereocorrespondence estimation and image inpainting. For stereocorrespondence estimation, we employed six instances fromthe Middlebury dataset [29], [30]: Tsukuba, Venus, Saw-tooth, Map, Cones and Teddy, and one instance from theKITTI dataset [31] (see Fig. 9). For Tsukuba and Venus, weused the unary potentials of [32], and for all other stereo Fig. 9:
Left and right images of the stereo instance from theKITTI dataset. The images are of size × , and we setthe number of labels to 40. This image pair was chosen arbitrarilyas a representative of the dataset. Augmenting path length F r equen cy Median = 5Mean = 14Maximum = 218
Fig. 10:
Lengths of augmenting paths found by our algorithm forthe Tsukuba stereo instance (see Sec. 8.1). Each bar indicates theproportion of paths of a certain length. For example, out of allaugmenting paths of them were of length 2. The red arrowindicates the median length. cases, those of [33]. For inpainting, we used the Penguinand House images employed in [32], and we used the sameunary potentials as in [32]. In all the above cases, we usedpairwise potentials that can be expressed as θ ij ( x i , x j ) = w ij θ ( | x i − x j | ) , (19)where, unless stated otherwise, the regularizer θ ( | x i − x j | ) is the quadratic function. Furthermore, we employed a 4-connected neighbourhood structure, in all our experiments.We compare our results with two max-flow implemen-tations: the BK algorithm [7] and Excesses IncrementalBreadth First Search (EIBFS) [34] (which we ran on theIshikawa graph), and three LP relaxation-based algorithms:Tree Reweighted Message Passing (TRWS) [10], Subgradientbased Dual Decomposition (DDSG) [11] and the AdaptiveDiminishing Smoothing algorithm (ADSal) [12]. For DDSGand ADSal, we used the Opengm [35] implementations. Forthe other algorithms, we employed the respective authors’implementations.In practice, we only ran the BK algorithm and EIBFS ifthe graph could be stored in RAM. Otherwise, we providean estimate of their memory requirement. For LP relaxation-based methods, unless they converged, we ran the algo-rithms either for 10000 iterations, or for 50000 seconds,whichever occurred first. Note that the running times re-ported for our algorithm include graph construction. All ourexperiments were conducted on a 3.4 GHz i7-4770 CPU with16 GB RAM.The memory consumption and running times of the al-gorithms are provided in Table 1. Altogether, our algorithmlets us solve much larger problems than the BK algorithmand EIBFS, and is an order of magnitude faster than state-of-the-art message-passing algorithms. In this section, we empirically analyze various propertiesof our algorithm. First, note that, at each iteration, i.e ., Problem Memory [MB] Time [s]BK EIBFS DDSG ADSal TRWS MEMF BK EIBFS DDSG ADSal TRWS MEMFTsukuba 3195 2495 258 252 287 > > > >
208 494 219 57 > >
939 5024 1200 - - > > Teddy *72303 *55063
939 5025 1200 - - > > KITTI *88413 *67316 > > > Penguin *173893 *130728 236 1123
663 - - > > > House *521853 *392315 689 2389 > > > TABLE 1:
Memory consumption and runtime comparison with state-of-the-art baselines for quadratic regularizer (see para. 2 of Sec. 8,for details on the algorithms). A “*” indicates a memory estimate, and “ > ” indicates that the algorithm did not converge to theoptimum within the specified time. Note that our algorithm has a memory consumption O ( (cid:96) ) times lower than the max-flow-basedmethods and is an order of magnitude faster than message-passing algorithms. Compared to EIBFS, our algorithm is only 4 – 7 timesslower, but requires 12 – 23 times less memory, which makes it applicable to more realistic problems. In all stereo problems, TRWScached the pairwise potentials in an array for faster retrieval, but in the case of inpainting, it was not possible due to excessive memoryrequirement. at each augmentation step, our algorithm performs morecomputation than standard max-flow. Therefore, we wouldlike our algorithm to find short augmenting paths and toconverge in fewer iterations than standard max-flow. Below,we analyze these two properties empirically.In Fig. 10, we show the distribution of the lengths of theaugmenting paths found by our algorithm for the Tsukubastereo instance. Note that the median length is only 5. Asa matter of fact, the maximum length observed over allour experiments was 1073 for the KITTI data. Nevertheless,even in that image, the median length was only 15. Notethat, since our algorithm finds augmenting paths in theblock-graph, the path lengths are not directly comparableto those found by other max-flow-based methods. In termsof number of augmentations, we found that our algorithmonly required between 35% and 50% of the total number ofaugmentations of the BK algorithm.Next, we fixed the number of labels but varied the imagesize and compare the running times of the max-flow algo-rithms, for Tsukuba and Penguin instances in Fig. 11a, 11b.Similarly, we fixed the image size but varied the number oflabels and report the running times in Fig. 11c, 11d. By doingthis, we try to estimate the empirical time complexity of ouralgorithm. Note that, similar to other max-flow algorithms,MEMF exhibited near-linear performance with respect to theimage size and near-cubic performance with respect to thenumber of labels, in these experiments.Finally, we report the percentage of time taken by eachsubroutine of our algorithm, for Tsukuba and Penguin in-stances in Fig. 12. Note that the individual time complexitiesof the subroutines compute edges and block-graph are O ( (cid:96) ) and O ( (cid:96) ) , respectively. Therefore, they become dominantwhen the number of labels is large, and hence the cor-responding percentages of time are high, particularly forPenguin. Since our algorithm can simply replace standard max-flowin Ishikawa-type graphs, we replaced the BK method withour MEMF procedure in the IRGC algorithm [6], whichminimizes MRFs with some non-convex pairwise poten-tials (or regularizers) by iteratively building and solvingan Ishikawa graph. This lets us tackle much larger non-submodular problems. In particular, we computed inpaint-
Problem Memory [MB] Time [s]BK MEMF BK MEMFPenguin-128/10 4471
332 224
498 106 - House-256/60 *137248 - TABLE 2:
Memory consumption and runtime comparison ofIRGC+expansion with either the BK method or our MEMFalgorithm as subroutine (see Sec. 8.2). Here, “Penguin-128/10”corresponds to the Penguin problem with 128 labels and the trun-cated quadratic function with truncation value 10 as regularizer.A “*” indicates a memory estimate. Compared to BK method,MEMF is only 4 – 11 times slower but requires 13 – 18 times lessmemory, which makes it applicable to much larger MRFs. ing results on Penguin by using all labels, as opposed tothe down-sampled label sets used in [6]. The results of theIRGC+expansion algorithm, with the BK method and withMEMF are summarized in Table 2.
Since robust regularizers are highly effective in computervision, we tested our algorithm by choosing Huber lossfunction [36] as the regularizer, θ ( | x i − x j | ) = (cid:26) | x i − x j | if | x i − x j | ≤ δδ (cid:0) | x i − x j | − δ (cid:1) otherwise , (20)where δ is the Huber value. The results are summarized inTable 3. In this experiment, the Huber value was set to 4for Tsukuba, Venus and Sawtooth, 6 for Map, 20 for Conesand Teddy, 10 for KITTI and 25 for Penguin and House.Note that, the Ishikawa graph for a Huber regularizer issignificantly smaller, i.e ., the number of edges per variablepair is O ( δ(cid:96) ) , instead of O ( (cid:96) ) . Even in this case, ouralgorithm lets us solve much larger problems than the BKalgorithm and EIBFS, and is an order of magnitude fasterthan state-of-the-art message-passing algorithms. We parallelized our algorithm based on the dual-decomposition technique of [9] and evaluated on the samesix stereo instances from the Middlebury dataset [29], [30].The relative times t m /t s , where t m stands for the multi-thread time and t s for the single-thread one, are shown No of pixels T i m e ( s ) BKEIBFSMEMFy = xy = x (a) Tsukuba No of pixels T i m e ( s ) MEMFy = xy = x (b) Penguin No of labels T i m e ( s ) BKEIBFSMEMFy = x y = x (c) Tsukuba
32 64 128 25610 No of labels T i m e ( s ) BKEIBFSMEMFy = x y = x (d) Penguin Fig. 11:
Running time plots (in logarithmic scale) by changing the image size ( a , b ) and by changing the number of labels ( c , d ),for Tsukuba and Penguin (see Sec. 8.1). The dashed lines provide the reference slopes. Note that, all algorithms exhibited near-linearperformance with respect to the number of pixels and near-cubic performance with respect to the number of labels, but MEMF required O ( (cid:96) ) less memory. The plots of BK algorithm and EIBFS are not complete, since we could not run them due to excessive memoryrequirement. Problem Memory [MB] Time [s]BK EIBFS TRWS MEMF BK EIBFS TRWS MEMFTsukuba 1715 1385 287
198 28Venus 3375 2719 638
211 57Sawtooth 3348 2698 633
467 34Map 2680 2116 494 > - - 1118 Teddy *42155 *32167 5025 - - 6879
KITTI *42161 *32627 6416 - - > Penguin *33487 *25423
663 - - > House *100494 *76295 > TABLE 3:
Memory consumption and runtime comparison with state-of-the-art baselines for Huber regularizer (see Sec. 8.3). A “*”indicates a memory estimate, and “ > ” indicates that the algorithm did not converge to the optimum within the specified time. Notethat our algorithm has a much lower memory consumption than the max-flow-based methods and is an order of magnitude fasterthan message-passing algorithms. Compared to EIBFS, our algorithm is 7 – 11 times slower, but requires 7 – 10 times less memory,which makes it applicable to more realistic problems. In all stereo problems, TRWS cached the pairwise potentials in an array for fasterretrieval, but in the case of inpainting, it was not possible due to excessive memory requirement. aug m en t i ng_pa t haug m en t c o m pu t e_edge s b l o ck − g r aph r epa i r _ t r ee R un t i m e % Tsukuba, 16 labelsPenguin, 256 labels
Fig. 12:
Percentage of time taken by each subroutine (see Sec. 8.1).Note that, in Penguin, due to large number of labels, the percent-ages of time spend on compute edges and block-graph are high. in Fig. 13 for two and four threads. In this experiment,for all problems, the image grid was split vertically intotwo and four equally-sized blocks, respectively. Note thatthis spliting strategy is fairly arbitrary, and may affect theperformance of the multi-threaded algorithm. In fact findingbetter splits may itself be a possible future direction.
ONCLUSION
We have introduced a variant of the max-flow algorithmthat can minimize multi-label submodular MRF energiesoptimally, while requiring much less storage. Furthermore,our experiments have shown that our algorithm is an orderof magnitude faster than state-of-the-art methods. We there-fore believe that our algorithm constitutes the method ofchoice to minimize Ishikwa type graphs when the completegraph cannot be stored in memory. T s u k uba V enu s S a w t oo t h M ap C one s T edd y R e l a t i v e t i m e Fig. 13:
Our algorithm can be accelerated using the parallel max-flow technique (see Sec. 8.4). The relative times ranged from 0.56to 0.99 with 2-threads and from 0.39 to 0.83 with 4-threads.In Teddy, in the case of 2-threads, the multi-threaded algorithmperforms almost the same as the single-threaded algorithm, whichmay be due to bad splits. A PPENDIX AT IME C OMPLEXITY A NALYSIS OF THE P OLYNOMIALTIME VERSION OF
MEMF
Let us denote the Ishikawa graph with ˆ G = ( ˆ V ∪ { , } , ˆ E ) and the lower-graph with ˆ G s = ( ˆ V ∪ { , } , ˆ E s ) . In thissection, we denote the nodes with u , v , etc. Also the notation u ≤ u means, the node u and u are in the same columnwhere u is below u . Let ˆ G sf denotes the residual graph ofthe lower-graph after the flow f and similarly ˆ E sf denotesthe set of non-zero residual edges. Let d f ( u, v ) denotes theshortest path distance from u to v calculated by MEMF.
6. The superscript s is used to restate the fact, that the lower-graph isa subgraph of the Ishikawa graph. Lemma A.1.
If the MEMF algorithm is run on the Ishikawagraph ˆ G = ( ˆ V ∪ { , } , ˆ E ) with source and terminal , thenfor any node v ∈ ˆ V , the shortest path distance d f (0 , v ) in theresidual lower-graph ˆ G sf increases monotonically with each flowaugmentation.Proof. We will suppose that for some node v ∈ ˆ V , there isa flow augmentation that causes the shortest path distancefrom to v to decrease, and then we will derive a contradic-tion. Let f be the flow just before the first augmentation thatdecreases some shortest path distance, and let f (cid:48) be the flowjust afterward. Let v be the node with the minimum d f (cid:48) (0 , v ) whose distance was decreased by the augmentation, so that d f (cid:48) (0 , v ) < d f (0 , v ) . Let p = 0 (cid:32) u → v be a shortest pathfrom to v in ˆ G sf (cid:48) , so that ( u, v ) ∈ ˆ E sf (cid:48) and d f (cid:48) (0 , v ) = (cid:26) d f (cid:48) (0 , u ) if u < v (infinite edge) d f (cid:48) (0 , u ) + 1 otherwise . (21)Because of how we chose v , we know that the distance ofnode u from the source did not decrease, i.e. , d f (cid:48) (0 , u ) ≥ d f (0 , u ) . (22)We claim that ( u, v ) / ∈ ˆ E sf . Why? If we had ( u, v ) ∈ ˆ E sf , thenwe would also have d f (0 , v ) ≤ d f (0 , u ) + 1 , (23) ≤ d f (cid:48) (0 , u ) + 1 , = d f (cid:48) (0 , v ) , which contradicts our assumption that d f (cid:48) (0 , v ) < d f (0 , v ) .The above argument simply follows even if ( u, v ) is aninfinite capacity edge. Hence ( u, v ) / ∈ ˆ E sf .How can we have ( u, v ) / ∈ ˆ E sf and ( u, v ) ∈ ˆ E sf (cid:48) ? Notethat, in this case, ( u, v ) cannot be an infinite capacity edge.There can be two reasons:1) A new lowest edge ( u, v ) is created due to the flowfrom v to u . That means the augmentation must haveincreased the flow from v to u . The MEMF algorithmalways augments flow along shortest paths, and there-fore the shortest path from to u in ˆ G sf has ( v, u ) as itslast edge. Therefore, d f (0 , v ) = d f (0 , u ) − , (24) ≤ d f (cid:48) (0 , u ) − , = d f (cid:48) (0 , v ) − , which contradicts our assumption that d f (cid:48) (0 , v ) If the MEMF algorithm is run on the Ishikawagraph ˆ G = ( ˆ V ∪ { , } , ˆ E ) with source and sink , then thetotal number of augmentations performed by the algorithm is O ( | ˆ V|| ˆ E| ) .Proof. We say that an edge ( u, v ) in a residual lower-graph ˆ G sf is critical on an augmenting path p if the residual capacityof p is the residual capacity of ( u, v ) , i.e. , if c f ( p ) = c f ( u, v ) .After we have augmented flow along an augmenting path,any critical edge on the path disappears from the residualgraph. Moreover, at least one edge on any augmenting pathmust be critical. We will show that each of the | ˆ E| edges canbecome critical at most | ˆ V| / times. Furthermore, notethat, an infinite capacity edge cannot be critical at any pointof the algorithm.Let u and v be nodes in ˆ V ∪ { , } that are connected byan edge in ˆ E s . Since augmenting paths are shortest paths,when ( u, v ) is critical for the first time, we have d f (0 , v ) = d f (0 , u ) + 1 . (26)Once the flow is augmented, the edge ( u, v ) disappears fromthe residual graph. Since we maintain the lowest-cross-edgeproperty, there cannot be an edge ( u, v ) in ˆ G sf for some v < v . Therefore, the edge ( u, v ) cannot reappear later onanother augmenting path until after the flow from u to v for some v ≤ v is decreased, which occurs only if ( v , u ) appears on an augmenting path. If f (cid:48) is the flow when thisevent occurs, then we have d f (cid:48) (0 , u ) = d f (cid:48) (0 , v ) + 1 . (27)Since d f (cid:48) (0 , v ) ≤ d f (cid:48) (0 , v ) , due to the upward infinitecapacity edges, and d f (0 , v ) ≤ d f (cid:48) (0 , v ) by Lemma A.1, wehave d f (cid:48) (0 , u ) = d f (cid:48) (0 , v ) + 1 , (28) ≥ d f (cid:48) (0 , v ) + 1 , ≥ d f (0 , v ) + 1 , = d f (0 , u ) + 2 . Consequently, from the time ( u, v ) becomes critical to thetime when it next becomes critical, the distance of u fromthe source increases by at least 2. The distance of u fromthe source is initially at least 0. The intermediate nodes ona shortest path from to u cannot contain , u or (since ( u, v ) on an augmenting path implies that u (cid:54) = 1 ). Therefore,until u becomes unreachable from the source, if ever, itsdistance is at most | ˆ V| . Thus, after the first time that ( u, v ) becomes critical, it can become critical at most | ˆ V| / timesmore, for a total of | ˆ V| / times. Since there are O ( | ˆ E| ) pairs of nodes that can have an edge between them in aresidual graph, the total number of critical edges during theentire execution of the MEMF algorithm is O ( | ˆ V|| ˆ E| ) . Eachaugmenting path has at least one critical edge, and hencethe theorem follows. A PPENDIX BP ROOF OF T HEOREM Theorem. Given the set of Ishikawa graph parameters φ , there isan augmenting path in the block-graph if and only if there existsan augmenting path in the Ishikawa graph.Proof. First, we will prove that, if there is an augmentingpath in the block-graph, then there exists an augmentingpath in the Ishikawa graph. It is clear that an augmentingpath in the block-graph contains an edge from node 0 toa block and then a sequence of edges B i : γ → B j : δ andfinally an edge from a block to node 1. Note that an edgefrom node 0 to a block B i : γ corresponds to a positive edge e i : (cid:96) − in the Ishikawa graph; similarly an edge from a block B j : δ to node 1 corresponds to a positive edge e j :0 . Now,consider an edge B i : γ → B j : δ in the augmenting path.Corresponding to this, there exists a positive edge e ij : λµ such that U i : λ ∈ B i : γ (cid:48) for some γ (cid:48) ≥ γ and U j : µ ∈ B j : δ in the Ishikawa graph. Also along the column i , there areupward infinite capacity edges, and nodes corresponding toa block are also connected with positive bidirectional edges.Hence, there exists an augmenting path in the Ishikawagraph, corresponding to the augmenting path in the block-graph.Now, we will prove the converse. Consider an augment-ing path in the Ishikawa graph. The path may contain asequence of positive edges e i : λ , e ij : λµ and infinite capacityedges e ii : λλ +1 . Note that, by construction, the e i : λ edgeseither will be in the same block B i : γ in the block-graph, orwill be between a block and node 0 or node 1. Furthermore,the infinite capacity edges either will be in the same block,or there will be an edge B i : γ → B j : δ in the block-graphto represent them. Finally, if e ij : λµ is a positive edge, then,by construction of the block-graph, there exists an edge B i : γ → B j : δ (cid:48) where U i : λ ∈ B i : γ and U j : µ ∈ B j : δ with δ (cid:48) ≤ δ .Hence, if there is an augmenting path in the Ishikawa graph,then there exists an augmenting path in the block-graph. R EFERENCES [1] H. Ishikawa, “Exact optimization for markov random fields withconvex priors,” Pattern Analysis and Machine Intelligence, IEEETransactions on , vol. 25, no. 10, pp. 1333–1336, 2003.[2] D. Schlesinger and B. Flach, Transforming an arbitrary minsumproblem into a binary one . TU, Fak. Informatik, 2006.[3] L. Ford and D. R. Fulkerson, Flows in networks . PrincetonPrinceton University Press, 1962, vol. 1962.[4] P. Torr and M. Kumar, “Improved moves for truncated convexmodels,” in Advances in neural information processing systems , 2009,pp. 889–896.[5] O. Veksler, “Multi-label moves for mrfs with truncated convexpriors,” International journal of computer vision , vol. 98, no. 1, pp.1–14, 2012.[6] T. Ajanthan, R. Hartley, M. Salzmann, and H. Li, “Iterativelyreweighted graph cut for multi-label mrfs with non-convex pri-ors,” in The IEEE Conference on Computer Vision and Pattern Recog-nition (CVPR) , June 2015.[7] Y. Boykov and V. Kolmogorov, “An experimental comparison ofmin-cut/max-flow algorithms for energy minimization in vision,” Pattern Analysis and Machine Intelligence, IEEE Transactions on ,vol. 26, no. 9, pp. 1124–1137, 2004.[8] P. Kohli and P. H. Torr, “Efficiently solving dynamic markovrandom fields using graph cuts,” in Computer Vision, 2005. ICCV2005. Tenth IEEE International Conference on , vol. 2. IEEE, 2005, pp.922–929. [9] P. Strandmark and F. Kahl, “Parallel and distributed graph cuts bydual decomposition,” in Computer Vision and Pattern Recognition(CVPR), 2010 IEEE Conference on . IEEE, 2010, pp. 2085–2092.[10] V. Kolmogorov, “Convergent tree-reweighted message passing forenergy minimization,” Pattern Analysis and Machine Intelligence,IEEE Transactions on , vol. 28, no. 10, pp. 1568–1583, 2006.[11] N. Komodakis, N. Paragios, and G. Tziritas, “Mrf energy mini-mization and beyond via dual decomposition,” Pattern Analysisand Machine Intelligence, IEEE Transactions on , vol. 33, no. 3, pp.531–552, 2011.[12] B. Savchynskyy, S. Schmidt, J. H. Kappes, and C. Schn¨orr, “Effi-cient mrf energy minimization via adaptive diminishing smooth-ing,” in Uncertainty in Artificial Intelligence , 2012, pp. 746–755.[13] T. Ajanthan, R. Hartley, and M. Salzmann, “Memory efficient maxflow for multi-label submodular mrfs,” in The IEEE Conference onComputer Vision and Pattern Recognition (CVPR) , June 2016.[14] E. Boros and P. L. Hammer, “Pseudo-boolean optimization,” Dis-crete applied mathematics , vol. 123, no. 1, pp. 155–225, 2002.[15] T. Werner, “A linear programming approach to max-sum problem:A review,” Pattern Analysis and Machine Intelligence, IEEE Transac-tions on , vol. 29, no. 7, pp. 1165–1179, 2007.[16] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduc-tion to algorithms . MIT press Cambridge, 2001, vol. 6.[17] J. B. Orlin, “Max flows in o (nm) time, or better,” in Proceedings ofthe forty-fifth annual ACM symposium on Theory of computing . ACM,2013, pp. 765–774.[18] M. J. Wainwright, T. S. Jaakkola, and A. S. Willsky, “Map estima-tion via agreement on trees: message-passing and linear program-ming,” Information Theory, IEEE Transactions on , vol. 51, no. 11, pp.3697–3717, 2005.[19] V. Kolmogorov and M. Wainwright, “On the optimality of tree-reweighted max-product message-passing,” Uncertainty on Artifi-cial Intelligence , 2005.[20] A. V. Goldberg and R. E. Tarjan, “A new approach to themaximum-flow problem,” Journal of the ACM (JACM) , vol. 35,no. 4, pp. 921–940, 1988.[21] B. G. Chandran and D. S. Hochbaum, “A computational studyof the pseudoflow and push-relabel algorithms for the maximumflow problem,” Operations research , vol. 57, no. 2, pp. 358–376, 2009.[22] A. V. Goldberg, S. Hed, H. Kaplan, R. E. Tarjan, and R. F. Wer-neck, “Maximum flows by incremental breadth-first search,” in Algorithms–ESA 2011 . Springer, 2011, pp. 457–468.[23] T. Verma and D. Batra, “Maxflow revisited: An empirical compar-ison of maxflow algorithms for dense vision problems.” in BMVC ,2012, pp. 1–12.[24] A. Delong and Y. Boykov, “A scalable graph-cut algorithm fornd grids,” in Computer Vision and Pattern Recognition, 2008. CVPR2008. IEEE Conference on . IEEE, 2008, pp. 1–8.[25] O. Jamriˇska, D. S`ykora, and A. Hornung, “Cache-efficient graphcuts on structured grids,” in Computer Vision and Pattern Recogni-tion (CVPR), 2012 IEEE Conference on . IEEE, 2012, pp. 3673–3680.[26] A. Shekhovtsov and V. Hlav´aˇc, “A distributed mincut/maxflowalgorithm combining path augmentation and push-relabel,” Inter-national journal of computer vision , vol. 104, no. 3, pp. 315–342, 2013.[27] V. Vineet and P. Narayanan, “Cuda cuts: Fast graph cuts on thegpu,” in Computer Vision and Pattern Recognition Workshops, 2008.CVPRW’08. IEEE Computer Society Conference on . IEEE, 2008, pp.1–8.[28] J. H. Kappes, B. Andres, F. A. Hamprecht, C. Schn¨orr,S. Nowozin, D. Batra, S. Kim, B. X. Kausler, T. Kr¨oger,J. Lellmann, N. Komodakis, B. Savchynskyy, and C. Rother,“A comparative study of modern inference techniques forstructured discrete energy minimization problems,” InternationalJournal of Computer Vision , pp. 1–30, 2015. [Online]. Available:http://dx.doi.org/10.1007/s11263-015-0809-x[29] D. Scharstein and R. Szeliski, “A taxonomy and evaluation ofdense two-frame stereo correspondence algorithms,” Internationaljournal of computer vision , vol. 47, no. 1-3, pp. 7–42, 2002.[30] ——, “High-accuracy stereo depth maps using structured light,”in Computer Vision and Pattern Recognition, 2003. Proceedings. 2003IEEE Computer Society Conference on , vol. 1. IEEE, 2003, pp. I–195.[31] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meetsrobotics: The kitti dataset,” The International Journal of RoboticsResearch , p. 0278364913491297, 2013.[32] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov,A. Agarwala, M. Tappen, and C. Rother, “A comparative studyof energy minimization methods for markov random fields with smoothness-based priors,” Pattern Analysis and Machine Intelli-gence, IEEE Transactions on , vol. 30, no. 6, pp. 1068–1080, 2008.[33] S. Birchfield and C. Tomasi, “A pixel dissimilarity measure thatis insensitive to image sampling,” Pattern Analysis and MachineIntelligence, IEEE Transactions on , vol. 20, no. 4, pp. 401–406, 1998.[34] A. V. Goldberg, S. Hed, H. Kaplan, P. Kohli, R. E. Tarjan, and R. F.Werneck, “Faster and more dynamic maximum flow by incremen-tal breadth-first search,” in Algorithms–ESA 2015 . Springer, 2015,pp. 619–630.[35] B. Andres, T. Beier, and J. H. Kappes, “Opengm: A c++ library fordiscrete graphical models,” arXiv preprint arXiv:1206.0111 , 2012.[36] P. J. Huber, “Robust estimation of a location parameter,” TheAnnals of Mathematical Statistics , vol. 35, no. 1, pp. 73–101, 1964. Thalaiyasingam Ajanthan obtained his BScdegree in Electronics and TelecommunicationEngineering from University of Moratuwa, SriLanka, in 2013. He is now a PhD student at theCollege of Engineering and Computer Science,Australian National University (ANU). He is alsoa member of the Analytics Group at Data61,CSIRO. He is a student member of IEEE. Richard Hartley is a member of the ComputerVision Group in the Research School of Engi-neering, at ANU, where he has been since Jan-uary, 2001. He is also a member of the Analyticsgroup in Data61, CSIRO. He worked at the GEResearch and Development Center from 1985 to2001, working first in VLSI design, and later incomputer vision. He became involved with Im-age Understanding and Scene Reconstructionworking with GEs Simulation and Control Sys-tems Division. He is an author (with A. Zisser-man) of the book Multiple View Geometry in Computer Vision.