A Potential Reduction Inspired Algorithm for Exact Max Flow in Almost O ˜ ( m 4/3 ) Time
aa r X i v : . [ c s . D S ] S e p A Potential Reduction Inspired Algorithm for Exact Max Flow inAlmost e O ( m / ) Time
Tarun Kathuria ∗ September 8, 2020
Abstract
We present an algorithm for computing s - t maximum flows in directed graphs in e O ( m / o (1) U / ) time. Our algorithm is inspired by potential reduction interior point methodsfor linear programming. Instead of using scaled gradient/Newton steps of a potential function,we take the step which maximizes the decrease in the potential value subject to advancing acertain amount on the central path, which can be efficiently computed. This allows us to tracethe central path with our progress depending only ℓ ∞ norm bounds on the congestion vector (asopposed to the ℓ norm required by previous works) and runs in O ( √ m ) iterations. To improvethe number of iterations by establishing tighter bounds on the ℓ ∞ norm, we then consider theweighted central path framework of Madry [Mad13, Mad16, CMSV17] and Liu-Sidford [LS20b].Instead of changing weights to maximize energy, we consider finding weights which maximizethe maximum decrease in potential value. Finally, similar to finding weights which maximizeenergy as done in [LS20b] this problem can be solved by the iterative refinement frameworkfor smoothed ℓ - ℓ p norm flow problems [KPSW19] completing our algorithm. We believe ourpotential reduction based viewpoint provides a versatile framework which may lead to fasteralgorithms for max flow. ∗ U.C. Berkeley, [email protected], supported by NSF Grant CCF 1718695.
Introduction
The s - t maximum flow problem and its dual, the s - t minimum cut on graphs are amongst themost fundamental problems in combinatorial optimization with a wide range of applications.Furthermore, they serve as a testbed for new algorithmic concepts which have found uses in otherareas of theoretical computer science and optimization. This is because the max-flow and min-cutproblems demonstrate the prototypical primal-dual relation in linear programs. In the well-known s - t maximum flow problem we are given a graph G = ( V, E ) with m edges and n vertices withedge capacities u e ≤ U , and aim to route as much flow as possible from s to t while restricting themagnitude of the flow on each edge to its capacity.Several decades of work in combinatorial algorithms for this problem led to a large set ofresults culminating in the work of Goldberg-Rao [GR98] which gives a running time bound of O ( m min { m / , n / } log( n m ) log U ) . This bound remained unimproved for many years. In abreakthrough paper, Christiano et al [CKM +
11] show how to compute approximate maximumflows in e O ( mn / log( U ) poly (1 /ε )) . Their new approach uses electrical flow computations whichare Laplacian linear system solves which can be solved in nearly-linear time [ST14] to take steps tominimize a softmax approximation of the congestion of edges via a second order approximation. Astraightforward analysis leads to a O ( √ m ) iteration algorithm. However, they present an insight bytrading off against another potential function and show that O ( m / ) iterations suffice. This workled to an extensive line of work exploiting Laplacian system solving and continuous optimizationtechniques for faster max flow algorithms. Lee et al. [LRS13] also present another O ( n / poly(1 /ε )) iteration algorithm for unit-capacity graphs also using electrical flow primitives. Finally Kelneret al. [KLOS14] and Sherman [She13, She17b] present algorithms achieving O ( m o (1) poly(1 /ε )) iteration algorithm for max-flow and its variants, which are based on congestion approximatorsand oblivious routing schemes as opposed to electrical flow computations. This has now beenimproved to near linear time [Pen16, She17a]. Crucially this line of work can only guarantee weakapproximations to max flow due to the poly(1 /ε ) in the iteration complexity.In order to get highly accurate solutions which depend only polylogarithmically on /ε , workhas relied on second-order optimization techniques which use first and second-order information(the Hessian of the optimization function). To solve the max flow problem to high accuracy, severalworks have used interior point methods (IPMs) for linear programming [NN94, Ren01]. Thesealgorithms approximate non-negativity/ ℓ ∞ constraints by approximating them by a self-concordant barrier, an approximation to an indicator function of the set which satisfies local smoothness andstrong convexity properties and hence can be optimized using Newton’s method. In particular,Daitch and Spielman [DS08] show how to combine standard path-following IPMs and Laplacianlinear system solves to obtain e O ( m √ m log( U/ε )) iterations, matching Goldberg and Rao up tologarithmic factors. The O ( √ m ) iterations is a crucial bottleneck here due to the ℓ ∞ norm beingapproximated by ℓ norm to a factor of √ m . Then Lee and Sidford [LS14] devised a fasterIPM using weighted logarithmic barriers to achieve a e O ( m √ n log( U/ε ) time algorithm. Madry[Mad13, Mad16] opened up the weighted barriers based IPM algorithms for max flow to show thatinstead of ℓ norm governing the progress of each iteration, one can actually make the progressonly maintaining bounds on the ℓ norm. Combining this with insights from [CKM + e O ( m / ) iteration algorithm whichleads to a e O ( m / U / log( m/ε )) time. Note that the algorithm depends polynomially on themaximum capacity edge U and hence is mainly an improvement for mildly large edge capaci-1ies. This work can also be used to solve min cost flow problems in the same running time [CMSV17].Another line of work beyond IPMs is to solve p -norm regression problems on graphs. Suchproblems interpolate between electrical flow problems p = 2 , maximum flow problems p = ∞ andtransshipment problems p = 1 . While these problems can also be solved in O ( √ m ) iterations tohigh accuracy using IPMs[NN94], it was unclear if this iteration complexity could be improveddepending on the value of p . Bubeck et al. [BCLL18] showed that for any self-concordant barrierfor the ℓ p ball, the iteration complexity has to be at least O ( √ m ) thus making progress using IPMsunlikely. They however showed another homotopy-based method, of which IPMs are also a part of,can be used to solve the problem in e O p ( m − p log(1 /ε )) iterations, where O p hides dependencieson p in the runtime. This leads to improvements on the runtime for constant values of p . Next,Adil et al. [AKPS19], inspired by the work of [BCLL18] showed that one can measure the changein p -norm using a second order term based on a different function which allows them to obtainapproximations to the p -norm function in different norms with strong condition number. Theseresults can be viewed in the framework of relative convexity [LFN18]. Thus, they can focus onjust solving the optimization problem arising from the residual. Using insights from [ CKM + ] ,they arrive at a e O p ( m / log(1 /ε ) -time algorithm. Then follow-up work by Kyng et al. [KPSW19]opened up the tools used by Spielman and Teng [ST14] for ℓ -norm flow problems to show thatone can construct strong preconditioners for the residual problems for mixed ℓ - ℓ p -norm flowproblems, a generalization of ℓ p -norm flow and obtain an e O p ( m o (1) log(1 /ε ) algorithm. Theseresults however do not lead to faster max flow algorithms however due to their large dependence on p .However, Liu and Sidford [LS20b] improving on Madry [Mad16] showed that instead of carefullytuning the weights based on the electrical energy, one can consider the separate problem of findinga new set of weights under a certain budget constraint to maximize the energy. They showed that aversion of this problem reduce to solving ℓ - ℓ p norm flow problems and hence can be solved in almost-linear time using the work of [KPSW19, AS20]. This leads to a O ( m / o (1) U / ) -time algorithmfor max flow. However, this result still relies on the amount of progress one can take in each iterationbeing limited to the bounds one can ensure on the ℓ norm of the congestion vector, as opposed tothe ideal ℓ ∞ norm. We remark here that there are IPMs for linear programming which only measurecentrality in ℓ ∞ norm as opposed to the ℓ or ℓ norm. In particular [CLS19, LSZ19, vdBLSS20]show how to take a step with respect to a softmax function of the duality gap and trace the centralpath only maintaining ℓ ∞ norm bounds. [Tun95, Tun94] also designed potential reduction basedIPMs which trace the central path only maintaining centrality in ℓ ∞ . In this paper, we devise a faster interior point method for s - t maximum flow in directed graphs.Precisely, our algorithm runs in time e O ( m / o (1) U ) . During the process of writing this paper, wewere informed by Yang Liu and Aaron Sidford [LS20a] that they have also obtained an algorithmachieving the same runtime. They also end up solving the same subproblems that we will end upsolving, although they arrive at it from the perspective of considering the Bregman divergence ofthe barrier as opposed to considering the potential funcion that is the inspiration for our work. Ouralgorithm builds on top of both Madry [Mad16] and Liu-Sidford [LS20b] and is arguably simplerthan both in some regards.In particular, our algorithm is based on potential reduction algorithms which are a kind ofinterior point methods for linear programs. These algorithms are based on a potential function2hich measures both the duality gap as well as accounts for closeness to the boundary via a barrierfunction. The algorithms differ from path-following IPMs in that they have the potential to notstrictly follow the path closely but only trace it loosely, which is also experimentally observed.Usually, the step taken is a scaled gradient step/Newton step on the potential function. Providedthat we can guarantee sufficient decrease of the potential function and relate the potential functionto closeness to optimality, we can show convergence. We refer to [Ans96, Tod96, NN94] for excellentintroductions to potential reduction IPMs.We will however use a different step; instead of a Newton step, we consider taking the step,subject to augmenting a certain amount of flow in each iteration, which maximizes the decreasein the potential function after taking the step. We then show that this optimization problem canbe efficiently solved in e O ( m ) time using electrical flow computations. While we can show that thepotential function decreases by a large amount which guarantees that we can solve the max flowproblem in O ( √ m ) iterations, we forego writing it in this manner as we are unable to argue such astatement when the weights and hence the potential function is also changed. Instead, we stick tokeeping track of the centrality of our flow vector while making sufficient progress. Crucially however,the amount of progress made by our algorithm only depends on bounds on the ℓ ∞ of the congestionvector of the update step rather than the traditional ℓ or ℓ norm bounds in [Mad16, LS20b]. Inorder to improve the iteration complexity by obtaining stronger bounds on the ℓ ∞ norm of thecongestion vector, we show that like in Liu-Sidford [LS20b], we can change weights on the barrierterm for each edge. Instead of using energy as a potential function to be maximized, inspired byoracles designed for multiplicative weights algorithms, we use the change in the potential functionitself as the quantity to be maximized subject to a ℓ budget constraint on the change in weights.While we are unaware of how to maximize the ℓ constrained problem, we relax it to an ℓ q constrainedproblem, which we solve using a mixed ℓ - ℓ p norm flow problem using the work of [KPSW19, AS20].Combining this with an application of Hölder’s inequality gives us sufficiently good control on the ℓ -norm of the weight change while ensuring that our step has significantly better ℓ ∞ norm boundson the congestion vector. We believe our potential reduction framework as well as the concept ofchanging weights based on the update step might be useful in designing faster algorithms for maxflow beyond our m / running time. Throughout this paper, we will view graphs as having both forward and backward capacities. Specif-ically, we will denote by G = ( V, E, u ) , a directed graph with vertex set V of size n , an edge set E of size m , and two non-negative capacities u − e and u + e for each edge e ∈ E . For the purpose of thispaper, all edge capacities are bounded by U = 1 . Each edge e = ( u, v ) has a head vertex u and atail vertex v . For a vector v ∈ R m , we define k v k p = ( m P i =1 | v i | p ) /p and k v k ∞ = m max i =1 | v i | and refer toDiag ( v ) ∈ R m × m as the diagonal matrix with the i th diagonal entry equal to v i . Maximum Flow Problem
Given a graph G , we call any assignment of real values to the edgesof E , i.e., f ∈ R m , a flow. For a flow vector f , we view f e as the amount of the flow on edge e andif this value is negative, we interpret it as having a flow of | f e | flowing in the direction opposite tothe edge’s orientation. We say that a flow f is an σ -flow, for some demands σ ∈ R n iff it satisfies flow conservation constraints with respect to those demands. That is, we have X e ∈ E + ( v ) f e − X e ∈ E − ( v ) f e = σ v for every vertex v ∈ V E + ( v ) and E − ( v ) is the set of edges of G that are entering and leaving vertex v respectively.We will require P v ∈ V σ v = 0 .Furthermore, we say that a σ -flow f is feasible in G iff f satisfies the capacity constraints − u − e ≤ f e ≤ u + e for each edge e ∈ E One type of flows that will be of interest to us are s − t flows, where s (the source ) and t (the sink ) are two distinguishing vertices of G. Formally, an s − t flow is a σ -flow whose demand vector σ = F χ s,t , where F is the value of the flow and χ s,t is a vector with − and +1 at the coordinatescorresponding to s and t respectively and zero elsewhere.Now, the maximum flow problem corresponds to the problem in which we are given a directedgraph G = ( V, E, u ) with integer capacities as well as a source vertex s and a sink vertex t and wantto find a feasible s - t flow of maximum value. We will denote this maximum value F ∗ Residual Graphs
A fundamental object in many maximum flow algorithms is the notion of aresidual graph. Given a graph G and a feasible flow σ -flow f in that graph, we define the residualgraph G f as a graph G = ( V, E, ˆ u ( f )) over the same vertex and edge set as G and such that, foreach edge e = ( u, v ) , it’s forward and backward residual capacities are defined as ˆ u + e ( f ) = u + e − f e and ˆ u − e ( f ) = u − e + f e We will also denote ˆ u e ( f ) = min { ˆ u + e ( f ) , ˆ u − e ( f ) } . When the value of f is clear from context, we willomit writing it explicitly. Observe that the feasibility of f implies that all residual capacities arealways non-negative. Electrical Flows and Laplacian Systems
Let G be a graph and let r ∈ R m ++ be a vector ofedge resistances, where the resistance of edge e is denoted by r e . For a flow f ∈ R E on G , we definethe energy of f to be E r ( f ) = f ⊤ Rf = P e ∈ E r e f e where R = Diag ( f ) . For a demand χ , we define theelectrical χ -flow f r to be the χ -flow which minimizes energy f r = arg min B ⊤ f = χ E r ( f ) , where B ∈ R m × n is the edge-vertex incidence matrix. This flow is unique as the energy is a strcitly convex function.The Laplacian of a graph G with resistances r is defined as L = B ⊤ R − B . The electrical χ flowis given by the formula f r = R − BL † χ . We also define electrical potentials as φ = L † χ There isa long line of work starting from Spielman and Teng which shows how to solve Lφ = χ in nearlylinear time [ST14, KMP14, KOSA13, PS14, CKM +
14, KS16, KLP + p-Norm Flows As mentioned above, a line of work [BCLL18, AKPS19, KPSW19] shows howto solve more general p -norm flow problems. Precisely, given a "gradient" vector g ∈ R E , resistances r ∈ R E + and a demand vector χ , the problem under consideration is OP T = min B ⊤ f = χ X e ∈ E g e f e + r e f e + | f e | p [KPSW19] call such a problem as a mixed ℓ - ℓ p -norm flow problem and denote the expression insidethe min as val ( f ) . The main result of the paper is Theorem 2.1 (Theorem 1.1 in [KPSW19]) . For any even p ∈ [ ω (1) , o (log / − o (1) n )] and an initialsolution f (0) such that all parameters are bounded by poly(log( n )) , we can compute a flow e f satisfyingthe demands χ such that val ( e f ) − OP T ≤ O (poly(log m )) ( val ( f (0) ) − OP T ) + 12 O (poly(log m )) in O ( p / ) m O (1 / √ p ) time.
4e remark that strictly speaking the theorem in [KPSW19] states the error to be polynomialbut [LS20b] observe that their proof actually implies quasi-polynomial error as stated above. Whileour subproblems that we need to solve to change weights cannot be exactly put into this form,we show that mild modifications to their techniques can be done to then use their algorithm as ablack-box. Hence, we elaborate on their approach below.One main thing to establish in their paper is how the p -norm changes when we move from f to f + δ . Lemma 2.2 (Lemma in [KPSW19]) . We have for any f ∈ R E and δ ∈ R E that f pi + pf p − i δ i + 2 − O ( p ) h p ( f p − i , δ i ) ≤ ( f i + δ i ) p ≤ f pi + pf p − i δ i + 2 O ( p ) h p ( f p − i , δ i ) where h p ( x, δ ) = xδ + δ p Hence, given an initial solution, it suffices to solve the residual problem of the form min B ⊤ f =0 g ( f ) ⊤ δ + X e ∈ E h p ( f p − i , δ i ) where g ( f ) i = pf p − i . Next, they notice that bounding the condition number with respect to thefunction h p ( · , · ) actually suffices to get linear convergence and hence tolerate quasi-polynomially lowerrors. The rest of the paper goes into designing good preconditioners which allow them to solvethe above subproblem quickly.We will also need some basics about min-max saddle point problems [BNO03]. Given a function f ( x, y ) such that dom ( f, x ) = X and dom ( f, y ) = Y . The problem we will be interested in is of theform min x ∈X max y ∈Y f ( x, y ) Define the functions f y ( y ) = min x ∈X f ( x, y ) and f x ( x ) = max y ∈Y f ( x, y ) for every fixed y ∈ Y . We havethe following theorem from Section 2.6 in [BNO03] Theorem 2.3. If f ( x, y ) is convex in x and concave in y and let X , Y be convex and closed. Then f x is a convex function and f y is a concave function. √ m Iteration Algorithm
In this section, we first set up our IPM framework and show how to recover the √ m iterations boundfor max flow. In the next section, we will then change the weights to obtain our improved runtime.Our framework is largely inspired by [Mad16] and [LS20b] and indeed a lot of the arguments canbe reused with some modifications. For every edge e = ( u, v ) , we consider assigning two non-negative weights for the forward andbackward edges w + e and w − e . Based on the weights and the edge capacities, for any feasible flow,we define a barrier functional φ w ( f ) = − X e ∈ E w + e log( u + e − f e ) + w − e log( u − e + f e ) B ⊤ f = F χ and the proximity of the point to the constraints measured through thebarrier φ w ( f ) , known as centrality. Previous IPMs taking a Newton step with respect to the barrierwith a size which ensures that we increase the value of the flow F by a certain amount. Due tothe fact that a Newton step is the minimization of a second order optimization problem, it can beshown that the step can be computed via electrical flow computations. Typically, taking a Newtonstep can be decomposed into progress and centering steps where one first takes a progress stepwhich increases the flow value which causes us to lose centrality by some amount. Then one takesa centering step which improves the centrality without increasing the flow value. Depending onthe amount of progress we can make in each iteration such that we can still recenter determinesthe number of iterations our algorithm will take. [Mad16, LS20b] follow this prototype and looselyspeaking the amount of flow value we can increase in each iteration for the progress step dependson the ℓ ∞ norm of the congestion vector, which measures how much flow we can add before wesaturate an edge. However, the bottleneck ends up being the centering step which requires that theflow value can only be increased by an amount depending on the ℓ norm of the congestion vectorwhich is a stronger condition than ℓ ∞ norm.[Mad13, Mad16] notes that when the ℓ ∞ and ℓ norms of the congestion vector are large thenincreasing the resistances of the congested edges increases the energy of the resulting electricalflow. So he repeatedly increases the weights of the congested edges (called boosting) until thecongested vector has sufficiently small norm. By using electrical energy of the resulting step as aglobal potential function and analyzing how it evolves over the progress, centering and boostingsteps, they can control the amount of weight change and number of boosting steps necessary toreduce the norm of the congestion vector. Carefully trading these quantities yields their runtimeof e O ( m / ) . To improve on this, Liu and Sidford [LS20b] consider the problem of finding a setof weight increases which maximize the energy of the resulting flow. As we need to ensure thatthe weights don’t increase by too much, they place a budget constraint on the weight vector. Byshowing that a small amount of weight change suffices to obtain good bounds on the congestionvector. Fortunately, this optimization problem ends up being efficiently solvable in almost lineartime by using the mixed ℓ - ℓ p norm flow problem of [KPSW19]. However, this step still essentiallyrequires ℓ -norm bounds to ensure centering is possible.In this paper, we will consider taking steps with respect to a potential function. The potentialfunction Φ w comes from potential reduction IPM schemes and trades off the duality gap with thebarrier. Φ w ( f, s ) = m log (cid:18) f ⊤ sm (cid:19) + φ w ( f ) For self-concordant barriers like weighted log barriers are, the negative gradient −∇ φ w ( f ) is feasiblefor the dual [Ren01] and so for any f ′ feasible for the primal, we have f ′⊤ ( −∇ φ w ( f )) ≥ . We willconsider dual "potential" variables y ∈ R V . Now, like in [Mad16, LS20b], we consider a centralitycondition y v − y u = w + e u + e − f e − w − e u − e + f e for all e = ( u, v ) (3.1)If ( f, y, w ) satsify the above condition, we call it well-coupled . Also, given a tuple ( f, y, w ) and acandidate step ˆ f , define the forward and backward congestion vectors ρ + , ρ − ∈ R E as ρ + e = | ˆ f e | u + e − f e and ρ − e = | ˆ f e | u − e + f e for all e ∈ E (3.2)6e can now assume via binary search that we know the optimal flow value F ∗ [Mad16]. [Mad16,LS20b] consider preconditioning the graph which allows them to ensure that for a well-coupled pointwe can ensure sufficient progress. The preconditioning strategy to ensure this is to add m extra(undirected) edges between s and t of capacity U each. So the max flow value increases at mostby mU . The following lemma can be seen from the proof of Lemma 4.5 in [LS20b] Theorem 3.1.
Let ( f, y, w ) be a well-coupled point for flow value F in a preconditioned graph G .Then we have for every preconditioned edge e that ˆ u e ( f ) = min { u + e − f e , u − e + f e } ≥ F ∗ − F k w k . Inparticular, if k w k ≤ m , then we have ˆ u e ( f ) ≥ F ∗ − F m . If we also have F ∗ − F ≥ m / − η , then ˆ u e ( f ) ≥ m − (1 / η ) / Now that our setup is complete, we can focus on the step that we will be taking. In this section,we will keep the weights all fixed to 1, i.e., w + e = w − e = 1 for all e ∈ E . Hence k w k = 2 m . Considerthe change in the potential function when we move from f to f + ˆ f while keeping the dual variable −∇ φ w ( f ) = By fixed. This change is m log − ( f + ˆ f ) ⊤ ∇ φ w ( f ) m ! − m log (cid:18) − f ⊤ ∇ φ w ( f ) m (cid:19) + φ w ( f + ˆ f ) − φ w ( f ) We are interested in minimizing this quantity which corresponds to maximizing the decrease in thepotential function value while guaranteeing that we send say δ more units of flow ˆ f . Hence theproblem is arg min B ⊤ ˆ f = δχ m log − ( f + ˆ f ) ⊤ ∇ φ w ( f ) m ! + φ w ( f + ˆ f ) Unfortunately, this problem is not convex as the duality gap term is concave in ˆ f . However, weinstead can minimize an upper bound to this term which is convex: arg min B ⊤ ˆ f = δχ φ w ( f + ˆ f ) − ( f + ˆ f ) ⊤ ∇ φ w ( f )= arg min B ⊤ ˆ f = δχ − X e ∈ E w + e log − ˆ f e u + e − f e ! + w − e log f e u − e + f e ! − ˆ f e (cid:18) w + e u + e − f e − w − e u − e + f e (cid:19) as log(1 + x ) ≤ x for non-negative x which holds from duality as mentioned above. We will referto the value of the problem in the last line as the potential decrement and will henceforth denotethe function inside the minimization as ∆Φ w ( f, ˆ f ) . It is instructive to first see how the couplingcondition changes if we were to take the optimal step of the above problem, while remaining feasible.To calculate this, from the optimality conditions of the above program, we can say that there existsa ˆ y such that for all e = ( u, v )ˆ y v − ˆ y u = (cid:18) w + e u + e − f e − ˆ f e − w − e u − e + f e + ˆ f e (cid:19) − (cid:18) w + e u + e − f e − w − e u − e + f e (cid:19) = (cid:18) w + e u + e − f e − ˆ f e − w − e u − e + f e + ˆ f e (cid:19) − ( y v − y u ) Hence, if we update y to y + ˆ y and f to f + ˆ f , we get a flow of value F + δ such that the couplingcondition with respect to the new y and f still hold.7ence, we can now focus on actually computing the step and showing what δ we can take toensure that we still satisfy feasibility, i.e., bounds on the ℓ ∞ norm of the congestion vector. Thefunction we are trying to minimize comprises of a self-concordant barrier term and a linear term.Unfortunately, we cannot control the condition number of such a function to optimize it in efficientlyover the entire space as this is arguably as hard as the original problem itself. However, due toself-concordance, the function behaves smoothly enough (good condition number) in a box aroundthe origin but that seemingly doesn’t help us solve the problem over the entire space. Fortunately,a fix for this was already found in [BCLL18]. In particular they (smoothly) extend the functionquadratically outside a box to ensure that the (global) smoothness and strong convexity propertiesinside the box carries over to that outside the box as well while still arguing that the minimizeris the same provided the minimizer of the original problem was inside the box. Specifically, thefollowing lemma can be inferred from Section 2.2 of [BCLL18]. Lemma 3.2.
Given a function f ( x ) which is L -smooth and µ -strongly convex inside an interval [ − ℓ, ℓ ] . Then, we define the quadratic extension of f , defined as f ℓ ( x ) = f ( x ) , for − ℓ ≤ x ≤ ℓf ( − ℓ ) + f ′ ( − ℓ )( x + ℓ ) + f ′′ ( − ℓ )( x + ℓ ) , for x < − ℓf ( ℓ ) + f ′ ( ℓ )( x − ℓ ) + f ′′ ( ℓ )( x − ℓ ) , for x > ℓ The function f ℓ is C , L -smooth and µ -strongly convex. Furthermore, for any convex function ψ ( x ) provided x ∗ = arg min x ∈X ψ ( x ) + n P i =1 f ( x i ) lies inside n Q i =1 [ − ℓ i , ℓ i ] , then arg min x ∈X ψ ( x ) + n P i =1 f ℓ i ( x i ) = x ∗ Hence, it suffices to consider a δ small enough such that the minimizer is the same as for theoriginal problem and we can focus on minimizing this quadratic extension of the function. Forminimization, we can use Accelerated Gradient Descent or Newton’s method. Theorem 3.3 ([Nes04]) . Given a convex function f which satisfies D (cid:22) ∇ f ( x ) (cid:22) κD ∀ x ∈ R n with some given fixed diagonal matrix D and some fixed κ . Given an initial point x and an errorparameter < ε < / , the accelerated gradient descent (AGD) outputs x such that f ( x ) − min x f ( x ) ≤ ε ( f ( x ) − min x f ( x )) in O ( √ κ log( κ/ε )) iterations. Each iteration involves computing ∇ f at some point x and projectingthe function onto the subspace defined by the constraints and some linear-time calculations. Notice that the Hessian of the function in the potential decrement problem is a diagonal matrixwith the e th entry being w + e ( u + e − f e − ˆ f e ) + w − e ( u − e + f e + ˆ f e ) So provided ρ + e , ρ − e are less than some small constant, the condition number κ of the Hessian isconstant with respect to the diagonal matrix which is ∇ φ w ( f ) and hence we can use Theorem 3.3to solve it in e O (1) to quasi-polynomially good error. Furthermore notice that the algorithm is justcomputing a gradient and then doing projection and so can be computing using a Laplacian linearsystem solve and hence runs in nearly linear time. Furthermore, quasi-polynomially small error willsuffice for our purposes [Mad13, Mad16, LS20b].Now, we just need to ensure that we can control the ℓ ∞ -norm of the congestion vector, as thatcontrols how much flow we can still send without violating constraints. Note further, that we need8o set ℓ while solving the quadratic extension of the potential decrement problem so that it’s greaterthan the ℓ ∞ norm that we can guarantee. We will want both of these to be some constants.As mentioned above, the point of preconditoning the graph is to ensure that the preconditionededges themselves can facilitiate sufficient progress. To bound the congestion, we show an analog ofLemma 3.9 in [Mad16]. Lemma 3.4.
Let ( f, y, w ) be a well-coupled solution with value F and let δ = F ∗ − F √ m . Let ˆ f be thesolution to the potential decrement problem. Then we have, ρ + e , ρ − e ≤ . for all edges e .Proof. Consider a flow f ′ which sends δm units of flow on each of the m/ preconditioned edges.Certainly the potential decrement flow ˆ f will have smaller potential decrement than that of f ′ whichis ∆Φ w ( f, f ′ ) = − X e ∈ E w + e log (cid:18) − f ′ e u + e − f e (cid:19) + w − e log (cid:18) f ′ e u + e − f e (cid:19) − f ′ e (cid:18) w + e u + e − f e − w − e u − e + f e (cid:19) ≤ X e ∈ E w + e (cid:18) f ′ e ˆ u + e ( f ) (cid:19) + w − e (cid:18) f ′ e ˆ u − e ( f ) (cid:19) ≤ k w k (cid:18) δF ∗ − F (cid:19) < . k w k m ≤ . where the second inequality follows from − log(1 − x ) ≤ x + x and − log(1 + x ) ≤ − x + x for non-negative x and the third inequality follows from plugging in the value of the flow on thepreconditioned edges and using Lemma 3.1. Finally we use k w k = 2 m . Now it suffices to provea lower bound on the potential decrement in terms of the congestion vector. For this, we start byconsidering the inner product of ˆ f with the gradient of the ∆Φ w ( f, ˆ f ) X e ∈ E ˆ f e (cid:18) w + e u + e − f e − ˆ f e − w − e u − e + f e + ˆ f e − w + e u + e − f e + w − e u − e + f e (cid:19) = X e ∈ E w + e ˆ f e (ˆ u + e − ˆ f e )ˆ u + e + w − e ˆ f e (ˆ u − e + ˆ f e )ˆ u − e ! ≤ X e ∈ E . w + e ˆ f e (ˆ u + e ) + w − e ˆ f e (ˆ u − e ) ! ≤ . X e ∈ E w + e ˆ f e (ˆ u + e ) + w − e ˆ f e (ˆ u − e ) ! ≤ . X e ∈ E − w + e log (cid:18) − f ′ e ˆ u + e (cid:19) − w − e log (cid:18) f ′ e ˆ u + e (cid:19) − f ′ e (cid:18) w + e ˆ u + e − w − e ˆ u − e (cid:19) = 2 . w ( f, ˆ f ) ≤ . where the second-to-last inequality follows from x + x / ≤ − log(1 − x ) and − x + x / ≤ − log(1+ x ) .Strictly speaking, the first inequality only holds for ˆ f e ≤ ˆ u e ( f ) / . However, instead of consideringthe inner product of ˆ f with the gradient of ∆Φ w ( f, ˆ f ) , we will instead consider it’s quadraticextension with ℓ e = ˆ u e ( f ) / for each edge e . It is easy to see that if ˆ f is outside the box, then9lso the desired inequality still holds (by computing the value the quadratic extension takes on f ′ in the cases outside the box). To finish the proof, X e ∈ E ˆ f e (cid:18) w + e u + e − f e − ˆ f e − w − e u − e + f e + ˆ f e − w + e u + e − f e + w − e u − e + f e (cid:19) = X e ∈ E w + e ˆ f e (ˆ u + e − ˆ f e )ˆ u + e + w − e ˆ f e (ˆ u − e + ˆ f e )ˆ u − e ! ≥ / X e ∈ E w + e ˆ f e (ˆ u + e ) + w − e ˆ f e (ˆ u − e ) ! ≥ . k ρ k ∞ Hence, combining the above, we get that k ρ k ∞ ≤ . Notice that since k ρ k ∞ < . , the minimizer of the quadratic smoothened function is the same asthe function without smoothing and hence the new step is well-coupled as per the argument above.Hence, in every iteration, we decrease the amount of flow that we could send multiplicatively bya factor of − / √ m and hence in √ m iterations we will get to a sufficiently small amount ofremaining flow that we can round using one iteration of augmenting paths. This completes our √ m iteration algorithm. m / o (1) U / Time Algorithm
In this section, we show how to change weights to improve the number of iterations in our algorithm.We will follow the framework of Liu and Sidford [LS20b] of finding a set of weights to add under anorm constraint such that the step one would take with respect to the new set of weights maximizesa potential function. In their case, since the step they are taking is an electrical flow, the potentialfunction considered is the energy of such a flow. As our step is different, we will instead takethe potential decrement as the potential function with respect to the new set of weights. Perhapssuprisingly however, we can make almost all their arguments go through with minor modifications.Let the initial weights be w and say we would like to add a set of weights w ′ . Then we are interestedin maximizing the potential decrement with respect to the new set of weights. This can be seenas similar to designing oracles for multiplicative weight algorithms for two-player games where aplayer plays a move to penalize the other player the most given their current move. Our algorithmfirst finds a finds a new set of weights and then takes the potential decrement step with respect tothe new weights. Finally, for better control of the congestion vector, we show that one can decreasesome of the weight increase like in [LS20b]. We first focus on the problem of finding the new setof weights. We are going to introduce a set r ′ ∈ R E ++ of "resistances" and will optimize theseresistances and then obtain the weights from them. Let w be the current set of weights and w ′ bethe set of desired changes. Without loss of generality, assume that ˆ u e ( f ) = ˆ u + e ( f ) and now given aresistance vector r ′ , we define the weight changes as ( w + e ) ′ = r ′ e (ˆ u + e ( f )) and ( w − e ) ′ = ( w + e ) ′ ˆ u − e ( f )ˆ u + e ( f ) This is the same set of weight changes done in [LS20b] in the context of energy maximization. Thisset of weights ensures that our point ( f, y, w ) is well-coupled with respect to w + w ′ as well, i.e., ( w + e ) ′ ˆ u + e ( f ) = ( w − e ) ′ ˆ u − e ( f ) g ( W ) = max r ′ > , k r ′ k ≤ W min B ⊤ ˆ f = δχ ∆Φ w + w ′ ( f, ˆ f ) Here w ′ is based on r ′ in the form written above. While this is the optimization problem we wouldlike to solve, we are unable to do so due to the ℓ norm constraint on the resistances. We willhowever be able to solve a relaxed q -norm version of the problem. g q ( W ) = max r ′ > , k r ′ k q ≤ W min B ⊤ ˆ f = δχ ∆Φ w + w ′ ( f, ˆ f )= max r ′ > , k r ′ k q ≤ W min B ⊤ ˆ f = δχ ∆Φ w ( f, ˆ f ) − X e ∈ E ( w + e ) ′ log (cid:18) − f ′ e u + − f e (cid:19) + ( w − e ) ′ log (cid:18) f ′ e u + − f e (cid:19) − f ′ e (cid:18) ( w + e ) ′ u + e − f e − ( w − e ) ′ u − e + f e (cid:19) Notice that this is a linear (and hence concave) function in w ′ and hence in r ′ and is closed andconvex in ˆ f and the constraints are convex as they are only linear and norm ball constraints. Hence,using Theorem 2.3, we can say that min B ⊤ ˆ f = δχ ∆Φ w ( f, ˆ f ) − X e ∈ E ( w + e ) ′ log (cid:18) − f ′ e u + − f e (cid:19) +( w − e ) ′ log (cid:18) f ′ e u + − f e (cid:19) − f ′ e (cid:18) ( w + e ) ′ u + e − f e − ( w − e ) ′ u − e + f e (cid:19) is concave in r ′ and max r ′ > , k r ′ k q ≤ W ∆Φ w ( f, ˆ f ) − X e ∈ E ( w + e ) ′ log (cid:18) − f ′ e u + − f e (cid:19) +( w − e ) ′ log (cid:18) f ′ e u + − f e (cid:19) − f ′ e (cid:18) ( w + e ) ′ u + e − f e − ( w − e ) ′ u − e + f e (cid:19) is convex in ˆ f . Now, as in [LS20b], we use Sion’s minimax lemma to get min B ⊤ ˆ f = δχ ∆Φ w ( f, ˆ f ) + max r ′ > , k r ′ k q ≤ W − X e ∈ E ( w + e ) ′ log − ˆ f e u + − f e ! + ( w − e ) ′ log f e u + − f e ! − ˆ f e (cid:18) ( w + e ) ′ u + e − f e − ( w − e ) ′ u − e + f e (cid:19) min B ⊤ ˆ f = δχ ∆Φ w ( f, ˆ f ) + W "X e ∈ E g e ( ˆ f ) p /p (4.1)where g e ( ˆ f ) = (ˆ u + e ( f )) log (cid:16) − ˆ f e ˆ u + (cid:17) + ˆ u + e ( f )ˆ u − e ( f ) log (cid:16) ˆ f e ˆ u − e ( f ) (cid:17) and we plugged in the value of w ′ in terms of r ′ and used that max k x k q ≤ W y ⊤ x = W k y k p with /p + 1 /q = 1 . As mentioned above, thefunction inside the minimization problem is convex. Furthermore, from the proof of Theorem 2.3, itcan be inferred that any smoothness and strong convexity properties that the function inside the min-max had carries over on the function after the maximization. Hence as in Section 3, we will considerthe quadratic extension of the function (as a function of f for the function inside the min-max with ℓ e = ˆ u e ( f ) / . This is just the quadratic extension of ∆Φ w ( f, ˆ f ) and the quadratic extension of g e ( f ) . Now, the strategy will be to consider adding flow using this step while the remaining flow tobe routed F ∗ − F ≥ m / − η . After which, running m / − η iterations of augmenting paths gets us tothe optimal solution. We will need to ensure that that throughout the course of the algorithm the ℓ norm of the weights doesnt get too large. For doing that, we will first compute the weight changesand then do a weight reduction procedure [LS20b] in order to always ensure that k w k ≤ m .We will take η = 1 / − o (1) − log m ( U ) and W = m η . Provided we can ensure that the k w k ≤ m throughout the course of the algorithm, that the ℓ ∞ of the congestion vector is alwaysbounded by a constant and that we can solve the resulting step in almost-linear time, we will obtainan algorithm which runs in time m / o (1) U / time.11 heorem 4.1. There exists an algorithm for solving s − t maximum flow in directed graphs in time m / o (1) U / time. To summarize, our algorithm starts off with ( f, y ) = (0 , and w + e = w − e = 1 for all edges e . Then in each iteration, starting with a well-coupled ( f, y, w ) with flow value F and δ = ( F ∗ − F ) /m / − η and W = m η we then solve Equation 4.1 (which is the potential decrement problemwith the new weights) problem to obtain ˆ f which will be the step we will take (and has flow value F + δ and then all that remains is to actually find the update weights w ′ which will have a closedform expression in terms of ˆ f and then we perform a weight reduction step to obtain the new w ′ which ensures that we still remain well-coupled for ˆ f and repeat while F ∗ − F ≥ m / − η . Finally,we round the remaining flow using m / − η iterations of augmenting paths. We first state the lemmathe proof of which is similar to Lemma 3.4 Lemma 4.2.
Let ( f, y, w ) be a well-coupled solution with value F and let δ = F ∗ − F m / − η . Let ˆ f bethe solution to the potential decrement problem considered in Equation 4.1. Then, we have for alledges e that ρ + e , ρ − e ≤ . and | ˆ f e | ≤ m − η We will prove this lemma in the Appendix A. Next notice that ( f, y ) are still a well-coupledsolution with respect to the new weights w + w ′ as the weights were chosen to ensure that thecoupling condition is unchanged. Lemma 4.3.
Our new weights, after weight reduction, satisfy k w ′ k ≤ m η + o (1) U ≤ m/ and ( f + ˆ f , y + ˆ y ) is well-coupled with respect to w + w ′ Proof.
Using optimality conditions of the program in Equation 4.1, we see that there exists a ˆ y suchthat ˆ y v − ˆ y u = ˆ f e w + e (ˆ u + e − ˆ f e )ˆ u + e − w − e (ˆ u − e + ˆ f e )ˆ u − e ! + W ˆ f e g p − e k g k p − p (cid:18) ˆ u + e ˆ u + e − ˆ f e − ˆ u − e ˆ u − e + ˆ f e (cid:19) where g ∈ R E is the vector formed by taking g e ( ˆ f ) for the e th coordinate. We will take ( r e ) ′ = W g p − e k g k p − p and ( w + e ) ′ = W g p − e k g k p − p (ˆ u + e ) and ( w − e ) ′ = W g p − e k g k p − p (ˆ u + e ˆ u − e ) which satisfies the well-coupling condition we want to ensure. Also notice that k r k q = W so wesatisfy the norm ball condition as well. Now, we need to upper bound the ℓ norm of w ′ . We willtake p = √ log m k w ′ k ≤ m /p k w ′ k q ≤ m o (1) X e ∈ E ( w + e ) ′ + ( w − e ) ′ ! /q ≤ m o (1) W U = O ( m η + o (1) U ) as ˆ u + e , ˆ u − e ≤ U . Plugging in the value of η , we get that this is less than m/ . Now, we will performweight reductions to obtain a new set of weights w ′′ such that they still ensure the coupling conditiondoesnt change and we can establish better control on the weights. The weight reduction is procedureis the same as that in [LS20b] where we find the smallest non-negative w ′′ such that for all edges ( w + e ) ′ ˆ u + e − ˆ f e − ( w − e ) ′ ˆ u − e + ˆ f e = ( w + e ) ′′ ˆ u + e − ˆ f e − ( w − e ) ′′ ˆ u − e + ˆ f e ( w + e ) ′ ˆ u + e = ( w − e ) ′ ˆ u − e and ˆ u + e − ˆ f e ˆ u − e + ˆ f e = (1 ± O (max { ρ + e , ρ − e } ) ˆ u + e ˆ u − e Hence, it follows that ( w + e ) ′′ + ( w − e ) ′′ ≤ O (max { ρ + e , ρ − e } )(( w + e ) ′ + ( w − e ) ′ ) As | ˆ f e | ≤ m − η from Lemma 4.2, we get k w ′′ k ≤ m − η X e ∈ E W g p − e k g k p − p (ˆ u + e + ˆ u − e ) ≤ O ( m η + o (1) U ) ≤ m/ As before, while this argument is done for the non-quadratically extended function while we areoptimizing the quadartically extended function, as our ρ + e , ρ − e ≤ . , the minimizers are the sameand hence the above argument works.Now, provided that we can show how to solve Equation 4.1 in almost-linear time, we are done.This is because we run the algorithm for m / − η iterations and the ℓ norm of the weights increasesby at most m η + o (1) U in each iteration. Hence the final weights are k w k ≤ m + m / η + o (1) U ≤ m/ . So we can use Lemma 3.1 throughout the course of our algorithm. Also, as mentionedabove, notice that the flow ˆ f that we augment in every iteration is just the solution to the potentialdecrement problem with the new weights. Hence, from the argument in Section 3, we alwaysmaintain the well-coupled condition.To show that we can solve the problem in Equation 4.1, we will appeal to the work of [KPSW19].As mentioned above, their work establishes Lemma 2.2 and then shows that for any function whichcan be sandwiched in that form plus a quadratic term which is the same on both sides, one canjust minimize the resulting upper bound to get a solution to the optimization problem with quasi-polynomially low error. Hence, we will focus on showing that the objective function in our problemcan also be sandwiched into terms of this form after which appealing to their algorithm, we will geta high accuracy solution to our problem in almost linear time. The first issue that arises is thatsrictly speaking, their algorithm only works for minimizing objectives of the form OP T = min B ⊤ f = χ X e ∈ E g e f e + r e f e + | f e | p whereas for our objective, the p -norm part is not raised to the power p but is just the p -normitself. The solution for this however was already given in Liu-Sidford [LS20b] where they show(Lemma B.3 in their paper) that for sufficiently nice functions minimizing problems of the form min f ( x ) + h ( g ( x )) can be obtained to high accuracy if we can obtain minimizers to functions of theform f ( x ) + g ( x ) . The conditions they require on the functions are also satisfied for our functionsand is a straightforward calculation following the proof in their paper [LS20b]. Hence, we can focuson just showing how to solve the following problem OP T = min B ⊤ ˆ f = χ X e ∈ E − w + e log . − ˆ f e ˆ u + e ! + w − e log . f e ˆ u − e ! + ˆ f e (cid:18) w + e ˆ u + e − w + e ˆ u − e (cid:19)! + ( g e ) . ( ˆ f ) p . denote that we are solving the quadratically smoothened func-tion with the box size being ˆ u e ( f ) / for each e and g e ( ˆ f ) = (ˆ u + e ( f )) log (cid:16) − ˆ f e ˆ u + (cid:17) +ˆ u + e ( f )ˆ u − e ( f ) log (cid:16) ˆ f e ˆ u − e ( f ) (cid:17) Call the term in the sum for a given edge e as val e ( ˆ f ) and the overallobjective function is val ( ˆ f ) . In particular, we consider for a single edge and prove the followinglemma Lemma 4.4.
We have the following for any feasible f and δ ≥ val e ( f ) + δ∂ f val e ( f ) + (9 / δ (cid:18) w + e (ˆ u + e − f ) + w − e (ˆ u − e + f ) (cid:19) + 2 − O ( p ) ( f p − e δ + δ p ) ≤ val e ( f + δ ) and val e ( f + δ ) ≤ val e ( f ) + δ∂ f val e ( f ) + (11 / δ (cid:18) w + e (ˆ u + e − f ) + w − e (ˆ u − e + f ) (cid:19) + 2 O ( p ) ( f p − e δ + δ p ) where ∂ x denotes the derivative of a function with respect to x . We prove this lemma in Appendix A. Let r e = (cid:16) w + e (ˆ u + e − f ) + w − e (ˆ u − e + f ) (cid:17) Lemma 4.5.
Now, given an initial point f such that B ⊤ f = χ and an almost linear time solverfor the following problem min B ⊤ δ =0 X e ∈ E δ e α e + (11 / O ( p ) (( r e + f p − e ) δ + δ p ) where the α e vector is the gradient of val at a given point f , we can obtain an ˆ f in e O p (1) calls tothe solver such that val ( ˆ f ) ≤ OP T + 1 / poly log m The proof is similar to the proof of the iteration complexity of gradient descent for smooth andstrongly convex function and it follows from [LFN18, KPSW19]. Note that since [KPSW19] give analmost linear time solver for exactly the subproblem in the above lemma provided the resistancesare quasipolynomially bounded, we are done. This is because Section D.1 in [LS20b] already provesthat the resistances are quasipolynomially bounded.
In this paper, we showed how to use steps inspired by potential reduction IPMs to solve maxflow in directed graphs in O ( m / o (1) U / ) time. We believe our framework for taking the stepcorresponding to the maximum decrease of the potential function may be useful for other problemsincluding ℓ p norm minimization. In particular, can one set up a homotopy path for which steps aretaken according to a potential function. Presumably if this can be done, this might also offer hintsfor how to use ideas corresponding to different homotopy paths induced by other potential functions(rather than the central path we consider) to solve max flow faster. Finally, there is no reason tobelieve that the procedure for selecting weight changes corresponding to the potential decrementbeing maximized to be the best way to change weights. This may lead to a faster algorithm as wellif one can find another strategy which establishes tighter control on weight changes. A questionalong the way to such a strategy might be to understand how the potential decrement optimumchanges as we change weights/resistances. Such an analog for change in energy of electrical flow aswe change resistances is used in [CKM +
11, Mad16, LS20b]. Another open problem that remains isobtaining faster algorithms for max flow on weighted graphs with logarithmic dependence on U asopposed to the polynomial dependence in this paper.14 cknowledgements We would like to thank Jelena Diakonikolas, Yin Tat Lee, Yang Liu, Aaron Sidford and Daniel Spiel-man for helpful discussions. We also thank Jelani Nelson for several helpful suggestions regardingthe presentation of the paper.
References [AKPS19] Deeksha Adil, Rasmus Kyng, Richard Peng, and Sushant Sachdeva,
Iterative refinementfor lp-norm regression , Proceedings of the Thirtieth Annual ACM-SIAM Symposiumon Discrete Algorithms, SODA, 2019.[Ans96] Kurt Anstreicher,
Potential reduction algorithms .[AS20] Deeksha Adil and Sushant Sachdeva,
Faster p-norm minimizing flows, via smoothedq-norm problems , Proceedings of the Thirtieth Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA, 2020.[BCLL18] Sébastien Bubeck, Michael B. Cohen, Yin Tat Lee, and Yuanzhi Li,
An homotopymethod for lp regression provably beyond self-concordance and in input-sparsity time ,Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing,STOC, 2018.[BNO03] Dimitri P. Bertsekas, Angelia NediÂťc, and Asuman E. Ozdaglar,
Convex analysis andoptimization , 2003.[CKM +
11] Paul Christiano, Jonathan A. Kelner, Aleksander Madry, Daniel A. Spielman, andShang-Hua Teng,
Electrical flows, laplacian systems, and faster approximation of max-imum flow in undirected graphs , Proceedings of the 43rd ACM Symposium on Theoryof Computing, STOC, 2011.[CKM +
14] Michael B. Cohen, Rasmus Kyng, Gary L. Miller, Jakub W. Pachocki, Richard Peng,Anup B. Rao, and Shen Chen Xu,
Solving SDD linear systems in nearly m log1/2 n time ,Symposium on Theory of Computing, STOC, 2014.[CLS19] Michael B. Cohen, Yin Tat Lee, and Zhao Song, Solving linear programs in the currentmatrix multiplication time , Proceedings of the 51st Annual ACM SIGACT Symposiumon Theory of Computing, STOC, 2019.[CMSV17] Michael B. Cohen, Aleksander Madry, Piotr Sankowski, and Adrian Vladu,
Negative-weight shortest paths and unit capacity minimum cost flow in õ ( m W ) time ,Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algo-rithms, SODA, 2017.[DS08] Samuel I. Daitch and Daniel A. Spielman, Faster approximate lossy generalized flow viainterior point algorithms , Proceedings of the 40th Annual ACM Symposium on Theoryof Computing (Cynthia Dwork, ed.), 2008.[GR98] Andrew V. Goldberg and Satish Rao,
Beyond the flow decomposition barrier , J. ACM (1998), no. 5, 783–797. 15KLOS14] Jonathan A. Kelner, Yin Tat Lee, Lorenzo Orecchia, and Aaron Sidford, An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicom-modity generalizations , Proceedings of the Twenty-Fifth Annual ACM-SIAM Sympo-sium on Discrete Algorithms, SODA, 2014.[KLP +
16] Rasmus Kyng, Yin Tat Lee, Richard Peng, Sushant Sachdeva, and Daniel A. Spielman,
Sparsified cholesky and multigrid solvers for connection laplacians , Proceedings of the48th Annual ACM SIGACT Symposium on Theory of Computing, STOC (DanielWichs and Yishay Mansour, eds.), 2016.[KMP14] Ioannis Koutis, Gary L. Miller, and Richard Peng,
Approaching optimality for solvingSDD linear systems , SIAM J. Comput. (2014), no. 1, 337–354.[KOSA13] Jonathan A. Kelner, Lorenzo Orecchia, Aaron Sidford, and Zeyuan Allen Zhu, A simple,combinatorial algorithm for solving SDD systems in nearly-linear time , Symposium onTheory of Computing Conference, STOC’13, 2013.[KPSW19] Rasmus Kyng, Richard Peng, Sushant Sachdeva, and Di Wang,
Flows in almost lin-ear time via adaptive preconditioning , Proceedings of the 51st Annual ACM SIGACTSymposium on Theory of Computing, STOC, 2019.[KS16] Rasmus Kyng and Sushant Sachdeva,
Approximate gaussian elimination for laplacians- fast, sparse, and simple , IEEE 57th Annual Symposium on Foundations of ComputerScience, FOCS (Irit Dinur, ed.), 2016.[LFN18] Haihao Lu, Robert M. Freund, and Yurii E. Nesterov,
Relatively smooth convex opti-mization by first-order methods, and applications , SIAM Journal on Optimization (2018), no. 1, 333–354.[LRS13] Yin Tat Lee, Satish Rao, and Nikhil Srivastava, A new approach to computing maximumflows using electrical flows , Symposium on Theory of Computing Conference, STOC,2013.[LS14] Yin Tat Lee and Aaron Sidford,
Path finding methods for linear programming: Solvinglinear programs in õ(vrank) iterations and faster algorithms for maximum flow , 55thIEEE Annual Symposium on Foundations of Computer Science, FOCS, 2014.[LS20a] Yang Liu and Aaron Sidford,
Faster divergence maximization for faster maximum flow ,arXiv preprints, 2020.[LS20b] Yang P. Liu and Aaron Sidford,
Faster energy maximization for faster maximum flow ,STOC (2020).[LSZ19] Yin Tat Lee, Zhao Song, and Qiuyi Zhang,
Solving empirical risk minimization in thecurrent matrix multiplication time , Conference on Learning Theory, COLT 2019, 25-28June, 2019.[Mad13] Aleksander Madry,
Navigating central path with electrical flows: From flows to match-ings, and back , 54th Annual IEEE Symposium on Foundations of Computer Science,FOCS, 2013.[Mad16] ,
Computing maximum flow with augmenting electrical flows , IEEE 57th AnnualSymposium on Foundations of Computer Science, FOCS, 2016.16Nes04] Yurii Nesterov,
Introductory lectures on convex optimization: A basic course , KluwerAcademic Publishers, 2004.[NN94] Yurii E. Nesterov and Arkadii Nemirovskii,
Interior-point polynomial algorithms inconvex programming , Siam studies in applied mathematics, vol. 13, SIAM, 1994.[Pen16] Richard Peng,
Approximate undirected maximum flows in O ( m polylog( n )) time , Pro-ceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algo-rithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016 (Robert Krauthgamer,ed.), SIAM, 2016, pp. 1862–1867.[PS14] Richard Peng and Daniel A. Spielman, An efficient parallel solver for SDD linearsystems , Symposium on Theory of Computing, STOC, 2014.[Ren01] James Renegar,
A mathematical view of interior-point methods in convex optimization ,MPS-SIAM series on optimization, SIAM, 2001.[She13] Jonah Sherman,
Nearly maximum flows in nearly linear time , 54th Annual IEEE Sym-posium on Foundations of Computer Science, FOCS, 2013.[She17a] ,
Area-convexity, l ∞ regularization, and undirected multicommodity flow , Pro-ceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing,STOC, 2017.[She17b] , Generalized preconditioning and undirected minimum-cost flow , Proceedingsof the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA,2017.[ST14] Daniel A. Spielman and Shang-Hua Teng,
Nearly linear time algorithms for precondi-tioning and solving symmetric, diagonally dominant linear systems , SIAM J. MatrixAnalysis Applications (2014), no. 3, 835–885.[Tod96] Michael J. Todd, Potential-reduction methods in mathematical programming , Math.Program. (1996), 3–45.[Tun94] Levent Tunçel, Constant potential primal-dual algorithms: A framework , Math. Pro-gram. (1994), 145–159.[Tun95] , On the convergence of primal-dual interior-point methods with wide neighbor-hoods , Comp. Opt. and Appl. (1995), no. 2, 139–158.[vdBLSS20] Jan van den Brand, Yin Tat Lee, Aaron Sidford, and Zhao Song, Solving tall denselinear programs in nearly linear time , STOC (2020).
A Missing Proofs
Proof. [of Lemma 4.2] We follow the strategy used in the proof of Lemma 3.4. Recall that theproblem we are trying to understand is min B ⊤ ˆ f = δχ ∆Φ w ( f, ˆ f ) + W "X e ∈ E g e ( ˆ f ) p /p g e ( ˆ f ) = (ˆ u + e ( f )) log (cid:16) − ˆ f e ( f )ˆ u + e ( f ) (cid:17) + ˆ u + e ( f )ˆ u − e ( f ) log (cid:16) ˆ f e ˆ u − e ( f ) (cid:17) . As in Lemma 3.4, we willconsider a flow f ′ which sends δm units of flow on each of the m/ preconditioned edges. Certainly,the objective value of the above function at ˆ f will have a smaller value than that at f ′ . For the firstterm ∆Φ w ( f, ˆ f ) , running the same argument as in Lemma 3.4, we get that ∆Φ w ( f, ˆ f ) ≤ k w k (cid:18) δF ∗ − F (cid:19) ≤ . m η For the the second term, we use log(1 − x ) ≤ − x + x and log(1 + x ) ≤ x + x , to get that g e ( f ) ≤ ˆ f e (cid:16) ˆ u + e ( f )ˆ u + e ( f ) (cid:17) ≤ f e where we have used that ˆ u + e ( f ) ≤ ˆ u − e ( f ) . Now, since there isnon-zero flow on the preconditioned edges, we get that W "X e ∈ E g e ( f ′ ) p /p ≤ W ( δ/m ) m o (1) ≤ m η + o (1) (cid:18) F ∗ − F m ( m / − η ) (cid:19) ≤ . m η − o (1) U using p = √ log n , the fact that F ∗ − F ≤ mU and the value of δ = F ∗ − F m / − η . Also using thevalue of η , we can see that this term is less than . m η . Hence, combining the two, weget that the objective value at ˆ f is less than . m η . As the objective function is made upof two non-negative quantities, we can obtain two inequalities using this upper bound by droppingone term from the objective value each time. For the second part, we ignore the first term of theobjective function and lower bound the second term using the fact that log(1 + x ) ≥ x + x / and log(1 − x ) ≥ − x + x / . m η ≥ W "X e ∈ E g e ( ˆ f ) p /p ≥ W | g e ( ˆ f ) |≥ W ˆ f e (1 + ˆ u + e ( f ) / ˆ u e ( f )) ≥ W ˆ f e This gives us that | ˆ f e | ≤ . m − η by plugging in the value of W = m η For the first part now, assume for the sake of contradiction that ρ e > . , otherwise we are done.Now, dropping the second term we want to establish that u e ( f ) ≤ m η , which we will do so by aproof similar to the proof of Lemma 4.3 in [Mad16]. Now using the argument as in Lemma 3.4, weget for an edge e = ( u, v ) , . m η ≥ ∆Φ w ( f, ˆ f ) ≥ . X e ∈ E ˆ f e (cid:18) w + e u + e − f e − ˆ f e − w − e u − e + f e + ˆ f e − w + e u + e − f e + w − e u − e + f e (cid:19) = 12 . f ⊤ B ˆ y . δχ ⊤ ˆ y = F ∗ − F m / − η χ ⊤ ˆ y ≥ χ ⊤ ˆ y/ y s − ˆ y t ) / ≥ (ˆ y u − ˆ y v ) / (cid:18) w + e u + e − f e − ˆ f e − w − e u − e + f e + ˆ f e − w + e u + e − f e + w − e u − e + f e (cid:19) ≥ f e (cid:18) w + e ( u + e − f e ) + w − e ( u − e + f e ) (cid:19) ≥ ρ e u e ( f ) ≥ . u e ( f ) where the first and second equalities follows from optimality and feasibility conditions of the poten-tial decrement problem respectively and the third inequality follows from the condition that we runthe program while the flow left to augment is at least m / − η . This implies that / ˆ u e ( f ) ≤ m η .Multiplying this with | ˆ f e | ≤ . m − η , we get that ρ e ≤ . , which finishes the proof. We alsoneed to argue the inequality ˆ y s − ˆ y t ≥ ˆ y u − ˆ y v . The optimality conditions of ˆ y u − ˆ y v = ˆ f e w + e ( u + e − f e − ˆ f e )( u e − f e ) + w − e ( u − e + f e )( u − e + f e + ˆ f e ) ! and noticing that the quantity in brackets in the right hand side above is non-negative, tells us thatthere is a fall in potential along the flow. This along with noticing that the sum of the potentialdifference in a directed cycle is zero, tells us that the graph induced by just the flow ˆ f is a DAG.Since, it’s a DAG, it can be decomposed into disjoint s − t paths along which flow is sent and everyedge belongs to one of these paths. Hence, the potential difference across an edge is less than thepotential difference across the whole path which is the potential difference between s and t andhence, we are done.As before, all these arguments go through with the quadratically smoothened cases cases ratherthan the original function to still get the same bounds and since ρ e ≤ . , the minimizers of thetwo are the same which completes the proof. Proof. [of Lemma 4.4] Note that while we are solving for the quadratically smoothened version ofthe problem, we can assume we solve it for the non-smoothened version in the box correspondingto a congestion of at most . as the extension is C and will ensure that any inequalities we needhenceforth (upto the second order terms) are bounded as well.There are two terms, one corresponding to the potential decrement term and the other is asimilar expression but raised to the p th power. We tackle the first term first. This is easily doneusing Taylor’s theorem. The function is g ( x + y ) = − log(1 − ( x + y ) /u ) − ( x + y ) /u . Computing thefirst two derivatives with respect to y , we get that g ′ ( x + y ) = u − x − y − /u and g ′′ ( x + y ) = u − x − y ) .Now, using Taylor’s theorem, we get that g ( x + y ) = g ( x ) + g ′ ( x ) y + 12 g ′′ ( x + ζ ) y g ( x ) + y (cid:18) u − x − y − u (cid:19) + y (cid:18) u − x − ζ ) (cid:19) for some ζ such that − u/ ≤ x + ζ ≤ u/ which easily gives us the bound g ( x ) + y (cid:18) u − x − y − u (cid:19) + (9 / y (cid:18) u − x ) (cid:19) ≤ g ( x + y ) ≤ g ( x ) + y (cid:18) u − x − y − u (cid:19) + (11 / y (cid:18) u − x ) (cid:19) Similarly for − log(1 + x/u ) + x/u .Now, for the second term, we will largely follow the strategy of [KPSW19]. Now for the p th order term, we have a function g ( x ) = u log(1 − x/u ) + u u log(1 − x/u ) . We first use Lemma2.2 with f i = g ( x ) and δ i = g ( x + y ) − g ( x ) to get g ( x + y ) p ≤ g ( x ) p + pg ( x ) p − ( g ( x + y ) − g ( x )) + 2 O ( p ) ( g ( x ) p − ( g ( x + y ) − g ( x )) + ( g ( x + y ) − g ( x )) p ) Now, adding and subtracting pg ( x ) p − yg ′ ( x ) from both sides and noticing that g ( x + y ) − g ( x ) − yg ′ ( x ) ≤ from concavity of g , we get g ( x + y ) p ≤ g ( x ) p + pyg ( x ) p − g ′ ( x ) + 2 O ( p ) ( g ( x ) p − ( g ( x + y ) − g ( x )) + ( g ( x + y ) − g ( x )) p ) Now, notice that using inequalities of log(1 − x/u ) and log(1 + x/u ) , to get x ≤ g ( x ) ≤ x andwe also use Taylor’s theorem get that g ( x + y ) − g ( x ) ≤ | xy | + | y | ) g ( x + y ) p ≤ g ( x ) p + pyg ( x ) p − g ′ ( x ) + 2 O ( p ) ( x p − ( x y + y ) + 2 p − ( x p y p + y p ) ≤ g ( x ) p + pyg ′ ( x ) + 2 O ( p ) ( x p − y + y p ) where we have used ( x + y ) p ≤ p − ( x p + y p ) and that y ≤ xx