[PDF] Monochromatic Triangles, Intermediate Matrix Products, and Convolutions

Abstract

The most studied linear algebraic operation, matrix multiplication, has surprisingly fast O( n ω ) time algorithms for ω<2.373 . On the other hand, the (min,+) matrix product which is at the heart of many fundamental graph problems such as APSP, has received only minor improvements over its brute-force cubic running time and is widely conjectured to require n 3−o(1) time. There is a plethora of matrix products and graph problems whose complexity seems to lie in the middle of these two problems. For instance, the Min-Max matrix product, the Minimum Witness matrix product, APSP in directed unweighted graphs and determining whether an edge-colored graph contains a monochromatic triangle, can all be solved in O ~ ( n (3+ω)/2 ) time. A similar phenomenon occurs for convolution problems, where analogous intermediate problems can be solved in O ~ ( n 1.5 ) time. Can one improve upon the running times for these intermediate problems, in either the matrix product or the convolution world? Or, alternatively, can one relate these problems to each other and to other key problems in a meaningful way? This paper makes progress on these questions by providing a network of fine-grained reductions. We show for instance that APSP in directed unweighted graphs and Minimum Witness product can be reduced to both the Min-Max product and a variant of the monochromatic triangle problem. We also show that a natural convolution variant of monochromatic triangle is fine-grained equivalent to the famous 3SUM problem. As this variant is solvable in O( n 1.5 ) time and 3SUM is in O( n 2 ) time (and is conjectured to require n 2−o(1) time), our result gives the first fine-grained equivalence between natural problems of different running times.

Full PDF

aa r X i v : . [ c s . CC ] S e p Monochromatic Triangles, Intermediate Matrix Products,and Convolutions ∗ Andrea Lincoln † MIT [email protected]

Adam Polak ‡ Jagiellonian Univeristy [email protected]

Virginia Vassilevska Williams § MIT [email protected]

Abstract

The most studied linear algebraic operation, matrix multiplication, has surprisingly fast O ( n ω ) time algorithms for ω < . , +) matrix product which isat the heart of many fundamental graph problems such as All-Pairs Shortest Paths, has receivedonly minor n o (1) improvements over its brute-force cubic running time and is widely conjecturedto require n − o (1) time. There is a plethora of matrix products and graph problems whosecomplexity seems to lie in the middle of these two problems. For instance, the Min-Max matrixproduct, the Minimum Witness matrix product, All-Pairs Shortest Paths in directed unweightedgraphs and determining whether an edge-colored graph contains a monochromatic triangle, canall be solved in e O ( n (3+ ω ) / ) time. While slight improvements are sometimes possible usingrectangular matrix multiplication, if ω = 2, the best runtimes for these “intermediate” problemsare all e O ( n . ).A similar phenomenon occurs for convolution problems. Here, using the FFT, the usual(+ , × )-convolution of two n -length sequences can be solved in O ( n log n ) time, while the (min , +)-convolution is conjectured to require n − o (1) time, the brute force running time for convolutionproblems. There are analogous intermediate problems that can be solved in O ( n . ) time, butseemingly not much faster: Min-Max convolution, Minimum Witness convolution, etc.Can one improve upon the running times for these intermediate problems, in either thematrix product or the convolution world? Or, alternatively, can one relate these problems toeach other and to other key problems in a meaningful way?This paper makes progress on these questions by providing a network of ﬁne-grained reduc-tions. We show for instance that APSP in directed unweighted graphs and Minimum Witnessproduct can be reduced to both the Min-Max product and a variant of the monochromatictriangle problem, so that a signiﬁcant improvement over n (3+ ω ) / time for any of the latterproblems would result in a similar improvement for both of the former problems. We also showthat a natural convolution variant of monochromatic triangle is ﬁne-grained equivalent to thefamous 3SUM problem. As this variant is solvable in O ( n . ) time and 3SUM is in O ( n ) time(and is conjectured to require n − o (1) time), our result gives the ﬁrst ﬁne-grained equivalencebetween natural problems of diﬀerent running times. We also relate 3SUM to monochromatictriangle, and a coin change problem to monochromatic convolution, and thus to 3SUM. ∗ We would like to thank Amir Abboud for fruitful discussions at an early stage of our research. Part of theresearch was done when the second author was visiting MIT. A preliminary version of this paper was presented atITCS 2020. † Partially supported by NSF Grant CCF-1909429. ‡ Partially supported by the National Science Center, Poland under grants 2017/27/N/ST6/01334 and2018/28/T/ST6/00305. § Supported by an NSF CAREER Award, NSF Grants CCF-1528078, CCF-1514339 and CCF-1909429, a BSFGrant BSF:2012338, a Google Research Fellowship and a Sloan Research Fellowship. Introduction

Matrix multiplication is arguably the most fundamental linear algebraic operation. It is an impor-tant primitive for an enormous variety of applications. Within algorithmic research it has a veryspecial role since it is one of the few problems for which we have surprisingly fast and completelycounter-intuitive algorithms. Starting with Strassen’s breakthrough [37] in 1969, a long line ofresearch culminated in the current bound ω < .

373 [43, 31], where ω is the smallest real numberso that n × n matrix multiplication can be performed in O ( n ω + ε ) time for all ε > C ij = P k A ik · B kj ). Such examples include matrix products over semirings such as the (min , +)-product (often called distance product) which is over the tropical ((min , +)) semiring, and theMax-Min product which is over the (max , min)-semiring. Both these products are equivalent tocertain types of path optimization problems in graphs. The distance product of n × n matrices isequivalent to the All-Pairs Shortest Paths (APSP) problem in n -node graphs, so that a T ( n ) timealgorithm for one problem would imply an O ( T ( n )) time algorithm for the other [22]. Similarly,the Max-Min product is equivalent to the so called All-Pairs Bottleneck Paths (APBP) in graphs(e.g. [36]).There seems to be a distinct complexity diﬀerence between APSP and APBP (and hence thecorresponding matrix products), however. The fastest algorithms for APSP and the distance prod-uct run in n / exp ( √ log n ) time [47], which is only better by an n o (1) factor than the trivial cubictime algorithm for the distance product. Meanwhile, as was ﬁrst shown by [39, 40], APBP andthe Max-Min product admit a much faster than cubic time algorithm via a reduction to (normal)matrix multiplication; the fastest running time is O (cid:0) n (3+ ω ) / (cid:1) [20].APSP is in fact conjectured to not admit any truly subcubic, O (cid:0) n − ε (cid:1) time algorithms for ε > n × n Max-Min product, e O (cid:0) n (3+ ω ) / (cid:1) , while nontriviallysubcubic, seems diﬃcult to improve upon. In fact, e O (cid:0) n (3+ ω ) / (cid:1) is the best known running time formany other matrix and graph problems besides the Max-Min product: the Dominance product [33]and Equality product [48, 30], All-Pairs Nondecreasing Paths (APNP) and the (min , )-product [38,42, 19]. For some of these problems [50, 25] one can obtain slightly improved running times usingrectangular matrix multiplication [24]. However, the closer ω is to 2, the smaller the improvements,and when ω = 2, the e O (cid:0) n (3+ ω ) / (cid:1) = e O (cid:0) n . (cid:1) running time is the best known for all of theseproblems. Since their running time exponent is essentially the average of the brute-force exponent3 and the fast matrix multiplication exponent ω , we will call these problems “intermediate”.Next two problems that are intermediate if ω = 2 are: the Minimum Witness product, whichis related to the problem of computing All-Pairs Least Common Ancestors in a DAG, and All-Pairs Shortest Paths (APSP) in unweighted directed graphs. For both problems we know algo-rithms running in e O (cid:0) n (3+ ω ) / (cid:1) e O (cid:0) n . (cid:1) time [6, 2], and both algorithms can be improvedupon, by using rectangular matrix multiplication [16, 51]. The improvement is already seenin a naive implementation, i.e. cutting rectangular matrices into square blocks, which gives an e O (cid:0) n / (4 − ω ) (cid:1) e O (cid:0) n . (cid:1) time. Employing a specialized rectangular matrix multiplication algo-2ithm [24], brings the runtime down to e O (cid:0) n . (cid:1) . When ω = 2, however, all the improvementsvanish and those running times become e O (cid:0) n . (cid:1) . Is the . running time exponent (for ω = 2 ) for all of these problems a coincidence, or can werelate all of them via ﬁne-grained reductions, and use plausible hypotheses to explain it? This is a question that many have asked, but unfortunately there are only two partial answers:First, it is known that Equality product and Dominance product are equivalent ([48, 30], alsofollows from Proposition 3.4 in [46]), and that they are equivalent to All-Pairs ℓ p +1 Distances [30].The second result is that the Max-Min product is equivalent to approximate APSP in weightedgraphs without scaling [10]. The main question above remains wide open .Parallel to the world of matrix products, there is a very similar landscape of convolution prob-lems. While it is well-known that the (+ , × )-convolution of two n -length vectors can be computedin O ( n log n ) time using the Fast Fourier Transform (FFT), these techniques no longer work forthe (min , +)-convolution, and this problem is conjectured to require n − o (1) time (see e.g. [15]).Similar to the “intermediate” matrix product problems, there are analogous “intermediate” con-volution problems, all in e O (cid:0) n / (cid:1) time : Max-Min convolution, Dominance convolution, MinimumWitness convolution, etc.The convolution landscape is even somewhat cleaner than the matrix product one. As thenormal convolution ((+ , × )) is already in (near-)linear time, there are no analogues of rectangularmatrix multiplication speedups, and all intermediate problems happen to have exactly the samerunning time (up to polylogarithmic factors). Still, there is no real formal explanation of why theyhave the same running time. The only reductions between these convolutions are analogous tothe matrix product ones: Dominance convolution is equivalent to Equality convolution [30], andapproximate (min , +)-convolution is equivalent to exact Max-Min convolution [10]. In this paper we provide new ﬁne-grained reductions between several intermediate matrix productand all-pairs graph problems, and between intermediate convolution problems, also relating theseto other key problems from ﬁne-grained complexity such as 3SUM. See Figure 1 for a pictorialrepresentation of our results.

Reductions for Graph Problems and Matrix Products.

Several of our reductions concernthe All-Edges Monochromatic Triangle ( AE - Mono ∆) problem: Given an n -node graph in whicheach edge has a color from 1 to n , decide for each edge whether it belongs to a monochromatic triangle, a triangle whose all three edges have the same color. Vassilevska, Williams and Yuster [41]studied the decision variant of AE - Mono ∆ in which one asks whether the given graph contains amonochromatic triangle. They provided an O (cid:0) n (3+ ω ) / (cid:1) time algorithm for the decision problem,but that algorithm is in fact strong enough to also solve the all-edges variant AE - Mono ∆, making AE - Mono ∆ one of the “intermediate” problems of interest.To obtain their O (cid:0) n (3+ ω ) / (cid:1) time algorithm, Vassilevska, Williams and Yuster [41] implicitlyreduce AE - Mono ∆ (in a black-box way) to the AE - Sparse ∆ problem of deciding for every edge e in an m -edge graph whether e is in a triangle. The fastest known algorithm for AE - Sparse ∆ is The (+ , × )-convolution of two vectors a and b is the vector c such that c i = P j a j b i − j . The exponent (3+ ω ) / / onoConvolution AE - Sparse ∆ TriangleListing ( t = m ) AE - Mono ∆ MinWitnessMin - MaxAP-BottleneckPathsApproximateAPSPUnweightedAPSPCoinChange e O (cid:0) n . (cid:1) e O (cid:0) n / (cid:1) O (cid:0) n (cid:1) O (cid:0) m ω/ ( ω +1) (cid:1) e O (cid:0) n (3+ ω ) / (cid:1) Figure 1: Our results. An arrow pointing from problem A to problem B means that problem A reduces to problem B in the ﬁne-grained sense. Solid arrows denote reductions which are tightwith respect to the best currently known running times, i.e. improving by a polynomial factor overthe best known running time for one problem implies a polynomial improvement over the bestknown running time for the other. Dashed arrows denote reductions which become tight when ω = 2. The reduction from CoinChange to MonoConvolution , denoted by a dotted arrow, is nottight.by Alon, Yuster and Zwick [3], running in O (cid:0) m ω/ ( ω +1) (cid:1) time, and the problem is known to beruntime equivalent to the problem of listing up to m triangles in an m -edge graph [21]. The black-box reduction of [41] from AE - Mono ∆ to AE - Sparse ∆ implies that a signiﬁcant improvement overthe O (cid:0) m ω/ ( ω +1) (cid:1) time for AE - Sparse ∆ would translate to an improvement over O (cid:0) n (3+ ω ) / (cid:1) for AE - Mono ∆. Theorem 1 (implicit in [41]) . If AE - Sparse ∆ is in O (cid:0) m ω/ ( ω +1) − ε (cid:1) time, for some ε > , then AE - Mono ∆ is in O (cid:0) n (3+ ω ) / − δ (cid:1) time, for some δ > . Our ﬁrst set of results shows that AE - Mono ∆ is powerful enough to capture two well-studiedintermediate problems: the Minimum Witness product of two Boolean matrices and the All-PairsShortest Paths problem in directed unweighted graphs.The Minimum Witness product (

MinWitness ) C of two Boolean matrices A and B is deﬁned as C ij = min { k | A ik = B kj = 1 } (where the minimum is deﬁned to be ∞ if there is no witness k ). MinWitness is used, e.g., for determining for every pair u, v of vertices in a DAG, the least commonancestor of u and v , i.e. solving the All-Pairs Least Common Ancestors problem [16]. The fastestknown algorithm for MinWitness runs in O (cid:0) n . (cid:1) time using rectangular matrix multiplication,and in O (cid:0) n / (4 − ω ) (cid:1) time just using square matrix multiplication [16].The All-Pairs Shortest Paths (APSP) problem in unweighted graphs is very well-studied. Whilein undirected graphs, the problem is known to be solvable in e O ( n ω ) time [35], the problem indirected graphs is one of our intermediate problems. Its fastest algorithm (similarly to MinWitness )runs in O (cid:0) n . (cid:1) time using rectangular matrix multiplication, and in e O (cid:0) n / (4 − ω ) (cid:1) time justusing square matrix multiplication [51]. We will refer to the APSP problem in directed unweightedgraphs as UnweightedAPSP .We present reductions from

MinWitness and

UnweightedAPSP to AE - Mono ∆ with only polylog-arithmic overhead.

Theorem 2. If AE - Mono ∆ is in T ( n ) time, then MinWitness is in O ( T ( n ) log n ) time. heorem 3. If AE - Mono ∆ is in T ( n ) time, then UnweightedAPSP is in O (cid:0) T ( n ) log n (cid:1) time. The above reductions tightly relate

MinWitness and

UnweightedAPSP to AE - Mono ∆ if ω = 2,showing that any improvement over the 2 . AE - Mono ∆, gives the same improvementfor

MinWitness and

UnweightedAPSP . Due to the tight reduction (Theorem 1) from AE - Mono ∆ to AE - Sparse ∆, we also obtain that an O (cid:0) m / − ε (cid:1) time algorithm, with ε >

0, for AE - Sparse ∆ wouldgive O (cid:0) n . − δ (cid:1) time algorithms, for δ >

0, for

MinWitness and

UnweightedAPSP , presenting anothertight relationship for the case when ω = 2.Our next result is that improving over the exponent 2 . AE - Mono ∆ is at least as hard asobtaining a truly subquadratic time algorithm for the 3SUM problem.

Theorem 4. If AE - Mono ∆ is in O (cid:0) n / − ε (cid:1) time, then SUM is in (randomized) e O (cid:16) n − ε (cid:17) time. In 3SUM one is given n integers and is asked whether three of them sum to 0. The problemis easy to solve in O (cid:0) n (cid:1) time, and slightly subquadratic time algorithms exist [4, 11]. 3SUM isa central problem in ﬁne-grained complexity [44]. It is hypothesized to require n − o (1) time (on aword-RAM with O (log n ) bit words), and many ﬁne-grained hardness results are conditioned onthis hypothesis (see [23, 44]). Our reduction shows that, under the 3SUM Hypothesis, the exponent2 . AE - Mono ∆ cannot be beaten, and this is tight if ω = 2. We note that before our work nointermediate matrix, graph, or convolution problem was known to be 3SUM-hard.Next, we consider the Min-Max product ( Min - Max ) of two matrices A and B , deﬁned as C ij =min k max( A ik , B kj ). The Min-Max product is equivalent to the aforementioned Max-Min product(just negate the matrix entries) and the All-Pairs Bottleneck Paths problem, and is thus solvablein O (cid:0) n (3+ ω ) / (cid:1) time [20].A very simple folklore reduction shows that Min - Max on n × n integer matrices is at least ashard as MinWitness on n × n Boolean matrices, giving a tight relationship when ω = 2. Theorem 5 (folklore) . If Min - Max is in T ( n ) time, then MinWitness is in O ( T ( n )) time. Our next result states that the All-Pairs Shortest Paths problem in directed unweighted graphs(

UnweightedAPSP ) is also tightly reducible to

Min - Max . This gives a second intermediate problemthat is at least as hard as both

MinWitness and

UnweightedAPSP . Theorem 6. If Min - Max is in T ( n ) time, then UnweightedAPSP is in O ( T ( n ) log n ) time. The above theorem also follows from a recent independent result by Barr, Kopelowitz, Poratand Roditty [5]. In particular, they reduce All-Pairs Shortest Paths in directed graphs with edgeweights from {− , , } to Min - Max . Interestingly, they use a substantially diﬀerent approach thanours. While their argument can be seen as inspired by Seidel’s algorithm for unweighted APSP inundirected graphs [35], ours resembles Zwick’s algorithm for directed graphs [51].

Reductions for Convolution Problems.

Our main result for convolution problems regardsthe convolution version of AE - Mono ∆, which we call

MonoConvolution : Given three integer se-quences a, b, c , decide for each index i if there exists j such that a j = b i − j = c i . We show that MonoConvolution is actually ﬁne-grained equivalent to 3SUM.

Theorem 7. If MonoConvolution is in O (cid:0) n / − ε (cid:1) time, then SUM is in (randomized) e O (cid:16) n − ε (cid:17) time. Theorem 8. If SUM is in O (cid:0) n − ε (cid:1) time, then MonoConvolution is in e O (cid:0) n / − ε/ (8 − ε ) (cid:1) time. diﬀerent running time complexities: MonoConvolution is a problem in O (cid:0) n / (cid:1) time, whereas 3SUMis in O (cid:0) n (cid:1) time, and a polynomial improvement on one of these running times would result ina polynomial improvement over the other. All previous ﬁne-grained equivalences were betweenproblems with the same running time exponent: the problems equivalent to APSP [45, 1] are allsolvable in O (cid:0) N . (cid:1) time where N is the size of their input, the problems equivalent to OrthogonalVectors [13] or to (min , +)-convolution [15] are all in quadratic time, the problems equivalent toCNF-SAT [14] are all in O (2 n ) time, etc. While tight ﬁne-grained reductions between problemswith diﬀerent running times are well-known, there was no such equivalence until our result, largelysince it often seems diﬃcult to reduce a problem with a smaller asymptotic running time to onewith a larger running time, something our Theorem 8 overcomes. Note that the same apparentdiﬃculty is overcome by the reduction from AE - Mono ∆ to AE - Sparse ∆ in Theorem 1, as well as bythe reductions from

MinWitness and

UnweightedAPSP to AE - Sparse ∆, which follow from combiningTheorems 2 and 3 with Theorem 1.Theorem 8 together with Theorem 4 give a reduction from

MonoConvolution to AE - Mono ∆.Previously reductions from a convolution to the corresponding graph/matrix problem were knownonly for problems with best known algorithms running in brute-force time, i.e. quadratic time forconvolution and cubic time for product, e.g. (min , +)-convolution reduces to (min , +)-product [7].Finally, we relate MonoConvolution to an unweighted variant of a coin change problem [49, 29]that is related to the minimum word break problem [8, 12]. Given a set of coin values from { , , . . . , n } , the CoinChange problem asks to determine for each integer value up to n what is theminimum number of coins (allowing repetitions) that sum to that value. We reduce CoinChange to MonoConvolution with only a polylogarithmic overhead. A simple algorithm solves

CoinChange in e O (cid:0) n / (cid:1) time [9], and our reduction implies that any improvement over the known running timesof MonoConvolution or 3SUM would also improve over the above running time for

CoinChange .Following the publication of the conference version of this paper, Chan and He [12] gave a faster e O (cid:0) n / (cid:1) time algorithm for CoinChange . Therefore, our reduction is no longer tight with respectto the best currently known running times. In order to improve over Chan and He’s running timeusing our reduction one would need an O (cid:0) n / − ε (cid:1) time algorithm for MonoConvolution . Theorem 9. If MonoConvolution is in T ( n ) time, then CoinChange is in O (cid:0) T ( n ) log n (cid:1) time. In this section we ﬁrst recall formal deﬁnitions of all the problems involved in the reductionspresented in the paper. We split these problems by their time complexity. At the end of the sectionwe recall the property of self-reducibility of 3SUM. e O (cid:0) n (3+ ω ) / (cid:1) time Deﬁnition 10 (All-Edges Monochromatic Triangle, AE - Mono ∆) . Given an n -node graph G inwhich each edge has a color from 1 to n , decide for each edge whether it belongs to a monochromatic triangle, a triangle where all three edges have the same color. Deﬁnition 11 (Min-Max matrix product,

Min - Max ) . Given two n × n matrices A and B , computematrix C such that C ij = min k max( A ik , B kj ) . eﬁnition 12 (Minimum Witness matrix product, MinWitness ) . Given two n × n Boolean matrices A and B , compute matrix C such that C ij = min( { k | A ik = B kj = 1 } ∪ {∞} ) . Deﬁnition 13 (All-Pairs Shortest Paths in directed unweighted graphs,

UnweightedAPSP ) . Givenan n -node unweighted directed graph G = ( V, E ), compute for each pair of vertices u, v ∈ V thelength of a shortest path from u to v . Note that all path lengths will be in { , , . . . , n − } ∪ {∞} . O (cid:0) m ω/ ( ω +1) (cid:1) time Deﬁnition 14 (All-Edges Sparse Triangle, AE - Sparse ∆) . Given an m -edge graph G decide for eachedge whether it belongs to a triangle. O ( n ) time Deﬁnition 15 (3SUM) . Given three lists, A , B and C , of n integers, determine if there exist a ∈ A , b ∈ B , and c ∈ C such that a + b = c .Let us note that the 3SUM problem is deﬁned in several diﬀerent ways in literature. They diﬀeras to whether the input is split into three list or all the numbers are in a single list, and whether onelooks for a + b = c or a + b + c = 0. All these variants are equivalent by simple folklore reductions. e O ( n . ) time Deﬁnition 16 ( MonoConvolution ) . Given three sequences a, b, c , all of length n , compute thesequence d such that d i = ( ∃ j a j = b i − j = c i , . e O (cid:0) n / (cid:1) time Deﬁnition 17 ( CoinChange ) . Given a set of coin values C ⊆ { , , . . . , n } , assume you have foreach c ∈ C an inﬁnite supply of coins of value c , and determine for each v ∈ { , , . . . , n } theminimum number of coins that sums up to v . CoinChange can be easily solved in e O (cid:0) n . (cid:1) time [9]. The algorithm splits the coins into heavycoins, with weight at least √ n , and light coins, with weight less than √ n . The minimum sum for avalue can use at most √ n heavy coins. By running FFT √ n times the algorithm produces a vectorwith the minimum number of heavy coins needed to sum to every value. That takes O (cid:0) n . log n (cid:1) time in total. Then a classical dynamic programming algorithm is run for the √ n light coins and n values, in O (cid:0) n . (cid:1) time.For a more involved e O (cid:0) n / (cid:1) time algorithm refer to [12]. In our proofs of Theorems 4 and 7 we use the following fact about 3SUM.

Lemma 18.

For any α ∈ [0 , , a single instance of SUM of size n can be reduced to O (cid:0) n α (cid:1) instances of SUM of size O (cid:0) n − α (cid:1) each. The reduction runs in time linear in the total size ofproduced instances, and the original instance is a yes-instance if and only if at least one of theproduced instances is a yes-instance. First, let us recall the algorithm of Vassilevska, Williams and Yuster [41] for AE - Mono ∆. Werephrase the argument so that it not only shows how to solve AE - Mono ∆ in O (cid:0) n (3+ ω ) / (cid:1) time,but also proves that any polynomial improvement over the O (cid:0) m ω/ ( ω +1) (cid:1) time algorithm of Alon,Yuster and Zwick [3] for AE - Sparse ∆ translates to a polynomial improvement for AE - Mono ∆. Theorem 1 (implicit in [41]) . If AE - Sparse ∆ is in O (cid:0) m ω/ ( ω +1) − ε (cid:1) time, for some ε > , then AE - Mono ∆ is in O (cid:0) n (3+ ω ) / − δ (cid:1) time, for some δ > .Proof. Assume AE - Sparse ∆ is in O ( m α ) time. Take an AE - Mono ∆ instance. For each color con-sider the subgraph composed of all the edges of that color. Each such subgraph constitutes anindependent instance of AE - Sparse ∆. However, simply using the O ( m α ) time algorithm on all ofthese instances is not eﬃcient enough. Intuitively, some of the instances might be too dense.Instead, for a parameter t to be determined later, take the t largest subgraphs (in terms ofthe number of edges). For each of them solve the problem by using fast matrix multiplication tocompute the square of the adjacency matrix. This takes O ( tn ω ) time in total. Let m i denote thenumber of edges in the i -th of the remaining subgraphs. Clearly, ∀ i m i n /t , and P i m i n .On each of those subgraphs use the O ( m α ) time AE - Sparse ∆ algorithm. This takes an order of X i m αi = X i m i · m α − i X i m i · ( n /t ) α − n · ( n /t ) α − time. The total runtime is thus O (cid:0) tn ω + n α /t α − (cid:1) . Optimize by setting t = n (2 α − ω ) /α , and get an O (cid:0) n ω +2 − ( ω/α ) (cid:1) time.Observe that for α = 2 ω/ ( ω + 1) the runtime is O (cid:0) n (3+ ω ) / (cid:1) . Moreover, for α < ω/ ( ω + 1) theexponent in the runtime becomes strictly smaller.Now, we proceed to show how to use AE - Mono ∆ to solve two popular intermediate problems. Westart with

MinWitness , and reduce a single instance of that problem to log n instances of AE - Mono ∆. Theorem 2. If AE - Mono ∆ is in T ( n ) time, then MinWitness is in O ( T ( n ) log n ) time.Proof. The main idea is to use a parallel binary search. For each entry of the output matrix C wewill keep an interval which that entry is guaranteed to lie in. With a single call to AE - Mono ∆ wewill be able to halve all the intervals.W.l.o.g. assume the last column of A and last row of B are all ones, so that the output is alwaysﬁnite. For ℓ ∈ [log n ], let C ( ℓ ) denote the matrix pointing to 2 ℓ -length intervals in which entries of C lie, that is C ( ℓ ) ij is the unique integer such that C ij ∈ (cid:2) ℓ · C ( ℓ ) ij , ℓ · ( C ( ℓ ) ij + 1) (cid:1) .We will compute C ( ℓ ) for ℓ = ⌈ log n ⌉ , . . . , ,

0. Observe that C ( ⌈ log n ⌉ ) is the zero matrix.Knowing C ( ℓ +1) , we compute C ( ℓ ) as follows. We create a tripartite graph G = ( I ∪ J ∪ K, E ), witheach of

I, J, K containing n vertices. We add edges between I and K according to the matrix A.Edges from the k -th column get the label ⌊ k/ ℓ ⌋ . We add edges between K and J according to thematrix B. Edges from the k -th row get the label ⌊ k/ ℓ ⌋ . Finally, we add the full bipartite clique8etween I and J . The edge between the i -th vertex of I and the j -th vertex of J gets the label2 · C ( ℓ +1) . That edge forms a monochromatic triangle if and only if C ij ∈ (cid:2) ℓ · · C ( ℓ +1) ij , ℓ · (2 · C ( ℓ +1) ij +1) (cid:1) , i.e. C ( ℓ ) ij = 2 · C ( ℓ +1) ij . Otherwise, it must be that C ij ∈ (cid:2) ℓ · (2 · C ( ℓ +1) ij + 1) , ℓ · (2 · C ( ℓ +1) ij + 2) (cid:1) ,i.e. C ( ℓ ) ij = 2 · C ( ℓ +1) ij + 1. Therefore, solving AE - Mono ∆ on G suﬃces to compute C ( ℓ ) . Finally,observe that C = C (0) . With a slightly more involved argument we show how to solve

UnweightedAPSP with O (cid:0) log n (cid:1) calls to AE - Mono ∆. Theorem 3. If AE - Mono ∆ is in T ( n ) time, then UnweightedAPSP is in O (cid:0) T ( n ) log n (cid:1) time.Proof. We solve

UnweightedAPSP in log n rounds, in the i -th round we compute matrix D i oflengths of shortest paths of length up to 2 i (other entries equal to ∞ ). Each round will consist ofa parallel binary search, similar to the one we use in our reduction from MinWitness to AE - Mono ∆(Theorem 2). The algorithm is based on the fact that in unweighted graphs every path can be splitroughly in half, i.e. if the distance from u to v equals to k , then there must exist a vertex w suchthat the distances from u to w and from w to v equal to ⌊ k/ ⌋ + { , } .To start, note that D is a { , , ∞} -matrix that can be easily obtained from the adjacencymatrix of the input graph. Now, assume we already computed D i and let us proceed to compute D i +1 . To avoid excessive indexing, let A denote D i , and B denote D i +1 . For each entry ofthe output matrix B we will keep an interval which that entry is guaranteed to lie in. With a singlecall to AE - Mono ∆ we will be able to halve all the intervals.For ℓ ∈ { , , . . . , i +2 } , let B ( ℓ ) denote the matrix pointing to 2 ℓ -length intervals in which entriesof B lie, that is B ( ℓ ) uv equals to the unique integer such that B uv ∈ (cid:2) ℓ · B ( ℓ ) uv , ℓ · ( B ( ℓ ) uv + 1) − (cid:3) , orto inﬁnity in case B uv is inﬁnite.We will iterate over ℓ from i + 2 down to 0. First, we need to compute B ( i +2) , whose entriesare either zeros or inﬁnities. Recall that we already know the matrix A = D i . Consider a pairof nodes u and v that are at distance at most 2 i +1 . There must exist a node w such that A uw i and A wv i , that is, equivalently both A uw and A wv are ﬁnite. We obtain the matrix B ( i +2) bysquaring the (0 ,

1) matrix obtained from A by putting ones at the ﬁnite entries and zeros elsewhere.That single Boolean matrix multiplication can be easily simulated by a single call to AE - Mono ∆,using just two colors.Once we have the matrix B ( ℓ +1) we want to compute B ( ℓ ) . For this we ﬁrst note that if B ( ℓ +1) uv = j then B ( ℓ ) uv is either 2 j or 2 j + 1. If B ( ℓ ) uv = 2 j , then there must exist a vertex w such that A uw ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:1) , and A wv ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:3) . (1)Furthermore, if B ( ℓ ) uv > j , then there is no w such that the above condition holds. This will allowus to distinguish between the 2 j and 2 j + 1 cases by coloring the matrix A based on which rangethe entries fall in. Note that the ranges in Condition (1) do not overlap with corresponding rangesfor diﬀerent integer values j ′ = j . Thus we will be able to use a single call to AE - Mono ∆ to checkin parallel for all values of B ( ℓ ) uv if they are the smaller even value 2 · B ( ℓ +1) uv or the larger odd value2 · B ( ℓ +1) uv + 1.We construct an AE - Mono ∆ instance with a tripartite graph with the vertex set U ⊔ V ⊔ W where U , V and W are disjoint copies of the original vertex set. The edges between U and V correspond toour desired output. If B ( ℓ +1) uv = j then we color the edge ( u, v ) ∈ U × V with j . The edges between U and W correspond to the ﬁrst part of Condition (1), i.e. if A uw ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:1) ,then we add the edge ( u, w ) in U × W with color j . The edges between W and V correspond to9he second part of Condition (1), i.e. if A wv ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:3) , then we add the edge( w, v ) in W × V with color j . Any edge ( u, v ) in U × V that is in a monochromatic triangle implies B ( ℓ ) uv = 2 · B ( ℓ +1) uv . Conversely, any edge ( u, v ) that is not a part of any monochromatic triangleimplies B ( ℓ ) uv = 2 · B ( ℓ +1) uv + 1.We iterate down until B (0) , and observe that B (0) = B . Thus, with O (log n ) calls we cancompute B = D i +1 from A = D i . To solve UnweightedAPSP the total number of calls we needto make to AE - Mono ∆ is O (cid:0) log ( n ) (cid:1) . Therefore, if AE - Mono ∆ can be solved in T ( n ) time, then UnweightedAPSP can be solved in O (cid:0) T ( n ) log ( n ) (cid:1) time.Now we show that AE - Mono ∆ is 3SUM-hard. In our proof we use as a black-box the followingreduction from 3SUM to AE - Sparse ∆. Lemma 19 (Kopelowitz, Pettie, Porat [28]) . A single instance of SUM of size n can be reducedto a single instance of AE - Sparse ∆ with Θ( n log n ) vertices and Θ (cid:0) n / log n (cid:1) edges. Theorem 4. If AE - Mono ∆ is in O (cid:0) n / − ε (cid:1) time, then SUM is in (randomized) e O (cid:16) n − ε (cid:17) time.Proof. Given an instance of 3SUM of size N , we use the self-reduction (Lemma 18), and reduce it to O (cid:0) N / (cid:1) instances of size O (cid:0) N / (cid:1) each. Then, we reduce each of these instances to an AE - Sparse ∆instance with n = Θ (cid:0) N / log N (cid:1) vertices and m = Θ (cid:0) N / log N (cid:1) edges, using Lemma 19. Now wewill show how to combine these O (cid:0) N / (cid:1) AE - Sparse ∆ instances to form polylogarithmically many AE - Mono ∆ instances, each with O (cid:0) N / log N (cid:1) vertices, which will ﬁnish the proof.Assume w.l.o.g. that all the created graphs are over the same vertex set [ n ]. If we were luckyenough and the edge sets of the created AE - Sparse ∆ instances were disjoint, the reduction wouldbe essentially done. Indeed, we could simply union the edge sets to create a single graph, and usecolors to track from which graph every edge originates. Solving that one AE - Mono ∆ instance wouldprovide answers to all AE - Sparse ∆ instances. Sadly, the chances of such a favorable collision-freescenario are very slim. The remaining part of the proof shows how to deal with multiple AE - Sparse ∆instances containing the same edge.We randomly permute the vertex sets, for each graph independently. For a ﬁxed ( u, v ) ∈ [ n ] , such that u = v , the probability that a ﬁxed graph contains the edge ( u, v ) equals to p = m/ (cid:0) n (cid:1) = O (cid:0) ( N / log N ) − (cid:1) . The expected number of ( u, v ) edges across all graphs is O (cid:0) N / · p (cid:1) = O (1 / log N ). By a Chernoﬀ bound, the probability that the number of ( u, v ) edgesexceeds c log n is less than (1 /e ) Θ( c log n ) . We take c large enough so that, by the union bound overall possible (cid:0) n (cid:1) edges, with probability at least 1 / c log n times acrossall graphs. For each ( u, v ) ∈ [ n ] we arbitrarily number all ( u, v ) edges with consecutive positiveintegers from 1 up to at most c log n . We iterate over all triples ( i, j, k ) ∈ [ c log n ] . For everytriple we create a tripartite graph with the vertex set V ⊔ V ⊔ V , for V = V = V = [ n ]. Wecreate an edge ( u, v ) between V and V if there exists an edge ( u, v ) with number i assigned toit in any of the AE - Sparse ∆ instances. Note that there is at most one such instance. We set thecolor of the newly created edge to the identiﬁer of the instance it originates from. Similarly, wecreate edges between V and V using edges with number j assigned, and between V and V usingnumber k . That gives us ( c log n ) instances of AE - Mono ∆. Note that every triangle present inany of the AE - Sparse ∆ instance corresponds to a single monochromatic in one of the AE - Mono ∆instances, and vice versa. We solve all AE - Mono ∆ instances and combine the outputs in order toget the output for all AE - Sparse ∆ instances, and eventually for the 3SUM instance.The next two theorems use techniques similar to Theorems 2 and 3 to give reductions to

Min - Max . 10 heorem 5 (folklore) . If Min - Max is in T ( n ) time, then MinWitness is in O ( T ( n )) time.Proof. Given two (0 ,

1) matrices A and B , we construct matrices A ′ and B ′ such that A ′ ik = ( k if A ik = 1 , ∞ if A ik = 0 , and B ′ kj = ( k if B kj = 1 , ∞ if B kj = 0 . Observe that the (min , max)-product of A ′ and B ′ equals to the minimum witness product of A and B . Theorem 6. If Min - Max is in T ( n ) time, then UnweightedAPSP is in O ( T ( n ) log n ) time.Proof. The reduction is similar to the reduction from

UnweightedAPSP to AE - Mono ∆ (Theorem 3)in that we also have log n rounds, and in the i -th round we compute matrix D i of lengths ofshortest paths of length up to 2 i (other entries equal to ∞ ). The key diﬀerence is that, in eachround, instead of performing a binary search and issuing log n calls to AE - Mono ∆, we issue justtwo calls to

Min - Max .As before, ﬁrst note that D is a { , , ∞} -matrix that can be easily obtained from theadjacency matrix of the input graph. Now, assume we already computed D i and let us proceedto compute D i +1 . Let ℓ = 2 i . Naturally, D ℓ is the (min , +)-product of D ℓ with itself, butthis sole observation is not enough for our purposes. We will exploit the fact that D ℓ is notan arbitrary matrix – but a (truncated) matrix of shortest paths in an unweighted graph – inorder to compute that speciﬁc (min , +)-product using a Min - Max algorithm. Let A ? B denote the(min , max)-product of matrices A and B .First, we handle even-length paths. We compute E = 2 · ( D ℓ ? D ℓ ). Note that D ℓuv E uv forall u, v ∈ V , because for any two integers a, b we have a + b · max( a, b ). Moreover, if D ℓuv = 2 k ,then there must exist w ∈ V such that D ℓuw = D ℓwv = k , and thus D ℓuw + D ℓwv = 2 · max( D ℓuw , D ℓwv )and D ℓuv = E uv .For odd-length paths we proceed in a similar manner, just the formulas become slightly moreobscure. We compute O = 2 · ( D ℓ ? ( D ℓ − D ℓuv O uv for all u, v ∈ V ,because for any two integers a, b we have a + b · max( a, b −

1) + 1. Moreover, if D ℓuv = 2 k + 1,then there must exist w ∈ V such that D ℓuw = k and D ℓwv = k + 1, and thus D ℓuw + D ℓwv =2 · max( D ℓuw , D ℓwv −

1) + 1 and D ℓuv = O uv .Consequently, we compute D ℓuv = min( E uv , O uv ), for all u, v ∈ V . In this section we provide two reductions which together show that

MonoConvolution is ﬁne-grainedequivalent to 3SUM. Recall that the best known algorithms for

MonoConvolution require time n / − o (1) , and the best algorithms for 3SUM require time n − o (1) , so this is an equivalence be-tween problems of diﬀerent time complexity. At the end of the section we reduce CoinChange to MonoConvolution .First, let us recall the All-Integers variant of 3SUM, which parallels the All-Edges variants ofour graph problems. That variant is easier to work with than the original 3SUM problem for ourpurposes. Luckily if either variant has a subquadratic algorithm then they both do [45].

Deﬁnition 20 (All-Integers 3SUM) . Given three lists

A, B, C of n integers each, output the listof all integers c ∈ C such that there exist a ∈ A and b ∈ B such that a + b = c .11 emma 21 (Vassilevska Williams, Williams [45]) . If SUM is in O (cid:0) n − ε (cid:1) time, then All-Integers SUM is in O (cid:0) n − ε/ (cid:1) time. An important ingredient of our reduction from 3SUM to AE - Mono ∆ (Theorem 7) is the followingrange reduction for 3SUM.

Lemma 22 (Baran, Demaine, Pˇatra¸scu, rephrased, see Section 2.1 of [4]) . For every positive integeroutput size s , there exists a family of hash functions H such that:1. Every hash function h ∈ H hashes to the range { , , . . . , R − } for R = 2 s .2. For all integers a, b, c ∈ Z and all hash functions h ∈ H , if a + b = c , then h ( a ) + h ( b ) ≡ h ( c ) + {− , , } mod R.

3. Given an integer c and two lists of n integers A and B such that there are no a ∈ A, b ∈ B with a + b = c , the probability, over hash functions h drawn uniformly at random from H ,that there exist a ∈ A, b ∈ B such that h ( a ) + h ( b ) ≡ h ( c ) + {− , , } mod R is at most O (cid:0) n /R (cid:1) . We are now ready to show that 3SUM can be solved eﬃciently with a

MonoConvolution al-gorithm. Our reduction uses the fact that we can re-write a 3SUM instance with n integers in {− R, . . . , R } as a convolution of O ( R )-length (0 ,

1) vectors, where a one in the i -th position cor-responds to the number i in the original 3SUM instance. We will combine several such instancesinto one MonoConvolution instance by giving each instance its own number. A one in position i ina convolution instance labelled j will result in the MonoConvolution instance having j in position i . Theorem 7. If MonoConvolution is in O (cid:0) n / − ε (cid:1) time, then SUM is in (randomized) e O (cid:16) n − ε (cid:17) time.Proof. Given an instance of 3SUM of size n , we reduce it to O (cid:0) n / (cid:1) instances of size O (cid:0) n / (cid:1) each,using the self-reduction (Lemma 18). Although for the self-reduction itself it would be suﬃcientjust to solve 3SUM on each of these instances – i.e. decide if there exist a, b, c with a + b = c – weare going to solve the All-Integers 3SUM variant – i.e. decide for each c if there exist a and b with a + b = c .To each created instance we apply a hashing scheme of Lemma 22 in order to reduce theuniverse size down to R = n / . This introduces false positives for each element with probability O (cid:0) ( n / ) /R (cid:1) = O (1). Note that the hashing has one-sided error, i.e. if for some element c thereare no a and b such that h ( a ) + h ( b ) ≡ h ( c ) + {− , , } mod R , then with certainty there are no a and b such that a + b = c . To mitigate the eﬀect of false positives we create O (log n ) copies of eachinstance, each copy using an independently drawn hash function. Note that for every ﬁxed element c , if there are no a , b with a + b = c , then the probability that in each of the independent O (log n )copies we detect that h ( a )+ h ( b ) ≡ h ( c )+ {− , , } mod R for some h ( a ) , h ( b ) is 1 / poly( n ), and wecan make the degree of the polynomial arbitrarily large by choosing an appropriate multiplicativeconstant for the number of copies. Therefore we can use the union bound to argue that with atleast 2 / A, B, C of size n / we learned, for every one of the O (log n ) hashed instance copies, for every t ∈ [ R ] such that there is some c with h ( c ) = t , whetherthere are some h ( a ) , h ( b ) with h ( a ) + h ( b ) ≡ h ( c ) + {− , , } mod R . Then, we can go throughevery c ∈ C and if for every copy the answer for h ( c ) was YES, we can conclude that (with high12robability) there exists a ∈ A, b ∈ B with a + b = c , and if the answer was NO at least once, thenwe can conclude that there is no pair that sums to c .Here an important point is that we need to solve all of the O (cid:0) n / log n (cid:1) instances of All-Integers3SUM above, each on n / integers over a range [ O (cid:0) n / (cid:1) ]. We will embed solving all instancessimultaneously into solving a small (polylogarithmic) number of MonoConvolution instances.Each of the above O (cid:0) n / log n (cid:1) instances of All-Integers 3SUM easily reduces to an (OR , AND)-convolution of (0 ,

1) vectors of length O (cid:0) n / (cid:1) , each with only O (cid:0) n / (cid:1) nonzero entries, and withonly O (cid:0) n / (cid:1) relevant output coordinates one needs to compute. If only we had no collisions –i.e. two instances with the same nonzero input coordinate or the same relevant output coordinate– we could easily combine all the convolution instances into a single instance of MonoConvolution ,with O (cid:0) n / log n (cid:1) diﬀerent colors/values. However, the collisions are unavoidable. In order tocircumvent these collisions, we will add small random shifts, and use a similar analysis as in the3SUM-to- AE - Mono ∆ reduction of Theorem 4.Speciﬁcally, for each 3-SUM (sub-)instance we chose a shift s uniformly at random from a rangeof size O (cid:0) n / (cid:1) , we add s to all elements in A , add s to all elements in B , and add − s to allelements in C . These shifts do not change whether for a given triplet a, b, c the condition a + b = c holds or not. Let the numbers after the shift lie in {− R ′ , . . . , R ′ } where R ′ = O (cid:0) n / (cid:1) . For aﬁxed value v ∈ {− R ′ , . . . , R ′ } the expected number of instances containing v is O (log n ). Indeed,for each particular instance, the probability that one of its numbers lands at v after the shift is O (cid:0) n / /R ′ (cid:1) = O (cid:0) /n / (cid:1) ; then summing over all the instances gives an expectation of O (log n ).Since the shifts are independent, we can use a Chernoﬀ bound to bound the probability thatthe number of instances containing v exceeds c log n by (1 /e ) Θ( c log n ) . We take c large enough sothat, by union bound, the probability that no value is contained in more than c log n instances isat least 2 / c log n ) instances of MonoConvolution as follows.For each value r ∈ {− R ′ , . . . , R ′ } , let the instances that contain r in their A sets be in A ( r )[1] , . . . ,in A ( r )[ c log n ]. Deﬁne in B ( r )[1] , . . . , in B ( r )[ c log n ] and in C ( r )[1] , . . . , in C ( r )[ c log n ] analogously.We now create an instance of MonoConvolution for each choice of ( x, y, z ) ∈ [ c log n ] . Ininstance ( x, y, z ) we create vectors a, b, c , where for each r ∈ {− R ′ , . . . , R ′ } , we set a r = in A ( r )[ x ], b r = in B ( r )[ y ] and c r = in C ( r )[ z ]. Then for any instance i that contains r in A , s in B and t in C , we would have in A ( r )[ x ] = in B ( s )[ y ] = in C ( t )[ z ] = i for some x, y, z and so we will place i in a r , b s , and c t for that choice of x, y, z .This next reduction ﬁnishes the equivalence between MonoConvolution and 3SUM. It uses ahigh-frequency/low-frequency split. For elements that appear at a high frequency we use FFT.For elements of low frequency we make calls to All-Integers 3SUM. Recall that a subquadraticalgorithm for 3SUM implies a subquadratic algorithm for All-Integers 3SUM (Lemma 21).

Theorem 8. If SUM is in O (cid:0) n − ε (cid:1) time, then MonoConvolution is in e O (cid:0) n / − ε/ (8 − ε ) (cid:1) time.Proof. For a parameter t to be determined later, consider the t most frequent values. For eachof these values use FFT to calculate the standard (+ , × )-convolution of two (0 ,

1) vectors formedfrom vectors a and b by putting ones everywhere that value appears, and zeros everywhere else.Examine, where the outputs of these convolutions are nonzero, in order to determine the part ofoutput to MonoConvolution corresponding to occurrences of the frequent values in vector c . Thistakes e O ( tn ) time in total.Let n i denote the number of occurrences of the i -th of the remaining values in all three sequences.Clearly, ∀ i n i n/t , and P i n i n . For each value v out of those remaining values construct13ets of indices at which it appears in vectors a , b , c , i.e. A = { j : a j = v } , B = { j : b j = v } , C = { j : c j = v } , and solve All-Integers 3SUM on these sets. For each element j reported bythe All-Integers 3SUM algorithm assign the corresponding output of MonoConvolution d j = 1. ByLemma 21, solving these All-Integers 3SUM instances takes an order of X i n − ε/ i = X i n i · n − ε/ i X i n i · (3 n/t ) − ε/ n · (3 n/t ) − ε/ time. The total time is thus e O (cid:0) tn + n · ( n/t ) − ε/ (cid:1) . Optimize by setting t = n / − ε/ (8 − ε ) , and getthe desired runtime.Our ﬁnal theorem connects the CoinChange problem to our network of reductions. The proofuses the same structure and techniques as the reduction from

UnweightedAPSP to AE - Mono ∆ inTheorem 3.

Theorem 9. If MonoConvolution is in T ( n ) time, then CoinChange is in O (cid:0) T ( n ) log n (cid:1) time.Proof. Let S denote the array of output values, i.e. S [ v ] equals to the minimum number of coinsthat sum to v . Parallel to the proof of Theorem 3, let S i [ v ] be inﬁnity if S [ v ] > i , and otherwiseequal to S [ v ]. We solve CoinChange in log n rounds, in the i -th round we compute S i .Note that S [0] = 0, and, for v > S [ v ] = 1 if v ∈ C and S [ v ] = ∞ otherwise. Furthernote that S = S log n . We will show how to compute S i +1 given S i . We will then iterate i from 0 up to log n .Following the style of Theorem 3, we avoid overparameterizing by setting A = S i and B = S i +1 . Let B ( ℓ ) be an array pointing to 2 ℓ -length intervals in which entries of B lie, i.e. B ( ℓ ) [ v ] = j if B [ v ] ∈ [2 ℓ j, ℓ ( j + 1) − ℓ from i + 2 down to 0 to compute B from A .First we show how to compute B ( i +2) from A . If there is a way to sum to v with at most 2 i +1 coins, then there must be a u ∈ [0 , n ] such that both A [ u ] and A [ v − u ] are at most 2 i . Conversely,if there is no way to sum to v with at most 2 i coins, then there will be no u that meets the abovecriteria. Therefore we create a (0 ,

1) vector a with a v = 1 if and only if A [ v ] is ﬁnite. Then, wecompute the (+ , × )-convolution of a with itself, in near-linear time using FFT. We set B [ v ] = 0where the convolution output is non-zero and B [ v ] = ∞ everywhere else.Now we show how to compute B ( ℓ ) from A and B ( ℓ +1) . Note that if B ( ℓ +1) [ v ] = j then B ( ℓ ) [ v ] ∈ { j, j + 1 } . Next, note that if B ( ℓ ) [ v ] = 2 j , then there must exist an integer u ∈ [0 , n ]such that A [ u ] ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:1) , and A [ v − u ] ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:3) . (2)Furthermore, if B ( ℓ ) [ v ] > j , then there is no u such that the above condition holds. This willallow us to distinguish between the 2 j and 2 j + 1 cases. Note that the ranges in Condition (2) donot overlap with corresponding ranges for diﬀerent integer values j ′ = j . Thus, we will be able touse a single call to MonoConvolution to check in parallel for all values v if B ( ℓ ) [ v ] is the smaller evenvalue 2 B ( ℓ +1) [ v ] or the larger odd value 2 B ( ℓ +1) [ v ] + 1.We construct a MonoConvolution instance with three input vectors a, b, c . The ﬁrst input vectorcorresponds to the ﬁrst part of Condition (2), i.e. if A [ v ] ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:1) , then a v = j .Any entries a v unset by this condition are given the special value a v = −

1. The second vectorcorresponds to the second part of Condition (2), i.e. if A [ v ] ∈ (cid:2) ℓ − · (2 j ) , ℓ − · (2 j + 1) (cid:3) , then b v = j . Similarly, any entries b v unset by this condition are given the special value b v = −

1. Thelast vector corresponds to our desired output, i.e. c v = B ( ℓ +1) [ v ]. Let d denote the vector output bythis MonoConvolution call. Now, if d v = 1 then B ( ℓ ) [ v ] = 2 B ( ℓ +1) [ v ], else B ( ℓ ) [ v ] = 2 B ( ℓ +1) [ v ] + 1.14e iterate down until B (0) , and observe that B (0) = B . With O (log n ) calls to MonoConvolution we can thus compute B = S i +1 from A = S i . To solve CoinChange the total number of

MonoConvolution calls is O (cid:0) log n (cid:1) . Therefore if MonoConvolution can be solved in T ( n ) time, then CoinChange can be solved in O (cid:0) T ( n ) log n (cid:1) time. References [1] Amir Abboud, Fabrizio Grandoni, and Virginia Vassilevska Williams. Subcubic equivalencesbetween graph centrality problems, APSP and diameter. In

Proceedings of the Twenty-SixthAnnual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA,January 4-6, 2015 , pages 1681–1697, 2015. doi:10.1137/1.9781611973730.112 .[2] Noga Alon, Zvi Galil, and Oded Margalit. On the exponent of the all pairs short-est path problem.

Journal of Computer and System Sciences , 54(2):255–262, 1997. doi:10.1006/jcss.1997.1388 .[3] Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles.

Algo-rithmica , 17(3):209–223, Mar 1997. doi:10.1007/BF02523189 .[4] Ilya Baran, Erik D. Demaine, and Mihai Pˇatra¸scu. Subquadratic algorithms for 3SUM.

Algo-rithmica , 50(4):584–596, Apr 2008. doi:10.1007/s00453-007-9036-3 .[5] Hodaya Barr, Tsvi Kopelowitz, Ely Porat, and Liam Roditty. {− , , } -APSP and (min,max)-product problems, 2019. arXiv:1911.06132 .[6] Michael A. Bender, Giridhar Pemmasani, Steven Skiena, and Pavel Sumazin. Finding leastcommon ancestors in directed acyclic graphs. In Proceedings of the Twelfth Annual Symposiumon Discrete Algorithms, January 7-9, 2001, Washington, DC, USA. , pages 845–854, 2001.URL: http://dl.acm.org/citation.cfm?id=365411.365795 .[7] David Bremner, Timothy M. Chan, Erik D. Demaine, Jeﬀ Erickson, Ferran Hurtado, JohnIacono, Stefan Langerman, and Perouz Taslakian. Necklaces, convolutions, and X + Y. In

Proceedings of the 14th Conference on Annual European Symposium - Volume 14 , ESA’06,pages 160–171, London, UK, UK, 2006. Springer-Verlag. doi:10.1007/11841036_17 .[8] Karl Bringmann, Allan Grønlund, and Kasper Green Larsen. A dichotomy for regular ex-pression membership testing. In , pages 307–318, 2017. doi:10.1109/FOCS.2017.36 .[9] Karl Bringmann and Tomasz Kociumaka. Personal communication, 2019.[10] Karl Bringmann, Marvin K¨unnemann, and Karol Wegrzycki. Approximating APSP withoutscaling: Equivalence of approximate min-plus and exact min-max. In

Proceedings of the 51stAnnual ACM SIGACT Symposium on Theory of Computing , STOC 2019, pages 943–954, NewYork, NY, USA, 2019. ACM. doi:10.1145/3313276.3316373 .[11] Timothy M. Chan. More logarithmic-factor speedups for 3SUM, (median, +)-convolution, andsome geometric 3SUM-hard problems. In

Proceedings of the Twenty-Ninth Annual ACM-SIAMSymposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018 ,pages 881–897, 2018. doi:10.1137/1.9781611975031.57 .1512] Timothy M. Chan and Qizheng He. More on Change-Making and Related Problems. In , volume 173 of

Leibniz Inter-national Proceedings in Informatics (LIPIcs) , pages 29:1–29:14, Dagstuhl, Germany, 2020.Schloss Dagstuhl–Leibniz-Zentrum f¨ur Informatik. doi:10.4230/LIPIcs.ESA.2020.29 .[13] Lijie Chen and Ryan Williams. An equivalence class for orthogonal vectors. In

Proceedings ofthe Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego,California, USA, January 6-9, 2019 , pages 21–40, 2019. doi:10.1137/1.9781611975482.2 .[14] Marek Cygan, Holger Dell, Daniel Lokshtanov, D´aniel Marx, Jesper Nederlof, Yoshio Okamoto,Ramamohan Paturi, Saket Saurabh, and Magnus Wahlstr¨om. On problems as hard as CNF-SAT.

ACM Transactions on Algorithms , 12(3):41:1–41:24, 2016. doi:10.1145/2925416 .[15] Marek Cygan, Marcin Mucha, Karol Wegrzycki, and Michal Wlodarczyk. On problems equiv-alent to (min,+)-convolution.

ACM Transactions on Algorithms , 15(1):14:1–14:25, January2019. doi:10.1145/3293465 .[16] Artur Czumaj, Miros law Kowaluk, and Andrzej Lingas. Faster algorithms for ﬁnding lowestcommon ancestors in directed acyclic graphs.

Theoretical Computer Science , 380(1):37 – 46,2007. Automata, Languages and Programming. doi:10.1016/j.tcs.2007.02.053 .[17] Artur Czumaj and Andrzej Lingas. Finding a heaviest triangle is not harder than matrixmultiplication. In

Proceedings of the Eighteenth Annual ACM-SIAM Symposium on DiscreteAlgorithms , SODA ’07, pages 986–994, Philadelphia, PA, USA, 2007. Society for Industrialand Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=1283383.1283489 .[18] Martin Dietzfelbinger. Universal hashing and k-wise independent random variables via integerarithmetic without primes. In

STACS 96 , pages 567–580, Berlin, Heidelberg, 1996. SpringerBerlin Heidelberg. doi:10.1007/3-540-60922-9_46 .[19] Ran Duan, Ce Jin, and Hongxun Wu. Faster algorithms for all pairs non-decreasingpaths problem. In , pages 48:1–48:13, 2019. doi:10.4230/LIPIcs.ICALP.2019.48 .[20] Ran Duan and Seth Pettie. Fast algorithms for (max, min)-matrix multiplication and bot-tleneck shortest paths. In

Proceedings of the Twentieth Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA 2009, New York, NY, USA, January 4-6, 2009 , pages 384–391,2009. doi:10.1137/1.9781611973068.43 .[21] Lech Duraj, Krzysztof Kleiner, Adam Polak, and Virginia Vassilevska Williams. Equivalencesbetween triangle and range query problems. In

Proceedings of the 2020 ACM-SIAM Symposiumon Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020 , pages 30–47. SIAM, 2020. doi:10.1137/1.9781611975994.3 .[22] Michael J Fischer and Albert R Meyer. Boolean matrix multiplication and transitive closure.In , pages 129–131.IEEE, 1971. doi:10.1109/SWAT.1971.4 .[23] Anka Gajentaan and Mark H. Overmars. On a class of O ( n ) problems in computationalgeometry. Computational Geometry , 5:165–185, 1995. doi:10.1016/0925-7721(95)00022-2 .1624] Francois Le Gall and Florent Urrutia. Improved rectangular matrix multiplication using powersof the Coppersmith-Winograd tensor. In

Proceedings of the Twenty-Ninth Annual ACM-SIAMSymposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018 ,pages 1029–1046, 2018. doi:10.1137/1.9781611975031.67 .[25] Omer Gold and Micha Sharir. Dominance Product and High-Dimensional Closest Pair under L ∞ . In Yoshio Okamoto and Takeshi Tokuyama, editors, , volume 92 of Leibniz International Proceedings inInformatics (LIPIcs) , pages 39:1–39:12, Dagstuhl, Germany, 2017. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.ISAAC.2017.39 .[26] Allan Grønlund and Seth Pettie. Threesomes, degenerates, and love triangles.

Journal of theACM , 65(4):22:1–22:25, April 2018. doi:10.1145/3185378 .[27] MohammadTaghi Hajiaghayi, Silvio Lattanzi, Saeed Seddighin, and Cliﬀ Stein. MapReducemeets ﬁne-grained complexity: Mapreduce algorithms for APSP, matrix multiplication, 3-SUM, and beyond, 2019. arXiv:1905.01748 .[28] Tsvi Kopelowitz, Seth Pettie, and Ely Porat. Higher lower bounds from the 3SUM conjecture.In

Proceedings of the Twenty-seventh Annual ACM-SIAM Symposium on Discrete Algorithms ,SODA ’16, pages 1272–1287, Philadelphia, PA, USA, 2016. Society for Industrial and AppliedMathematics. doi:10.1137/1.9781611974331.ch89 .[29] Marvin K¨unnemann, Ramamohan Paturi, and Stefan Schneider. On the Fine-Grained Com-plexity of One-Dimensional Dynamic Programming. In Ioannis Chatzigiannakis, Piotr Indyk,Fabian Kuhn, and Anca Muscholl, editors, , volume 80 of

Leibniz International Proceedings inInformatics (LIPIcs) , pages 21:1–21:15, Dagstuhl, Germany, 2017. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.ICALP.2017.21 .[30] Karim Labib, Przemyslaw Uznanski, and Daniel Wolleb-Graf. Hamming Distance Complete-ness. In Nadia Pisanti and Solon P. Pissis, editors, , volume 128 of

Leibniz International Proceedings in Informat-ics (LIPIcs) , pages 14:1–14:17, Dagstuhl, Germany, 2019. Schloss Dagstuhl–Leibniz-Zentrumfuer Informatik. doi:10.4230/LIPIcs.CPM.2019.14 .[31] Fran¸cois Le Gall. Powers of tensors and fast matrix multiplication. In

Proceedings of the 39thInternational Symposium on Symbolic and Algebraic Computation , ISSAC ’14, pages 296–303,New York, NY, USA, 2014. ACM. doi:10.1145/2608628.2608664 .[32] Andrea Lincoln, Virginia Vassilevska Williams, Joshua R. Wang, and R. Ryan Williams. De-terministic Time-Space Trade-Oﬀs for k-SUM. In , volume 55 of

Leibniz International Proceedings inInformatics (LIPIcs) , pages 58:1–58:14, Dagstuhl, Germany, 2016. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.ICALP.2016.58 .[33] Jiˇr´ı Matouˇsek. Computing dominances in E n . Information Processing Letters , 38(5):277–278,1991. doi:10.1016/0020-0190(91)90071-O .[34] Mihai Patrascu. Towards polynomial lower bounds for dynamic problems. In

Proceedings ofthe Forty-second ACM Symposium on Theory of Computing , STOC ’10, pages 603–610, NewYork, NY, USA, 2010. ACM. doi:10.1145/1806689.1806772 .1735] Raimund Seidel. On the all-pairs-shortest-path problem in unweighted undi-rected graphs.

Journal of Computer and System Sciences , 51(3):400–403, 1995. doi:10.1006/jcss.1995.1078 .[36] Asaf Shapira, Raphael Yuster, and Uri Zwick. All-pairs bottleneck paths in vertex weightedgraphs.

Algorithmica , 59(4):621–633, 2011. doi:10.1007/s00453-009-9328-x .[37] Volker Strassen. Gaussian elimination is not optimal.

Numerische mathematik , 13(4):354–356,1969. doi:10.1007/BF02165411 .[38] Virginia Vassilevska. Nondecreasing paths in a weighted graph or: how to optimally read atrain schedule. In

Proceedings of the Nineteenth Annual ACM-SIAM Symposium on DiscreteAlgorithms, SODA 2008, San Francisco, California, USA, January 20-22, 2008 , pages 465–472, 2008. URL: http://dl.acm.org/citation.cfm?id=1347082.1347133 .[39] Virginia Vassilevska, Ryan Williams, and Raphael Yuster. All-pairs bottleneck paths for gen-eral graphs in truly sub-cubic time. In

Proceedings of the 39th Annual ACM Symposium onTheory of Computing, San Diego, California, USA, June 11-13, 2007 , pages 585–589, 2007. doi:10.1145/1250790.1250876 .[40] Virginia Vassilevska, Ryan Williams, and Raphael Yuster. All pairs bottleneck paths andmax-min matrix products in truly subcubic time.

Theory of Computing , 5(1):173–189, 2009. doi:10.4086/toc.2009.v005a009 .[41] Virginia Vassilevska, Ryan Williams, and Raphael Yuster. Finding heaviest H-subgraphs inreal weighted graphs, with applications.

ACM Transactions on Algorithms , 6(3):44:1–44:23,July 2010. doi:10.1145/1798596.1798597 .[42] Virginia Vassilevska Williams. Nondecreasing paths in a weighted graph or: How to op-timally read a train schedule.

ACM Transactions on Algorithms , 6(4):70:1–70:24, 2010. doi:10.1145/1824777.1824790 .[43] Virginia Vassilevska Williams. Multiplying matrices faster than Coppersmith-Winograd. In

Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing , STOC ’12,pages 887–898, 2012. doi:10.1145/2213977.2214056 .[44] Virginia Vassilevska Williams. On some ﬁne-grained questions in algorithms and complexity.In

Proceedings of the International Congress of Mathematicians (ICM 2018) , pages 3447–3487,2018. doi:10.1142/9789813272880_0188 .[45] Virginia Vassilevska Williams and R. Ryan Williams. Subcubic equivalences betweenpath, matrix, and triangle problems.

Journal of the ACM , 65(5):27:1–27:38, August 2018. doi:10.1145/3186893 .[46] Virginia Vassilevska Williams and Ryan Williams. Finding, minimizing, and counting weightedsubgraphs.

SIAM Journal on Computing , 42(3):831–854, 2013. doi:10.1137/09076619X .[47] Ryan Williams. Faster all-pairs shortest paths via circuit complexity. In

Proceedings of theForty-sixth Annual ACM Symposium on Theory of Computing , STOC ’14, pages 664–673, NewYork, NY, USA, 2014. ACM. doi:10.1145/2591796.2591811 .[48] Virginia V. Williams. Problem Set 2 in Stanford’s class CS367, Oct. 15, 2015. http://theory.stanford.edu/~virgi/cs367/hw2.pdf , 2015.1849] J. W. Wright. The change-making problem.

Journal of the ACM , 22(1):125–128, January1975. doi:10.1145/321864.321874 .[50] Raphael Yuster. Eﬃcient algorithms on sets of permutations, dominance, and real-weightedAPSP. In

Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Al-gorithms, SODA 2009, New York, NY, USA, January 4-6, 2009 , pages 950–957, 2009. doi:10.1137/1.9781611973068.103 .[51] Uri Zwick. All pairs shortest paths using bridging sets and rectangular matrix multiplication.

Journal of the ACM , 49(3):289–317, May 2002. doi:10.1145/567112.567114doi:10.1145/567112.567114