[PDF] Conditional Hardness of Earth Mover Distance

Abstract

The Earth Mover Distance (EMD) between two sets of points A,B⊆ R d with |A|=|B| is the minimum total Euclidean distance of any perfect matching between A and B . One of its generalizations is asymmetric EMD, which is the minimum total Euclidean distance of any matching of size |A| between sets of points A,B⊆ R d with |A|≤|B| . The problems of computing EMD and asymmetric EMD are well-studied and have many applications in computer science, some of which also ask for the EMD-optimal matching itself. Unfortunately, all known algorithms require at least quadratic time to compute EMD exactly. Approximation algorithms with nearly linear time complexity in n are known (even for finding approximately optimal matchings), but suffer from exponential dependence on the dimension. In this paper we show that significant improvements in exact and approximate algorithms for EMD would contradict conjectures in fine-grained complexity. In particular, we prove the following results: (1) Under the Orthogonal Vectors Conjecture, there is some c>0 such that EMD in Ω( c log ∗ n ) dimensions cannot be computed in truly subquadratic time. (2) Under the Hitting Set Conjecture, for every δ>0 , no truly subquadratic time algorithm can find a (1+1/ n δ ) -approximate EMD matching in ω(logn) dimensions. (3) Under the Hitting Set Conjecture, for every η=1/ω(logn) , no truly subquadratic time algorithm can find a (1+η) -approximate asymmetric EMD matching in ω(logn) dimensions.

Full PDF

aa r X i v : . [ c s . CC ] S e p Conditional Hardness of Earth Mover Distance

Dhruv [email protected] 25, 2019

Abstract

The Earth Mover Distance (EMD) between two sets of points

A, B ⊆ R d with | A | = | B | is theminimum total Euclidean distance of any perfect matching between A and B . One of its generalizationsis asymmetric EMD, which is the minimum total Euclidean distance of any matching of size | A | betweensets of points A, B ⊆ R d with | A | ≤ | B | . The problems of computing EMD and asymmetric EMDare well-studied and have many applications in computer science, some of which also ask for the EMD-optimal matching itself. Unfortunately, all known algorithms require at least quadratic time to computeEMD exactly. Approximation algorithms with nearly linear time complexity in n are known (even forﬁnding approximately optimal matchings), but suﬀer from exponential dependence on the dimension.In this paper we show that signiﬁcant improvements in exact and approximate algorithms for EMDwould contradict conjectures in ﬁne-grained complexity. In particular, we prove the following results: • Under the Orthogonal Vectors Conjecture, there is some c > c log ∗ n ) dimen-sions cannot be computed in truly subquadratic time. • Under the Hitting Set Conjecture, for every δ >

0, no truly subquadratic time algorithm can ﬁnda (1 + 1 /n δ )-approximate EMD matching in ω (log n ) dimensions. • Under the Hitting Set Conjecture, for every η = 1 /ω (log n ), no truly subquadratic time algorithmcan ﬁnd a (1 + η )-approximate asymmetric EMD matching in ω (log n ) dimensions. In the

Earth Mover Distance (EMD) problem , we are given two sets A and B each with n vectors in R d , andwant to ﬁnd the minimum cost of any perfect matching between A and B , where an edge between a ∈ A and b ∈ B has cost k a − b k .In a harder variant of the problem (“EMD matching”), we want to actually ﬁnd a perfect matching withthe optimal cost. This is a special case of the geometric transportation problem , in which each vector of A has a positive supply and each vector of B has a positive demand, and the goal is to ﬁnd an optimal“transportation map”, i.e., match each unit of supply with a unit of demand while minimizing the totaldistance, summed over all units of supply.A more general variant of the EMD problem (with an analogous extension to arbitrary supplies/demands)allows for the possibility that | A | < | B | , and requires the map from A to B to be an injection. We refer tothis variant as the asymmetric EMD problem.Earth Mover Distance is a discrete analogue of the Monge-Kantorovich metric for probability measures,which has connections to various areas of mathematics [26]. Furthermore, computing distance betweenprobability measures is an important problem in machine learning [23, 20, 7, 13] and computer vision[22, 10, 25], to which Earth Mover Distance is often applied. To provide a few speciﬁc examples, computinggeometric transportation cost has applications in image retrieval [22], where asymmetric EMD allows thedistance to deal with occlusions and clutter. In computer graphics, computing the actual transportationmap is useful for interpolation between distributions, though the metric may be non-Euclidean [10].For the exact geometric transportation problem, the best known algorithm simply formulates the problemin terms of minimum cost ﬂow, yielding a runtime of O ( n . · polylog( U )) where U is the total supply (assuming1hat d is subpolynomial in n ) [18, 19]. Even for EMD, the best known algorithm follows directly from thegeneral graph algorithms for maximum matching in O ( m √ n ) time [14].The situation is better for approximation algorithms. There has been considerable work on both esti-mating the transportation cost [15, 6] and computing the actual map [24, 3, 5] in time nearly linear in n butexponential in dimension d . Most recently, it was shown [17] that there is an O ( nǫ − O ( d ) log( U ) O ( d ) log n )time algorithm which outputs a transportation map with cost at most (1 + ǫ ) times the optimum. This algo-rithm is very eﬃcient when the dimension d is constant or nearly constant, and when ǫ is not too small—say,constant or O (1 / polylog( n )). However, when d = ω (log n ), the algorithm is not guaranteed to ﬁnd even aconstant-factor approximation in quadratic time.Despite considerable progress on improving the algorithms for geometric matching problems over the lasttwo decades, little is known about lower bounds on their computational complexity. In particular, we do nothave any evidence that a running time of the form O ( n · poly( d, log n, /ǫ )) is not achievable. This is thequestion we address in this paper. In this paper we provide evidence that geometric transportation problems in high-dimensional spaces cannotbe solved in (truly) subquadratic time. This applies to both exact and approximate variants of the problem,and even in the special case of unit supplies. In particular we show a conditional quadratic hardness forthe exact EMD problem, as well as the approximate variant of EMD when the (approximately) optimalmatching must be reported.Our hardness results are based on two well-studied conjectures in ﬁne-grained complexity: OrthogonalVectors Conjecture and Hitting Set Conjecture (see [29] for a comprehensive survey).

The

Orthogonal Vectors (OV) problem takes as input two sets

A, B ⊆ { , } d ( n ) where | A | = | B | = n andasks whether there are some vectors a ∈ A and b ∈ B such that a · b = 0. The popular Orthogonal VectorsConjecture hypothesizes that in suﬃciently large dimensions, the obvious quadratic time algorithm for OVis nearly optimal:

Orthogonal Vectors Conjecture.

Let d ( n ) = ω (log n ) . For every constant ǫ > , no randomized algorithmcan solve d ( n ) -dimensional OV in O ( n − ǫ ) time. A plethora of problems have been shown to have nontrivial lower bounds under the Orthogonal VectorsConjecture; often these lower bounds are essentially tight (e.g. [1, 2, 8, 11, 28]; see [29] for a comprehensivesurvey). It is known that if the conjecture fails, then the Strong Exponential Time Hypothesis (SETH) failsas well [27], providing evidence for hardness of OV, and by extension of these problems to which OV can bereduced.Our ﬁrst result shows that EMD in “nearly constant” dimension is hard to compute exactly in trulysubquadratic time, under the Orthogonal Vectors Conjecture:

Theorem 1.1.

There is a constant c > under which the following holds. If there exists ǫ > and d ( n ) = Ω( c log ∗ n ) such that EMD on O (log n ) -bit vectors in d ( n ) dimensions can be computed in O ( n − ǫ ) time, then the Orthogonal Vectors Conjecture is false. Using techniques similar to those for the above theorem, we also address a question raised in [9] aboutthe complexity of the maximum/minimum weighted assignment problem when the weight matrix has lowrank. The minimum weighted assignment problem is deﬁned as follows: given an n × n weight matrix whichdetermines a complete bipartite graph, ﬁnd the cost of the minimum weight perfect matching. Motivatedby the observation that the problem can be solved in O ( n log n ) time if the weight matrix is rank-1, it isasked whether there is an O ( nr log n ) time algorithm for rank- r matrices [9]. We can answer this questionin the negative, under the Orthogonal Vectors Conjecture. In fact, we can show something stronger (seeAppendix A for the proof): Theorem 1.2.

There is a constant c > under which the following holds. If there exists ǫ > and r ( n ) = Ω( c log ∗ n ) such that the minimum assignment problem with rank- r weight matrices can be solved in O ( n − ǫ ) time, then the Orthogonal Vectors Conjecture is false. (a) Structure of Theorem 1.1 HSApproxFind-OVApproxMOMApprox EMD (b) Structure of Theorem 1.3

Figure 1: Summary of reductions

The second conjecture on which we base some of our results is hardness of the

Hitting Set (HS) problem .This problem, similar to OV, takes two sets of vectors

A, B ⊆ { , } d as input, and asks whether there existssome a ∈ A such that a · b = 0 for every b ∈ B . Hitting Set Conjecture.

Let d ( n ) = ω (log n ) . For every constant ǫ > , no randomized algorithm cansolve d ( n ) -dimensional HS in O ( n − ǫ ) time. It is known that HS reduces to OV, but the reverse reduction is unknown, so the Hitting Set Conjectureis “stronger” than the Orthogonal Vectors Conjecture [2]. The Hitting Set Conjecture has been used to proveconditional hardness of the Radius problem in sparse graphs [2]. The utility of the Hitting Set problem inconditional hardness results comes from the diﬀerence between its “ ∃∀ ” logical structure and the “ ∃∃ ” logicalstructure of the Orthogonal Vectors problem, which makes it more natural for some types of problems.Under the Hitting Set Conjecture, we prove hardness of approximation for the EMD matching problem(in which we want to ﬁnd the optimal or nearly-optimal matching). Simultaneously we obtain strongerhardness of approximation for asymmetric EMD matching. Theorem 1.3.

For any δ > and d ( n ) = ω (log n ) , if (1 + 1 /n δ ) -approximate EMD matching can be solvedin d ( n ) dimensions in truly subquadratic time, then the Hitting Set conjecture is false. Theorem 1.4.

For any d ( n ) = ω (log n ) and η = 1 /ω (log n ) , if (1 + η ) -approximate asymmetric EMDmatching can be solved in d ( n ) dimensions in truly subquadratic time, then the Hitting Set Conjecture isfalse. Finally, motivated by the question of how hard Hitting Set really is, compared to Orthogonal Vectors,we generalize the result that Hitting Set reduces to Orthogonal Vectors by ﬁnding a set of approximationproblems that lie between Orthogonal Vectors and Hitting Set in diﬃculty. For a positive integer function k ( n ) ≤ n/

2, we deﬁne the ( k, k ) -Find-OV problem : given two sets A, B ⊆ { , } d ( n ) with | A | = | B | = n and the guarantee that there exist at least 2 k orthogonal pairs between A and B , ﬁnd k pairs { ( a i , b i ) } ki =1 such that a i · b i = 0 for every i .We prove the following theorem in Appendix C. Theorem 1.5.

Let k ( n ) ≤ n/ . If ( k, k ) -Find-OV can be solved in truly subquadratic time, then the HittingSet conjecture is false. See Figure 1 for an overview of the structure of our main results (Theorems 1.1 and 1.3 respectively;the proof of Theorem 1.4 has the same structure as the latter). We provide the remaining deﬁnitions of therelevant problems in the next section. 3

Preliminaries

Before diving into the reductions, we formally deﬁne the remainder of the problems which we’re studying.Each problem we study takes sets of vectors as input, so one parameter of a problem is the dimension d ,which is a function of the input size n . That is, every function d : N → N deﬁnes a d ( n )-dimensional EMDproblem, and a d ( n )-dimensional OV problem, and so forth. We gloss over this choice of d in the subsequentdeﬁnitions. The

Earth Mover Distance (EMD) problem is deﬁned as follows: given two sets

A, B ⊆ R d ( n ) with | A | = | B | ,ﬁnd min π : A → B X a ∈ A k a − π ( a ) k where π is a bijection. We’ll restrict our attention to the special cases where A, B ⊆ Z d ( n ) with polynomiallybounded entries (for hardness of exact EMD) and A, B ⊆ { , } d ( n ) (for hardness of approximate EMD).We can deﬁne the asymmetric EMD problem as above, except we relax the constraint | A | = | B | = n to | A | ≤ | B | = n , and require π to be a injection rather than a bijection.The EMD matching problem is the variant of the EMD problem in which the desired output is the optimalmatching π . Similarly we can deﬁne the asymmetric EMD matching problem. An algorithm “solves” EMDmatching (or its asymmetric variant) up to a certain additive or multiplicative factor if the cost of thebijection it outputs diﬀers from the optimal cost by at most that additive or multiplicative factor. The reduction from Hitting Set to approximate EMD matching will go through the variants of OV deﬁnednext.The

Maximum Orthogonal Matching (MOM) problem is deﬁned as follows: given two sets

A, B ⊆{ , } d ( n ) , with | A | ≤ | B | = n , ﬁnd an injection π : A → B which maximizes |{ a ∈ A | a · π ( a ) = 0 }| . And the

Find-OV problem is deﬁned as follows: given two sets

A, B ⊆ { , } d ( n ) with | A | = | B | = n ,ﬁnd the set S ⊆ A of vectors a ∈ A such that there exists some b ∈ B with a · b = 0. An algorithm solvesFind-OV up to an additive error of t if it returns a set S ′ ⊆ S for which | S ′ | ≥ | S | − t . We will apply the following theorem from [12] to our low-dimensional hardness result of exact EMD:

Theorem 2.1 ([12]) . Assuming OVC, there is a constant c > such that Bichromatic ℓ -Closest Pair in c log ∗ n dimensions requires n − o (1) time, with vectors of O (log n ) bit entries. To prove hardness of the exact EMD problem under the Orthogonal Vectors Conjecture, we reduce to thebichromatic closest pairs problem, and then apply Theorem 2.1 due to [12]. The intuition for the reductionis as follows: given two sets A and B of n vectors, we’d like to augment set A with n − B , and much closer to B than A is. Similarly, we’d like to augment set B with n − A , and much closer to A than B is. If this werepossible, then the minimum cost matching between the augmented sets would only match one pair of theoriginal sets: the desired closest pair.Unfortunately, it is in general impossible to ﬁnd a vector equidistant from n vectors in d ≪ n dimensions.But this can be circumvented by embedding the vectors in a slightly higher-dimensional space, and adjusting4oordinates in the “free” dimensions to ensure that an equidistant vector exists. So long as the free dimensionsused to adjust set A are disjoint from the free dimensions used to adjust set B , the inner products between A and B are unaﬀected, and the distances change in an accountable way.Since we are working in the ℓ norm, we will need the following simple lemma which shows that anyinteger can be eﬃciently decomposed as a sum of a constant number of perfect squares. Lemma 3.1.

For any ρ > and any positive integer m , there is an O ( m ρ ) time algorithm to decompose m as a sum of O (log 1 /ρ ) perfect squares.Proof. Here is the algorithm: repeatedly ﬁnd the largest square which does not push the total above m , untilthe remainder does not exceed O ( m ρ/ ). Then compute the minimal square decomposition for the remainderby dynamic programming.The ﬁrst, greedy phase takes O (polylog( m )) time and ﬁnds O (log 1 /ρ ) perfect squares which sum to some m ′ with m − m ρ/ ≤ m ′ ≤ m . The second, dynamic programming phase takes O ( m ρ ) time (even naively).By Lagrange’s four-square theorem, a decomposition of m − m ′ into at most four perfect squares is found.Now we describe the main reduction of this section. We’ll use a shorthand notation to deﬁne vectorsmore concisely: for example, a x b y c z refers to an ( x + y + z )-dimensional vector with value a in the ﬁrst x dimensions, b in the next y dimensions, and c in the next z dimensions. Theorem 3.2.

Let d = d ( n ) ≤ n be a dimension, and let k > be a constant. There is a constant c = c ( k ) for which the following holds. Suppose that there is an algorithm which computes the ℓ earth mover distancebetween sets A ′ , B ′ ⊆ [1 , n k ] d +2 c +2 of size n in O ( n − ǫ ) time. Then bichromatic closest pair between sets A, B ⊆ [1 , n k ] d of size n can be computed in O ( n − ǫ ) time as well.Proof. Set ρ = 1 / (16 k ), and let c = O (log 1 /ρ ) be the constant in Lemma 3.1 for the number of perfectsquares in a decomposition. Let A and B be two sets of vectors from { , . . . , n k } d . Let N = n k . Our goalis to compute min a ∈ A,b ∈ B k a − b k . We can assume without loss of generality that k a k and k b k are odd for all a ∈ A and b ∈ B : for instance,we can replace each vector z = ( z , . . . , z d ) by (2 z , . . . , z d , A ′ and B ′ of (2 d + 2 c + 2)-dimensional vectors as follows. Let u = 0 d (10 c )0 c +1 d (parentheses for clarity). Let v = N d c +1 (10 c )0 d . Add n − u to B ′ and add n − v to A ′ . For each a ∈ A , add the following vector to A ′ , where we’ll deﬁne vector adj a ∈ Z c +1 later: a ′ = f ( a ) = 0 d (adj a )0 c +1 a. Similarly, for each b ∈ B , add the following vector to B ′ , where we’ll deﬁne adj b ∈ Z c +1 later: b ′ = g ( b ) = N d c +1 (adj b ) b. Now pick any a ∈ A . We’ll construct adj a so that the following equalities are both satisﬁed: k a ′ − u k = n k d = k adj a k . Deﬁne the ﬁrst element adj a (0) = ( k a k + 1) /

2. Since k a k ≤ n k d , we can then use Lemma 3.1 to ﬁnd c integers adj a (1) , . . . , adj a ( c ) so that k adj a k = n k d . Furthermore, k a ′ − u k = k adj a − c k + k a k = k adj a k − · adj a (0) + 1 + k a k = n k d . For each b ∈ B , we can similarly construct adj b so that k b ′ − v k = k adj b k = n k d . We claim that EMD( A ′ , B ′ ) = 2( n − n k d + min a ∈ A,b ∈ B q N d + 2 n k d + k a − b k .

5o prove this claim, notice that k u − v k ≥ N √ d and k a ′ − b ′ k ≥ N √ d for every a ′ ∈ A ′ \ { v } and b ′ ∈ B ′ \ { u } , whereas k a ′ − u k ≪ N √ d/n and k b ′ − v k ≪ N √ d/n . This means that the optimal matchingbetween A ′ and B ′ will minimize the number of ( u, v ) and ( a ′ , b ′ ) edges. Hence, exactly one element of A ′ \{ v } will be matched to an element in B ′ \ { u } . So if M denotes this optimal matching, and x ′ = f ( x ) ∈ A ′ ismatched with y ′ = g ( y ) ∈ B ′ , then the cost of M iscost( M ) =  X a ′ ∈ A ′ \{ v,x ′ } k a ′ − u k + X b ′ ∈ B ′ \{ u,y ′ } k b ′ − v k  + k x ′ − y ′ k = 2( n − n k d + q N d + k adj x k + (cid:13)(cid:13) adj y (cid:13)(cid:13) + k x − y k = 2( n − n k d + q N d + 2 n k d + k x − y k . The claim follows. So the algorithm is simply: run the EMD algorithm on ( A ′ , B ′ ) and use the computedmatching cost to ﬁnd the closest pair distance, according to the above formula.The time complexity of constructing A ′ , B ′ is O ( n / d / ), dominated by computing a square decom-position for each vector. Since A ′ and B ′ are sets of O ( n ) vectors in Z d +2 c +2 with entries bounded bymax( N, n k d ) ≤ n k , the EMD between A ′ and B ′ can be computed in O ( n − ǫ ) time. Thus, the overallalgorithm takes O ( n − ǫ ) time.Theorem 1.1 follows from the above reduction and Theorem 2.1. In this section we prove hardness of approximation for the EMD matching problem when the approximatelyoptimal matching must be reported. Note that the techniques from the previous section do not immediatelygeneralize to this scenario, since the reduction in Theorem 3.2 is not approximation-preserving. A multi-plicative error of 1 + ǫ in the EMD algorithm would induce an additive error of ˜ O ( ǫn k ) in the closest pairalgorithm, due to the large integers constructed in the reduction. A bucketing scheme, to ensure that thediameter of the input point set is within a constant factor of the closest pair, could eliminate the dependenceon the values of the input coordinates, yielding a multiplicative error of only 1 + ˜ O ( ǫn ).However, (1 + ǫ )-approximate closest pair is only quadratically hard for ǫ = o (1) [21]; for any constant ǫ >

0, there is a subquadratic (1 + ǫ )-approximation algorithm [16, 4]. Thus, the above arguments wouldonly yield (1 + ˜ O (1 /n ))-approximate hardness. Furthermore, the factor of n loss intuitively feels intrinsicto the approach of reducing from closest pair, since the EMD is the sum of n distances. Thus, a diﬀerentapproach seems necessary if we are to achieve hardness for ǫ = ω (1 /n ).Our method broadly consists of two steps. First, we show that EMD can encode orthogonality, byreducing approximate Maximum Orthogonal Matching (the problem of reporting a maximum matching inthe implicit graph with an edge for each orthogonal pair) to approximate EMD matching. Second, we showthat approximate Maximum Orthogonal matching can solve an instance ( A, B ) of Hitting Set by ﬁnding anorthogonal pair ( a, b ) for every a ∈ A if possible, even if the set of orthogonal pairs does not constitute amatching.We start by proving that asymmetric EMD matching reduces to EMD matching for the appropriatechoices of error bounds. The reduction pads the smaller set of vectors A with a vector that is equidistantfrom the opposite set B , so that its contribution to the earth mover distance can be accounted for. Of course,it is ﬁrst necessary to transform the vectors so that an equidistant vector exists. Lemma 4.1.

Suppose that (1 + ǫ ) -approximate EMD matching in D dimensions can be solved in T ( n, D ) time. Then (1 + ǫ ) -approximate asymmetric EMD matching in d dimensions can be solved with an additionaladditive factor of nǫ √ d in T ( n, d ) time.Proof. Let

A, B ⊆ { , } d with | A | ≤ | B | . Deﬁne sets A ′ , B ′ ⊆ { , } d by mapping a ∈ A to the vector( a , . . . , a d , − a , . . . , − a d )6nd similarly mapping b ∈ B to ( b , . . . , b d , − b , . . . , − b d ) . Then add | B | − | A | copies of the zero vector to A ′ .Now | A ′ | = | B ′ | , so we can run the approximate EMD algorithm on A ′ and B ′ to ﬁnd some bijection π : A ′ → B ′ such that X a ′ ∈ A ′ k a ′ − π ( a ′ ) k ≤ (1 + ǫ ) EM D ( A ′ , B ′ ) . Each vector b ′ ∈ B ′ has k b ′ k = d , so the distance from the zero vector to each match is exactly √ d . Andfor any a ∈ A and b ∈ B which map to a ′ ∈ A ′ and b ′ ∈ B ′ , k a ′ − b ′ k = 2 k a − b k . Hence, the cost of π is X a ′ ∈ A ′ k a ′ − π ( a ′ ) k = ( | B | − | A | ) √ d + √ · X a ∈ A k a − π ( a ) k and the optimal cost is EM D ( A ′ , B ′ ) = ( | B | − | A | ) √ d + √ · EM D ( A, B ) . It follows that X a ∈ A k a − π ( a ) k ≤ ǫ √ | B | − | A | ) √ d + (1 + ǫ ) EM D ( A, B ) , which is the stated error bound.Next, we reduce approximate Maximum Orthogonal Matching to approximate asymmetric EMD match-ing. The general idea, given input sets ( A, B ), is to deform A and B so that orthogonal pairs ( a, b ) aremapped to pairs ( a ′′ , b ′′ ) with distance d , and all other pairs are mapped to pairs with distance at least d > d . Then add | A | auxiliary vectors to B , each with distance exactly d from all vectors in A . Thus, inan optimal matching, each vector of A is either matched with an orthogonal vector at distance d , or somevector with distance exactly d . This introduces a nonlinearity, ensuring that in the additive matching cost,an orthogonal pair’s contribution is not “cancelled out” by the contribution of a pair with dot product 2, forinstance. A similar trick was used by [8] in the context of edit distance, another “additive” metric.The following simple lemma will be useful: Lemma 4.2.

There are maps φ , φ : { , } d → { , } d such that for any a, b ∈ { , } d , φ ( a ) · φ ( b ) = d − ( a · b ) . Furthermore, the maps can be evaluated in O ( d ) time.Proof. Each dimension expands into three dimensions as follows: a i ( φ ( a ) i , φ ( a ) i +1 , φ ( a ) i +2 ) = ( a i , − a i , − a i ) b i ( φ ( b ) i , φ ( b ) i +1 , φ ( b ) i +2 ) = (1 − b i , b i , − b i ) . Then for each i , i +2 X j =3 i φ ( a ) j φ ( b ) j = a i (1 − b i ) + (1 − a i ) b i + (1 − a i )(1 − b i ) = 1 − a i b i . Summing over i = 1 , . . . , d we get φ ( a ) · φ ( b ) = d − ( a · b ) as desired. Lemma 4.3.

Suppose that (1 + ǫ ) -approximate asymmetric EMD in D dimensions can be solved with anadditional additive factor of nǫ √ D in T ( n, D ) time. Then the Maximum Orthogonal Matching problem in d dimensions can be solved up to an additive factor of O ( nǫd ) in T (2 n, d + 1) time. roof. Let

A, B ⊆ { , } d with | A | ≤ | B | = n . Deﬁne A ′ , B ′ ⊆ { , } d by A ′ = φ ( A ) and B ′ = φ ( B ),where φ , φ are as deﬁned in Lemma 4.2.Let d ′ = 3 d for convenience. Now we construct sets A ′′ , B ′′ ⊆ { , } d ′ +1 as follows, starting from sets A ′ and B ′ . We add 2 d ′ dimensions to ensure that k a ′′ k = k b ′′ k = d ′ for every a ′′ ∈ A ′′ and b ′′ ∈ B ′′ withoutchanging the inner products. Add another d ′ + 1 dimensions, extending each a ′′ ∈ A ′′ so that a ′′ d ′ +1 = 1and a ′′ i = 0 otherwise; and extend each b ′′ ∈ B ′′ so that b ′′ d ′ +2 = 1 and b ′′ i = 0 otherwise. Finally augment B ′′ with | A | copies of the vector v ∈ { , } d ′ +1 with 3 d ′ zeros followed by d ′ + 1 ones.Notice that for every a ∈ A and b ∈ B corresponding to some a ′′ ∈ A ′′ and b ′′ ∈ B ′′ , k a ′′ − b ′′ k = k a ′′ k + k b ′′ k − a ′′ · b ′′ = 2( d ′ + 1) − a ′′ · b ′′ = 2 a · b + 4 d + 2 , and k a ′′ − v k = 2( d ′ + 1) − a ′′ · v = 4 d + 4 . Now we run the approximate asymmetric EMD matching algorithm on A ′′ and B ′′ , yielding an injection π : A ′′ → B ′′ such that X a ′′ ∈ A ′′ k a ′′ − π ( a ′′ ) k ≤ | B ′′ | ǫ √ d ′ + 1 + (1 + ǫ ) EM D ( A ′′ , B ′′ ) . For each a ′′ ∈ A ′′ , if k a ′′ − π ( a ′′ ) k > d + 4, then we can set π ( a ′′ ) = v , preserving injectivity and decreasingthe cost of the matching. Therefore every edge has cost either √ d + 2 or √ d + 4. In particular, if thereare m orthogonal pairs in the matching, the total cost is X a ′′ ∈ A ′′ k a ′′ − π ( a ′′ ) k = m √ d + 2 + ( | A | − m ) √ d + 4 . By the same argument as above, the minimum cost matching is obtained by maximizing the number oforthogonal pairs. If the maximum possible number of orthogonal pairs in a matching is m OPT , then

EM D ( A ′′ , B ′′ ) = m OPT √ d + 2 + ( | A | − m OPT ) √ d + 4 . Substituting these expressions into the approximation guarantee and solving, we get that m ≥ m OPT − O ( ǫnd ) as desired.In the above lemma we assumed that we are given an algorithm for asymmetric EMD matching whichhas both a multiplicative error of 1 + ǫ and an additive error of nǫ √ d , since this is the error introduced bythe reduction to (symmetric) EMD. However, we are also interested in the hardness of (1 + ǫ )-approximateasymmetric EMD matching in its own right. Removing the additive error from the hypothesized algorithm inLemma 4.3 directly translates to an improved Maximum Orthogonal Matching algorithm, with an additiveerror of O ( ǫ | A | d ) instead of O ( ǫnd ), where n = | A | + | B | : Lemma 4.4.

Suppose that there is an algorithm which solves (1+ ǫ ) -approximate asymmetric EMD matchingin T ( | A | + | B | , d ) time, where the input is A, B ⊆ { , } d . Then the Maximum Orthogonal Matching problemcan be solved up to an additive error of O ( ǫ | A | d ) in T (2 n, d + 1) time. Now we could reduce OV to approximate Maximum Orthogonal Matching. The proof of the followingtheorem is given in Appendix B for completeness.

Theorem 4.5.

Let d = ω (log n ) . Under the Orthogonal Vectors Conjecture, for any ǫ > and δ ∈ (0 , , (1 + 1 /n δ ) -approximate EMD matching in { , } d cannot be solved in O ( n δ − ǫ ) time. However, Theorem 4.5 does not prove quadratic hardness for any approximation factor larger than(1 + 1 /n ), and in fact breaks down completely for (1 + 1 / √ n )-approximate EMD matching.Instead, we reduce Hitting Set to approximate Maximum Orthogonal Matching, through approximateFind-OV. These two problems are structurally similar; the technical diﬃculty is that Find-OV may requireﬁnding many orthogonal pairs even when the largest orthogonal matching may be small, in which caseapplying the Maximum Orthogonal Matching algorithm would result in little progress. We resolve this with8he following insight: if many vectors in set A are orthogonal to at least one vector in set B but there isnot a large orthogonal matching, then some vector in set B is orthogonal to many vectors in A . But thesevectors can be found eﬃciently by sampling.In the proof of the following theorem we formalize the above idea. Theorem 4.6.

Let d = d ( n ) be a dimension. Suppose that the Maximum Orthogonal Matching problem canbe solved up to an additive error of E ( | A | , | B | ) in O ( n − ǫ poly ( d )) time, where the input is A, B ⊆ { , } d .Then for any (suﬃciently small) α > there is some γ > such that Find-OV can be solved with highprobability up to an additive error of E ( | A | , | B | α ) in O ( n − γ poly ( d )) time.Proof. Let

A, B ⊆ { , } d with | A | = | B | = n . Let α > α <

1. Let the degree of a vector a ∈ A , denoted d ( a ), be the number of b ∈ B which areorthogonal to a . The algorithm for Find-OV consists of three steps:1. For every a ∈ A , sample n − α/ vectors from B to get an estimate ˆ d ( a ) of d ( a ). Mark and remove thevectors for which ˆ d ( a ) ≥ n α/ .2. Next, for every b ∈ B , sample n − α/ vectors from A to get an estimate ˆ d ( b ) of d ( b ). Let B large ⊆ B be the set of vectors for which ˆ d ( b ) ≥ n α . For each b ∈ B large , iterate over A and mark and removeeach a ∈ A for which a · b = 0. Now remove B large from B .3. Run the Maximum Orthogonal Matching algorithm on the remaining set A , and the multiset consistingof 2 n α copies of each remaining b ∈ B . This produces a set of pairs ( a i , b i ) where a i · b i = 0. Outputthe union of { a i } i and the set of all vectors marked and removed from A in the previous steps.In the ﬁrst step, a Chernoﬀ bound shows that with high probability, every vector for which d ( a ) ≥ n α/ is marked and removed. Now summing over the remaining vectors, X a ∈ A d ( a ) = X b ∈ B d ( b ) ≤ n α/ . In the second step, with high probability B large contains no b ∈ B for which d ( b ) ≤ n α , by a Chernoﬀbound on each such b ∈ B . Therefore | B large | ≤ n − α/ . Furthermore, with high probability B large containsevery b ∈ B for which d ( b ) ≥ n α .So after the ﬁrst two steps, every remaining vector b ∈ B has degree at most 2 n α . Suppose there are t vectors a ∈ A with positive degree, and t ′ of these are found in the ﬁrst two steps. Then by the degreebound, the remaining t − t ′ vectors inject into 2 n α copies of B . Therefore there is an orthogonal matchingof size at least t − t ′ . By the approximation guarantee of the Maximum Orthogonal Matching algorithm, weﬁnd an orthogonal matching of size at least t − t ′ − n (1+ α ) δ in step 3. Overall, we ﬁnd at least t − n (1+ α ) δ vectors with positive degree, which gives the desired approximation guarantee.The time complexity is O (( n − α/ + n (2 − ǫ )(1+ α ) )poly( d )). This is subquadratic in n for suﬃciently small α . As the ﬁnal step of the reduction, we show that approximate Find-OV can solve Hitting Set. Note thatexact Find-OV obviously solves Hitting Set. It’s also clear that Find-OV with an additive error of n − ǫ solvesHitting Set: simply run Find-OV, and then exhaustively check the remaining unpaired vectors of A —unlessthere are more than n − ǫ unpaired vectors, in which case there must be a hitting vector.To reduce Hitting Set to Find-OV with additive error of Θ( n ), the essential idea is simply to repeatedlyrun Find-OV on the remaining unpaired vectors. If the Find-OV algorithm has an additive error of n/ A, B with no hitting vector, the algorithm will ﬁnd orthogonal pairs for at least n/ A . Naively, we’d like to recurse on the remaining half of A . Unfortunately, the set B cannotsimilarly be halved, so the error bound in the next step would not be halved. Thus, the algorithm mightmake no further progress.The workaround is to duplicate every unpaired vector of A before recursing. If n/ A has been duplicated once, then matches are found for at least n/ O (log n ) steps.9 heorem 4.7. Suppose that Find-OV in d dimensions can be solved up to additive error of n/ in T ( n, d ) time. Then Hitting Set in d dimensions can be solved in O (( T ( n, d ) + nd ) log n ) time.Proof. Let

A, B ⊆ { , } d with | A | = | B | = n . Our hitting set algorithm consists of t = ⌈ log n ⌉ + 1 phases.Initialize R = A .In phase i ≥

1, run Find-OV on (2 i − R i , B ), where 2 i R i is the multiset with 2 i copies of each vectorin R i . Let P ⊆ A be the output multiset and let P ′ be the corresponding set (removing duplicates). Set R i +1 = R i \ P ′ . If | R i +1 | > n/ i , report failure (i.e. there is a hitting vector). Otherwise, proceed to thenext phase. If phase t is complete, report success (i.e. no hitting vector).Suppose that the algorithm reports success. Then after phase t , we have R t +1 ≤ n/ t <

1. Then forevery a ∈ A there was some phase i in which a was removed from R i , and therefore was orthogonal to some b ∈ B . So there is no hitting vector.Suppose that the algorithm reports failure in phase i . Then | R i | ≤ n/ i − and | R i +1 | > n/ i , so | P ′ | < n/ i . Therefore | P | ≤ i − | P ′ | < n/

2. By the Find-OV approximation guarantee, not every elementof R i is orthogonal to an element of B . So there is a hitting vector.The time complexity is dominated by O (log n ) applications of Find-OV on inputs of size O ( n ), alongwith O ( nd ) extra processing in each phase. Thus, the time complexity is O (( T ( n, d ) + nd ) log n ).The next theorem shows that hardness for approximate EMD matching (conditioned on the Hitting SetConjecture) follows from chaining together the above reductions. Theorem 4.8.

If there are any ǫ, δ > such that (1 + 1 /n δ ) -approximate EMD matching can be solved in O ( n − ǫ ) time for some dimension d = ω (log n ) , then the Hitting Set Conjecture is false.Proof. Fix d = ω (log n ), and assume without loss of generality that d ( n ) is polylogarithmic. Let ǫ, δ > /n δ )-approximate EMD matching can be solved in O ( n − ǫ ) time. Then (1 + 1 /n δ )-approximate asymmetric EMD can be solved with an additional additive error of n − δ √ d with the sametime complexity, by Lemma 4.1. Hence, the Maximum Orthogonal Matching problem can be solved with anadditive error of n − δ d in the same time, by Lemma 4.3.Applying Theorem 4.6 with parameter α = δ , we get a randomized algorithm for Find-OV with anadditive error of O ( n − δ d δ ) and time complexity O ( n − γ ) for some γ >

0. For suﬃciently large n , theerror is at most n/

2. Thus, we can apply Theorem 4.7 to get a randomized algorithm for Hitting Set withtime complexity ˜ O ( n − γ ), which contradicts the Hitting Set Conjecture.Furthermore, we obtain stronger hardness of approximation for asymmetric EMD matching: Theorem 4.9.

Let d = ω (log n ) and η = 1 /ω (log n ) . If there is a truly subquadratic (1 + η ) -approximationalgorithm for asymmetric EMD matching in d dimensions, then the Hitting Set Conjecture is false.Proof. Fix d ′ = ω (log n ) and η = 1 /ω (log n ) and ǫ >

0. Suppose that there is an O ( n − ǫ ) time al-gorithm which achieves a (1 + η ) approximation for asymmetric EMD matching in d ′ dimensions. Set d = min( d ′ , p (log n ) /η ). Since R d embeds isometrically in R d ′ , the algorithm also achieves a (1 + η ) approx-imation for asymmetric EMD in d dimensions.By Lemma 4.4, the Maximum Orthogonal Matching problem can be solved up to an additive error of O ( η | A | d ) in O ( d ) dimensions and O ( n − ǫ ) time. By Theorem 4.6 there is some γ > O ( ηnd ) in O ( d ) dimensions and O ( n − γ ) time. By choice of d wehave ηnd = o ( n ), so for suﬃciently large n the algorithm achieves additive error of at most n/

2. Thereforeby Theorem 4.7, Hitting Set can be solved in O ( d ) dimensions and ˜ O ( n − ǫ ) time. Since d = ω (log n ), thiscontradicts the Hitting Set Conjecture. Acknowledgments.

I want to thank Piotr Indyk and Arturs Backurs for numerous helpful discussionsand guidance. I am also grateful to an anonymous reviewer for pointing towards Theorem 1.2 and its proof.10 eferences [1] Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for lcs andother sequence similarity measures. In

Proceedings of the 2015 IEEE 56th Annual Symposium onFoundations of Computer Science (FOCS) , FOCS ’15, pages 59–78, Washington, DC, USA, 2015. IEEEComputer Society. URL: http://dx.doi.org/10.1109/FOCS.2015.14 , doi:10.1109/FOCS.2015.14 .[2] Amir Abboud, Virginia Vassilevska Williams, and Joshua Wang. Approximation and ﬁxed pa-rameter subquadratic algorithms for radius and diameter in sparse graphs. In Proceedings ofthe Twenty-seventh Annual ACM-SIAM Symposium on Discrete Algorithms , SODA ’16, pages377–391, Philadelphia, PA, USA, 2016. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=2884435.2884463 .[3] Pankaj K. Agarwal, Kyle Fox, Debmalya Panigrahi, Kasturi R. Varadarajan, and Allen Xiao. FasterAlgorithms for the Geometric Transportation Problem. In Boris Aronov and Matthew J. Katz, editors, , volume 77 of

Leibniz Interna-tional Proceedings in Informatics (LIPIcs) , pages 7:1–7:16, Dagstuhl, Germany, 2017. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. URL: http://drops.dagstuhl.de/opus/volltexte/2017/7234 , doi:10.4230/LIPIcs.SoCG.2017.7 .[4] Josh Alman, Timothy M. Chan, and Ryan Williams. Polynomial representations of threshold functionsand algorithmic applications. In , pages 467–476, Oct 2016. doi:10.1109/FOCS.2016.57 .[5] Jason Altschuler, Jonathan Weed, and Philippe Rigollet. Near-linear time approximation algorithms foroptimal transport via sinkhorn iteration. In Proceedings of the 31st International Conference on NeuralInformation Processing Systems , NIPS’17, pages 1961–1971, USA, 2017. Curran Associates Inc. URL: http://dl.acm.org/citation.cfm?id=3294771.3294958 .[6] Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. Parallel algo-rithms for geometric graph problems. In

Proceedings of the Forty-sixth Annual ACM Symposiumon Theory of Computing , STOC ’14, pages 574–583, New York, NY, USA, 2014. ACM. URL: http://doi.acm.org/10.1145/2591796.2591805 , doi:10.1145/2591796.2591805 .[7] Martin Arjovsky, Soumith Chintala, and L´eon Bottou. Wasserstein gan. arXiv preprintarXiv:1701.07875 , 2017.[8] Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratictime (unless seth is false). In Proceedings of the Forty-seventh Annual ACM Symposium onTheory of Computing , STOC ’15, pages 51–58, New York, NY, USA, 2015. ACM. URL: http://doi.acm.org/10.1145/2746539.2746612 , doi:10.1145/2746539.2746612 .[9] Amitabh Basu. Open problem: Maximum weighted assignment problem. In Workshop: CombinatorialOptimization , Oberwolfach Report 50/2018, page 44, 2018. doi:10.4171/OWR/2018/50 .[10] Nicolas Bonneel, Michiel van de Panne, Sylvain Paris, and Wolfgang Heidrich. Displacement interpola-tion using lagrangian mass transport.

ACM Transactions on Graphics , 30(6):158:1–158:12, December2011. URL: http://doi.acm.org/10.1145/2070781.2024192 , doi:10.1145/2070781.2024192 .[11] Karl Bringmann and Marvin Kunnemann. Quadratic conditional lower bounds for string problems anddynamic time warping. In Proceedings of the 2015 IEEE 56th Annual Symposium on Foundations ofComputer Science (FOCS) , FOCS ’15, pages 79–97, Washington, DC, USA, 2015. IEEE ComputerSociety. URL: http://dx.doi.org/10.1109/FOCS.2015.15 , doi:10.1109/FOCS.2015.15 .[12] Lijie Chen. On the hardness of approximate and exact (bichromatic) maximum inner product. In Proceedings of the 33rd Computational Complexity Conference , CCC ’18, pages 14:1–14:45, Germany,2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.CCC.2018.14 .1113] R´emi Flamary, Marco Cuturi, Nicolas Courty, and Alain Rakotomamonjy. Wasserstein discriminantanalysis.

Machine Learning , 107(12):1923–1945, December 2018. doi:10.1007/s10994-018-5717-1 .[14] John Hopcroft and Richard Karp. An n / algorithm for maximum matchings in bipartite graphs. SIAM Journal on Computing , 2(4):225–231, 1973. arXiv:https://doi.org/10.1137/0202019 , doi:10.1137/0202019 .[15] Piotr Indyk. A near linear time constant factor approximation for euclidean bichromatic matching(cost). In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms , SODA’07, pages 39–42, Philadelphia, PA, USA, 2007. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=1283383.1283388 .[16] Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing thecurse of dimensionality. In

Proceedings of the Thirtieth Annual ACM Symposium on The-ory of Computing , STOC ’98, pages 604–613, New York, NY, USA, 1998. ACM. URL: http://doi.acm.org/10.1145/276698.276876 , doi:10.1145/276698.276876 .[17] Andrey Boris Khesin, Aleksandar Nikolov, and Dmitry Paramonov. Preconditioning for the Ge-ometric Transportation Problem. In Gill Barequet and Yusu Wang, editors, , volume 129 of Leibniz International Pro-ceedings in Informatics (LIPIcs) , pages 15:1–15:14, Dagstuhl, Germany, 2019. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. URL: http://drops.dagstuhl.de/opus/volltexte/2019/10419 , doi:10.4230/LIPIcs.SoCG.2019.15 .[18] Yin Tat Lee and Aaron Sidford. Path ﬁnding ii: An ˜ O ( m √ n ) algorithm for the minimum cost ﬂowproblem. arXiv preprint arXiv:1312.6713 , 2013.[19] Yin Tat Lee and Aaron Sidford. Path ﬁnding methods for linear programming: Solving linear pro-grams in ˜O(vrank) iterations and faster algorithms for maximum ﬂow. In Proceedings of the 2014IEEE 55th Annual Symposium on Foundations of Computer Science , FOCS ’14, pages 424–433, Wash-ington, DC, USA, 2014. IEEE Computer Society. URL: http://dx.doi.org/10.1109/FOCS.2014.52 , doi:10.1109/FOCS.2014.52 .[20] Jonas Mueller and Tommi Jaakkola. Principal diﬀerences analysis: Interpretable characterization ofdiﬀerences between distributions. In Proceedings of the 28th International Conference on Neural Infor-mation Processing Systems - Volume 1 , NIPS’15, pages 1702–1710, Cambridge, MA, USA, 2015. MITPress. URL: http://dl.acm.org/citation.cfm?id=2969239.2969429 .[21] Aviad Rubinstein. Hardness of approximate nearest neighbor search. In

Proceedings of the50th Annual ACM SIGACT Symposium on Theory of Computing , STOC 2018, pages 1260–1268, New York, NY, USA, 2018. ACM. URL: http://doi.acm.org/10.1145/3188745.3188916 , doi:10.1145/3188745.3188916 .[22] Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. The earth mover’s distance as a met-ric for image retrieval. International Journal of Computer Vision , 40(2):99–121, Nov 2000. doi:10.1023/A:1026543900054 .[23] Roman Sandler and Michael Lindenbaum. Nonnegative matrix factorization with earth mover’s distancemetric for image analysis.

IEEE Transactions on Pattern Analysis and Machine Intelligence , 33(8):1590–1602, Aug 2011. doi:10.1109/TPAMI.2011.18 .[24] R. Sharathkumar and Pankaj K. Agarwal. A near-linear time ǫ -approximation algorithm forgeometric bipartite matching. In Proceedings of the Forty-fourth Annual ACM Symposium onTheory of Computing , STOC ’12, pages 385–394, New York, NY, USA, 2012. ACM. URL: http://doi.acm.org/10.1145/2213977.2214014 , doi:10.1145/2213977.2214014 .[25] Justin Solomon, Fernando de Goes, Gabriel Peyr´e, Marco Cuturi, Adrian Butscher, Andy Nguyen,Tao Du, and Leonidas Guibas. Convolutional wasserstein distances: Eﬃcient optimal transporta-tion on geometric domains. ACM Transactions on Graphics , 34(4):66:1–66:11, July 2015. URL: http://doi.acm.org/10.1145/2766963 , doi:10.1145/2766963 .1226] C´edric Villani. Topics in optimal transportation . Number 58 in Graduate Studies in Mathematics.American Mathematical Society, 2003.[27] Ryan Williams. A new algorithm for optimal 2-constraint satisfaction andits implications.

Theoretical Computer Science , 348(2):357 – 365, 2005. Au-tomata, Languages and Programming: Algorithms and Complexity (ICALP-A 2004).URL: , doi:https://doi.org/10.1016/j.tcs.2005.09.023 .[28] Ryan Williams. On the diﬀerence between closest, furthest, and orthogonal pairs: Nearly-linear vsbarely-subquadratic complexity. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposiumon Discrete Algorithms , SODA ’18, pages 1207–1215, Philadelphia, PA, USA, 2018. Society for Industrialand Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=3174304.3175348 .[29] Virginia Vassilevska Williams. On some ﬁne-grained questions in algorithms and complexity. In

Pro-ceedings of the ICM , 2018.

A Hardness of Low-Rank Minimum Weighted Assignment

The methods we used to prove hardness of exact EMD in low dimensions can be adapted to prove hardnessof minimum weighted assignment with low-rank weight matrices, under the Orthogonal Vectors Conjecture.In particular, we show in the following theorem that bichromatic closest pair in d dimensions can be reducedto minimum weighted assignment with a rank- O ( d ) weight matrix. The reduction algorithm uses the sameinput transformation as Theorem 3.2, and then solves minimum weighted assignment on the matrix M withentries M ij = (cid:13)(cid:13) A ′ i − B ′ j (cid:13)(cid:13) , where A ′ and B ′ are the transformed input sets. The key is that M has rank O ( d ), and its minimum weight assignment encodes the squared closest pair distance of the input—just asthe EMD of the transformed input in Theorem 3.2 encoded the closest pair distance of the input. Theorem A.1.

Fix a dimension d = d ( n ) ≤ n , and let ǫ > . Suppose that there is an algorithm whichsolves minimum weighted assignment in O ( n − ǫ ) time, if the weight matrix has rank at most O ( d ) . Thenbichromatic closest pair in d dimensions can be solved in O ( n − ǫ ) time.Proof. Let A and B be two sets of n vectors in d dimensions, with entries in { , . . . , n k } for some constant k >

0. Apply the transformation described in Theorem 3.2 to construct sets A ′ , B ′ ∈ { , . . . , n k } d +2 c +2 where c is as deﬁned in the proof of the theorem. DeﬁneSQEMD( A ′ , B ′ ) = min σ : A ′ → B ′ X a ′ ∈ A ′ k a ′ − σ ( a ′ ) k where σ ranges over all bijections from A ′ to B ′ . Since k u − v k ≥ N d and k a ′ − b ′ k ≥ N d for every a ′ ∈ A ′ \ { v } and b ′ ∈ B ′ \ { u } , whereas k a ′ − u k ≪ N d/n and k b ′ − v k ≪ N d/n , the optimal matching σ minimizes the number of ( u, v ) and ( a ′ , b ′ ) edges. In particular, exactly one element of A ′ \ { v } is matchedto an element of B ′ \ { u } . Thus, paralleling the proof of Theorem 3.2, we getSQEMD( A ′ , B ′ ) = 2( n − n k d + (cid:18) N d + 2 n k d + min a ∈ A,b ∈ B k a − b k (cid:19) . Hence, to compute the bichromatic closest pair distance between A and B , it suﬃces to computeSQEMD( A ′ , B ′ ). Representing A ′ and B ′ as n × (2 d + 2 c + 2) matrices, let M be the n × n matrix de-ﬁned by M ij = (cid:13)(cid:13) A ′ i − B ′ j (cid:13)(cid:13) . Then observing that M ij = d +2 c +2 X k =1 ( A ′ ik − B ′ jk ) = d +2 c +2 X k =1 ( A ′ ik ) + d +2 c +2 X k =1 ( B ′ jk ) − d +2 c +2 X k =1 A ′ ik B ′ jk , we can write M as the sum of 2 d + 2 c + 4 rank-1 matrices, so rank( M ) ≤ d + 2 c + 4. So by assumption,the minimum weight perfect matching in the complete bipartite graph determined by M can be found in O ( n − ǫ poly( d )) time. But the cost of the optimal matching is precisely SQEMD( A ′ , B ′ ).Applying Theorem 2.1 completes the proof of Theorem 1.2.13 Proof of Theorem 4.5

The theorem follows immediately from the reduction from Maximum Orthogonal Matching to EMD matchingshown in section 4, and this next proposition.

Proposition B.1.

Suppose the Maximum Orthogonal Matching problem can be solved up to an additivefactor of n δ in O ( n γ ) time where δ < / . Then OV can be solved in O ( n γ/ (1 − δ ) ) time.Proof. Let

A, B ⊆ { , } d with | A | = | B | = n . We construct multisets A ′ and B ′ which consist of 2 n δ/ (1 − δ ) copies of each a ∈ A , and 2 n δ/ (1 − δ ) copies of each b ∈ B , respectively. We then run our approximateMaximum Orthogonal Matching algorithm on A ′ and B ′ . If any orthogonal pair is found, we return it;otherwise we return that there is no orthogonal pair.Since | A ′ | = | B ′ | = 2 n / (1 − δ ) , the time complexity of this algorithm is O ( n γ/ (1 − δ ) ). It is clear that if A and B have no orthogonal pair, then A ′ and B ′ have no orthogonal pair, so the algorithm correctly returns“no pair”.Suppose that there are a ∈ A and b ∈ B with a · b = 0 but the algorithm returns “no pair”. Thenthe matching found by the algorithm had no orthogonal pairs. However, there is a matching consistingof 2 n δ/ (1 − δ ) pairs. Since | B ′ | δ < n δ/ (1 − δ ) , this contradicts the approximation guarantee of the MaximumOrthogonal Matching algorithm. C Hardness of ( k, k ) -Find-OV The ( k, k )-Find-OV problem provides some sense of the relative “powers” of the Orthogonal Vectors Con-jecture and the Hitting Set Conjecture, as well as another example of how the Hitting Set Conjecture canbe used to explain hardness of approximation problems. Reducing from OV, we get the following hardnessresult, and it is not clear how to make any improvement. Note that this proof extends to the (1 , k )-Find-OVproblem, for which this lower bound is tight, due to a random sampling algorithm. Proposition C.1.

Fix δ ∈ (0 , . Assuming OVC, any algorithm for ( n δ , n δ ) -Find-OV requires Ω( n − δ − o (1) ) time.Proof. Suppose that there exists an O ( n − δ − ǫ ) time algorithm find for ( n δ , n δ )-Find-OV. Here is an algo-rithm for OV: given sets A, B ⊆ { , } d with | A | = | B | = n , duplicate each a ∈ A and each b ∈ B exactly2 n δ/ (2 − δ ) times. If the original number of orthogonal pairs was r , then the new number is 4 rn δ/ (2 − δ ) . For r ≥

1, this exceeds 2( n · n δ/ (2 − δ ) ) δ , so applying find yields a positive number of orthogonal vectors if andonly if r >

0. It’s easy to check that the time complexity is subquadratic.On the other hand, under the Hitting Set Conjecture, we can obtain quadratic hardness. When k = n/ k = √ n , and it extends naturally to any k = n γ for γ ∈ (0 , Theorem C.2.

If the ( √ n, √ n ) -Find-OV problem can be solved in O ( n − ǫ ) time for some ǫ > , thenHitting Set can be solved in O ( n − δ ) time for some δ > .Proof. Let find be the presupposed algorithm for ( √ n, √ n )-Find-OV. Set α = ǫ/

7. Let

A, B ⊆ { , } d with | A | = | B | = n . Without loss of generality, assume that no vector is all-zeroes. Here is an algorithm:1. For each a ∈ A , randomly sample n − α vectors from B. If any of these is orthogonal to a , mark a andremove it from A , replacing it with an all-ones vector.2. Set k = n / − α . Partition A into sets A , . . . , A k of approximately equal size, and similarly partition B into sets B , . . . , B k . For each pair ( A i , B j ):(a) Apply find to ( A i , B j ).(b) If the output is not p n/k orthogonal pairs, then continue to the next pair ( A i , B j ).14c) Otherwise, suppose that the output is { ( a m , b m ) }√ n/km =1 . For each vector a ∈ { a m }√ n/km =1 , mark a and remove it from A i (and from A ), replacing it with an all-ones vector.(d) Go to (a).3. If the number of unmarked input vectors exceeds 2 n − α/ , return “NO” and exit.4. For each a ∈ A , if a is not the all-ones vector, iterate over all b ∈ B , and mark a if any b ∈ B isorthogonal.5. Return “YES” if every vector originally in A is now marked, and “NO” otherwise.We claim that this algorithm solves Hitting Set in strongly subquadratic time. Correctness is relativelysimple: a vector a ∈ A is only marked by the above algorithm if some b ∈ B is found for which a · b = 0.Thus, if some a ∈ A is a hitting vector for B , then it is never marked, so the algorithm returns “NO”.Conversely, suppose that every a ∈ A is orthogonal to some b ∈ B . Then the number of unmarked inputvectors in Step 3 is at most the number of remaining orthogonal pairs. But each ( A i , B j ) contains at most2 p n/k orthogonal pairs after Step 2 ﬁnishes, so the number of remaining orthogonal pairs in Step 3 is atmost k (2 p n/k ) = 2 n − α/ . Thus, the algorithm continues to Step 4. Every a ∈ A which has not beenmarked by the end of Step 2 is tested against every b ∈ B in Step 4. Therefore every vector is marked, sothe algorithm returns “YES”.Turning to time complexity, Step 1 takes O ( n − α ) time. The complexity of Step 2 is dominated by thecalls to find . For each pair ( A i , B j ) there is at most one call to find for which the output is not p n/k orthogonal pairs. Hence, there are k = n / − α such “failed” calls. To bound the number of “successful”calls to find , for which the output is p n/k orthogonal pairs, note that after Step 1, with high probabilityeach a ∈ A is orthogonal to at most n α vectors b ∈ B , so the total number of orthogonal pairs is at most n α . Each successful call eliminates p n/k = n / α/ orthogonal pairs, so there are at most n / α/ successful calls. This bound dominates the bound on failed calls. Each call takes time O (( n/k ) − ǫ ), so thetime complexity of Step 2 is asymptotically n ( + α ) (2 − ǫ ) n + α = n − ǫ − ǫ . Step 3 takes negligible time. Finally, in Step 4, there are at most 2 n − α/ vectors a ∈ A which are notthe all-ones vector (since each of these is unmarked), so the complexity is O ( n − α/ ).Hence, the overall time complexity is bounded by O ( n − ǫ/7