Hardness Amplification of Optimization Problems
aa r X i v : . [ c s . CC ] A ug Hardness Amplification of Optimization Problems
Elazar GoldenbergThe Academic College of Tel Aviv-Yaffo [email protected]
Karthik C. S. ∗ Weizmann Institute of Science [email protected]
Abstract
In this paper, we prove a general hardness amplification scheme for optimizationproblems based on the technique of direct products.We say that an optimization problem Π is direct product feasible if it is possibleto efficiently aggregate any k instances of Π and form one large instance of Π suchthat given an optimal feasible solution to the larger instance, we can efficiently findoptimal feasible solutions to all the k smaller instances. Given a direct product feasi-ble optimization problem Π , our hardness amplification theorem may be informallystated as follows:If there is a distribution D over instances of Π of size n such thatevery randomized algorithm running in time t ( n ) fails to solve Π on α ( n ) fraction of inputs sampled from D ,then, assuming some relationships on α ( n ) and t ( n ) ,there is a distribution D ′ over instances of Π of size O ( n · α ( n )) such that every randomized algorithm running in time t ( n ) poly ( α ( n )) fails to solve Π on / fraction of inputs sampled from D ′ .As a consequence of the above theorem, we show hardness amplification of prob-lems in various classes such as NP -hard problems like Max-Clique, Knapsack, andMax-SAT, problems in P such as Longest Common Subsequence, Edit Distance, Ma-trix Multiplication, and even problems in TFNP such as Factoring and computingNash equilibrium. ∗ This work was partially supported by Irit Dinur’s ERC-CoG grant 772839. Introduction
The widely believed conjecture P = NP asserts that the class NP cannot be decidedefficiently on the worst-case. That is, no polynomial time algorithm can decide the sat-isfiability of a CNF formula on every instance. However, the worst case hardness of NP still does not clarify its average-case hardness: how hard is to decide the satisfiability ona uniformly random instance.Studying the average-case hardness of NP has a two-fold motivation. First, it mayprovide a more meaningful explanation than worst-case complexity about the intractabil-ity of NP -hard instances actually encountered in practice. In other words, if NP is hardonly on the worst-case, then the theory of worst-case complexity that has been exten-sively developed over the last fifty years, might be not be a good reflection of reality.Second, hardness on average is the cornerstone of modern cryptography as the secu-rity of any nontrivial cryptosystem requires some computational problem to be average-case hard (for some nice distribution). Additionally, showing average-case hardness forfunctions is a stepping stone towards proving strong derandomization results and theconstruction of pseudorandom generators.The study of hardness amplification is the task of connecting the worst-case andaverage-case hardness. More specifically, based on a worst-case hardness (assumption)one would like to prove the average-case hardness of the problem. A utopic theorem in the context of hardness amplification would assert that if a functionis hard in the worst-case then it implies the average-case hardness for the same functionagainst algorithms with essentially the same running time complexity. More formally itwould look as follows:
Utopic Theorem 1 (A Utopic Hardness Amplification Theorem) . Let { f n } n ∈ N be a fam-ily of functions. Assume that every algorithm running in time t ( n ) , fails to compute f n on atleast γ ( n ) fraction of inputs. Then there exists a family { g n } n ∈ N of functions, such that everyalgorithm running in time t ′ ( n ) , fails to compute g n on at least γ ′ ( n ) fraction of inputs.Ideally, we would like to achieve the above amplification for the following parameters γ ( n ) = O ( / n ) and γ ′ ( n ) = / − O ( / n ) ,2. t ′ ( n ) ≈ t ( n ) ,3. { f n } n ∈ N = { g n } n ∈ N . We briefly elaborate here why we would like the above three setting of parame-ters in our utopic hardness amplification theorem. Item 1 would yield a worst-case toaverage-case reduction, and therefore extend all the lower bounds and hardness resultsthat have been achieved in the theory of worst-case complexity for f to the average-case In order to succinctly specify the desirable parameters of a hardness amplification theorem, we assumehere that f n and g n are Boolean functions. g . In fact, achieving γ ′ ( n ) = / − O ( / n ) would imply that no algorithmrunning in time t ′ ( n ) can do much better than randomly guessing the output. Item 2would imply that our worst-case complexity lower bounds meaningfully translate tolower bounds in the average-case. Item 3 expresses the notion of self-reducability: ifwe are interested in understanding the average complexity of a problem, our hardnessamplification theorem should enable us to do so by analyzing the worst-case complexityof the same problem. In summary, obtaining a hardness amplification result satisfyingthe three items is in a sense an attempt to bridge the gap between theory and practice.Finally, we remark that our utopic theorems would gain more importance if the family offunctions for which we show hardness amplification are natural (in some broad sense).Specifically, if we prove such a theorem for the family of deciding satisfiability of CNF formulas, then we get that the assumption that P = NP implies that every poly-nomial time algorithm fails to decide satisfiability on slightly more that half of the CNF formulas – a highly non-trivial and very desirable result that would pave the way forthe construction of one-way functions from (weak) worst-case assumptions. However,as we wake up from the dream of a utopia, one may wonder if such a result can even beachieved [BT06b].Remarkably, nearly three decades ago, Lipton [Lip89, CPS99] proved the above typeof (utopic) theorem for the function of computing the permanent (a P -complete prob-lem) against probabilistic polynomial time algorithms. Trevisan and Vadhan [TV07] fol-lowing a line of works [BFNW93, IW97, STV01] were almost able to prove such an am-plification result for the class EXP (they couldn’t achieve Item 3). For the class NP weare far from proving a strong hardness amplification result, and there are some knownbarriers while trying to convert worst-case NP -hardness into average-case hardness (seee.g. [BT06a]). More recently, strong hardness amplification results have been proved forfunctions in P [BRSV17, GR17, GR18]. We also note that hardness amplification resultshave also been shown for one-way functions [Yao82, GIL +
90, BR13].Given the above state-of-the-art picture, we raise a few natural questions and ad-dress them in this paper. There are many problems that are hard in the worst-case buteasy on average. For example, 3-coloring is a well-known NP -hard problem, but it isan easy exercise to show that it can be solved in linear time with high probability ona random graph. This motivates us to distinguish within worst-case hard problems asto which of them remain hard on average. One way to go about this task is to identifywhich worst-case hard problem admits a hardness amplification theorem. For which problems can we amplify hardness?Can we identify a mathematical structure that allows us to amplify hardness?
The latter question has been implicitly addressed in literature (for example, if theproblem has algebraic structure like in the case of computing permanent [Lip89] orcounting k -cliques [GR18]), but are quite specific and not broad enough to capture theclass of problems that we believe are hard on average. In Section 1.2.1 we address theabove two questions.Next, we turn our attention to NP -hard problems. In a beautiful paper, O’Donnell[O’D04] initiated the study of (non-uniform) hardness amplification in NP . His resultwas improved by [HVV06] who showed that if for some function in NP (with n inputs)3e have that any s ( n ) size circuit fails to compute the function correctly on / poly ( n ) frac-tion of the inputs then, the hardness can be amplified to show that there is some functionin NP such that any s ( √ n ) Ω ( ) size circuit fails to compute the function on / − / s ( √ n ) Ω ( ) fraction of the inputs. However, the best uniform hardness amplification results (againstalgorithms as opposed to circuits) that have been achieved do not match the parametersof [HVV06]: Trevisan [Tre05, BKS06] improving on his previous work [Tre03] showedthat we can amplify hardness from / poly ( n ) to / − / polylog ( n ) for NP against random-ized polynomial time algorithms (later extended to deterministic algorithms in [GG11]).However, it is important to note that all these hardness amplification results are for deci-sion problems, and this leads us to our next question, do we gain anything by moving tosearch problems, or more precisely to the focus of this paper, to optimization problems? Can we improve our hardness amplification results for optimization problems?Can we prove stronger uniform hardness amplification results for
MaxSAT ? Arguably, optimization problems are as natural as decision problems, but are strictlyharder from the point of view of computational complexity. Does this mean we caneither give simpler proofs of hardness amplification for optimization problems or provestronger results? We address the above questions in Section 1.2.2.We now shift our focus to the class P . As mentioned earlier, we have strong worst-case to average-case results established for problems in P [BRSV17, GR18]. The draw-back however is that they are all for counting problems. This is indeed inherent as theunderlying technique these works use are the same as the one used to show worst-caseto average-case reduction for the permanent problem. While counting the number of k -cliques (the problem considered in [GR18]) is a natural problem, and therefore hard-ness amplification for that problem is interesting, it still leaves the door open for provinghardness amplification for the search problem of just finding one k -clique in a graph (aneasier problem and thus harder to amplify hardness). Can we prove hardness amplification results for natural search problems in P ? Moreover, there is a barrier [Abb19] to using the algebraic techniques of [BRSV17,GR18] to obtain hardness amplification for important problems studied in fine-grainedcomplexity such as computing the Longest Common Subsequence (
LCS ) and Edit Dis-tance (
Edit - Distance ) for a pair of strings. In particular, if these string problems can berepresented using low-degree polynomials, then we could obtain small speedups byusing the polynomial method [CW16], which would imply new circuit lower bounds[AHWW16]. This suggests we might need to look beyond these algebraic techniquesfor proving hardness amplification for these string problems. Is there a different tech-nique to prove hardness amplification in P ? We address these aforestated questions inSection 1.2.3. Our main contribution is a general hardness amplification theorem for optimizationproblems which we state in Section 1.2.1. Next, we apply our main theorem to vari-ous problems. In Section 1.2.2 we state our hardness amplification theorems for various4 P -hard problems such as Knapsack and MaxSAT . In Section 1.2.3 we state our hardnessamplification theorems for various string problems in P such as LCS and
Edit - Distance
Fi-nally, in Section 1.2.4 we state our hardness amplification theorems for various problemsin
TFNP (believed to not be in P ) such as Factoring and computing Nash equilibrium. Aggregation is a key tool in the field of hardness amplification. If a function f is hardto compute on a tiny fraction of the domain, then, intuitively, computing multiple in-stances of f in one shot should be hard on a larger fraction of the inputs. More formally,for a function f : [ N ] → Σ and k ∈ N , its k -direct product encoding is defined as afunction f ( k ) : [ N ] k → Σ k mapping each tuple ( x , . . . , x k ) into ( f ( x ) , . . . , f ( x k )) . Usingstandard techniques one can show a ‘direct product theorem’ stating that if f is hardagainst t ( n ) running-time algorithms on α ( n ) -fraction of the domain, then f ( k ) is hardagainst t ′ ( n ) running-time algorithms on ≈ k · α ( n ) -fraction of its domain. But in or-der to utilize such a direct product result, we need to be able to stitch k -instances into asingle (larger) instance. To address this task we introduce the following notion of directproduct feasibility. Definition 1.1 (Direct Product Feasibility; Informal statement of Definition 3.3) . Let Π bean optimization problem. We say that Π is ( S , T ) -direct product feasible if the exists a pair ofdeterministic algorithms ( Gen , Dec ) satisfying the following: • Gen takes as input k instances ( I , . . . , I k ) of Π each of size n and outputs an instance I ′ of Π of size S ( n , k ) . • Dec gets as input ( I , . . . , I k ) , the instance I ′ which is the output of Gen on input ( I , . . . , I k ) ,an optimal solution for I ′ , and i ∈ [ k ] . It outputs an optimal solution for the instance I i . • The running time of
Gen and
Dec is bounded by T ( n , k ) . Our main theorem is about hardness amplification for an arbitrary direct productfeasible problem Π . In particular we show that if Π is hard against t ( n ) running timerandomized algorithms on a tiny fraction of the domain, then Π is hard on a much largerfraction of the domain against randomized algorithms with a similar running time. Theorem 1.2 (Informal Statement of Theorem 3.4) . Let Π be ( S , T ) -direct product feasible.Let D ( n ) be an efficiently samplable distribution over the instances of Π of size n. Assume thefollowing: • Any t ( n ) running-time algorithm with success probability at least fails to compute anoptimal solution on at least α ( n ) -fraction of the inputs sampled from D . • Fix k = poly (( α ( n )) − ) . Then we have T ( n , k ) = o ( t ( n )) . In the formal definition of direct product feasibility, it is defined for a pair of optimization problems ( Π , Λ ) for technical reasons which will be addressed later in Section 1.2.3. In the case Π = Λ we formallycall it as self direct product feasible and this notion coincides with the informal definition given here. Formost of the applications given in this paper, self direct product feasibility notion suffices. We can (deterministically) decide the optimality of a given solution to any instance ino ( t ( n )) time.Then there exists an efficiently samplable distribution D ′ ( S ( n , k )) over instances of Π ofsize S ( n , k ) such that every t ( n ) running-time algorithm with success probability at least fails to compute an optimal solution on at least of the inputs sampled from D ′ . Naturally, the distribution D ′ is defined as follows: Draw k independent samples I , . . . , I k from D , and output Gen ( I , . . . , I k ) . The proof of our main theorem is based ona reduction using an oracle access to an algorithm that solves D ′ on 99% of the inputs,we convert it into an algorithm solving D on greater than 1 − α ( n ) fraction of the inputs.The reduction is uniform, so in case that the algorithms ( Gen , Dec ) are uniform we get auniform hardness amplification result.Another key point is that our hardness amplification is a self-reduction, i.e., if aproblem is somewhat hard against one distribution D , then the same problem is muchharder against a different distribution D ′ .To the best of our knowledge, this is the first result to study hardness amplificationfor optimization problems. It opens avenues to prove results in various subclasses as wewill see in subsequent subsections. NP -hard Problems In the NP world, we generalize the results of [O’D04, Tre05] to optimization problems.In particular we show that if MaxSAT is hard to solve on / poly ( n ) fraction of the inputsof samples drawn from some samplable distribution D . Then there exists a samplabledistribution D ′ such that solving MaxSAT on D ′ is hard on at least / -fraction of thesamples (See Corollary 4.3). Theorem 1.3 (Informal Statement of Corollary 4.3.) . Let D ( n ) be a distribution over - CNF formulas with n variables and poly ( n ) clauses, such that for every randomized algorithm A running in time poly ( n ) , we have: Pr Ψ ∼D [ A finds a optimal assignment for Ψ w.p. ≥ ] ≤ − poly ( n ) . Then there exists a distribution D ′ ( n ′ ) over - CNF formulas with n ′ variables poly ( n ′ ) clauses, such that for every randomized algorithm A ′ running in time poly ( n ′ ) , we have: Pr Ψ ′ ∼D ′ (cid:2) A ′ finds a optimal assignment for Ψ ′ w.p. ≥ (cid:3) ≤ Moreover, if D ( n ) is poly ( n ) -samplable then D ′ ( n ′ ) is poly ( n ′ ) -samplable. Observe that the failure probability on D ′ is much larger than in [O’D04, Tre05] andcan even tend to 0 for a proper choice of our parameters. This can be achieved since wedeal with optimization problems instead of decision problems.We also remark that our reduction and the proof correctness are much simpler, andin particular we do not rely the hard core set lemma [Imp95], a powerful and non-trivialkey tool in the previous known proofs. 6ur result easily extends into other NP -hard problems such as finding the largestclique in a graph, or finding smallest dominating set or vertex cover of a graph, etc (seeRemark 4.4).However, there are other NP -hard problems for which establishing a hardness am-plification result through Theorem 1.2 is not easy. A special highlight is that of provingsuch a result for the Knapsack problem, as it isn’t immediately clear if it’s direct productfeasible for reasonable range of parameters. This is because, for the Knapsack problem,when we aggregate instances in the natural way, optimal solutions of one instance mayinterfere with other instances (see Section 4 for details). Nonetheless, with some care,the direct product feasibility of Knapsack problem was established (see Lemma 4.7).The Exponential Time Hypothesis ( ETH ) [IP01, IPZ01, CIP06] asserts that that wecannot decide whether a given 3-
CNF is satisfiable in time which is sub-exponential inthe number of variables. That is a worst case assumption, and it raises a natural questionarises: Can we prove stronger hardness amplification result based on
ETH ? In fact, canwe prove a worst case to an average case hardness amplification based on
ETH ?Our next theorem is a step towards proving such a worst-case to an average casereduction for
MaxSAT . Theorem 1.4 (Informal Statement of Corollary 4.5.) . Let D ( n ) be a distribution over - CNF instances with n variables and O ( n ) -clauses, such that for every randomized algorithm A running in time o ( n ) , we have: Pr Ψ ∼D [ A finds an optimal assignment for Ψ w.p. ≥ ] ≤ − o ( n ) . Then there exists a distribution D ′ ( n ′ ) over - CNF instances with n ′ variables and o ( n ′ ) clauses, such that for every polynomial time randomized algorithm A ′ , we have: Pr Ψ ′ ∼D ′ (cid:2) A ′ finds an optimal assignment for Ψ ′ w.p. ≥ (cid:3) ≤ ETH that isassuming that every uniform algorithm fails on 1/2 o ( n ) -fraction of inputs. While Heallyet al. use similar assumption against non-uniform algorithms. P We investigate hardness amplification in P and can show results for string problems,such as LCS and
Edit - Distance , which were not possible in previous works.
Theorem 1.5 (Informal statement; see Corollaries 5.4 and 5.10 ) . Fix ε > . Let D ( n ) be anefficiently samplable distribution over the instances of LCS / Edit - Distance of length n. Assumethat any n − ε running-time algorithm with success probability at least fails to compute anoptimal alignment on at least n o ( ) -fraction of the inputs sampled from D . Then for some ε ′ > there exists an efficiently samplable distribution D ′ ( n + o ( ) ) over instances of LCS / Edit - Distance of size n + o ( ) such that every n − ε ′ running-time algorithm with success probabilityat least fails to compute an optimal solution on at least of the inputs sampled from D ′ . LCS and
Edit - Distance , is the problem of computingthe Fr´echet distance between two (discrete) curves. Strangely, this problem resists allnatural approaches to show that it is direct product feasible (see Remark 5.11). Therefore,it is an interesting question as to whether it is possible to show that it is direct productfeasible (for relevant range of parameters) or whether it is a candidate for a problem thatis not direct product feasible.Additionally, we show hardness amplification for a very different kind of problem,that of computing the product of two matrices (see Corollary 5.14). We highlight thisproblem, as it does not directly follow from our main theorem (i.e., Theorem 1.2). Elab-orating, a detail that was brushed under the carpet while discussing Theorem 1.2 wasthat, given an instance of an optimization problem and a candidate solution, we need toable to efficiently compute the value of the objective of the candidate solution for that in-stance. This naturally holds for all the problems considered in this paper except the taskof computing the product of two matrices, i.e., we do not know a way to deterministically verify if the product of two matrices is equal to a given third matrix, which is signifi-cantly faster than actually multiplying the two given matrices and checking if it’s equalto the third matrix [K ¨un18, WW18]. Nonetheless, we modify the proof of Theorem 1.2 tohandle this issue.
TFNP
Total problems (with not necessarily efficient verification of totality) are essentially equiv-alent to Optimization problems (see Remark 3.7). The class
TFNP is special as it is in aninformal sense the intersection of Search NP and Optimization problems. Problems in TFNP capture problems in various areas such as game theory, cryptography, computa-tional geometry, etc. We show that our general theorem can be applied to
TFNP prob-lems as well, and as an example show it for the Factoring problem (see Corollary 6.4) andthe End of a Line problem (see Corollary 6.8). The latter hardness amplification resultdirectly implies the hardness amplification of various problems in game theory such ascomputing an approximate Nash equilibrium (see Section 6.2 for details).
Our work leaves open several questions. We state a few of them below.
In Theorem 1.4 we showed that if
MaxSAT is hard to compute on 1 − o ( n ) -fraction ofinputs for sub-exponential time algorithm, then there exists a distribution on which itis hard on a constant fraction of inputs for algorithms running in time n ω ( ) . A naturalopen question is the following: Can we improve Theorem 1.4 and get hardness amplified against sub-exponential timealgorithms (instead of super-polynomial time algorithms)?
8t seems to us that derandomized direct product theorems may serve as the key toolto address the above question (for example, see [IKW12]). In particular, if one can provea (strongly) derandomized version of [FK00] then it might be possible to both aggregatesub-exponentially many instances succinctly and sample from the (derandomized) directproduct distribution efficiently.
In this paper, we were able to show direct product feasibility for certain problems quiteeasily (for example, see Theorems 1.3 and 1.5), but had to work harder to prove them forsome other problems (for example, see Lemmas 4.7 and 5.13), and in some problem(s)were unable to establish the property of direct product feasibility (see Remark 5.11). Thisleads us to the following question.
Can we pinpoint what property of a problem makes it possible to establishdirect product feasibility?
Direct Product theorems are key ingredients for both gap amplification and hardnessamplification. Also, there are many philosophical similarities in the techniques knownin literature of the aforementioned two kinds of amplifications. Thus we can ask thefollowing (ambitious) question:
Can we obtain a trade-off between gap amplification and hardness amplification?
In particular, can we show that if one problem is hard to approximate on worstcase within some factor α >
0, then it is hard to approximate within a factor α /100 onaverage? We note here that Feige [Fei02], did answer the converse of this question, i.e.,he used average case hardness assumptions to prove hardness of approximation resultsfor various problems in NP .It seems to us that analyzing the operation of performing a small perturbation on thegiven instance may be the right direction to proceed. Elaborating, consider a (worst case)hard distribution over gap instances of some problem. If we build a new distribution,which samples from the aforementioned distribution, then performs a small perturba-tion on the sampled gap instance, and outputs the perturbed instance, then we wouldstill retain most of the gap in the instance sampled from the new distribution, but onthe other hand, the fraction of instances on which it is hard to solve the problem shouldincrease significantly. It would be interesting if this intuition/approach could be madeto work.A related question is to ask if we can improve our result in Theorem 1.4 (for example,by making progress on the question detailed in Section 1.3.1) using Gap-ETH [Din16,MR16] (instead of ETH)? 9 .3.4 Average Case Hard Problems in P In this paper, we looked at average case hardness of some problems in P against someefficiently sampleable distribution but one can ask if we can achieve more. Can we show for some natural problem in P that it is hard to solve for theuniform distribution? Another important question stemming from cryptography [BRSV17] is whether wecan construct a fine-grained one way function from worst case assumptions?
In Section 2, we provide the proof overview of our main theorem (Theorem 1.2). InSection 3, we formally state and prove Theorem 1.2. In Section 4, we prove hardnessamplification results for various problems in NP . In Section 5, we prove hardness am-plification results for various problems in P . Finally, in Section 6, we prove hardnessamplification results for various problems in TFNP . We provide a proof overview for our hardness amplification result for the problem offinding the maximum clique in a graph and then in the subsequent section we will showhow our general result (i.e., Theorem 1.2) would follow.
To illustrate the main ideas behind our scheme let us focus on
MaxCLIQUE , the problemof finding the largest clique in a given graph G .Assume the existence of a distribution D over graphs on n vertices which is some-what hard to compute. That is for every randomized algorithm A running in time poly ( n ) ,we have Pr G ∼D [ A finds max-clique in G w.p. ≥ ] ≤ − n . (1)We would like to prove the existence of a new distribution D ′ over graphs on poly ( n ) vertices which is much harder to compute. That is, for every randomized algorithm A ′ running in time poly ( n ′ ) , we have:Pr G ′ ∼D ′ (cid:2) A ′ finds max-clique in G ′ w.p. ≥ (cid:3) ≤ D is poly ( n ) -time samplable, then so is D ′ .10 onstruction of New Distribution: D ′ samples a graph H as follows:1. Independently sample G , . . . , G k from D , where k = poly ( n ) .2. Define V ( H ) = V ( G ) ˙ ∪ · · · ˙ ∪ V ( G k ) .3. For every i ∈ [ k ] , connect the vertices in V ( G i ) using the original edges in G i .4. For every i , j ∈ [ k ] such that i = j , insert all the possible edges between G i and G j .5. Output H .Clearly, if D is poly ( n ) -time samplable, then so is D ′ . Now assume for sake of contradic-tion, that there exists A ′ running in time poly ( n ′ ) , violating Equation (2). We show theexistence of an algorithm A running in time poly ( n ) violating Equation (1).The algorithm A on input graph G with n vertices is defined as follows:1. Let S be an empty set.2. Repeat following O ( n ) times.(a) Pick randomly i ∈ [ k ] .(b) Independently sample G , . . . , G i − , G i + , . . . G k from D .(c) Construct H setting G i to be G .(d) Find clique in H using A ′ .(e) Restrict clique in H to the vertices of G and add it to S .3. Output the largest clique in S .Clearly, the running time of A is poly ( n ) , as n ′ = poly ( n ) and the running time of A ′ is poly ( n ′ ) . Our first observation is that for any graph H constructed by A , and for every i ∈ [ k ] the restriction of a maximal clique in H into G i , is a maximal clique for G i .Let A be one iteration of step 2 of A . If we show that A outputs maximum cliquew.p. Ω ( n ) on 1 − n fraction of samples from D then, A outputs maximum cliquew.p. 2/3 on 1 − n fraction of samples from D .Now, observe that if instead of planting the given input graph G as the i -th subgraphof H , we were planting a uniformly random sample of D , then we get a graph H whichis drawn according to D ′ . Consequently, if that was the case, then the success probabilityof A was equal the probability of A ′ and we were done.Let D ′ G denote the marginal distribution over H , where the graph G is planted at arandom coordinate i ∈ [ k ] . We conclude the proof by showing that for 1 − n -fractionof instances G drawn from D we have:Pr G ′ ∼D ′ G (cid:2) A ′ finds max-clique in G ′ w.p. ≥ (cid:3) ≥
12 Pr G ′ ∼D ′ (cid:2) A ′ finds max-clique in G ′ w.p. ≥ (cid:3) .Towards this goal we use a result by Feige and Kilian [FK00] that was proven in thecontext of parallel repetition. Under minor manipulations their result can be stated asfollows: 11et X be a universe and T be a distribution over X . Let f : X k → {
0, 1 } . Define µ = E x k ∼T k h f (cid:16) x k (cid:17)i , µ x = E i ∈ [ k ] , x ,..., x i − , x i + ,..., x k ∼T [ f ( x , . . . , x i − , x , x i + , . . . x k )] .Pr x ∼T h | µ x − µ | ≥ k − i ≤ k − , (3)To conclude the result, set X as the set of graphs with n vertices, and T be thedistribution D . We have D ′ = D k . Define f : X k → {
0, 1 } by: f ( G ′ ) = ⇐⇒ A ′ finds a maximal clique in G w.p. ≥ µ = Pr G ′ ∼D ′ (cid:2) A ′ finds max-clique in G ′ w.p. ≥ (cid:3) µ x = Pr G ′ ∼D ′ G (cid:2) A ′ finds max-clique in G ′ w.p. ≥ (cid:3) .By an application of (3), and a proper choice of k , we get that for all but at most k -fraction of graphs G drawn according to D , the success probability of A ′ on D ′ G is Ω ( n ) ,as claimed. In the previous subsection, we showed the main ingredients used for proving hardnessamplification for the task of finding a maximal clique in a given graph. What were theproperties of
MaxCLIQUE that we utilized to prove the result?One property that we used was that if we are given k input graphs G , . . . , G k , thereexists an efficient way to construct a large graph H such that a maximal clique in H induces a maximal clique on each of the graphs G i . The second property was that givena maximal clique in H there exists an efficient algorithm to construct a maximal cliqueon each of the graphs G i .These two properties are captured in Definition 1.1: The first property of a problem Π being Direct Product feasible is the existence of an efficient algorithm Gen stitching k instances I , . . . , I k of Π into a larger instance I ′ of Π , such that: an optimal solution for I ′ induces an optimal solution for each of the instances I i . The second property the exis-tence of an efficient algorithm Dec converting an optimal solution for I ′ into an optimalsolution of I ′ .Once we show Π is Direct Product feasible then the rest of the proof goes through.Indeed, assuming the existence of a distribution D on instances of Π for which any effi-cient algorithm fails to compute on 1 − n fraction of inputs, we define the distribution D ′ , D ′ I as follows: • D ′ is the k -product distribution of D , where we pick k random samples from D ′ • D ′ I is the distribution where we pick uniformly at random i ∈ [ k ] , and indepen-dently sample I , . . . , I i − , I i + , . . . I k from D . Finally, we construct I ′ by setting I i tobe I .Now we can use [FK00] to show that for most instances I ∼ D to connect the successprobability of A ′ on D ′ and D ′ I , to conclude the proof. Remark about Direct Product results and Hardness Amplification.
The direct productlemma at the heart of most hardness amplification results is the XOR lemma [Yao82]. Buthere we critically use the fact the problem is total, so at the surface at least, our resultsare incomparable to the hardness amplification results for NP and EXP obtained via XORlemmas.
In this section, we prove our main result. First, we define some notations. We use thedefinition of optimization problems given in [ACG +
99] with additional formalism.
Definition 3.1 (Optimization Problem) . An optimization problem Π is charaterized by thefollowing quadruple of objects ( I Π , Sol Π , ∆ Π , goal Π ) , where: • I Π is the set of instances of Π . In particular for every d ∈ N , I Π ( d ) is the set of instanceof Π of input size at most d (bits); • Sol Π is a function that associates to any input instance x ∈ I Π the set of feasible solutionsof x; • ∆ Π is the measure function , defined for pairs ( x , y ) such that x ∈ I Π and y ∈ Sol Π ( x ) .For every such pair ( x , y ) , ∆ Π ( x , y ) provides a non-negative integer which is the value ofthe feasible solution y; • goal Π ∈ { min, max } specifies whether Π is a maximization or minimization problem. We would like to identify a subset of our solution space which are optimal withrespect to our measure function. To this effect, we define a notion of optimal feasiblesolution.
Definition 3.2 (Optimal Feasible Solution) . Let Π ( I Π , Sol Π , ∆ Π , goal Π ) be an optimizationproblem. For every x ∈ I Π and y ∈ Sol Π ( x ) we say that y is an optimal feasible solution of x if forevery y ′ ∈ Sol Π ( x ) we have ∆ Π ( x , y ) ≥ ∆ Π ( x , y ′ ) if goal Π = max and ∆ Π ( x , y ) ≤ ∆ Π ( x , y ′ ) if goal Π = min . Now, we can formally define the notion of direct product feasibility of a pair ofoptimization problems. We define the measure function only for feasible solutions of an instance. Indeed if an algorithm solvingthe optimization problem outputs a non-feasible solution then, the measure just evaluates to -1 in case ofmaximization problems and ∞ in case of minimization problems. efinition 3.3. Let Π ( I Π , Sol Π , ∆ Π , goal Π ) and Λ ( I Λ , Sol Λ , ∆ Λ , goal Λ ) be two optimizationproblems. Let S , T : N × N → N . We say that the pair ( Π , Λ ) is ( S , T ) -direct product feasibleif there exists a pair of deterministic algorithms ( Gen , Dec ) such that for every k , d ∈ N thefollowing holds: • Gen takes as input x , . . . , x k ∈ I Π ( d ) and outputs x ′ ∈ I Λ ( S ( d , k )) . • For any feasible solution y ′ ∈ Sol Λ ( x ′ ) , Dec takes as input i ∈ [ k ] , x , . . . , x k ∈ I Π ( d ) ,and y ′ , and outputs y ∈ Sol Π ( x i ) . Moreover if y ′ ∈ Sol Λ ( x ′ ) is an optimal feasiblesolution then so is y ∈ Sol Π ( x i ) . • Gen and
Dec run in T ( d , k ) time.Moreover, if Π = Λ then we say that Π is ( S , T ) -self direct product feasible All but two results in this paper use the notion of self direct product feasibility.Only the problems of computing the longest common subsequence and computing theedit distance between two strings require us to define direct product feasibility for apair of problems (instead of a single problem). Even for the aforementioned two prob-lems, direct product feasibility is shown for a pair of essentially same problems (see Lem-mas 5.3 and 5.8 for more details).We now show our main theorem, that direct product feasibility implies hardnessamplification.
Theorem 3.4 (Formal version of Theorem 1.2) . Let p ∈ (
0, 1 ) . Let Π ( I Π , Sol Π , ∆ Π , goal Π ) and Λ ( I Λ , Sol Λ , ∆ Λ , goal Λ ) be two optimization problems. Let S , T : N × N → N be such that ( Π , Λ ) is ( S , T ) -direct product feasible. Let v : N → N be such that there is a deterministicalgorithm V running in time v ( d ) which on input x ∈ I Π ( d ) and y ∈ Sol Π ( x ) always correctlycomputes ∆ Π ( x , y ) . Let s , t : N → N , fail : N → (
0, 1 ] , and D = { D d } d ∈ N be a family ofdistributions such that for every randomized algorithm A running in time t ( d ) , the followingholds for large d ∈ N . • D d is a distribution over I Π ( d ) where an instance of I Π ( d ) can be sampled from D d in s ( d ) time. • Pr x ∼ D d [ A finds an optimal feasible solution for x with probability at least p ] ≤ − fail ( d ) .Let k : = · ( fail ( d )) − and c : =
200 ln ( / − p ) p . If k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c , then thereis a distribution family D ′ = { D ′ d } d ∈ N such that for every randomized algorithm A ′ running intime t ( d ) c , the following holds for all large enough d ∈ N . • D ′ d is a distribution over I Λ ( m ) where m = S ( d , k ) and an instance in I Λ ( m ) can besampled from D ′ d in O ( s ( d ) k + T ( d , k )) time. • Pr x ′ ∼ D ′ d [ A ′ finds an optimal feasible solution for x ′ with probability at least p ] ≤ emma 3.5 (Feige-Kilian Direct Product Lemma [FK00]) . Let X be a finite set and D aprobability distribution on X. Let k ∈ N and f : X k → {
0, 1 } . We define the following twomeasures: µ = E x ,..., x k ∼D [ f ( x , . . . , x k )] = Pr x ,..., x k ∼D [ f ( x , . . . , x k ) = ] , where x , . . . , x k are sampled independently, and for every i ∈ [ k ] and x ∈ X we define µ i , x = E x ,..., x i − , x i + ,..., x k ∼D [ f ( x , . . . . x i − , x , x i + , . . . , x k )] , where x , . . . , x i − , x i + , . . . , x k are sampled independently. Then, we have the following: Pr x ∼D i ∼ [ k ] (cid:20) | µ i , x − µ | ≥ k (cid:21) ≤ k , where we sample i from [ k ] uniformly at random. The proof of the above lemma as stated above may be found in [GO05] and we pro-vide it in Appendix A for completeness. We also note that variants of the above lemmahave previously appeared in direct product testing literature [DG08, IKW12, DS14].
Proof of Theorem 3.4.
First we describe the construction of D ′ and then show the claimsmade in the theorem statement. Construction of D ′ Let ( Gen , Dec ) be the pair of algorithms guaranteed by Definition 3.3 for the pair ( Π , Λ ) .Fix d , k ∈ N . We construct D ′ d from D d as follows. Independently sample k pairs ofinstances, say x , . . . , x k from D d and feed it as input to Gen . The sampling algorithm ofthe distribution D ′ d then outputs the output of Gen .Therefore, the m in the theorem statement, the size of the instances outputted by D ′ d is equal to S ( d , k ) . The time needed to sample from D ′ d is the time needed to sample k independent samples from D d which is k · s ( d ) time, plus the running time of Gen whichis T ( d , k ) . Correctness of the Claim
We will show that if there is a randomized algorithm A ′ with success probability p run-ning in time t ( d ) / c that finds an optimal feasible solution of an instance sampled from D ′ d with probability (over the sampling) greater than 0.01 then, there is a randomized al-gorithm A with success probability p running in time t ( d ) that finds an optimal feasiblesolution for an instance sampled from D d with probability (over the sampling) greaterthan 1 − fail ( d ) , reaching a contradiction. First we describe below the algorithm A . We remark here that the simulation of instances sampled from D ′ d from an instance of D d is similarto the simulation described in the (textbook) proof of showing existence of weak one-way functions implyexistence of strong one-way functions [Yao82, Gol08]. lgorithm A : Input : An instance x ∈ I Π ( d ) . Output : A feasible solution y ∈ Sol Π ( x ) . Procedure :1. Let
S ⊆
Sol Π ( x ) be a subset of feasible solutions initialized to ∅ .2. Repeat the below procedure c times.2.1. Pick i ∈ [ k ] uniformly at random.2.2. Independently sample k − D d say x , . . . , x i − , x i + , . . . , x k .2.3. Define x i = x .2.4. Feed x , . . . , x k as input to Gen . Let x ′ ∈ I Λ ( m ) be the outputof Gen .2.5. Feed x ′ as input to A ′ . Let y ′ ∈ Sol Λ ( x ′ ) be the output of A ′ .2.6. Feed i , x , . . . , x k , and y ′ as input to Dec . Let y be the output of Dec .2.7. Include y in S .3. Run V on ( x , y ) for each y ∈ S and output the feasible solution in S which optimizes Π (depends on goal Π ).Let us first analyze the running time of A . Since we repeat Step 2, c times, it sufficesto analyze the time needed for one iteration. Step 2.2 needs ( k − ) · s ( d ) time, Step 2.4needs T ( d , k ) time, Step 2.5 needs time t ( d ) / c time, and finally Step 2.6 needs time T ( d , k ) .Therefore, the total running time of A is less than c ( k · s ( d ) + T ( d , k ) + t ( d ) / c + v ( d )) ≤ t ( d ) .Finally, we argue on the correctness probability of A . Let A be the same algorithmas A except Step 2 has only one iteration. Therefore it suffices to show that A outputsan optimal feasible solution with probability at least ln ( / − p ) c on 1 − fail ( d ) fraction ofinstances sampled from D d , as this implies that A outputs an optimal feasible solutionwith probability at least (cid:16) − (cid:16) − ln ( / − p ) c (cid:17) c (cid:17) ≥ − e ln ( − p ) = p on greater than 1 − fail ( d ) fraction of instances sampled from D d . Notice that if y ′ is an optimal solution to x ′ then Dec always outputs an optimal solution to x . Therefore, it suffices to show thaton input x ′ , A ′ in Step 2.5. outputs an optimal feasible solution with probability at least ln ( / − p ) c on greater than 1 − fail ( d ) fraction of instances sampled from D d .Consider the Boolean function f : I Λ ( m ) → {
0, 1 } where f ( x ′ ) = A ′ outputs an optimal feasible solution of x ′ with probability at least p . From assumptionon the fraction of sampled inputs that A ′ outputs an optimal feasible solution on, wehave that µ : = E x ′ ∼ D ′ d [ f ( x ′ )] > µ x , i as follows: µ x , i : = Pr x ,..., x i − , x i + ,..., x k ∼ D d [ f ( x , . . . , x i − , x , x i + , . . . , x k ) = ] .16rom Lemma 3.5, we have the following,Pr x ∼ D d i ∈ [ k ] h | µ x , i − µ | ≥ k − i ≤ k − .Call an instance x ∈ I Π ( d ) “bad” if Pr i ∈ [ k ] (cid:2) µ x , i < µ − k − (cid:3) ≤ k − ; otherwise it iscalled “good”. From Markov inequality we have that:Pr x ∼ D d [ x is good ] ≥ − k − .Take any good x then with probability at least 1 − k − over i ∈ [ k ] we havethat µ x , i ≥ µ − k − ≥ − k − . Next, conditioned on picking i ∈ [ k ] such that µ x , i > − k − , then with probability at least 0.01 − k − we have f ( x , . . . , x i − , x , x i + , . . . , x k ) =
1. Conditioned on f ( x , . . . , x i − , x , x i + , . . . , x k ) = A ′ outputs an optimal feasible solution of x ′ with probability at least p .Summarizing, if we sample a good x then with probability at least ( − k − ) · p ≥ · p = ln ( / − p ) c the algorithm A ′ in Step 2.5 outputs an optimal feasible solution of x ′ . The proof concludes by noting that a good x is sampled with probability at least1 − k − = − fail ( d ) > − fail ( d ) .We conclude this section by providing a couple of remarks on the above theoremand proof. Remark 3.6 (Amplification factor) . We would like to note that the amplified hardness for theoptimization problem Λ (which is shown in Theorem 3.4 to be 0.01) can be further amplified toany arbitrarily small positive constant close to 0 by adjusting the parameters in the proof. Remark 3.7 (Total Problems) . One may observe that the class of optimization problems areindeed equivalent to the class of total problems. For some finite alphabet Σ , we call a relationR ⊆ Σ ∗ × Σ ∗ to be total if for every x ∈ Σ ∗ there is always a y ∈ Σ ∗ such that ( x , y ) ∈ R. To seethat every optimization problem Π is also a total problem, it suffices to note that the range of themeasure function ∆ Π is bounded and therefore a maximum/minimum always exists. And to seethat every total problem R is also an optimization problem Π R , it suffices to note we can define ∆ Π R ( x , y ) to be 1 if ( x , y ) ∈ R and 0 otherwise, and set Π R to be a maximization problem (i.e., goal Π R = max ).Given this new outlook at optimization problems, one may see the existence of solutionsas a crucial requirement in the proof of Theorem 3.4. In particular, our algorithm A (designand analysis) would be meaningless without the existence of solutions for any instance of theoptimization problem. Actually, it can be even amplified to sub-constant, but we would have to pay in the running time lowerbound for Λ . Almost Worst Case to Average Case for Problems in NP In this section, we show hardness amplification results for various NP -hard problems,with a focus on MaxSAT and Knapsack.
We recall below the Maximum satisfiability problem in our formalism for optimizationproblems.
Definition 4.1 ( MaxSAT problem) . The Maximum Satisfiability problem (
MaxSAT ) is an opti-mization problem characterized by the following quadruple of objects ( I SAT , Sol
SAT , ∆ SAT , max ) ,where: • For every d ∈ N , I SAT ( d ) is the set of all CNF formulas on n = Ω ( d ) variables and O ( n ) clauses; • For every φ ∈ I SAT we have
Sol
SAT ( φ ) = {
0, 1 } n ; • For every φ ∈ I SAT and every x ∈ {
0, 1 } n we define ∆ SAT ( φ , x ) to be the number of clausesthat are satisfied by the assignment x to the variables of φ . We now show that
MaxSAT is self direct product feasible (in a rather naive way).
Lemma 4.2.
Let S , T : N × N → N , where S ( d , k ) = dk and T ( d , k ) = O ( dk ) . Then we havethat MaxSAT is ( S , T ) -self direct product feasible.Proof. We define the pair of deterministic algorithms ( Gen , Dec ) below.For every φ , . . . , φ k ∈ I SAT ( d ) given as input to Gen , it outputs the instance φ ′ in I SAT ( d ′ ) defined as: φ ′ : = φ ′ ∧ · · · ∧ φ ′ k Where φ ′ i is the formula obtained by replacing each literal l j in φ i by l ( i − ) n + j . It is clearthat the running time of Gen is O ( d ′ ) .Next, for every i ∗ ∈ [ k ] , φ , . . . , φ k ∈ I SAT ( d ) , and a SAT assignment x ′ ∈ {
0, 1 } n ′ for φ ′ given as input to Dec , the algorithm first runs
Gen to compute φ ′ and then outputs x ∈ {
0, 1 } n which is computed as follows: x j = x ′ ( i ∗ − ) n + j .We next show that if x ′ is an optimal SAT assignment for φ ′ then x is an optimalassignment for φ i ∗ . This follows easily by the fact that the φ ′ i s are defined on disjoint setsof variables, hence an optimal solution for φ ′ induces an optimal solution for each of the φ i s. The running times of Gen and
Dec trivially follow.The above rather simple self direct product feasibility has a rather strong conse-quence when combined with our generic hardness amplification theorem.
Corollary 4.3.
Let a ≥ . Let D = { D d } d ∈ N be a family of distributions such that for everyrandomized algorithm A running in time O ( d a ) over inputs of size d, the following holds forlarge d ∈ N . D d is a distribution over I SAT ( d ) where an instance of I SAT ( d ) can be sampled from D d in e O ( d ) time. • Pr φ ∼ D d [ A finds maximizing assignment for φ with probability at least ] ≤ − d .Then for m : = Θ ( d ) , there is a distribution family D ′ = { D ′ d } d ∈ N such that for everyrandomized algorithm A ′ running in time O ( m a /7 ) over inputs of size m, the following holds forall large enough d ∈ N . • D ′ d is a distribution over I SAT ( m ) and an instance in I SAT can be sampled from D ′ d in e O ( m ) time. • Pr φ ′ ∼ D ′ d [ A ′ finds maximizing assignment for φ ′ with probability at least ] ≤ Proof.
We apply Theorem 3.4 by setting, Π = Λ = MaxSAT , p = / , S ( d , k ) = dk , T ( d , k ) = w · dk (for some w ∈ N ), v ( d ) = w ′ · d (for some w ′ ∈ N ), s ( d ) = e O ( d ) , t ( d ) = d a , and fail ( d ) = / d . Then we have that k : = Θ ( d ) and c : =
300 ln 3. We verifythat k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c holds by noting that k · s ( d ) + T ( d , k ) + v ( d ) = e O ( d ) and t ( d ) c = Ω ( d a ) = Ω ( d ) (because a ≥ D ′ d is e O ( d ) = e O ( m ) and that the theorem statement holds for any randomizedalgorithm A ′ running in time t ( d ) c = Θ ( d a ) = Θ ( m a /7 ) . Remark 4.4.
The idea of taking k instances disjointly as in Lemma 4.2 can be extended to many NP -hard covering problems such as Vertex Cover, Dominating Set, etc. Consequently, we obtainhardness amplification results similar to Corollary 4.3 for these problems as well. Now we consider the same hardness amplification result of
MaxSAT but againstsubexponential time algorithms. We provide below a slightly informal statement (usingasymptotic notations as opposed to providing specific constants) for clarity.
Corollary 4.5.
Let D = { D d } d ∈ N be a family of distributions such that for every randomizedalgorithm A running in time o ( d ) over inputs of size d, the following holds for large d ∈ N . • D d is a distribution over I SAT ( d ) where an instance of I SAT ( d ) can be sampled from D d in e O ( d ) time. • Pr φ ∼ D d [ A finds maximizing assignment for φ with probability at least ] ≤ − o ( d ) .Then for m : = o ( d ) , there is a distribution family D ′ = { D ′ d } d ∈ N such that for everyrandomized algorithm A ′ running in time m ω ( ) over inputs of size m, the following holds for alllarge enough d ∈ N . • D ′ d is a distribution over I SAT ( m ) and an instance in I SAT can be sampled from D ′ d in e O ( m ) time. • Pr φ ′ ∼ D ′ d [ A ′ finds maximizing assignment for φ ′ with probability at least ] ≤ roof. Let h : N → N be some slowly increasing function such that lim x → ∞ h ( x ) = ∞ and let h : = h ( d ) . We apply Theorem 3.4 by setting, Π = Λ = MaxSAT , p = / , S ( d , k ) = dk , T ( d , k ) = w · dk (for some w ∈ N ), v ( d ) = w ′ · d (for some w ′ ∈ N ), s ( d ) = e O ( d ) , t ( d ) = d / h , and fail ( d ) = / d / ( · h ) . Then we have that k : = Θ ( d / h ) and c : =
300 ln 3. We verify that k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c holds by noting that k · s ( d ) + T ( d , k ) + v ( d ) = ( + o ( )) · d / h and t ( d ) c = Ω ( d / h ) . Therefore, we have that thesampling time from D ′ d is 2 ( + o ( )) · d / h = O ( m ) and that the theorem statement holds forany randomized algorithm A ′ running in time t ( d ) c = O ( d / h ) = Θ ( m h ( d ) ) = m ω ( ) .Note that if we assume the (randomized) Exponential Time Hypothesis ( ETH ) for3-
SAT [IP01, IPZ01, CIP06] then after applying the Sparsification lemma, we obtain forsome ε >
0, a family of distributions D = { D d } d ∈ N such that for every randomizedalgorithm A running in time 2 ε d over inputs of size d , the following holds for large d ∈ N . • D d is a distribution over I SAT ( d ) where an instance of I SAT ( d ) can be sampled from D d in e O ( d ) time. • Pr φ ∼ D d [ A finds maximizing assignment for φ with probability at least 2/3 ] ≤ − e O ( d ) .Therefore, our amplification in Corollary 4.5 is an almost worst-case to average-casereduction for MaxSAT (under subexponential time reductions).
In this subsection, we study the direct product feasibility of the Knapsack problem andas we will see, showing that its direct product feasible for reasonable parameters is sig-nificantly more non-trivial than was with the case for
MaxSAT .In the Knapsack problem we are given a target sack weight W , and set of items viapairs ( w , v ) where w is the weight of the item and v is the value of the item, and the goalis to pick a subset of items which maximizes the sum of the values of the picked itemsgiven the constraint that their total weight is at most W . More formally, we describe it asfollows. Definition 4.6 ( KS problem) . The Knapsack problem ( KS ) is an optimization problem charac-terized by the following quadruple of objects ( I KS , Sol KS , ∆ KS , max ) , where: • For every d ∈ N , I KS ( d ) = ( W , { ( v i , w i ) } ni = ) where W , n , v i , w i ∈ N such that d = log W + ∑ ni = ( log v i + log w i ) ; • For every ( W , { ( v i , w i ) } ni = ) ∈ I KS we have Sol KS (( W , { ( v i , w i ) } ni = )) is the set of allsubsets S of [ n ] satisfying ∑ i ∈ S w i ≤ W; • For every S ∈ [ n ] satisfying ∑ i ∈ S w i ≤ W we define ∆ KS ( S ) to be ∑ i ∈ S v i . It is not trivial to show the self direct problem feasibility for Knapsack. To see that,consider the case k =
2. Naively, if the sacks weights are W , W then one may define anew sack of weight W + W , and then take the union of the item sets while leaving their20eights and values untouched. However, in this simple reduction, we may use some ofthe target sack weight of one instance against another instance. Nonetheless we showwith some care, a direct product feasibility result can be obtained. Lemma 4.7.
Let T : = { ℓ | ℓ ∈ N } . Let S , T : N × T → N , where S ( d , k ) = k O ( ) · d andT ( d , k ) = k O ( ) · d. Then we have that KS is ( S , T ) -self direct product feasible.Proof. We first show that there exists a pair of deterministic algorithms ( Gen , Dec ) suchthat the conditions of ( S , T ) -self direct product feasibility in Definition 3.3 are met forthe Knapsack problem for every d ∈ N but when k is fixed to be 2. Using that, we provethe lemma statement for any value of k which is a power of 2. The proof proceeds byfirst creating (recursively) two instances: the first corresponds to the first k /2 instancesand the second to the last k /2 ones. Then we use the result for k = Base Case k = . Let us first present the algorithm
Gen . Let I , I ∈ I KS ( d ) be theinput to Gen where for every j ∈ {
1, 2 } , we have I j = ( W j , { ( v ( j ) i , w ( j ) i ) } i ∈ [ n j ] ) . We firstnormalize the weights (by multiplying the weight of both the sack and each item by thesame factor c ) so that W = W /2, W = W , for some W ∈ N (note that we can achievethis normalization for W = · W · W ).The output of Gen is a new instance I ′ : = ( W ′ , { ( v ′ i , w ′ i ) } i ∈ [ n ′ ] ) ∈ I KS ( d ′ ) , where d ′ = O ( d ) , n ′ : = n + n + log W , and W ′ = W + W /2. We define N = {
1, . . . , n } , N = { n +
1, . . . , n } , and D = { n + n +
1, . . . , n + n + log W } .Now we define the items ( v ′ i , w ′ i ) for all i ∈ [ n ′ ] . The first n items correspond to theitems of I , the next two n items correspond to the items in I , and the last log W aredummy items. Elaborating, we have ∀ i ∈ [ n ′ ] , v ′ i = v ( ) i if i ≤ n v ( ) i − n · ( m + ) W + n < i ≤ n + n ( m + ) · i − n − n − if i > n + n , ∀ i ∈ [ n ′ ] , w ′ i = w ( ) i if i ≤ n w ( ) i − n · W if n < i ≤ n + n W · i − n − n − if i > n + n ,where m : = ∑ i ∈ [ n ] v ( ) i . Note that the size of I ′ is indeed O ( d ) .Now we define Dec : It gets as input an index j ∈ {
1, 2 } , an instance I ′ which wasgenerated by Gen and an optimal solution S ′ for I ′ . It is required to produce an optimalsolution for the instance I j . The algorithm Dec returns S ′ ∩ N if j =
1, otherwise itoutputs S ′ ∩ N (where each element is translated by − n so that the final output residesin [ n ] ).The correctness of Dec follows by the following claim.
Claim 4.8.
Let S ′ be an optimal solution for I ′ , then: . S ′ : = S ′ ∩ N is an optimal solution for I .2. Let S : = S ′ ∩ N and define S ′ as the set of elements in S where each element is translatedby − n , then S ′ is an optimal solution for I .Proof.
1. We first show that for S ′ we have w : = ∑ { i ′ ∈ S ′ | i ′ ∈ N ∪ D } w ′ i = W .Assume not, then either we have w > W or w < W . If it is the former thennotice that since w ′ i is a multiple of W for all i > n , we arrive at a contradictionas w ≤ W ′ = W + W /2. Therefore let us assume that it is the latter. We define a‘better’ solution S ′′ as follows.First include into S ′′ all elements in S ′ ∩ N . Let ρ = W − ∑ i ′ ∈ S ′ ∩ N w ′ i , denote theremaining slack in the sack after inserting the elements in N ∩ S ′ . Next, for each i ∈ D we insert i into S ′′ if in the binary representation of ρ , the i -th bit equals 1.We show that ∆ ( S ′′ ) > ∆ ( S ′ ) , contradicting the optimality of S ′ .Let D ′ = S ′ ∩ D , and let ρ ′ = ∑ i ∈ D ′ i − ( n + n ) − be the number obtained by thebinary representation of the elements in D ′ . Observe that since by our assumption ∑ { i ′ ∈ S ′ | i ′ ∈ N ∪ D } w ′ i < W we get ρ ≥ ρ ′ + ∆ ( (cid:12)(cid:12) S ′′ ∩ D (cid:12)(cid:12) ) − ∆ ( (cid:12)(cid:12) S ′ ∩ D (cid:12)(cid:12) ) = ( ρ − ρ ′ ) · ( m + ) ≥ m + ∆ ( S ′′ ) − ∆ ( S ′ ) = ∆ ( (cid:12)(cid:12) S ′′ ∩ D (cid:12)(cid:12) ) − ∆ ( (cid:12)(cid:12) S ′ ∩ D (cid:12)(cid:12) ) + ∆ ( (cid:12)(cid:12) S ′′ ∩ N (cid:12)(cid:12) ) − ∆ ( (cid:12)(cid:12) S ′ ∩ N (cid:12)(cid:12) ) ≥ m + − ∑ i ∈ S ′ ∩ N v i ′ ≥ ∑ i ∈ S ′ ∩ N v ′ i ≤ m = ∑ i ∈ [ n ] v ( ) i , contradictingthe optimality of S ′ .Now, clearly, if ∑ { i ′ ∈ S ′ | i ′ ∈ N ∪ D } w i = W , then S ′ ∩ N is an optimal solution for I , since otherwise we can improve over the solution S ′ (by taking the same itemsfrom N and D and add the optimal solution for I ).2. Assume for sake of contradiction that, S ′ defined in the claim, is not an optimalsolution for I . Let e S be an optimal solution for I . Let us define S ′′ as follows: Firstwe include the items from e S , then add items in D until the total weight reaches W . Finally, we include the items from S ′ ∩ N .Observe that by the previous item, the set S ′′ is a feasible solution (as the weight ofthe elements in S ′ ∩ N does not exceeds W /2). Since in I ′ for each i ∈ N we set: v ′ i = v ( ) i − n · ( m + ) W + e S we have: ∆ ( S ′′ ∩ N ) − ∆ ( S ′ ∩ N ) ≥ ( m + ) W + S ′ may contain at most log W more elements from D than S ′′ contains.However their value is bounded by ( m + ) W , and hence: ∆ ( S ′′ ) − ∆ ( S ′ ) = ( ∆ ( S ′′ ∩ N ) − ∆ ( S ′ ∩ N )) + ( ∆ ( S ′′ ∩ D ) − ∆ ( S ′ ∩ D )) ≥ ( m + ) W + − ( m + ) W ≥ S ′ .Finally, it is easy to see that running time of Dec is at most O ( d ′ ) . General Case k = ℓ , for some ℓ ∈ N . We will use ( Gen , Dec ) given in the previouscase to show that there exists a pair of deterministic algorithms ( g Gen , g Dec ) such thatthe conditions of ( S , T ) -self direct product feasibility in Definition 3.3 are met for theKnapsack problem for every d ∈ N , k ∈ T .Let us first present the algorithm g Gen . Let I , . . . I k ∈ I KS ( d ) be the input to Gen where for every j ∈ [ k ] , we have I j = ( W j , { ( v ( j ) i , w ( j ) i ) } i ∈ [ n j ] ) . We arbitrarily pair up the k instances, and feed each pair of instances to Gen (described previously in the proof).We obtain k /2 instances of Knapsack problem of size O ( d ) . We repeat the process ofarbitrarily pairing up the instances and feeding it to Gen . After doing this process log k times, we will have a single instance of Knapsack as the output of Gen which will be ofsize at most k O ( ) · d . This is the output of g Gen .Finally, it suffices to note that g Dec simply does a restriction of the solution to thecoordinates of the instance of interest as in the previous case where k was set to 2.Like MaxSAT , Knapsack too admits a hardness amplification result but we skip writ-ing it here for the sake of non-repetitiveness. Also, notice that the above lemma is provenfor functions S , T on domain N × T instead of N × N . This is done for the sake of clearpresentation. The above proof can be extended to prove the direct product feasibility forthe general case as well. Remark 4.9 (Adopting above proof to maximization version of other covering problems) . The idea of taking k instances with appropriate scaling as in Lemma 4.7 can be extended tomany maximization versions of covering problems (that are NP -hard) such as Max-coverage,Clustering etc. Consequently, we obtain hardness amplification results for these problems aswell. P In this section, we look at hardness amplification for two natural and important stringproblems which have been at the center stage of fine-grained complexity in the last fewyears. We also look at the problem of matrix multiplication, which does not fit into ourscheme of hardness amplification given in Section 3 as it is not known to admit efficientdeterministic verification. We also propose the problem of computing Frechet distanceas a natural problem which might not be direct product feasible.23 .1 Longest Common Subsequence
In the Longest Common Subsequence problem we are given two strings and the goalis to find the subsequence of maximum length that is in both the strings. It is indeed anatural maximization problem and fits smoothly into our formalism as follows.
Definition 5.1 ( LCS alignment) . Let Σ be a finite non-empty set and n ∈ N . For every pairof strings ( a , b ) ∈ Σ n × Σ n and every function σ : [ n ] → [ n ] ∪ {⊥} , we say that σ is an LCS alignment for ( a , b ) if the following holds. • Monotonicity : For every i , j ∈ [ n ] , i < j we have that if σ ( i ) = ⊥ and σ ( j ) = ⊥ then σ ( i ) < σ ( j ) . • Matching : For every i ∈ [ n ] , if σ ( i ) = ⊥ then we have a i = b σ ( i ) .The length of an LCS alignment σ is defined as the cardinality of the preimage of [ n ] , i.e., |{ i ∈ [ n ] | σ ( i ) ∈ [ n ] }| . Definition 5.2 ( LCS problem) . Let Σ be a finite non-empty set. The Longest Common Sub-sequence ( LCS Σ ) is an optimization problem charaterized by the following quadruple of objects ( I LCS Σ , Sol
LCS Σ , ∆ LCS Σ , max ) , where: • For every d ∈ N , I LCS Σ ( d ) = Σ n × Σ n where n = d / ( | Σ | ) ; • For every ( a , b ) ∈ I LCS Σ we have Sol
LCS Σ ( a , b ) is the set of all LCS alignments for ( a , b ) ; • For every ( a , b ) ∈ I LCS Σ and every LCS alignment σ for ( a , b ) we define ∆ LCS Σ ( a , b , σ ) tobe the length of σ . Now we show that
LCS on alphabets of some size are direct product feasible with
LCS on alphabets with an additional character.
Lemma 5.3.
Let Σ be a finite non-empty set. Let Ξ be a superset of Σ of cardinality | Σ | + . LetS , T : N × N → N , where S ( d , k ) = | Ξ | ( dk + k − ) and T ( d , k ) = · S ( d , k ) . Then wehave that ( LCS Σ , LCS Ξ ) are ( S , T ) -direct product feasible.Proof. Let Ξ = Σ ∪ { ξ } , where ξ / ∈ Σ . We define the pair of deterministic algorithms ( Gen , Dec ) below.Fix k , d ∈ N (and consequently n ∈ N ). Let ℓ : = + nk , m : = nk + ℓ ( k − ) , and d ′ = m | Ξ | . Let z ∈ Ξ ℓ be the concatenation of ℓ copies of ξ , i.e., z : = ξ ◦ · · · ◦ ξ | {z } ℓ copies . Here we use the unary encoding to encode a symbol in Σ for the ease of presentation, as it circumventsrounding issues. This does not affect the results in this paper as we think of Σ as some small universalconstant (like Σ = {
0, 1 } ). The results in this paper also hold for larger alphabets, but more care needs totaken in the rounding of parameters in the forthcoming proofs while using the binary encoding. ( a , b ) , . . . , ( a k , b k ) ∈ I LCS Σ ( d ) given as input to Gen , it outputs the in-stance ( a , b ) in I LCS Ξ ( d ′ ) where, a : = a ◦ z ◦ a ◦ z ◦ · · · ◦ z ◦ a k ∈ Ξ m , b : = b ◦ z ◦ b ◦ z ◦ · · · ◦ z ◦ b k ∈ Ξ m .It is clear that the running time of Gen is d ′ .Next, for every i ∈ [ k ] , ( a , b ) , . . . , ( a k , b k ) ∈ I LCS Σ ( d ) , and an LCS alignment e σ : [ m ] → [ m ] ∪ {⊥} for ( a , b ) given as input to Dec , the algorithm first runs
Gen to compute ( a , b ) and then outputs σ : [ n ] → [ n ] ∪ {⊥} which is computed as follows: ∀ j ∈ [ n ] , σ ( j ) = (e σ ( j + loc ) if loc < e σ ( j + loc ) ≤ n + loc , ⊥ otherwise,where loc = ( n + ℓ )( i − ) . It is easy to see that σ is an LCS alignment for ( a i , b i ) . Also,it is easy to see that the running time of Dec is running time of
Gen plus d ′ (needed tocompute σ ).We next show that if e σ is an optimal LCS alignment for ( a , b ) then σ is an optimalalignment for ( a , b ) . To show this we first show that if e σ is an optimal LCS alignment thenit has a certain “block structure”. Consider the following
LCS alignment e τ for ( a , b ) : ∀ j ∈ [ m ] , e τ ( j ) = ( j if a j = ξ , ⊥ otherwise. e τ is an LCS alignment because of our construction of ( a , b ) , where we have that for all j ∈ [ m ] if a j = ξ then b j = ξ as well. We have that ∆ LCS Ξ ( a , b , e τ ) = ℓ ( k − ) . Therefore if e σ is an optimal LCS alignment then we should have ∆ LCS Ξ ( a , b , e σ ) ≥ ℓ ( k − ) .Suppose there exists j ∈ [ n ] such that e σ ( j + loc ) = ⊥ and is strictly greater than n + loc . If there is more than one such j then we pick the smallest one. Then we have that e σ ( j + loc ) > n + loc + ℓ as a j + loc ∈ Σ but b n + loc + r = ξ / ∈ Σ for all r ∈ [ ℓ ] . Also we havethat for every j ′ ∈ [ n ] that is strictly less than j either e σ ( j + loc ) = ⊥ or is at most n + loc .In this case, we have that the length of e σ is at most m − ℓ = nk + ℓ ( k − ) = ℓ ( k − ) −
1, acontradiction as we showed earlier that ∆ LCS Ξ ( a , b , e σ ) ≥ ℓ ( k − ) . By following a similarargument we can show that there does not exist j ∈ [ n ] such that e σ ( j + loc ) = ⊥ and isless than or equal to loc . This implies that e σ restricted to the coordinates in the interval [ loc + loc + n ] provides an optimal LCS alignment for pair of contiguous substringof a and b restricted to the coordinates in the interval [ loc + loc + n ] . Thus σ is anoptimal LCS alignment for ( a i , b i ) .Finally, we note that the bound on the functions S and T hold as d ′ = ( nk + ( nk + )( k − )) | Ξ | = | Ξ | ( nk + k − ) ≤ | Ξ | ( dk + k − ) . LCS problem has been of special interest in the last few years thanks to the advance-ments in fine grained complexity [ABW15, AHWW16, Wil15, Wil16, Wil18]. The abovedirect product feasibility immediately implies the below hardness amplification result.
Corollary 5.4.
Let ε > . Let Σ be a finite non-empty set. Let Ξ be a superset of Σ of cardinality | Σ | + . Let D = { D d } d ∈ N be a family of distributions such that for every randomized algorithm A running in time d + ε over inputs of size d, the following holds for large d ∈ N . D d is a distribution over I LCS Σ ( d ) where an instance of I LCS Σ ( d ) can be sampled from D d in e O ( d ) time. • Pr ( a , b ) ∼ D d [ A finds optimal LCS alignment for ( a , b ) with probability at least ] ≤ − d − o ( ) .Then there is some ε ′ > such that there is a distribution family D ′ = { D ′ d } d ∈ N such thatfor every randomized algorithm A ′ running in time d + ε ′ , the following holds for all large enoughd ∈ N . • D ′ d is a distribution over I LCS Ξ ( d + o ( ) ) and an instance in I LCS Ξ can be sampled from D ′ d in d + o ( ) time. • Pr ( a ′ , b ′ ) ∼ D ′ d [ A ′ finds optimal LCS alignment for ( a ′ , b ′ ) with probability at least ] ≤ Proof.
We apply Theorem 3.4 by setting, Π = I LCS Σ , Λ = I LCS Ξ , p = / , S ( d , k ) = | Ξ | ( dk + k − ) , T ( d , k ) = · S ( d , k ) , v ( d ) = w · d (for some w ∈ N ), s ( d ) = e O ( d ) , t ( d ) = d + ε , and fail ( d ) = / d o ( ) . Then we have that k : = d o ( ) and c : =
300 ln 3. Weverify that k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c holds by noting that k · s ( d ) + T ( d , k ) + v ( d ) = d + o ( ) and t ( d ) c = Ω ( d + ε ) . Therefore, we have that the sampling time from D ′ d is d + o ( ) and that the theorem statement holds for any randomized algorithm A ′ running in time t ( d ) c = Θ ( d + ε ) = Θ (cid:18)(cid:16) d + o ( ) (cid:17) + ε − δ (cid:19) , for any δ > Remark 5.5 (Hardness Amplification of k - LCS and other parameterized complexity prob-lems) . We remark here that the above proof strategy can be extended to show a hardness amplifi-cation result for the k-
LCS problem (for fixed k) which is of interest in parameterized complexity.In fact, following Remark 4.4, we can obtain hardness amplification theorems for fundamentalproblems in fixed parameter complexity such k-clique and k-set cover.
Recall that the edit distance between a pair of strings a , b ∈ Σ n is defined as the minimalnumber of edit operations needed to convert x into y , where edit operations are characterinsertions/ deletions and substitutions. Definition 5.6 ( ED alignment) . Let Σ be a finite non-empty set and n ∈ N . For every pairof strings ( a , b ) ∈ Σ n × Σ n and every function σ : [ n ] → [ n ] ∪ {⊥} , we say that σ is an LCS alignment for ( a , b ) if the following holds. Monotonicity : For every i , j ∈ [ n ] , i < j we have that if σ ( i ) = ⊥ and σ ( j ) = ⊥ then σ ( i ) < σ ( j ) .The cost of an ED alignment σ is defined as twice the cardinality of the preimage of ⊥ plusthe number of mismatches, i.e., |{ i ∈ [ n ] | σ ( i ) = ⊥}| + |{ i ∈ [ n ] | y σ ( i ) = x i }| .We interpret an alignment σ is as follows: For every i ∈ [ n ] if σ ( i ) = ⊥ then σ matches each symbol x i into y σ ( i ) (while paying an edit operation in case of substitu-tion). In case σ ( i ) = ⊥ , then it means that σ deletes the i -th character of x . In case we26ave σ ( i ) , σ ( i + ) = ⊥ but σ ( i + ) > σ ( i ) + σ inserts the characters y σ ( i )+ , . . . , y σ ( i + ) − . The bound on the edit cost of σ simply follows by the fact that thenumber of characters insertions and deletions is equal. Definition 5.7 ( ED problem) . Let Σ be a finite non-empty set. The Edit Distance ( ED Σ ) is anoptimization problem characterized by the following quadruple of objects ( I ED Σ , Sol ED Σ , ∆ ED Σ , min ) ,where: • For every d ∈ N , I ED Σ ( d ) = Σ n × Σ n where n = d / ( | Σ | ) ; • For every ( a , b ) ∈ I ED Σ we have Sol ED Σ ( a , b ) is the set of all ED alignments for ( a , b ) ; • For every ( a , b ) ∈ I ED Σ and every ED alignment σ for ( a , b ) we define ∆ ED Σ ( a , b , σ ) to bethe cost of σ . Lemma 5.8.
Let Σ be a finite non-empty set. Let Ξ be a superset of Σ of cardinality | Σ | + . LetS , T : N × N → N , where S ( d , k ) = | Ξ | dk and T ( d , k ) = · S ( d , k ) . Then we have that ( ED Σ , ED Ξ ) are ( S , T ) -direct product feasible.Proof. Let Ξ = Σ ∪ { ξ } , where ξ / ∈ Σ . We define the pair of deterministic algorithms ( Gen , Dec ) below.Fix k , d ∈ N (and consequently n ∈ N ). Let ℓ : = + nk , m : = ( n + ℓ ) k , and d ′ = m | Ξ | . Let z ∈ Ξ ℓ be the concatenation of ℓ copies of ξ , i.e., z : = ξ ◦ · · · ◦ ξ | {z } ℓ copies .For every ( a , b ) , . . . , ( a k , b k ) ∈ I ED Σ ( d ) given as input to Gen , it outputs the instance ( a , b ) in I ED Ξ ( d ′ ) where, a : = a ◦ z ◦ a ◦ z ◦ · · · ◦ z ◦ a k ◦ z ∈ Ξ m , b : = b ◦ z ◦ b ◦ z ◦ · · · ◦ z ◦ b k ◦ z ∈ Ξ m .It is clear that the running time of Gen is d ′ .Next, for every i ∈ [ k ] , ( a , b ) , . . . , ( a k , b k ) ∈ I ED Σ ( d ) , and an ED alignment e σ : [ m ] → [ m ] ∪ {⊥} for ( a , b ) given as input to Dec , the algorithm first runs
Gen to compute ( a , b ) and then outputs σ : [ n ] → [ n ] ∪ {⊥} which is computed as follows: ∀ j ∈ [ n ] , σ ( j ) = (e σ ( j + loc ) if loc < e σ ( j + loc ) ≤ n + loc , ⊥ otherwise,where loc = ( n + ℓ )( i − ) . It is easy to see that σ is an ED alignment for ( a i , b i ) . Also,it is easy to see that the running time of Dec is running time of
Gen plus d ′ (needed tocompute σ ).We next show that if e σ is an optimal ED alignment for ( a , b ) then σ is an optimalalignment for ( a , b ) .To show this we first prove that there exists an optimal alignment e σ ′ which is anoptimal ED alignment for ( a , b ) and it is ‘block-consistent’. Similarly to the LCS case, Here again we use the unary encoding to encode a symbol in Σ for the ease of presentation. e σ ( i ) ∈ { i − ℓ /2, . . . , i + ℓ /2 } (as otherwise its costexceeds the identity alignment cost). To simplify the notations we define for each j ∈ [ k ] the starting and the end indices of each block a i , specifically: s ( a j ) = ( ℓ − + n )( j − ) + f ( a j ) = ( ℓ − + n )( j − ) + n .We denote by ∆ ( e σ ) j the number of edit operations made by e σ on the j th block of ( a , b ) , i.e., it equals twice the cardinality of the preimage of ⊥ plus the number of mis-matches for indices in j ∈ { s ( a j ) , . . . s ( a j ) + n + ℓ } Claim 5.9.
There exists an optimal alignment e σ ′ which is an optimal ED alignment for ( a , b ) which is ‘block-consistent’, that is: For all j ∈ [ k ] , if j > then, e σ ′ ( s ( a j ) − ) = s ( a j ) − and e σ ′ ( f ( a j ) + ) = f ( a j ) + Moreover, ∆ ( e σ ′ ) j = ∆ e ( a j , b j ) . Proof of Claim 5.9.
We take any optimal alignment e σ and gradually change it to be ’block-consistent’ while preserving its cost. This is done as follows: We define e σ = e σ . For each j ∈ [ k ] we define e σ j = e σ j − and then convert it to be consistent on the j th block by: • Deleting all the characters in a j that were mapped into z and then matching allthe characters of z (formally, for each i ∈ { s ( a j ) , . . . , f ( a j ) } if e σ j − ( i ) > f ( a j ) set e σ j ( i ) = ⊥ ); • Matching all the characters in j th block of z (formally, for each i ∈ { f ( a j ) +
1, . . . , s ( a j + ) − } set e σ j ( i ) = i ); • Each character in the prefix of a j + that was mapped into z in e σ j − is deleted (for-mally for for each i ∈ { s ( a j + ) , . . . , f ( a j + ) } if e σ j − ( i ) < s ( a j + ) set e σ j ( i ) = ⊥ ).Finally set e σ ′ = e σ k .We next claim that ∆ ( e σ j ) ≤ ∆ ( e σ j − ) (and by the optimality of e σ we get an optimal-ity of e σ j ): Let disp e σ j ( i ) = i − e σ j ( i ) . The proof proceeds by a case analysis: Case 1 – disp e σ j − ( f ( a j ) + ) > : Let us compare ∆ ( e σ j − ) and ∆ ( e σ j ) : In e σ j we deletedisp e σ j − ( f ( a j ) + ) symbols that were not deleted in e σ j − . However, on each such a dele-tion in e σ j − we payed for a mismatch. Next, in both e σ j − , e σ j we pay no edit operationsas long as we match between z characters and then:If disp e σ j − ( s ( a j + )) >
0: In e σ j we pay disp e σ j − ( s ( a j + )) additions the prefix of b j + .On the other hand, in e σ j − we pay this much of substitutions caused by matching z characters into b j + .If disp ( e σ j − ( s ( a j + )) <
0: In this case in e σ j − we pay at least disp ( e σ j − ( s ( a j + )) mismatches and characters insertions till we reach an index i disp ( e σ j − ( i ) ≥ s ( a j + ) . Onthe other hand in e σ j we pay this much of characters deletions to delete the prefix of a j + .Overall, the total costs of the alignments are equal.28 ase 2 – disp ( e σ j − ( f ( a j ) + )) < : We handle similarly using the same arguments.To prove the moreover part, observe that e σ ′ matches all the z blocks to themselves.Therefore, if there exists a block j ∈ [ k ] which e σ ′ does not align optimally a j into b j thenthere exists a better alignment than e σ ′ . Contradicting to the optimality of e σ ′ .To conclude the correctness of our algorithm Dec , observe that it mimics the behav-ior e σ ′ on the j th block of ( a , b ) . Since ∆ ( e σ ′ ) j = ∆ e ( a j , b j ) the proof follows. Finally, wenote that the bound on the functions S and T hold as d ′ = ( nk + nk ) | Ξ | ≤ | Ξ | dk . ED problem has also been of special interest in the last few years thanks to the ad-vancements in fine grained complexity [BI18, AHWW16, Wil15, Wil16, Wil18]. The abovedirect product feasibility immediately implies the below hardness amplification result. Corollary 5.10.
Let ε > . Let Σ be a finite non-empty set. Let Ξ be a superset of Σ of cardinality | Σ | + . Let D = { D d } d ∈ N be a family of distributions such that for every randomized algorithm A running in time d + ε over inputs of size d, the following holds for large d ∈ N . • D d is a distribution over I ED Σ ( d ) where an instance of I ED Σ ( d ) can be sampled from D d in e O ( d ) time. • Pr ( a , b ) ∼ D d [ A finds optimal ED alignment for ( a , b ) with probability at least ] ≤ − d − o ( ) .Then there is some ε ′ > such that there is a distribution family D ′ = { D ′ d } d ∈ N such thatfor every randomized algorithm A ′ running in time d + ε ′ , the following holds for all large enoughd ∈ N . • D ′ d is a distribution over I ED Ξ ( d + o ( ) ) and an instance in I ED Ξ can be sampled from D ′ d ind + o ( ) time. • Pr ( a ′ , b ′ ) ∼ D ′ d [ A ′ finds optimal ED alignment for ( a ′ , b ′ ) with probability at least ] ≤ Proof.
We apply Theorem 3.4 by setting, Π = I ED Σ , Λ = I ED Ξ , p = / , S ( d , k ) = | Ξ | dk , T ( d , k ) = · S ( d , k ) , v ( d ) = w · d (for some w ∈ N ), s ( d ) = e O ( d ) , t ( d ) = d + ε , and fail ( d ) = / d o ( ) . Then we have that k : = d o ( ) and c : =
300 ln 3. We verify that k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c holds by noting that k · s ( d ) + T ( d , k ) + v ( d ) = d + o ( ) and t ( d ) c = Ω ( d + ε ) . Therefore, we have that the sampling time from D ′ d is d + o ( ) and thatthe theorem statement holds for any randomized algorithm A ′ running in time t ( d ) c = Θ ( d + ε ) = Θ (cid:18)(cid:16) d + o ( ) (cid:17) + ε − δ (cid:19) , for any δ > Remark 5.11 (Fr´echet Distance) . Note that another well studied subquadratic hard [Bri14,AHWW16] problem of computing Fr´echet Distance does not seem to admit direct product feasi-bility (at least the natural way to aggregate fails). .3 Matrix Multiplication Now we consider the matrix multiplication problem. It may be written as an optimiza-tion problem in the following rather mundane way:
Definition 5.12 (Matrix Multiplication problem) . Let q be some large prime (universal con-stant). The Matrix Multiplication (
Mult ) is an optimization problem charaterized by the follow-ing quadruple of objects ( I Mult , Sol
Mult , ∆ Mult , min ) , where: • For every d ∈ N , I Mult ( d ) is the set of all pairs of n × n matrices with entries in F q ; • For every ( A , B ) ∈ I Mult we have
Sol
Mult ( A , B ) is the set of all n × n matrices with entriesin F q ; • For every ( A , B ) ∈ I Mult and every n × n matrix C we define ∆ Mult ( A , B , C ) to be thenumber of entries in C which differ from AB. Given the above formalism, we show that it is self direct product feasible.
Lemma 5.13.
Let S , T : N × N → N , where S ( d , k ) = ( dk ) and T ( d , k ) = O ( d k ) . Thenwe have that Mult is ( S , T ) -self direct product feasible.Proof. We define the pair of deterministic algorithms ( Gen , Dec ) below.For every ( A , B ) , . . . , ( A k , B k ) ∈ I Mult ( d ) given as input to Gen , it outputs the in-stance ( A ′ , B ′ ) in I Mult ( d ′ ) defined as: A ′ : = A A k and: B ′ : = B B k It is clear that the running time of
Gen is O ( d ′ ) .Next, for every i ∈ [ k ] , ( A , B ) , . . . , ( A k , B k ) ∈ I Mult ( d ) , and a matrix C ′ ∈ Sol
Mult ( A ′ , B ′ ) given as input to Dec , the algorithm first runs
Gen to compute ( A ′ , B ′ ) and then outputs C ∈ Mult ( A i , B i ) which is the i -th block of C ′ .We next show that if C ′ is an optimal Mult matrix for ( A ′ , B ′ ) then C is an optimalalignment for ( A i , B i ) . An optimal solution for ( A ′ , B ′ ) is the matrix C ′ = A ′ · B ′ . Bymatrix multiplication definition its i -th block of C ′ equals A i · B i which is indeed theoptimal solution for ( A i , B i ) .Clearly the running time of Dec is O ( d k ) , the proof follows.It is a well-known open problem [K ¨un18, WW18] to find an efficient way to checkif the product of two matrices is equal to the third matrix. However, it can be checkedin linear time (in the input size) if we allow randomness [Fre77]. But Theorem 3.4 needs30he verification to be deterministic so we cannot invoke it directly. Let us briefly explainhow to modify the proof so it generalizes to the case of matrix multiplication.First, observe that using [Fre77] result, given n × n matrices A , B , and C in F q onecan check whether C = A · B with failure probability less than 2 − m in time mn .Recall the proof of Theorem 3.4 and let us restate it in matrix multiplication ter-minology: Given input matrices A , B , the algorithm repeatedly define matrices A ′ , B ′ such that A , B are nested as a random sub-block of A ′ , B ′ as defined in the proof ofLemma 5.13. This above process is repeated for c -times. The matrices A ′ , B ′ are fedinto algorithm A ′ , and the output of A ′ is the product A ′ · B ′ .We modify the algorithm so that it iterates over its main loop for 2 c iterations. Insuch a way, using the same argument described in the proof of Theorem 3.4, we areguaranteed that, on at least 1 − fail ( d ) fraction of instances sampled from original distri-bution, the probability that none of the matrices output by A ′ equals A ′ · B ′ is at most: (cid:18) − ln ( / − p ) c (cid:19) c ≤ − ( e ln ( − p ) ) ≤ − p .Now suppose that on at least one of the iterations A ′ outputs a matrix C ′ that equals A ′ · B ′ (which now happen with probability at least 2 p ). We run [Fre77] algorithm oneach matrix output by A ′ , so that we are guaranteed that with probability at least 1 − p for each such a matrix, the algorithm outputs ’equal’ iff C ′ = A ′ · B ′ . In total, the successprobability of our modified algorithm is at least 2 p − p = p .Summarizing, on at least 1 − fail ( d ) fraction of inputs sampled from the originaldistribution, our modified algorithm outputs A · B with probability at least p , as claimed.Thus, with a setting of parameters as in the proofs of Corollaries 5.4 and 5.10, we havethe following. Corollary 5.14.
Let ε > . Let D = { D d } d ∈ N be a family of distributions such that for everyrandomized algorithm A running in time d + ε over inputs of size d, the following holds for larged ∈ N . • D d is a distribution over I Mult ( d ) where an instance of I Mult ( d ) can be sampled from D d in e O ( d ) time. • Pr ( A , B ) ∼ D d [ A finds correct product of ( A , B ) with probability at least ] ≤ − d − o ( ) .Then there is some ε ′ > such that there is a distribution family D ′ = { D ′ d } d ∈ N such thatfor every randomized algorithm A ′ running in time d + ε ′ , the following holds for all large enoughd ∈ N . • D ′ d is a distribution over I Mult ( d + o ( ) ) and an instance in I Mult can be sampled from D ′ d in d + o ( ) time. • Pr ( A ′ , B ′ ) ∼ D ′ d [ A ′ finds correct product of ( A ′ , B ′ ) with probability at least ] ≤ Almost Worst Case to Average Case for Problems in
TFNP
In this section, we look at two problems in
TFNP : Factoring and End of a Line. We alsopoint towards future directions of research in the intersection of hardness amplificationand
TFNP . Factoring is the problem of given a number as input, the task of determining all its primefactors. Given a candidate set of prime factors for a number it is possible to efficientlycheck that they are prime numbers, factors of the input number, and exhaustive, i.e.,there are no more prime factors of the input number. This is captured in our formalismfor optimization problems as follows.
Definition 6.1 (Prime factor set of a number) . For every positive integer n > and every set Γ ⊆ [ n ] , we say that Γ is a prime factor set of n if the following holds. • Divisor : For every a ∈ Γ , we have that a divides n. • Prime : For every a ∈ Γ , we have that a is prime (greater than 1). Definition 6.2 (Factoring problem) . The Factoring problem (
Factor ) is an optimization prob-lem charaterized by the following quadruple of objects ( I Factor , Sol
Factor , ∆ Factor , max ) , where: • For every d ∈ N , I Factor ( d ) = {
2, . . . , 2 d + } ; • For every n ∈ I Factor we have
Sol
Factor ( n ) is the set of all prime factor sets of n; • For every n ∈ I Factor and every prime factor set Γ of n, we define ∆ Factor ( n , Γ ) to be thecardinality of Γ . Now, we show that factoring is self direct product feasible.
Lemma 6.3.
Let S , T : N × N → N , where S ( d , k ) = dk and T ( d , k ) = e O ( dk ) . Then we havethat Factor is ( S , T ) -self direct product feasible.Proof. We define the pair of deterministic algorithms ( Gen , Dec ) below.For every n , . . . , n k ∈ I Factor ( d ) given as input to Gen , it outputs the instance n ′ in I Factor ( d ′ ) (where d ′ ≤ dk ) defined as: n ′ : = Π i ∈ k n i .It is clear that the running time of Gen is e O ( d ′ ) .Next, for every i ∈ [ k ] , n , . . . , n k ∈ I Factor ( d ) , and a Factor assignment Γ ′ ∈ Sol
Factor ( n ) for n ′ given as input to Dec , the algorithm first runs
Gen to compute n ′ and then outputs Γ ∈ Sol
Factor ( n ) which is computed as follows: We initialize Γ = ∅ . Then we scan Γ ′ andfor each a ′ ∈ Γ ′ we insert a ′ in Γ if a ′ divides n i .32e next show that if Γ ′ is an optimal Factor factor set for n ′ then Γ is an optimalfactor set for n i . Indeed if Γ ′ is an optimal solution for n ′ then it equals the set of all prime factors of n ′ , which include all prime factors of n i . Hence each prime factor a of n will be inserted into Γ during the scan of Γ ′ .Clearly the running time of Gen is e O ( dk ) , since the check whether each a ∈ Γ ′ divides n i takes e O ( d ) -time, the proof follows.The factoring problem is known to be in the class PPA (under randomized reduc-tions) [Jer16]. It’s also the basis of many cryptosystems. Therefore the following hard-ness amplification result is of interest.
Corollary 6.4.
Let a ≥ . Let D = { D d } d ∈ N be a family of distributions such that for everyrandomized algorithm A running in time O ( d a ) over inputs of size d, the following holds forlarge d ∈ N . • D d is a distribution over I Factor ( d ) where an instance of I Factor ( d ) can be sampled from D d in e O ( d ) time. • Pr n ∼ D d [ A finds largest prime factor set of n with probability at least ] ≤ − d .Then for m : = Θ ( d ) , there is a distribution family D ′ = { D ′ d } d ∈ N such that for everyrandomized algorithm A ′ running in time O ( m a /7 ) over inputs of size m, the following holds forall large enough d ∈ N . • D ′ d is a distribution over I Factor ( m ) and an instance in I Factor can be sampled from D ′ d in e O ( m ) time. • Pr n ′ ∼ D ′ d [ A ′ finds largest prime factor set of n ′ with probability at least ] ≤ Proof.
We apply Theorem 3.4 by setting, Π = Λ = Factor , p = / , S ( d , k ) = dk , T ( d , k ) =( dk ) + o ( ) , v ( d ) = d + o ( ) , s ( d ) = e O ( d ) , t ( d ) = d a , and fail ( d ) = / d . Then we havethat k : = Θ ( d ) and c : =
300 ln 3. We verify that k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c holdsby noting that k · s ( d ) + T ( d , k ) + v ( d ) = e O ( d ) and t ( d ) c = Ω ( d a ) = Ω ( d ) (because a ≥ D ′ d is e O ( d ) = e O ( m ) and thatthe theorem statement holds for any randomized algorithm A ′ running in time t ( d ) c = Θ ( d a ) = Θ ( m a /7 ) .We wonder if the above result can have any meaningful, concrete applications incryptography. The End of a Line problem [Pap94, DGP09] is the canonical
PPAD -complete problem.Informally it captures the handshaking lemma in directed graphs on 2 n vertices given asinput through predecessor and successor circuits of poly ( n ) size. It may be written as anoptimization problem under our formalism in the following (mundane) way.33 efinition 6.5 (Size of a Circuit) . For every positive integer n and every circuit C : {
0, 1 } n →{
0, 1 } n , the size of C denoted by | C | is the number of bits needed to describe C. Definition 6.6 (End of a Line problem) . The End of a Line problem (
EoL ) is an optimizationproblem charaterized by the following quadruple of objects ( I EoL , Sol
EoL , ∆ EoL , max ) , where: • For every d ∈ N , I EoL ( d ) = { ( A , B ) | A , B : {
0, 1 } n → {
0, 1 } n and | A | + | B | ≤ d } ; • For every ( A , B ) ∈ I EoL we have
Sol
EoL ( A , B ) = {
0, 1 } n ; • For every ( A , B ) ∈ I EoL and every x ∈ {
0, 1 } n , we define ∆ EoL ( A , B , x ) as follows: ∆ EoL ( A , B , x ) = ( if A ( B ( x )) = x or B ( A ( x )) = x = n otherwise Lemma 6.7.
Let S , T : N × N → N , where S ( d , k ) = dk and T ( d , k ) = O ( dk ) . Then we havethat Factor is ( S , T ) -self direct product feasible.Proof. We define the pair of deterministic algorithms ( Gen , Dec ) below.Fix k , d ∈ N . For every ( A , B ) , . . . , ( A k , B k ) ∈ I EoL ( d ) given as input to Gen , itoutputs the instance ( A ′ , B ′ ) in I EoL ( d ′ ) where d ′ = dk , and A ′ , B ′ : {
0, 1 } nk → {
0, 1 } nk are defined as follows. For all x ′ = ( x ′ , . . . , x ′ k ) ∈ {
0, 1 } nk , where ∀ i ∈ [ k ] , we have x ′ i ∈ {
0, 1 } n , and define A ′ ( x ′ ) and B ′ ( x ′ ) as follows. A ′ ( x ′ ) = ( A ( x ′ ) , . . . , A k ( x ′ k )) B ′ ( x ′ ) = ( B ( x ′ ) , . . . , B k ( x ′ k )) It is clear that the running time of
Gen is O ( d ′ ) .Next, for every i ∗ ∈ [ k ] , ( A , B ) , . . . , ( A k , B k ) ∈ I EoL ( d ) , and a EoL solution x ′ ∈{
0, 1 } nk for ( A ′ , B ′ ) given as input to Dec , the algorithm first runs
Gen to compute ( A ′ , B ′ ) and then outputs x ∈ {
0, 1 } n which is computed as follows: x j = x ′ ( i ∗ − ) n + j .We next show that if x ′ is an optimal EoL solution for ( A ′ , B ′ ) then x is an optimalsolution for ( A i ∗ , B i ∗ ) . This follows easily by the fact that the ( A i , B i ) s are defined on dis-joint union of circuits, hence an optimal solution for ( A ′ , B ′ ) induces an optimal solutionfor each of the ( A i , B i ) s.The running times of Gen and
Dec trivially follow.
Corollary 6.8.
Let a ≥ . Let D = { D d } d ∈ N be a family of distributions such that for everyrandomized algorithm A running in time O ( d a ) over inputs of size d, the following holds forlarge d ∈ N . • D d is a distribution over I EoL ( d ) where an instance of I EoL ( d ) can be sampled from D d in e O ( d ) time. • Pr ( A , B ) ∼ D d [ A finds a solution of ( A , B ) with probability at least ] ≤ − d . hen for m : = Θ ( d ) , there is a distribution family D ′ = { D ′ d } d ∈ N such that for everyrandomized algorithm A ′ running in time O ( m a /7 ) over inputs of size m, the following holds forall large enough d ∈ N . • D ′ d is a distribution over I EoL ( m ) and an instance in I EoL can be sampled from D ′ d in e O ( m ) time. • Pr ( A ′ , B ′ ) ∼ D ′ d [ A ′ finds a solution of ( A ′ , B ′ ) with probability at least ] ≤ Proof.
We apply Theorem 3.4 by setting, Π = Λ = EoL , p = / , S ( d , k ) = dk , T ( d , k ) = w · dk (for some w ∈ N ), v ( d ) = w ′ · d (for some w ′ ∈ N ), s ( d ) = e O ( d ) , t ( d ) = d a ,and fail ( d ) = / d . Then we have that k : = Θ ( d ) and c : =
300 ln 3. We verify that k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c holds by noting that k · s ( d ) + T ( d , k ) + v ( d ) = e O ( d ) and t ( d ) c = Ω ( d a ) = Ω ( d ) (because a ≥ D ′ d is e O ( d ) = e O ( m ) and that the theorem statement holds for any randomized algorithm A ′ running in time t ( d ) c = Θ ( d a ) = Θ ( m a /7 ) .Since there is a direct many-one reduction from EoL to the problem of computingapproximate Nash equilibrium in various kinds of games [CDT09, Rub18, Rub16], theabove corollary extends to give hardness amplification results for computing Nash equi-librium as well.
Remark 6.9.
The proof of Lemma 6.7 can be mimicked to get self direct product feasibilityof canonical complete problems of other
TFNP classes like Leaf (complete for
PPA ) [Pap94],LocalOPT (complete for
PLS ) [JPY88, DP11], End of Metered Line (equivalent to
EoPL )[HY17],and Unique EOPL [FGMS18].
Now we consider the same hardness amplification result of
EoL but against subex-ponential time algorithms. We provide below a slightly informal statement (using asymp-totic notations as opposed to providing specific constants) for clarity.
Corollary 6.10.
Let D = { D d } d ∈ N be a family of distributions such that for every randomizedalgorithm A running in time o ( d ) over inputs of size d, the following holds for large d ∈ N . • D d is a distribution over I EoL ( d ) where an instance of I EoL ( d ) can be sampled from D d in e O ( d ) time. • Pr ( A , B ) ∼ D d [ A finds a solution of ( A , B ) with probability at least ] ≤ − o ( d ) .Then for m : = o ( d ) , there is a distribution family D ′ = { D ′ d } d ∈ N such that for everyrandomized algorithm A ′ running in time m ω ( ) over inputs of size m, the following holds for alllarge enough d ∈ N . • D ′ d is a distribution over I EoL ( m ) and an instance in I EoL can be sampled from D ′ d in e O ( m ) time. • Pr ( A ′ , B ′ ) ∼ D ′ d [ A ′ finds a solution of ( A ′ , B ′ ) with probability at least ] ≤ roof. Let h : N → N be some slowly increasing function such that lim x → ∞ h ( x ) = ∞ andlet h : = h ( d ) . We apply Theorem 3.4 by setting, Π = Λ = EoL , p = / , S ( d , k ) = dk , T ( d , k ) = w · dk (for some w ∈ N ), v ( d ) = w ′ · d (for some w ′ ∈ N ), s ( d ) = e O ( d ) , t ( d ) = d / h , and fail ( d ) = / d / ( · h ) . Then we have that k : = Θ ( d / h ) and c : =
300 ln 3.We verify that k · s ( d ) + T ( d , k ) + v ( d ) ≤ t ( d ) c holds by noting that k · s ( d ) + T ( d , k ) + v ( d ) = ( + o ( )) · d / h and t ( d ) c = Ω ( d / h ) . Therefore, we have that the sampling time from D ′ d is 2 ( + o ( )) · d / h = O ( m ) and that the theorem statement holds for any randomizedalgorithm A ′ running in time t ( d ) c = O ( d / h ) = Θ ( m h ( d ) ) = m ω ( ) .Note that if we assume the (randomized) Exponential Time Hypothesis ( ETH ) for
PPAD [BPR16] then we obtain a family of distributions D = { D d } d ∈ N such that for everyrandomized algorithm A running in time 2 d − o ( ) over inputs of size d , the following holdsfor large d ∈ N . • D d is a distribution over I EoL ( d ) where an instance of I EoL ( d ) can be sampled from D d in e O ( d ) time. • Pr φ ∼ D d [ A finds maximizing assignment for φ with probability at least 2/3 ] ≤ − d .Therefore, our amplification in Corollary 6.10 is an almost worst-case to average-casereduction for EoL (under subexponential time reductions).
Acknowledgements
We would like to thank Amir Abboud, Irit Dinur, and Eylon Yogev for discussions andcomments.
References [Abb19] Amir Abboud. Personal communication, 2019.[ABW15] Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tighthardness results for LCS and other sequence similarity measures. In
IEEE56th Annual Symposium on Foundations of Computer Science, FOCS 2015,Berkeley, CA, USA, 17-20 October, 2015 , pages 59–78, 2015.[ACG +
99] G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela,and M. Protasi.
Complexity and Approximation: Combinatorial OptimizationProblems and Their Approximability Properties . Springer-Verlag Berlin Heidel-berg, 1999.[AHWW16] Amir Abboud, Thomas Dueholm Hansen, Virginia Vassilevska Williams,and Ryan Williams. Simulating branching programs with edit distance andfriends: or: a polylog shaved is a lower bound made. In
Proceedings of the48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016,Cambridge, MA, USA, June 18-21, 2016 , pages 375–388, 2016.36BFNW93] L ´aszl ´o Babai, Lance Fortnow, Noam Nisan, and Avi Wigderson. BPP hassubexponential time simulations unless EXPTIME has publishable proofs.
Computational Complexity , 3:307–318, 1993.[BI18] Arturs Backurs and Piotr Indyk. Edit distance cannot be computed instrongly subquadratic time (unless SETH is false).
SIAM J. Comput. ,47(3):1087–1097, 2018.[BKS06] Joshua Buresh-Oppenheim, Valentine Kabanets, and Rahul Santhanam.Uniform hardness amplification in NP via monotone codes.
Electronic Col-loquium on Computational Complexity (ECCC) , 13(154), 2006.[BPR16] Yakov Babichenko, Christos H. Papadimitriou, and Aviad Rubinstein. Canalmost everybody be almost happy? In
Proceedings of the 2016 ACM Con-ference on Innovations in Theoretical Computer Science, Cambridge, MA, USA,January 14-16, 2016 , pages 1–9, 2016.[BR13] Andrej Bogdanov and Alon Rosen. Input locality and hardness amplifica-tion.
J. Cryptology , 26(1):144–171, 2013.[Bri14] Karl Bringmann. Why walking the dog takes time: Frechet distance has nostrongly subquadratic algorithms unless SETH fails. In , pages 661–670, 2014.[BRSV17] Marshall Ball, Alon Rosen, Manuel Sabin, and Prashant Nalini Vasude-van. Average-case fine-grained hardness. In
Proceedings of the 49th AnnualACM SIGACT Symposium on Theory of Computing , STOC 2017, pages 483–496, New York, NY, USA, 2017. ACM.[BT06a] Andrej Bogdanov and Luca Trevisan. Average-case complexity.
Foundationsand Trends in Theoretical Computer Science , 2(1), 2006.[BT06b] Andrej Bogdanov and Luca Trevisan. On worst-case to average-case reduc-tions for NP problems.
SIAM J. Comput. , 36(4):1119–1159, 2006.[CDT09] Xi Chen, Xiaotie Deng, and Shang-Hua Teng. Settling the complexity ofcomputing two-player nash equilibria.
J. ACM , 56(3), 2009.[CIP06] Chris Calabro, Russell Impagliazzo, and Ramamohan Paturi. A dualitybetween clause width and clause density for SAT. In , pages 252–260, 2006.[CPS99] Jin-yi Cai, Aduri Pavan, and D. Sivakumar. On the hardness of permanent.In
STACS 99, 16th Annual Symposium on Theoretical Aspects of Computer Sci-ence, Trier, Germany, March 4-6, 1999, Proceedings , pages 90–99, 1999.[CW16] Timothy M. Chan and Ryan Williams. Deterministic apsp, orthogonal vec-tors, and more: Quickly derandomizing razborov-smolensky. In
Proceedingsof the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms,SODA 2016, Arlington, VA, USA, January 10-12, 2016 , pages 1246–1255, 2016.37DG08] Irit Dinur and Elazar Goldenberg. Locally testing direct product in the lowerror range. In , pages 613–622, 2008.[DGP09] Constantinos Daskalakis, Paul W. Goldberg, and Christos H. Papadim-itriou. The complexity of computing a nash equilibrium.
SIAM J. Comput. ,39(1):195–259, 2009.[Din16] Irit Dinur. Mildly exponential reduction from gap 3SAT to polynomial-gaplabel-cover.
ECCC , 23:128, 2016.[DP11] Constantinos Daskalakis and Christos H. Papadimitriou. Continuous localsearch. In
Proceedings of the Twenty-Second Annual ACM-SIAM Symposiumon Discrete Algorithms, SODA 2011, San Francisco, California, USA, January23-25, 2011 , pages 790–804, 2011.[DS14] Irit Dinur and David Steurer. Direct product testing. In
IEEE 29th Conferenceon Computational Complexity, CCC 2014, Vancouver, BC, Canada, June 11-13,2014 , pages 188–196, 2014.[Fei02] Uriel Feige. Relations between average case complexity and approxima-tion complexity. In
Proceedings on 34th Annual ACM Symposium on Theory ofComputing, May 19-21, 2002, Montr´eal, Qu´ebec, Canada , pages 534–543, 2002.[FGMS18] John Fearnley, Spencer Gordon, Ruta Mehta, and Rahul Savani. Uniqueend of potential line.
CoRR , abs/1811.03841, 2018.[FK00] Uriel Feige and Joe Kilian. Two-prover protocols - low error at affordablerates.
SIAM J. Comput. , 30(1):324–346, 2000.[Fre77] Rusins Freivalds. Probabilistic machines can use less running time. In
IFIPCongress , pages 839–842, 1977.[GG11] Parikshit Gopalan and Venkatesan Guruswami. Hardness amplificationwithin NP against deterministic algorithms.
J. Comput. Syst. Sci. , 77(1):107–121, 2011.[GIL +
90] Oded Goldreich, Russell Impagliazzo, Leonid A. Levin, RamarathnamVenkatesan, and David Zuckerman. Security preserving amplification ofhardness. In , pages 318–326, 1990.[GO05] Venkatesan Guruswami and Ryan O’Donnell.
Lecture 12: Feige-Kilian ”con-fuse/match” games (Part 1) . Lecture notes for CSE 533: The PCP Theoremand Hardness of Approximation. University of Washington, 2005.[Gol08] Oded Goldreich.
Computational Complexity: A Conceptual Perspective . Cam-bridge University Press, New York, NY, USA, 1 edition, 2008.[GR17] Oded Goldreich and Guy N. Rothblum. Worst-case to average-case reduc-tions for subclasses of P.
Electronic Colloquium on Computational Complexity(ECCC) , 24:130, 2017. 38GR18] Oded Goldreich and Guy N. Rothblum. Counting t-cliques: Worst-case toaverage-case reductions and direct interactive proof systems. In , pages 77–88, 2018.[HVV06] Alexander Healy, Salil P. Vadhan, and Emanuele Viola. Using nondeter-minism to amplify hardness.
SIAM J. Comput. , 35(4):903–931, 2006.[HY17] Pavel Hub´aˇcek and Eylon Yogev. Hardness of continuous local search:Query complexity and cryptographic lower bounds. In
Proceedings of theTwenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA2017, Barcelona, Spain, Hotel Porta Fira, January 16-19 , pages 1352–1371, 2017.[IKW12] Russell Impagliazzo, Valentine Kabanets, and Avi Wigderson. New direct-product testers and 2-query pcps.
SIAM J. Comput. , 41(6):1722–1768, 2012.[Imp95] Russell Impagliazzo. Hard-core distributions for somewhat hard problems.In , pages 538–545, 1995.[IP01] Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-sat.
J.Comput. Syst. Sci. , 62(2):367–375, 2001. Preliminary version in CCC’99.[IPZ01] Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which prob-lems have strongly exponential complexity?
J. Comput. Syst. Sci. , 63(4):512–530, 2001. Preliminary version in FOCS’98.[IW97] Russell Impagliazzo and Avi Wigderson.
P = BPP if E requires exponentialcircuits: Derandomizing the XOR lemma. In Proceedings of the Twenty-NinthAnnual ACM Symposium on the Theory of Computing, El Paso, Texas, USA,May 4-6, 1997 , pages 220–229, 1997.[Jer16] Emil Jer´abek. Integer factoring and modular square roots.
J. Comput. Syst.Sci. , 82(2):380–394, 2016.[JPY88] David S. Johnson, Christos H. Papadimitriou, and Mihalis Yannakakis.How easy is local search?
J. Comput. Syst. Sci. , 37(1):79–100, 1988.[K ¨un18] Marvin K ¨unnemann. On nondeterministic derandomization of freivalds’algorithm: Consequences, avenues and algorithmic progress. In , pages 56:1–56:16, 2018.[Lip89] Richard J. Lipton. New directions in testing. In
Distributed Computing AndCryptography, Proceedings of a DIMACS Workshop, Princeton, New Jersey, USA,October 4-6, 1989 , pages 191–202, 1989.[MR16] Pasin Manurangsi and Prasad Raghavendra. A birthday repetition theoremand complexity of approximating dense CSPs.
CoRR , abs/1607.02986, 2016.[O’D04] Ryan O’Donnell. Hardness amplification within np.
J. Comput. Syst. Sci. ,69(1):68–94, 2004. 39Pap94] Christos H. Papadimitriou. On the complexity of the parity argument andother inefficient proofs of existence.
J. Comput. Syst. Sci. , 48(3):498–532,1994.[Raz98] Ran Raz. A parallel repetition theorem.
SIAM J. Comput. , 27(3):763–803,1998.[Rub16] Aviad Rubinstein. Settling the complexity of computing approximate two-player nash equilibria. In
IEEE 57th Annual Symposium on Foundationsof Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, NewBrunswick, New Jersey, USA , pages 258–265, 2016.[Rub18] Aviad Rubinstein. Inapproximability of nash equilibrium.
SIAM J. Comput. ,47(3):917–959, 2018.[STV01] Madhu Sudan, Luca Trevisan, and Salil P. Vadhan. Pseudorandom genera-tors without the XOR lemma.
J. Comput. Syst. Sci. , 62(2):236–266, 2001.[Tre03] Luca Trevisan. List-decoding using the XOR lemma. In , pages 126–135, 2003.[Tre05] Luca Trevisan. On uniform amplification of hardness in NP. In
Proceedingsof the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD,USA, May 22-24, 2005 , pages 31–38, 2005.[TV07] Luca Trevisan and Salil P. Vadhan. Pseudorandomness and average-casecomplexity via uniform reductions.
Computational Complexity , 16(4):331–364, 2007.[Wil15] Virginia Vassilevska Williams. Hardness of easy problems: Basing hard-ness on popular conjectures such as the strong exponential time hypothesis(invited talk). In
IPEC , pages 17–29, 2015.[Wil16] Virginia Vassilevska Williams. Fine-grained algorithms and complexity (in-vited talk). In
STACS , pages 3:1–3:1, 2016.[Wil18] Virginia Vassilevska Williams. On some fine-grained questions in algo-rithms and complexity. In
Proc. Int. Cong. of Math. , volume 3, pages 3431–3472, 2018.[WW18] Virginia Vassilevska Williams and R. Ryan Williams. Subcubic equivalencesbetween path, matrix, and triangle problems.
J. ACM , 65(5):27:1–27:38,2018.[Yao82] Andrew Chi-Chih Yao. Theory and applications of trapdoor functions (ex-tended abstract). In , pages 80–91, 1982.40
Missing Proofs
An elementary proof of Lemma 3.5 may be found in the lecture notes of Guruswami andO’Donnell [GO05] and we reproduce it below.
Proof of Lemma 3.5.
We first show that the following holds: E x ∼D i ∼ [ k ] [( µ i , x − µ ) ] ≤ k . (4)In order to prove (4), we introduce some notations: • We denote elements of X k by ¯ x , and denote by D k the k -wise product distributionof D . • Fix i ∈ [ k ] , x ∈ X , let D k , − i , x be the distribution over ¯ x ∈ X k obtained by picking x , . . . , x i − , x i + , . . . , x k from D (independently) and setting ¯ x = ( x , . . . , x i − , x , x i + , . . . , x k ) . • Fix i ∈ [ k ] , define F i : X → R and σ i ∈ R as follows: ∀ x ∈ X , F i ( x ) : = E ¯ x ∼D k , − i , x [ f ( ¯ x ) − µ ] and σ i : = E x ∼D [ F i ( x ) ] . • Fix i ∈ [ k ] define f i : X k → R by setting for all ¯ x ∈ X k , f i ( ¯ x ) : = F i ( ¯ x i ) .With these notations setup, we have: E x ∼D i ∼ [ k ] [( µ i , x − µ ) ] = E i ∼ [ k ] (cid:20) E x ∼D h F i ( x ) i(cid:21) = k ∑ i ∈ [ k ] σ i .Therefore, in order to show (4), it suffices to show: ∑ i ∈ [ k ] σ i ≤
1. (5)We observe that from linearity of expectation, the following holds: ∀ i ∈ [ k ] , E x ∼D [ F i ( x )] =
0. (6)Also from the independent choice of coordinates in ¯ x and (6), we have for all i , j ∈ [ k ] such that i = j we have: E ¯ x ∼D k h f i ( ¯ x ) f j ( ¯ x ) i =
0. (7)41inally we define g : X k → R as follows: ∀ ¯ x ∈ X k , g ( ¯ x ) = ∑ i ∈ [ k ] f i ( ¯ x ) .Notice that0 ≤ E ¯ x ∼D k (cid:2) ( f ( ¯ x ) − g ( ¯ x )) (cid:3) = E ¯ x ∼D k (cid:2) f ( ¯ x ) (cid:3) − · E ¯ x ∼D k [ f ( ¯ x ) g ( ¯ x )] + E ¯ x ∼D k (cid:2) g ( ¯ x ) (cid:3) . (8)We bound each of the above terms separately. First, we have that E ¯ x ∼D k (cid:2) f ( ¯ x ) (cid:3) = µ since f is a boolean valued function. Second, we have E ¯ x ∼D k [ f ( ¯ x ) g ( ¯ x )] = E ¯ x ∼D k " f ( ¯ x ) · ∑ i ∈ [ k ] f i ( ¯ x ) = ∑ i ∈ [ k ] E ¯ x ∼D k h f ( ¯ x ) · f i ( ¯ x ) i = ∑ i ∈ [ k ] E x ∼D (cid:20) F i ( x ) · E ¯ x ∼D k , − i , x [ f ( ¯ x )] (cid:21) = ∑ i ∈ [ k ] E x ∼D h F i ( x ) · ( µ + F i ( x )) i (cid:18) because F i ( x ) = E ¯ x ∼D k , − i , x [ f ( ¯ x ) − µ ] (cid:19) = ∑ i ∈ [ k ] µ · E x ∼D h F i ( x ) i + ∑ i ∈ [ k ] E x ∼D (cid:20)(cid:16) F i ( x ) (cid:17) (cid:21) = ∑ i ∈ [ k ] E x ∼D (cid:20)(cid:16) F i ( x ) (cid:17) (cid:21) ( from (6) )= ∑ i ∈ [ k ] σ i .Third, we have E ¯ x ∼D k (cid:2) g ( ¯ x ) (cid:3) = E ¯ x ∼D k ∑ i ∈ [ k ] f i ( ¯ x ) ! = E ¯ x ∼D k ∑ i , j ∈ [ k ] f i ( ¯ x ) · f j ( ¯ x ) = E ¯ x ∼D k ∑ i , j ∈ [ k ] i = j f i ( ¯ x ) · f j ( ¯ x ) + E ¯ x ∼D k " ∑ i ∈ [ k ] (cid:16) f i ( ¯ x ) (cid:17) = E ¯ x ∼D k " ∑ i ∈ [ k ] (cid:16) f i ( ¯ x ) (cid:17) ( from (7) )= ∑ i ∈ [ k ] σ i . 42lugging all together in (8), we get:0 ≤ µ − · ∑ i ∈ [ k ] σ i + ∑ i ∈ [ k ] σ i .Since µ ≤ ∑ i ∈ [ k ] σ i ≤
1, as claimed. Thus we have shown (5) holds andconsequently (4) holds. The proof of the lemma then follows from a simple applicationof Markov inequality:Pr x ∼D i ∼ [ k ] (cid:20) | µ i , x − µ | ≥ k (cid:21) = Pr x ∼D i ∼ [ k ] (cid:20) ( µ i , x − µ ) ≥ k (cid:21) ≤ k · E x ∼D i ∼ [ k ] [( µ i , x − µ ) ] (from Markov inequality) ≤ k k (from (4)) = k1/3