[PDF] Improving Schroeppel and Shamir's Algorithm for Subset Sum via Orthogonal Vectors

Abstract

We present an O ⋆ ( 2 0.5n ) time and O ⋆ ( 2 0.249999n ) space randomized algorithm for solving worst-case Subset Sum instances with n integers. This is the first improvement over the long-standing O ⋆ ( 2 n/2 ) time and O ⋆ ( 2 n/4 ) space algorithm due to Schroeppel and Shamir (FOCS 1979). We breach this gap in two steps: (1) We present a space efficient reduction to the Orthogonal Vectors Problem (OV), one of the most central problem in Fine-Grained Complexity. The reduction is established via an intricate combination of the method of Schroeppel and Shamir, and the representation technique introduced by Howgrave-Graham and Joux (EUROCRYPT 2010) for designing Subset Sum algorithms for the average case regime. (2) We provide an algorithm for OV that detects an orthogonal pair among N given vectors in {0,1 } d with support size d/4 in time O ~ (N⋅ 2 d /( d d/4 )) . Our algorithm for OV is based on and refines the representative families framework developed by Fomin, Lokshtanov, Panolan and Saurabh (J. ACM 2016). Our reduction uncovers a curious tight relation between Subset Sum and OV, because any improvement of our algorithm for OV would imply an improvement over the runtime of Schroeppel and Shamir, which is also a long standing open problem.

Full PDF

IImproving Schroeppel and Shamir’s Algorithm for Subset Sumvia Orthogonal Vectors

Jesper Nederlof ∗ Karol Węgrzycki † We present an O (cid:63) (2 . n ) time and O (cid:63) (2 . n ) space randomized algorithm for solving worst-case Subset Sum instances with n integers. This is the ﬁrst improvement over the long-standing O (cid:63) (2 n/ ) time and O (cid:63) (2 n/ ) space algorithm due to Schroeppel and Shamir (FOCS 1979).We breach this gap in two steps: (1) We present a space eﬃcient reduction to the OrthogonalVectors Problem (OV), one of the most central problem in Fine-Grained Complexity. The reductionis established via an intricate combination of the method of Schroeppel and Shamir, and the repre-sentation technique introduced by Howgrave-Graham and Joux (EUROCRYPT 2010) for designingSubset Sum algorithms for the average case regime. (2) We provide an algorithm for OV that detectsan orthogonal pair among N given vectors in { , } d with support size d/ in time ˜ O ( N · d / (cid:0) dd/ (cid:1) ) .Our algorithm for OV is based on and reﬁnes the representative families framework developed byFomin, Lokshtanov, Panolan and Saurabh (J. ACM 2016).Our reduction uncovers a curious tight relation between Subset Sum and OV, because anyimprovement of our algorithm for OV would imply an improvement over the runtime of Schroeppeland Shamir, which is also a long standing open problem. ∗ Utrecht University, The Netherlands, [email protected] . Supported by the project CRACKNP that has re-ceived funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research andinnovation programme (grant agreement No 853234). † Saarland University and Max Planck Institute for Informatics, Saarbrücken, Germany, [email protected] . This work is part of the project TIPEA that has received funding from theEuropean Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme(grant agreement No. 850979). Author was also supported Foundation for Polish Science (FNP), by the grants2016/21/N/ST6/01468 and 2018/28/T/ST6/00084 of the Polish National Science Center and project TOTAL thathas received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 researchand innovation programme (grant agreement No 677651). a r X i v : . [ c s . D S ] O c t Introduction

The most natural question in computational complexity is: Can an algorithm be improved, or isthere some fundamental barrier stopping us from doing so? A major theme in contemporary researchhas been to study this question in a ﬁne-grained sense: given an algorithm using T time (and S space) on worst-case instances, can this be improved to T − ε (or S − ε space), for some ε > ?Because of the highly challenging nature of ﬁnding such improvements, researchers introducedseveral hypotheses that state that the currently best known algorithms already hit upon a barrier andtherefore cannot be improved in the above sense. Under these hypotheses, many simple algorithmsfor standard problems in P like -SUM, Edit Distance, or Diameter cannot be signiﬁcantly improved.The ﬁne-grained hardness of the latter two problems is based on hardness of a problem that isparticularly central in the area, called Orthogonal Vectors : Given N vectors in { , } d , detect twoorthogonal vectors. A common hypothesis is that for d = ω (log n ) the problem cannot be solved in O ( N − ε ) time for some constant ε > . See for example the survey [VW18].For NP-complete problems the situation is slightly diﬀerent: Although similar ﬁne-grained hy-potheses for CNF-SAT and Set Cover have been introduced, they did not prove suﬃcient yet to ruleout improvements of the currently best algorithms for basic NP-complete problems such as Travel-ing Salesman, Graph Coloring and MAX-3-SAT. See the survey [LMS11] for some hardness resultsin this regime. There may be a good reason for this: While improved polynomial time algorithmscan be naturally used as subroutines for improved exponential time algorithms, the converse is farless natural. Therefore it is quite plausible that ﬁnding better exponential time algorithms is mucheasier than ﬁnding faster polynomial time algorithms. And indeed, in the last decade improved al-gorithms for basic problems such as Undirected Hamiltonicity [Bjö14] and Graph Coloring [BHK09]were found. This motivates the optimism that for many NP-complete problems the currently bestknown algorithm can still be enhanced.Equipped with this optimism, we study the ﬁne-grained complexity of the following importantclass of NP-complete problems revolving around numbers. Subset Sum, Knapsack and Binary Integer Programming

In the Subset Sum problem, weare given as input a set of integers { w , . . . , w n } and a target t . The task is to decide if there existsa subset S ⊆ { , . . . , n } such that the total sum of integers w ( S ) := (cid:80) i ∈ S w i is equal to t .In the 1970’s, [HS74] introduced the meet-in-the-middle strategy and solved Subset Sum in O (cid:63) (2 n/ ) time and space. Since then, it has been a notorious open question to improve their result: Question 1:

Can Subset Sum be solved in O (cid:63) (cid:0) (1 / − ε ) n (cid:1) time, for ε > ?A few years later, [SS81] gave an algorithm for Subset Sum using O (cid:63) (2 n/ ) time and only O (cid:63) (2 n/ ) space . In the last section of their paper, they ask the following: Question 2:

Can Subset Sum be solved in O (cid:63) (cid:0) n/ (cid:1) time and O (cid:63) (cid:0) (1 / − ε ) n (cid:1) space, for ε > ?Both questions seemed to be out of reach until 2010, when [HJ10] introduced the representationtechnique and used it to solve random instances of Subset Sum in O (cid:63) (2 . n ) time. The main ideabehind the representation technique is to artiﬁcially expand the search space such that a singlesolution has an exponential number r of representatives in the new search space. This allows usto subsequently restrict attention to an /r -fraction of the search space which in some setting canbe advantageous. In the context of Subset Sum, this technique has already inspired improved1lgorithms for large classes of instances [AKKN15, AKKN16], time-space trade-oﬀs [AKKM13,DDKS12] and improved polynomial space algorithms [BGNV18].But nevertheless, answers to Questions 1 and 2 for worst-case instances still remained elusive. Our main result is a positive answer to the 40-year old open Question 2:

Theorem 1.1.

Every instance of Subset Sum can be solved in O (cid:63) (2 n/ ) time and O (cid:63) (2 . n ) space by a randomized Monte Carlo algorithm with constant success probability. The result implies analogous space improvement for Knapsack and Binary Integer Programming(see Corollary 1.4). To explain our key ideas and their combination with existing methods, thefollowing problem is instrumental:Weighted Orthogonal Vectors (shorthand notation: WOV ( N, d, h ) ) Input:

Families of weighted vectors A , B ⊆ (cid:0) [ d ] d/ (cid:1) × N of Hamming weight d/ , target integer t Task:

Detect ( A, w A ) ∈ A and ( B, w B ) ∈ B such that A and B are disjoint and w A + w B = t The starting point is the O (cid:63) (2 n/ ) time and space algorithm by [HS74]. Their algorithm can beseen as a reduction to an instance of WOV (2 n/ , , . Since d = 0 , this is an instance of 2-SUM ,and the runtime follows by a linear time algorithm for 2-SUM.In contrast, the representation technique by [HJ10] can also be thought of as a reduction frominstances of Subset Sum to WOV, but with the assumption that the Subset Sum instance doesnot have additive structure . In a follow-up work, [AKKN16] loosen the assumption of [HJ10] andshow that their reduction applies whenever there is a small subset of d weights without additivestructure. Their work implies a reduction from every instance of Subset Sum to WOV ( N, d, d/ ,where N = 2 n/ (cid:0) dd/ (cid:1) / d and d/n > is a small (but ﬁxed) positive constant.Note that the two above reductions feature an intriguing trade-oﬀ between the size N and the dimension d of the produced instance, and the natural question is how the worst-case complexitiesof solving these instances as quick as possible compare. Our ﬁrst step towards proving Theorem 1.1is to show that this trade-oﬀ is tight , unless Question 1 is answered positively: Key Insight:

There is an algorithm for WOV whose run time dependency in d matches theinstance decrease in d in the reduction from [AKKN16]. In particular, WOV ( N, d, d/ can besolved in (cid:101) O ( N · d / (cid:0) dd/ (cid:1) ) time and (cid:101) O ( N + 2 d ) space (see Theorem 6.1).This insight has two interesting immediate consequences. First, it provides an avenue towardsresolving Question 1, because a positive answer to this question is implied by an improvement of ouralgorithm even for the unweighted version of WOV ( N, d, d/ . To the best of our knowledge, such animprovement is entirely consistent with all the known hypotheses on (low/moderate/sparse) versionof the Orthogonal Vectors problem [CW19, GIKW19]. In fact, to answer Question 1 aﬃrmativelywe only need an improvement for the regime d / (cid:0) dd/ (cid:1) ≤ N ≤ d , while previous hypotheses addressthe regime where d/ log N tends to inﬁnity.Second, a combination of the reduction from [AKKN16] and our algorithm for WOV ( N, d, d/ would give an algorithm for Subset Sum that runs in O (cid:63) (2 n/ ) time and O (cid:63) (2 (1 / − δ ) n ) space, for See Appendix C for deﬁnitions of all problems considered in this paper. Speciﬁcally, this means that | w (2 [ n ] ) | ≥ (1 − ε ) n for some small ε > , where w (2 [ n ] ) denotes { w ( X ) : X ⊆ [ n ] } . Actually not every instance, but instances where the reduction fails can be solved quickly by other means. δ > . While this is not even close to the memory improvement of [SS81], one mayhope that both methods can be combined to get a better memory usage. Notwithstanding thesigniﬁcant hurdles that need to be overcome to make these two methods combine, this is how weget the improvement in Theorem 1.1. We now give a high level proof idea of Theorem 1.1. While our conceptual contribution lies in theaforementioned key insight, our main technical eﬀort lies in showing that indeed the representationtechnique and the algorithm of [SS81] can be combined to get a space eﬃcient reduction from SubsetSum to WOV.Following [HS74], the method from [SS81] can also be seen as a reduction from Subset Sum toan instance of WOV (2 n/ , , , but is an implicit one: The relevant vectors of the instance can beenumerated quickly by decomposing the search space of n/ vectors into a Cartesian product oftwo sets n/ vectors, and generating all vectors in a useful order via priority queues. See Section 3for a further explanation. Thus, to prove Theorem 1.1, we aim to generate the relevant parts ofthe instance I of WOV ( N, d, d/ deﬁned by the representation technique eﬃciently, using priorityqueues of size at most O (cid:63) (2 . n ) .Unfortunately the vectors from the instance I deﬁned by the representation technique are el-ements of a search space of size (1 / n ; its crux is that there are only (1 / − Ω(1)) n vectorsin the instance because we only have vectors with a ﬁxed inner product with the weight vector ( w , . . . , w n ) . Thus a straightforward decomposition of this space into a Cartesian product willgive priority queues of size (1 / n .To circumvent this issue, we show that we can apply the representation technique again togenerate the vectors of the instance I eﬃciently using priority queues of size O (cid:63) (2 . n ) . Whilethe representation technique was already used in a multi-level fashion in several earlier works (seee.g. [AKKM13, HJ10]), an important ingredient of our algorithm is that we apply the technique indiﬀerent ways at the diﬀerent levels depending on the structure of the instance. Our route towards Theorem 1.1 as outlined above has the following by-products that may be con-sidered interesting on their own. The ﬁrst one was already referred to in the ‘key insight’:

An algorithm for Orthogonal Vectors

A key subroutine in this paper is the following algorithmthat Orthogonal Vectors. We let OV ( N, d, h ) refer to the problem WOV ( N, d, h ) restricted tounweighted instances (that is all involved integers equal are zero). Theorem 1.2.

There is a Monte-Carlo algorithm solving OV ( N, d, d/ using (cid:101) O (cid:16) N · d / (cid:0) dd/ (cid:1)(cid:17) time and (cid:101) O ( N + 2 d ) space. An easy reduction shows the same runtime and space usage can be obtained for WOV ( N, d, d/ .Our algorithm for Orthogonal Vectors uses the general blueprint of an algorithm by [FLPS16](which in turn builds upon ideas from [Bol65, Mon83]). However, to ensure that the algorithm forOrthogonal Vectors combined with our methods result in an O (cid:63) (2 n/ ) time algorithm for SubsetSum, we need to reﬁne their method and analysis. Some knowledge of the representation technique may be required to understand this in detail; We explain therepresentation technique in Section 2.

3o facilitate our presentation, we consider a new communication complexity-related parameterof -covers of a matrix that we call the ‘ sparsity ’. We show that a -cover of low sparsity of aspeciﬁc Disjointness Matrix implies eﬃcient algorithm for Orthogonal Vectors, and we exhibit a -cover of low sparsity of the disjointness matrix. We also prove that our -cover has nearly optimalsparsity. This means that a new technique has to be developed in order to resolve Question 1.1 via -covers. Additionally, we use several preprocessing techniques to ensure that the space usage ofour algorithm is only (cid:101) O ( N + 2 d ) , which is crucial to Theorem 1.1. Reduction to weighted P While we do not resolve Question 1, our approach provides newavenues by reducing it to typical questions in the study of ﬁne-grained complexity of polynomiallytime problems. Our new reductions enable us to show an interesting connection between SubsetSum and the following graph Problem. In the Exact Node Weighted P problem one is given anundirected graph G = ( V, E ) with vertex weights and the task is to decide whether there exist apath on four vertices with weights summing to (see also Appendix C). Theorem 1.3.

If Exact Node Weighted P on a graph G = ( V, E ) can be solved in O ( | V | . ) time,then Subset Sum can be solved in O (cid:63) (2 (0 . − δ ) n ) randomized time for some δ > . In comparison to the straightforward reduction from Subset Sum to -SUM, our reductioncreates a set of integers with an additional path constraint. Thus a possible attack towards resolvingQuestion 1 is to design a quadratic time algorithm for Exact Node Weighted P (or more particularonly for the instances of the problem generated by our reduction).The naïve algorithm for Exact Node Weighted P works in (cid:101) O ( | V | ) time. To the best of ourknowledge the best algorithm for this problem runs in (cid:101) O ( | V | . ) when ω = 2 [Bri20]. On the lowerbounds side, using the ‘vertex minor’ method from [AL13] it can be shown that the problem ofdetecting triangles in a graph can be reduced to Exact Node Weighted P [Abb20]. This explainsthat obtaining a quadratic time algorithm may be hard (since it is even hard to obtain for detectingtriangles). However, is known to be solvable in (cid:101) O ( | V | ω ) time, where ω is the exponent of currentlythe fastest algorithm for matrix multiplication. Therefore it is still justiﬁed to aim for a (cid:101) O ( | V | ω ) time algorithm for Exact Node Weighted P . We leave it as an intriguing open question whetherExact Node Weighted P can be solved in (cid:101) O ( | V | ω ) time. More general problems

Known reductions from [NvLvdZ12] combined with Theorem 1.1 alsoimply the following improved algorithms for generalizations of the Subset Sum problem (see Ap-pendix C for their deﬁnitions):

Corollary 1.4.

Any instance of Knapsack on n items can be solved in O (cid:63) (2 n/ ) time and O (cid:63) (2 . n ) space, and any instance of Binary Integer Programming with n variables and d constraints with max-imum absolute integer value m can be solved in O (cid:63) (2 n/ (log( mn ) n ) d ) time and O (cid:63) (2 . n ) space. It was shown in [SS81] that Subset Sum admits a time-space tradeoﬀ, i.e. an algorithm using S space and n / S time for any S ≤ O (cid:63) (2 n/ ) . This tradeoﬀ was improved by [AKKM13] for almostall tradeoﬀ parameters (see also [DDKS12]). We mention in the passing that as direct consequence Brieﬂy described, reduce a problem instance on ( V, E ) to the problem of ﬁnding a triangle in an unweightedgraph on | V | edges: One vertex of the triangle represents the two extreme vertices of the path and the sum of theweights of the two ﬁrst vertices on the path, and the other two vertices of the triangle represent the inner vertex.This instance of unweighted triangle can be solved in (cid:101) O ( | V | . ) with standard methods, assuming ω = 2 .

4f Theorem 1.1, the Subset Sum admits a time-space tradeoﬀ using n / S . / . time and S space, for any S ≤ O (cid:63) (2 . n ) , but the obtained parameters are only better then the previousworks for S chosen closely to its maximum. See Appendix A for a proof.In [AKKN15], the authors considered Subset Sum parametrized by the maximum bin size β and obtained algorithm running in time O (cid:63) (2 . n β ) . Subsequently, [AKKN16] showed that onecan get a faster algorithm for Subset Sum than meet-in-the-middle if β ≤ (0 . − ε ) n or β ≥ . n .Recently, [BGNV18] gave an algorithm for Subset Sum running in O (cid:63) (2 . n ) time and polynomialspace, assuming random access to a random oracle.From the pseudopolynomial algorithms perspective, Subset Sum has also been subject of recentstimulative research [ABHS19, Bri17, CDL +

16, JW19, KX17, KX18, LN10].

Average case complexity and representation technique

In a breakthrough paper, [HJ10]introduce the representation technique and showed O (cid:63) (2 . n ) time and O (cid:63) (2 . n ) space algorithmfor an Subset Sum in average-case setting. It was improved by [BCJ11] who gave an algorithmrunning in O (cid:63) (2 . n ) time and space.The representation technique already found several applications in the worst-case setting forother problems (see [MNPW19, Ned16, NPSW21]). Orthogonal Vectors

Naively, the orthgonal vectors problem can solved in O ( dN ) time. Forlarge d , only a slightly faster algorithm that runs in time N − / O (log( d/ log N )) time is known [AWY15,CW16]. The assumption that for d = ω (log N ) there is no O ( N − ε ) time algorithm any ε > is acentral conjecture of ﬁne-grained complexity (see [VW18] for an overview).In this paper, we are mainly interested in linear (in N ) time algorithms for OV. It was shownin [Wil05] that OV cannot be solved in O ( N − ε · o ( d ) ) time for any ε > assuming SETH (see[VW18]. In [BHKK] an algorithm for OV was given that runs in (cid:101) O ( D ) time, where D is the totalnumber of vectors whose support is a subset of the support of an input vector. In this paper we heavily build upon previous literature, and in particular the representation tech-nique as developed in [AKKN16, BCJ11]. Therefore, we introduce the reader to this technique inSection 2. In the end of Section 2, we also use the introduced terminology of the representationtechnique to explain the new steps towards proving Theorem 1.1.The remainder of the paper is devoted to formally support all claims made. Necessary pre-liminaries are provided in Section 3; in Section 4 we present the proof of Theorem 1.1, and thereduction to Exact Node Weighted P from Theorem 1.3 is given in Section 5. Section 6 containsthe proof of (a generalization of) Theorem 1.2. In Appendices A to C we provide, respectively,various short omitted proofs, an inequality relevant for the runtime of our algorithms and a list ofproblem statements. This section is devoted to explain the representation technique (and its extensions from [AKKN16])and will serve towards a warm up towards the formal proof of Theorem 1.1.5 .1 Representation Technique with Simpliﬁed Assumption

We ﬁx an instance w , . . . , w n , t of Subset Sum. A perfect mixer is a subset M ⊆ [ n ] such that forevery distinct subsets A , A ⊆ M we have w ( A ) (cid:54) = w ( A ) . To simplify the explanation in thisintroductory section, we will make the following assumption about the Subset Sum instance:

Assumption 1. If w , . . . , w n , t is a YES-instance of Subset Sum, then there is a perfect mixer M for some constant µ > , and a set S with w ( S ) = t such that | M ∩ S | = | M | / . A mild variant of Assumption 1 can be made without loss of generality since relatively standardextensions of the method by [SS81] can be used to solve the instance more eﬃciently if it doesnot hold. We discuss Assumption 1 more later. We now illustrate the representation technique byoutlining a proof of the following statement:

Theorem 2.1.

There is a randomized output-linear time reduction that reduces an instance ofSubset Sum satisfying Assumption 1 to an equivalent instance of

WOV (cid:16) n/ (cid:0) | M || M | / (cid:1) / | M | , | M | / (cid:17) . Algorithm:

RepTechnique ( w , . . . , w n , t, M ) Output :

Instance of weighted orthogonal vectors Arbitrarily partition S \ M into L and R such that | L | = | R | = ( n − | M | ) / Pick a random prime p of order | M | / Pick a random x ∈ Z p Construct the following sets: L := (cid:110) ( A ∩ M, w ( X )) (cid:12)(cid:12)(cid:12) A ∈ L ∪ M : | A ∩ M | = | M | / and w ( A ) ≡ p x (cid:111) R := (cid:110) ( A ∩ M, w ( Y )) (cid:12)(cid:12)(cid:12) A ∈ R ∪ M : | A ∩ M | = | M | / and w ( A ) ≡ p t − x (cid:111) return the instance L , R , t of weighted orthogonal vectors Algorithm 1:

Pseudocode of Theorem 2.1The reduction from Theorem 2.1 is described in Algorithm 1, and uses the standard notation ≡ p to denote equivalence modulo p . We now describe the intuition of the algorithm. For partitionof [ n ] into L, M, R , it expands the search space by looking for pairs ( A , A ) where both A and A may intersect with M in | M | / elements. This is useful since the assumed solution S is represented by the (cid:0) | M ∩ S || M | / (cid:1) ≈ | M | / partitions ( A , A ) of S that are in the expanded search space. Togetherwith the Assumption 1, this allows us in turn to narrow down the search space by restricting thesearch to look only for pairs ( A , A ) satisfying w ( A ) ≡ p x , and thus w ( A ) ≡ p t − x . Thus, thealgorithm enumerates all candidates for A and A in respectively L and R and the instance ofweighted orthogonal vectors detects a disjoint pair of candidates with weights summing to t .Thus one direction of the correctness of the algorithm follows directly: If the produced instanceof weighted orthogonal vectors is a YES-instance, the union of the two found sets is a solution tothe subset sum instance.Conversely, we claim that if the instance of Subset Sum is a YES-instance and Assumption 1holds, then with good probability the output instance of Weighted Orthogonal Vectors is a YES-instance. Let S be the solution of the subset sum instance, so w ( S ) = t and | S ∩ M | = | M/ | . Notethat W := (cid:12)(cid:12)(cid:8) w ( A ∪ ( L ∩ S )) : A ⊆ M ∩ S (cid:9)(cid:12)(cid:12) = (cid:18) | M | / | M | / (cid:19) , This notion will be generalized to the notion of an ε -mixer in Deﬁnition 3.5. (cid:0) | M | / | M | / (cid:1) possibilities for X and w ( A ) is diﬀerent for each diﬀerent A by theperfect mixer property of M .By standard properties of hashing modulo random prime numbers (see e.g. Lemma 3.2 for ageneral statement), we have that the expected size of { y mod p : y ∈ W } is also approximately ofcardinality (cid:0) | M | / | M | / (cid:1) . Therefore the probability that x is chosen such that x ≡ p y for some y ∈ W is (cid:0) | M | / | M | / (cid:1) / | M | / ≥ Ω( | M | ) . If we let A ∈ L be the set with w ( A ) = y then since w ( S \ A ) ≡ p t − y , S \ A ∈ R and the pair ( A , S \ A ) is a solution to the weighted orthogonal vectors problem. Ingeneral this happens with probability at least /n .Figure 1: The green and orange regions represent a solution, i.e., a set S such that w ( S ) = t . Thereare (cid:0) | M ∩ S || M ∩ S | / (cid:1) pairs A ∈ L ∪ M and A ∈ R ∪ M , such that A ∪ A = S and A ∩ A = ∅ . Because M is a perfect mixer, the number of possible values that w ( A ) can take is also (cid:0) | M ∩ S || M ∩ S | / (cid:1) .Now we discuss the runtime and output size. At Line 4 we construct L and R . Since thenumber of possibilities of A is | L | (cid:0) | M || M | / (cid:1) and each such A satisﬁes w ( A ) ≡ p x with probability p (taken over the random choices of x ), we have that the expected sizes of L (and similarly, of R )is n/ (cid:0) | M || M | / (cid:1) / | M | , as claimed. By standard pseudo-polynomial dynamic programming techniques(see e.g. [AKKN16]), Line 4 can be performed in O ( p + |L| + |R| ) time, and thus the claimed runtime follows. Assumption 1 is oversimplifying our actual assumptions, and actually only a weaker assumption isneeded to apply the representation method. We call any set M that satisﬁes | w (2 M ) | = 2 (1 − ε ) | M | an ε -mixer (see also Deﬁnition 3.5). Denoting w ( F ) for { w ( X ) : X ∈ F } , the assumption is Assumption 2. If w , . . . , w n , t is a YES-instance of Subset Sum, then there is an ε -mixer M , anda set S with w ( S ) = t such that | ( − ε (cid:48) ) | M | ≤ | M ∩ S | ≤ ( + ε (cid:48) ) | M | , for some small positive ε, ε (cid:48) . To note that the representation technique introduced above still works with these relaxed as-sumptions, we remark that it can be shown that if, M is an ε -mixer, then w ( (cid:0) M ∩ Si (cid:1) ) ≥ (1 − f ( ε,ε (cid:48) )) | M ∩ S | for some ε, ε (cid:48) and unknown i . Thus in the representation technique we can split the solution in sets A , A where | A ∩ M | = i and | A ∩ M | = | S ∩ M | − i and use a prime p of order (1 − f ( ε,ε (cid:48) )) | M ∩ S | for some function f that tends to when ε and ε (cid:48) tend to .The advantage of the relaxed assumptions in Assumption 2 are that the methods from [HS74,SS81] can be extended such that it solves any instance that does not satisfy the assumptionsexponentially better in terms of time and space than in the worst-case. This allows us to make This uses that all numbers are single-exponential in n , but this can be assured with a standard hashing argument. [ n ] = L (cid:93) M L (cid:93) M (cid:93) M R (cid:93) R and decompositionof the solution S = A (cid:93) A (cid:93) A (cid:93) A .these assumptions without loss of generality when aiming for general exponential improvements inthe run time (or space bound).For example, in the approach by [SS81], in some steps of the algorithm we only need to enumeratesubsets of cardinality bounded away from half of the underlying universe; or in some other steps ofthe algorithm we can maintain smaller lists with sums generated by subsets. While these extensionsare not entirely direct, we skip a detailed explanation of them in this introductory section (seeSection 3 for formal statements). Having described the representation technique, we now explain our approach in more detail.

Setting up the representation technique to reduce the space usage.

Now, we present theintuition behind the space reduction of Theorem 1.1. In the previous subsection we constructedan instance L , R of weighted orthogonal vectors of expected size O (cid:63) (2 n/ − Ω( | M | ) ) such that withgood probability there exist A ∈ L and A ∈ R with w ( A ∪ A ) = t and A ∩ A = ∅ (if ananswer to Subset Sum is yes ). We combine this approach with the approach from [SS81] and aimto eﬃciently enumerate this instance of weighted orthogonal vectors instance. To do so, we applythe representation method two times more, and are able construct sets L , L , R , R with thefollowing properties:(i) With good probability there exist pairwise disjoint A ∈ L , A ∈ L , A ∈ R , A ∈ R , suchthat w ( A ∪ A ∪ A ∪ A ) = t , if the Subset Sum instance is a YES-instance,(ii) E [ |L | ] , E [ |L | ] , E [ |R | ] , E [ |R | ] = O (cid:63) (2 n/ − Ω( | M | ) ) .The sets will have the property that elements in L are formed by pairs from L ×L and elementsin R are formed by pairs from R × R . But in contract to the technique from [SS81], the lists L and R can not be easily decomposed into a direct product of two sets of size (cid:112) |L| and (cid:112) |R| .To overcome this issue, we apply the representation method again to enumerate the elements of L and R quickly. In particular, to construct the sets L , L , R , R , we partition the instanceinto L, M L , M, M R , R . See also Figure 2. Here M is assumed to be an ε -mixer that is used forthe representation technique on ‘ﬁrst level’: at this level we check whether a pair L × R forms asolution. The set M L is assumed to be an ε L -mixer and the set M R is assumed to be an ε R -mixer,and these sets are used for two applications of the representation technique on the ‘second level’:8t this level we check whether a pair in L × L forms an element of L (and similarly, whether apair in R × R forms an element of R ). If any of the assumptions fail, relatively direct extensionsof the methods from [SS81] can again solve the instance more eﬃciently by other means, so theseassumptions are without loss of generality. Maintaining the O (cid:63) (2 n/ ) time bound After we construct sets L , L , R , R as claimed inproperty (i) we combine the approach of [SS81] with our Orthogonal Vectors algorithm to obtainthe O (cid:63) (2 n/ ) running time. In particular, we store the elements of L , L , R , R in priority queuesordered by the weight in order to enumerate the elements of L and R in the correct order. Byour applications of the representation technique, we only need to assure disjointness between pairsbetween L × L , R × R and L × R and therefore the time to check disjointness is exactly thesame as in the normal application of the representation technique as described in the beginning ofthis section.Unfortunately, the relaxed assumptions from Assumption 2 cause issues here because we need toconsider unbalanced partitions of M ∩ S , M L ∩ S , M R ∩ S and the constants ε, ε L , ε M give diﬀerentprimes. Without additional care, the overhead in the runtime implied by these issues would lead toan undesired time bound of O (cid:63) (2 (0 . ε ) n ) for arbitrarily small constant ε > .To address these complications, we analyse our algorithm in such a way that if | w (2 L ) | or | w (2 R ) | is signiﬁcantly smaller than | w (2 M ) | we get an improved runtime. Note this can be assumed byswitching the roles of M L , M R , M . Additionally, we provide a general runtime for solving WeightedOrthogonal Vectors instances with vectors with general support size, Throughout the paper we use the O (cid:63) notation to hide factors polynomial in the input size and the (cid:101) O notation to hide polylogarithmic factors in the input size; which input this refers to will always beclear from the context. We also use [ n ] to denote the set { , . . . , n } . We use the binomial coeﬃcientnotation for sets, i.e., for a set S the symbol (cid:0) Sk (cid:1) denotes the set of all subsets of the set S of sizeexactly k . For a modulus m ∈ Z ≥ and x, y ∈ Z we write x ≡ m y to indicate that m divides x − y .If X ⊆ [ n ] , we denote w ( X ) := (cid:80) i ∈ X w i , which is extended to set families F ⊆ [ n ] by denoting w ( F ) := { w ( X ) : X ∈ F } . We use A (cid:93) B = C to denote that A, B form a partition of C . Prime numbers and hashing

We use the following folklore theorem on prime numbers:

Lemma 3.1 (Folklore) . For every suﬃciently large integer r the following holds. If p is a primebetween r and r selected uniformly at random and x is a nonzero integer, then p divides x withprobability at most (log x ) /r . The following Lemma already appeared in [AKKN16], but since we need slightly diﬀerent pa-rameters we repeat its proof.

Lemma 3.2 (cf., Proposition 3.5 in [AKKN16]) . Let w , . . . , w n be n integers bounded by O ( n ) .Suppose Q ⊆ [ n ] with | Q | = Θ( n ) . Let W , . . . , W c be integers and let W = (cid:81) ci =1 W i such that W ≤| w (2 Q ) | . For i = 1 , . . . , c , let p i be prime numbers selected uniformly at random from [ W i / , W i ] .Let s be the smallest integer such that (cid:0) | Q | s (cid:1) ≥ | w (2 Q ) | / | Q | . Denoting p := (cid:81) ci =1 p i , we have Pr (cid:104)(cid:12)(cid:12)(cid:12)(cid:110) a mod p : X ⊆ Q, | X | ∈ [ s , | Q | / , w ( X ) = a (cid:111)(cid:12)(cid:12)(cid:12) ≥ Ω (cid:16) pn c (cid:17) (cid:105) ≥ / . roof of Lemma 3.2, [AKKN16]. By the pigeonhole principle, there exists an integer s such that (cid:12)(cid:12)(cid:12) w (cid:16)(cid:0) Qs (cid:1)(cid:17)(cid:12)(cid:12)(cid:12) ≥ | w (2 Q ) | / | Q | . By the assumption (cid:0) | Q | s (cid:1) ≥ | w (2 Q ) | / | Q | we know that s ≥ s . We mayalso assume s ≤ | Q | / because w (cid:16)(cid:0) Qs (cid:1)(cid:17) = w (cid:16)(cid:0) Q | Q |− s (cid:1)(cid:17) by subset complementation.Let F ⊆ (cid:0) Qs (cid:1) be a maximal injective subset, i.e., satisfying |F | = | w ( F ) | = | w (cid:16)(cid:0) Qs (cid:1)(cid:17) | ≥| w (2 Q ) | / | Q | . Let c i = | { Y ∈ F : w ( Y ) ≡ p i } | be the number of sets from F with a sum in the i ’thcongruence class modulo p . Our goal is to lower bound the probability that c i > for a random i ∈ Z p . We can bound the expected (cid:96) norm (e.g., the number of collisions) by E (cid:34)(cid:88) i c i (cid:35) = (cid:88) Y,Z ∈F P [ p divides w ( Y ) − w ( Z )] ≤ |F | + O ( n c |F | /p ) . (1)The last inequality follows by Lemma 3.1 and the assumption that w , . . . , w n ≤ O ( n ) and | Q | = Θ( n ) . Namely, note that if Y (cid:54) = Z , then w ( Y ) (cid:54) = w ( Z ) . Hence Pr[ p divides w ( Y ) − w ( Z )] isat most O ( n c /W ) by applying Lemma 3.1 i times with each p i .By Markov’s inequality, (cid:80) i c i ≤ O ( |F | + n c |F | /p ) with constant probability over the choice of p . We assumed that p ≤ O ( w (2 Q )) , hence |F | ≤ O ( n c |F | /p ) . So (cid:80) i c i ≤ O ( n c |F | /p ) .Conditioned on this, the Cauchy-Schwarz inequality implies that the number of non-zero c i ’s isat least |F | (cid:14) (cid:80) i c i ≥ Ω(min {|F | , p/n c } ) = Ω( p/n c ) as desired. Shroeppel-Shamir’s sumset enumeration

We recall some of the basic building blocks of pre-vious work on Subset Sum. In [SS81] the authors used the following data structure to obtain an (cid:101) O ( n ) time and (cid:101) O ( n ) space algorithm for 4-SUM. Lemma 3.3.

Let

Binomial Coeﬃcients

We will frequently use the binary entropy function h ( p ) := − p log ( p ) − (1 − p ) log (1 − p ) . Its main use is via the following estimate of binomial coeﬃcients: Ω( d − / )2 dh ( α ) ≤ (cid:18) dαd (cid:19) ≤ dh ( α ) . (2)We also consider the inverse of the binary entropy. Since h ( α ) is strictly increasing in [0 , . wecan deﬁne h − : [0 , → [0 , . , with condition that h − ( α ) = β iﬀ h ( β ) = α .For every α ∈ [0 , . we have the following inequality on the entropy function: − α ≤ h (1 / − α ) ≤ − α / ln 2 (3)10e can also compute the derivative of the entropy function on / to bound its value, i.e., for every α > : h (1 / α ) ≤ h (1 /

4) + α log (4)Moreover by the convexity of binary entropy we know that for all α, x, y ∈ [0 , : αh ( x ) + (1 − α ) h ( y ) ≤ h ( αx + (1 − α ) y ) (5)In particular it means that h ( σλ ) + h ((1 − σ ) λ ) ≤ h ( λ/ for any ≤ σ ≤ .The following standard concentration lemma will be useful to control the intersection of thesolution with certain subsets of the weights of the subset sum instance: Lemma 3.4. If A ⊆ [ d ] with | A | = αd , and B ⊆ [ d ] is uniformly sampled over all subsets with | B | = βd , then the following holds: P [ | A ∩ B | = αβd ] ≥ Ω (cid:63) (1) . Proof.

There are (cid:0) d | B | (cid:1) possibilities of selecting a random B . There are (cid:0) | A | αβd (cid:1)(cid:0) d −| A || B |− αβd (cid:1) many pos-sibilities of selecting B , such that | A ∩ B | = αβd . Hence for a random B , the probability that | A ∩ B | = αβd is: (cid:0) αdαβd (cid:1)(cid:0) (1 − α ) dβ (1 − α ) d (cid:1)(cid:0) dβd (cid:1) ≥ Ω (cid:32) d ( αh ( β )+(1 − α ) h ( β )) ) d dh ( β ) (cid:33) = Ω (cid:18) d (cid:19) , because of (2). Preprocessing Algorithms

We now present several simple procedures that allow us to makeassumptions about the given Subset Sum instance in the proof of Theorem 1.1. Throughout thispaper w , . . . , w n , t denotes an instance of Subset Sum. We can assume that the integers w , . . . , w n , t are positive and w + . . . + w n + t ≤ n (see [AKKN16, Lemma 2.1]). Throughout the paper wewill introduce certain constants close to and assume that n is big enough, so the product of n with these constants is an integer.The following notion that was already discussed in Section 2 indicates how many distinct sumsthe subsets of a given subset of the weights generate: Deﬁnition 3.5 ( ε -mixer) . A set M is an ε -mixer if | w (2 M ) | = 2 (1 − ε ) | M | . Lemma 3.6.

Given a set M , one can in O (cid:63) (2 | M | ) time and O (cid:63) (2 | M | ) space determine the smallest ε such that M is an ε -mixer.Proof. Iterate over every possible subset of M and store w (2 M ) . Afterwards sort w (2 M ) , determinethe size of M and output ε := − (1 − log ( | w (2 M ) | ) / | M | ) . Lemma 3.7.

For any constants ε > and µ ∈ (0 , / , there is an algorithm that, given a SubsetSet instance w , . . . , w n , t and an ε -mixer M satisfying | M | = µn and ε > ε , solves the instance intime O (cid:63) (2 (1 − ε µ ) n/ ) and O (cid:63) (2 (1 − ε µ ) n/ ) space.Proof of Lemma 3.7. Arbitrarily partition [ n ] \ M = L (cid:93) L (cid:93) R (cid:93) R , such that: | L | = n − µn (cid:18) − ε (cid:19) , | L | = | R | = | R | = (1 − ε µ ) n . | L | > , because µ < / and ε > . Then construct the sets A := w (2 L ) , B := w (2 R ) , C := w (2 R ) . Observe that the space needed to do this is exactly | L | , which is within theboundaries of our algorithm. Next construct D := w (2 L ∪ M ) . Observe that | D | = | w (2 L ∪ M ) | ≤ | w (2 L ) || w (2 M ) | ≤ (1 − ε µ ) n/ . Finally, observe that the Subset Sum instance is equivalent to the 4-SUM instance

A, B, C, D, t ,which we can solve in O (cid:63) ( | A || B | + | C || D | ) time and O (cid:63) ( | A | + | B | + | C | + | D | ) space using Lemma A.1. Lemma 3.8.

Suppose a Subset Sum instance w , . . . , w n , t with promise that there is a solution ofsize λn is given. Then we can ﬁnd S ⊆ [ n ] with w ( S ) = t in randomized O (cid:63) (2 h ( λ ) n/ + 2 n/ ) timeand O (cid:63) (2 h ( λ ) n/ ) space.Proof of Lemma 3.8. Let S be the solution to Subset Sum such that | S | = λn . Randomly partition [ n ] = A (cid:93) A (cid:93) A (cid:93) A , each of size n/ . By Lemma 3.4 with probability Ω (cid:63) (1) we have that | A i ∩ X | = λn for all i ∈ [4] . Next for all i ∈ [4] we enumerate sets: A i = (cid:26) w ( B ) (cid:12)(cid:12)(cid:12) B ⊆ (cid:18) A i λn/ (cid:19)(cid:27) . We can construct A i in O (cid:63) (2 n/ ) time and O (cid:63) ( |A i | ) space by testing all possible subsets of A i . Fi-nally, we invoke 4-SUM algorithm from Lemma A.1 on instance A , . . . , A , t . It runs in O (cid:63) (cid:16)(cid:0) n/ (cid:96)/ (cid:1) (cid:17) time and O (cid:63) (cid:16)(cid:0) n/ (cid:96)/ (cid:1)(cid:17) space. For correctness, observe that |A i | = O (cid:63) ( (cid:0) n/ λn/ (cid:1) ) and with Ω (cid:63) (1) proba-bility S ∩ A i ∈ A i . This section is devoted to the proof of Theorem 1.1. The main technical eﬀort, done in Subsec-tions 4.1 to 4.2, is to prove the following lemma. The performance of the algorithm will depend on pa-rameters λ ≤ / , and µ , ε L and ε R . It is instructive to think about ε L = ε R = 0 , µ = 1 / , λ = 1 / .Parameter µ is a constant that we will set later to be close to / . Lemma 4.1 (Main Lemma) . Let λ := 0 . , ε := 0 . . Let λ ∈ [ λ , . , ε R ∈ [0 , ε ] , µ ∈ (0 . , . and let M L , M, M R ⊆ [ n ] be disjoint sets such that | M | = | M L | = | M R | = µn . Let ≤ ε ≤ ε L ≤ ε R be such that M L is an ε L -mixer, M is an ε -mixer and M R is an ε R -mixer. Let S ⊆ [ n ] be such that w ( S ) = t and | M L ∩ S | = | M ∩ S | = | M R ∩ S | = λµn .There is a Monte Carlo algorithm for Subset Sum that, given the instance w , . . . , w n , t , the sets M L , M, M R , and λ, ε L , ε R , runs in time O (cid:63) (2 n/ ) and space O (cid:63) (cid:16) (1 / − µ (3 / λ − h (1 / n +0 . µn + 2 µn (2 h (1 / − λ )+0 . µn + 2 µn (cid:17) . The constant . depends on the choice of ε and λ . This constant goes to when ε → and λ → / .First, we prove the main result of the paper assuming Lemma 4.1 by using the elementarypreprocessing algorithms provided in Section 3. 12 roof Theorem 1.1 assuming Lemma 4.1. Set µ := 0 . . With polynomial overhead we can guess | S | = λn . If λ < λ then we use Theorem 3.8 to solve Subset Sum in O (cid:63) (2 h ( λ ) n/ ) ≤ O (cid:63) (2 . n ) space and O (cid:63) (2 n/ ) time. Hence, we can assume that λ ≥ λ . We can also assume that λ ≤ / bylooking for [ n ] \ S instead of S by changing t to w ([ n ]) − t .Next, randomly select pairwise disjoint sets M, M L , M R ⊆ (cid:0) [ n ] µn (cid:1) . For each of them we useLemma 3.6 to determine the ε, ε L , ε R such that M is an ε -mixer, M L is an ε L -mixer and M R isan ε R -mixer. If at least one of ε, ε L , ε R is at least ε , use Theorem 3.7 to solve the instance in O (cid:63) (2 (1 − µε ) n/ ) ≤ O (cid:63) (2 . ) space and O (cid:63) (2 n/ ) time. Hence we can assume ε, ε L , ε R < ε .Finally, Lemma 4.1 applies and it solves the instance in time O (cid:63) (2 n/ ) . For our choice of theparameters we get that the space is at most O (cid:63) (2 . n ) .In total, the space complexity of our algorithm is bounded by O (cid:63) (2 . n ) as claimed.The rest of this section is devoted to the proof of Lemma 4.1. This lemma is an extensionof Theorem 2.1 combined with a fast OV algorithm. As mentioned in Subsection 2.3, we applythe representation technique on 2 levels and therefore we need sets M L , M, M R . Moreover, theassumption < ε ≤ ε L , ε R is to avoid the aforementioned undesired O (cid:63) (2 (0 . O ( ε )) n ) running time. Algorithm:

SubsetSum ( w , . . . , w n , t, M L , M, M R , λ, ε L , ε R ) Output :

Set S with w ( S ) = t and | M L ∩ S | , | M ∩ S | , | M R ∩ S | = λ | M | , if it exists Arbitrarily partition [ n ] \ ( M L ∪ M ∪ M R ) = L (cid:93) R such that | L | , | R | satisfy (7) Pick random primes p R ∈ Θ(2 ( λ − ε R ) | M | ) , p (cid:48) ∈ Θ(2 ( ε R − ε L ) | M | ) ; set p L = p (cid:48) · p R Pick random x L ∈ Z p L , x ∈ Z p L , x R ∈ Z p R foreach h − (1 − ε L /λ − log nn ) ≤ σ, σ L , ≤ / , and h − (1 − ε R /λ − log nn ) ≤ σ R ≤ / do Construct L , L , R , R as deﬁned in Equations (8) to (11) if WeightedOV ( L , L , R , R , M, t ) then return true return false Algorithm 2:

Pseudocode of the algorithm for Lemma 4.1The algorithm of Lemma 4.1 is presented with pseudocode in Algorithm 2. The

WeightedOV subroutine decides whether there exists ( A , . . . , A ) ∈ L × L × R × R with w ( A ∪ . . . ∪ A ) = t and A i ∩ A j = ∅ for all i (cid:54) = j . This subroutine will be provided and analysed in Section 4.2.On a high level, the Algorithm 2 has the same structure as Algorithm 1, with one major diﬀer-ence: The sets L and R are generated implicitly. To generate these lists we combine the techniquefrom [SS81] as summarized in Lemma 3.3 with two more applications of the representation methodsused to generate L and R . The algorithm iterates over every possible choice of parameters σ, σ L , σ R ∈ [0 , , such that h ( σ ) , h ( σ L ) , h ( σ R ) ≥ − ε R /λ − log nn in Line 4. The precision of σ, σ R , σ L is polynomial, sincethese parameter describe the size of possible subsets of M, M R , M L . The purpose of one iterationof this loop is summarized in the following lemma, which is also illustrated in Figure 3 : Note that, formally speaking, the list L ( a ) from Algorithm 1 is not the same as the set of elements of list L ofAlgorithm 2 with sum a , but since the two are almost equals we kept the same notation. S = S (cid:93) . . . (cid:93) S as formalized in Lemma 4.2. Lemma 4.2.

Consider an iteration of the loop at Line 4 of Algorithm 2 with parameters σ, σ L , σ R .Suppose there exists a set S ∈ (cid:0) [ n ] λn (cid:1) with w ( S ) = t that has a partition S = S (cid:93) S (cid:93) · · · (cid:93) S satisfying the following properties: S ⊆ L, S ∈ (cid:18) M L σ L λ | M | (cid:19) , S ∈ (cid:18) M L (1 − σ L ) λ | M | (cid:19) , S ∈ (cid:18) Mσλ | M | (cid:19) ,S ⊆ R, S ∈ (cid:18) M R σ R λ | M | (cid:19) , S ∈ (cid:18) M R (1 − σ R ) λ | M | (cid:19) , S ∈ (cid:18) M (1 − σ ) λ | M | (cid:19) , (6) w ( S ∪ S ∪ S ∪ S ) ≡ p L x, w ( S ∪ S ∪ S ∪ S ) ≡ p R t − x,w ( S ∪ S ) ≡ p L x L , w ( S ∪ S ) ≡ p L x − x L ,w ( S ∪ S ) ≡ p R x R , w ( S ∪ S ) ≡ p R t − x − x R . Then during this iteration the algorithm returns true.

The (relatively straightforward) proof of Lemma 4.2 will be given in Subsection 4.3 where weprove the correctness of the algorithm. To obtain a relatively fast algorithm also in the case that ε is bounded away from or λ is bounded away from / , we need to carefully deﬁne the lists L , L , R , R in order to not slow down the run time to beyond O (cid:63) (2 n/ ) . To do so, we use thefollowing balance parameter β = β ( λ, σ ) := h ( σλ ) − h ((1 − σ ) λ ) . Intuitively, β expresses the diﬀerence of the expected list sizes {L ( a ) } a and {R ( b ) } b that are deﬁnedon Line 8 and Line 9 when we would have set | L | = | R | . Observe that if ε = 0 and λ = 1 / , then σ L , σ, σ R = 1 / and indeed β = 0 .All elements of [ n ] not in M L ∪ M ∪ M R are arbitrarily partitioned into L and R on Line 1 where | L | and | R | are chosen to compensate for imbalance caused by ε, σ, λ as follows: | L | = (1 − µ − βµ ) n , | R | = (1 − µ + βµ ) n . (7)Observe that β ≤ , and since µ ≤ / we have that | L | , | R | > .Now we deﬁne the four lists that play a similar role in our algorithm as the four lists in theoriginal algorithm of [SS81]. L := (cid:26) S ∪ S (cid:12)(cid:12)(cid:12)(cid:12) w ( S ∪ S ) ≡ p L x L , S ⊆ L, S ∈ (cid:18) M L σ L λ | M | (cid:19)(cid:27) , (8) R := (cid:26) S ∪ S (cid:12)(cid:12)(cid:12)(cid:12) w ( S ∪ S ) ≡ p R t − x − x R , S ⊆ R, S ∈ (cid:18) M R σ R λ | M | (cid:19)(cid:27) , (9)14 := (cid:26) S ∪ S (cid:12)(cid:12)(cid:12)(cid:12) w ( S ∪ S ) ≡ p L x − x L , S ∈ (cid:18) M L σλ | M | (cid:19) , S ∈ (cid:18) M (1 − σ L ) λ | M | (cid:19)(cid:27) , (10) R := (cid:26) S ∪ S (cid:12)(cid:12)(cid:12)(cid:12) w ( S ∪ S ) ≡ p R x R , S ∈ (cid:18) M (1 − σ ) λ | M | (cid:19) , S ∈ (cid:18) M R (1 − σ R ) | M | (cid:19)(cid:27) . (11)Using a straightforward algorithm, we can construct each list using (cid:101) O ( |L | + |L | + |R | + |R | +2 µn ) time and space (see Lemma A.3 in Appendix A.2) Algorithm:

WeightedOV ( L , L , R , R , M, t ) Output : If ∃ disjoint ( A , . . . , A ) ∈ L × L × R × R with w ( A ∪ . . . ∪ A ) = t Initialize inc = inc ( w ( L ) , w ( L )) // Lemma 3.3 Initialize dec = dec ( w ( R ) , w ( R )) // Lemma 3.3 Let ( P rb , b ) = dec . next () foreach ( P la , a ) = inc . next () do // Integers a are increasing while a + b > t do Let ( P rb , b ) = dec . next () // Integers b are decreasing if a + b = t then Construct L ( a ) := (cid:110) Y ∩ M (cid:12)(cid:12)(cid:12) ∃ X ∈ L , Y ∈ L , X ∩ Y = ∅ , ( w ( X ) , w ( Y )) ∈ P la (cid:111) Construct R ( b ) := (cid:110) Y ∩ M (cid:12)(cid:12)(cid:12) ∃ X ∈ R , Y ∈ R , X ∩ Y = ∅ , ( w ( X ) , w ( Y )) ∈ P lb (cid:111) if OV ( L ( a ) , R ( b )) (cid:54) = ∅ then return true // Theorem 6.1 return false Algorithm 3:

Weighted Orthogonal Vectors algorithmWe describe the

WeightedOV subroutine with pseudo-code in Algorithm 3). The algorithm isheavily based on the data structures from [SS81] as described in Lemma 3.3. First we initialize thequeue inc for enumerating w ( L ) + w ( L ) in increasing order and the queue dec for enumerating w ( R ) + w ( R ) in decreasing order. With these queues, we enumerate all groups L ( a ) ⊆ M withthe property that if S ∈ L ( a ) then there exist X ∈ L and Y ⊆ L with Y ∩ M = S , X ∩ Y = ∅ and w ( X ) + w ( Y ) = a . Similarly, we enumerate all groups R ( a ) ⊆ M with the property that if S ∈ R ( b ) then there exist X ∈ R and Y ⊆ R with Y ∩ M = S , X ∩ Y = ∅ and w ( X )+ w ( Y ) = b .At the end we execute a subroutine OV that solves the unweighted orthogonal vectors problem thatwill be described in Theorem 6.1.We now analyse the correctness and space usage of this algorithm. The time analysis will beintertwined with the time analysis of Algorithm 2 and is therefore postponed to Subsection 4.5. Lemma 4.3.

Algorithm

WeightedOV is a correct Monte-Carlo algorithm for the Weighted Orthog-onal Vectors Problem.Proof.

If the algorithm outputs true at Line , there exist A ∈ L , A ∈ L , A ∈ R , A ∈ R such that w ( A ) + w ( A ) + w ( A ) + w ( A ) = t .First, note that by the construction of sets L , L , R , R it has to be that A ∩ A ⊆ M . Sincethe OV algorithm checks for disjointness on M we have that A ∩ A ∩ M = ∅ , hence A ∩ A = ∅ .Also, A ∩ A = ∅ because ( X, Y ) ∈ L ( a ) means X ∩ Y = ∅ . Similarly A ∩ A = ∅ because15 X, Y ) ∈ R ( b ) means that X ∩ Y = ∅ . By the construction of the lists L , L , R , R the sets A , . . . , A are thus mutually disjoint and indeed the instance of Weighted Orthogonal Vectors is aYES-instance.For the other direction, assume the desired A , . . . , A quadruple exists. Let t L := w ( A ∪ A ) .Then t L ∈ w ( L )+ w ( L ) and t − t L = w ( A ∪ A ) ∈ w ( R )+ w ( R ) . By Lemma 3.3 inc enumerates w ( L )+ w ( L ) , and dec enumerates w ( R )+ w ( R ) in decreasing order. Therefore, the loop startingat Line 4 is a basic linear search routine that sets a to t L and b to t − t L in some iteration: If a isset to t L before b is set to t − t L , then b is in this iteration larger than t − t L and it will be decreasedin the next iterations until it is set to t − t L . Similarly, if b is set to t − t L before a is set to t L , inthis iteration a is smaller than t L and it will be increased in the next iterations until it is set to t L .In the iteration with a = t L and b = t − t L we have that P la contains the pair ( w ( A ) , w ( A )) and P rb contains the pair ( w ( A ) , w ( A )) . Therefore L ( a ) contains A ∩ M = S and L ( b ) contains A ∩ M = S , and since S and S these are disjoint, a solution will be detected by the OV subroutinewith at least constant probability on Line 10. Lemma 4.4.

Algorithm

WeightedOV uses at most O (cid:63) ( |L | + |L | + |R | + |R | + 2 | M | ) space.Proof. The datastructures inc and dec use at most (cid:101) O ( |L | + |L | + |R | + |R | ) space by Lemma 3.3,and the sets L ( a ) and R ( b ) are of cardinality at most | M | . The statement follows since, as we willshow in Theorem 6.1, the subroutine OV ( A , B ) uses at most (cid:101) O ( |A| + |B| + 2 | M | ) space. We now focus on the correctness of the entire algorithm. First notice that if the algorithm ﬁnds asolution on Line 6, it is always correct since it found pairwise disjoint sets A , A , A , A satisfying w ( A ∪ A ∪ A ∪ A ) = t . Thus S := A ∪ A ∪ A ∪ A is a valid solution. The proof of the reverseimplication is less easy and its proof is therefore split in two parts with the help of Lemma 4.2.Note that because the partition [ n ] = L (cid:93) M L (cid:93) M (cid:93) M R (cid:93) R is selected at random, the solutionis well-balanced in sets L, M L , M, M R , R . The following is a direct consequence of Lemma 3.4: Observation 4.5 (Balanced split with good probability) . Let S be the solution to Subset Sumwith | S | = λn . Then, with Ω (cid:63) (1) probability, the following holds: | S ∩ M L | = λ | M L | , | S ∩ M | = λ | M | , | S ∩ M R | = λ | M R | . Now we show that if the above event was successful, the conditions of Lemma 4.2 apply withgood probability:

Lemma 4.6.

Suppose there exists a solution S ⊆ [ n ] be such that w ( S ) = t and | M L ∩ S | = | M ∩ S | = | M R ∩ S | = λµn . Then with probability Ω (cid:63) (1) , there exists a partition S = S (cid:93) · · · (cid:93) S satisfying all conditions in (6) .Proof. We select S = L ∩ S , S = R ∩ S , and a, b be such that let a ≡ p L w ( S ) and b ≡ p R w ( S ) .Next we prove that, because the subsets of M generate many distinct sums, the same holds for thesolution intersected with M : a good mixer. Claim 4.7.

Set M ∩ S is an ε (cid:48) -mixer for some ε (cid:48) ≤ ε/λ . Similarly, M L ∩ S is an ε (cid:48) L -mixer for ε (cid:48) L ≤ ε L /λ , and M R ∩ S is an ε (cid:48) R -mixer for some ε (cid:48) R ≤ ε R /λ .Proof of Claim 4.7. Let us focus on M ∩ S (the result for M L and M R is analogous). Because M isan ε -mixer, we know that (1 − ε L ) | M | ≤ | w (2 M ) | ≤ | w (2 M ∩ S ) || w (2 M \ S ) | . Since | w (2 M \ S ) | ≤ (1 − λ ) | M | we have that | w (2 M ∩ S ) | ≥ ( λ − ε ) | M | = 2 (1 − ε/λ ) | M ∩ S | .16ow we know that Q = M L ∩ S is a good mixer. We can use Lemma 3.2 for Q = M L ∩ S and p = p L · p (cid:48) , since | w (2 | M L ∩ S | ) | ≥ (1 − ε L /λ ) | M L ∩ S | = 2 ( λ − ε L ) | M L | . Because x L was chosenrandomly, Lemma 3.2 guarantees that with Ω ∗ (1) probability, there exists S ⊆ M L ∩ S , such that w ( S ) ≡ p L x L − a . Moreover Lemma 3.2 guarantees that | S | ∈ [ s , λµn/ , where s is the smallestinteger such that (cid:0) Qs (cid:1) ≥ w (2 Q ) / | Q | . If we take the logarithm of both sides this is equivalent to λµn · h (cid:18) s | Q | (cid:19) ≥ log (cid:16)(cid:12)(cid:12)(cid:12) w (2 ( M L ∩ S ) ) (cid:12)(cid:12)(cid:12)(cid:17) − log nn ≥ (1 − ε L /λ ) λµn − log nn . Because we have checked all σ L that satisfy h ( σ L ) ≥ (1 − ε L /λ ) − log nn the algorithm willeventually guess the correct s (same reasoning holds for σ and σ R ). We select S = ( M L ∩ S ) \ S with | S | = (1 − σ L ) µn .In the similar manner we can prove that with Ω (cid:63) (1) probability there exists S ⊆ M R ∩ S ,such that w ( S ) ≡ p R x R − b with | S | = σ R µn and h ( σ R ) ≥ − ε R /λ − log nn (we need to applyLemma 3.2 with Q = M R ∩ S and prime p R ). Moreover, this probability only depends on x R which isindependent of all other random variables and events. If this happens, we select S = ( M R ∩ S ) \ S with | S | = (1 − σ R ) µn .Conditioned on the existence of S , S , S , S , S , S , now we prove there exist S and S with Ω (cid:63) (1) probability. Let c = w ( S ∪ S ∪ S ) and d = w ( S ∪ S ∪ S ) . We again use Lemma 3.2,but this time with Q = M ∩ S and p = p R · p (cid:48) . It assures that with high probability there exist S ⊆ M ∩ S , with w ( S ) ≡ p L x − c and | S | = σµn with h ( σ ) ≥ − ε L /λ − log nn . And indeed,again this probability only depends on x R which is independent of all other random variables andevents. If this event happens, we select S = ( M ∩ S ) \ S .Now we use the fact that p R divides p L : If x ≡ p L a then x ≡ p R a because ( x − a ) = k · p (cid:48) · p R for some k ∈ Z . Hence w ( S ) + d ≡ p R w ( S ) − x , which means that w ( S ∪ S ∪ S ∪ S ) ≡ p R t − x .Moreover it holds that | S | = (1 − σ ) µn , thus S also satisﬁes the desired conditions.To conclude observe that the constructed sets S , . . . , S are disjoint.Finally, we prove the Lemma 4.2 that the existence of the tuple ( S , . . . , S ) implies that asolution is detected: Proof of Lemma 4.2.

By the construction of L , L , R , R and the assumed properties of thelemma, we have that A := S ∪ S ∈ L , A := S ∪ S ∈ L , A := S ∪ S ∈ R , and A := S ∪ S ∈ R . Since the sets S , . . . , S are pairwise disjoint and satisfy (cid:80) i =1 w ( S i ) = t , thesets A , . . . , A certify that WeightedOV ( L , L , R , R , M, t ) outputs true.By combining Lemma 4.6 and Lemma 4.2, the correctness of Algorithm 2 directly follows. The bulk of the analysis of the space usage consists of computing the expected sizes of the lists L , L , R , R . This requires us to look in to our setting of the parameters. Useful bounds on parameters

Recall, that we deﬁned the following constants λ := 0 . and ε := 0 . . Then, we assumed that ε, ε L , ε R ≤ ε and λ ∈ [ λ , . . Moreover, we have chosen σ, σ L , σ R , such that: . < − ε /λ − log nn ≤ h ( σ ) , h ( σ L ) , h ( σ R ) ε and λ and large enough n ): σ, σ L , σ R ∈ [0 . , . . (12)because h (0 . h (0 . ≈ . . Next, observe that h ( σλ ) , h ((1 − σ ) λ ) ≤ h (1 /

4) + 0 . . (13)because the entropy function is increasing in [0 , . and h (0 . · . − h (1 / < . . For thenext inequality, recall that β ( σ, λ ) = h ( σλ ) − h ((1 − σ ) λ ) . − . ≤ β ( σ, λ ) ≤ . (14)because | β | < h (0 . · . − h (0 . · λ ) < . . Bounds on the list sizesClaim 4.8. E [ |L | ] ≤ O (cid:63) (cid:0) (1 / − µ (3 / λ − h (1 / − . n (cid:1) .Proof. Let W L be the number of possible diﬀerent elements from L . It is W L := 2 | L | (cid:18) µnλσ L µn (cid:19) The expected size of |L | over the random choices of x L is E [ |L | ] ≤ W L p L . If we plug in the deﬁnition of | L | , we have: (log ( E [ |L | ]) /n ) ≤ / − µ (3 / λ − h ( λσ L )) + µ ( ε L − β/ By Inequality 13 we have that h ( σ L λ ) ≤ h (1 /

4) + 0 . . By Inequality 14 we have that | β | ≤ . and ε L < . . Hence: (log ( E [ |L | ]) /n ) ≤ / − µ (3 / λ − h (1 / . · µ. By symmetry the same bound holds for E [ |R | ] . Claim 4.9. E [ |R | ] ≤ O (cid:63) (cid:0) (1 / − µ (3 / λ − h (1 / − . n (cid:1) . Next we bound |L | and |R | : Claim 4.10. E [ |L | ] ≤ O (cid:63) (2 µn (2 h (1 / − λ )+0 . µn ) . The only diﬀerence being that β shows up positively rather than negatively, but this does not matter since webounds its absolute value. roof. Let W L be the number of possibilities of selecting S . It is W L := (cid:18) µnσλµn (cid:19)(cid:18) µn (1 − σ L ) λµn (cid:19) The expected size of |L | over the random choices of x L and p L is E [ |L | ] ≤ W L p L . Hence, (log ( E [ |L | ])) /n ≤ µ ( h ( λσ ) + h ( λ (1 − σ L )) − λ + ε L ) We use Inequality 13 and have h ((1 − σ ) λ ) , h ( σλ ) ≤ h (1 /

4) + 0 . . Hence we can roughly bound: (log ( E [ |L | ])) /n ≤ µ (2 h (1 / − λ ) + 0 . · µ By symmetry, the same bound holds for |R | : Claim 4.11. E [ |R | ] ≤ O (cid:63) (2 µn (2 h (1 / − λ )+0 . µn ) . As mentioned in Subsection 4.1, the subroutine

WeightedOV uses O (cid:63) ( |L | + |L | + |R | + |R | +2 | M | ) space. By the above claims, we see that this is at most O (cid:63) (cid:16) (1 / − µ (3 / λ − h (1 / n +0 . µn + 2 µn (2 h (1 / − λ )+0 . µn + 2 µn (cid:17) , as promised. Remark 4.12.

The constant . actually depends on our choice of ε and λ . When ε → and λ → / it goes to . With more complicated inequalities and a tighter choice of parameters we wereable to get O (cid:63) (2 . n ) space usage. We decided to skip the details for a simplicity of presentation. Now, we prove that the runtime of Algorithm 2 is O (cid:63) (2 n/ ) . By Lemma 3.3, the total runtime ofall queries to inc . next () is O (cid:63) ( |L ||L | ) , and the total runtime of all the queries to dec . next () is O (cid:63) ( |R ||R | ) . This is upper bounded by O (cid:63) (2 n/ ) by the analysis of Subsection 4.4.The main bottleneck of the algorithm comes from all the calls to OV subroutine at Line 10 ofAlgorithm 3. To facilitate the analysis, we deﬁne sets A , B that represent the total input to the OV subroutine: For every a ∈ N , such that a ≡ p L x and each X ∈ L ( a ) , add the pair ( X ∩ M, a ) to A (without repetitions). Similarly, for each Y ∈ R ( t − a ) , add the pair ( Y ∩ M, t − a ) to B . Hence thetotal input for OV generated by is: A := (cid:26) ( X, a ) : X ∈ (cid:18) Mσλµn (cid:19) and a − w ( X ) ∈ w (2 L ∪ M L ) and a ≡ p L x (cid:27) , B := (cid:26) ( Y, b ) : Y ∈ (cid:18) M (1 − σ ) λµn (cid:19) and b − w ( Y ) ∈ w (2 R ∪ M R ) and b ≡ p R t − x (cid:27) . Now, let us calculate the expected size of A . The number of possibilities of selecting possibleelements in A is the number of possibilities of selecting X from M and a from w (2 L ∪ M L ) . Since theprobability that a ≡ p L x is /p L , we obtain E [ |A| ] ≤ (cid:18) Mσλµn (cid:19) | w (2 L ∪ M L ) | /p L . b ≡ p R t − x is /p R : x is chosen uniformly at random from Z p L , butsince p L is a multiple of p R , x mod p R will also be uniformly distributed in Z p R . E [ |B| ] ≤ (cid:18) M (1 − σ ) λµn (cid:19) | w (2 R ∪ M R ) | /p R . Recall that M L is an ε L -mixer, hence | w (2 L ∪ M L ) | ≤ | w (2 | L | ) | (1 − ε L ) µn , and similarly M R is an ε R mixer. Hence: log ( E [ |A| ]) ≤ | L | + (1 − ε L ) µn + h ( σλ ) µn − ( λ − ε L ) µn = (cid:18) − µ − βµ µ − λµ + h ( σλ ) µ (cid:19) n, = (cid:18) − µ (cid:18)

12 + λ + β/ − h ( σλ ) (cid:19)(cid:19) n, and similarly: log ( E [ |B| ]) ≤ | R | + (1 − ε R ) µn + h ((1 − σ ) λ ) µn − ( λ − ε R ) µn = (cid:18) − µ + βµ µ − λµ + h ((1 − σ ) λ ) µ (cid:19) n = (cid:18) − µ (cid:18)

12 + λ − β/ − h ((1 − σ ) λ ) (cid:19)(cid:19) n. Now it becomes clear that we have chosen the balancing parameter β in the sizes | L | , | R | to matchthe sizes of A , B : Observe that β/ − h ( σλ ) = − h ( σλ ) + h ((1 − σ ) λ )2 = − β/ − h ((1 − σ ) λ ) , and thus we obtain that log ( E [ |A| ]) , log ( E [ |B| ]) ≤ (cid:18) − µ (cid:18)

12 + λ − h ( σλ ) + h ((1 − σ ) λ )2 (cid:19)(cid:19) n. By the convexity of binary entropy function (see Inequality 5), we know that h ( σλ ) + h ((1 − σ ) λ ) ≤ h ( λ/ . Hence: E [ |A| ] , E [ |B| ] ≤ O (cid:63) (2 n/ − µn (1 / λ − h ( λ/ ) . (15)The OV subroutine (see Theorem 6.1) takes A and B as an input with dimension d = µn . Notethat the condition λ ∈ [0 . , . in Theorem 6.1 is satisﬁed by the assumption in the Lemma 4.1 and σ ∈ [0 . , . is satisﬁed because for our choice of parameters σ ∈ [0 . , . (see Inequality 12).Hence the total run time is: O (cid:63) (cid:16) ( |A| + |B| ) 2 µn (1 / λ − h ( λ/ (cid:17) . Thus the algorithm runs in O (cid:63) (2 n/ ) time by Equality (15). Remark 4.13.

Observe that the Inequality in Lemma B.1 is tight when λ = 1 / , which is the worstcase for an algorithm. In particular any O (cid:63) (2 δd ) improvement to our OV algorithm in the case λ = 1 / and σ = 1 / for some δ > would give an O (cid:63) (2 (1 / − δ (cid:48) ) n ) time algorithm for Subset Sumfor some δ (cid:48) > . Reducing From Subset Sum to Exact Node Weighted P In this section we discuss an interesting connection between graph problems and Subset Sum. Recallthat in the Exact Node Weighted P problem we are given a node weighted graph G = ( V, E ) , andwant to ﬁnd vertices that form a path and their total weight is equal . We show that a fastalgorithm for this problem would improve resolve Open Question 1: Theorem 1.3.

If Exact Node Weighted P on a graph G = ( V, E ) can be solved in O ( | V | . ) time,then Subset Sum can be solved in O (cid:63) (2 (0 . − δ ) n ) randomized time for some δ > .Proof. We choose some constants ε > and λ < / . By Theorem 3.7 and Theorem 3.8, we cansolve Subset Sum in O (cid:63) (2 (0 . − δ (cid:48) ) n ) time for some δ (cid:48) ( ε , δ ) . Hence, from now on we assume that ε < ε and λ > λ .We use the construction from Lemma 4.1. This Lemma, gives us the algorithm that constructs families of sets: L , L , R , R ⊆ [ n ] , with the following properties (see Lemma 4.2 and Lemma 4.1):• If an answer to Subset Sum is positive, then with Ω (cid:63) (1) probability, there exist S ∈ L , S ∈L , S ∈ R , S ∈ R , that are pairwise disjoint and w ( S ∪ S ∪ S ∪ S ) = t (otherwise thereis no such quadruple).• For every S ∈ L , S ∈ L , S ∈ R , S ∈ R we have that S ∩ S = ∅ , S ∩ S = ∅ and S ∩ S = ∅ .• The expected size of these lists is bounded by O (cid:63) (2 (1 / − µ (3 / λ − h (1 / n + µρn +2 µn (2 h (1 / − λ )+ µρn ) ,for some ρ = ρ ( λ, ε ) that goes to when ε → and λ → / (see Remark 4.12).Deﬁne constant κ ( ε , λ , µ ) that goes to when ε → and λ → / such that the expected sizeof the lists is: max { E [ |L | ] , E [ |L | ] , E [ |R | ] , E [ |R | ] } ≤ O (cid:63) (2 (1 / − µ (2 − h (1 / n + κn + 2 µn (2 h (1 / − / κn ) . Next we select µ := 1 / (3 + 2 h (1 / ≈ . to minimize the expected size of L , L , R , R . Wehave that: max { E [ |L | ] , E [ |L | ] , E [ |R | ] , E [ |R | ] } ≤ O (cid:63) (2 . n + κn ) . Now, we proceed with the reduction to Exact Node Weighted P . First, we construct a graph. Let M := 100 · w ([ n ]) be suﬃciently large integer. For every set A ∈ L create a vertex v A of weight w ( v A ) = M + w ( A ) , for every set B ∈ L create a vertex v B of weight M + w ( B ) , for every set C ∈ R create a vertex v C of weight M + w ( C ) . Finally, for every set D ∈ R create a vertex v D of weight − M − t + w ( D ) .Next for every i ∈ { , , } add an edge between vertices v iX and v i +1 Y iﬀ X ∩ Y = ∅ . Thisconcludes the construction. At the end we run our hypothetical oracle to an algorithm for ExactNode Weighted P and return true if the oracle detects a simple path of vertices with total weight . This concludes description of the reduction.Now we analyse the correctness. If there exist S ∈ L , S ∈ L , S ∈ R and S ∈ R thatare disjoint and sum to t , then vertices v S , v S , v S , v S form a path and their sum is equal to .For the other direction, suppose there exist vertices that form a path and their sum is equal to .Because their sum is equal to and integer M is larger than the rest of the weight, these verticesfrom distinct groups, i.e. vertices v A , v B , v C , v D for some sets A, B, C, D . Moreover vertices v A , v B have to be connected (since vertices in group 1 are connected only to the vertices in group 2), hence A ∩ B = ∅ . Analogously, it can be checked that the rest of the sets B, C, D are disjoint. Observe that21 ( v A ) + w ( v B ) + w ( v C ) + w ( v D ) = 0 hence w ( A ) + w ( B ) + w ( C ) + w ( D ) = t . By the correctness ofconstruction of L , L , R , R we conclude that the answer to the Subset Sum instance is positive.Finally we analyse the runtime of our reduction. The number of vertices is clearly | V | = O (cid:63) ( |L | + |L | + |R | + |R | ) ≤ O (cid:63) (2 . n + κn ) The time needed to construct this graph is | E | = O (cid:63) ( |L ||L | + |L ||R | + |L ||L | ) ≤ O (cid:63) (2 . n +2 κn ) (this is also an upper bound on numberof edges).Hence if we would have an algorithm that solves Exact Node Weighted P in time O ( | V | . ) ,then Subset Sum could be solved in randomized time O (cid:63) (2 . . n + κn ) ) ≤ O (cid:63) (2 (0 . . κ ) n ) .Note that κ ( ε , λ ) is some constant that can be selected to be arbitrarly close to . In this section we will prove the following lemma. As discussed in the introduction it should benoted that the proof strategy is similar to the one from [FLPS16] (which is heavily inspired onBollobás’s Theorem [Bol65]), but we obtain improvements that are crucial for the main result ofthis paper. We compare our methods with existing literature in the end of this section.

Theorem 6.1 (OV-algorithm, Generalization of Theorem 1.2) . For any λ ∈ [0 . , . and σ ∈ [0 . , . , there is a Monte-Carlo algorithm that is given A ⊆ (cid:0) dσλd (cid:1) and

B ⊆ (cid:0) d (1 − σ ) λd (cid:1) , detects ifthere exists A ∈ A and B ∈ B with A ∩ B = ∅ in time: (cid:101) O (cid:16) ( |A| + |B| ) 2 d (1 / λ − h ( λ/ (cid:17) and space (cid:101) O ( |A| + |B| + 2 d ) . For the purposes within the proof of Lemma 4.1, we can assume that λ ≤ . by a subsetcomplementation trick. The bound σ ∈ [0 . , . is an artifact of technical methods we used in theproof of Lemma B.1. In the proof of this lemma the parameters λ and σ lost their meaning fromSection 4. Hence, to simplify, we let p := σλn and q := (1 − σ ) λn , and let A ⊆ (cid:0) [ d ] p (cid:1) and B ⊆ (cid:0) [ d ] q (cid:1) .We use the following standard deﬁnitions from communication complexity (see for example [RY20]): Deﬁnition 6.2 ( ( p, q, d ) -Disjointness Matrix) . For integers p, q, d the Disjointness matrix

Disj p,q,d has its rows indexed by (cid:0) [ d ] p (cid:1) and its columns indexed by (cid:0) [ d ] q (cid:1) . For A ∈ (cid:0) [ d ] p (cid:1) and B ∈ (cid:0) [ d ] q (cid:1) we deﬁne Disj p,q,d [ A, B ] = (cid:40) if A ∩ B = ∅ , otherwise. Deﬁnition 6.3 (Monochromatic Rectangle, -Cover) . A monochromatic rectangle of a matrix M is subset X of rows and subset Y of the columns such that M [ i, j ] = M [ i (cid:48) , j (cid:48) ] for every i, i (cid:48) ∈ X and j, j (cid:48) ∈ Y . A family of monochromatic rectangles M = ( X , Y ) , . . . , ( X z , Y z ) is called a -cover iffor every i, j such that M [ i, j ] = 1 , there exists k ∈ [ z ] , such that i ∈ X k and j ∈ Y k . A natural goal in the ﬁeld of communication complexity is to ﬁnd ‘good’ -covers. The naturalparameter that quantiﬁes such ‘goodness’ is z (intuitively the smaller z the better a -cover wehave). The parameter z is sometimes called the Boolean rank and it is known to be equal to nc ( M ) where nc ( M ) is the ‘non-deterministic communication complexity’ of M (see e.g. [RY20]). The name ‘Boolean rank’ is used because a -cover of M with z rectangles is equivalent to a factorization M = L · R over the Boolean semi-ring of rank z . -covers of the Disjointness matrix can be used in algorithms for the Orthogonal Vectorsproblem: An orthogonal pair is a in the submatrix of the Disjointness induced by the rows andcolumns from the families A and B , and we can search for such a via searching for the associatedmonochromatic rectangle that covers it (see Lemma 6.5 for a related approach). For the case that p = q , it is well known that Disj p,q,d admits a -cover with O (2 p p ln d ) rectangles [RY20, Claim1.37]. When applied naïvely, this -cover would imply an (cid:101) O (( |A| + |B| )2 d/ ) time algorithm for thesetting of Theorem 1.2 with p = q = d/ .In order to get a faster algorithm we introduce the following new parameter of a -cover: Deﬁnition 6.4 (Sparsity) . The sparsity of a -cover M = ( X , Y ) , . . . , ( X z , Y z ) of an n × m matrixis deﬁned as (cid:80) i | X i | /n + (cid:80) i | Y i | /m . A -cover of sparsity Ψ of a matrix can be understood as a factorization of M = L · R over theBoolean semi-ring such that the average number of ’s in a row L plus the average number of ’sin a column of R is at most Ψ . Our notion of sparsity is related to the degree of the data structurecalled n - p - q -separating collection [FLPS16]. For a further discussion about sparse factorizationssee [Ned20, Section 5.1])We present the algorithmic usefulness of the notion of sparsity of -cover with the followingstatement. Lemma 6.5 (Orthogonal Vectors Parameterized by the Sparsity) . For any constant integer c andintegers p, q, d such that c divides p, q, d , there is an algorithm that takes as an input a -cover M of Disj p/c,q/c,d/c of sparsity Ψ and two set families A ⊆ (cid:0) [ d ] p (cid:1) , B ⊆ (cid:0) [ d ] q (cid:1) with the following properties:It outputs a pair A ∈ A and B ∈ B such that A ∩ B = ∅ with constant non-zero probability if sucha pair exists. Moreover, it uses (cid:101) O (( |A| + |B| )Ψ c + 2 d/c ) time and (cid:101) O (2 d/c + z c ) space, where z isthe number of rectangles of M . Input :

A ⊆ (cid:0) dp (cid:1) , B ⊆ (cid:0) dq (cid:1) and -cover M = ( X , Y ) , . . . , ( X z , Y z ) . Output:

Exist A ∈ A and B ∈ B such that A ∩ B = ∅ ? Randomly partition [ d ] = [ U ] (cid:93) . . . (cid:93) [ U c ] For every Q ∈ (cid:0) [ U i ] p/c (cid:1) construct L i ( Q ) := { j ∈ [ z ] : Q ∈ X j } // Use O (cid:63) (2 d/c ) space For every Q ∈ (cid:0) [ U i ] q/c (cid:1) construct R i ( Q ) := { j ∈ [ z ] : Q ∈ Y j } // Use O (cid:63) (2 d/c ) space Initialize T [ i , . . . , i c ] = False for every i , . . . , i c ∈ [ z ] // Use (cid:101) O ( z c ) space foreach A ∈ A do if (cid:81) ci =1 | L i ( A ∩ U i ) | ≤ (cid:101) O ( | Ψ | c ) then foreach ( i , . . . , i c ) ∈ L ( A ∩ U ) × . . . × L c ( A ∩ U c ) do Set T [ i , . . . , i c ] = True foreach B ∈ B do if (cid:81) ci =1 | R i ( B ∩ U i ) | ≤ (cid:101) O ( | Ψ | c ) then foreach ( i , . . . , i c ) ∈ R ( B ∩ U ) × . . . × R c ( B ∩ U c ) do if T [ i , . . . , i c ] = True then return True return False . Algorithm 4:

Pseudocode of Lemma 6.523 roof.

Denote the -cover to be M = ( X , Y ) , . . . , ( X z , Y z ) . Observe, that if A ∩ B = ∅ then itsuﬃces to ﬁnd (cid:96) ∈ [ z ] such that A ∈ X (cid:96) and B ∈ Y (cid:96) since M forms a -cover. In the birdsview, thealgorithm will ﬁnd such an (cid:96) . We need to make sure that the space usage of our algorithm is low.We will use parameter c to achieve that (it is instructive for a reader to assume c = 1 ). Algorithm 4presents an overview of the proof.First, randomly partition [ d ] into blocks U , . . . , U c with | U d | = d/c . By Lemma 3.4, if werepeat the algorithm d O ( c ) times with probably at least /d O ( c ) this partition is good, i.e., for someorthogonal pair A, B it holds that | A ∩ U i | = p/c and | B ∩ U i | = p/c .Next, we map the given factorization X , . . . , X z , Y , . . . , Y z of Disj p/c,q/c,d/c , to the set U i byunifying U i with [ d ] a uniformly random permutation.Now we present a processing step of the algorithm. For every i ∈ [ c ] we create and store two lists L and R . The purpose of these lists is to give every element in A and B fast access to correspondingrectangles from the -cover that contain it (i.e., given A we need to ﬁnd all X i , such that A ∈ X i in (cid:101) O ( z ) time). Speciﬁcally, for every i ∈ [ c ] construct:For every set Q ∈ (cid:18) [ d/c ] p/c (cid:19) construct the list L i ( Q ) := { j ∈ [ z ] : Q ∈ X j } . And similarly for all i ∈ [ c ] :For every set Q ∈ (cid:18) [ d/c ] q/c (cid:19) construct the list R i ( Q ) := { j ∈ [ z ] : B ∈ Y j } . Because (cid:0) d/cp/c (cid:1) ≤ d/c we can construct and store all L i ( Q ) and R i ( Q ) in (cid:101) O (2 d/c + 2 d/c z ) timeand space. Additionally, initialize a table T [ i , . . . , i c ] := False for every i , . . . , i c ∈ [ c ] . This tablewill store which sets X i , Y i have been seen by elements in A . Observe that so far we did not look atthe input A and B ; we just preprocessed the -cover, so the next steps can be computed eﬃciently.Now iterate over every element A ∈ A and check if we can aﬀord to process it, i.e., if | L ( A ∩ U ) | · · · | L c ( A ∩ U c ) | > (4 c Ψ) c we simply ignore it (later we will prove that for a disjoint pair A and B this situation happens with low probability). If indeed we can aﬀord it, then we mark it in table T : For every ( i , . . . , i c ) ∈ L ( A ∩ U ) × . . . × L c ( A ∩ U c ) we mark T ( i , . . . , i c ) to be True . Clearlythis step takes (cid:101) O ( |A| Ψ c ) time.Next, we treat B in a similar way: We iterate over every element B ∈ B and check if | R ( B ∩ U ) | · · · | R c ( B ∩ U c ) | ≤ (4 c Ψ) c . If so, we iterate over every ( i , . . . , i c ) ∈ R ( B ∩ U ) × . . . × R c ( B ∩ U c ) and check if T ( i , . . . , i c ) = True . If this happens, then it means there exists A ∈ A that isorthogonal to the current B and we can return True . If this never happens, we return

False .Clearly, the total running time of the algorithm is (cid:101) O (( |A| + |B| )Ψ c ) and extra amount of workingmemory is (cid:101) O (2 d/c + z c ) . Hence we focus on correctness.Note that if True is returned, indeed there must exist disjoint A ∈ A and B ∈ B because M is -cover. For the other direction, suppose that there exist orthogonal A ∈ A and B ∈ B .As mentioned this implies by Lemma 3.4 that with /d c we have that for each i it holds that | A ∩ U i | = p/c and | B ∩ U i | = p/c . Because we uniﬁed [ d ] with U i with a random permutation, E [ | L i ( A ∩ U i ) | ] , E [ | R i ( A ∩ U i ) | ] = Ψ , and by Markov’s inequality and a union bound there will beno i with | L i ( A ∩ U i ) | + | R i ( B ∩ U i ) | ≥ c Ψ , and therefore | L ( A ∩ U ) | · · · | L c ( A ∩ U c ) | ≤ (4 c Ψ) c and | R ( B ∩ U ) | · · · | R c ( B ∩ U c ) | ≤ (4 c Ψ) c . If this happens, the orthogonal pair will be detectedsince ( X , Y ) , . . . , ( X z , Y z ) is a -cover. 24 emma 6.6 (Construction of -cover with small sparsity) . Let p, q and d be integers such that p ≤ q and p + q ≤ d/ . There is a randomized algorithm that in O (2 d ) time and space, constructs X , . . . , X z ⊆ (cid:0) [ d ] p (cid:1) and Y , . . . , Y z ⊆ (cid:0) [ d ] p (cid:1) , where z is at most d .All pairs of sets ( X , Y ) , . . . , ( X z , Y z ) form monochromatic rectangles in Disj p,q,d and withprobability at least / , it holds that ( X , Y ) . . . , ( X z , Y z ) is a -cover of Disj p,q,d with sparsity d O (1) · d/ p + q − d · h ( p + q d ) . Proof.

Let l = p + q and let A ∈ A , B ∈ B be an orthogonal pair. Let x be some parameter thatwe will determine later (think about x ≈ d/ ). Note that (cid:12)(cid:12)(cid:12)(cid:12)(cid:26) S ∈ (cid:18) [ d ] x (cid:19) : A ⊆ S and S ∩ B = ∅ (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:18) d − lx − p (cid:19) . Let S := { S , . . . , S z } ⊆ (cid:0) [ d ] x (cid:1) be obtained by including each set from (cid:0) [ d ] x (cid:1) with probability d (cid:0) d − lx − p (cid:1) − (assuming x > p + Ω(1) , this probability is indeed in the interval [0 , ).Thus, if A and B are disjoint sets, with good probability there is a certiﬁcate set S ∈ S , suchthat A ⊆ S and S ∩ B = ∅ . More formally: P [ (cid:54) ∃ S ∈ S : A ⊆ S and S ∩ B = ∅ | A ∩ B = ∅ ] = (cid:32) − d (cid:18) d − lx − p (cid:19) − (cid:33) ( d − lx − p ) ≤ exp( − d ) , (16)(where the last inequality is due to the standard inequality α ≤ exp( α ) ). Now we deﬁne a -cover based on the family S :For every i ∈ [ z ] : X i := (cid:18) S i p (cid:19) and Y i := (cid:18) [ d ] \ S i q (cid:19) . First let us prove that with good probability X , Y . . . , X z , Y z is -cover. There are at most d disjoint pairs A, B . Hence by Equation 16 and the union bound on all disjoint pairs

A, B , we havethat ( X , Y ) , . . . , ( X z , Y z ) is a -cover with probability at least / .Next, we bound the sparsity of ( X , Y ) , . . . , ( X z , Y z ) . By Markov’s inequality, z ≤ d (cid:0) dx (cid:1)(cid:0) d − lx − p (cid:1) − with probability at least / . Hence with probability at least / our -cover has sparsity at most: d (cid:18) dx (cid:19)(cid:18) d − lx − p (cid:19) − (cid:18) | X i | / (cid:18) dp (cid:19) + | Y i | / (cid:18) dq (cid:19)(cid:19) =4 d (cid:18) dx (cid:19)(cid:18) d − lx − p (cid:19) − (cid:32)(cid:18) xp (cid:19)(cid:18) dp (cid:19) − + (cid:18) d − xq (cid:19)(cid:18) dq (cid:19) − (cid:33) =4 d (cid:18)(cid:18) d − px − p (cid:19) + (cid:18) d − qx (cid:19)(cid:19)(cid:18) d − lx − p (cid:19) − , (17)where the second equality follows from using (cid:0) ab (cid:1)(cid:0) bc (cid:1) = (cid:0) ab,c (cid:1) = (cid:0) ac (cid:1)(cid:0) a − cb (cid:1) twice.Next, we use Lemma B.1 (see Appendix B) with: d = n , p = σλn , q = (1 − σ ) λn and p + q = λn .Note that we assumed that σ ∈ [0 . , . and λ ∈ [0 . , . hence conditions for Lemma B.1 aresatisﬁed. We obtain that for the choice of x := d (1 / σ − / (3) /

2) + (1 / − σ )(1 / − λ )) expression (17) is bounded from above with d O (1) · d/ p + q − d · h ( p + q d ) , as required. 25ow the main statement of this section follows by a straightforward combination of the previouslemmas: Proof of Theorem 6.1.

Let p := σλd and q := (1 − σ ) λd . Set c = 20 and assume that integers p, q, d are multiples of c (by padding the instance if needed).Next, use Lemma 6.6 with d/c , p/c and q/c to construct a -cover M of sparsity Ψ = d O (1) · d/ p + q − dh (( p + q ) / (2 d )) c , with good probability. Subsequently, apply Lemma 6.5 with this -cover M to detect a disjointpair A ∈ A and B ∈ B with constant probability. Note that the runtime is: (cid:101) O (cid:16) ( |A| + |B| ) (4 c Ψ) c + 2 d/c (cid:17) = (cid:101) O (cid:16) ( |A| + |B| ) 2 d/ p + q − dh (( p + q ) / d ) (cid:17) . Hence, the running time is (cid:101) O (cid:0) ( |A| + |B| ) 2 d (1 / λ − h ( λ/ (cid:1) . The main bottleneck in the spaceusage comes from the z c factor in Lemma 6.5 which gives the d factor. Lower Bounds on sparsity

One might be tempted to try to get even better bounds on thesparsity of the disjointness matrix. Here we show that the sparsity bound from Lemma 6.6 isessentially optimal with a fairly straightforward counting argument. It means that new techniqueswould have to be developed in order to improve an algorithm for Orthogonal Vectors in the worstcase σ = 1 / and λ = 1 / , and in consequence improve the meet-in-middle algorithm for SubsetSum. Theorem 6.7.

Any -cover of Disj d/ ,d/ ,d/ has sparsity at least Ω (cid:63) (cid:16) d / (cid:0) dd/ (cid:1)(cid:17) .Proof. Let ( X , Y ) , . . . , ( X z , Y z ) be a -cover of Disj d/ ,d/ ,d/ . Next, we deﬁne L i := (cid:91) A ∈ X i A and R i := (cid:91) B ∈ Y i B. We say an index i ∈ [ z ] is left-heavy if | L i | > d/ and right-heavy if | R i | > d/ . Note that i cannotboth be left-heavy and right-heavy since otherwise there exist A ∈ X i and B ∈ Y i that overlap,contradicting that X i , Y i is a monochromatic rectangle.By swapping X i and Y i we can assume without loss of generality that z (cid:88) i =1 i left-heavy | X i || Y i | ≥ z (cid:88) i =1 i right-heavy | X i || Y i | Since every disjoint pair of sets

A, B ⊆ [ d ] with | A | = | B | = d/ must be in at least one rectangle,we have the lower bound (cid:18) dd/ , d/ (cid:19) ≤ z (cid:88) i =1 | X i || Y i | ≤ z (cid:88) i =1 i left-heavy | X i || Y i | ≤ z (cid:88) i =1 i left-heavy | X i | (cid:18) d/ d/ (cid:19) , where the last inequality holds since | A i | ≤ d/ implies that | Y i | ≤ (cid:0) d/ d/ (cid:1) . Thus (cid:80) zi =1 | X i | ≥ d /d O (1) ,and the theorem follows because rows and columns are (cid:0) dd/ (cid:1) .26 elation of Techniques in this section with existing methods The idea for constructingthe -cover is relatively standard in communication complexity (see e.g., the aforementioned [RY20,Claim 1.37]). It was also used in some proofs of Bollobás’s Theorem [Bol65]. The idea of randomlypartitioning the universe to get a structured -cover is very similar to the derandomization of thecolor-coding approach from [AYZ95].Both ideas were also used by [FLPS16]. They also start with a probabilistic construction (c.f.,[FLPS16, Lemma 4.5]) on a small universe that is repeatedly applied, and use it to set up adata structure of ‘ n - p - q -separating collections’ that is similar to our lists. The small but crucialdiﬀerence however, is that (in our language) they obtain a monochromatic rectangle by sampling arandom set S ⊆ [ d ] (in contrast to our random sampling S ∈ (cid:0) [ d ] d/ (cid:1) in the case p = q = d/ ), and inthe case p = q = d/ this would lead to sparsity d/ / d/ (cid:29) d / (cid:0) dd/ (cid:1) . Acknowledgements.

The ﬁrst author would like to thank Per Austrin, Nikhil Bansal, PetteriKaski, Mikko Koivisto for several inspiring discussions about reductions from Subset Sum to Or-thogonal Vectors. The second author would like to thank Marcin Mucha and Jakub Pawlewicz foruseful discussions.

References [Abb20] Amir Abboud. personal communication, 2020.[ABHS19] Amir Abboud, Karl Bringmann, Danny Hermelin, and Dvir Shabtay. SETH-BasedLower Bounds for Subset Sum and Bicriteria Path. In

Proceedings of the ThirtiethAnnual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019 , pages 41–57,2019.[AKKM13] Per Austrin, Petteri Kaski, Mikko Koivisto, and Jussi Määttä. Space-Time Tradeoﬀsfor Subset Sum: An Improved Worst Case Algorithm. In

Automata, Languages, andProgramming - 40th International Colloquium, ICALP 2013 , pages 45–56, 2013.[AKKN15] Per Austrin, Petteri Kaski, Mikko Koivisto, and Jesper Nederlof. Subset Sum in theAbsence of Concentration. In , pages 48–61, 2015.[AKKN16] Per Austrin, Petteri Kaski, Mikko Koivisto, and Jesper Nederlof. Dense Subset SumMay Be the Hardest. In , pages 13:1–13:14, 2016.[AL13] Amir Abboud and Kevin Lewi. Exact Weight Subgraphs and the k-Sum Conjecture.In

Automata, Languages, and Programming - 40th International Colloquium, ICALP2013 , volume 7965 of

Lecture Notes in Computer Science , pages 1–12. Springer, 2013.[AWY15] Amir Abboud, Richard Ryan Williams, and Huacheng Yu. More Applications of thePolynomial Method to Algorithm Design. In

Proceedings of the Twenty-Sixth AnnualACM-SIAM Symposium on Discrete Algorithms, SODA 2015 , pages 218–230. SIAM,2015. Additionally they derandomize their construction by using brute-force to ﬁnd the probabilistic construction anduse splitters to derandomize the step of splitting the universe into c blocks. J. ACM , 42(4):844–856,1995.[BCJ11] Anja Becker, Jean-Sébastien Coron, and Antoine Joux. Improved Generic Algorithmsfor Hard Knapsacks. In

Advances in Cryptology - EUROCRYPT 2011 - 30th AnnualInternational Conference on the Theory and Applications of Cryptographic Techniques.Proceedings , pages 364–385, 2011.[BGNV18] Nikhil Bansal, Shashwat Garg, Jesper Nederlof, and Nikhil Vyas. Faster Space-Eﬃcient Algorithms for Subset Sum, k-Sum, and Related Problems.

SIAM J. Comput. ,47(5):1755–1777, 2018.[BHK09] Andreas Björklund, Thore Husfeldt, and Mikko Koivisto. Set Partitioning viaInclusion-Exclusion.

SIAM J. Comput. , 39(2):546–563, 2009.[BHKK] Andreas Björklund, Thore Husfeldt, Petteri Kaski, and Mikko Koivisto. CountingPaths and Packings in Halves. In Amos Fiat and Peter Sanders, editors,

Algorithms -ESA 2009, 17th Annual European Symposium. Proceedings .[Bjö14] Andreas Björklund. Determinant Sums for Undirected Hamiltonicity.

SIAM J. Com-put. , 43(1):280–299, 2014.[Bol65] Béla Bollobás. On generalized graphs.

Acta Mathematica Academiae ScientiarumHungarica , 16(3-4):447–452, 1965.[Bri17] Karl Bringmann. A Near-linear Pseudopolynomial Time Algorithm for Subset Sum.In

Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Al-gorithms, SODA 2017 , 2017.[Bri20] Karl Bringmann. personal communication, 2020.[CDL +

16] Marek Cygan, Holger Dell, Daniel Lokshtanov, Dániel Marx, Jesper Nederlof, YoshioOkamoto, Ramamohan Paturi, Saket Saurabh, and Magnus Wahlström. On Problemsas Hard as CNF-SAT.

ACM Trans. Algorithms , 12(3):41:1–41:24, 2016.[CW16] Timothy M. Chan and Ryan Williams. Deterministic APSP, Orthogonal Vectors, andMore: Quickly Derandomizing Razborov-Smolensky. In

Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016 , pages1246–1255. SIAM, 2016.[CW19] Lijie Chen and Ryan Williams. An Equivalence Class for Orthogonal Vectors. In

Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms,SODA 2019 , 2019.[DDKS12] Itai Dinur, Orr Dunkelman, Nathan Keller, and Adi Shamir. Eﬃcient Dissection ofComposite Problems, with Applications to Cryptanalysis, Knapsacks, and Combina-torial Search Problems. In

Advances in Cryptology - CRYPTO 2012 - 32nd AnnualCryptology Conference. Proceedings , 2012.[FLPS16] Fedor V. Fomin, Daniel Lokshtanov, Fahad Panolan, and Saket Saurabh. EﬃcientComputation of Representative Families with Applications in Parameterized and ExactAlgorithms.

J. ACM , 63(4):29:1–29:60, 2016.28GIKW19] Jiawei Gao, Russell Impagliazzo, Antonina Kolokolova, and Ryan Williams. Complete-ness for First-order Properties on Sparse Structures with Algorithmic Applications.

ACM Trans. Algorithms , 15(2):23:1–23:35, 2019.[HJ10] Nick Howgrave-Graham and Antoine Joux. New Generic Algorithms for Hard Knap-sacks. In

Advances in Cryptology - EUROCRYPT 2010, 29th Annual InternationalConference on the Theory and Applications of Cryptographic Techniques. Proceedings ,pages 235–256, 2010.[HS74] Ellis Horowitz and Sartaj Sahni. Computing Partitions with Applications to the Knap-sack Problem.

J. ACM , 21(2):277–292, 1974.[JW19] Ce Jin and Hongxun Wu. A Simple Near-Linear Pseudopolynomial Time Random-ized Algorithm for Subset Sum. In , pages 17:1–17:6, 2019.[KX17] Konstantinos Koiliaris and Chao Xu. A Faster Pseudopolynomial Time Algorithm forSubset Sum. In

Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA 2017 , 2017.[KX18] Konstantinos Koiliaris and Chao Xu. Subset Sum Made Simple.

CoRR ,abs/1807.08248, 2018.[LMS11] Daniel Lokshtanov, Dániel Marx, and Saket Saurabh. Lower bounds based on the expo-nential time hypothesis.

Bulletin of the European Association for Theoretical ComputerScience EATCS , 105, 01 2011.[LN10] Daniel Lokshtanov and Jesper Nederlof. Saving Space by Algebraization. In

Proceedingsof the 42nd ACM Symposium on Theory of Computing, STOC 2010 , pages 321–330,2010.[MNPW19] Marcin Mucha, Jesper Nederlof, Jakub Pawlewicz, and Karol Węgrzycki. Equal-Subset-Sum Faster Than the Meet-in-the-Middle. In , 2019.[Mon83] Burkhard Monien. The Complexity of Determining Paths of Length k. In

Proceedings ofthe WG ’83, International Workshop on Graphtheoretic Concepts in Computer Science ,pages 241–251, 1983.[Ned16] Jesper Nederlof. Finding Large Set Covers Faster via the Representation Method. In , 2016.[Ned20] Jesper Nederlof. Algorithms for NP-hard problems via Rank-related Parameters ofMatrices. In

Festschrift Dedicated to the 60th Birthday of Hans Bodlaender . Springer,2020.[NPSW21] Jesper Nederlof, Jakub Pawlewicz, Céline M. F. Swennenhuis, and Karol Węgrzycki.A Faster Exponential Time Algorithm for Bin Packing With a Constant Number ofBins via Additive Combinatorics. In

Accepted to SODA 2021 , 2021.[NvLvdZ12] Jesper Nederlof, Erik Jan van Leeuwen, and Ruben van der Zwaan. Reducing a TargetInterval to a Few Exact Queries. In

Mathematical Foundations of Computer Science2012 - 37th International Symposium, MFCS 2012 , pages 718–727, 2012.29RY20] Anup Rao and Amir Yehudayoﬀ.

Communication Complexity: and Applications . Cam-bridge University Press, 2020.[SS81] Richard Schroeppel and Adi Shamir. A T= O (2 n/ ) , S= O (2 n/ ) Algorithm for CertainNP-Complete Problems.

SIAM J. Comput. , 10(3):456–464, 1981.[VW18] Virginia Vassilevska-Williams. On Some Fine-Grained Questions in Algorithms andComplexity. In

Proceedings of the International Congress of Mathematicians (ICM2018) , pages 3447–34, 2018.[Wil05] Ryan Williams. A new algorithm for optimal 2-constraint satisfaction and its implica-tions.

Theor. Comput. Sci. , 348(2-3):357–365, 2005.

A Omitted proofs

A.1 The Approach of Schroeppel and Shamir

In this section we recall the Approach from [SS81] in such a way that we can easily reuse parts ofit. Their crucial insight is formalized in Lemma 5, which we ﬁrst recall for convenience:

Lemma 3.3.

Let

A, B ⊆ Z be two sets of integers, and let C := A + B := { a + b : a ∈ A, b ∈ B } be their sumset. Let c , . . . , c m be elements of C in increasing order. There is a data structure inc := inc ( A, B ) that takes (cid:101) O ( | A | + | B | ) preprocessing time and supports a query inc . next () thatin the i ’th (for ≤ i ≤ m ) call outputs ( P lc i , c i ) , and in the subsequent calls outputs EMPTY . Here P lc i is the set { ( a, b ) : a ∈ A, b ∈ B, a + b = c i } .Moreover, the total time needed to execute all m calls to inc . next () is (cid:101) O ( | A || B | ) and the maxi-mum space usage of the data structure is (cid:101) O ( | A | + | B | ) .Similarly, there is a data structure dec := dec ( A, B ) that outputs pairs of elements of A and B in order of their decreasing sum. Preprocessing inc ( A, B ) : Sort A = { a , . . . , a k } , B = { b , . . . , b l } Initialize priority queue Q For every b j ∈ B add ( a , b j ) to Q with priority a + b j Operation inc . next () : if Q is empty then return EMPTY Let w be the lowest priority in the queue Q Initialize P lw := ∅ while the lowest priority in Q is w do Let ( a i , b j ) be the element with lowest priority in queue Q and remove it if i ≤ k then add ( a i +1 , b j ) with priority a i +1 + b j to Q P lw = P lw ∪ { ( a i , b j ) } ; // a i + b j = w if P lw = ∅ then return inc . next () ; // Seek next if w / ∈ C return ( P lw , w ) . Algorithm 5:

Pseudocode of Lemma 530 roof.

For the overview of the proof see Algorithm 5. During the preprocessing step we sort sets A and B in increasing order A = { a ≤ a ≤ . . . ≤ a k } and B = { b ≤ b ≤ . . . b l } . Next weinitialize priority queue Q and for every b j ∈ B we add a tuple ( a , b j ) with priority a + b j . Thepreprocessing clearly takes (cid:101) O ( | A | + | B | ) time and space.Now we explain the implementation of operation inc . next () . We let w be the priority of theelement with the lowest priority in our queue. We go over all ( a i , b j ) in the priority queue that havethe priority w (and therefore a i + b j = w ) and add them to P lw . Namely, we remove every element ( a i , b j ) with a i + b j = w from the queue and replace it with ( a i +1 , b j ) . For correctness, note thatevery pair ( a i , b j ) ∈ A × B will eventually be added and removed from the queue. Moreover thepriority queue outputs elements in the increasing order.For the space complexity, observe that at any moment for every b ∈ B there exists at most one a ∈ A such that ( a, b ) ∈ Q . Hence at any moment the size of the priority queue is (cid:101) O ( | B | ) space.For the running time, observe that every pair ( a, b ) ∈ A × B will be added and removed from Q exactly once. Hence the total running time of all calls to inc . next () is (cid:101) O ( | A || B | ) .The datastructure of Lemma 5 can be used for an eﬃcient -SUM algorithm: Lemma 5

Lemma A.1.

An instance

A, B, C, D, t of -SUM with | A | = | B | = | C | = | D | = N can be solvedusing (cid:101) O ( N ) time and (cid:101) O ( N ) space.Proof. The idea is to simulate standard linear search routine on sets A + B and C + D . Use Lemma 5to enumerate A + B in increasing order and C + D in decreasing order. In any iteration with currentitema x ∈ A + B and y ∈ C + D , compare x + y with t . If x + y = t output YES. Otherwise,if x + y < t , query the next (larger) element of A + B , and if x + y > t query the next (smaller)element of C + D and iterate. If this terminates output NO. It is easy to see that this is alwayscorrect and runs in the required time and space bounds. Theorem A.2 ([SS81]) . Given a set S of n positive integers and a target t ∈ N . In O (cid:63) (2 n/ ) timeand O (cid:63) (2 n/ ) space we can determine if there exists S (cid:48) ⊆ S , such that w ( S (cid:48) ) = t .Proof. First, arbitrarily partition the input weights into sets A , A , A , A with | A | = | A | = | A | = | A | = n/ . Next enumerate and store for every i ∈ { , . . . , } the sets: A i := { w ( S ) | S ⊆ A i } . Observe, that |A i | = 2 n/ . We can construct and store A i for every i ∈ { , . . . , } in O (cid:63) (2 n/ ) time and space. Now, we solve the 4-SUM instance with sets A , . . . , A and target t using thealgorithm from Lemma A.1. Because we can solve an instance of 4-SUM of N integers in (cid:101) O ( N ) time and (cid:101) O ( N ) space and an instance size is N = 2 n/ , the algorithm runs in O (cid:63) (2 n/ ) time and O (cid:63) (2 n/ ) space. For the correctness, assume that S (cid:48) ⊆ S with w ( S (cid:48) ) = t . Note that A i ∩ S (cid:48) ∈ A i forevery i ∈ { , . . . , } . Therefore 4-SUM algorithm answers yes if there exists S (cid:48) ⊆ S with w ( S (cid:48) ) = t .The other direction of correctness is trivial. A.2 Enumerating L , L , R , R Lemma A.3.

We can enumerate L , R , L , R in (cid:101) O ( |L | + |L | + |R | + |R | + 2 µn ) time andspace. See Appendix C for problem deﬁnitions. roof. We start with enumerating sets S , . . . , S deﬁned in Equation 6. We can do that in timeand space O (cid:63) (2 | L | + 2 | M | + 2 | R | ) . This is bounded by our claimed runtime, because | L | and | R | arebounded by (1 − µ + | β | µ ) n/ < (1 − . µ ) n/ < µ (recall that µ > . and | β | < . ).Next, we construct tables modulo p L , p R , for all i ∈ { , . . . , } and a ∈ [ p L ] : T iL [ a ] = { X | X ∈ S i and w ( X ) ≡ p L a } . And for all i ∈ { , . . . , } and a ∈ [ p R ] : T iR [ a ] = { X | X ∈ S i and w ( X ) ≡ p R a } . Now construct sets L , L , R , R with the dynamic programming according to Equations (8)-(11). For example, to construct L we join all sets X ∈ T L [ a ] with Y ∈ T L [ x L − a ] for all a ∈ [ p L ] .This can be computed in the O (cid:63) ( |L | + |L | + |R | + |R | + 2 µn ) extra time and space (note that p L , p R = O (cid:63) (2 µn ) ). A.3 Improved Time-Space Trade-oﬀ

Corollary A.4.

Let S be an integer satisfying S ≤ . n . Then any Subset Sum instance on n integers can be solved by a Monte Carlo algorithm using O (cid:63) ( S ) space and O (cid:63) (2 n / S . ) time.Proof. Let w , . . . , w n , t be such an instance, and set b = n − log S / . . For every subsetof X ⊆ n − b + 1 , . . . , n solve the Subset Sum instance with weights w , . . . , w n − b and target t − (cid:80) i ∈ X w i using Theorem 1.1. This is clearly a correct Monte Carlo algorithm, and it uses S =2 . n − b ) space and time T = O (cid:63) (2 b +( n − b ) / ) . Thus we have that T S . / . n ≤ n . B Inequality in the runtime analysis of algorithm for OV

In this section we will prove the bound on the running time of Orthogonal Vectors algorithm.Intuitively, it means that the hardest case is when σ = 1 / . We will use the short binomialnotation, i.e., (cid:0) αnβn (cid:1) = (cid:0) αβ (cid:1) n . The inequality that we prove is: Lemma B.1.

For large enough n and λ ∈ [0 . , . and σ ∈ [0 . , . the following inequality holds: min x (cid:40) (cid:0) − λσx − λσ (cid:1) n + (cid:0) − (1 − σ ) λx (cid:1) n (cid:0) − λx − λσ (cid:1) n (cid:41) ≤ n (1 / λ − h ( λ/ n O (1) . The strategy behind the proof is to ﬁnd an x that is a good approximation (up to a rd orderfactors) of the equation (cid:0) − λσx − λσ (cid:1) n = (cid:0) − (1 − σ ) λx (cid:1) n . We found it with a computer assistance. Next weplug in the x and use the Taylor expansion up to the nd order. It will turn out that all the 0th and2nd order terms cancel out. Moreover we will prove that nd order terms are negative. Because weuse Taylor expansions, we need an extra assumption about closeness of λ, σ to / (recall, that weonly need λ ≤ . ). Observe that nh ( α ) is within polynomial factors from (cid:0) nαn (cid:1) , hence in the proofwe decided to skip factors n O (1) . Proof of Lemma B.1.

First we do the substitution: σ := 1 / − α and λ := 1 / − β . We have that α ∈ [ − , ] and β ∈ [0 , ] by symetry. The inequality that we need to prove is therefore:32 in x  (cid:0) − αβ +( α + β ) / x − / − αβ +( α + β ) / (cid:1) n + (cid:0) + αβ − ( α − β ) / x (cid:1) n (cid:0) / βx − / − αβ +( α + β ) / (cid:1) n  ≤ n (1 − β − h (1 / − β/ Our choice for the minimizer is x := 1 / σ − / (3) /

2) + (0 . − λ )(0 . − σ ) = 1 / − α (log (3)) / αβ . Moreover deﬁne constant c := (log (3) − / . Then our inequality is: (cid:0) + α/ β/ − αβ )1 / − cα + β/ (cid:1) n + (cid:0) − α/ β/ αβ )1 / cα + β/ (cid:1) n (cid:0) / β / β/ − cα (cid:1) n ≤ n (1 − β − h (1 / − β/ Now we will use the following observation:

Claim B.2. If α ∈ [ − , ] , β ∈ (0 , ) then (cid:18) / β / β/ − cα (cid:19) n ≥ n (1 / β − . α ) Hence if we multiply by the divisor, our inequality is simpliﬁed to: (cid:18) + α/ β/ − αβ )1 / − cα + β/ (cid:19) n + (cid:18) − α/ β/ αβ )1 / cα + β/ (cid:19) n ≤ n (3 / − . α − h (1 / − β/ Next, we use the inequality (cid:0) n + εk (cid:1) ≤ ε (cid:0) nk (cid:1) twice (for ε = αβ and ε = − αβ ) to simplify to: αβ (cid:18)(cid:18) + α/ β/ / − cα + β/ (cid:19) n + (cid:18) − α/ β/ / cα + β/ (cid:19) n (cid:19) ≤ n (3 / − . α − h (1 / − β/ Next, we use inequality (cid:0) n + εk + ε (cid:1) ≤ ε (cid:0) nk (cid:1) for ε = β/ and simplify it even further: αβ + β/ (cid:18)(cid:18) + α/ / − cα (cid:19) n + (cid:18) − α/ / cα (cid:19) n (cid:19) ≤ n (3 / − . α − h (1 / − β/ Next, we use the following:

Claim B.3.

For every α ∈ [ − , ] , β ∈ [0 , ] it holds: h (1 / − β ) ≤ h (1 / − β/ − αβ. Using this claim, it remains to show that (cid:18) + α/ / − cα (cid:19) n + (cid:18) − α/ / cα (cid:19) n ≤ n (3 / − . α − h (1 / Finally, we use our last claim:

Claim B.4.

For every α ∈ [ − , ] the following holds: (cid:18) − α/ / cα (cid:19) n ≤ (cid:18) / / (cid:19) n − . α n (cid:18) / / (cid:19) n − . α n ≤ n (3 / − . α − h (1 / To see that this holds, note that the α factors cancel out, and that remaining inequality (cid:0) / / (cid:1) n (cid:0) / (cid:1) n ≤ . n is in fact an equality because both sides count the number of partitions of n in three blocks ofsize n/ , n/ and n/ .Now, we will present a proofs of the claims. These are based on the following Taylor expansionsof the entropy function: h (1 / x ) = h (1 /

3) + x −

274 ln 8 x + 278 ln 8 x − O ( x ) (18)We will denote κ := −

274 ln 8 and κ :=

278 ln 8 . h (1 / x ) = h (1 /

4) + (log x − x + O ( x ) (19) Proof of Claim B.2.

Recall, that we put c := (log (3) − / . First, we use the entropy functionand write: (cid:18) / β / β/ − cα (cid:19) n = 2 (1 / β ) h (1 / − cα/ (1 / β )) n . Hence we need to show: (1 / β ) · h (cid:18) / − cα / β (cid:19) ≥ / β − . · α . Next, we use the inequality h (1 / − x ) ≥ − x (Inequality (3)) and have: (1 / β ) · h (cid:18) / − cα / β (cid:19) ≥ (1 / β ) (cid:32) − (cid:18) cα / β (cid:19) (cid:33) = 1 / β − c α / β . Hence, we need to show that: / β − c α / β ≥ / β − . · α Which is equivalent to / β ≥ c . Note that c . ≈ . and the claim follows because β ≥ . Proof of Claim B.3.

From (19), we know that h (1 / − β/ ≤ h (1 / − log β Hence, we need to show that: h (1 / − β/ − αβ ≥ h (1 / − log β Which means that: ≤ (cid:18) log − − α (cid:19) β ≈ (0 . − α ) β This holds when β ≥ and α ≤ . . 34 roof of Claim B.4. This is the moment, when the choice of x is used. First, let us rewrite thebinomial coeﬃcient as an entropy function. (cid:18) / − α/ / cα (cid:19) n ≤ n (3 / − α/ h (cid:16) / − cα / − α/ (cid:17) . Hence, we need to prove (3 / − α/ h (cid:18) / cα / − α/ (cid:19) ≤ h (1 / − . · α . Let us denote φ ( α ) := / cα / − α/ . Therefore, we need to show (3 / − α/ h ( φ ( α )) ≤ h (1 / − . · α . The strategy behind the proof is straightforward. We bound φ ( α ) with Taylor expansion and thenbound h ( φ ( α )) . The inequality is technical, because we need to expand up to the O ( α ) term. Letus use Taylor expansion of fraction inside binary entropy: φ ( α ) := 1 / cα / − α/ / c + 1) α + 427 (6 c + 1) α + 881 (6 c + 1) α + O ( α ) . Let A := (12 c + 2) / and B := (24 c + 4) / . Note, that (6 c + 1) < . , therefore φ ( α ) ≤ / Aα + Bα + 0 . α − α . Because we assumed α < . we can roughly bound: φ ( α ) ≤ / Aα + Bα + 1 / · α . Then we plug in (18), the Taylor expansion of h (1 / x ) ≤ h (1 /

3) + x + κ · x + κ · x and have: h ( φ ( α )) ≤ h (1 /

3) + Aα + (cid:0) B + κ A (cid:1) α + κ α . Next we multiply it by ( − α/ and have: (cid:18) − α/ (cid:19) · h ( φ ( α )) ≤ h (1 / α (cid:18) A − h (1 / / (cid:19) + α (cid:18) B − A/ (cid:19) + α (cid:18) κ · A (cid:19) + 3 κ α . The x and the constant c were chosen in such a way that A = h (1 / / and B = A/ . Hence: (cid:18) − α/ (cid:19) · h ( φ ( α )) ≤ h (1 /

3) + α (cid:18) κ · A (cid:19) + α (cid:18) κ (cid:19) . Moreover κ · A / < − . and κ / < . . Hence (cid:18) − α/ (cid:19) · h ( φ ( α )) ≤ h (1 / − . α + 1 . α Recall that we assumed that α < . , therefore: (cid:18) − α/ (cid:19) · h ( φ ( α )) ≤ h (1 / − . α which we needed to prove. 35 Problems Deﬁnitions -SUM Input:

Sets

A, B, C, D of integers and a target integer t Task:

Find a ∈ A , b ∈ B , c ∈ C , d ∈ D such that a + b + c + d = t Binary Integer Programming (BIP)

Input:

Vectors v, a , . . . , a d ∈ [ m ] n and integers u , . . . , u d ∈ [ m ] Task:

Find x ∈ Z n , such thatminimize (cid:104) v, x (cid:105) subject to (cid:104) a j , x (cid:105) ≤ u j for all j ∈ [ d ] x i ∈ { , } for all i ∈ [ n ] . Orthogonal Vectors (OV)

Input:

Two sets of vectors A , B ⊆ { , } d Task:

Decide if there exists a pair a ∈ A and b ∈ B such that (cid:104) a, b (cid:105) = 0 .Exact Node Weighted P Input:

A node weighted, undirected graph G . Task:

Decide if there exists a simple path on vertices with total weight equal exactly .Knapsack Input:

A set of n items { ( v , w ) , . . . , ( v n , w n ) } Task:

Find x ∈ Z n such that:maximize (cid:104) v, x (cid:105) subject to (cid:104) w, x (cid:105) ≤ t,x i ∈ { , } n for all i ∈ [ n ] . Subset Sum

Input:

A set of n integers { w , . . . , w n } and integer t Task:

Decide if there exists x , . . . , x n ∈ { , } , such that (cid:80) ni =1 x i w i = tt