[PDF] Explicit Designs and Extractors

Abstract

We give significantly improved explicit constructions of three related pseudorandom objects. 1. Extremal designs: An (n,r,s) -design, or (n,r,s) -partial Steiner system, is an r -uniform hypergraph over n vertices with pairwise hyperedge intersections of size <s . For all constants r≥s∈N with r even, we explicitly construct (n,r,s) -designs ( G n ) n∈N with independence number α( G n )≤O( n 2(r−s) r ) . This gives the first derandomization of a result by Rödl and Šinajová (Random Structures & Algorithms, 1994). 2. Extractors for adversarial sources: By combining our designs with leakage-resilient extractors (Chattopadhyay et al., FOCS 2020), we establish a new, simple framework for extracting from adversarial sources of locality 0 . As a result, we obtain significantly improved low-error extractors for these sources. For any constant δ>0 , we extract from (N,K,n, polylog (n)) -adversarial sources of locality 0 , given just K≥ N δ good sources. The previous best result (Chattopadhyay et al., STOC 2020) required K≥ N 1/2+o(1) . 3. Extractors for small-space sources: Using a known reduction to adversarial sources, we immediately obtain improved low-error extractors for space s sources over n bits that require entropy k≥ n 1/2+δ ⋅ s 1/2−δ , whereas the previous best result (Chattopadhyay et al., STOC 2020) required k≥ n 2/3+δ ⋅ s 1/3−δ . On the other hand, using a new reduction from small-space sources to affine sources, we obtain near-optimal extractors for small-space sources in the polynomial error regime. Our extractors require just k≥s⋅ log C n entropy for some constant C , which is an exponential improvement over the previous best result, which required k≥ s 1.1 ⋅ 2 log 0.51 n (Chattopadhyay and Li, STOC 2016).

Full PDF

aa r X i v : . [ c s . CC ] J u l Explicit Extremal Designs and Applications to Extractors

Eshan Chattopadhyay ∗ Cornell University [email protected]

Jesse Goodman ∗ Cornell University [email protected]

July 16, 2020

Abstract

An ( n, r, s )-design, or ( n, r, s )-partial Steiner system, is an r -uniform hypergraph over n vertices with pairwise hyperedge intersections of size < s . An independent set in a hypergraph G is a subset of vertices covering no hyperedge, and its independence number α ( G ) is the sizeof its largest independent set. For all constants r ≥ s ∈ N with r even, we explicitly construct( n, r, s )-designs ( G n ) n ∈ N with independence number α ( G n ) ≤ O ( n r − s ) r ). This gives the ﬁrstderandomization of a result by R¨odl and ˇSinajov´a (Random Structures & Algorithms, 1994).By combining our designs with a recent explicit construction of a leakage-resilient extrac-tor that works for low-entropy (Chattopadhyay et al., FOCS 2020), we obtain simple andsigniﬁcantly improved low-error explicit extractors for adversarial and small-space sources. Inparticular, for any constant δ >

0, we extract from (

N, K, n, k )-adversarial sources of locality0, where K ≥ N δ and k ≥ polylog n . The previous best result (Chattopadhyay et al., STOC2020) required K ≥ N / o (1) . As a result, we get extractors for small-space sources over n bitswith entropy requirement k ≥ n / δ , whereas the previous best result (Chattopadhyay et al.,STOC 2020) required k ≥ n / δ . ∗ Supported by NSF grant CCF-1849899.

Introduction

Let G = ( V, E ) be an r -uniform hypergraph over n vertices. G is called an ( n, r, s ) -design , or( n, r, s ) -partial Steiner system , if | e ∩ e | < s , for all distinct e , e ∈ E . In this paper, we study( n, r, s )-designs that are extremal in the sense that they have small independence number. In1994, R¨odl and ˇSinajov´a proved the ﬁrst general theorem about the existence of such objects:

Theorem 1.1 ([RˇS94]) . Given any n ≥ r ≥ s ∈ N with r ≥ , there exists an ( n, r, s ) -design G with independence number α ( G ) ≤ C r,s · n r − sr − (log n ) r − , where C r,s = C ( r, s ) depends only on r, s . In fact, they also showed this result is tight up to the term C r,s that depends only on r, s .In order to prove Theorem 1.1, R¨odl and ˇSinajov´a used the Lov´asz Local Lemma [Spe77] toshow that a random r -uniform hypergraph, where each hyperedge is included independently withsome probability p , is an ( n, r, s )-design with small independence number. Thus, while their resultproves the existence of these extremal designs, it does not provide an explicit way to construct them.Furthermore, later work has focused on improving the term C r,s [EV13, Eus13] or extending theresult to more general types of designs [GPR95, KMV14, TL18], while still relying on probabilisticconstructions.To the best of our knowledge, there are no known explicit constructions of ( n, r, s )-designswith a nontrivial upper bound on their independence number. However, such constructions areimportant in both extremal combinatorics and theoretical computer science: not only do theyoﬀer insight into the structure of these objects, but they also have recently found applications inthe construction of randomness extractors [CGGL20]. Furthermore, explicit designs of a diﬀerentextremal ﬂavor have been used to construct pseudorandom generators in the seminal work of Nisanand Wigderson [NW94]. Thus, the primary goal - and ﬁrst main contribution - of this paper is a derandomization ofTheorem 1.1. Second (and third), we show that our explicit designs are strong enough to be used,along with a very recent object from [CGG + Randomness extractors A randomness extractor is an object that can purify defective sources(distributions) of randomness found in nature. It is motivated by the fact that most applications ofrandomness in computing require access to uniformly random bits, yet the random bits harvestedfrom natural phenomena rarely look so pure. An extractor is formally deﬁned as follows. Deﬁnition 1.2.

Let X be a family of sources over { , } n . A function Ext : { , } n → { , } m isan extractor for X with error ǫ if for every X ∈ X , | Ext ( X ) − U m | ≤ ǫ, where U m is the uniform distribution over { , } m , and | · | denotes total variational distance. Recall that an independent set in a hypergraph is a subset of vertices that contains no edge, and the independencenumber of a hypergraph is the size of its largest independent set. [NW94] uses ( n, r, s )-designs that are extremal in the sense that they have a large number of hyperedges. It isknown how to explicitly construct this type of extremal design; see, e.g., the survey [Vad12]. X and errors ǫ have been considered depending on the motivatingapplication. Our designs will give much improved extractors for two important classes X of sources.First, we will give improved extractors for the class X of adversarial sources . Motivated byapplications in generating a (cryptographic) common random string, and in harvesting randomnessfrom unreliable sources, Chattopadhyay, Goodman, Goyal and Li [CGGL20] introduced this newclass of sources: Deﬁnition 1.3.

A source X over ( { , } n ) N is an ( N, K, n, k )-adversarial source of locality 0 if ithas the form X = ( X , . . . , X N ) , where each X i is an independent source over { , } n , and thereexists some (unknown) subset of K good sources ; i.e., there is some S ⊆ [ N ] , | S | ≥ K such thateach X j , j ∈ S has min-entropy H ∞ ( X ) ≥ k . The class X of adversarial sources generalizes several well-studied models, which will allow usto obtain improved extractors of another type, discussed next.Second, we will give improved extractors for the class X of small space sources . Introducedby Kamp, Rao, Vadhan and Zuckerman [KRVZ06], each distribution in this class is deﬁned to besamplable from some algorithm with limited memory. To deﬁne this class of sources formally, oneﬁrst deﬁnes a branching program of width w and length n to be a directed acyclic graph with n + 1layers, where the ﬁrst layer ( layer 0 ) has 1 node, the remaining layers have w nodes each, and suchthat an edge starting in layer i must terminate in layer i + 1. Then, one formally deﬁnes a smallspace source as follows. Deﬁnition 1.4.

A distribution X over { , } n is a space s source if it is generated by a randomwalk starting on the ﬁrst layer of a branching program of width s and length n , where each edge islabeled with an output bit and some transition probability. These sources were deﬁned following a line of work introduced by Trevisan and Vadhan [TV00]on extracting uniform bits from distributions that can be sampled by algorithms with some limitedcomputational resource. The motivation behind sources of this type is the idea that since theuniverse has limited resources, it may be reasonable to assume that imperfect sources of randomnessfound in nature are generated in this way.Given these deﬁnitions, we are now ready to state our main contributions.

In our ﬁrst main result, we explicitly construct ( n, r, s )-designs with small independence number.In particular, we prove the following theorem.

Theorem 1.

There exists an Algorithm A such that given any n ≥ r ≥ s ∈ N as input with r an even number, A runs in time poly (cid:0)(cid:0) nr (cid:1)(cid:1) and outputs an ( n, r, s ) -design G with independencenumber α ( G ) ≤ C r,s · n r − s ) r , where C r,s = C ( r, s ) depends only on r, s . The min-entropy of a source X over { , } n is deﬁned as H ∞ ( X ) := min x ∈ support( X ) log(1 / Pr[ X = x ]). r ≥ s ∈ N where r is even, Theorem 1 gives an explicit family of ( n, r, s )-designs( G n ) n ∈ N with small independence number. The independence number of our explicit designs diﬀersfrom Theorem 1.1 (known to be optimal, up to constants) by just a quadratic factor. Like thenon-explicit designs of [RˇS94], our derandomization focuses on the case where r, s are constant.However, we also give meaningful results when r, s are not constant: we show that we may take C r,s = Cr for some universal constant C >

0, and we argue that for almost all interesting settingsof super-constant r, s , our algorithm still runs eﬃciently. We refer the reader to Remark 3.2 inSection 3, and the discussion thereafter, for more detail.Our second main theorem gives much improved explicit extractors for (

N, K, n, k )-adversarialsources of locality 0:

Theorem 2.

There is a universal constant

20] and are known as extractorsfor cylinder intersections . Most relevant to us is the work of Chattopadhyay et al. [CGG +

20] wherethey gave the ﬁrst constructions of such objects for entropy k = o ( n ), and in fact their LREs workfor entropy k ≥ polylog n . Theorem 2 relies heavily on these low-entropy LREs.Finally, our third main theorem gives new low-error explicit extractors for space s sources. Untilrecently, the best extractors for such sources [KRVZ06] required entropy k ≥ Cn − γ s γ , where γ > C is a large one. In [CGGL20], the entropy requirement was improvedto k ≥ Cn / δ s / − δ . We reduce this entropy requirement even further, and prove the following. Theorem 3.

For any ﬁxed δ ∈ (0 , / there is a constant C > such that for all n, k, s ∈ N satisfying k ≥ Cn / δ s / − δ , there exists an explicit extractor Ext : { , } n → { , } m for space s sources of min-entropy k , with output length m = n Ω(1) and error ǫ = 2 − n Ω(1) . The line of improvements described above (from [KRVZ06] to [CGGL20] to Theorem 3) is strict ,in terms of both entropy and space: we refer the reader to Remark 5.2 in Section 5 for more detail.Furthermore, the entropy requirement in Theorem 3 virtually reaches the known limit of currenttechniques in low-error extraction from small-space sources, and any further improvements (i.e.,beyond k ≥ n / ) will require signiﬁcantly new ideas (see Remark 5.7 in Section 5). Finally, wenote that, along the way, we provide improved randomness extractors for a class of sources knownas total entropy sources (see Theorem 5.6). We brieﬂy sketch an overview of our techniques. We refer the reader to Section 2 for some basicpreliminaries on notation and probability.

Explicit designs with small independence number

In Section 3, we give our explicit con-struction of designs with small independence number, proving Theorem 1. In order to constructour ( n, r, s )-designs G = ( V, E ), we start with a linear code Q ⊆ F n of distance d > r − s ), andthen restrict it to the set Q r ⊆ Q of elements in Q that have Hamming weight r . Our design3 = ( V, E ) is constructed by identifying V with [ n ], and by creating a hyperedge for each x ∈ Q r in the natural way. The distance of the code and the deﬁnition of Q r immediately guarantees that G is an ( n, r, s )-design.In order to upper bound the independence number α ( G ) of our design, we observe that anyindependent set in G corresponds to a subcube S ⊆ F n that contains no vector in Q of weight r ;in other words, since Q is a linear code, this means that the subspace T ∗ := S ∩ Q has no vectorof Hamming weight r . If our linear code Q had very high dimension, then even if the subcube S was relatively small, we would have found a relatively large subspace T ∗ containing no vectorof Hamming weight r . But intuitively, it seems like as the dimension of a subspace grows largeenough, at some point it must be guaranteed to have such a vector. It turns out this is true,and it follows immediately from Sidorenko’s recent bounds [Sid18, Sid20] on the size of sets in F n containing no r elements that sum to zero. Thus if Q has large enough dimension, S cannot be toolarge, and thus neither can α ( G ). All that remains is to explicitly construct (the weight- r vectorsof) a high-dimensional linear code Q ⊆ F n with distance d > r − s ), which can easily be doneusing BCH codes [BRC60, Hoc59]. Improved extractor for adversarial sources

In Section 4, we show how to combine our de-signs with leakage-resilient extractors (LREs) in order to obtain better extractors for adversarialsources, proving Theorem 2. Informally, an LRE for r sources oﬀers the guarantee that its out-put looks uniform even conditioned on the output of many leakage functions, each called on upto r − N, K, n, k )-adversarial source of locality 0 consists of N independent sources, where only K of them are guaranteed to be “good” (i.e., contain some en-tropy). Given such a source, we use Theorem 1 to explicitly construct an ( N, r, r − G with small independence number α ( G ) < K , and we identify the vertices of our design with the N independent sources. Then, for each hyperedge in our design, we call a leakage-resilient extractoron the r sources it contains, and ﬁnish by taking the bitwise XOR over the outputs of the LREcalls.This construction successfully outputs uniform bits for the following reasons. Because α ( G )

N, r, r − leakage-resilience property of the LRE, guarantees thatthese uniform bits still look uniform even after taking their bitwise XOR with the outputs of allother LRE calls . Using these ideas, we actually provide a slightly more general framework tocombine ( N, r, s )-designs with LREs of various strength. Our framework leverages the “ activationvs. fragile correlation ” paradigm introduced in [CGGL20], yet it is able to do so in a much moresimple, general, and eﬀective way, by combining two very recent explicit pseudorandom objects:LREs from [CGG + Improved extractor for small-space sources

In Section 5, we show how our adversarial sourceextractors give better extractors for total-entropy sources and small-space sources, and therebyprove Theorem 3. To do so, we simply import standard reductions from small-space sources to total-entropy sources [KRVZ06] to adversarial sources [CGGL20]. The results then follow immediately.This application of adversarial source extractors was ﬁrst provided in [CGGL20], but we includethe (slightly optimized) techniques here for completeness, i.e., to demonstrate how the parametersof our improved adversarial source extractors carry over to total-entropy and small-space sources.We conclude with some remarks and present some open problems in Section 6.4

Preliminaries

General notation

Given two strings x, y ∈ { , } m , we let x ⊕ y denote their bitwise XOR. Fora number n ∈ N , [ n ] denotes the interval [1 , n ] ⊆ N . We let ◦ denote string concatenation, andfor a collection { x i : i ∈ I } indexed by some ﬁnite set I , we let ( x i ) i ∈ I denote the concatenationof all strings x i , i ∈ I . If I is already equipped with some total order, this is used to determinethe concatenation order; otherwise, I is arbitrarily identiﬁed with [ | I | ] to induce a total ordering.Given a domain D , and some string x ∈ D N , we let x i ∈ D denote the value at the i th coordinateof x . Given a subset S ⊆ [ N ], we let x S := ( x i ) i ∈ S . Even if D = R n for some other domain R andnumber n ∈ N , the deﬁnition of x S ∈ D | S | does not change. Basic coding theory and extractor deﬁnitions

We let F denote the ﬁnite ﬁeld of size two,and we let F n denote a vector space over this ﬁeld. The Hamming weight of a vector x ∈ F n isdeﬁned as ∆( x ) := { i ∈ [ n ] : x i = 1 } , and the Hamming distance between two vectors x, y ∈ F n isdeﬁned as ∆( x, y ) := ∆( x − y ), where the subtraction is over F . The standard basis vectors in F n is the collection E ∗ := { e i } i ∈ [ n ] , where e i ∈ F n holds a 1 at coordinate i and 0 everywhere else, anda subcube is a subspace spanned by some subset of E ∗ . An ( n, k, d )-code is a subset Q ⊆ F n of size2 k with the guarantee that any two distinct points x, y ∈ Q have Hamming distance ∆( x, y ) ≥ d .A linear [ n, k, d ]-code is simply an ( n, k, d ) code that is a subspace. Finally, we say that a source X over { , } n is an ( n, k ) source if it has min-entropy at least k , and we say that an extractor Ext an N -source extractor for entropy k if it is an extractor for a family of sources X , where each X ∈ X consists of N independent ( n, k ) sources. Discrete probability

In general, for a random variable X : Ω → V , we are only concerned withthe distribution over V induced by X . We will therefore typically not deﬁne the outcome space Ω,and can assume it has any form we like (so long as the distribution induced by X does not change).Given random variables X , Y and any y ∈ support( Y ), we let ( X | Y = y ) denote a randomvariable that takes value x with probability Pr[ X = x | Y = y ]. Given a random variable X and afamily of random variables Y , we say that X is a convex combination of random variables from Y ifthere exists a random variable Z such that for each z ∈ support( Z ), it holds that ( X | Z = z ) ∈ Y .We deﬁne the statistical distance between two random variables X , Y over V as | X − Y | := max S ⊆ V | Pr[ X ∈ V ] − Pr[ Y ∈ V ] | = 12 X v ∈ V | Pr[ X = v ] − Pr[ Y = v ] | , and we say that X , Y are ǫ -close if | X − Y | ≤ ǫ . Given these deﬁnitions, the following two standardfacts are easy to show, and are extremely useful. Fact 2.1.

For any random variable X ∼ { , } m and any constant c ∈ { , } m , it holds that | X − U m | = | ( X ⊕ c ) − U m | . Fact 2.2.

For any random variables X , Y , where X ∼ { , } m , it holds that | X − U m | ≤ E y ∼ Y [ | ( X | Y = y ) − U m | ] . Finally, we will need the following standard lemma about conditional min-entropy.

Lemma 2.3 ([MW97]) . Let X , Y be random variables such that Y can take at most ℓ values. Thenfor any ǫ > , it holds that Pr y ∼ Y [ H ∞ ( X | Y = y ) ≥ H ∞ ( X ) − log ℓ − log(1 /ǫ )] ≥ − ǫ. Explicit extremal designs via slicing codes and zero-sum sets

In this section, we eﬃciently construct designs with small independence number, and obtain ourmain theorem about explicit designs:

Theorem 3.1 (Theorem 1, restated) . There exists an Algorithm A such that given any n ≥ r ≥ s ∈ N as input with r an even number, A runs in time poly (cid:0)(cid:0) nr (cid:1)(cid:1) and outputs an ( n, r, s ) -design G with independence number α ( G ) ≤ C r,s · n r − s ) r , (1) where C r,s = C · r for some universal constant C ≥ . For all constants r ≥ s ∈ N with r even, Theorem 3.1 constructs an explicit family of ( n, r, s )-designs ( G n ) n ∈ N with small independence number. Like the non-explicit designs of Theorem 1.1from [RˇS94], our derandomization focuses on the case where r, s are constant. However, it turns outthat even for most super-constant r, s , our algorithm is still eﬃcient. In particular, before provingTheorem 3.1, we make (and quickly prove) the following remark. Remark 3.2.

Let A be the algorithm from Theorem 3.1, and let m = m ( n, r, s ) be the number ofhyperedges in the design produced by A on input ( n, r, s ) . Then for any functions r = r ( n ) , s = s ( n ) ,Algorithm A is guaranteed to run in time poly( n, m ) over the collection I = { ( n, r ( n ) , s ( n )) } n ∈ N as long as at least one of the following holds: • The functions r, s are constant: r ( n ) = O (1) and s ( n ) = O (1) ; or • There is a constant ǫ > such that inequality (1) is bounded above by O ( n − ǫ ) , ∀ ( n, r, s ) ∈ I . The ﬁrst bullet in Remark 3.2 reiterates the fact that the algorithm in Theorem 3.1 is eﬃcientwhen r, s are constant. The second bullet gives a more general remark on the performance ofAlgorithm A on super-constant r, s : it says that as long as Theorem 3.1 gave a “non-trivial” boundon the independence number in the ﬁrst place, then the algorithm will run eﬃciently. This eﬀectivelycovers all “interesting” regimes of r, s : indeed, the main application of selecting non-constant r, s would be to achieve independence bounds that are stronger than those achieved by constant r, s (and any constant r, s that achieve α ( G ) < n in Theorem 3.1 in fact achieve the second bullet).To prove that Algorithm A is eﬃcient given the condition in the second bullet, we use standardbounds on Tur´an numbers. The Tur´an number T ( n, β, r ) is deﬁned as the fewest number of edges inan r -uniform hypergraph with no independent set of size β , and it is known [Sid95] that T ( n, β, r ) ≥ (cid:0) nr (cid:1) / (cid:0) βr (cid:1) . Thus, the second bullet implies the number of edges, m = m ( n, r, s ), in the design is atleast T ( n, Cn − ǫ + 1 , r ) ≥ T ( n, n − ǫ/ , r ) ≥ (cid:18) nr (cid:19) / (cid:18) n − ǫ/ r (cid:19) ≥ (cid:18) nr (cid:19) / (cid:18) nr (cid:19) − ǫ/ ≥ (cid:18) nr (cid:19) ǫ/ , where we use the observation that the Tur´an number is non-increasing in its second argument, thefact that we can assume n, r are suﬃciently large (since otherwise the eﬃciency claim is trivial),and a simple application of Stirling’s formula. Thus, Algorithm A runs in time (cid:0) nr (cid:1) = poly( n, m ).In fact, since we gave a lower bound on m based on the independence number, it trivially holdsthat any algorithm that achieves independence numbers as small as A must output m edges,meaning that the runtime of A is optimal up to constant powers. This completes our discussion onRemark 3.2. 6e now turn to proving Theorem 3.1. We start with the simple observation that hypergraphsover n vertices can be identiﬁed with subsets of F n . In particular, any subset T ⊆ F n inducesa hypergraph G T = ( V, E ) in the following way: identify V with [ n ], and for each x ∈ T add ahyperedge e ⊆ [ n ] to E that contains exactly the coordinates that take the value 1 in x . Using thiscorrespondence, we can instead focus on constructing special subsets of F n , and thereby leveragethe tools of linear algebra and coding theory.To obtain our designs, we will need to explicitly construct a subset T ⊆ F n such that (1) G T isan ( n, r, s )-design; and (2) G T has small independence number. We can make sure this happens viathe following two simple facts, which describe how these hypergraph properties can be identiﬁedwith properties of subsets in F n . Fact 3.3.

For any subset T ⊆ F n , the hypergraph G T is an ( n, r, s ) -design if and only if (i) every x ∈ T has ∆( x ) = r ; and (ii) any two distinct x, y ∈ T have ∆( x, y ) > r − s ) .Proof. The two conditions are suﬃcient because the ﬁrst one guarantees that G T will be r -uniform,and the second one guarantees that any two edges in G T intersect at < s points. They are bothnecessary because if the ﬁrst does not hold, G T will not be r -uniform, and if the ﬁrst holds but thesecond does not, then two edges will end up sharing ≥ s points. Fact 3.4.

For any subset T ⊆ F n , the hypergraph G T has independence number α ( G T ) < ℓ if andonly if every subcube A ⊆ F n of dimension at least ℓ has at least one point in T .Proof. If α ( G T ) ≥ ℓ , there is an independent set S ⊆ V = [ n ] of size at least ℓ , and thus thesubcube A := span ( { e i } i ∈ S ) of dimension ℓ has no points in T . If there is a subcube A ⊆ F n ofdimension ℓ with no points in T , the set S ⊆ [ n ] indexing the standard basis vectors that span A must have size ℓ and constitute an independent set in G T .By Fact 3.3 and Fact 3.4, we see that the task of constructing an ( n, r, s )-design G with smallindependence number is equivalent to the task of constructing a subset T ⊆ F n with the following three properties :1. T lies in the Hamming slice ∆ r := { x ∈ F n : ∆( x ) = r } ,2. Points in T have pairwise Hamming distance > r − s ), and3. Any subcube of relatively small dimension intersects T .In order to construct a set T ⊆ F n with these three properties, we use connections to codingtheory and zero-sum problems . In particular, recall that an ( n, k, d )-code is a subset Q ⊆ F n of size2 k with the guarantee that any two distinct points x, y ∈ Q have Hamming distance ∆( x, y ) ≥ d .Thus, if we take any ( n, k, d )-code Q ⊆ F n with d > r − s ) and intersect it with the Hammingslice ∆ r , we obtain a set T = Q ∩ ∆ r that enjoys properties (1) and (2). In order to endow it withproperty (3), we will need to start with some code Q such that for any relatively large subcube S ,the set S ∩ T = S ∩ ( Q ∩ ∆ r ) = ( S ∩ Q ) ∩ ∆ r is non-empty.The trick here is to start with a linear code Q . A linear [ n, k, d ]-code Q ⊆ F n is simply an( n, k, d ) code that is also a subspace. The condition ( S ∩ Q ) ∩ ∆ r = ∅ required for property (3) nowbecomes more concrete: since Q is a subspace, S ∩ Q is also a subspace, and thus we can make sureit contains some vector of Hamming weight r as long as we can show that every large subspacecontains such a vector. In particular, deﬁning Λ r ( n ) to be the dimension of the largest subspace R ⊆ F n containing no vector of Hamming weight r , we prove the following lemma.7 emma 3.5. If Q ⊆ F n is a linear [ n, k, d ] -code with d > r − s ) , then the hypergraph G Q ∩ ∆ r isan ( n, r, s ) -design with independence number α = α ( G Q ∩ ∆ r ) that obeys the following inequality: α − Λ r ( α ) ≤ n − k Proof.

It follows immediately from Fact 3.3 that G Q ∩ ∆ r is an ( n, r, s )-design. By Fact 3.4, thereis a subcube A = span ( e i , . . . , e i α ) ⊆ F n of dimension α that doesn’t intersect Q ∩ ∆ r . Thus, ifwe deﬁne A ′ := A ∩ Q , then A ′ contains no vector of Hamming weight r , and furthermore it hasdimension dim ( A ′ ) = dim ( A ∩ Q ) ≥ dim ( A ) + dim ( Q ) − n = α + k − n . Notice now that if we deﬁnethe projection π : F n → F α as the map ( x , . . . , x n ) ( x i , . . . , x i α ), then the subset π ( A ′ ) is still asubspace (albeit now of F α ) of dimension dim ( π ( A ′ )) ≥ α + k − n containing no vector of Hammingweight r . Thus, by deﬁnition of Λ r , it must hold that α + k − n ≤ dim ( π ( A ′ )) ≤ Λ r ( α ).In order to construct an ( n, r, s )-design from Lemma 3.5 with the smallest possible independencenumber α , we will want an explicit [ n, k, d > r − s )]-linear code with the largest possible dimension k , along with a strong upper bound on Λ r ( n ). We start with the latter.Getting a good upper bound on Λ r ( n ) is closely related to the theory of zero-sum problems . Inthis ﬁeld, one parameter of great interest is the (generalized) Erd˝os-Ginzburg-Ziv constant (s) of aﬁnite abelian group. Given n ≥ r ∈ N where r is even, this parameter is deﬁned for F n as thesmallest integer s r ( n ) such that any sequence of s r ( n ) values in F n contains a subsequence of length r that sums to zero. For our application, it will be more convenient to use an almost identicalparameter β r ( n ), deﬁned as the size of the largest subset of F n containing no r elements that sumto zero. Using slightly diﬀerent terminology, the relationship between β r ( n ) and Λ r ( n ) was shownin [Sid20]. We include it here, in our language, for completeness. Lemma 3.6 ([Sid20]) . For every n ≥ r ∈ N where r is even, β r ( n − Λ r ( n )) ≥ n. Proof.

Let R ⊆ F n be a subspace of dimension k := Λ r ( n ), and deﬁne d := n − k . Let v , . . . , v d bea basis for the orthogonal complement of R , and deﬁne the matrix M ∈ F d × n so that its i th row is v i . Notice that R contains exactly the solutions to M x = 0, and thus R has a vector of Hammingweight r if and only if there are r columns in M that sum to zero. By deﬁnition of Λ r ( n ), we know R has no such vector, and thus n ≤ β r ( d ) = β r ( n − Λ r ( n )).To get a good upper bound on Λ r ( n ), we need a good upper bound on β r ( n ). In 2018, Sidorenkoprovided a very strong bound of this type: Theorem 3.7 ([Sid18], Theorem 4.4) . There is a universal constant

C > such that for every n, r ∈ N where r is even, β r ( n ) ≤ C · r · n/r . By plugging this bound into Lemma 3.6, we get the following corollary.

Corollary 3.8 ([Sid20]) . There is a universal constant

C > such that for any n ≥ r ∈ N where r is even, the largest subspace S ⊆ F n with no vector of Hamming weight r has dimension Λ r ( n ) ≤ n − ( r log n − r log r − r log C ) / . We are ﬁnally ready to prove our main design lemma, which reduces the problem of constructing( n, r, s )-designs with small independence number to constructing high-dimensional linear codes.8 emma 3.9 (Main design lemma) . There is a universal constant

C > such that for every n ≥ r ≥ s with r even, if Q ⊆ F n is a linear [ n, k, d ] -code with d > r − s ) , then G Q ∩ ∆ r is an ( n, r, s ) -design with independence number α ( G Q ∩ ∆ r ) ≤ C · r · n − k ) /r . Proof.

Simply plug the bound on Λ r ( α ) from Corollary 3.8 into Lemma 3.5.To complete the proof of Theorem 3.1, we now just need to explicitly construct a linear codewith very high dimension. In 1959-1960, Bose, Ray-Chaudhuri [BRC60], and Hocquenghem [Hoc59]explicitly constructed codes of exactly this type (see [GB10] for a great exposition of these codes,which are known as BCH codes ). In particular, they proved the following theorem.

Theorem 3.10 ([BRC60, Hoc59]) . For every m, t ∈ N , there exists an [ n, k, d ] -linear code BCH m,t ⊆ F n with block length n = 2 m − , dimension k ≥ n − mt , and distance d > t .Furthermore, there exists an Algorithm B that given any m, t ∈ N and x ∈ F n as input, checks if x ∈ BCH m,t in poly( n ) time. By instantiating Lemma 3.9 with Theorem 3.10, we can ﬁnally prove Theorem 3.1.

Proof of Theorem 3.1.

We start by assuming that n = 2 m − m ∈ N . Then, we let t = r − s , and use Theorem 3.10 to deﬁne the [ n, k, d ]-linear code Q := BCH m,t ⊆ F n , where k ≥ n − mt = n − m ( r − s ) and d > t = 2( r − s ). By Lemma 3.9, we know that G Q ∩ ∆ r is an( n, r, s )-design with independence number α ( G Q ∩ ∆ r ) ≤ C · r · n − k ) /r ≤ C · r · mt/r = C · r · (2 m ) r − s ) /r ≤ C · r · n r − s ) /r . Note that G Q ∩ ∆ r can be constructed in poly( (cid:0) nr (cid:1) ) time if Q ∩ ∆ r can be constructed in poly( (cid:0) nr (cid:1) )time, and this can be done by simply checking (and appropriately including) whether each of the (cid:0) nr (cid:1) elements in ∆ r belong to Q , using Algorithm B from Theorem 3.10.If n is of the form 2 m , we can follow the previous procedure to draw hyperedges around theﬁrst n − α ( G Q ∩ ∆ r ) ≤ C · r · n r − s ) /r .If n is not of the form 2 m − m , then it can be written as a sum x + · · · + x d d over N ,where d = ⌈ log n ⌉ and each x i ∈ { , } . We can then follow the most recent procedure to constructa graph G i over 2 i vertices separately for each nonzero x i . The ﬁnal graph G = S i G i is clearlystill an ( n, r, s )-design, and it has independence number α ( G ) = X i : x i =1 α ( G i ) ≤ X ≤ i ≤⌈ log n ⌉ C · r · (2 i ) r − s ) /r = 3 C · r X ≤ i ≤⌈ log n ⌉ (2 i ) r − s ) /r = 3 C · r · (2 ⌈ log n ⌉ +1 ) r − s ) /r − r − s ) /r − . It is straightforward to verify that for a large enough universal constant C ′ , the above fraction isbounded above by C ′ · r · n r − s ) /r , which completes the proof. In this section, we will show how to combine our designs from Section 3 with a very recent object,known as a leakage-resilient extractor (LRE), in order to obtain improved extractors for 0-localadversarial sources (as deﬁned by Deﬁnition 1.3). We will prove our second main theorem:9 heorem 4.1 (Theorem 2, restated) . There is a universal constant

C > such that for any ﬁxed δ > and all suﬃciently large N, K, n, k ∈ N satisfying k ≥ log C n and K ≥ N δ , there exists anexplicit extractor Ext : ( { , } n ) N → { , } m for ( N, K, n, k ) -adversarial sources of locality 0, withoutput length m = k Ω(1) and error ǫ = 2 − k Ω(1) . Previously, the best explicit extractor for this setting was constructed by Chattopadhyay etal. [CGGL20], and required K ≥ N . o (1) good sources. On the other hand, it is easy to give a non-explicit extractor that requires just K ≥ Thus, while our explicit constructionsgreatly improve the state-of-art (and most notably break the “ √ N barrier”), there is still a lot ofroom for improvement. Further improvement, however, will require signiﬁcantly new techniques.Our construction leverages the “ activation vs. fragile correlation ” paradigm introduced in[CGGL20] for extracting from adversarial sources. This paradigm was ﬁrst introduced in an at-tempt to construct a low-error extractor for ( N, K, n, k )-adversarial sources of locality 0, given just k ≥ polylog n entropy and as a few good sources, K , as possible. Since there exists a three-sourceextractor Ext for k ≥ polylog n entropy and exponentially small error [Li15], a natural idea is tosomehow employ this object as a subroutine. Using this idea, [CGGL20] proposed an extractor foradversarial sources that works as follows. Given as input an adversarial source X = ( X , . . . , X N ),the extractor carefully selecting triples of sources, calls Ext over each triple, and XORs the re-sults. [CGGL20] argued that this procedure outputs uniform bits as long as the following twoproperties hold:1. Activation : some

Ext call is activated , i.e., only given good sources as input.2. Fragile correlation : ﬁxing the (XOR of the) output of all other

Ext calls does not aﬀectthe output of the activated Ext call (with high probability).It is not hard to see why these conditions suﬃce: activation guarantees that some Ext calloutputs uniform bits, while fragile correlation guarantees that these uniform bits will be propa-gated through to the overall output of the extractor (by Fact 2.1 and Fact 2.2). Thus, the mainchallenge considered in [CGGL20] is determining how to select triples such that activation andfragile correlation are guaranteed.The key idea in [CGGL20] is to select triples using the hyperedges of a 3-uniform hypergraph, G = ( V, E ). Then, we know that activation is guaranteed as long as the good sources cover somehyperedge e ∈ E , which is guaranteed to happen whenever K > α ( G ). In order to ensure fragilecorrelation, [CGGL20] observed that it suﬃces to require that G = ( V, E ) is a partial Steiner triplesystem , also known as an ( N, , Ext call sharesat most one source with the activated Ext call. Thus, if we start by ﬁxing all sources that are not inputs to the activated Ext call, it is then easy to ﬁx the outputs of all other Ext calls withoutintroducing correlation between the inputs to the activated call. Furthermore, by Lemma 2.3, wecan show that this process barely decreases the entropy of the inputs to the activated call, and thusits output remains uniform.We have shown that the construction above provides a low-error extractor for ( N, K, n, k )-adversarial sources of locality 0, where k ≥ polylog n and K > α ( G ). Thus, the goal becomes toexplicitly construct an ( N, , G = ( V, E ) with small independence number. Chattopad-hyay et al. [CGGL20] construct such an object with α ( G ) < N . , and thus gave an explicitextractor when there are K ≥ N . good sources. In order to improve this requirement on K , it This extractor calls an optimal two-source extractor over every pair of sources in the adversarial source, and takesthe XOR of the results [CL16]. To see why this works, we refer the reader to a similar proof sketch for a slightlymore involved construction, provided in the following paragraphs.

10s natural to try to construct an ( N, , K ≥ N . o (1) good sources.Chattopadhyay et al. [CGGL20] take a diﬀerent approach. They observed that the above extrac-tor, based on ( N, , X , X , X , then the output still looks uniform even conditioned on the output of several functions, each acting on just one of X , X , X . They suggested that if onecould construct a three-source extractor with stronger conditioning properties, then perhaps theycould start with a 3-uniform hypergraph from a more general class H than ( N, , H with small independence number, and thereby improve the requirement on K .By applying a strong two-source condenser from [BACDTS19], Chattopadhyay et al. [CGGL20]explicitly construct more robust three-source (and multi-source) extractors with stronger condi-tioning properties. By combining these objects with various types of explicit hypergraphs, theysuccessfully reduce the requirement on good sources from K ≥ N . to K ≥ N . o (1) . Unfor-tunately, however, the conditioning properties of their robust subroutine extractors are extremelyspeciﬁc, and as a result they can only be combined with very specialized types of hypergraphs.These hypergraphs oﬀer no clean generalization of ( N, , K (and, in particular, break the “ √ N barrier”).If one hopes to signiﬁcantly improve K , it appears that one would need a multi-source extractorwith even stronger conditioning properties to use as a subroutine. Recently, exactly such an objectwas deﬁned in [CGG + leakage-resilient extractor (LRE). LREs are verygeneral objects with extremely strong conditioning properties. The exact variant that will beuseful here is actually a specialization known as extractors for cylinder intersections , ﬁrst introducedin [KMS19]. Informally, we deﬁne an ( r, s ) -leakage-resilient extractor to be an r -source extractor LRE that outputs bits that look uniform, even conditioned on the output of several functions thateach act on fewer than s of the inputs to LRE . Formally, it is deﬁned as follows.

Deﬁnition 4.2 ([KMS19, CGG + . A function

LRE : ( { , } n ) r → { , } m is an ( r, s )-leakage-resilient extractor for entropy k and error ǫ if the following holds. Let X := ( X , . . . , X r ) be any r independent ( n, k ) sources, let T := (cid:0) [ N ] s − (cid:1) , and let L := { Leak T : ( { , } n ) s − → { , } m } T ∈T beany collection of functions. Then: | LRE ( X ) ◦ ( Leak S ( X S )) S ∈S − U m ◦ ( Leak S ( X S )) S ∈S | ≤ ǫ. Given such a robust extractor, it is now easy to generalize the original extractor of [CGGL20] ina clean, natural way: instead of calling a three-source extractor over the hyperedges of an ( N, , r, s )-leakage-resilient extractor over the hyperedgesof an ( N, r, s )-design and XOR the results. Once again, we can ensure activation as long as thenumber of good sources, K , exceeds the independence number of the design. On the other hand,instead of using Lemma 2.3 to ensure fragile correlation, we simply use the leakage-resilience of ourleakage-resilient extractor: to see why this works, simply observe that an ( N, r, s )-design guaranteesthat the intersection of two hyperedges has size < s , while a leakage-resilient extractor guaranteesto output uniform bits even conditioned on several leaks that each act on < s of its inputs.Formally, we prove the following lemma, which provides a framework for combining leakage-resilient extractors with general designs in order to extract from adversarial sources.11 emma 4.3.

Let G = ([ N ] , E ) be an ( N, r, s ) -design with independence number α , and let Ext :( { , } n ) r → { , } m be an ( r, s ) -leakage resilient extractor for entropy k with error ǫ . Then forany K > α and k ≥ k , the function Ext G : ( { , } n ) N → { , } m deﬁned as Ext G ( X ) := M e ∈ E ( G ) Ext ( X e ) is an extractor for ( N, K, n, k ) adversarial sources of locality 0 with error ǫ = ǫ .Proof. Let X be an ( N, K, n, k ) adversarial source. We must show that | Ext G ( X ) − U m | ≤ ǫ .Because K > α , there is some e ∗ ∈ E containing only good sources, i.e., X i has entropy at least k for each i ∈ e ∗ . Without loss of generality, we assume e ∗ = [ r ]. We now ﬁx all other sources Z = ( X j ) j / ∈ e ∗ , using Fact 2.2: | Ext G ( X ) − U m | ≤ E z ∼ Z [ | ( Ext G ( X ) | Z = z ) − U m | ] . Consider any z = ( x j ) j / ∈ e ∗ . For each e ∈ E ( G ), we deﬁne the restriction Ext e : ( { , } n ) | e ∩ e ∗ | →{ , } m as Ext e ( Y , . . . , Y | e ∩ e ∗ | ) = Ext ( Y , . . . , Y | e ∩ e ∗ | , ( x j ) j ∈ e \ e ∗ ), so that we may write( Ext G ( X ) | Z = z ) = M e ∈ E ( G ) Ext e ( X e ∩ e ∗ ) = Ext ( X e ∗ ) ⊕ M e ∈ E ( G ) \{ e ∗ } Ext e ( X e ∩ e ∗ ) . Because G is an ( N, r, s )-design, any two edges share at most s − E ( G ) \ { e ∗ } into (cid:0) rs − (cid:1) sets, depending on the intersection behavior of each edge with e ∗ .In particular, for each S ∈ (cid:0) e ∗ s − (cid:1) , we deﬁne: W S := { e ∈ E : e ∩ e ∗ ⊆ S } . If any e ∈ E ends up in more than one W S , we simply remove it from all but one of these sets. Now,for each S ∈ (cid:0) e ∗ s − (cid:1) , we deﬁne Leak S : ( { , } n ) s − → { , } m such that for any X ∈ ( { , } n ) N , Leak S ( X S ) = L e ∈W S Ext e ( X e ∩ e ∗ ), which is a valid deﬁnition because e ∩ e ∗ is always in S , bydeﬁnition of W S . We may now write( Ext G ( X ) | Z = z ) = Ext ( X e ∗ ) ⊕ M S ∈ ( e ∗ s − ) Leak S ( X S ) . (2)To bound the distance of this random variable from uniform, we now deﬁne the second randomvariable we will ﬁx, Z := ( Leak S ( X S )) S ∈ ( e ∗ s − ). Fixing this random variable, we have: | ( Ext G ( X ) | Z = z ) − U m | ≤ E z ∼ Z [ | ( Ext G ( X ) | Z = z , Z = z ) − U m | ]= E z ∼ Z [ | ( Ext ( X e ∗ ) | Z = z ) − U m | ]= | Ext ( X e ∗ ) ◦ Z − U m ◦ Z | , where the ﬁrst and last (in)equalities follow easily from the deﬁnition of statistical distance, andthe second (in)equality follows from Equation (2) and the fact that adding a constant to a randomvariable doesn’t change its distance from uniform. But notice that by deﬁnition of Z and theleakage-resilience of Ext , this quantity is bounded above by ǫ , which completes the proof.12n order to highlight the generality of this framework, we observe that by Lemma 2.3, a standardthree-source extractor is, in fact, a (3 , r = 3 , s = 2, we recover the original extractorand analysis of [CGGL20]. Even better, since Theorem 1.1 tells us that the independence number α of an ( N, r, s )-design decreases quickly as r, s grow large together, we see that Lemma 4.3 oﬀersa concrete way to construct extractors for adversarial sources with much fewer good sources, K .If we want to realize the above plan, we need two explicit objects. First, we need explicit( N, r, s )-designs with independence numbers that decrease quickly as r, s grow together. Theorem 1of the current paper gives exactly this, and in fact the independence numbers of our designs decreasewith r, s almost as quickly as possible , as shown by the tightness of Theorem 1.1.Second, we need explicit leakage-resilient extractors for polylogarithmic entropy that have ex-ponentially small error. Very recently, these exact objects were constructed:

Theorem 4.4 ([CGG + . There is a universal constant

C > such that for any suﬃcientlylarge constant r ∈ N and all n, k ∈ N satisfying k ≥ log C n , there exists an explicit ( r, r − -leakageresilient extractor Ext : ( { , } n ) r → { , } m for min-entropy k with output length m = k Ω(1) anderror ǫ = 2 − k Ω(1) . By combining these explicit LREs with our explicit designs, we can ﬁnally prove Theorem 4.1,which signiﬁcantly improves the adversarial source extractors of [CGGL20].

Proof of Theorem 4.1.

Let C be the same universal constant from Theorem 4.4, and let r ∈ N bea suﬃciently large (even) constant such that 2 /r < δ , and such that Theorem 4.4 guarantees theexistence of an explicit ( r, r − Ext : ( { , } n ) r → { , } m for min-entropy k ≥ log C n with output length m = k Ω(1) and error ǫ = 2 − k Ω(1) . For suﬃciently large N ∈ N , Theorem 1 guarantees the existence of an ( N, r, r − G with independence number α < N δ that is computable in poly( (cid:0) Nr (cid:1) ) = poly( N ) time. The result now follows by instantiatingLemma 4.3 with Ext and G . In this section, we will show how to use our extractors from Section 4 to obtain better extractorsfor small-space sources (as deﬁned by Deﬁnition 1.4). We will prove our third main theorem:

Theorem 5.1 (Theorem 3, restated) . For any ﬁxed δ ∈ (0 , / there is a constant C > such thatfor all n, k, s ∈ N satisfying k ≥ Cn / δ s / − δ , there exists an explicit extractor Ext : { , } n →{ , } m for space s sources of min-entropy k , with output length m = n Ω(1) and error ǫ = 2 − n Ω(1) . Until very recently, the best explicit extractor for this setting [KRVZ06] required entropy k ≥ Cn − γ s γ , where γ > C is a large one. In [CGGL20], this requirementwas signiﬁcantly improved to k ≥ Cn / δ s / − δ , for an arbitrarily small constant δ >

0, and thecurrent paper (Theorem 5.1) further improves this to k ≥ Cn / δ s / − δ . We note that this line ofimprovements is strict in terms of both entropy and space, via the following remark. Remark 5.2.

For all s = s ( n ) , the requirement k ≥ Cn − ǫ s ǫ strictly decreases as ǫ grows. This isbecause the requirement is trivial whenever s ≥ n ; we may therefore assume s < n , and so shiftingthe weight of the cumulative power from n to s strictly improves the requirement. Non-constructively, it is known [KRVZ06] that there exist extractors for space s sources ofmin-entropy k ≥ O ( s + log n ) that have error ǫ = 2 − Ω( k ) . Furthermore, in the large-error setting,13hattopadhyay and Li [CL16] constructed an extractor for small-space sources that requires just k ≥ n o (1) entropy, but has error ǫ = n − Ω(1) . Thus, while Theorem 5.1 signiﬁcantly improves thestate-of-art in low-error extraction, there is still a lot of room for improvement. However, wenote (in Remark 5.7) that any substantial improvements to our low-error extractors will requiresigniﬁcantly new techniques.We now proceed to prove Theorem 5.1. The techniques that follow, which will reduce the taskof extracting from small space sources to the task of extracting from adversarial sources, are justslightly optimized versions of the exact arguments that appear in [CGGL20]. However, we includethem here for completeness.The ﬁrst step is to reduce small-space sources to a class of sources known as total entropysources , deﬁned as follows.

Deﬁnition 5.3.

A random variable X over ( { , } ℓ ) r is an ( r, ℓ, k )-total entropy source if X =( X , X , . . . , X r ) , where each X i is an independent source over { , } ℓ , and P i ∈ [ r ] H ∞ ( X i ) ≥ k . In [KRVZ06], Kamp et al. showed that upon ﬁxing a few positions in the random walk thatgenerates a small space source X , it is straightforward to use Lemma 2.3 to show that X becomesa total-entropy source, with high probability. We include the proof for completeness. Lemma 5.4 ([KRVZ06]) . Let X be a space s source over { , } n with min-entropy k . Then for any α ∈ (0 , / such that r = αk/s and ℓ = ns/ ( αk ) are positive integers, it holds that X is − k/ -closeto a convex combination of ( r, ℓ, k/ -total entropy sources.Proof. For each i ∈ [ n ], let W i ∼ { , } s be the random variable denoting the state reached in layer i of the branching program in the random walk that generates X . Observe that ﬁxing any W i breaks X into two independent sources. More generally, observe that if we deﬁne W ∗ := ( W iℓ +1 ) i ∈ [0 ,r − ,then if we condition X on any ﬁxing of W ∗ , it must hold that X becomes an ( r, ℓ, Γ)-total entropysource, for some

Γ. Furthermore, by Lemma 2.3, we knowPr w ∼ W ∗ [ H ∞ ( X | W ∗ = w ) ≥ k − rs − k/ k − αk − k/ ≥ k/ ≥ − − k/ . (3)Thus, the random variable ( X | W ∗ = w ) is an ( r, ℓ, k/ − − k/ over w ∼ W ∗ , which completes the proof.The next step is to show that a total-entropy source looks like an adversarial source of locality0, using a standard Markov-type argument: Lemma 5.5.

Let X be an ( r, ℓ, Γ) -total entropy source. Then for any N, K, n, k ∈ N with N n = rℓ and n a multiple of ℓ , X is also an ( N, K, n, k ) -adversarial source of locality 0, as long as Kn + N k ≤ Γ .Proof. By deﬁnition of total-entropy source, X = ( X , . . . , X r ), where each X i is an independentsource over { , } ℓ . By collecting the sources X i into N buckets containing n/ℓ sources each, we seethat X is also an ( N, n,

Γ)-total entropy source, and may be rewritten as X = ( X , . . . , X N ), whereeach X i is an independent source over { , } n . If X were not an ( N, K, n, k )-adversarial sourceof locality 0, then the K − X each have entropy at most n , and theremaining each have entropy < k . This yields H ∞ ( X ) = Γ < ( K − n +( N − ( K − k < Kn + N k ,contradicting the given lower bound on Γ.Given the above reduction, we can now use our improved adversarial source extractors (fromTheorem 2) to give improved extractors for total-entropy sources.14 heorem 5.6.

For any ﬁxed δ > and all suﬃciently large r, ℓ, Γ ∈ N with Γ ≥ max (cid:8) ( rℓ ) / δ , r δ ℓ (cid:9) , there exists an explicit extractor Ext : ( { , } ℓ ) r → { , } m for ( r, ℓ, Γ) -totalentropy sources, with output length m = ( rℓ ) Ω(1) and error ǫ = 2 − ( rℓ ) Ω(1) .Proof.

Fix any

N, n ∈ N such that N n = rℓ and n is a multiple of ℓ . By Lemma 5.5, every ( r, ℓ, Γ)-total entropy source is also an (

N, K, n, k )-adversarial source of locality 0, provided Kn + N k ≤ Γ.Thus, by Theorem 2, for any ﬁxed δ > Ext : ( { , } r ) ℓ →{ , } m for ( r, ℓ, Γ)-total entropy sources with output length m = n Ω(1) and error ǫ = 2 − n Ω(1) ,provided N δ n + N n δ ≤ Γ and

N, n are suﬃciently large. To achieve the parameters claimed inthe theorem statement, pick δ = δ/ N, n as follows: (i) if r ≥ ℓ , set N = n = √ rℓ ; (ii)if r < ℓ , set N = r and n = ℓ . We conclude by remarking that this casework was motivated bytrying to minimize the requirement on Γ by setting N = n . This is not possible in case (ii), but ispossible in case (i) by assuming, without loss of generality, that r = x ℓ for some x ∈ N .Previously, the best low-error explicit extractors for total-entropy sources [CGGL20] requiredentropy Γ ≥ max { ( rℓ ) / δ , r / δ ℓ } . Non-constructively, we know it is possible [KRVZ06] toachieve an entropy requirement of Γ ≥ O ( ℓ + log r ) and error of 2 − Ω(Γ) . Thus, while there is stilla lot of room to give improved explicit extractors for total-entropy sources, we remark that theentropy requirement in Theorem 5.6 is, in fact, not far from optimal when ℓ ≫ r .Finally, we show how to combine our improved explicit extractors for total-entropy sources(Theorem 5.6) with the standard reduction from small-space sources to total-entropy sources(Lemma 5.4) to complete the proof of Theorem 5.1: Proof of Theorem 5.1.

Fix any δ ∈ (0 , / α ∈ (0 , /

4] be a suﬃciently small constantand

C > s source X over { , } n with entropy k ≥ Cn / δ s / − δ , we know by Lemma 5.4 that X is ǫ = 2 − k/ -close to a convex combinationof ( r, ℓ, k/ r = αk/s and ℓ = ns/ ( αk ). (Here we assume r, ℓ ∈ N ,but it is easy to extend the argument when this is not the case.) In particular, this means there issome random variable Y such that with probability at least 1 − ǫ over y ∼ Y , the random variable( X | Y = y ) is an ( r, ℓ, k/ Ext : ( { , } ℓ ) r → { , } m be the extractor from Theorem 5.6 for such total-entropy sources.We will argue that it also an extractor for the small-space source X . Notice we have | Ext ( X ) − U m | ≤ E y ∼ Y [ | Ext ( X | Y = y ) − U m | ] ≤ ǫ + | Ext ( X ′ ) − U m | , where X ′ is some ( r, ℓ, k/ r, ℓ, k/ k/ ≥ max { ( rℓ ) / δ , r δ ℓ } ,then Theorem 5.6 tells us that | Ext ( X ) − U m | ≤ ǫ + | Ext ( X ′ ) − U m | ≤ − k/ + 2 − ( rℓ ) Ω(1) = 2 − n Ω(1) and m = ( rℓ ) Ω(1) = n Ω(1) , which would prove the current theorem. We know that r, ℓ, k/ r = αk/s ≥ αC ( n/s ) / δ ≥ αC , and ℓ = ns/ ( αk ) ≥ /α , and k ≥ C ,where α is suﬃciently small and C is suﬃciently large. Next, we know k/ ≥ ( rℓ ) / δ = n / δ by the provided lower bound on k . Finally, to show k/ ≥ r δ ℓ = ( αk/s ) δ ns/ ( αk ), rearrangethe inequality to obtain k − δ ≥ α δ − s − δ n , plug in the provided lower bound on k to obtain( Cn / δ s / − δ ) − δ ≥ α δ − s − δ n , and observe that it therefore suﬃces to show (0 . C − δ α − δ ) · n (1 / δ )(2 − δ ) − ≥ s − δ − (2 − δ )(1 / − δ ) , or rather(0 . C − δ α − δ ) · n δ − δ/ − δ ≥ s δ − δ/ − δ . This holds because n ≥ s (otherwise the provided lower bound on k gives k > n ), because 2 δ − δ/ − δ ≥ δ ∈ (0 , / C is suﬃciently large.We conclude this section with a remark about the √ n “barrier” in small-space extraction.15 emark 5.7. Any signiﬁcant improvement to our extractors for small-space sources, presented inTheorem 5.1, would require signiﬁcantly new techniques. In particular, if one wishes to constructextractors that can handle min-entropy k ≥ √ n , it is not possible to use the standard reduction fromsmall-space sources to total-entropy sources appearing in the proof to Lemma 5.5. This is becauseif k = √ n , then either ℓ > √ n , or not. If ℓ > √ n , then all the entropy could lie in one chunk oflength ℓ > k , and it is impossible to extract from one source. If ℓ ≤ √ n , then r ≥ √ n , and theapplication of Lemma 2.3 in Equation (3) always leaves the source ( X | W ∗ = w ) with 0 entropy,from which extraction is impossible. In this paper, we give the ﬁrst derandomization of R¨odl and ˇSinajov´a’s probabilistic designs [RˇS94],and show how they can be used to get better extractors for adversarial sources, total-entropysources, and small-space sources. The three most natural open problems are as follows, which askfor improvements on each of our three main theorems Theorems 1 to 3, respectively.

Problem 1.

Better explicit designs: improve the constant in the power of n of Theorem 1 from 2to 1.99. Alternatively, extend Theorem 1 to also work when r is odd. Problem 2.

Better extractors for adversarial sources: improve the requirement on good sources inTheorem 2 from K ≥ N δ to K ≥ N o (1) , or even K ≥ polylog N . Problem 3.

Better extractors for small-space sources: improve the requirement on min-entropy inTheorem 3 from k ≥ n / δ to k ≥ n . . An answer to Problem 1 would provide explicit designs for regimes in which we currently onlyhave trivial bounds (i.e., when r is odd or r ≥ s ), while an answer to Problem 2 would likely requireextractors that have stronger conditioning properties than even our best leakage-resilient extractors.Finally, an answer to Problem 3 might require an idea that bypasses a reduction to total-entropysources, as explained in Remark 5.7. In general, any signiﬁcant progress on these problems shouldrequire signiﬁcantly new techniques, which would be interesting in their own right. References [BACDTS19] Avraham Ben-Aroya, Gil Cohen, Dean Doron, and Amnon Ta-Shma. Two-sourcecondensers with low error and small entropy gap via entropy-resilient functions. In

Approximation, Randomization, and Combinatorial Optimization. Algorithms andTechniques (APPROX/RANDOM 2019) . Schloss Dagstuhl-Leibniz-Zentrum fuer In-formatik, 2019.[BRC60] Raj Chandra Bose and Dwijendra K. Ray-Chaudhuri. On a class of error correctingbinary group codes.

Information and Control , 3(1):68–79, 1960.[CGG +

20] Eshan Chattopadhyay, Jesse Goodman, Vipul Goyal, Ashutosh Kumar, Xin Li,Raghu Meka, and David Zuckerman. Extractors and secret sharing against boundedcollusion protocols. In

Proceedings of the 61st Annual IEEE Symposium on Founda-tions of Computer Science (FOCS, to appear) . IEEE, 2020.[CGGL20] Eshan Chattopadhyay, Jesse Goodman, Vipul Goyal, and Xin Li. Extractors foradversarial sources via extremal hypergraphs. In

Proceedings of the 52nd Annual CM SIGACT Symposium on Theory of Computing , STOC 2020, pages 1184–1197,New York, NY, USA, 2020. Association for Computing Machinery.[CL16] Eshan Chattopadhyay and Xin Li. Extractors for sumset sources. In

Proceedings ofthe forty-eighth annual ACM symposium on Theory of Computing , pages 299–311.ACM, 2016.[Eus13] Alexander Eustis.

Hypergraph independence numbers . PhD thesis, UC San Diego,2013.[EV13] Alex Eustis and Jacques Verstra¨ete. On the independence number of Steiner systems.

Combinatorics, Probability & Computing , 22(2):241–252, 2013.[GB10] Venkatesan Guruswami and Eric Blais. Notes 6: Reed-Solomon, BCH, Reed-Mullerand concatenated codes.

Introduction to Coding Theory CMU: Spring , 2010.[GPR95] David A. Grable, Kevin T. Phelps, and Vojtˇech R¨odl. The minimum independencenumber for designs.

Combinatorica , 15(2):175–185, 1995.[Hoc59] Alexis Hocquenghem. Codes correcteurs d’erreurs.

Chiﬀres , 2(2):147–56, 1959.[KMS19] Ashutosh Kumar, Raghu Meka, and Amit Sahai. Leakage-resilient secret sharingagainst colluding parties. In , pages 636–660. IEEE, 2019.[KMV14] Alexandr Kostochka, Dhruv Mubayi, and Jacques Verstra¨ete. On independent setsin hypergraphs.

Random Structures & Algorithms , 44(2):224–239, 2014.[KRVZ06] Jesse Kamp, Anup Rao, Salil Vadhan, and David Zuckerman. Deterministic ex-tractors for small-space sources. In

Proceedings of the thirty-eighth annual ACMsymposium on Theory of computing , pages 691–700. ACM, 2006.[Li15] Xin Li. Three-source extractors for polylogarithmic min-entropy. In , pages 863–882. IEEE,2015.[MW97] Ueli Maurer and Stefan Wolf. Privacy ampliﬁcation secure against active adversaries.In

Annual International Cryptology Conference , pages 307–321. Springer, 1997.[NW94] Noam Nisan and Avi Wigderson. Hardness vs randomness.

Journal of Computerand System Sciences , 49(2):149–167, 1994.[RˇS94] Vojtˇech R¨odl and Edita ˇSinajov´a. Note on independent sets in Steiner systems.

Random Structures & Algorithms , 5(1):183–190, 1994.[Sha11] Ronen Shaltiel. An introduction to randomness extractors. In

International Collo-quium on Automata, Languages, and Programming , pages 21–41. Springer, 2011.[Sid95] Alexander Sidorenko. What we know and what we do not know about tur´an numbers.

Graphs and Combinatorics , 11(2):179–199, 1995.[Sid18] Alexander Sidorenko. Extremal problems on the hypercube and the codegree Tur´andensity of complete r -graphs. SIAM Journal on Discrete Mathematics , 32(4):2667–2674, 2018. 17Sid20] Alexander Sidorenko. On generalized Erd˝os–Ginzburg–Ziv constants for Z d . Journalof Combinatorial Theory, Series A , 174:105254, 2020.[Spe77] Joel Spencer. Asymptotic lower bounds for Ramsey functions.

Discrete Mathematics ,20:69–76, 1977.[TL18] Fang Tian and Zi-Long Liu. Bounding the independence number in some ( n, k, ℓ, λ )-hypergraphs.

Graphs and Combinatorics , 34(5):845–861, 2018.[TV00] Luca Trevisan and Salil Vadhan. Extracting randomness from samplable distribu-tions. In

Proceedings 41st Annual Symposium on Foundations of Computer Science ,pages 32–42. IEEE, 2000.[Vad12] Salil Vadhan. Pseudorandomness.

Foundations and Trends R (cid:13) in Theoretical Com-puter Sciencein Theoretical Com-puter Science