aa r X i v : . [ m a t h . C O ] F e b Configuration-avoiding sets on the Euclidean spaceand the sphere
Davi Castro-Silva ∗ University of CologneFebruary 22, 2021
Abstract
Given finite configurations P , . . . , P n ⊂ R d , let us denote by m R d ( P , . . . , P n )the maximum density a set A ⊆ R d can have without containing congruent copiesof any P i . In this paper we will study this geometrical parameter, called the independence density of the considered configurations, and give several resultswe believe are interesting. For instance we show that, under suitable size andnon-degeneracy conditions, m R d ( t P , t P , . . . , t n P n ) tends to Q ni =1 m R d ( P i ) asthe ratios between consecutive dilation parameters t i +1 /t i grow large; this showsan exponential decay on the density when forbidding multiple dilates of a givenconfiguration, and gives a common generalization of theorems by Bourgain andby Bukh in geometric Ramsey theory. We also consider the analogous parameter m S d ( P , . . . , P n ) on the more complicated framework of sets on the unit sphere S d , obtaining the corresponding results in this setting. Let us start our considerations with a beautiful theorem of Bourgain [2] from 1986:
Theorem 1.
Let A ⊂ R d be a set of positive upper density and P ⊂ R d be a set of d points spanning a ( d − -dimensional hyperplane. There exists some number l suchthat A contains a congruent copy of l · P whenever l ≥ l . This theorem generalizes an earlier result obtained by Furstenberg, Katznelsonand Weiss [10], which gives the same conclusion but for two-point configurations (orequivalently for distances realized on the set A ).While Furstenberg, Katznelson and Weiss’ theorem was proven using ergodic the-oretic methods and was thus irrevocably ‘infinitary’, Bourgain’s proof of Theorem 1makes use of Fourier analytic methods and proceeds by first establishing the strongerand more ‘quantitative’ result given in Lemma 1 below . For convenience, we shallsay that a configuration P ⊂ R d is admissible if it has at most d points and spans a( | P | − Lemma 1.
Let P ⊂ [0 , d be an admissible configuration and let ε > . There exists J = J ( ε, P ) > such that, for every set A ⊂ [0 , d of measure at least ε and every ∗ Electronic address: [email protected] The strategy of attacking qualitative problems through quantitative methods is quite typical ofBourgain’s work. We refer the reader to Tao’s paper [16] for an excellent discussion on this point,and for a clear exposition of Bourgain’s proof of the Furstenberg-Katznelson-Weiss theorem. equence ( t j ) j ≥ ⊂ (0 , satisfying t j +1 ≤ t j / for j ≥ , the set A must contain acongruent copy of t j P for some ≤ j ≤ J . In other words, a set having a given positive density can only avoid boundedlymany genuinely distinct dilates t P, . . . , t n P of an admissible configuration P . Thismotivates the introduction of the independence density of a given family of configura-tions P , P , . . . , P n ⊂ R d , denoted m R d ( P , P , . . . , P n ), as the maximum density of aset A ⊂ R d which does not contain a congruent copy of any of these configurations (wewill give a more precise definition later). We similarly define m [0 , d ( P , P , . . . , P n )for when the set A considered is restricted to lie in the cube [0 , d .Such functions are very natural geometrical parameters, and have been extensivelystudied in the case of two-point configurations; in this case they are characterized bythe distance r between their points and thus denoted m R d ( r , . . . , r n ) for forbiddendistances r , . . . , r n >
0. The independence density m R d (1) for a single forbiddendistance is related to the measurable chromatic number χ m ( R d ) by the easy inequality χ m ( R d ) ≥ / m R d (1), and it was in the context of studying the chromatic number ofgeometric graphs that the independence density was first defined by Sz´ekely [14]. Werefer the reader to [15] for a survey of results related to the independence density of(a collection of) distances and to the measurable chromatic number of R d ; for thebest-known bounds on χ m ( R d ) and m R d (1) we refer the reader to [1] and [5]. In amore recent paper (thus not mentioned in [15]), Bukh [3] showed that m R d ( r , . . . , r n )tends to m R d (1) n as the ratios r j +1 /r j between consecutive distances get large.A more combinatorial perspective on our notion of independence density (andthe one which explains its denomination) is that m R d ( P , P , . . . , P n ) is the naturalanalogue of the independence number for the (infinite) geometrical hypergraph on R d whose edges are all isometric copies of P j , 1 ≤ j ≤ n . This perspective will bequite useful for us later, as one of the main steps in our proofs is the result that suchgeometrical hypergraphs have a nice supersaturation property : if a set A is just slightlydenser than the independence density, then it must contain a positive proportion ofall edges.Using the notation now introduced, one can restate the conclusion of Theorem 1as the assertion that m R d (( l j P ) j ≥ ) = 0 for all unbounded positive sequences ( l j ) j ≥ .Similarly, Lemma 1 asserts that m [0 , d ( t P, t P, . . . , t J P ) < ε holds for some J = J ( ε, P ) uniformly over all scales 0 < t , . . . , t J < t j +1 ≤ t j /
2, 1 ≤ j < J .Seen in this light, these results might inspire several further natural questions. Forinstance:(Q1) What possible values can be taken by the independence density m R d ( t P, t P,. . . , t n P ) of n distinct dilates of a given configuration P ?(Q2) What is the rate of decay of m R d ( t P, t P, . . . , t n P ) with n as the ratios t j +1 /t j between consecutive scales get large?(Q3) Are there analogous results which are valid for other (non-Euclidean) spaces?The study of these three questions will be main focus of the present paper. In Section 2 we will more formally define the independence density of a family ofconfigurations, both for the entire space R d and for bounded cubes on R d , and start The measurable chromatic number of R d is the minimum number of measurable sets needed topartition R d so that no two points belonging to the same part are at distance 1 from each other. Counting Lemma (Lemma 5)and a
Supersaturation Theorem (Theorem 2), both of which are conceptually similar toresults of the same name in graph and hypergraph theory (see [13, 12, 9]). Intuitively,the Counting Lemma says that the count of admissible configurations inside a given setdoes not significantly change if we blur the set a little; this will be proven by Fourier-analytic methods. The Supersaturation Theorem (as already mentioned) says that anybounded set A ⊆ [ − R, R ] d , which is just slightly denser than the independence densityof an admissible configuration P , must necessarily contain a positive proportion of allcongruent copies of P lying in [ − R, R ] d . This will be proven in Section 2 by usingBukh’s combinatorial arguments in conjunction with our Counting Lemma, and willbe re-proven in Section 4 by more functional-analytic means (but giving up any controlon the bounds obtained).We will then use these tools to answer questions (Q1) and (Q2) in the case wherethe considered configuration P is admissible (the same condition as that of Bour-gain’s results). Regarding question (Q1) we show that, by forbidding n distinct dilatesof such a configuration P , we can obtain as independence density any real numberstrictly between m R d ( P ) n and m R d ( P ), but none smaller than m R d ( P ) n or largerthan m R d ( P ). As for question (Q2), we show that m R d ( t P, t P, . . . , t n P ) tends to m R d ( P ) n as the ratios t j +1 /t j get large; this generalizes Bukh’s aforementioned resultfrom two-point configurations to k -point configurations with k ≤ d , and easily impliesthe two results of Bourgain already discussed.In Section 3 we will consider these same questions but related to the more com-plicated case of sets on the sphere S d . We will also present (and prove) sphericalanalogues of Theorem 1 and Lemma 1; this is in line with our question (Q3), as thesphere is the most well-studied non-Euclidean space.Many of the arguments from the Euclidean setting will be used again in the spheri-cal setting (in particular the reliance on our two main ‘combinatorial’ tools), but thereare also some complications we need to solve that are intrinsic to the sphere. Oneof them is that harmonic analysis is much more complicated on S d than it is on R d ,which makes our proof of the spherical Counting Lemma correspondingly harder andmore technical than its Euclidean counterpart. Moreover, due to the lack of dilationinvariance on the spherical setting, we will not be able to obtain an answer for itsanalogue of question (Q1) (and the answer to question (Q2) will be somewhat moreintricate).In Section 4 we give several further results related to configuration-avoiding setsand associated parameters, both in the spherical setting and in the Euclidean setting.We will prove:- A couple of continuity properties of the function which counts copies of a givenconfiguration P on measurable sets (namely L | P | continuity and, if P is admis-sible, continuity on the weak ∗ topology of L ∞ );- A ‘zero-measure removal’ lemma stating: either a set has a positive proportion ofall possible copies of P , or it can be made P -avoiding by removing a zero-measuresubset. This immediately implies a weak supersaturation property which holds Whether these boundary values can be attained is not yet clear.
The following notational convention is used: if there is (say) some function f whichappears in the statement of Lemma 1, Theorem 2 or Corollary 3, in proofs of laterresults we will refer to this function by writing f ( L , f ( T or f ( C , respectively.Dependencies on certain parameters will be indicated by using these parameters asindices; it is particularly important to note which parameters are not used as indicesof some given constant or function, as this will mean the constant or function underconsideration can be chosen independently of those parameters (unless explicitly statedotherwise, of course).The averaging notation E x ∈ X is used to denote the expectation when the variable x is distributed uniformly over the set X . When X is (a subset of) a compact group G this measure is (the restriction of) the normalized Haar measure on G , which isthe unique Borel probability measure on G which is invariant by both left- and right-actions of this group. Similarly, we write P x ∈ X to denote the probability under thissame distribution.The same denomination will be used for both a set and its indicator function; forinstance, if we are given A ⊆ R d , then A ( x ) = 1 if x ∈ A and A ( x ) = 0 otherwise.Given a group G acting on some space X and an element x of this space, we writeStab G ( x ) := { g ∈ G : g.x = x } for the stabilizer subgroup of x . Throughout this section we shall fix an integer d ≥ d -dimensional Eu-clidean space R d , equipped with its usual inner product x · y and associated Euclideannorm k x k . We denote by m the Lebesgue measure on R d and by µ the normalizedHaar measure on the orthogonal group O( d ) = { O ∈ R d × d : O t O = I } .Given x ∈ R d and R >
0, we denote by Q ( x, R ) the axis-parallel open cube of sidelength R centered at x . We write d Q ( x,R ) ( A ) := m ( A ∩ Q ( x, R )) /R d for the densityof A ⊆ R d inside the cube Q ( x, R ). The upper density of a measurable set A ⊆ R d isdefined as d ( A ) := lim sup R →∞ m ( A ∩ Q (0 , R )) R d ;if the limit exists, we shall instead denote it by d ( A ).A configuration P is just a finite subset of R d , and we define its diameter diam P asthe largest distance between two of its points. Recall that a configuration P ⊂ R d on k points is said to be admissible if k ≤ d and if P is non-degenerate (that is, if it spansa ( k − R d with at most d elements. 4e say that two configurations P, Q ⊂ R d are congruent , and write P ≃ Q , if theycan be made equal using only rigid transformations ; that is, P ≃ Q if and only ifthere exist x ∈ R d and T ∈ O( d ) such that P = x + T · Q . Given a configuration P ⊂ R d , we say that a set A ⊆ R d contains no copies of P , or that A avoids P , ifthere is no subset of A which is congruent to P .We can now formally define our main object of study in this section, the indepen-dence density of a configuration or family of configurations. There are in fact twoclosely related versions of this parameter we will need, depending on whether we areconsidering bounded or unbounded configuration-avoiding sets. Given n ≥ P , . . . , P n ⊂ R d , we then define the quantities m R d ( P , . . . , P n ) := sup (cid:8) d ( A ) : A ⊂ R d avoids P i for all 1 ≤ i ≤ n (cid:9) and m Q (0 ,R ) ( P , . . . , P n ) := sup (cid:8) d Q (0 ,R ) ( A ) : A ⊂ Q (0 , R ) avoids P i for all 1 ≤ i ≤ n (cid:9) Remark.
For the sake of clarity and notational convenience, whenever possible theresults we give about independence density will be stated and proved in the case ofonly one forbidden configuration. It can be easily verified that these results also holdin the case of several (but finitely many) forbidden configurations, with essentiallyunchanged proofs. Whenever we need this greater generality we will mention how thecorresponding statement would be in the case of several configurations.We start our investigations by proving a simple lemma which relates the two ver-sions of independence density just defined:
Lemma 2.
For all finite configurations P ⊂ R d and all R > we have m Q (0 ,R ) ( P ) (cid:0) diam PR (cid:1) d ≤ m R d ( P ) ≤ m Q (0 ,R ) ( P ) Proof.
For the first inequality, suppose A ⊆ Q (0 , R ) is any set avoiding P and considerthe periodic set A ′ := A + ( R + diam P ) Z d . This set also avoids P , and it has density d ( A ′ ) = m ( A )( R + diam P ) d = d Q (0 ,R ) ( A ) (cid:0) diam PR (cid:1) d Since we can choose d Q (0 ,R ) ( A ) arbitrarily close to m Q (0 ,R ) ( P ), the leftmost inequalityfollows.Now let A ⊆ R d be any set avoiding P , and note that A ∩ Q ( x, R ) also avoids P for every x ∈ R d . By fixing ε > x inside a large enoughcube Q (0 , R ′ ) (depending on A , diam P and ε ), we conclude there is x ∈ R d for which m ( A ∩ Q ( x, R )) > ( d ( A ) − ε ) R d . The rightmost inequality follows.As we are interested in the study of sets avoiding certain configurations, it is usefulalso to have a way of counting how many such configurations there are in a givenset. Thus, for a given configuration P = { v , v , . . . , v k } ⊂ R d and a measurable set A ⊆ R d we define I P ( A ) := Z R d Z O( d ) A ( x + T v ) A ( x + T v ) · · · A ( x + T v k ) dµ ( T ) dx, One could also restrict this notion and allow only translations and rotations when defining con-gruence, giving rise to formally different but very closely related definitions and problems as the oneswe consider. All methods used and results obtained in this paper work exactly the same way in thiscase, with only notational differences. P are contained in A . This quantity I P ( A ) can of course be infinite if the set A is unbounded, but we will use it almostexclusively for bounded sets. We similarly define its weighted version I P ( f ) := Z R d Z O( d ) f ( x + T v ) f ( x + T v ) · · · f ( x + T v k ) dµ ( T ) dx whenever f : R d → R is a measurable function for which this integral makes sense.Following Bukh [3], for each δ > γ > zooming-out operator Z δ ( γ ) which takes a measurable set A ⊆ R d to the set Z δ ( γ ) A := (cid:8) x ∈ R d : d Q ( x,δ ) ( A ) ≥ γ (cid:9) Intuitively, Z δ ( γ ) A represents the points where A is not too sparse at scale δ . Themain reason for us to define this operator is the following lemma: Lemma 3.
Let P ⊂ R d be a k -point configuration in R d , and denote γ P := 1 / (2 d +1 k ) .Whenever A ⊆ R d is a measurable set for which Z δ (1 − γ P ) A contains a copy of P ,we have that I P ( A ) ≥ c P ( δ ) > , where c P : (0 , → (0 , depends only on P .Proof. Let us write P = { v , v , . . . , v k } , and suppose A ⊆ R d is a measurable setfor which Z δ (1 − γ P ) A contains a copy of P . Up to congruence, we may assume that P ⊆ Z δ (1 − γ P ) A .Note that, if d Q ( x,δ ) ( A ) ≥ − / (2 d +1 k ) for some x ∈ R d , then d Q ( y,δ/ ( A ) ≥ − / k for all y ∈ Q ( x, δ/ P ⊆ Z δ (1 − γ P ) A thus impliesthat d Q ( y,δ/ ( A ) ≥ − / k whenever y ∈ Q ( v i , δ/
2) for some 1 ≤ i ≤ k .Let ℓ := max {k v i k : 1 ≤ i ≤ k } be the largest length of a vector in P , and letus write B( I, δ/ ℓ ) := { T ∈ O( d ) : k T − I k → ≤ δ/ ℓ } for the ball of radius δ/ ℓ (inoperator norm) centered on the identity I . Then whenever T ∈ B( I, δ/ ℓ ) we havethat T v i ∈ Q ( v i , δ/
2) for each 1 ≤ i ≤ k .By the preceding discussion and using union bound, we conclude that for all T ∈ B( I, δ/ ℓ ) we have Z R d k Y i =1 A ( x + T v i ) dx ≥ Z Q (0 ,δ/ k Y i =1 A ( x + T v i ) dx = (cid:18) δ (cid:19) d P x ∈ Q (0 ,δ/ ( x + T v i ∈ A for all 1 ≤ i ≤ k ) ≥ (cid:18) δ (cid:19) d − k X i =1 P x ∈ Q (0 ,δ/ ( x + T v i / ∈ A ) ! = (cid:18) δ (cid:19) d − k X i =1 (cid:0) − d Q ( T v i ,δ/ ( A ) (cid:1)! ≥ (cid:18) δ (cid:19) d We thus immediately conclude that I P ( A ) ≥ Z R d Z B( I, δ/ ℓ ) k Y i =1 A ( x + T v i ) dµ ( T ) dx ≥ µ (B( I, δ/ ℓ ))2 (cid:18) δ (cid:19) d > , finishing the proof. 6 .1 Fourier analysis on R d We will now show that the count of copies of an admissible configuration P insidea measurable set A does not significantly change if we ignore its fine details and‘blur’ the set A a little . The philosophy is similar to the famous regularity method in graph theory, where a large graph can be replaced by a much smaller weighted‘reduced graph’ (which is an averaged version of the original graph which ignores itsfine details) without significantly changing the count of copies of any small subgraph.The methods we will use are Fourier analytic in nature, drawing from Bourgain’sarguments presented in [2]. We define the Fourier transform on R d by b f ( ξ ) := Z R d f ( x ) e − iξ · x dx and b σ ( ξ ) := Z R d e − iξ · x dσ ( x )for a suitable function f and a suitable Borel measure σ on R d . We recall that theconvolution between two functions f , g ∈ L ( R d ) is defined by f ∗ g ( x ) := Z R d f ( y ) g ( x − y ) dy We recall also the basic identities [ f ∗ g ( ξ ) = b f ( ξ ) b g ( ξ ), Z R d f ( x ) g ( x ) dx = Z R d b f ( ξ ) b g ( − ξ ) dξ and Z R d f ( x ) dσ ( x ) = Z R d b f ( ξ ) b σ ( − ξ ) dξ Denote Q δ ( x ) := δ − d Q (0 , δ )( x ). This way, f ∗ Q δ ( x ) = δ − d R Q ( x,δ ) f ( y ) dy is theaverage of a function f on the cube Q ( x, δ ). Specializing to the indicator function ofa set A ⊆ R d , we obtain A ∗ Q δ ( x ) = d Q ( x,δ ) ( A ); this represents a ‘blurring’ of theset A considered. What we wish to obtain is then an upper bound on the quantity | I P ( A ) − I P ( A ∗Q δ ) | which goes to zero as δ goes to zero, uniformly over all measurablesets A ⊆ Q (0 , R ) (for any fixed R > | I P ( f ) − I P ( g ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z R d Z O( d ) k Y i =1 f ( x + T v i ) − k Y i =1 g ( x + T v i ) ! dµ ( T ) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) for some given functions f , g and some configuration P = { v , . . . , v k } . Since we canrewrite the term inside the parenthesis above as the telescoping sum k X i =1 i − Y j =1 f ( x + T v j ) ( f ( x + T v i ) − g ( x + T v i )) k Y j = i +1 g ( x + T v j ) , it follows from the triangle inequality that | I P ( f ) − I P ( g ) | is at most k X i =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z R d Z O( d ) i − Y j =1 f ( x + T v j ) ( f ( x + T v i ) − g ( x + T v i )) k Y j = i +1 g ( x + T v j ) dµ ( T ) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) This can be seen as an analogue of Bukh’s Zooming-out Lemma [3], where he shows that onecan ‘zoom out’ from a set A and ignore its small scale details without losing some properties he callssupersaturable.
7o obtain some bound for | I P ( f ) − I P ( g ) | it then suffices to obtain a similar boundfor an expression of the form (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z R d Z O( d ) h ( x + T u ) · · · h k − ( x + T u k − ) ( f ( x + T u k ) − g ( x + T u k )) dµ ( T ) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) whenever each h i is either f or g , and whenever ( u , . . . , u k ) is a permutation of thepoints of P .We shall refer to an argument of this form (breaking a difference of products into atelescoping sum, using the triangle inequality and bounding each term of the resultingexpression) as the telescoping sum trick . It is frequently used in modern graph andhypergraph theory when estimating the number of subgraphs inside a given large(hyper)graph G with the aid of edge-discrepancy measures such as the cut norm;such results are usually known as counting lemmas , and are an essential part of theregularity method we have already mentioned (see [13, 12] for details).In our arguments we will also have cause to use the following estimate. We denoteby G d,m (for m < d ) the Grassmanian of m -dimensional subspaces of R d endowed withthe normalized Haar measure, and write dist( ξ, F ) for the Euclidean distance betweena point ξ ∈ R d and a subspace F ∈ G d,m . Lemma 4.
For all m < d and all ρ > , we have Z G d,m Z R d | b f ( ξ ) | | − b Q δ ( ξ ) | (1 + dist( ξ, F )) − ρ dξ dF ≤ C d ( δ + δ ρ/ ) k f k Proof.
Using the easily proven bounds | b Q δ ( ξ ) | ≤ | − b Q δ ( ξ ) | ≤ C ′ d δ k ξ k , wecan bound the expression in the statement by Z G d,m (Z k ξ k≤ δ − / + Z k ξ k >δ − / ) | b f ( ξ ) | | − b Q δ ( ξ ) | (1 + dist( ξ, F )) − ρ dξ dF ≤ Z k ξ k≤ δ − / | b f ( ξ ) | C ′ d δ dξ + Z k ξ k >δ − / | b f ( ξ ) | Z G d,m ξ, F )) − ρ dF dξ ≤ C ′ d δ k f k + 4 k f k sup k ξ k >δ − / Z G d,m (1 + dist( ξ, F )) − ρ dF ≤ C d ( δ + δ ρ/ ) k f k , as wished.We are now ready to formally state and prove our main technical tool in theEuclidean setting, which by analogy with methods from graph theory we shall call theCounting Lemma. Its proof is mostly contained in Bourgain’s paper [2] (though it isnot expressed in this form), and here we will follow his arguments. Lemma 5 (Counting Lemma) . For every admissible configuration P ⊂ R d there existsa constant C P > such that the following holds. For every R > and any measurableset A ⊆ Q (0 , R ) , we have that | I P ( A ) − I P ( A ∗ Q δ ) | ≤ C P δ / R d ∀ δ ∈ (0 , Proof.
Let ( v , . . . , v k ) be a fixed permutation of the points of P ; by first translatingall points by − v k and using translation invariance we may assume that v k = 0, so that8 ≃ { , v , . . . , v k − } . We will work a bit more generally and show that a bound as inthe statement of the lemma holds for (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z R d Z O( d ) f ( x ) f ( x + T v ) · · · f k − ( x + T v k − ) × ( f k − ( x + T v k − ) − f k − ∗ Q δ ( x + T v k − )) dµ ( T ) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (1)whenever f , f , . . . , f k − : Q (0 , R ) → [0 ,
1] are bounded measurable functions. Byour telescoping sum trick, this immediately implies the result.By considering the variables y = T v , . . . , y k − = T v k − , we can rewrite theexpression Z R d Z O( d ) f ( x ) f ( x + T v ) · · · f k − ( x + T v k − ) dµ ( T ) dx counting weighted copies of P as Z f ( x ) f ( x + y ) · · · f k − ( x + y k − ) dσ ( d − ( y ) dσ ( d − y ( y ) · · · dσ ( d − k +1) y ,...,y k − ( y k − ) dx, where σ ( d − j ) y ,...,y j − is the average over a ( d − j )-dimensional sphere in R d dependingon the points y , . . . , y j − already fixed and on P . (More precisely, if y i = T v i for1 ≤ i ≤ j −
1, then σ ( d − j ) y ,...,y j − is the uniform probability measure on the ( d − j )-sphere Stab O( d ) ( y , . . . , y j − ) · T v j .) We will not need an explicit description for thesemeasures σ ( d − j ) y ,...,y j − , only the simple Fourier estimate | b σ ( d − j ) y ,...,y j − ( ξ ) | ≤ C ′ P (1 + dist( ξ, [ y , . . . , y j − ])) − ( d − j ) / (2)which follows from the decay at infinity of the Fourier transform of the ( d − j )-sphereon R d − j +1 .Let us denote for simplicity d Ω j ( y , . . . , y j ) = dσ ( d − ( y ) dσ ( d − y ( y ) · · · dσ ( d − j ) y ,...,y j − ( y j )Let G := f k − − f k − ∗ Q δ and, for Y = ( y , . . . , y k − ) ∈ ( R d ) k − fixed, denote F Y ( x ) := f ( x ) f ( x + y ) · · · f k − ( x + y k − )The expression (1) we wish to bound may then be written as (cid:12)(cid:12)(cid:12)(cid:12)Z Z Z F Y ( x ) G ( x + y k − ) dσ ( d − k +1) Y ( y k − ) d Ω k − ( Y ) dx (cid:12)(cid:12)(cid:12)(cid:12) , which can in turn be bounded by Z (cid:12)(cid:12)(cid:12)(cid:12)Z Z F Y ( x ) G ( x + y k − ) dσ ( d − k +1) Y ( y k − ) dx (cid:12)(cid:12)(cid:12)(cid:12) d Ω k − ( Y )= Z (cid:12)(cid:12)(cid:12)(cid:12)Z b F Y ( − ξ ) b G ( ξ ) b σ ( d − k +1) Y ( − ξ ) dξ (cid:12)(cid:12)(cid:12)(cid:12) d Ω k − ( Y ) ≤ Z k b F Y k (cid:18)Z | b G ( ξ ) | | b σ ( d − k +1) Y ( ξ ) | dξ (cid:19) / d Ω k − ( Y ) ≤ (cid:18) sup Y k b F Y k (cid:19) Z (cid:18)Z | b f k − ( ξ ) | | − b Q δ ( ξ ) | | b σ ( d − k +1) Y ( ξ ) | dξ (cid:19) / d Ω k − ( Y )9ince k b F Y k = k F Y k and | F Y ( x ) | ≤ | f ( x ) | pointwise for all Y ∈ ( R d ) k , we have thatthe supremum above is at most k f k .Using Cauchy-Schwarz on the first integral of the last expression and the Fourierestimate (2), we conclude that the right-hand side is at most C ′ P k f k (cid:18)Z Z | b f k − ( ξ ) | | − b Q δ ( ξ ) | (1 + dist( ξ, [ Y ])) − ( d − k +1) dξ d Ω k − ( Y ) (cid:19) / Note that the expression inside the parenthesis above is equal to the left-hand side ofthe expression in Lemma 4, with f = f k − , m = k − ρ = d − k + 1. It followsthat the original expression (1) we wanted to bound (for d ≥ k and 0 < δ ≤
1) is atmost C P δ / k f k k f k − k , and the result follows. Remark.
Note that the bound on the constant C P we obtain in this proof can be madeuniform inside a small neighborhood of the considered configuration P . That is, thereis a small ball B ⊂ ( R d ) k around P such that | I P ′ ( A ) − I P ′ ( A ∗ Q δ ) | ≤ C P δ / R d holdsuniformly over all configurations P ′ ∈ B (and all
R > A ⊆ Q (0 , R ) and 0 < δ ≤ P is ‘supersaturable’ in the sensegiven by Bukh [3]. Indeed, Lemma 3 is condition VI of his definition of supersaturableproperties and the Counting Lemma quickly implies condition VII, all other conditionsbeing trivial to verify. It then follows from Bukh’s results that m R d ( t P , t P , . . . , t n P n ) → n Y i =1 m R d ( P i )as t /t , t /t , . . . , t n /t n − → ∞ whenever P , P , . . . , P n ⊂ R d are admissible config-urations, which answers question (Q2) in this case and is one of our main results inthe Euclidean setting (see Theorem 3).For the reader’s convenience, and because we will later need a slight strengtheningof a result which follows from Bukh’s arguments but is never stated in his paper, wewill present his reasoning below. In this subsection we will outline Bukh’s arguments for obtaining the SupersaturationTheorem (Theorem 2 below) from Lemma 3 and the Counting Lemma. We shall notbe too worried about providing all details or explicitly giving the parameters neededfor the arguments to work, but we will give enough details to convince the reader thatthe bounds obtained can be made to hold uniformly inside a small neighborhood ofthe configuration P we are considering (this will be needed later). Remark.
In Section 4 we will give a different argument for proving the same theo-rem, which is less elementary but somewhat ‘cleaner’. However, that argument usescompactness and so gives no information on the bounds we obtain.For the rest of this subsection, let us assume P ⊂ R d is a fixed admissible con-figuration with k points. In order to lighten the notation a little we will omit thedependency of some constants and functions on this configuration P .Our first step will be to show, by a simple averaging argument, that we can boot-strap the conclusion of Lemma 3 in order to obtain a positive proportion of all possiblecopies of P if the set Z δ (1 − γ P ) A is denser than its independence density.10 emma 6. Denote γ P = 1 / (2 d +1 k ) . For every ε > and every < δ ≤ there is aconstant c = c ( ε, δ ) > such that the following holds. For all R > , if A ⊆ Q (0 , R ) is a measurable set satisfying d Q (0 ,R ) ( Z δ (1 − γ P ) A ) ≥ m Q (0 ,R ) ( P ) + ε, then I P ( A ) ≥ cR d .Proof sketch. Take R ≥ diam P large enough (depending on diam P and ε ) so that m Q (0 ,R ) ( P ) < m R d ( P ) + ε/
3. We will show that the conclusion of the lemma holdswhenever R ≥ dR /ε ; this is enough since for R < dR /ε the conclusion followsimmediately from Lemma 3 (for some small enough c > R ≥ dR /ε , and let A ⊆ Q (0 , R ) be a set satisfying d Q (0 ,R ) ( Z δ (1 − γ P ) A ) ≥ m Q (0 ,R ) ( P ) + ε Let us denote B := Z δ (1 − γ P ) A for convenience. Since m Q (0 ,R ) ( P ) < m R d ( P )+ ε/ ≤ m Q (0 ,R ) ( P ) + ε/
3, we have that m ( B ∩ Q (0 , R )) > (cid:0) m Q (0 ,R ) ( P ) + 2 ε/ (cid:1) R d Denote K := ⌊ R/R ⌋ . Since K d R d > (cid:18) − R R (cid:19) d R d ≥ (cid:18) − dR R (cid:19) R d ≥ (cid:16) − ε (cid:17) R d , we conclude that m ( B ∩ Q (0 , KR )) > (cid:0) m Q (0 ,R ) ( P ) + ε/ (cid:1) K d R d . Dividing Q (0 , KR )into K d cubes of side length R , by averaging we conclude that at least εK d / Q ( x, R ) satisfy m ( B ∩ Q ( x, R )) > m Q (0 ,R ) ( P ) R d Each of these sets B ∩ Q ( x, R ) will then contain a copy of P .Using (the conclusion of) Lemma 3 on B ∩ Q ( x, R ) for each of these cubes where B has high density, and noting that each copy of P will be counted at most 2 d times(on neighboring cubes Q ( x, R )), we conclude that I P ( A ) ≥ εK d c ( L ) P ( δ )2 d > εc ( L ) P ( δ )2 d +3 R d ! R d , finishing the proof.This lemma can now be used, in conjunction with the Counting Lemma, to showthat the conclusion that I P ( A ) ≥ cR d follows also from the simpler condition that d Q (0 ,R ) ( A ) ≥ m Q (0 ,R ) ( P ) + ε . This is the supersaturation property of the hypergraphencoding copies of P that we alluded to in the Introduction. Theorem 2 (Supersaturation Theorem) . For every ε > there exist c > and R > such that the following holds. For all R ≥ R , if A ⊆ Q (0 , R ) satisfies d Q (0 ,R ) ( A ) ≥ m Q (0 ,R ) ( P ) + ε, then I P ( A ) ≥ cR d . roof sketch. Note that the claim is trivial if ε ≥ − m Q (0 ,R ) ( P ), and if we substitute A for Z δ (1 − γ P ) A in the inequality above (for some δ > ε = (1 + γ P / ε ′ for some 0 < ε ′ <
1. Let us see how to obtain the same conclusion for when ε = ε ′ (with a smaller constant c and larger constant R ). This easily implies the fullresult.Let now A ⊆ Q (0 , R ) satisfy d Q (0 ,R ) ( A ) ≥ m Q (0 ,R ) ( P ) + ε ′ , for some large R . Ifin addition d Q (0 ,R ) ( Z δ (1 − γ P ) A ) ≥ m Q (0 ,R ) ( P ) + ε ′ / c = c ( L ) ( ε ′ / , δ ); we may then suppose d Q (0 ,R ) ( Z δ (1 − γ P ) A ) < m Q (0 ,R ) ( P ) + ε ′ R is large enough so that border effects become negligible, in thesense that E x ∈ Q (0 ,R ) [ d Q ( x,δ ) ( A )] > d Q (0 ,R ) ( A ) − ε ′ γ P ≥ m Q (0 ,R ) ( P ) + ε ′ (cid:16) − γ P (cid:17) (4)Since for any η > d Q ( x,δ ) ( A ) ≤ η + (1 − γ P ) Z δ ( η ) A ( x ) + γ P Z δ (1 − γ P ) A ( x )pointwise, by averaging over Q (0 , R ) we obtain E x ∈ Q (0 ,R ) [ d Q ( x,δ ) ( A )] ≤ η + (1 − γ P ) d Q (0 ,R ) ( Z δ ( η ) A ) + γ P d Q (0 ,R ) ( Z δ (1 − γ P ) A )Taking η := ε ′ γ P / d Q (0 ,R ) ( Z δ ( ε ′ γ P / A ) > m Q (0 ,R ) ( P ) + ε ′ (1 + γ P / I P ( Z δ ( ε ′ γ P / A ) > cR d . Using that A ∗ Q δ ( x ) ≥ ( ε ′ γ P / · Z δ ( ε ′ γ P / A ( x ) pointwise, we obtain I P ( A ∗ Q δ ) ≥ (cid:18) ε ′ γ P (cid:19) k I P ( Z δ ( ε ′ γ P / A ) > (cid:18) ε ′ γ P (cid:19) k cR d By the Counting Lemma (Lemma 5) we can choose δ > ε ′ , P and c ) so that I P ( A ) > ( ε ′ γ P / k cR d /
2; this finishes the induction stepwith c substituted by ( ε ′ γ P / k c/ Remark.
Exactly the same proof works in the case of several configurations, showingthat I P i ( A ) ≥ c ( ε ) R d holds for some 1 ≤ i ≤ n whenever the density condition d Q (0 ,R ) ( A ) ≥ m Q (0 ,R ) ( P , . . . , P n ) + ε is satisfied (assuming R is large enough and allthe configurations P i are admissible). Moreover, it is easy to see that the bounds for c and R obtained by the proof can be made uniform on a small enough neighborhoodof the configurations P i being considered, since both the bounds from Lemma 6 andfrom the Counting Lemma can be made uniform on such neighborhoods. Corollary 1.
For every ε > there exist δ > and R > such that the followingholds. For all δ ≤ δ and all R ≥ R , if A ⊆ R d satisfies d Q (0 ,R ) ( Z δ ( ε ) A ) ≥ m R d ( P ) + ε, then A contains a copy of P . roof. Assume R is large enough so that m Q (0 ,R ) ( P ) ≤ m R d ( P ) + ε/
2. By the Super-saturation Theorem we have that d Q (0 ,R ) ( Z δ ( ε ) A ) ≥ m R d ( P ) + ε = ⇒ I P ( Z δ ( ε ) A ) ≥ c ( T ) ( ε/ R d holds for all δ >
0. By the Counting Lemma we then have I P ( A ) ≥ I P ( A ∗ Q δ ) − C ( L ) P δ / R d ≥ ε k I P ( Z δ ( ε ) A ) − C ( L ) P δ / R d ≥ ( ε k c ( T ) ( ε/ − C ( L ) P δ / ) R d Taking δ > I P ( A ) >
0, and so A contains a copy of P . Let us now consider subsets of the Euclidean space avoiding several different config-urations. Our main aim is to give an answer to questions (Q1) and (Q2), at least inthe case of admissible configurations.We start with a simple argument giving the lower bound m R d ( P , P , . . . , P n ) ≥ n Y i =1 m R d ( P i ) , which is valid for all n ≥ P , . . . , P n ⊂ R d . Indeed, fix ε > R large enough so that min ≤ i ≤ n ( R − diam P i ) d ≥ (1 − ε ) R d . For each1 ≤ i ≤ n let A i ⊆ Q (0 , R − diam P i ) be a set containing no copies of P i and satisfying d Q (0 ,R − diam P i ) ( A i ) > m R d ( P i ) − ε (which is possible by Lemma 2). We then constructthe R -periodic set A ′ i := A i + R Z d , which also avoids P i and has density d ( A ′ i ) = ( R − diam P i ) d R d d Q (0 ,R − diam P i ) ( A i ) > m R d ( P i ) − ε Since each set A ′ i is periodic with the same ‘fundamental domain’ Q (0 , R ), it followsthat the average of d ( T ni =1 ( x i + A ′ i )) over independent translates x , . . . , x n ∈ Q (0 , R )is equal to Q ni =1 d ( A ′ i ). There must then exist some x , . . . , x n ∈ Q (0 , R ) for which d n \ i =1 ( x i + A ′ i ) ! ≥ n Y i =1 d ( A ′ i ) > n Y i =1 ( m R d ( P i ) − ε )Since T ni =1 ( x i + A ′ i ) avoids each of the configurations P i and ε > m R d ( P , P , . . . , P n ) being close to Q ni =1 m R d ( P i ) assome sort of independence or lack of correlation between the n constraints of forbiddingeach configuration P i ; in this case, there is no better way to choose a set avoiding allthese configurations than simply intersecting optimal P i -avoiding sets for each i (aftersuitably translating them). One might then expect this to happen if the sizes ofeach P i are very different from each other, so that each constraint will be relevant indifferent and largely independent scales.Our next theorem, which (partially) answers question (Q2) and is one of our mainresults in the Euclidean setting, says that this is indeed the case whenever the config-urations P , . . . , P n considered are all admissible. (We remark that this result is not necessarily true if the configurations considered aren’t admissible; see Section 5 for adiscussion.) The proof we present here is based on Bukh’s arguments for supersat-urable properties. 13 heorem 3. If P , P , . . . , P n ⊂ R d are admissible configurations, then m R d ( t P , t P , . . . , t n P n ) → n Y i =1 m R d ( P i ) as the ratios t /t , t /t , . . . , t n /t n − tend to infinity.Proof. We have already seen that m R d ( t P , t P , . . . , t n P n ) ≥ n Y i =1 m R d ( t i P i ) = n Y i =1 m R d ( P i )always holds, so it suffices to show that m R d ( t P , t P , . . . , t n P n ) is no larger than Q ni =1 m R d ( P i ) + ε whenever ε > t i arelarge enough. We will prove this in the case n = 2, the general case being similar butsomewhat more involved; we refer the interested reader to Bukh’s paper [3]. (See alsothe proof of Theorem 7, which is very similar and presented in full.)By dilation invariance we may assume that t = 1; we then wish to show that m R d ( t P , P ) ≤ m R d ( P ) m R d ( P ) + ε whenever t > P , P and ε > ε > A ⊂ R d is a measurable set avoiding copies of t P and P , for some t > A ). By Corollary 1, wemust then have d ( Z δ ( ε ) A ) ≤ m R d ( P ) + ε ∀ δ ≤ δ ( C )0 ( ε ; P ) (5)Moreover, since A/t contains no copies of P , by the Supersaturation Theorem wesee that d Q ( x,R ) ( A/t ) < m Q (0 ,R ) ( P ) + ε ∀ x ∈ R d , R ≥ R ( T )0 ( ε ; P ) (6)Take R ≥ R ( T )0 ( ε ; P ) large enough so that m Q (0 ,R ) ( P ) < m R d ( P ) + ε . Thenwhenever t ≤ δ ( C )0 ( ε ; P ) /R equation (5) implies that d ( Z t R ( ε ) A ) ≤ m R d ( P ) + ε ,while by equation (6) we have d Q ( x,t R ) ( A ) = d Q ( x,R ) ( A/t ) < m R d ( P ) + 2 ε ∀ x ∈ R This means that the density of A inside cubes Q ( x, t R ) of side length t R is at most ε (when x / ∈ Z t R ( ε ) A ) except at a set of upper density at most m R d ( P ) + ε , when itis instead no more than m R d ( P ) + 2 ε .Taking averages, we conclude that d ( A ) ≤ ε + ( m R d ( P ) + ε ) ( m R d ( P ) + 2 ε ) ≤ m R d ( P ) m R d ( P ) + 6 ε This inequality then holds for m R d ( t P , P ) whenever 0 < t ≤ δ ( C )0 ( ε ; P ) /R , fin-ishing the proof. Remark.
By a simple dilation invariance argument we can obtain a more ‘quantitative’version of the result, saying that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m R d ( t P , t P , . . . , t n P n ) − n Y i =1 m R d ( P i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < ε whenever t /t , t /t , . . . , t n /t n − > C ( ε ). We leave the details to the reader.14et us now show how this theorem easily implies the two results of Bourgain wehave seen in the Introduction. Proof of Theorem 1.
Suppose A ⊂ R d is a measurable set not satisfying the conclusionof the theorem; thus there is a sequence ( t j ) j ≥ tending to infinity such that A doesnot contain a copy of any t j P . This implies that d ( A ) ≤ m R d ( t P, t P, . . . , t n P ) forall n ∈ N . By taking a suitably fast-growing subsequence, we may then use Theorem3 to obtain (say) d ( A ) ≤ m R d ( P ) n for any fixed n ≥
1. This implies that d ( A ) = 0,as wished. Proof of Lemma 1.
The result is equivalent to showing that m Q (0 , / ( t P, t P, . . . , t J P ) < ε holds for some J = J ( ε, P ) and all sequences ( t j ) j ≥ ⊂ (0 ,
1] satisfying t j +1 < t j / j ≥
1. Let then n ≥ m R d ( P ) n < ε/
4, and fix C = C ( ε, n, P ) > m R d ( s P, s P, . . . , s n P ) ≤ m R d ( P ) n + ε < ε s /s , s /s , . . . , s n − /s n ≥ C .By (the multivariable version of) Lemma 2 we have that m Q (0 , / ( s P, s P, . . . , s n P ) ≤ (1 + 2 s diam P ) d m R d ( s P, s P, . . . , s n P )holds whenever s > s > · · · > s n >
0. Thus, if s ≤ (2 /d − / (2 diam P ) and s i +1 ≤ s i /C for 1 ≤ i < n , we conclude that m Q (0 , / ( s P, s P, . . . , s n P ) < ε . Byour assumption on the decay of ( t j ) j ≥ , we can choose numbers s , . . . , s n satisfyingthese bounds from the first J := (cid:24) log (cid:18) /d − P (cid:19)(cid:25) + ( n − ⌈ log C ⌉ terms of this sequence, finishing the proof.Going back to our study of the independence density for multiple configurations,we will now consider the opposite situation of what we have seen before: when theconstraints of forbidding each individual configuration are so strongly correlated as tobe essentially redundant. One might expect this is the case, for instance, when we areforbidding very close dilates of a given configuration P .We will show that this intuition is correct, whether or not the configuration con-sidered is admissible, and the proof is much simpler than in the case of very distantdilates of P (in particular not needing the results from earlier sections). Lemma 7.
For any given configuration P ⊂ R d , we have that m R d ( t P, t P, . . . , t n P ) → m R d ( P ) as t /t , t /t , . . . , t n /t n − → .Proof. Assume by dilation invariance that t = 1. By Lemma 2, it suffices to show thatthe convergence above holds with m R d replaced by m Q (0 ,R ) for all fixed R >
0. We willthen fix an arbitrary
R > m Q (0 ,R ) ( P, t P, . . . , t n P ) → m Q (0 ,R ) ( P )as t , t , . . . , t n →
1. 15et ( v , v , . . . , v k ) be an ordering of the points of P and consider the continuousfunction g P : ( R d ) k × O( d ) → R given by g P ( x , . . . , x k , T ) := k X j =2 k ( x j − x ) − T ( v j − v ) k Note that min T ∈ O( d ) g P ( x , . . . , x k , T ) = 0 if and only if ( x , . . . , x k ) is congruent to( v , . . . , v k ).Let now A ⊂ Q (0 , R ) be a measurable set avoiding P with density d Q (0 ,R ) ( A ) ≥ m Q (0 ,R ) ( P ) − ε , for some given ε >
0. From elementary measure theory we know thereexists a compact set e A ⊆ A with d Q (0 ,R ) ( e A ) ≥ m Q (0 ,R ) ( P ) − ε . As the set e A k × O( d )is then compact, the continuous function g P attains a minimum on this set; let us callthis minimum γ . Since e A avoids P , it follows that γ > e A also avoids tP whenever t is sufficiently close to 1, saywhen | t − | < γ/ ( k · diam P ). Indeed, for all x , . . . , x k ∈ e A and all T ∈ O( d ), by thetriangle inequality we have that k X j =2 k ( x j − x ) − T ( tv j − tv ) k ≥ k X j =2 |k ( x j − x ) − T ( v j − v ) k − | t − |k v j − v k| > k X j =2 k ( x j − x ) − T ( v j − v ) k − k · | t − | diam P ≥ γ − k · | t − | diam P, which is positive if | t − | < γ/ ( k · diam P ). In particular, we see that m Q (0 ,R ) ( P, t P, . . . , t n P ) ≥ d Q (0 ,R ) ( e A ) ≥ m Q (0 ,R ) ( P ) − ε whenever | t j − | < γ/ ( k · diam P ) for 2 ≤ j ≤ n . Since we clearly have that m Q (0 ,R ) ( P, t P, . . . , t n P ) ≤ m Q (0 ,R ) ( P ), the result follows.We can now use these two limit results in order to give an (almost complete)answer to question (Q1) when restricted to admissible configurations. Let us denoteby M n ( P ) the set of all possible independence densities we can obtain by forbidding n distinct dilates of a configuration P , that is M n ( P ) := { m R d ( t P, t P, . . . , t n P ) : 0 < t < t < · · · < t n < ∞} Recall that (Q1) asked for an explicit description of this set M n ( P ). Theorem 4. If P ⊂ R d is admissible, then M n ( P ) = [ m R d ( P ) n , m R d ( P )] .Proof. It is clear that m R d ( t P, t P, . . . , t n P ) ≤ m R d ( t P ) = m R d ( P ) always holds,and we have already seen that m R d ( t P, t P, . . . , t n P ) ≥ n Y i =1 m R d ( t i P ) = m R d ( P ) n Moreover, Lemma 7 implies that m R d ( P ) is an accumulation point of the set M n ( P ),and (since P is admissible) Theorem 3 implies the same about m R d ( P ) n . The resultfollows from continuity of the function( t , t , . . . , t n ) m R d ( t P, t P, . . . , t n P ) , which is an immediate consequence of Theorem 11 in Section 4.16 The sphere
In this section we turn to the question of whether the methods and results shown inthe Euclidean space setting can also be made to work in the spherical setting.We shall fix an integer d ≥ d -dimensionalunit sphere S d ⊂ R d +1 . We denote the uniform probability measure on S d by σ ( d ) = σ ,and the normalized Haar measure on O( d + 1) by µ d +1 = µ .The analogue of the axis-parallel cube in the spherical setting will be the sphericalcap : given x ∈ S d and ρ >
0, we denote Cap( x, ρ ) := { y ∈ S d : k x − y k R d +1 ≤ ρ } We say Cap( x, ρ ) is the spherical cap with center x and radius ρ . Since its measure σ (Cap( x, ρ )) does not depend on the center point x , we shall denote this value simplyby σ (Cap ρ ). For a given (measurable) set A ⊆ S d we then write d Cap( x,ρ ) ( A ) := σ ( A ∩ Cap( x, ρ )) σ (Cap ρ )for the density of A inside this cap.We define a (spherical) configuration on S d as a finite subset of R d +1 which iscongruent to a set on S d ; it is convenient to allow for configurations that are notnecessarily on the sphere in order to consider dilations. Note that, if P, Q ⊂ S d aretwo configurations which are on the sphere, then P ≃ Q if and only if there is atransformation T ∈ O( d + 1) for which P = T · Q (translations are no longer necessaryin this case).A spherical configuration P on S d is said to be admissible if it has at most d pointsand if it is congruent to a collection P ′ ⊂ S d which is linearly independent . As before,we shall say that some set A ⊆ S d contains no copies of P , or that A avoids P , ifthere is no subset of A which is congruent to P .The natural analogues of the independence density in the spherical setting can nowbe given. For n ≥ P , . . . , P n on S d , we define the quantities m S d ( P , . . . , P n ) := sup (cid:8) σ ( A ) : A ⊂ S d avoids P i for all 1 ≤ i ≤ n (cid:9) and m Cap( x,ρ ) ( P , . . . , P n ) := sup (cid:8) d Cap( x,ρ ) ( A ) : A ⊂ Cap( x, ρ ) avoids P i for all 1 ≤ i ≤ n (cid:9) Whenever convenient we will enunciate and prove results in the case of only one for-bidden configuration, as the more general case of multiple forbidden configurationsfollows from the same arguments with only trivial modifications.The first issue we encounter in the spherical setting is that it is not compatiblewith dilations: given a collection of points P ⊂ S d and some dilation parameter t > not true that there exists a collection Q ⊂ S d congruent to tP . However,there is a large class of configurations (including the ones we call admissible) for whichthis is true whenever 0 < t ≤
1; we shall say that they are contractible .It is easy to show that any configuration P ⊂ S d having at most d + 1 pointsis contractible. Indeed, these points will all be contained in a d -dimensional affine It is more customary to define the spherical cap using angular distance instead of Euclideandistance as we use. There is no meaningful (qualitative) difference between these two choices, but theuse of the Euclidean distance will be more convenient for us. Note that this definition is different from the one in the Euclidean setting, where we required thepoints to be affinely independent instead of linearly independent. The reason behind this differenceis that the Euclidean space is translation invariant while the sphere is not, so affine properties on R d should translate to linear properties on S d . H ⊂ R d +1 ; let w ∈ R d +1 be a normal vector to H and consider translations sw + H of this hyperplane in the direction of w . By elementary geometry, for anygiven 0 < t ≤ s ≥ sw + H ) ∩ S d which are closest to sw + P is congruent to tP .Even when the configuration we are considering is contractible, however, there isno easy relationship between the independence density of its distinct dilates. We willthen start with the following reassuring lemma, which in a sense assures us the resultswe wish to obtain aren’t true for only trivial reasons. Lemma 8.
For any fixed contractible configuration P ⊂ S d we have that inf Denote by δ S d ( γ ) the packing density of S d by caps of radius γ , i.e. the largestpossible density of a collection of interior-disjoint caps each having radius γ . It isclear that δ S d ( γ ) ≥ σ (Cap γ ) is bounded away from zero when γ is bounded away fromzero, and it is well-known that δ S d ( γ ) tends to the sphere packing density of R d when γ → 0. In particular, inf <γ ≤ δ S d ( γ ) > < t ≤ 1, let P t denote the centers of caps on an (arbitrary) optimalcap packing of radius diam tP , and define the set A t := [ x ∈P t Cap (cid:18) x, diam tP (cid:19) It is easy to see that A t does not contain any copy of tP . Moreover, since the inequality σ (Cap ρ/ ) ≥ c d σ (Cap ρ ) holds for some c d > < ρ ≤ 2, we conclude thatinf 0, denote byB( I, δ ) := { R ∈ O( d + 1) : k R − I k → ≤ δ } the ball of radius δ in operator norm centered on the identity I ∈ O( d + 1). Given δ, γ > V δ ( γ ) acting on measurable sets A ⊆ S d by V δ ( γ ) A := ( x ∈ S d : 1 µ (B( I, δ )) Z B( I,δ ) A ( Rx ) dµ ( R ) ≥ γ ) The reason for this choice of operator is that the analogue of Lemma 3 can be easilyproven for V δ ( γ ): Lemma 9. Let P be a k -point configuration on S d . If A ⊆ S d is a measurable set forwhich V δ (1 − / k ) A contains a copy of P , then I P ( A ) ≥ µ (B( I, δ )) / > .Proof. Suppose { u , . . . , u k } ≃ P is a copy of P in V δ (1 − / k ) A ; this means that P R ∈ B( I,δ ) ( Ru i / ∈ A ) ≤ / k for all 1 ≤ i ≤ k . Then I P ( A ) = P R ∈ O( d +1) ( Rv , . . . , Rv k ∈ A ) ≥ µ (B( I, δ )) · P R ∈ B( I,δ ) ( Rv , . . . , Rv k ∈ A ) ≥ µ (B( I, δ )) − k X i =1 P R ∈ B( I,δ ) ( Ru i / ∈ A ) ! ≥ µ (B( I, δ ))2 , as desired. The next thing we need is an analogue of the Counting Lemma in the spherical setting,saying we do not significantly change the count of configurations in a given set A ⊆ S d by blurring this set a little. As in the Euclidean setting, we will use Fourier-analyticmethods to prove such a result. We now give a quick overview of the definitions andresults we need on harmonic analysis for our arguments.Given an integer n ≥ 0, we write H d +1 n for the space of real harmonic polynomials,homogeneous of degree n , on R d +1 . That is, H d +1 n = ( f ∈ R [ x , . . . , x d +1 ] : f homogeneous , deg f = n, d +1 X i =1 ∂ ∂x i f = 0 ) The restriction of the elements of H d +1 n to S d are called spherical harmonics of degree n on S d . If Y ∈ H d +1 n , note that Y ( x ) = k x k n Y ( x ′ ) where x = k x k x ′ and x ′ ∈ S d ;19e can then identify H d +1 n with the space of spherical harmonics of degree n , whichby a slight (and common) abuse of notation we also denote H d +1 n .Harmonic polynomials of different degrees are orthogonal with respect to the stan-dard inner product h f, g i S d := R S d f ( x ) g ( x ) dσ ( x ). Moreover, it is a well-known factthat the family of spherical harmonics is dense in L ( S d ), and so L ( S d ) = ∞ M n =0 H d +1 n Denoting by proj n : L ( S d ) → H d +1 n the orthogonal projection onto H d +1 n , what thismeans is that f = P ∞ n =0 proj n f (with equality in the L sense) for all f ∈ L ( S d ).There is a family ( P dn ) n ≥ of polynomials on [ − , 1] which is associated to thisdecomposition. We use the convention that deg P dn = n and P dn (1) = 1. Thesepolynomials are then uniquely characterized by the following two properties:( i ) for each fixed y ∈ S d , the function on S d given by x P dn ( x · y ) is in H d +1 n ;( ii ) the projection operator proj n : L ( S d ) → H d +1 n is given byproj n f ( x ) = dim H d +1 n Z S d P dn ( x · y ) f ( y ) dσ ( y ) (7)Using these two facts we conclude also the useful property Z S d P dn ( x · y ) P dn ( x · z ) dσ ( x ) = 1dim H d +1 n P dn ( y · z ) ∀ y, z ∈ S d (8)Note that, by orthogonality of the spaces H d +1 n , property ( i ) implies that Z S d P dn ( x · y ) P dm ( x · y ) dσ ( x ) = 0 if n = m (for any fixed y ∈ S d ). Using the change of variables t = x · y , this is equivalent tosaying that Z − P dn ( t ) P dm ( t ) (1 − t ) ( d − / dt = 0 if n = m This shows that the polynomials P dn are, up to multiplicative constants, the Gegenbauerpolynomials C λn with parameter λ = ( d − / P dn ( x ) = C ( d − / n ( x ) /C ( d − / n (1);we refer the reader to [4] for information on the Gegenbauer polynomials, and for aproof that the P dn thus defined indeed satisfy properties ( i ) and ( ii ). The followingsimple facts about P dn follow immediately from the corresponding properties of theGegenbauer polynomials: Lemma 10. For all integers d ≥ and n ≥ the following hold: • P dn ( t ) ∈ [ − , for all t ∈ [ − , ; • For any fixed γ > , max t ∈ [ − γ, − γ ] P dn ( t ) tends to zero as n → ∞ . We will follow Dunkl [7] in defining both the convolution operation on the sphereand the spherical analogue of Fourier coefficients. For this we will need to break a littlethe symmetry of the sphere and distinguish an (arbitrary) point e on S d ; we think ofthis point as being the north pole.For a given x ∈ S d , we write M ( S d ; x ) for the space of (signed) Borel regular zonalmeasures on S d with pole at x , that is, those measures which are invariant under the20ction of Stab O( d +1) ( x ). The elements of M ( S d ; e ) are referred to simply as the zonalmeasures .Given a function f ∈ L ( S d ) and a zonal measure ν ∈ M ( S d ; e ), we define theirconvolution f ∗ ν by f ∗ ν ( x ) := Z S d f ( y ) dϕ x ν ( y ) ∀ x ∈ S d , where ϕ x : M ( S d ; e ) → M ( S d ; x ) is the rotation operator defined by ϕ x ν ( A ) = ν ( T − x A ) where T x ∈ O( d + 1) satisfies T x e = x. The value f ∗ ν ( x ) can be thought of as the average of f according to a measurewhich acts with respect to x as ν acts with respect to e . It is easy to see that thisoperation is well-defined, independently of the choice of T x : if S x e = T x e = x , then S − x T x ∈ Stab( e ) and so ν ( S − x A ) = ν (( S − x T x ) T − x A ) = ν ( T − x A ).For an integer n ≥ ν ∈ M ( S d ; e ), we define its n -th Fouriercoefficient as b ν n := Z S d P dn ( e · y ) dν ( y )The main property we will need of Fourier coefficients is the following result, which isstated in Dunkl’s paper [7] and can be proven using a straightforward modification ofthe methods exposed in Chapter 2 of [4]: Theorem 5. If f ∈ L ( S d ) and ν ∈ M ( S d ; e ) , then f ∗ ν ∈ L ( S d ) and proj n ( f ∗ ν ) = b ν n proj n f ∀ n ≥ δ > 0, which are related to our zooming-out operators and willgive the ‘blurring’ of the spherical sets we shall consider.The first, denoted cap δ , is the uniform probability measure on Cap( e, δ ):cap δ ( A ) = σ ( A ∩ Cap( e, δ )) σ (Cap( e, δ ))The second is the measure ν δ defined by the formula Z S d f ( x ) dν δ ( x ) = 1 µ (B( I, δ )) Z B( I,δ ) f ( Re ) dµ ( R )It is easy to check from the definitions that( d cap δ ) n = 1 σ (Cap δ ) Z Cap( e,δ ) P dn ( e · y ) dσ ( y ) , ( b ν δ ) n = 1 µ (B( I, δ )) Z B( I,δ ) P dn ( e · Re ) dµ ( R )for all n ≥ 0, and f ∗ cap δ ( x ) = 1 σ (Cap δ ) Z Cap( x,δ ) f ( y ) dσ ( y ) ,f ∗ ν δ ( x ) = 1 µ (B( I, δ )) Z B( I,δ ) f ( Rx ) dµ ( R )for all f ∈ L ( S d ). 21 emma 11. Let d ≥ and γ > . Then, for all f, g ∈ L ( S d ) and all u, v ∈ S d with u · v ∈ [ − γ, − γ ] , we have that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) f ( Ru ) ( g ( Rv ) − g ∗ ν δ ( Rv )) dµ ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ c d,γ ( δ ) k f k k g k and (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) f ( Ru ) ( g ( Rv ) − g ∗ cap δ ( Rv )) dµ ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ c d,γ ( δ ) k f k k g k , where c d,γ ( δ ) → as δ → .Proof. Denote by ˜ µ e the Haar measure on Stab( e ), and assume without loss of gener-ality that u coincides with the north pole e . By symmetry, the expressions we wish tobound may then be written as (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) f ( Re ) h ( Rv ) dµ ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) f ( Re ) Z Stab( e ) h ( RSv ) d ˜ µ e ( S ) ! dµ ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) for h = g − g ∗ ν δ and h = g − g ∗ cap δ .Denote t := e · v . Note that, when S ∈ Stab( e ) is distributed uniformly accordingto ˜ µ e , the point Sv is uniformly distributed on S d − t := { y ∈ S d : e · y = t } . Denoteby σ ( d − t the uniform probability measure on S d − t (that is, the unique one which isinvariant under the action of Stab( e )).Making the change of variables y = Sv , we see that Z Stab( e ) h ( RSv ) d ˜ µ e ( S ) = Z S d − t h ( Ry ) dσ ( d − t ( y )= Z S d − t h ( z ) dσ ( d − t ( R − z ) = h ∗ σ ( d − t ( Re ) (9)The expressions we wish to bound are then of the form (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) f ( Re ) h ∗ σ ( d − t ( Re ) dµ ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)Z S d f ( x ) h ∗ σ ( d − t ( x ) dσ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) Using Parseval’s identity, we can rewrite the right-hand side of this last equality as (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ X n =0 Z S d proj n f ( x ) proj n ( h ∗ σ ( d − t )( x ) dσ ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ∞ X n =0 Z S d | proj n f ( x ) | | ( b σ ( d − t ) n | | proj n h ( x ) | dσ ( x ) ≤ ∞ X n =0 | ( b σ ( d − t ) n | k proj n f k k proj n h k Let us consider the case where h = g − g ∗ ν δ , the other choice of h being analogous.In this case the expression above is equal to ∞ X n =0 | ( b σ ( d − t ) n | | − ( b ν δ ) n | k proj n f k k proj n g k = ∞ X n =0 | P dn ( t ) | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − µ (B( I, δ )) Z B( I,δ ) P dn ( e · Re ) dµ ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) k proj n f k k proj n g k ε > 0. Since t ∈ [ − γ, − γ ] (by hypothesis), from Lemma 10 we obtainthat | P dn ( t ) | ≤ ε/ n ≥ N ( ε, γ ), while (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − µ (B( I, δ )) Z B( I,δ ) P dn ( e · Re ) dµ ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ max − ≤ t ≤ | − P dn ( t ) | = 2always holds. Moreover, since each P dn is a polynomial satisfying P dn (1) = 1, we canchoose δ = δ ( ε, γ ) > | − P dn ( e · Re ) | ≤ ε holds whenever n < N ( ε, γ ) and R ∈ B( I, δ ). This implies that the last sum is at most ∞ X n =0 ε k proj n f k k proj n g k ≤ ε k f k k g k whenever δ ≤ δ ( ε, γ ), finishing the proof.Recall that a spherical configuration P on S d is admissible if it has at most d pointsand if it is congruent to a collection P ′ ⊂ S d which is linearly independent. We cannow give the spherical counterpart to the Counting Lemma from last section; as inthe Euclidean setting, we note that the upper bound c P we obtain in our proof canbe made uniform inside a small ball B ⊂ ( R d +1 ) k centered on the configuration P considered. Lemma 12 (Counting Lemma) . For every admissible configuration P on S d thereexists a function c P : (0 , → (0 , with lim δ → + c P ( δ ) = 0 such that the followingholds for all measurable sets A ⊆ S d : | I P ( A ) − I P ( A ∗ ν δ ) | , | I P ( A ) − I P ( A ∗ cap δ ) | ≤ c P ( δ ) ∀ δ ∈ (0 , Proof. Up to congruence, we may assume P ⊂ S d . Similarly to what we did in theEuclidean setting, we will first show that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) f ( T v ) · · · f k − ( T v k − ) ( f k ( T v k ) − f k ∗ ν δ ( T v k )) dµ ( T ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ c ′ P ( δ )whenever 0 ≤ f , . . . , f k ≤ v , v , . . . , v k ) is a permu-tation of the points of P (and similarly with ν δ substituted by cap δ ).Denote by G := Stab O( d +1) ( v , . . . , v k − ) the stabilizer of the first k − P and by H := Stab O( d +1) ( v , . . . , v k − , v k − ) = Stab G ( v k − ) the stabilizer of the first k − P . We can then bound the expression above by Z O( d +1) (cid:12)(cid:12)(cid:12)(cid:12)Z G f k − ( T Sv k − ) ( f k ( T Sv k ) − f k ∗ ν δ ( T Sv k )) dµ G ( S ) (cid:12)(cid:12)(cid:12)(cid:12) dµ ( T ) (10)(we write µ G and µ H for the normalized Haar measures on G and H , respectively).Denote ℓ := d − k + 2 ≥ 2. Since P is non-degenerate, we see that G ≃ O( ℓ + 1)and that both Gv k − and Gv k are spheres of dimension ℓ . Morally, we should then beable to apply the last lemma (with d = ℓ , f = f k − ( T · ) and g = f k ( T · )) and easilyconclude. However, the convolution in expression (10) above happens in S d , whilethat on the last lemma would happen in S ℓ ; in particular, if k ≥ ℓ < d ,all of the mass on the average defined by the convolution in (10) lies outside of the ℓ -dimensional sphere Gv k , so this argument cannot work. We will have to work harderto conclude. 23ote that, since Gv k is an ℓ -dimensional sphere while Hv k is an ( ℓ − P is non-degenerate), it follows that there is a point ξ ∈ Gv k which is fixed by H ; this point will work as the north pole of Gv k .It will be more convenient to work on the canonical unit sphere S ℓ instead ofthe ℓ -dimensional sphere Gv k ⊂ S d . We shall then restrict ourselves to the ( ℓ + 1)-dimensional affine hyperplane H determined by H ∩ S d = Gv k and place coordinateson it to identify H with R ℓ +1 and Gv k with S ℓ , noting that G then acts as O( ℓ + 1).More formally, let r > Gv k in R d +1 , so that Gv k is isometric to rS ℓ . Take such an isometry ψ : Gv k → rS ℓ , and define e ∈ S ℓ by e := ψ ( ξ ) /r . Nowwe construct a map φ : G → O( ℓ + 1) defined by φ ( S ) ψ ( x ) = ψ ( Sx ) ∀ x ∈ Gv k for each S ∈ G . It is easy to check that this map is well-defined and gives an isomor-phism between G and O( ℓ + 1) satisfying φ ( H ) = Stab O( ℓ +1) ( e ).For each fixed T ∈ O( d + 1), define the functions g T , h T : S ℓ → [ − , 1] by g T ( Re ) := f k − ( T φ − ( R ) v k − ) and h T ( Re ) := f k ( T φ − ( R ) ξ ) − f k ∗ ν δ ( T φ − ( R ) ξ )for all R ∈ O( ℓ +1). These functions are indeed well-defined on S ℓ , since Stab G ( v k − ) =Stab G ( ξ ) = φ − (Stab O( ℓ +1) ( e )). Note that h T can also be written as a function of x ∈ S ℓ by making use of the isometry ψ − : rS ℓ → Gv k : h T ( x ) = f k ( T ψ − ( rx )) − f k ∗ ν δ ( T ψ − ( rx ))Denote by u := ψ ( v k ) /r the point in S ℓ corresponding to v k . Making the changeof variables R = φ ( S ), we obtain Z G f k − ( T Sv k − ) ( f k ( T Sv k ) − f k ∗ ν δ ( T Sv k )) dµ G ( S )= Z O( ℓ +1) g T ( Re ) h T ( Ru ) dµ ℓ +1 ( R )= Z O( ℓ +1) g T ( Re ) Z Stab( e ) h T ( RSu ) d ˜ µ e ( S ) ! dµ ℓ +1 ( R ) , where we write Stab( e ) for Stab O( ℓ +1) ( e ) and ˜ µ e for its Haar measure. Working as wedid in the chain of equalities (9), we see that the expression in parenthesis is equal to h T ∗ σ ( ℓ − e · u ( Re ), where σ ( ℓ − e · u is the uniform probability measure on the ( ℓ − e ) u = { y ∈ S ℓ : e · y = e · u } (and the convolution now takes place in S ℓ with e as the north pole). Making the change of variables x = Re , we then see that theexpression above is equal to Z O( ℓ +1) g T ( Re ) h T ∗ σ ( ℓ − e · u ( Re ) dµ ℓ +1 ( R ) = Z S ℓ g T ( x ) h T ∗ σ ( ℓ − e · u ( x ) dσ ( ℓ ) ( x )We conclude that the expression (10) we wish to bound is equal to Z O( d +1) (cid:12)(cid:12)(cid:12)(cid:12)Z S ℓ g T ( x ) h T ∗ σ ( ℓ − e · u ( x ) dσ ( ℓ ) ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) dµ d +1 ( T ) ≤ Z O( d +1) k h T ∗ σ ( ℓ − e · u k dµ d +1 ( T ) ≤ Z O( d +1) k h T ∗ σ ( ℓ − e · u k dµ d +1 ( T ) ! / e · u , which will be necessary for bounding k h T ∗ σ ( ℓ − e · u k .From the identity k re − ru k R ℓ +1 = k ψ − ( re ) − ψ − ( ru ) k R d +1 = k ξ − v k k R d +1 we conclude that r (2 − e · u ) = 2 − ξ · v k , and so e · u = ( ξ · v k − (1 − r )) /r / ∈ {− , } depends only on P and not our later choices.Now fix an arbitrary ε > 0. By Parseval’s identity, we have that k h T ∗ σ ( ℓ − e · u k = ∞ X n =0 k proj n ( h T ∗ σ ( ℓ − e · u ) k = ∞ X n =0 | ( b σ ( ℓ − e · u ) n | k proj n h T k = ∞ X n =0 P ℓn ( e · u ) k proj n h T k Since e · u / ∈ {− , } is a constant depending only on P , there exists N = N ( ε, P ) ∈ N such that | P ℓn ( e · u ) | ≤ ε for all n > N . Using also that − ≤ P ℓn ( t ) ≤ − ≤ t ≤ 1, we conclude that k h T ∗ σ ( ℓ − e · u k ≤ N X n =0 k proj n h T k + X n>N ε k proj n h T k The second term on the right-hand side of the inequality above is bounded by ε k h T k ≤ ε , so let us concentrate on the first term.By identities (7) and (8), we have k proj n h T k = Z S ℓ (cid:18) dim H ℓ +1 n Z S ℓ h T ( y ) P ℓn ( x · y ) dσ ( y ) (cid:19) dσ ( x )= (dim H ℓ +1 n ) Z S ℓ Z S ℓ h T ( y ) h T ( z ) (cid:18)Z S ℓ P ℓn ( x · y ) P ℓn ( x · z ) dσ ( x ) (cid:19) dσ ( y ) dσ ( z )= dim H ℓ +1 n Z S ℓ Z S ℓ h T ( y ) h T ( z ) P ℓn ( y · z ) dσ ( y ) dσ ( z )Since | P ℓn ( y · z ) | ≤ y, z ∈ S ℓ , we conclude that Z O( d +1) k proj n h T k dµ d +1 ( T )= dim H ℓ +1 n Z S ℓ Z S ℓ Z O( d +1) h T ( y ) h T ( z ) dµ d +1 ( T ) ! P ℓn ( y · z ) dσ ( y ) dσ ( z ) ≤ dim H ℓ +1 n Z S ℓ Z S ℓ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) h T ( y ) h T ( z ) dµ d +1 ( T ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) dσ ( y ) dσ ( z ) By Lemma 10 this value of N can be made robust to small perturbations of the value e · u , whichis equivalent to small perturbations of the configuration P . This remark, and others in the samevein, are the reason why the bound obtained in the proof can be made to hold uniformly inside smallneighborhoods of the considered configuration. 25e now divide this last double integral on the sphere into two parts, depending onwhether or not y · z is close to the extremal points 1 or − 1. Thus, for some parameter0 < γ < Z S ℓ Z S ℓ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) h T ( y ) h T ( z ) dµ d +1 ( T ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) { y · z < − γ or y · z > − γ } dσ ( y ) dσ ( z )+ Z S ℓ Z S ℓ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z O( d +1) h T ( y ) h T ( z ) dµ d +1 ( T ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) {− γ ≤ y · z ≤ − γ } dσ ( y ) dσ ( z )Since − ≤ h T ≤ 1, the first term is at most2 Z S ℓ Z S ℓ { y · z > − γ } dσ ( y ) dσ ( z ) = 2 σ ( ℓ ) (Cap S ℓ ( e, p γ ))To bound the second term, note that for fixed y, z ∈ S ℓ we have Z O( d +1) h T ( y ) h T ( z ) dµ d +1 ( T )= Z O( d +1) ( f k ( T e y ) − f k ∗ ν δ ( T e y )) ( f k ( T e z ) − f k ∗ ν δ ( T e z )) dµ d +1 ( T )where e y := ψ − ( ry ) and e z := ψ − ( rz ). Moreover, we have k ry − rz k R ℓ +1 = k e y − e z k R d +1 = ⇒ e y · e z = 1 − r (1 − y · z );thus, whenever y · z ∈ [ − γ, − γ ], we have e y · e z ∈ [ − r γ, − r γ ]. UsingLemma 11 (with f = f k − f k ∗ ν δ , g = f k and γ substituted by r γ ) we conclude thatthe second term is bounded by c d,r γ ( δ ).Taking stock of everything, we obtain Z O( d +1) k h T ∗ σ ( ℓ − e · u k dµ d +1 ( T ) ≤ ε + N X n =0 dim H ℓ +1 n (cid:16) σ ( ℓ ) (Cap S ℓ ( e, p γ )) + c d,r γ ( δ ) (cid:17) for any 0 < γ < 1. Choosing γ small enough depending on ℓ , ε and N , and thenchoosing δ small enough depending on d , r γ , ε and N (so ultimately only on ε and P ), we can bound the right-hand side by 4 ε ; the expression (10) is then bounded by2 ε in this case.For such small values of δ we thus conclude from our telescoping sum trick (ex-plained in Section 2.1) that | I P ( A ) − I P ( A ∗ ν δ ) | ≤ kε , finishing the proof since ε > With the Counting Lemma in hand we next provide the other main tool in the spher-ical setting, the Supersaturation Theorem . For the rest of this subsection we fix anadmissible configuration P on k points, and in order to lighten notation we will omitthe dependency on P of the constants and functions we define here. Theorem 6 (Supersaturation Theorem) . For every ε > there exists c ( ε ) > suchthat the following holds. If A ⊆ S d has measure σ ( A ) ≥ m S d ( P )+ ε , then I P ( A ) ≥ c ( ε ) . roof. We proceed by downwards induction on ε , the case where ε ≥ − m S d ( P )being obvious (with c ( ε ) = 1).Suppose now that the conclusion of the theorem is true for ε = (1 + 1 / k ) ε ′ , forsome ε ′ > I P ( A ) ≥ c ((1 + 1 / k ) ε ′ ) > σ ( A ) ≥ m S d ( P ) + (1 + 1 / k ) ε ′ . We will show that it must also be true for ε = ε ′ , with somevalue 0 < c ( ε ′ ) ≤ c ((1 + 1 / k ) ε ′ ).Let then A ⊆ S d be a set of measure σ ( A ) ≥ m S d ( P )+ ε ′ . If for some δ = δ ( ε ′ ) > A ) it holds that σ ( V δ (1 − / k ) A ) > m S d ( P ), it follows from Lemma 9 that I P ( A ) ≥ µ (B( I, δ ( ε ′ ))) / > 0. Let us thenassume that σ ( V δ (1 − / k ) A ) ≤ m S d ( P ).For any 0 < γ < − / k fixed and any x ∈ S d , we clearly have A ∗ ν δ ( x ) ≤ γ + (cid:18) − k (cid:19) { A ∗ ν δ ( x ) ≥ γ } + 12 k (cid:26) A ∗ ν δ ( x ) ≥ − k (cid:27) = γ + (cid:18) − k (cid:19) V δ ( γ ) A ( x ) + 12 k V δ (1 − / k ) A ( x )Integrating over all x ∈ S d we obtain σ ( A ) = Z S d A ∗ ν δ ( x ) dσ ( x ) ≤ γ + (cid:18) − k (cid:19) σ ( V δ ( γ ) A ) + 12 k σ ( V δ (1 − / k ) A ) , and so σ ( V δ ( γ ) A ) ≥ σ ( A ) − / k · σ ( V δ (1 − / k ) A ) − γ − / k ≥ m S d ( P ) + ε ′ − / k · m S d ( P ) − γ − / k = m S d ( P ) + ε ′ − γ − / k Taking γ := ε ′ / k , we obtain σ ( V δ ( γ ) A ) ≥ m S d ( P ) + (1 + 1 / k ) ε ′ .Applying the induction hypothesis to the set V δ ( γ ) A we then have I P ( V δ ( γ ) A ) ≥ c ((1 + 1 / k ) ε ′ ). By the Counting Lemma (Lemma 12) we conclude that I P ( A ) ≥ I P ( A ∗ ν δ ) − c ( L ) P ( δ ) ≥ γ k I P ( V δ ( γ ) A ) − c ( L ) P ( δ ) ≥ (cid:18) ε ′ k (cid:19) k c ((1 + 1 / k ) ε ′ ) − c ( L ) P ( δ )Choosing δ = δ ( ε ′ ) small enough depending on ε ′ (and also on c ((1 + 1 / k ) ε ′ ) and P ) we conclude that I P ( A ) ≥ min ( µ (B( I, δ ( ε ′ )))2 , (cid:18) ε ′ k (cid:19) k c ((1 + 1 / k ) ε ′ )2 ) Taking this value for c ( ε ′ ) we conclude the inductive step, and the result immediatelyfollows. Remark. Note that the lower bound c ( ε ) obtained in this proof of the SupersaturationTheorem can be made uniform over a small ball around the configuration P considered,since the same is true for the function c ( L ) P .27 orollary 2. For every ε > there exist δ > such that the following holds. Forany δ ≤ δ and any measurable set A ⊆ S d , if σ ( Z δ ( ε ) A ) ≥ m S d ( P ) + ε then A contains a copy of P .Proof. By the Supersaturation Theorem we know that σ ( Z δ ( ε ) A ) ≥ m S d ( P ) + ε = ⇒ I P ( Z δ ( ε ) A ) ≥ c ( T ) ( ε )holds for all δ > 0. By the Counting Lemma we then have I P ( A ) ≥ I P ( A ∗ cap δ ) − c ( L ) P ( δ ) ≥ ε k I P ( Z δ ( ε ) A ) − c ( L ) P ( δ ) ≥ ε k c ( T ) ( ε ) − c ( L ) P ( δ )Since c ( L ) P ( δ ) → δ → 0, there is δ > I P ( A ) > δ ≤ δ . Thisimplies that A contain a copy of P . We must now tackle the problem of obtaining a relationship between the indepen-dence density m S d ( P ) of a given configuration P ⊂ S d and its spherical cap version m Cap( x,ρ ) ( P ), as this will be needed in order to obtain the spherical analogue of The-orem 3.In the Euclidean setting this was very easy to do (see Lemma 2), using the factthat we can tessellate R d with cubes Q ( x, R ) of any given side length R > 0. This isno longer the case in the spherical setting, as it is impossible to completely cover S d using non-overlapping spherical caps of some given radius; in fact, this cannot be doneeven approximately if we require the radii of the spherical caps to be the same (as wedid with the side length of the cubes in R d ).We will then need to use a much weaker ‘almost-covering’ result, saying that we cancover almost all of the sphere by using finitely many non-overlapping spherical capsof different radii. For technical reasons we will also want these radii to be arbitrarilysmall. Lemma 13. For every ε > there is a finite cap packing P = { Cap( x i , ρ i ) : 1 ≤ i ≤ N } of S d with density σ ( P ) > − ε and with radii ρ i ≤ ε for all ≤ i ≤ N .Proof. We will use the same notation for both a collection of caps and the set ofpoints on S d which belong to (at least) one of these caps. The desired packing P willbe constructed in several steps, starting with P := { Cap( e, ε ) } .Now suppose P i − has already been constructed (and is finite) for some i ≥ P i . Define C i := (cid:8) Cap ( x, min { ε, dist( x, P i − ) } ) : x ∈ S d \ P i − (cid:9) , and note that C i is a covering of S d \ P i − by caps of positive radii (since P i − is closedon S d ). By Vitali’s Covering Lemma , there is a countable subcollection Q i = ∞ [ j =1 { Cap( x j , r j ) } ⊂ C i Note that spherical caps are exactly the (closed) balls of the separable metric space S d endowedwith the Euclidean distance induced from R d +1 . disjoint caps in C i such that S d \ P i − ⊆ S ∞ j =1 Cap( x j , r j ). In particular1 − σ ( P i − ) = σ ( S d \ P i − ) ≤ ∞ X j =1 σ (Cap( x j , r j )) ≤ K d σ ( Q i ) , where we denote K d := sup r> σ (Cap r ) /σ (Cap r ) < ∞ . Taking N i ∈ N such that N i X j =1 σ (Cap( x j , r j )) ≥ σ ( Q i ) − − σ ( P i − )2 K d , we see that P ′ i := { Cap( x j , r j ) : 1 ≤ j ≤ N i } satisfies σ ( P ′ i ) ≥ − σ ( P i − )2 K d Now set P i := P i − ∪ P ′ i ; this is a finite cap packing with1 − σ ( P i ) = 1 − σ ( P i − ) − σ ( P ′ i ) ≤ (1 − σ ( P i − )) (cid:18) − K d (cid:19) ≤ (1 − σ (Cap ε )) (cid:18) − K d (cid:19) i (where the last inequality follows by induction). Taking n ≥ − σ (Cap ε )) (cid:16) − K d (cid:17) n < ε , we see that P := P n satisfies all requirements.We can now obtain our analogue of Lemma 2, relating the two versions of inde-pendence density in the spherical setting: Lemma 14. For every ε > , ρ > there is t > such that the following holdswhenever P , . . . , P n ⊂ S d have diameter at most t : (cid:12)(cid:12) m Cap( x,ρ ) ( P , . . . , P n ) − m S d ( P , . . . , P n ) (cid:12)(cid:12) < ε Proof. If A ⊂ S d is a set that does not contain copies of P , . . . , P n , then for any x ∈ S d the set A ∩ Cap( x, ρ ) ⊆ Cap( x, ρ ) also does not contain copies of P , . . . , P n and E x ∈ S d [ d Cap( x,ρ ) ( A )] = σ ( A ). There must then be x ∈ S d such that d Cap( x,ρ ) ( A ∩ Cap( x, ρ )) = d Cap( x,ρ ) ( A ) ≥ σ ( A ) , proving that m Cap( x,ρ ) ( P , . . . , P n ) ≥ m S d ( P , . . . , P n ).For the opposite direction, let γ ≤ ε/ σ (Cap ρ + γ ) ≤ (1 + ε/ σ (Cap ρ ). By Lemma 13 we know there is a cap packing P = { Cap( x i , ρ i ) : 1 ≤ i ≤ N } of S d with σ ( P ) ≥ − γ and 0 < ρ , . . . , ρ N ≤ γ . Now let t > σ (Cap ρ i − t ) ≥ (1 − ε/ σ (Cap ρ i ) for all 1 ≤ i ≤ N ; note that t will ultimatelydepend only on ε and ρ .Fixing any configurations P , . . . , P n ⊂ S d of diameter at most t , let A ⊂ Cap( x, ρ )be a set avoiding all of them. We shall construct a set e A ⊂ S d which avoids P , . . . , P n and which satisfies σ ( e A ) > d Cap( x,ρ ) ( A ) − ε ; this will finish the proof.29or each 1 ≤ i ≤ N denote e ρ i := ρ i − t < γ . We have that σ ( A ) = Z S d d Cap( y, e ρ i ) ( A ) dσ ( y )= Z Cap( x, ρ + e ρ i ) d Cap( y, e ρ i ) ( A ) dσ ( y ) ≤ Z Cap( x, ρ ) d Cap( y, e ρ i ) ( A ) dσ ( y ) + σ (Cap ρ + e ρ i ) − σ (Cap ρ )Since e ρ i < γ , dividing by σ (Cap ρ ) we obtain E y ∈ Cap( x,ρ ) (cid:2) d Cap( y, e ρ i ) ( A ) (cid:3) ≥ σ ( A ) σ (Cap ρ ) − σ (Cap ρ + e ρ i ) − σ (Cap ρ ) σ (Cap ρ ) > d Cap( x,ρ ) ( A ) − ε y i ∈ Cap( x, ρ ) for which d Cap( y i , e ρ i ) ( A ) > d Cap( x,ρ ) ( A ) − ε/ 4; fixone such y i for each 1 ≤ i ≤ N , and let T y i → x i ∈ O( d + 1) be any rotation taking y i to x i (and thus Cap( y i , e ρ i ) to Cap( x i , e ρ i )).We claim that the set e A := N [ i =1 T y i → x i ( A ∩ Cap( y i , e ρ i ))satisfies our requirements. Indeed, we have σ ( e A ) = N X i =1 σ ( A ∩ Cap( y i , e ρ i )) = N X i =1 d Cap( y i , e ρ i ) ( A ) · σ (Cap e ρ i ) > N X i =1 (cid:16) d Cap( x,ρ ) ( A ) − ε (cid:17) · (cid:16) − ε (cid:17) σ (Cap ρ i ) ≥ (cid:16) d Cap( x,ρ ) ( A ) − ε (cid:17) σ ( P ) > d Cap( x,ρ ) ( A ) − ε Moreover, since diam ( P j ) ≤ t and the caps Cap( x i , e ρ i ) are (at least) 2 t -distantfrom each other, we see that any copy of P j in e A ⊂ S Ni =1 Cap( x i , e ρ i ) must be entirelycontained in one of the the caps Cap( x i , e ρ i ). But then it should also be contained(after rotation by T − y i → x i ) in A ∩ Cap( y i , e ρ i ); this shows that e A does not containcopies of P j for any 1 ≤ j ≤ N , since A doesn’t, and we are done. We will now consider subsets of the sphere which avoid several different configurations.As in the Euclidean setting, it is easy to show that m S d ( P , . . . , P n ) ≥ Q ni =1 m S d ( P i )holds without any assumptions on the configurations P , . . . , P n . To do this we choose,for each 1 ≤ i ≤ n , a set A i ⊂ S d which avoids configuration P i . By taking independentrotations R i A i of each set A i , we see that E R ,...,R n ∈ O( d +1) " σ n \ i =1 R i A i ! = Z S d n Y i =1 E R i ∈ O( d +1) [ A i ( R − i x )] dσ ( x )= n Y i =1 σ ( A i )30here must then exist R , . . . , R n ∈ O( d + 1) for which σ n \ i =1 R i A i ! ≥ n Y i =1 σ ( A i )Since T ni =1 R i A i avoids all configurations P , . . . , P n and the sets A , . . . , A n werechosen arbitrarily, the result follows.Using supersaturation we can show that this lower bound is essentially tight whenthe configurations considered are all admissible and each one is at a different size scale.Intuitively, this happens because the constraints of avoiding each of these configura-tions will act at distinct scales and thus not correlate with each other. Theorem 7. For every admissible configurations P , . . . , P n ⊂ S d and every ε > there is a positive increasing function f : (0 , → (0 , such that the following holds.Whenever < t , . . . , t n ≤ satisfy t i +1 ≤ f ( t i ) for ≤ i < n , we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m S d ( t P , . . . , t n P n ) − n Y i =1 m S d ( t i P i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < ε Proof. We have already seen that m S d ( t P , . . . , t n P n ) ≥ Q ni =1 m S d ( t i P i ), so it suf-fices to show that m S d ( t P , . . . , t n P n ) ≤ Q ni =1 m S d ( t i P i ) + ε for suitably separated t , . . . , t n ≤ 1. We will do so by induction on n , with the base case n = 1 being trivial(and taking f ≡ 1, say).Suppose then n ≥ n − f : (0 , → (0 , 1] be the function promised by the theorem applied to the n − P , . . . , P n and with accuracy ε , so that whenever 0 < t ≤ < t j +1 ≤ ˜ f ( t j ) for all 2 ≤ j < n we have m S d ( t P , . . . , t n P n ) ≤ n Y j =2 m S d ( t j P j ) + ε. By the corollary to the Supersaturation Theorem (Corollary 2), for all 0 < t ≤ δ = δ ( C )0 ( ε ; t P ) > σ ( Z δ ( ε ) A ) ≥ m S d ( t P ) + ε = ⇒ A contains a copy of t P . Applying Lemma 14 with radius ρ = δ , we see there is t = t ( L )0 ( ε, δ ) > m Cap( x,δ ) ( t P , . . . , t n P n ) ≤ m S d ( t P , . . . , t n P n ) + ε whenever 0 < t , . . . , t n ≤ t / < t , . . . , t n ≤ t ≤ t ( L )0 ( ε, δ ( C )0 ( ε ; t P )) / t j +1 ≤ ˜ f ( t j ) for all 2 ≤ j < n. If A ⊂ S d does not contain copies of t P , . . . , t n P n , then by the preceding discussionwe must have σ ( Z δ ( ε ) A ) < m S d ( t P ) + ε and, for all x ∈ S d , d Cap( x,δ ) ( A ) ≤ m Cap( x,δ ) ( t P , . . . , t n P n ) ≤ m S d ( t P , . . . , t n P n ) + ε ≤ n Y j =2 m S d ( t j P j ) + 2 ε x ∈ S d we trivially have d Cap( x,δ ) ( A ) ≤ ε + Z δ ( ε ) A ( x ) (cid:18) sup z ∈ S d d Cap( z,δ ) ( A ) (cid:19) , by averaging over S d we conclude that σ ( A ) = E x ∈ S d [ d Cap( x,δ ) ( A )] ≤ ε + σ ( Z δ ( ε ) A ) (cid:18) sup z ∈ S d d Cap( z,δ ) ( A ) (cid:19) ≤ ε + ( m S d ( t P ) + ε ) n Y j =2 m S d ( t j P j ) + 2 ε ≤ ε + n Y i =1 m S d ( t i P i )It thus suffices to take the function f : (0 , → (0 , 1] given by f ( t ) = min ( ˜ f ( t ) , t ( L )0 ( ε/ , δ ( C )0 ( ε/ tP ))2 ) to conclude the induction.Note that this result provides a partial answer to the analogue of question (Q2)in the spherical setting: if P is admissible, then m S d ( t P, t P, . . . , t n P ) decays ex-ponentially with n as the ratios t j +1 /t j between consecutive scales go to zero (recallfrom Lemma 8 that m S d ( tP ) is bounded away from both zero and one for 0 < t ≤ Corollary 3. Let P ⊂ S d be an admissible configuration, and let A ⊂ S d be a set with σ ( A ) > . There exists a number t > such that A contains a congruent copy of tP for all t ≤ t . This corollary can be seen as the counterpart to Theorem 1 in the spherical setting,where it impossible to consider arbitrarily large dilates . One can similarly obtainan analogue of Lemma 1 in this setting, but due to the fact that the independencedensity m S d is not dilation-invariant we will also need to specify a lower bound forconsecutive terms of the dilation sequence. Corollary 4. Let P ⊂ S d be an admissible configuration. Let A ⊂ S d , σ ( A ) ≥ ε and < t j ≤ be a sequence satisfying ≤ t ≤ and (cid:18) t j (cid:19) ≤ t j +1 ≤ t j for j ≥ . Then there exists j ≤ J ( ε, P ) such that A contains a congruent copy of t j P .Proof. Let us denote n := (cid:24) log ( ε/ (1 − /k ) (cid:25) , The equivalent result of containing all sufficiently small dilates of a configuration in the Euclideansetting also holds with the same proof, or as an immediate consequence of Lemma 1. 32o that by (the proof of) Lemma 8 we have n Y i =1 m S d ( s i P ) ≤ (cid:18) − k (cid:19) n ≤ ε < s , . . . , s n ≤ f : (0 , → (0 , 1] be the function promised by Theorem 7 with ε substituted by ε/ n copies of P . Thus f is an increasing function and, whenever 0 < s ≤ < s j +1 ≤ f ( s j ) for 1 ≤ j < n , we have m S d ( s P, . . . , s n P ) < n Y i =1 m S d ( s i P ) + ε ≤ ε In particular, the set A from the statement must contain a copy of s i P for some1 ≤ i ≤ n . Now we use the bounds we have on the sequence ( t j ) j ≥ to show that wecan choose a finite sequence ( s i ) ni =1 satisfying these requirements from the first thefirst J ( ε, P ) terms of ( t j ) j ≥ .Let ( ℓ j ) j ≥ be the lower bound sequence defined by ℓ = 1 / and ℓ j +1 =( ℓ j / for j ≥ 1. We then have that ℓ j ≤ t j ≤ − j for all j ≥ 1. If wedefine inductively j (1) = 1 and j ( i + 1) = j ( i ) + ⌈ log ( t j ( i ) /f ( t j ( i ) )) ⌉ , we see that s i := t j ( i ) satisfies our requirements, since t j ( i +1) ≤ t j ( i ) / log ( t j ( i ) /f ( t j ( i ) )) = f ( t j ( i ) ).As j ( i + 1) < j ( i ) + log (2 − j ( i ) /f ( ℓ j ( i ) )) + 1 = 2 + log (1 /f ( ℓ j ( i ) )) is bounded bysomething depending only on ε and P , so is j ( n ) ≤ J ( ε, P ) and we are done.As a final result, and in some ways the direct opposite of Theorem 7, one canshow that forbidding very close dilates of a contractible configuration makes essen-tially no difference to its independence density. Indeed, by considering compact inner-approximations of P -avoiding sets on the sphere and proceeding in exactly the sameway as we did in the proof of Lemma 7, we can prove: Lemma 15. For any contractible configuration P ⊂ S d , we have that m S d ( P, t P, . . . , t n P ) → m S d ( P ) as t , t , . . . , t n tend to (from below). We will see in the next section (see Theorem 10) that the independence density m S d ( P , . . . , P n ) varies continually on the set of n admissible configurations P , . . . ,P n on S d . It follows that for all admissible P the set M S d n ( P ) := { m S d ( t P, t P, . . . , t n P ) : 0 < t < t < · · · < t n ≤ } is an interval, which is what we needed in the Euclidean setting to provide an answer toquestion (Q1) for admissible configurations. However, since the independence densityis no longer dilation invariant in the present setting, we cannot easily conclude fromthis and the previous results a simple description of the set M S d n ( P ). In this section we continue our study of configuration-avoiding sets, both on the Eu-clidean space and on the unit sphere, presenting several results on quantities andfunctions naturally associated to them.Each result will first be presented in detail in the case of the sphere S d , andafterwards we will enunciate the corresponding result for the Euclidean space R d anddiscuss the (usually very simple) necessary changes in its proof.33 .1 Continuity properties of the counting function I P Given some k -point configuration P on the sphere, it is sometimes important to un-derstand how much the count of congruent copies of P on a set A ⊆ S d can change ifwe perturb the set A a little. An instance of this problem was already considered inthe Counting Lemma, where the perturbation was given by blurring and it was seenthat the counting function I P is somewhat robust to small perturbations (in the caseof admissible configurations).The next lemma gives a similar result, but with the perturbation being now mea-sured in the L k norm. Lemma 16. For all k -point spherical configurations P ⊂ S d , the function I P is con-tinuous in the L k ( S d ) norm.Proof. Let us denote P = { v , v , . . . , v k } , where we choose an arbitrary order for itspoints. Consider the multilinear operator Λ P : ( L k ( S d )) k → R given byΛ P ( f , f , . . . , f k ) = Z O( d +1) f ( T v ) f ( T v ) . . . f k ( T v k ) dµ ( T );note that I P ( f ) = Λ P ( f, f, . . . , f ). Applying H¨older’s inequality k − P , we obtain | Λ P ( f , f , . . . , f k ) | ≤ k Y i =1 Z O( d +1) | f i ( T v i ) | k dµ ( T ) ! /k Since T v i is uniformly distributed on S d when T is distributed according to the Haarmeasure on O( d + 1), it follows that the right-hand side expression in the inequalityabove is equal to Q ki =1 k f i k k .By our usual telescoping sum trick, we then obtain | I P ( f ) − I P ( g ) | ≤ | Λ P ( f − g, f, . . . , f ) | + · · · + | Λ P ( g, g, . . . , f − g ) |≤ k X i =1 k g k i − k k f − g k k k f k k − ik ≤ k ( k f k k − k + k g k k − k ) k f − g k k and the conclusion follows.When P is admissible, we obtain the following significantly stronger continuityproperty of I P when restricting to essentially bounded functions: Lemma 17. If P is an admissible configuration on S d , then I P is weak ∗ continuouson the unit ball of L ∞ ( S d ) .Proof. Denote the (closed) unit ball of L ∞ ( S d ) by B ∞ . Since B ∞ endowed with theweak ∗ topology is metrizable, it suffices to prove that I P is sequentially continuous(i.e. that I P ( f i ) i →∞ −−−→ I P ( f ) whenever f i i →∞ −−−→ f ).Suppose then ( f i ) i ≥ ⊂ B ∞ is a sequence weak ∗ -converging to f ∈ B ∞ . It followsthat for every x ∈ S d and every δ > f i ∗ cap δ ( x ) = 1 σ (Cap δ ) Z Cap( x,δ ) f i ( y ) dσ ( y ) i →∞ −−−→ σ (Cap δ ) Z Cap( x,δ ) f ( y ) dσ ( y ) = f ∗ cap δ ( x )34ince f ∗ cap δ and each f i ∗ cap δ are Lipschitz with the same constant (depending onlyon δ , as k f k ∞ , k f i k ∞ ≤ 1) and S d is compact, this easily implies that k f i ∗ cap δ − f ∗ cap δ k ∞ → i → ∞ . In particular, we see that lim i →∞ I P ( f i ∗ cap δ ) = I P ( f ∗ cap δ ).Since P is admissible, by the spherical Counting Lemma (Lemma 12) we have | I P ( f ∗ cap δ ) − I P ( f ) | ≤ c ( L ) P ( δ ) and | I P ( f i ∗ cap δ ) − I P ( f i ) | ≤ c ( L ) P ( δ ) ∀ i ≥ i ( δ ) ≥ | I P ( f i ∗ cap δ ) − I P ( f ∗ cap δ ) | ≤ c ( L ) P ( δ ) ∀ i ≥ i ( δ ) , we conclude that | I P ( f ) − I P ( f i ) | ≤ | I P ( f ) − I P ( f ∗ cap δ ) | + | I P ( f ∗ cap δ ) − I P ( f i ∗ cap δ ) | + | I P ( f i ∗ cap δ ) − I P ( f i ) |≤ c ( L ) P ( δ ) ∀ i ≥ i ( δ )Since δ > c ( L ) P ( δ ) → δ → 0, we conclude that lim i →∞ I P ( f i ) = I P ( f ) as wished. In the case of configurations on the Euclidean space we have the following equivalent L k continuity property, which has essentially the same proof as its spherical counterpart(but now applying H¨older’s inequality to the integral over R d ): Lemma 18. For all k -point configurations P ⊂ R d , the function I P is continuous inthe L k ( R d ) norm. When the considered configuration P is admissible, we can again obtain a strongercontinuity property of the counting function I P restricted to essentially bounded func-tions. However, since I P is not well-defined in the entirety of the unit ball of L ∞ ( R d ),we will restrict ourselves to bounded functions of bounded support; the proof is essen-tially the same in this case. Lemma 19. If P ⊂ R d is an admissible configuration, then for every fixed R > thefunction I P is weak ∗ continuous on the unit ball of L ∞ ( Q (0 , R )) . When a measurable set A (either on R d or on S d ) contains no copies of some configu-ration P , it is clear from the definition that I P ( A ) = 0; however, it is also possible for I P ( A ) to be zero even when A contains congruent copies of P . In intuitive terms, thecondition I P ( A ) = 0 means only that A contains a negligible fraction of all possiblecopies of P . The next result shows that this distinction is essentially irrelevant formost purposes. Lemma 20. Suppose P ⊂ S d is a finite configuration and A ⊆ S d is measurable. If I P ( A ) = 0 , then we can remove a zero measure subset of A in order to remove allcopies of P . roof. We will first show thatlim δ → | A ∗ ν δ ( x ) − A ( x ) | = 0 for almost every x ∈ S d . (11)Recall that A ∗ ν δ ( x ) is the average density of A according to a measure concentratedon Cap( x, δ ), so identity (11) should remind us of Lebesgue’s Density Theorem. Inorder to be able to formally apply this theorem we will have to change spaces.Define on O( d + 1) the set E := { R ∈ O( d + 1) : Re ∈ A } , and note that A ∗ ν δ ( Re ) = 1 µ (B( R, δ )) Z B( R,δ ) E ( S ) dµ ( S )By the Lebesgue Density Theorem on O( d + 1) we have thatlim δ → (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) µ (B( R, δ )) Z B( R,δ ) E ( S ) dµ ( S ) − E ( R ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 for µ -a.e. R ∈ O( d + 1) . But this means exactly that the measure of the set F := (cid:26) R ∈ O( d + 1) : lim δ → | A ∗ ν δ ( Re ) − A ( Re ) | 6 = 0 (cid:27) of ‘non-density points’ is zero. It is clear from the definition of F that it is invariantunder the right-action of Stab( e ); this implies that σ ( { Re : R ∈ F } ) = µ ( F ) = 0,proving (11).Now we remove from A all points x for which identity (11) does not hold, thusobtaining a subset B ⊆ A with σ ( A \ B ) = 0 and lim δ → B ∗ ν δ ( x ) = 1 for all x ∈ B .We will show that no copy of P remains on this restricted set B .Suppose for contradiction that B contains a copy { u , . . . , u k } of P . Then thereexists δ > B ∗ ν δ ( u i ) ≥ − / k for all 1 ≤ i ≤ k , which means that { u , . . . , u k } ⊆ V δ (1 − / k ) B . By Lemma 9 we conclude that I P ( B ) ≥ µ (B( I, δ )) / I P ( A ) = 0 and finishing the proof.Note that this ‘zero-measure removal’ lemma immediately implies the followingweak supersaturation result, which is then valid for all finite configurations P ⊂ S d . Lemma 21 (Weak supersaturation) . If σ ( A ) > m S d ( P ) then I P ( A ) > . It is interesting to note the similarity between Lemma 20 (and its proof) to the‘infinitary’ Hypergraph Removal Lemma for hypergraph limits obtained by Elek andSzegedy [8]. In their case the role of the configuration P is played by a hypergraph F , the counting function I P being then replaced by the F -homomorphism densityfunction t ( F, · ) which gives the density of copies of F inside a given hypergraph.Elek and Szegedy showed that to every converging sequence ( H i ) i ≥ of k -uniformhypergraphs there is an associated measurable set W ⊆ [0 , k − , and if this sequenceof hypergraphs has asymptotically negligible density of F as a subgraph then by re-moving the non-density points from W one also removes all copies of the consideredhypergraph F from the limit object. Using their notion of convergence (based onultraproducts), they easily conclude from this the usual Hypergraph Removal Lemma.Inspired by their derivation of the Hypergraph Removal Lemma from its infinitaryversion, let us now show how weak supersaturation and weak ∗ continuity of the count-ing function I P easily provide an alternative proof of the Supersaturation Theorem. A sequence ( H i ) i ≥ of k -uniform hypergraphs is said to converge if the densities of every fixedhypergraph F converge. roof of the (spherical) Supersaturation Theorem. Suppose for contradiction that theresult is false. Then there exist ε > A i ) i ≥ of sets each of densityat least m S d ( P ) + ε which satisfy lim i →∞ I P ( A i ) = 0.Since the unit ball B ∞ of L ∞ ( S d ) is weak ∗ compact (and also metrizable in thistopology), by possibly restricting to a subsequence we may assume that ( A i ) i ≥ con-verges in the weak ∗ topology of L ∞ ( S d ); let us denote its limit by A ∈ B ∞ . It is clearthat 0 ≤ A ≤ R S d A ( x ) dσ ( x ) = lim i →∞ σ ( A i ) ≥ m S d ( P )+ ε .By weak ∗ continuity of I P we also have I P ( A ) = lim i →∞ I P ( A i ) = 0.Now let B := { x ∈ S d : A ( x ) ≥ ε } . Since εB ( x ) ≤ A ( x ) < ε + B ( x ) for a.e. x ∈ S d , we conclude that I P ( B ) ≤ ε − k I P ( A ) = 0 and σ ( B ) > Z S d A ( x ) dσ ( x ) − ε ≥ m S d ( P )But this set B contradicts Lemma 21, finishing the proof. The equivalent removal result for sets on the Euclidean space follows from the sameargument as before, using now the usual Lebesgue Density Theorem on R d . Lemma 22. Suppose P ⊂ R d is a finite configuration and A ⊆ R d is measurable. If I P ( A ) = 0 , then we can remove a zero measure subset of A in order to remove allcopies of P . We immediately conclude that weak supersaturation holds for all finite configura-tions P ⊂ R d : Lemma 23 (Weak supersaturation) . If either d Q (0 ,R ) ( A ) > m Q (0 ,R ) ( P ) for some R > or d ( A ) > m R d ( P ) , then I P ( A ) > . The Supersaturation Theorem in the Euclidean setting also follows from weaksupersaturation using a similar compactness-contradiction argument to the one usedfor spherical sets, but we must in addition apply an averaging argument akin to theone employed in Lemma 6. We leave the details to the interested reader. As our definition of the independence density m S d ( P ) involved a supremum over all P -avoiding measurable sets A ⊆ S d , it is not immediately clear whether there actu-ally exists a measurable P -avoiding set attaining this extremal value of density. Infact, such a result is false in the case where d = 1 and we are considering two-pointconfigurations { u, v } ⊂ S : if the length of the arc between u and v is not a rationalmultiple of π , it was shown by DeCorte and Pikhurko [6] that m S ( { u, v } ) = 1 / { u, v } -avoiding measurable set of density 1 / heorem 8. If P ⊂ S d is an admissible configuration, then there is a P -avoidingmeasurable set A ⊆ S d attaining σ ( A ) = m S d ( P ) .Proof. Let A , A , · · · ⊆ S d be a sequence of P -avoiding measurable sets satisfyinglim i →∞ σ ( A i ) = m S d ( P ). By passing to a subsequence if necessary, we may assumethat ( A i ) i ≥ converges to some function A ∈ B ∞ in the weak ∗ topology of L ∞ ( S d ).We shall prove two things:( i ) the limit function A is { , } -valued almost everywhere, so we can identify itwith its support supp A ;( ii ) after possibly modifying it on a zero measure set, this set A will not contain anycopy of P .With these two results we will be done, since σ ( A ) = lim i →∞ σ ( A i ) = m S d ( P ).By weak ∗ convergence we know that 0 ≤ A ≤ I P ( A ) = lim i →∞ I P ( A i ) = 0. From this we easily conclude that I P (supp A ) = 0, and also σ (supp A ) = Z S d supp A ( x ) dσ ( x ) ≥ Z S d A ( x ) dσ ( x ) = m S d ( P ) (12)But weak supersaturation (Lemma 21) implies that σ (supp A ) ≤ m S d ( P ), which by(12) and the fact that 0 ≤ A ≤ A = supp A almost everywhere.This proves ( i ).Identifying A with its support and using that I P ( A ) = 0, Lemma 20 implies wecan remove a zero measure subset of A in order to remove all copies of P . This proves( ii ) and finishes the proof of the theorem. For configuration-avoiding sets on R d we also obtain an equivalent result, but the proofis made more complicated due to the fact that R d is not compact.This result can also be obtained from Bukh’s methods (see Corollary 13 in [3]) byusing the fact that avoiding admissible configurations is a supersaturable property (asdiscussed in Section 2.1 of the present paper). One of the main steps in the proof wewill present here is inspired by Bukh’s arguments. Theorem 9. If P ⊂ R d is an admissible configuration, there exists a P -avoiding set A ⊆ R d with well-defined density attaining d ( A ) = m R d ( P ) .Proof. For each integer i ≥ 1, let A i ⊆ Q (0 , i ) be a P -avoiding set with density d Q (0 ,i ) ( A i ) ≥ m Q (0 ,i ) ( P ) − − i . By restricting to a subsequence if necessary, we mayassume that ( A i ) i ≥ converges to some function e A ∈ B ∞ in the weak ∗ topology of L ∞ ( R d ) (where B ∞ now denotes the unit ball of L ∞ ( R d )).For better clarity, we shall denote the indicator function of the cube Q (0 , R ) by χ R (for any fixed R > χ R A i ) i ≥ con-verges to χ R e A in the weak ∗ topology of L ∞ ( Q (0 , R )). By Lemma 11 we concludethat I P ( χ R e A ) = 0, which (by the same argument as in the spherical setting) implies I P (supp χ R e A ) = 0. Since R > I P (supp e A ) = 0. For instance we can proceed as in the second proof of the Supersaturation Theorem, approximating I P (supp A ) by I P ( B ε ) where B ε := { x ∈ S d : A ( x ) ≥ ε } , and noting that I P ( B ε ) ≤ ε − k I P ( A ) = 0for all ε > A := supp e A , we will now prove that A has density d ( A ) = m R d ( P ).Since I P ( A ) = 0, by Lemma 22 we can then remove a zero-measure subset of A inorder to remove all copies of P and conclude the proof.By weak supersaturation, it suffices to show that lim inf R →∞ d Q (0 ,R ) ( A ) ≥ m R d ( P ).Fix some arbitrary ε > R ≥ R + 2diam P ) d < (1 + ε/ R d . For any given R ≥ R , take a P -avoiding set B R ⊆ Q (0 , R ) with d Q (0 ,R ) ( B R ) > m Q (0 ,R ) ( P ) − ε/ ≥ m R d ( P ) − ε/ i ≥ R define A ′ i := B R ∪ ( A i \ Q (0 , R + 2diam P )); note that A ′ i avoids P and m ( A ′ i ) = m ( A i ) − m ( A i ∩ ( Q (0 , R + 2diam P ) \ Q (0 , R ))) − m ( A i ∩ Q (0 , R )) + m ( B R ) ≥ m ( A i ) − (( R + 2diam P ) d − R d ) + m ( B R ) − m ( A ∩ Q (0 , R ))+ m ( A ∩ Q (0 , R )) − m ( A i ∩ Q (0 , R )) ≥ (cid:0) m Q (0 ,i ) ( P ) − − i (cid:1) i d − εR d (cid:0) d Q (0 ,R ) ( B R ) − d Q (0 ,R ) ( A ) (cid:1) R d + Z Q (0 ,R ) ( A ( x ) − A i ( x )) dx ≥ (cid:0) m Q (0 ,i ) ( P ) − − i (cid:1) i d − εR d (cid:0) m R d ( P ) − d Q (0 ,R ) ( A ) (cid:1) R d + Z Q (0 ,R ) (cid:16) e A ( x ) − A i ( x ) (cid:17) dx Since m ( A ′ i ) ≤ m Q (0 ,i ) ( P ) i d for all i ≥ R and R Q (0 ,R ) ( e A ( x ) − A i ( x )) dx > − ε for allsufficiently large i , we conclude that for large enough i we have d Q (0 ,R ) ( A ) > m R d ( P ) − i d i R d − ε − εR d > m R d ( P ) − ε, as wished. We will now prove that the independence density function P m S d ( P ) is continuouson the set of admissible configurations of S d . Recall that the equivalent result in theEuclidean setting was used in Theorem 4 to provide a (partial) answer to question(Q1) from the Introduction.Before doing that, it is interesting to note that a similar result does not hold for two-point configurations on the unit circle S (which can be seen as the very first instanceof non-admissible configurations). Indeed, it was shown in [6] that m S ( { u, v } ) is discontinuous at a configuration { u, v } ⊂ S whenever the arc length between u and v is a rational multiple of 2 π with odd denominator.We will first need an equicontinuity property for the family of counting functions P I P ( A ), over all measurable sets A ⊆ S d . In what follows we shall write B( P, r ) ⊂ ( R d +1 ) k for the ball of radius r centered on P = { v , . . . , v k } , where the distancefrom P to Q = { u , . . . , u k } is given by k Q − P k ∞ := max {k u i − v i k : 1 ≤ i ≤ k } Strictly speaking we would need this set P to be ordered, but since the specific order chosen forits points makes no difference in the argument we will ignore this detail. emma 24. For every admissible P ⊂ S d and every ε > there is δ > such that k Q − P k ∞ ≤ δ = ⇒ | I Q ( A ) − I P ( A ) | ≤ ε ∀ A ⊆ S d Proof. Recall from Section 3.1 that, in a small neighborhood of any fixed admissibleconfiguration P ⊂ S d , the bound we obtain in the Counting Lemma goes to zerouniformly as the radius of the averaging spherical cap goes to zero. In other words,there is r > c ′ P : (0 , → (0 , 1] with lim t → c ′ P ( t ) = 0 such that | I Q ( A ) − I Q ( A ∗ cap ρ ) | ≤ c ′ P ( ρ ) for all Q ∈ B( P, r ) and all A ⊆ S d .Now, for a given ρ > < δ < ρ , we see from the triangle inequality that k x − y k ≤ δ = ⇒ Cap( x, ρ − δ ) ⊂ Cap( x, ρ ) ∩ Cap( y, ρ )= ⇒ σ (Cap( x, ρ ) \ Cap( y, ρ )) ≤ σ (Cap ρ ) − σ (Cap ρ − δ )This implies that, for any A ⊆ S d and any x, y ∈ S d with k x − y k ≤ δ , we have | A ∗ cap ρ ( x ) − A ∗ cap ρ ( y ) | = | σ ( A ∩ Cap( x, ρ )) − σ ( A ∩ Cap( y, ρ )) | σ (Cap ρ ) ≤ σ (Cap( x, ρ ) \ Cap( y, ρ )) σ (Cap ρ ) ≤ σ (Cap ρ ) − σ (Cap ρ − δ ) σ (Cap ρ )By the usual telescoping sum argument we conclude that, whenever k Q − P k ∞ ≤ δ ,we have | I Q ( A ∗ cap ρ ) − I P ( A ∗ cap ρ ) | ≤ k σ (Cap ρ ) − σ (Cap ρ − δ ) σ (Cap ρ )Take ρ > c ′ P ( ρ ) ≤ ε/ 3, and for this value of ρ take 0 < δ < r small enough so that σ (Cap ρ − δ ) ≥ (1 − ε/ k ) σ (Cap ρ ). Then for all Q ∈ B( P, δ ) andany set A ⊆ S d we have | I Q ( A ) − I P ( A ) | ≤ | I Q ( A ) − I Q ( A ∗ cap ρ ) | + | I Q ( A ∗ cap ρ ) − I P ( A ∗ cap ρ ) | + | I P ( A ∗ cap ρ ) − I P ( A ) |≤ c ′ P ( ρ ) + k σ (Cap ρ ) − σ (Cap ρ − δ ) σ (Cap ρ ) + c ′ P ( ρ ) ≤ ε k ε k + ε ε, as wished. Theorem 10. The independence density function P m S d ( P ) is continuous on theset of admissible configurations on S d . The same is true for the multivariable function ( P , . . . , P n ) m S d ( P , . . . , P n ) on the set of n admissible configurations.Proof. For the sake of better readability, we will prove the result in the case of only oneforbidden configuration. The n -variable version easily follows from the same argument,fixing n − P ⊂ S d : there is r > c ′ P : (0 , → (0 , 1] such that, whenever k Q − P k ∞ ≤ r and A ⊆ S d is a set with σ ( A ) > m S d ( Q ) + ε , we have that I Q ( A ) >c ′ P ( ε ) > 0. 40ix some ε > 0. By the previous lemma there is 0 < δ < r such that, whenever k Q − P k ∞ ≤ δ , we have | I Q ( A ) − I P ( A ) | ≤ c ′ P ( ε ) for all A ⊆ S d . Now let us fix aconfiguration Q ∈ B( P, δ ) and let A, ˜ A ⊆ S d be sets satisfying:- A is P -avoiding and σ ( A ) = m S d ( P );- ˜ A is Q -avoiding and σ ( ˜ A ) = m S d ( Q ).We then have: I Q ( A ) ≤ I P ( A ) + c ′ P ( ε ) = c ′ P ( ε ) = ⇒ σ ( A ) ≤ m S d ( Q ) + εI P ( ˜ A ) ≤ I Q ( ˜ A ) + c ′ P ( ε ) = c ′ P ( ε ) = ⇒ σ ( ˜ A ) ≤ m S d ( P ) + ε These two inequalities together imply that | m S d ( Q ) − m S d ( P ) | ≤ ε , finishing theproof. The corresponding result for sets on R d can be proven in much the same way: wefirst show that the equicontinuity lemma holds for bounded sets A ⊆ Q (0 , R ) witherror εR d , valid uniformly over all R > 0, and then repeat our arguments from theproof of Theorem 10 for the restriction of optimal configuration-avoiding sets on R d to increasingly larger cubes Q (0 , R ). We obtain: Theorem 11. For every n ≥ , the function ( P , . . . , P n ) m R d ( P , . . . , P n ) iscontinuous on the set of n admissible configurations in R d . Our results leave open the question of what happens when the configurations we forbidare not admissible. There are two different reasons for a given configuration (either onthe space or on the sphere) to not be admissible, so let us examine them separately.The fist reason is that P is degenerate, meaning that its points are affinely depen-dent if we are on R d or linearly dependent if we are on S d . In this case, Bourgain[2] showed an example of sets A d ⊂ R d (for each d ≥ 2) which have positive densitybut which avoid arbitrarily large dilates of the degenerate three-point configuration {− , , } . These sets then show that the conclusion of Theorem 1 (and thus also theconclusion of Theorem 3) is false for this degenerate configuration. This was latergeneralized by Graham [11], who showed that a result like Theorem 1 can only hold if P is contained on the surface of some sphere of finite radius (as is always the casewhen P is non-degenerate).The non-degeneracy hypothesis is thus necessary both in Bourgain’s result and inour Theorem 3 . It is interesting to note, however, that a more recent result of Ziegler[17, 18] (generalizing a result of Furstenberg, Katznelson and Weiss [10] for three-pointconfigurations) shows that every set A ⊆ R d of positive upper density is arbitrarilyclose to containing all large enough dilates of any finite configuration P ⊂ R d . Moreprecisely, denoting by A δ the set of all points at distance at most δ from the set A ,Ziegler proved the following: Theorem 12. Let A ⊆ R d be a set of positive upper density and P ⊂ R d be a finiteset. Then there exists l > such that for any l > l and any δ > the set A δ containsa configuration congruent to lP . We believe the same is true for their spherical analogues, namely Theorem 7 and Corollary 3,though we do not know of a counterexample. P on R d or S d to benon-admissible, namely that it contains d + 1 points (if it has more than d + 1 pointsthen it is obviously degenerate). In this case we cannot apply the same strategy weused to prove the Counting Lemmas, and it is not clear whether they or the analoguesof Theorem 1 are true. We conjecture that they are whenever d ≥ 2, so that we canremove the cardinality condition from the statement of Bourgain’s result and of ourTheorems 3 and 7.In particular let us make more explicit the simplest case of this conjecture, whichis an obvious question left open since the results of Bourgain and of Furstenberg,Katznelson and Weiss: Conjecture 1. Let A ⊂ R be a set of positive upper density and let u, v, w ∈ R be non-colinear points. Then there exists l > such that for any l > l the set A contains a configuration congruent to { lu, lv, lw } . Another question we ask is related to a suspected ‘compatibility condition’ betweenthe Euclidean and spherical settings. Since S d resembles R d at small scales, it seemgeometrically intuitive that m S d ( tP ) should get increasingly close to m R d ( P ) as t → P is a contractible configuration on S d . (It is easy to show that a config-uration P ⊂ S d is contractible if and only if it is contained in a d -dimensional affinesubspace, so we can embed it in R d .) We ask whether this intuition is indeed correct,i.e. is it true that lim t → m S d ( tP ) = m R d ( P ) for all contractible configurations P ?Note that the analogous property holds (and is not hard to prove) when we areforbidding a ‘thick’ family of configurations, like all two-point configurations whichspan a distance of at most γ > 0. On the Euclidean space this gives rise to (thecenters of) a sphere packing of radius γ/ 2, while on the sphere it gives rise to a cappacking of a similar radius. While these sets will be discrete collections of points andhave zero measure, the density of these points on optimal configurations multipliedby the normalizing factor γ d will converge to the same number in both settings as γ → R d or the unit sphere S d , and whoseedges are all congruent copies of a given configuration P . Note that the independencedensity of P is just the density of the largest independence set in the correspondinghypergraph.Let us say that a hypergraph H = ( V, E ) is strongly removable if the analogue ofthe Hypergraph Removal Lemma holds for its edges: whenever some subset A of thevertices contains few (i.e. a negligible fraction of | E | ) edges, we can remove few (anegligible fraction of | V | ) vertices from A in order to remove all edges from this set .This property immediately implies supersaturation of the considered hypergraph.Note that the removal lemma of a given k -uniform hypergraph F for host hyper-graphs with N vertices is equivalent to strong removability of the F -encoding hyper-graph on [ N ] := { , , . . . , N } , i.e. the one whose vertices are the k -element subsetsof [ N ] and whose edges are all collections of these subsets which form a hypergraphisomorphic to F on [ N ]. This shows that subgraph-encoding hypergraphs are alwaysstrongly removable. The supersaturation property of subgraph-encoding hypergraphs Assuming we use the (non-normalized) Lebesgue surface measure on the sphere when computingthe density of points on S d , instead of its normalized version σ we have used throughout the paper. For our P -encoding hypergraph on S d , the property would more formally read: whenever A ⊆ S d satisfies I P ( A ) ≤ ε , there is a subset E ⊂ A of measure σ ( E ) ≤ o ε → (1) such that A \ E contains nocopies of P (where o ε → (1) denotes a quantity that goes to zero as ε → 42s however much easier to prove than its strong removability, and the later easily fol-lows from the ‘zero-measure removal lemma’ of hypergraph limits by Elek and Szegedy[8] which we already discussed in Section 4.2.Since Lemmas 20 and 22 are the equivalents of this last result in our settings, andsince the edge-counting function I P is (weak ∗ ) continuous whenever P is admissible,by analogy we believe that the geometrical hypergraphs encoding copies of a givenadmissible configuration P are also strongly removable. Unfortunately, the notion ofconvergence used by Elek and Szegedy (based on ultraproducts) is very different fromthe one we use (weak ∗ convergence), so we cannot simply transfer their proof to oursetting; we then leave this as an open question.Finally, it would be very interesting to have a way of obtaining good upper andlower bounds for the independence densities of a given configuration. There are severalpapers (see [1, 5] and the references therein) which consider this question in the caseof two-point configurations, drawing on powerful methods from the theory of conicoptimization and representation theory, and it is already quite challenging in thissimplest case. We believe that the study of the independence density for higher-orderconfigurations in the optimization setting is also worthwhile, since they serve as modelproblems for symmetric optimization problems depending on higher-order relationsand might prove very fruitful in new methods developed. Acknowledgements The author would like to thank Fernando de Oliveira Filho, Lucas Slot and FrankVallentin for many helpful discussions. We also thank F. de Oliveira Filho and F.Vallentin for several suggestions which improved the presentation of this paper.This work is supported by the European Union’s EU Framework Programme forResearch and Innovation Horizon 2020 under the Marie Sk lodowska-Curie ActionsGrant Agreement No 764759 (MINOA). References [1] C. Bachoc, A. Passuello, and A. Thiery , The density of sets avoiding dis-tance 1 in Euclidean space , Discrete Comput. Geom., 53 (2015), pp. 783–808.[2] J. Bourgain , A Szemer´edi type theorem for sets of positive density in R k , IsraelJ. Math., 54 (1986), pp. 307–316.[3] B. Bukh , Measurable sets with excluded distances , Geom. Funct. Anal., 18 (2008),pp. 668–697.[4] F. Dai and Y. Xu , Approximation theory and harmonic analysis on spheres andballs , Springer Monographs in Mathematics, Springer, New York, 2013.[5] E. DeCorte, F. M. de Oliveira Filho, and F. Vallentin , Complete posi-tivity and distance-avoiding sets , Mathematical Programming, (2020).[6] E. DeCorte and O. Pikhurko , Spherical sets avoiding a prescribed set ofangles , Int. Math. Res. Not. IMRN, (2016), pp. 6095–6117.[7] C. F. Dunkl , Operators and harmonic analysis on the sphere , Trans. Amer.Math. Soc., 125 (1966), pp. 250–263.438] G. Elek and B. Szegedy , A measure-theoretic approach to the theory of densehypergraphs , Adv. Math., 231 (2012), pp. 1731–1772.[9] P. Erd˝os and M. Simonovits , Supersaturated graphs and hypergraphs , Combi-natorica, 3 (1983), pp. 181–192.[10] H. Furstenberg, Y. Katznelson, and B. Weiss , Ergodic theory and config-urations in sets of positive density , in Mathematics of Ramsey theory, vol. 5 ofAlgorithms Combin., Springer, Berlin, 1990, pp. 184–198.[11] R. L. Graham , Recent trends in Euclidean Ramsey theory , Discrete Math., 136(1994), pp. 119–127.[12] V. R¨odl, B. Nagle, J. Skokan, M. Schacht, and Y. Kohayakawa , Thehypergraph regularity method and its applications , Proc. Natl. Acad. Sci. USA,102 (2005), pp. 8109–8113.[13] V. R¨odl and M. Schacht , Regularity lemmas for graphs , in Fete of combi-natorics and computer science, vol. 20 of Bolyai Soc. Math. Stud., J´anos BolyaiMath. Soc., Budapest, 2010, pp. 287–325.[14] L. A. Sz´ekely , Measurable chromatic number of geometric graphs and sets with-out some distances in Euclidean space , Combinatorica, 4 (1984), pp. 213–218.[15] , Erd¨os on unit distances and the Szemer´edi-Trotter theorems , in Paul Erd¨osand his mathematics, II (Budapest, 1999), vol. 11 of Bolyai Soc. Math. Stud.,J´anos Bolyai Math. Soc., Budapest, 2002, pp. 649–666.[16] T. Tao , Exploring the toolkit of Jean Bourgain , Bull. Amer. Math.Soc. (N.S.), electronically published on January 27, 2021, DOI:https://doi.org/10.1090/bull/1716 (to appear in print).[17] T. Ziegler , An application of ergodic theory to a problem in geometric Ramseytheory , Israel J. Math., 114 (1999), pp. 271–288.[18] , Nilfactors of R m -actions and configurations in sets of positive upper densityin R mm