Concentration estimates for functions of finite high-dimensional random arrays
CCONCENTRATION ESTIMATES FOR FUNCTIONS OF FINITEHIGH-DIMENSIONAL RANDOM ARRAYS
PANDELIS DODOS, KONSTANTINOS TYROS AND PETROS VALETTAS
Abstract.
Let X be a d -dimensional random array on [ n ] whose entries take valuesin a finite set X , that is, X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) is an X -valued stochastic pro-cess indexed by the set (cid:0) [ n ] d (cid:1) of all d -element subsets of [ n ] := { , . . . , n } . We givesufficient—and easily checked—conditions on the random array X which ensure thatfor every function f : X (cid:16) [ n ] d (cid:17) → R which satisfies E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1 forsome p >
1, the random variable f ( X ) becomes concentrated after conditioning it ona large subarray of X ; these conditions cover wide classes of random arrays with notnecessarily independent entries. Examples are also given which show the optimalityof various aspects of the results.The proof is based on analytic and probabilistic tools—in particular, estimates formartingale difference sequences in L p spaces—and reduces the conditional concentra-tion of the random variable f ( X ) to (an approximate form of) the dissociativity ofthe random array X . The latter is then obtained using combinatorial arguments. Contents
1. Introduction 22. Precise statement of the main results 63. From dissociativity to concentration: proof of Theorem 2.2 84. The box independence condition propagates 145. Proof of Theorem 4.2 186. Proofs of Theorems 1.5 and 2.3 307. Examples 308. Comments and extensions 39References 40
Mathematics Subject Classification : 60E15, 60G09, 60G42.
Key words : concentration inequalities, exchangeable random arrays, spreadable random arrays, mar-tingale difference sequences.P.V. is supported by Simons Foundation grant 638224. a r X i v : . [ m a t h . P R ] F e b PANDELIS DODOS, KONSTANTINOS TYROS AND PETROS VALETTAS Introduction
Motivation/Overview.
The concentration of measure refers to the powerful phe-nomenon asserting that a function which depends smoothly on its variables is essentiallyconstant, as long as the number of the variables is large enough. There are various waysto quantify this “smooth dependence” ( e.g. , Lipschitz conditions, bounds for the L normof the gradient, etc.). Detailed expositions can be found in [Le01, BLM13].It is easy to see that this phenomenon is no longer valid if we drop the smoothnessassumption. Nevertheless, one can still obtain some form of concentration under a muchmilder integrability condition (see [DKT16, Theorem 1 (cid:48) ]). Theorem.
For every p > and every < ε (cid:54) , there exists a constant c > withthe following property. If n (cid:62) /c is an integer, X = ( X , . . . , X n ) is a random vectorwith independent entries which take values in a measurable space X , and f : X n → R isa measurable function with E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1 , then there exists an interval I of [ n ] with | I | (cid:62) cn such that (1.1) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] (cid:12)(cid:12) (cid:54) ε (cid:1) (cid:62) − ε where E [ f ( X ) | F I ] denotes the conditional expectation of f ( X ) with respect to the σ -alge-bra F I := σ ( { X i : i ∈ I } ) . (Here, and in what follows, [ n ] denotes the discrete interval { , . . . , n } .) Roughly speak-ing, this result asserts that if a function of several variables is sufficiently integrable, then,by integrating out some coordinates, it becomes essentially constant. It was motivatedby—and it has found several applications in—problems in combinatorics (see [DK16]).1.2. The problem.
The main goal of this paper is to extend the concentration estimate(1.1) to functions of random vectors with not necessarily independent entries . Of course,to this end some structural property of X is necessary. We will focus on high-dimensionalrandom arrays whose distribution is invariant under certain symmetries. Besides itsanalytic and probabilistic interest, the initial motivation to study functions of symmetricrandom arrays was related to an important combinatorial conjecture of Bergelson [Ber96].To proceed with our discussion, it is useful at this point to recall the definition of arandom array. Definition 1.1 (Random arrays, and their subarrays/sub- σ -algebras) . Let d be a positiveinteger, and let I be a set with | I | (cid:62) d . A d -dimensional random array on I is a stochasticprocess X = (cid:104) X s : s ∈ (cid:0) Id (cid:1) (cid:105) indexed by the set (cid:0) Id (cid:1) of all d -element subsets of I . If J is asubset of I with | J | (cid:62) d , then the subarray of X determined by J is the d -dimensional The question whether there is such an extension, was explicitly asked by an anonymous reviewer of[DKT16], as well as by several colleagues in personal communication. We plan to present applications in combinatorics in a separate paper.
ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 3 random array X J := (cid:104) X s : s ∈ (cid:0) Jd (cid:1) (cid:105) ; moreover, by F J we shall denote the σ -algebra σ ( { X s : s ∈ (cid:0) Jd (cid:1) } ) generated by X J . Of course, one-dimensional random arrays are just random vectors. On the otherhand, two-dimensional random arrays are essentially the same as random symmetric ma-trices, and their subarrays correspond to principal submatrices. (More generally, higher-dimensional random arrays correspond to random symmetric tensors.) We employ theterminology of random arrays, however, since we are not using linear-algebraic tools.We are now ready to state the problem addressed in this paper.
Problem 1.2.
Let n (cid:62) d be positive integers, let X be a d -dimensional random array on [ n ] whose entries take values in a measurable space X , let f : X ( [ n ] d ) → R be a measurablefunction, and assume that E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1 for some p > . Underwhat condition on X can we find a large subset I of [ n ] such that the random variable E [ f ( X ) | F I ] is concentrated around its mean? Two comments are in order here. Firstly, the condition we are referring to in Problem1.2 should be fairly concrete, in the sense that even its negation provides useful informa-tion on the random array X . In fact, our main result can also be stated as a dichotomy inthe spirit of Tao’s “structure versus randomness” [Tao08]: either every sufficiently inte-grable random variable of the form f ( X ) is conditionally concentrated, or the correlationsof X behave in a very specific way.Secondly, note that we demand that the random variable f ( X ) becomes concentratedafter conditioning it on a subarray of X . This is a fairly natural requirement in thiscontext, since subarrays preserve the higher-dimensional structure of arrays; it is alsoessential for combinatorial applications.1.3. Our main result provides an affirmative, and essentially optimal, answer to Prob-lem 1.2 for all finite-valued high-dimensional random arrays which are approximatelyspreadable.1.3.1. Spreadable and approximately spreadable random arrays.
Let d be a positive inte-ger, and recall that a d -dimensional random array X on a (possibly infinite) set I is called spreadable if for every pair J, K of finite subsets of I with | J | = | K | (cid:62) d , the subarrays X J and X K have the same distribution. Infinite, spreadable, two-dimensional random ar-rays have been studied by Fremlin and Talagrand [FT85], and—in greater generality—byKallenberg [Kal92]. Spreadability is placed in the general context of exchangeability , a We point out that this is not standard terminology. In particular, in [FT85] spreadable randomarrays are referred to as deletion invariant , while in [Kal05] they are called contractable . Recall that a d -dimensional random array X = (cid:104) X s : s ∈ (cid:0) Id (cid:1) (cid:105) on a (possibly infinite) set I is called exchangeable if for every finite permutation π of I , the random arrays X and X π := (cid:104) X π ( s ) : s ∈ (cid:0) Id (cid:1) (cid:105) have the same distribution. (Some authors refer to this property as joint exchangeability .) For infinitesequences of random variables exchangeability coincides with spreadability (see [Kal05]), but it is astronger notion for higher-dimensional random arrays. PANDELIS DODOS, KONSTANTINOS TYROS AND PETROS VALETTAS classical topic in probability which goes back to the work of de Finetti; see [Au08, Kal05]for an exposition of this theory and its applications.However, as already noted, in this paper we will deal with high-dimensional randomarrays which satisfy the following approximate form of spreadability.
Definition 1.3 (Approximate spreadability) . Let X be a d -dimensional random array ona ( possibly infinite ) set I , and let η (cid:62) . We say that X is η -spreadable ( or approximatelyspreadable if η is understood ) , provided that for every pair J, K of finite subsets of I with | J | = | K | (cid:62) d we have (1.2) ρ TV ( P J , P K ) (cid:54) η where P J and P K denote the laws of the random subarrays X J and X K respectively, and ρ TV stands for the total variation distance. The following proposition justifies Definition 1.3, and shows that approximately spread-able random arrays are the building blocks of arbitrary finite-valued, high-dimensionalrandom arrays. The proof follows by a standard application of Ramsey’s theorem [Ra30]taking into account the fact that the space of all probability measures on a finite setequipped with the total variation distance in compact.
Proposition 1.4.
For every triple m, n, d of positive integers with n (cid:62) d , and every η > ,there exists an integer N (cid:62) n with the following property. If X is a set with |X | = m and X is an X -valued, d -dimensional random array on a set I with | I | (cid:62) N , then thereexists a subset J of I with | J | = n such that the random array X J is η -spreadable. The main result.
We are now ready to state the main result of this paper. In thisintroduction we will restrict our discussion to boolean two-dimensional random arrays,mainly because in this case the condition we obtain is easier to grasp, but at the sametime it is quite representative of the higher dimensional case. The general version ispresented in Theorem 2.3 in Section 2.
Theorem 1.5.
Let < p (cid:54) , let < ε (cid:54) , let k (cid:62) be an integer, and set C = C ( p, ε, k ) := exp (cid:16) ε ( p − · k (cid:17) . (1.3) Also let n (cid:62) C be an integer, let X = (cid:104) X s : s ∈ (cid:0) [ n ]2 (cid:1) (cid:105) be a { , } -valued, (1 /C ) -spreadable,two-dimensional random array on [ n ] , and assume that (1.4) (cid:12)(cid:12) E [ X { , } X { , } X { , } X { , } ] − E [ X { , } ] E [ X { , } ] E [ X { , } ] E [ X { , } ] (cid:12)(cid:12) (cid:54) C .
Then for every function f : { , } ( [ n ]2 ) → R with E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1 thereexists an interval I of [ n ] with | I | = k such that (1.5) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] (cid:12)(cid:12) (cid:54) ε (cid:1) (cid:62) − ε. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 5
Observe that (1.4) together with the (1 /C )-spreadability of X imply that for every i, j, k, (cid:96) ∈ [ n ] with i < j < k < (cid:96) we have(1.6) (cid:12)(cid:12) E [ X { i,k } X { i,(cid:96) } X { j,k } X { j,(cid:96) } ] − E [ X { i,k } ] E [ X { i,(cid:96) } ] E [ X { j,k } ] E [ X { j,(cid:96) } ] (cid:12)(cid:12) (cid:54) C .
Though not obvious at first sight, as the parameter C gets bigger, the estimate (1.6)forces the random variables X { i,k } , X { i,(cid:96) } , X { j,k } , X { j,(cid:96) } to behave independently. (It alsoimplies that the correlation matrix of X is close to the identity.) Thus, we may view(1.6) as an ( approximate ) box independence condition for X . i j k ‘X { i,k } X { i,‘ } X { j,k } X { j,‘ } Figure 1.
The box independence condition.Finally, we note that (1.6) is, essentially, an optimal condition. Specifically, for everyinteger n (cid:62) X on [ n ], and— a translated multilinear polynomial f : R ( [ n ]2 ) → R of degree 4 with E [ f ( X )] = 0and (cid:107) f ( X ) (cid:107) L ∞ (cid:54) X is the identity, and for which (1.6) and (1.5) do nothold. (See Proposition 7.1 in Section 7; the case “ d (cid:62)
3” is treated in Proposition 7.3.)1.4.
Related work.
Although Theorem 1.5 (as well as its higher dimensional extension,Theorem 2.3) is somewhat distinct from the traditional setting of concentration of smoothfunctions, it is related with several results which we are about to discuss.Arguably, the one-dimensional case—that is, the case of random vectors—is the mostheavily investigated. It is impossible to give here a comprehensive review; we only mentionthat concentration estimates for functions of finite exchangeable random vectors have beenobtained in [Bob04, Ch06].The two-dimensional case is also heavily investigated, in particular, in the literaturearound various random matrix models. However, closer to the spirit of this paper is thework of Latala [La06] and the subsequent papers [AdWo15, GSS19, V19] which obtain
PANDELIS DODOS, KONSTANTINOS TYROS AND PETROS VALETTAS exponential concentration inequalities for smooth functions ( e.g. , polynomials) of high-dimensional random arrays whose entries are of the form(1.7) X s = (cid:89) i ∈ s ξ i where ( ξ , . . . , ξ n ) is a random vector with independent entries and a well-behaved distri-bution. Note that all these arrays are dissociated, and are additionally exchangeable ifthe random variables ξ , . . . , ξ n are identically distributed.That said, the study of concentration inequalities for functions of more general fi-nite high-dimensional random arrays is nearly not developed at all, mainly because thestructure of finite high-dimensional random arrays is quite complicated (see, also, [Au13,page 16] for a discussion on this issue). We make a step in this direction in the companionpaper [DTV20].1.5. Brief outline of the argument.
The first step of the proof is Theorem 2.2. Itis based on estimates for martingale difference sequences in L p spaces, and it applies torandom arrays with arbitrary distributions (in particular, not necessarily approximatelyspreadable). It shows that the conditional concentration of f ( X ) is equivalent to anapproximate form of the dissociativity of X (see Definition 2.1). The main advantage ofthis reduction is that it enables us to forget about the function f and focus exclusivelyon the random array X .The basic step for the verification of the approximate dissociativity of X , is Theo-rem 4.2. It shows that the “box independence condition” propagates and forces all, nottoo large, subarrays of X to behave independently. Theorem 4.2 is essentially combina-torial and it is analogous to the phenomenon—discovered in the theory of quasi-randomgraphs [CGW88, CGW89]—that a graph G which contains (roughly) the expected numberof 4-cycles must also contain the expected number of any other, not too large, graph H . Precise statement of the main results
From dissociativity to concentration.
Let d be a positive integer, and recall thata d -dimensional random array X on a (possibly infinite) subset I of N is called dissociated if for every J, K ⊆ I with | J | , | K | (cid:62) d and max( J ) < min( K ), the σ -algebras F J and F K are independent, that is, for every A ∈ F J and B ∈ F K we have P ( A ∩ B ) = P ( A ) P ( B ).Dissociativity is a classical concept in probability (see [MS75]); in this paper, we will needthe following approximate version of this notion. Definition 2.1 (Approximate dissociativity) . Let n, (cid:96), d be positive integers such that n (cid:62) (cid:96) (cid:62) d , and let (cid:54) β (cid:54) . We say that a d -dimensional random array X on [ n ] is The understanding is better in the one-dimensional case—see [DF80]. In fact, this is more than an analogy; indeed, it is easy to see that Theorem 4.2 yields the aforemen-tioned property of quasi-random graphs. We discuss further the relation between the “box independencecondition” and quasi-randomness of graphs and hypergraphs in [DTV20, Section 8].
ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 7 ( β, (cid:96) )-dissociated provided that for every J, K ⊆ [ n ] with | J | , | K | (cid:62) d , | J | + | K | (cid:54) (cid:96) and max( J ) < min( K ) , and every pair of events A ∈ F J and B ∈ F K we have (2.1) (cid:12)(cid:12) P ( A ∩ B ) − P ( A ) P ( B ) (cid:12)(cid:12) (cid:54) β. The following theorem provides the link between conditional concentration and ap-proximate dissociativity. Its proof is given in Section 3.
Theorem 2.2.
Let d be a positive integer, let < p (cid:54) , let < ε (cid:54) , let k (cid:62) d be aninteger, and set β = β ( p, ε ) := (cid:0) ε (cid:1) p − (2.2) (cid:96) = (cid:96) ( p, ε, k ) := (cid:108) ε ( p − k (cid:109) . (2.3) Also let n (cid:62) (cid:96) be an integer, and let X be a ( β, (cid:96) ) -dissociated, d -dimensional random arrayon [ n ] whose entries take values in a measurable space X . Then for every measurablefunction f : X ( [ n ] d ) → R with E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1 there exists an interval I of [ n ] with | I | = k such that (2.4) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] (cid:12)(cid:12) (cid:54) ε (cid:1) (cid:62) − ε. We note that for spreadable arrays there is a converse of Theorem 2.2, namely, approx-imate dissociativity is in fact necessary in order to have conditional concentration—seeProposition 3.6 in Section 3.2.2.
The concentration estimate.
The following theorem—which is proved in Sec-tion 6—extends Theorem 1.5. (Also note that the case “ d = 1” corresponds to randomvectors.) Theorem 2.3.
Let d, m be two positive integers with m (cid:62) , let < p (cid:54) , let < ε (cid:54) ,let k (cid:62) d be an integer, and set C = C ( d, m, p, ε, k ) := exp (cid:16) d ln mε d ( p − d · k d (cid:17) . (2.5) Also let n (cid:62) C be an integer, let X be a set with |X | = m , and let X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be an X -valued, (1 /C ) -spreadable, d -dimensional random array on [ n ] . Set (2.6) Box( d ) := (cid:110) s ∈ (cid:18) [ n ] d (cid:19) : | s ∩ { i − , i }| = 1 for all i ∈ [ d ] (cid:111) , and assume that there exists S ⊆ X with |S| = |X | − such that for every a ∈ S we have (2.7) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈ Box( d ) [ X s = a ] (cid:17) − (cid:89) s ∈ Box( d ) P (cid:0) [ X s = a ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) C .
Then for every function f : X ( [ n ] d ) → R with E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1 there existsan interval I of [ n ] with | I | = k such that (2.8) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] (cid:12)(cid:12) (cid:54) ε (cid:1) (cid:62) − ε. PANDELIS DODOS, KONSTANTINOS TYROS AND PETROS VALETTAS
Organization of the rest of the paper.
We close this section by giving an outlineof the contents of the rest of the paper. In Section 3 we present the proof of Theorem 2.2.The next two sections, Sections 4 and 5, are devoted to the fact (already noted in theintroduction) that the box independence condition forces all, not too large, large subarraysto behave independently. The precise statement is Theorem 4.2 in Section 4; in Section 4,we also present some consequences. The proof of Theorem 4.2 is given in Section 5; thisis the most technically demanding part of the paper. In Section 6 we complete the proofof Theorem 2.3, and in Section 7 we present examples which show the optimality of thebox independence condition. Finally, in Section 8, we discuss extensions/refinements ofTheorem 2.3 for dissociated random arrays (Theorem 8.1), and for vector-valued functionsof random arrays (Theorem 8.2).3.
From dissociativity to concentration: proof of Theorem 2.2
Moment bound.
The following theorem is the main result in this section.
Theorem 3.1.
Let d, (cid:96), n be positive integers with n (cid:62) (cid:96) (cid:62) d , let < β (cid:54) , and let X be a d -dimensional random array on [ n ] which is ( β, (cid:96) ) -dissociated and whose entries takevalues in a measurable space X . Then, for every < p (cid:54) ∞ , every measurable function f : X ( [ n ] d ) → R with f ( X ) ∈ L p , every integer k with d (cid:54) k (cid:54) (cid:98) (cid:96)/ (cid:99) , and every I ∈ (cid:0) [ n ] (cid:96) (cid:1) ,there exists J ∈ (cid:0) Ik (cid:1) with the following property. For any (cid:54) r < p (cid:54) , we have (3.1) (cid:13)(cid:13) E [ f ( X ) | F J ] − E [ f ( X )] (cid:13)(cid:13) L r (cid:54) (cid:16) ( p − − / (cid:114) k(cid:96) +10 β r − p (cid:17) (cid:13)(cid:13) f ( X ) − E [ f ( X )] (cid:13)(cid:13) L p where F J denotes the σ -algebra generated by the subarray X J . ( See Definition . ) Moreover, if I is an interval of [ n ] , then J is an interval too. Theorem 3.1 easily yields Theorem 2.2. We present the details below.
Proof of Theorem assuming Theorem . Set r := ( p +1) / < r < p (cid:54)
2. Since E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1, by Theorem 3.1applied for the interval [ (cid:96) ], there exists an interval I of [ n ] with | I | = k such that(3.2) (cid:13)(cid:13) E [ f ( X ) | F I ] (cid:13)(cid:13) L r (cid:54) (cid:16) ( p − − / (cid:114) k(cid:96) + 10 β r − p (cid:17) . By Markov’s inequality, this estimate yields that(3.3) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] (cid:12)(cid:12) (cid:62) ε (cid:1) (cid:54) (1 /ε ) r · (cid:16) ( p − − / (cid:114) k(cid:96) + 10 β r − p (cid:17) r . By (3.3), the choice of r and the choice of β and (cid:96) in (2.2) and (2.3) respectively, weconclude that(3.4) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] (cid:12)(cid:12) (cid:62) ε (cid:1) (cid:54) ε. which clearly implies (2.4). The proof of Theorem 2.2 is completed. (cid:3) ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 9
The rest of this section is devoted to the proof of Theorem 3.1 which is based oninequalities for martingales in L p spaces. Martingales are, of course, standard tools inthe proofs of concentration estimates. Typically, one decomposes a given random variable X into martingale increments, and then controls an appropriate norm of X by controllingthe norm of the increments. In the proof of Theorem 3.1 we also decompose a givenrandom variable into martingale increments but, in contrast, we seek to find one of theincrements which has controlled norm. This method, known as the energy incrementstrategy , was introduced in the present probabilistic setting by Tao [Tao06] for “ p = 2”,and then extended in the full range of admissible p ’s in [DKT16]. Having said that, wealso note that the main novelty of the present paper lies in the selection of the filtration.We now briefly describe the contents of this section. In Subsection 3.2 we present theanalytical estimate which is used in the proof of Theorem 3.1. In Subsection 3.3 we provean orthogonality result for pairs of σ -algebras which satisfy the estimate (2.1). The proofof Theorem 3.1 is completed in Subsection 3.4. Finally, in Subsection 3.5 we show that,for spreadable random arrays, the assumption of approximate dissociativity in Theorem2.2 is necessary.3.2. Martingale difference sequences.
It is an elementary, though important, factthat martingale difference sequences are orthogonal in L . We will need the followingextension of this fact. Proposition 3.2.
Let < p (cid:54) . Then for every martingale difference sequence ( d i ) mi =1 in L p we have (3.5) (cid:16) m (cid:88) i =1 (cid:107) d i (cid:107) L p (cid:17) / (cid:54) (cid:0) p − (cid:1) − / (cid:13)(cid:13)(cid:13) m (cid:88) i =1 d i (cid:13)(cid:13)(cid:13) L p . In particular, (3.6) min (cid:54) i (cid:54) m (cid:107) d i (cid:107) L p (cid:54) (cid:112) m ( p − (cid:13)(cid:13)(cid:13) m (cid:88) i =1 d i (cid:13)(cid:13)(cid:13) L p . We note that the constant ( p − − / in (3.5) is optimal; this sharp estimate wasproved by Ricard and Xu [RX16] who deduced it from a uniform convexity inequality for L p spaces—see [Pi11, Lemma 4.32]. (See, also, [DKK16, Appendix A] for an exposition.)3.3. Mixing and orthogonality.
In what follows, it is convenient to introduce thefollowing terminology. Let (Ω , Σ , P ) be a probability space, and let 0 (cid:54) β (cid:54)
1; giventwo sub- σ -algebras A , B of Σ, we say that A and B are β -mixing provided that for every A ∈ A and every B ∈ B we have(3.7) (cid:12)(cid:12) P ( A ∩ B ) − P ( A ) P ( B ) (cid:12)(cid:12) (cid:54) β. Square-function estimates could also be used, but they do not yield optimal dependence with respectto the integrability parameter p . Notice that in the extreme case “ β = 0” the estimate (3.7) is equivalent to saying thatthe σ -algebras A and B are independent, which in turn implies for every random vari-able X with E [ X ] = 0 we have E (cid:2) E [ X | A ] | B ] = 0. The main result in this subsection(Proposition 3.5 below) is an approximate version of this fact.We start with the following lemma. Lemma 3.3.
Let (Ω , Σ , P ) be a probability space, let < β (cid:54) , and let A , B be twosub- σ -algebras of Σ which are β -mixing. Then for every real-valued, bounded, randomvariable X and every (cid:54) p (cid:54) ∞ we have (3.8) (cid:13)(cid:13) E (cid:2) E [ X | A ] | B (cid:3) − E [ X ] (cid:13)(cid:13) L p (cid:54) (4 β ) /p (cid:107) X − E [ X ] (cid:107) L ∞ . For the proof of Lemma 3.3 we need the following simple fact.
Fact 3.4.
Let ( X, Σ , µ ) be a measure space, and let f : X → R be an integrable function.Then we have (3.9) (cid:107) f (cid:107) L ( µ ) (cid:54) A ∈ Σ (cid:12)(cid:12)(cid:12) (cid:90) A f dµ (cid:12)(cid:12)(cid:12) . In particular, if x , . . . , x m ∈ R , then (3.10) m (cid:88) i =1 | x i | (cid:54) ∅(cid:54) = I ⊆ [ m ] (cid:12)(cid:12)(cid:12) (cid:88) i ∈ I x i (cid:12)(cid:12)(cid:12) . Proof.
Since [ f (cid:62) , [ f < ∈ Σ, we have (cid:107) f (cid:107) L ( µ ) = (cid:12)(cid:12)(cid:12) (cid:90) [ f (cid:62) f dµ (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) (cid:90) [ f< f dµ (cid:12)(cid:12)(cid:12) (cid:54) A ∈ Σ (cid:12)(cid:12)(cid:12) (cid:90) A f dµ (cid:12)(cid:12)(cid:12) as desired. (cid:3) We proceed to the proof of Lemma 3.3.
Proof of Lemma . We prove the L -estimate; the L p -estimate for p > L − L ∞ bound, and the fact that the conditional expectation is a linear contractionon L ∞ . Without loss of generality we may assume that E [ X ] = 0. (If not, then we workwith the random variable X (cid:48) := X − E [ X ] instead of X ). Set Z := E [ X | A ], and observethat E [ Z ] = E [ X ] = 0. Hence, by Fact 3.4, it suffices to obtain an upper bound for (cid:12)(cid:12) E [ Z B ] (cid:12)(cid:12) for arbitrary B ∈ B . To this end, note that (cid:107) Z (cid:107) L ∞ (cid:54) (cid:107) X (cid:107) L ∞ ; therefore, by astandard approximation, we may assume that Z is of the form (cid:80) Ni =1 a i A i where N isa positive integer, | a i | (cid:54) (cid:107) Z (cid:107) L ∞ for every i ∈ [ N ], and the family { A , . . . , A N } formsa partition of Ω into measurable events. Let B ∈ B be arbitrary. Using the fact that (cid:80) Ni =1 a i P ( A i ) = E [ Z ] = 0 and the triangle inequality, we have(3.11) (cid:12)(cid:12) E [ Z B ] (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) N (cid:88) i =1 a i P ( A i ∩ B ) (cid:12)(cid:12)(cid:12) (cid:54) N (cid:88) i =1 | a i | · | P ( A i ∩ B ) − P ( A i ) P ( B ) | . ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 11
If we set x i := P ( A i ∩ B ) − P ( A i ) P ( B ), we obtain that(3.12) (cid:12)(cid:12) E [ Z B ] (cid:12)(cid:12) (cid:54) N (cid:88) i =1 | a i | · | x i | (cid:54) (cid:107) Z (cid:107) L ∞ max ∅(cid:54) = I ⊆ [ N ] (cid:12)(cid:12)(cid:12) (cid:88) i ∈ I x i (cid:12)(cid:12)(cid:12) where we have also used the pointwise bound | a i | (cid:54) (cid:107) Z (cid:107) L ∞ and Fact 3.4. Finally, setting A I := (cid:83) i ∈ I A i for every nonempty I ⊆ [ N ], then we have(3.13) (cid:12)(cid:12)(cid:12) (cid:88) i ∈ I x i (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) P ( A I ∩ B ) − P ( A I ) P ( B ) (cid:12)(cid:12) (cid:54) β since the sets A , . . . , A N are pairwise disjoint and A I ∈ A . We conclude that(3.14) (cid:12)(cid:12) E (cid:2) E [ Z | B ] B (cid:3)(cid:12)(cid:12) = (cid:12)(cid:12) E [ Z B ] (cid:12)(cid:12) (cid:54) β (cid:107) X (cid:107) L ∞ . Since B ∈ B was arbitrary, the result follows. (cid:3) We are now ready to state the main result in this subsection.
Proposition 3.5.
Let (Ω , Σ , P ) be a probability space, let < β (cid:54) , and let A , B be twosub- σ -algebras of Σ which are β -mixing. Let (cid:54) r < p (cid:54) ∞ and let X ∈ L p . Then, wehave (3.15) (cid:13)(cid:13) E (cid:2) E [ X | A ] | B (cid:3) − E [ X ] (cid:13)(cid:13) L r (cid:54) β /r − /p (cid:107) X − E [ X ] (cid:107) L p . Proof.
We will obtain the estimate by truncating X and employing Lemma 3.3. We layout the details. As in the proof of Lemma 3.3, we may assume that E [ X ] = 0. Let t > X t := X [ | X | (cid:54) t ] . Markov’s inequalityyields that P ( | X | > t ) (cid:54) t − p (cid:107) X (cid:107) pL p , thus applying H¨older’s inequality we obtain that(3.16) (cid:107) X t − X (cid:107) rL r = E (cid:2) | X | r [ | X | >t ] (cid:3) (cid:54) (cid:107) X (cid:107) rL p P ( | X | > t ) − rp (cid:54) (cid:107) X (cid:107) pL p t p − r for any 1 (cid:54) r < p . Therefore, (cid:13)(cid:13) E (cid:2) E [ X | A ] | B (cid:3)(cid:13)(cid:13) L r (cid:54) (cid:13)(cid:13) E (cid:2) E [ X − X t | A ] | B (cid:3)(cid:13)(cid:13) L r +(3.17) + (cid:13)(cid:13) E (cid:2) E [ X t | A ] | B (cid:3) − E [ X t ] (cid:13)(cid:13) L r + (cid:12)(cid:12) E [ X t ] (cid:12)(cid:12) (cid:54) (cid:107) X − X t (cid:107) L r + 2(4 β ) /r t + (cid:107) X − X t (cid:107) L where we have used the contraction property of the conditional expectation, Lemma 3.3for the random variable X t , and the fact E [ X ] = 0, respectively. Taking into account(3.16), we conclude that(3.18) (cid:13)(cid:13) E (cid:2) E [ X | A ] | B (cid:3)(cid:13)(cid:13) L r (cid:54) (cid:107) X (cid:107) p/rL p t p/r − + 8 β /r t. It remains to optimize the latter with respect to t ; the choice t := β − /p (cid:107) X (cid:107) L p yields theassertion. (cid:3) Proof of Theorem 3.1.
After normalizing, we may assume that(3.19) (cid:107) f ( X ) − E [ f ( X )] (cid:107) L p = 1 . Fix an integer k with d (cid:54) k < (cid:98) (cid:96)/ (cid:99) and I ∈ (cid:0) [ n ] (cid:96) (cid:1) , and let { ι < · · · < ι (cid:96) } denote theincreasing enumeration of I . Set m := (cid:98) (cid:96)/k (cid:99) . Also let K , . . . , K m ∈ (cid:0) [ (cid:96) ] k (cid:1) be successiveintervals with min( K ) = 1, and set J i := { ι κ : κ ∈ K i } for every i ∈ [ m ]. Thus, thesets J , . . . , J m are successive subsets of I each of cardinality k ; also notice that if I is aninterval of [ n ], then the sets J , . . . , J m are intervals too.Next, denote by (Ω , Σ , P ) the underlying probability space on which the random array X is defined, and for every i ∈ [ m ] let F J i be the σ -algebra generated by the subarray X J i . (See Definition 1.1). We define a filtration ( A i ) mi =0 by setting A = {∅ , Ω } and(3.20) A i := σ (cid:0) {F J , . . . , F J i } (cid:1) for every i ∈ [ m ]. J J J i J i +1 F J F J F J i F J i +1 A i Figure 2.
The filtration ( A i ) mi =0 .Let ( d i ) mi =1 denote the martingale difference sequence of the Doob martingale for f ( X )with respect to the filtration ( A i ) mi =0 , that is, d i := E [ f ( X ) | A i ] − E [ f ( X ) | A i − ] forevery i ∈ [ m ]. Since E [ f ( X ) | A m ] − E [ f ( X )] = (cid:80) mi =1 d i , the contractive property of theconditional expectation yields that (cid:13)(cid:13)(cid:13) m (cid:88) i =1 d i (cid:13)(cid:13)(cid:13) L p (cid:54) (cid:107) f ( X ) − E [ f ( X )] (cid:107) L p (3.19) = 1 . (3.21)Therefore, by Proposition 3.2, there exists an integer i ∈ [ m ] so that (cid:107) d i (cid:107) L p (cid:54) (cid:112) m ( p − . (3.22)We claim that the set J := J i is as desired. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 13
To this end, fix 1 (cid:54) r < p . First observe that, conditioning further on F J i , (cid:13)(cid:13) E [ f ( X ) | F J i ] − E (cid:2) E [ f ( X ) | A i − ] | F J i (cid:3)(cid:13)(cid:13) L p = (cid:13)(cid:13) E [ d i | F J i ] (cid:13)(cid:13) L p (3.23) (cid:54) (cid:112) m ( p − F J i ⊆ A i , the contractive property of the conditionalexpectation once more, and (3.22). By the triangle inequality and taking into account(3.23) and the monotonicity of the L p -norms, we obtain that (cid:13)(cid:13) E [ f ( X ) | F J i ] − E [ f ( X )] (cid:13)(cid:13) L r (cid:54) (cid:112) m ( p −
1) +(3.24) + (cid:13)(cid:13) E (cid:2) E [ f ( X ) | A i − ] | F J i (cid:3) − E [ f ( X )] (cid:13)(cid:13) L r Finally, by (3.20) and our assumption that the random array X is ( β, (cid:96) )-dissociated, wesee that the σ -algebras F J i and A i − are β -mixing. (See (3.7).) By Proposition 3.5, weconclude that (cid:107) E [ f ( X ) | F J i ] − E [ f ( X )] (cid:107) L r (cid:54) (cid:112) m ( p −
1) + 10 β r − p . (3.25)and the proof is completed.3.5. Necessity of approximate dissociativity.
We close this section with the followingproposition which shows that the assumption of approximate dissociativity in Theorem2.2 is necessary.
Proposition 3.6.
Let n, d, (cid:96) be positive integers with n (cid:62) (cid:96) (cid:62) d , let < β (cid:54) , let X bea spreadable, d -dimensional random array on [ n ] whose entries take vales in a measurablespace X , and assume that X is not ( β, (cid:96) ) -dissociated. Then there exists a measurablefunction f : X ( [ n ] d ) → { , } such that for every I ∈ (cid:0) [ n ] (cid:96) (cid:1) we have (3.26) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] − E [ f ( X )] (cid:12)(cid:12) (cid:62) β/ (cid:1) (cid:62) β/ . Proof.
Since the random array X is spreadable and not ( β, (cid:96) )-dissociated, there existtwo integers j, k (cid:62) d with j + k (cid:54) (cid:96) , and two events A ∈ F [ j ] and B ∈ F K , where K := { j + 1 , . . . , k + j } , such that | P ( A ∩ B ) − P ( A ) P ( B ) | (cid:62) β . We select a measurablesubset A (cid:48) of X ( [ j ] d ) such that the events [ X [ j ] ∈ A (cid:48) ] and A agree almost surely, and we set (cid:101) A := π − ( A (cid:48) ) where π : X ( [ n ] d ) → X ( [ j ] d ) denotes the natural projection. Finally, we define f : X ( [ n ] d ) → { , } by f = (cid:101) A .We claim that f is as desired. Indeed, let I ∈ (cid:0) [ n ] (cid:96) (cid:1) be arbitrary. We select L ∈ (cid:0) Ik (cid:1) with min( L ) > j . Invoking the spreadability of X and the choice of A and B , we mayalso select Γ ∈ F L such that(3.27) (cid:12)(cid:12) P ( A ∩ Γ) − P ( A ) P (Γ) (cid:12)(cid:12) (cid:62) β. Observing that P ( A ) = E [ f ( X )] and P ( A ∩ Γ) = E [ f ( X ) Γ ], and using the fact thatΓ ∈ F L ⊆ F I , we obtain that β (3.27) (cid:54) (cid:12)(cid:12) E [ (cid:0) f ( X ) − E [ f ( X )] (cid:1) Γ ] (cid:12)(cid:12) = (cid:12)(cid:12) E [ (cid:0) E [ f ( X ) | F I ] − E [ f ( X )] (cid:1) Γ ] (cid:12)(cid:12) (3.28)which is easily seen to imply (3.26). The proof is completed. (cid:3) Remark . We note that if the random array X in Proposition 3.6 is boolean, then thefunction f defined above is a polynomial of degree at most (cid:0) (cid:96)d (cid:1) .4. The box independence condition propagates
The main result.
We start by introducing some pieces of notation and someterminology. Let n, d be a positive integers with n (cid:62) d ; for every finite sequence H = ( H , . . . , H d ) of nonempty finite subsets of [ n ] with max( H i ) < min( H i +1 ) forall i ∈ [ d − H ) := (cid:110) s ∈ (cid:18) [ n ] d (cid:19) : | s ∩ H i | = 1 for all i ∈ [ d ] (cid:111) . If, in addition, we have | H i | = 2 for all i ∈ [ d ], then we say that the set Box( H ) is a d -dimensional box of [ n ]. As in (2.6), by Box( d ) we shall denote the d -dimensional boxcorresponding to the sequence ( { , } , . . . , { d − , d } ), that is,(4.2) Box( d ) = (cid:110) s ∈ (cid:18) [ n ] d (cid:19) : | s ∩ { i − , i }| = 1 for all i ∈ [ d ] (cid:111) . We proceed with the following definition. (Note that the “( ϑ, S )-box independence”condition introduced below is weaker than (1.6). We will work with this weaker notionsince it is more amenable to an inductive argument.) Definition 4.1.
Let n, d be positive integers with n (cid:62) d , let X be a nonempty finite set,and let X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be an X -valued, d -dimensional random array on [ n ] . Alsolet S be a nonempty subset of X . (i) (Box independence) Let ϑ > . We say that X is ( ϑ, S )-box independent if forevery d -dimensional box B of [ n ] and every a ∈ S we have (4.3) P (cid:16) (cid:92) s ∈ B [ X s = a ] (cid:17) (cid:54) (cid:89) s ∈ B P (cid:0) [ X s = a ] (cid:1) + ϑ. (ii) (Approximate independence) Set (cid:96) := (cid:0) (cid:98) n/ (cid:99) d (cid:1) , and let γ = ( γ k ) (cid:96)k =1 be a finitesequence of positive reals. We say that X is ( γ , S )-independent if for everynonempty subset F of (cid:0) [ n ] d (cid:1) such that ∪F has cardinality at most n/ , and everycollection ( a s ) s ∈F of elements of S we have (4.4) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) − (cid:89) s ∈F P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ |F| . Note that if d = 1, then this condition is superfluous. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 15
We are ready to state the main result in this section; it is the second main step towardsTheorem 2.3, and its proof is given in Section 5. (The definition of the numerical invariantsappearing below is given in Subsection 5.2.)
Theorem 4.2.
Let d, n be positive integers with n (cid:62) d , let < η, ϑ (cid:54) , and set (cid:96) := (cid:0) (cid:98) n/ (cid:99) d (cid:1) . Then there exists a sequence γ = ( γ k ( η, ϑ, d, n )) (cid:96)k =1 of positive reals suchthat γ k ( η, ϑ, d, n ) tends to zero as n tends to infinity and η, ϑ tend to zero, and with thefollowing property.Let X be a finite set, let S be a nonempty subset of X , and let X be an X -valued, η -spreadable, d -dimensional random array on [ n ] . If X is ( ϑ, S ) -box independent, then X is also ( γ , S ) -independent. Consequences.
The rest of this section is devoted to the proof of two consequencesof Theorem 4.2. The first consequence—which is one of the main ingredients of theproof of Theorem 2.3—shows that the box independence condition implies approximatedissociativity. Specifically, we have the following corollary.
Corollary 4.3.
For every triple d, (cid:96), m of positive integers with (cid:96) (cid:62) d and m (cid:62) , andevery β > , there exist an integer N (cid:62) (cid:96) and two constants η, ϑ > with the followingproperty.Let n (cid:62) N be an integer, let X be a set with |X | = m , let S be a subset of X with |S| = |X | − , and let X be an X -valued, η -spreadable, d -dimensional random array on [ n ] . If X is ( ϑ, S ) -box independent, then X is ( β, (cid:96) ) -dissociated. The second consequence of Theorem 4.2 shows that the box independence forces allsub-processes indexed by d -dimensional boxes to behave independently. More precisely,we have the following corollary. Corollary 4.4.
For every pair d, m of positive integers with m (cid:62) , and every γ > ,there exist an integer N (cid:62) d and η, ϑ > with the following property.Let n, X , S , X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be as in Corollary . If X is ( ϑ, S ) -box independent,then for every d -dimensional box B of [ n ] and every collection ( a s ) s ∈ B of elements of X we have (4.5) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈ B [ X s = a s ] (cid:17) − (cid:89) s ∈ B P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ. Remark . Although Corollary 4.4 is weaker than Theorem 4.2, a direct proof of theestimate (4.5) is likely to require the whole machinery presented in Section 5.The deduction of Corollaries 4.3 and 4.4 from Theorem 4.2 is based on the followinglemma.
Lemma 4.6.
For every triple d, m, κ of positive integers with m (cid:62) , and every γ > ,there exist an integer N (cid:62) and η, ϑ > with the following property. Let n, X , S , X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be as in Corollary . If X is ( ϑ, S ) -box independent,then for every nonempty subset F of (cid:0) [ n ] d (cid:1) with |F| (cid:54) κ and every collection ( a s ) s ∈F ofelements of X we have (4.6) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) − (cid:89) s ∈F P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ. We defer the proof of Lemma 4.6 to Subsection 4.3 below. At this point, let us givethe proofs of Corollaries 4.3 and 4.4.
Proof of Corollary . Let d, (cid:96), m, β be as in the statement of the corollary. Let
N, η, ϑ be as in Lemma 4.6 applied for κ := (cid:0) (cid:96)d (cid:1) and γ := m − κ β . (Clearly, we may assume that N (cid:62) (cid:96) .) We claim that N, η and ϑ are as desired.Indeed, fix n, X , S , X and recall that we need to show that X is ( β, (cid:96) )-dissociated.(See Definition 2.1.) To this end, let J, K be subsets of [ n ] with | J | , | K | (cid:62) d , | J | + | K | (cid:54) (cid:96) and max( J ) < min( K ), and let A ∈ F J and B ∈ F K . Notice that we have to show that | P ( A ∩ B ) − P ( A ) P ( B ) | (cid:54) β .Since A belongs to the σ -algebra generated by X J , there exists a collection A of mapsof the form a : (cid:0) Jd (cid:1) → X such that(4.7) A = (cid:91) a ∈A (cid:92) s ∈ ( Jd )[ X s = a ( s )] . Similarly, there exists a collection B of maps of the form b : (cid:0) Kd (cid:1) → X such that(4.8) B = (cid:91) b ∈B (cid:92) t ∈ ( Kd )[ X t = b ( t )] . For every a ∈ A we set A a := (cid:84) s ∈ ( Jd )[ X s = a ( s )], respectively, for every b ∈ B we set B b := (cid:84) t ∈ ( Kd )[ X t = b ( t )]. By Lemma 4.6, for every a ∈ A and every b ∈ B , we have (cid:12)(cid:12)(cid:12) P ( A a ∩ B b ) − (cid:89) s ∈ ( Jd ) P (cid:0) [ X s = a ( s )] (cid:1) (cid:89) t ∈ ( Kd ) P (cid:0) [ X t = b ( t )] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ (4.9) (cid:12)(cid:12)(cid:12) P ( A a ) − (cid:89) s ∈ ( Jd ) P (cid:0) [ X s = a ( s )] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ (4.10) (cid:12)(cid:12)(cid:12) P ( B b ) − (cid:89) t ∈ ( Kd ) P (cid:0) [ X t = b ( t )] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ ;(4.11)consequently, | P ( A a ∩ B b ) − P ( A a ) P ( B b ) | (cid:54) γ . On the other hand, by identities (4.7)and (4.8), we see that A ∩ B = (cid:84) a ∈A , b ∈B A a ∩ B b ; moreover, the collections (cid:104) A a : a ∈ A(cid:105) and (cid:104) B b : b ∈ B(cid:105) consist of pairwise disjoint events. Thus, we have(4.12) P ( A ∩ B ) = (cid:88) a ∈A , b ∈B P ( A a ∩ B b ) , P ( A ) = (cid:88) a ∈A P ( A a ) and P ( B ) = (cid:88) b ∈B P ( B b ) . ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 17
Therefore, | P ( A ∩ B ) − P ( A ) P ( B ) | (cid:54) (cid:88) a ∈A , b ∈B | P ( A a ∩ B b ) − P ( A a ) P ( B b ) | (4.13) (cid:54) γ |A| |B| (cid:54) γ m ( (cid:96)d ) = β and the proof is completed. (cid:3) Proof of Corollary . It follows from Lemma 4.6 applied for “ κ = 2 d ”. (cid:3) Proof of Lemma 4.6.
The result follows from Theorem 4.2 and the inclusion-exclusion formula. Specifically, let d, m, κ, γ be as in the statement of the lemma, and set γ (cid:48) := m − κ γ . By Theorem 4.2, there exist an integer N (cid:62) d and two constants 0 < η < ϑ > γ k ( η, ϑ, d, n ) < γ (cid:48) for every integer n (cid:62) N and k ∈ [ κ ]. We claim that N, η and ϑ are as desired.Indeed, fix n, X , S and X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) . By (4.14) and Theorem 4.2, for everynonempty F ∗ ⊆ (cid:0) [ n ] d (cid:1) with |F ∗ | (cid:54) κ and every collection ( a s ) s ∈F ∗ of elements of S ,we have(4.15) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F ∗ [ X s = a s ] (cid:17) − (cid:89) s ∈F ∗ P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ (cid:48) . Let F be a nonempty subset of (cid:0) [ n ] d (cid:1) with |F| (cid:54) κ , and let ( a s ) s ∈F be a collection ofelements of X . Set F (cid:48) := { s ∈ F : a s ∈ S} and G := F \ F (cid:48) ; observe that for every t ∈ G the events (cid:104) [ X t = a ] : a ∈ S(cid:105) are pairwise disjoint and, moreover,(4.16) [ X t = a t ] = (cid:16) (cid:91) a ∈S [ X t = a ] (cid:17) (cid:123) . (For any event E , by E (cid:123) we denote its complement.) Thus, for every t ∈ G we have P (cid:0) [ X t = a t ] (cid:1) = 1 − (cid:80) a ∈S P (cid:0) [ X t = a ] (cid:1) and, consequently, (cid:89) s ∈F P (cid:0) [ X s = a s ] (cid:1) = (cid:89) s ∈F (cid:48) P (cid:0) [ X s = a s ] (cid:1) (cid:89) t ∈G (cid:16) − (cid:88) a ∈S P (cid:0) [ X t = a ] (cid:1)(cid:17) (4.17) = (cid:88) W⊆G a : W→S ( − |W| (cid:89) t ∈W P (cid:0) [ X t = a ( t )] (cid:1) (cid:89) s ∈F (cid:48) P (cid:0) [ X s = a s ] (cid:1) with the convention that the product over an empty index-set is equal to 1. On the otherhand, P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) (4.16) = P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:92) t ∈G (cid:16) (cid:91) a ∈S [ X t = a ] (cid:17) (cid:123) (cid:17) (4.18) = P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] (cid:17) − P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:16) (cid:91) t ∈G (cid:91) a ∈S [ X t = a ] (cid:17)(cid:17) . Next observe that for every nonempty subset W of G we have(4.19) (cid:92) t ∈W (cid:91) a ∈S [ X t = a ] = (cid:91) a : W→S (cid:16) (cid:92) t ∈W [ X t = a ( t )] (cid:17) and the events (cid:104) (cid:84) t ∈W [ X t = a ( t )] : a : W → S(cid:105) are pairwise disjoint. Hence, by theinclusion-exclusion formula, P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:16) (cid:91) t ∈G (cid:91) a ∈S [ X t = a ] (cid:17)(cid:17) (4.20) = (cid:88) ∅(cid:54) = W⊆G ( − |W|− P (cid:16) (cid:92) t ∈W (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:16) (cid:91) a ∈S [ X t = a ] (cid:17)(cid:17)(cid:17) (4.19) = (cid:88) ∅(cid:54) = W⊆G ( − |W|− P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:16) (cid:91) a : W→S (cid:16) (cid:92) t ∈W [ X t = a ( t )] (cid:17)(cid:17)(cid:17) = (cid:88) ∅(cid:54) = W⊆G (cid:88) a : W→S ( − |W|− P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:92) t ∈W [ X t = a ( t )] (cid:17) . Combining identities (4.18) and (4.20), we see that(4.21) P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) = (cid:88) W⊆G a : W→S ( − |W| P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:92) t ∈W [ X t = a ( t )] (cid:17) with the convention that the intersection over an empty index-set is equal to the wholesample space. Finally, by identities (4.17) and (4.21) and the triangle inequality, weconclude that the quantity (cid:12)(cid:12) P (cid:0) (cid:84) s ∈F [ X s = a s ] (cid:1) − (cid:81) s ∈F P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12) is upper bounded by (cid:88) W⊆G a : W→S (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F (cid:48) [ X s = a s ] ∩ (cid:92) t ∈W [ X t = a ( t )] (cid:17) − (4.22) − (cid:89) s ∈F (cid:48) P (cid:0) [ X s = a s ] (cid:1) (cid:89) t ∈W P (cid:0) [ X t = a ( t )] (cid:1)(cid:12)(cid:12)(cid:12) (4.15) (cid:54) m κ γ (cid:48) = γ. The proof of Lemma 4.6 is completed.5.
Proof of Theorem 4.2
This section is devoted to the proof of Theorem 4.2 which proceeds by induction onthe dimension d . In a nutshell, the argument is based on repeated averaging and anappropriate version of the weak law of large numbers in order to gradually upgrade thebox independence condition. The combinatorial heart of the matter lies in the selectionof this averaging.5.1. Toolbox.
We begin by presenting three lemmas which are needed for the proof ofTheorem 4.2, but they are not directly related with the main argument.
Lemma 5.1.
Let m be a positive integer, let δ > and let A , . . . , A m be events in aprobability space such that for every i, j ∈ [ m ] with i (cid:54) = j we have (5.1) P ( A i ∩ A j ) (cid:54) P ( A i ) P ( A j ) + δ. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 19
Then, setting Z := m (cid:80) mi =1 A i , we have (5.2) Var( Z ) (cid:54) m + δ. Proof.
We haveVar( Z ) = E (cid:2) ( Z − E [ Z ]) (cid:3) = 1 m (cid:88) i,j ∈ [ m ] E (cid:2)(cid:0) A i − P ( A i ) (cid:1)(cid:0) A j − P ( A j ) (cid:1)(cid:3) = 1 m (cid:104) m (cid:88) i =1 (cid:0) P ( A i ) − P ( A i ) (cid:1) + (cid:88) i,j ∈ [ m ] i (cid:54) = j (cid:0) P ( A i ∩ A j ) − P ( A i ) P ( A j ) (cid:1)(cid:105) (cid:54) m + δ as desired. (cid:3) Lemma 5.2.
Let m be a positive integer, let η, δ > and let E, A , . . . , A m be events ina probability space such that for every i, j ∈ [ m ] with i (cid:54) = j we have (i) | P ( A i ) − P ( A j ) | (cid:54) η , (ii) | P ( E ∩ A i ) − P ( E ∩ A j ) | (cid:54) η , and (iii) P ( A i ∩ A j ) (cid:54) P ( A i ) P ( A j ) + δ .Then for every i ∈ [ m ] we have (5.3) (cid:12)(cid:12) P ( E ∩ A i ) − P ( E ) P ( A i ) (cid:12)(cid:12) (cid:54) η + (cid:114) m + δ. Proof.
Set Z := m (cid:80) mj =1 A j . Let i ∈ [ m ]. Notice that, by the triangle inequality, (cid:12)(cid:12) P ( E ∩ A i ) − P ( E ) P ( A i ) (cid:12)(cid:12) = (cid:12)(cid:12) E [ E A i ] − E [ E P ( A i )] (cid:12)(cid:12) (5.4) (cid:54) (cid:12)(cid:12) E [ E A i ] − E [ E Z ] (cid:12)(cid:12) + (cid:12)(cid:12) E [ E Z ] − E [ E E [ Z ]] (cid:12)(cid:12) ++ (cid:12)(cid:12) E [ E E [ Z ]] − E [ E P ( A i )] (cid:12)(cid:12) . Invoking the triangle inequality again, we have (cid:12)(cid:12) E [ E A i ] − E [ E Z ] (cid:12)(cid:12) (cid:54) m m (cid:88) j =1 (cid:12)(cid:12) P ( E ∩ A i ) − P ( E ∩ A j ) (cid:12)(cid:12) (ii) (cid:54) η (5.5) (cid:12)(cid:12) E [ E E [ Z ]] − E [ E P ( A i )] (cid:12)(cid:12) (cid:54) P ( E ) 1 m m (cid:88) j =1 (cid:12)(cid:12) P ( A j ) − P ( A i ) (cid:12)(cid:12) (i) (cid:54) η. (5.6)Finally, by the Cauchy–Schwarz inequality, hypothesis (iii) and Lemma 5.1,(5.7) (cid:12)(cid:12) E [ E Z ] − E [ E E [ Z ]] (cid:12)(cid:12) (cid:54) (cid:112) P ( E ) (cid:107) Z − E [ Z ] (cid:107) L (cid:54) (cid:114) m + δ. The estimate (5.3) follows from (5.4)–(5.7). (cid:3)
Lemma 5.3.
Let m (cid:62) be an integer, let η > , and let ( A i ) mi =1 be an η -spreadablesequence of events in a probability space. Then for every i, j ∈ [ m ] with i (cid:54) = j , (5.8) P ( A i ∩ A j ) (cid:62) P ( A i ) P ( A j ) − m − η. Proof.
Set Z := m (cid:80) mk =1 A k . Fix i, j ∈ [ m ] with i (cid:54) = j . Then we have (cid:12)(cid:12) P ( A i ∩ A j ) − E [ Z ] (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) P ( A i ∩ A j ) − m m (cid:88) k =1 P ( A k ) − m (cid:88) k,(cid:96) ∈ [ m ] k (cid:54) = (cid:96) P ( A k ∩ A (cid:96) ) (cid:12)(cid:12)(cid:12) (cid:54) m + η. Also notice that (cid:12)(cid:12) P ( A i ) P ( A j ) − E [ Z ] (cid:12)(cid:12) (cid:54) E [ Z ] (cid:12)(cid:12) P ( A j ) − E [ Z ] (cid:12)(cid:12) + P ( A j ) (cid:12)(cid:12) P ( A i ) − E [ Z ] (cid:12)(cid:12) (cid:54) η. Since E [ Z ] (cid:54) E [ Z ], inequality (5.8) follows from the previous two estimates. (cid:3) Initializing various numerical parameters.
Our goal in this subsection is todefine, by recursion on d , the numbers γ k ( η, ϑ, d, n ) as well as some other numericalinvariants which are needed for the proof of Theorem 4.2. (The reader is advised to skipthis subsection at first reading.)We start by setting(5.9) γ k ( η, ϑ, , n ) := (3 k − η + ( k − (cid:115) (cid:98) n/ (cid:99) + ϑ for every 0 < η (cid:54)
1, every ϑ > k, n with n (cid:62) k (cid:54) n/ d (cid:62) γ k ( η, ϑ, d − , n ) have beendefined for every choice of admissible parameters. Fix 0 < η (cid:54) ϑ >
0, and let n bean integer with n (cid:62) d . We set ϑ ( η, ϑ, d, n ) := ( n − d + 2) − / + (2 d + 5) √ η + √ ϑ (5.10) ϑ ( η, ϑ, d, n ) := 2 d − n − d + 1 + 2 d η + ϑ (5.11) ϑ ( η, ϑ, d, n ) := d − n − d + 2) / d − + (2 d + 5) η / d − + ϑ / d − + 3 η. (5.12)Next, for every positive integer k with k (cid:54) (cid:0) (cid:98) ( n − / (cid:99) d − (cid:1) we set γ (1) k ( η, ϑ, d, n ) := γ k (cid:0) η, ϑ ( η, ϑ, d, n ) , d, n ) , d − , n − (cid:1) + ( k + 1) η (5.13) γ (2) k ( η, ϑ, d, n ) := γ k (cid:0) η, ϑ ( η, ϑ, d, n ) , d − , n − (cid:1) (5.14) γ (3) k ( η, ϑ, d, n ) := 2 γ (1) k ( η, ϑ, d, n ) + γ (2) k ( η, ϑ, d, n ) + k ϑ ( η, ϑ, d, n )(5.15) γ (4) k ( η, ϑ, d, n ) := (cid:0) γ (3) k ( η, ϑ, d, n ) + (cid:98) n/ (cid:99) − + (2 k + 1) η (cid:1) / + 2 η. (5.16) That is, the random vector ( A , . . . , A m ) is η -spreadable according to Definition 1.3. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 21
Moreover, for every positive integer u with u (cid:54) n/ k , . . . , k u of positiveintegers with k , . . . , k u (cid:54) (cid:0) (cid:98) ( n − / (cid:99) d − (cid:1) set(5.17) γ (5) ( η, ϑ, d, n, ( k i ) ui =1 ) := γ (1) k ( η, ϑ, d, n ) + u (cid:88) i =2 (cid:0) γ (1) k i ( η, ϑ, d, n ) + γ (4) k i ( η, ϑ, d, n ) (cid:1) with the convention that the sum in (5.17) is equal to 0 if u = 1. Finally, for everypositive integer k with k (cid:54) (cid:0) (cid:98) n/ (cid:99) d (cid:1) we define(5.18) γ k ( η, ϑ, d, n ) := ( k + 1) η + max { γ (5) ( η, ϑ, d, n, ( k i ) ui =1 ) } where the above maximum is taken over all choices of positive integers u, k , . . . , k u sat-isfying u (cid:54) n/ − d , k , . . . , k u (cid:54) (cid:0) (cid:98) ( n − / (cid:99) d − (cid:1) and k + · · · + k u = k .5.3. The inductive hypothesis.
For every positive integer d by P( d ) we shall denotethe following statement. Let n (cid:62) d be an integer, let < η < , let ϑ > , let X be a nonempty finite set, andlet S be a nonempty subset of X . Set (cid:96) := (cid:0) (cid:98) n/ (cid:99) d (cid:1) and let γ = ( γ k ( η, ϑ, d, n )) (cid:96)k =1 be asin Subsection . Let X be an X -valued, η -spreadable, d -dimensional random array X on [ n ] . If X is ( ϑ, S ) -box independent, then X is ( γ , S ) -independent. It is clear that Theorem 4.2 is equivalent to the validity of P( d ) for every integer d (cid:62) The base case “ d = 1 ”. The initial step of the induction follows from the followinglemma.
Lemma 5.4.
Let n, η, ϑ, X and S be as in the statement of P(1) , and assume that X = ( X , . . . , X n ) is an X -valued, η -spreadable, random vector. Assume, moreover, thatfor every i, j ∈ [ n ] with i (cid:54) = j and every a ∈ S we have (5.19) P (cid:0) [ X i = a ] ∩ [ X j = a ] (cid:1) (cid:54) P (cid:0) [ X i = a ] (cid:1) P (cid:0) [ X j = a ] (cid:1) + ϑ. Then for every nonempty
F ⊆ [ n ] with |F| (cid:54) n/ and every collection ( a i ) i ∈F of elementsof S , we have (5.20) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) i ∈F [ X i = a i ] (cid:17) − (cid:89) i ∈F P (cid:0) [ X i = a i ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ |F| ( η, ϑ, , n ) where ( γ k ( η, ϑ, , n )) (cid:98) n/ (cid:99) k =1 is as in (5.9) . In particular, P(1) holds true.Proof.
Observe that, by the η -spreadability of X , it is enough to show that for every k ∈ { , . . . , (cid:98) n/ (cid:99)} and every a , . . . , a k ∈ S we have(5.21) (cid:12)(cid:12)(cid:12) P (cid:16) k (cid:92) i =1 [ X i = a i ] (cid:17) − k (cid:89) i =1 P (cid:0) [ X i = a i ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) ( k − (cid:16) η + (cid:115) (cid:98) n/ (cid:99) + ϑ (cid:17) . To this end, we proceed by induction of k . The case “ k = 1” is straightforward. Let k bea positive integer with k < (cid:98) n/ (cid:99) , and assume that (5.21) has been verified up to k . Fix a , . . . , a k +1 ∈ S . Set m := (cid:98) n/ (cid:99) and E := (cid:84) ki =1 [ X i = a i ]. Also set A j := [ X k + j = a k +1 ] for every j ∈ [ m ]. Using the η -spreadability of X , for every j, j (cid:48) ∈ [ m ] with j (cid:54) = j (cid:48) wehave(i) | P ( A j ) − P ( A j (cid:48) ) | (cid:54) η , and(ii) | P ( E ∩ A j ) − P ( E ∩ A j (cid:48) ) | (cid:54) η .Moreover, since a k +1 ∈ S , we have(iii) P ( A j ∩ A j (cid:48) ) (cid:54) P ( A j ) P ( A j (cid:48) ) + ϑ .Applying Lemma 5.2 for “ δ = ϑ ” and using the definition of A , we see that(5.22) (cid:12)(cid:12) P (cid:0) E ∩ [ X k +1 = a k +1 ] (cid:1) − P ( E ) P (cid:0) [ X k +1 = a k +1 ] (cid:1)(cid:12)(cid:12) (cid:54) η + (cid:114) m + ϑ. On the other hand, by our inductive assumptions, we have(5.23) (cid:12)(cid:12)(cid:12) P ( E ) − k (cid:89) j =1 P (cid:0) [ X j = a j ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) ( k − (cid:16) η + (cid:114) m + ϑ (cid:17) . Combining (5.22) and (5.23), we see that (5.21) is satisfied, as desired. (cid:3)
The general inductive step.
We now enter into the main part of the proof ofTheorem 4.2.
Specifically, fix an integer d (cid:62) . Throughout this subsection, we willassume that P( d − has been proved .5.5.1. Step 1: preparatory lemmas.
Our goal in this step is to prove two probabilisticlemmas which will be used in the third and the fourth step of the proof respectively.Strictly speaking, these lemmas are not part of the proof of P( d ) since in their proofswe do not use the inductive assumptions. (In particular, this subsection can be readindependently.)The first lemma essentially shows that the reverse inequality in (4.3) always holds true. Lemma 5.5.
Let n be an integer with n (cid:62) d , let < η < , let X be a nonempty finiteset, and let X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be an X -valued, η -spreadable, d -dimensional randomarray on [ n ] . Then for every t ∈ (cid:0) [ n − d − (cid:1) and every a ∈ X we have P (cid:0) [ X t ∪{ n − } = a ] (cid:1) P (cid:0) [ X t ∪{ n } = a ] (cid:1) (cid:54) P (cid:0) [ X t ∪{ n − } = a ] ∩ [ X t ∪{ n } = a ] (cid:1) +(5.24) + 1 n − d + 1 + 6 η. Proof.
Fix t ∈ (cid:0) [ n − d − (cid:1) and a ∈ X . Set t := [ d − A i := [ X t ∪{ d − i } = a ] for every i ∈ [ n − d + 1]. Observe that the sequence ( A . . . , A n − d +1 ) is η -spreadable . By Lemma5.3, we obtain that(5.25) P ( A ) P ( A ) (cid:54) P ( A ∩ A ) + 1 n − d + 1 + 3 η. By (5.25) and the η -spreadability of X , the estimate (5.24) follows. (cid:3) Recall that this means that the random vector ( A , . . . , A n − d +1 ) is η -spreadable. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 23
The second lemma shows that the box independence condition (4.3) is inherited by thetwo-dimensional faces of d -dimensional boxes. Lemma 5.6.
Let n be an integer with n (cid:62) d , let < η < , let ϑ > , let X be anonempty finite set, let S be a nonempty subset of X , and let X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be an X -valued, η -spreadable, d -dimensional random array on [ n ] which is ( ϑ, S ) -boxindependent. Then for every t ∈ (cid:0) [ n − d − (cid:1) and every a ∈ S we have P (cid:0) [ X t ∪{ n − } = a ] ∩ [ X t ∪{ n } = a ] (cid:1) (cid:54) P (cid:0) [ X t ∪{ n − } = a ] (cid:1) P (cid:0) [ X t ∪{ n } = a ] (cid:1) +(5.26) + ϑ ( η, ϑ, d, n ) where ϑ ( η, ϑ, d, n ) is as defined in (5.12) .Proof. Fix a ∈ S . For every i ∈ [ d ] set t i − := [ i −
1] (where, by convention, [0] = ∅ ) and H i := { n − d + 2 i − , n − d + 2 i } . We define, recursively, a finite sequence ( ϑ i ) d − i =0 bysetting ϑ = ϑ and(5.27) ϑ r +1 = (cid:16) n − d + r + 2 + (2 d − r + 5) η + ϑ r (cid:17) / . By induction on r ∈ { , . . . , d − } , we will show that(5.28) P (cid:16) (cid:92) v ∈ B r [ X t r ∪ v = a ] (cid:17) (cid:54) (cid:89) v ∈ B r P (cid:0) [ X t r ∪ v = a ] (cid:1) + ϑ r where B r := Box (cid:0) ( H r +1 , . . . , H d ) (cid:1) is the ( d − r )-dimensional box determined by thesequence ( H r +1 , . . . , H d ). (See (4.1).) The case “ r = 0” follows from the fact that therandom array X is ( ϑ, S )-box independent. Next, let r ∈ { , . . . , d − } and assume that(5.28) has been proved up to r . For every j ∈ [ n − d + r + 2] set(5.29) A j := (cid:92) v ∈ B r +1 [ X t r ∪{ r + j }∪ v = a ] . Since X is η -spreadable, the sequence ( A , . . . , A n − d + r +2 ) is η -spreadable. Using thisobservation and the inductive assumptions, we see that P ( A ∩ A ) (cid:54) P ( A n − d + r +1 ∩ A n − d + r +2 ) + η (5.30) = P (cid:16) (cid:92) v ∈ B r [ X t r ∪ v = a ] (cid:17) + η (cid:54) (cid:89) v ∈ B r P (cid:0) [ X t r ∪ v = a ] (cid:1) + η + ϑ r . On the other hand, since X is η -spreadable, we have(5.31) (cid:89) v ∈ B r P (cid:0) [ X t r ∪ v = a ] (cid:1) (cid:54) (cid:16) (cid:89) v ∈ B r +1 P (cid:0) [ X t r +1 ∪ v = a ] (cid:1)(cid:17) + 2 d − r η. Moreover, by Lemma 5.3 applied to the η -spreadable sequence ( A j ) n − d + r +2 j =1 ,(5.32) P ( A ∩ A ) (cid:62) P ( A ) P ( A ) − n − d + r + 2 − η. By (5.30)–(5.32) and using the η -spreadability of the sequence ( A j ) n − d + r +2 j =1 once again,we obtain that P (cid:16) (cid:92) v ∈ B r +1 [ X t r +1 ∪ v = a ] (cid:17) = P ( A ) (cid:54) P ( A ) P ( A ) + η (5.33) (cid:54) (cid:16) (cid:89) v ∈ B r +1 P (cid:0) [ X t r +1 ∪ v = a ] (cid:1)(cid:17) + ϑ r +1 . Taking square-roots, this estimate completes the inductive proof of (5.28).Now notice that ϑ d − (cid:54) ϑ d − + d − (cid:88) j =1 ( n − d + 2) − / j + ((2 d + 5) η ) / j (5.34) (cid:54) d − n − d + 2) / d − + ( d − d + 5) / η / d − + ϑ / d − (cid:54) d − n − d + 2) / d − + (2 d + 5) η / d − + ϑ / d − . Setting s := [ d − ∪ { n − } and s := [ d − ∪ { n } , by (5.28) and (5.34), we have P (cid:0) [ X s = a ] ∩ [ X s = a ] (cid:1) (cid:54) P (cid:0) [ X s = a ] (cid:1) P (cid:0) [ X s = a ] (cid:1) +(5.35) + d − n − d + 2) / d − + (2 d + 5) η / d − + ϑ / d − . Taking into account the η -spreadability of X and the definition of ϑ ( η, ϑ, d, n ), theestimate (5.26) follows from (5.35). (cid:3) Step 2: rewriting the inductive assumptions.
We proceed with the following lemmawhich will enable us to use P( d −
1) in a more convenient form.
Lemma 5.7.
Let n, η, ϑ, X , S be as in the statement of P( d ) , and let X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be an X -valued, η -spreadable, d -dimensional random array on [ n ] which is ( ϑ, S ) -boxindependent. We define (cid:102) X = (cid:104) (cid:101) X t : t ∈ (cid:0) [ n − d − (cid:1) (cid:105) by setting (5.36) (cid:101) X t := X t ∪{ n } . Then the random array (cid:102) X is X -valued, η -spreadable and ( ϑ ( η, ϑ, d, n ) , S ) -box indepen-dent, where ϑ ( η, ϑ, d, n ) is as (5.10) .Proof. Since X is X -valued and η -spreadable, by (5.36), we see that these properties areinherited to (cid:102) X . Thus, we only need to check that (cid:102) X is ( ϑ ( η, ϑ, d, n ) , S )-box independent.To this end, fix a ∈ S and a finite sequence H = ( H , . . . , H d − ) of 2-element subsetsof [ n −
1] with max( H i ) < min( H i +1 ) for all i ∈ [ d − B := Box( H ) denote the( d − H . Moreover, set B := Box (cid:0) ( { , } , . . . , { d − , d − } ) (cid:1) ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 25 and A r := (cid:84) t ∈ B [ X t ∪{ d − r } = a ] for every r ∈ [ n − d + 2]. Notice that the sequence( A , . . . , A n − d +2 ) is η -spreadable. Therefore, by Lemma 5.3,(5.37) P ( A ) (cid:54) P ( A ) P ( A ) + η (cid:54) P ( A ∩ A ) + 1 n − d + 2 + 4 η. Next, set B (cid:48) := Box (cid:0) ( { , } , . . . , { d − , d − } , { d − , d } ) (cid:1) , and observe that B (cid:48) isa d -dimensional box and A ∩ A = (cid:84) s ∈ B (cid:48) [ X s = a ]. Since X is ( ϑ, S )-box independentand a ∈ S , we see that P ( A ∩ A ) (cid:54) (cid:89) s ∈ B (cid:48) P (cid:0) [ X s = a ] (cid:1) + ϑ (5.38) = (cid:16) (cid:89) t ∈ B P (cid:0) [ X t ∪{ d − } = a ] (cid:1)(cid:17)(cid:16) (cid:89) t ∈ B P (cid:0) [ X t ∪{ d } = a ] (cid:1)(cid:17) + ϑ (cid:54) (cid:16) (cid:89) t ∈ B P (cid:0) [ X t ∪{ d − } = a ] (cid:1)(cid:17) + 2 d − η + ϑ where the last inequality follows from the η -spreadability of X . By (5.37), (5.38) and thedefinition of A , we obtain P (cid:16) (cid:92) t ∈ B [ X t ∪{ d − } = a ] (cid:17) (cid:54) (cid:89) t ∈ B P (cid:0) [ X t ∪{ d − } = a ] (cid:1) +(5.39) + (cid:16) n − d + 2 + (2 d − + 4) η + ϑ (cid:17) / (cid:54) (cid:89) t ∈ B P (cid:0) [ X t ∪{ d − } = a ] (cid:1) ++ ( n − d + 2) − / + (2 d − + 4) √ η + √ ϑ. On the other hand, using the η -spreadability of X , we have (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) t ∈ B [ X t ∪{ d − } = a ] (cid:17) − P (cid:16) (cid:92) t ∈ B [ (cid:101) X t = a ] (cid:17)(cid:12)(cid:12)(cid:12) (cid:54) η (cid:54) √ η, (5.40) (cid:12)(cid:12)(cid:12) (cid:89) t ∈ B P (cid:0) [ X t ∪{ d − } = a ] (cid:1) − (cid:89) t ∈ B P (cid:0) [ (cid:101) X t = a ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) d − η (cid:54) d − √ η. (5.41)Combining (5.39)–(5.41) and invoking the definition of ϑ ( η, ϑ, d, n ) in (5.10), we concludethat(5.42) P (cid:16) (cid:92) t ∈ B [ ˜ X t = α ] (cid:17) (cid:54) (cid:89) t ∈ B P (cid:0) [ ˜ X t = a ] (cid:1) + ϑ ( η, ϑ, d, n ) . Since a and B were arbitrary, the result follows. (cid:3) By Lemma 5.7 and P( d − Corollary 5.8.
Let n, η, ϑ, X , S , X be as in Lemma . Then for every nonempty subset G of (cid:0) [ n − d − (cid:1) with | ∪ G| (cid:54) ( n − / , every collection ( a t ) t ∈G of elements of S , and every r ∈ [ n ] with r > max( ∪G ) we have (5.43) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) t ∈G [ X t ∪{ r } = a t ] (cid:17) − (cid:89) t ∈G P (cid:0) [ X t ∪{ r } = a t ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ (1) |G| ( η, ϑ, d, n ) where γ (1) |G| ( η, ϑ, d, n ) is as in (5.13) . Step 3: doubling.
The following lemma complements Lemma 5.7. It is also basedon the inductive hypothesis P( d − Lemma 5.9 (Doubling) . Let n, η, ϑ, X , S be as in the statement of P( d ) , and assumethat X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) is an X -valued, η -spreadable, d -dimensional random arrayon [ n ] which is ( ϑ, S ) -box independent. We define a ( d − -dimensional random array (cid:102) X (cid:48) = (cid:104) (cid:101) X (cid:48) t : t ∈ (cid:0) [ n − d − (cid:1) (cid:105) by setting (5.44) (cid:101) X (cid:48) t := ( X t ∪{ n − } , X t ∪{ n } ) . Then (cid:102) X (cid:48) is ( X × X ) -valued, η -spreadable and ( ϑ ( η, ϑ, d, n ) , { ( a, a ) : a ∈ S} ) -box inde-pendent, where ϑ ( η, ϑ, d, n ) is as defined in (5.11) .Proof. It is clear that (cid:102) X (cid:48) is ( X × X )-valued and η -spreadable. So, we only need to showthat (cid:102) X (cid:48) is ( ϑ ( η, ϑ, d, n ) , { ( a, a ) : a ∈ S} )-box independent.Let H , . . . , H d − be 2-element subsets of [ n −
2] with max( H i ) < min( H i +1 ) for all i ∈ [ d − a ∈ S . Set (cid:101) B := Box (cid:0) ( H , . . . , H d − ) (cid:1) ; also set H d := { n − , n } and B := Box (cid:0) ( H , . . . , H d − , H d ) (cid:1) . Since X is ( ϑ, S )-box independent, we see that P (cid:16) (cid:92) t ∈ (cid:101) B [ (cid:101) X (cid:48) t = ( a, a )] (cid:17) = P (cid:16) (cid:92) t ∈ (cid:101) B (cid:0) [ X t ∪{ n − } = a ] ∩ [ X t ∪{ n } = a ] (cid:1)(cid:17) (5.45) = P (cid:16) (cid:92) s ∈ B [ X s = a ] (cid:17) (cid:54) (cid:89) s ∈ B P (cid:0) [ X s = a ] (cid:1) + ϑ. By Lemma 5.5, we have (cid:89) s ∈ B P (cid:0) [ X s = a ] (cid:1) = (cid:89) t ∈ (cid:101) B P (cid:0) [ X t ∪{ n − } = a ] (cid:1) P (cid:0) [ X t ∪{ n } = a ] (cid:1) (5.46) (cid:54) (cid:89) t ∈ (cid:101) B P (cid:0) [ X t ∪{ n − } = a ] ∩ [ X t ∪{ n } = a ] (cid:1) + 2 d − n − d + 1 + 2 d − η = (cid:89) t ∈ (cid:101) B P (cid:0) [ (cid:101) X (cid:48) t = ( a, a )] (cid:1) + 2 d − n − d + 1 + 2 d η. By (5.45) and (5.46) and the definition of ϑ ( η, ϑ, d, n ), the result follows. (cid:3) The following corollary—which is an immediate consequence of Lemma 5.9 and theinductive assumption P( d − Corollary 5.10.
Let n, η, ϑ, X , S , X , (cid:102) X (cid:48) be as in Lemma . Then the random ar-ray (cid:102) X (cid:48) is (cid:0) ( γ (2) k ( η, ϑ, d, n )) (cid:96)k =1 , { ( a, a ) : a ∈ S} (cid:1) -independent, where (cid:96) = (cid:0) (cid:98) ( n − / (cid:99) d − (cid:1) and ( γ (2) k ( η, ϑ, d, n )) (cid:96)k =1 is as in (5.14) . ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 27
Step 4: gluing.
This is the main step of the proof. Specifically, our goal is to provethe following proposition.
Proposition 5.11 (Gluing) . Let n (cid:62) d + 2 be an integer, let η, ϑ, X , S be as in thestatement of P( d ) , and assume that X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) is an X -valued, η -spreadable, d -dimensional random array on [ n ] which is ( ϑ, S ) -box independent. Finally, let r be aninteger with d < r (cid:54) n/ , let G be a nonempty subset of (cid:0) [ r − d − (cid:1) , let ( a t ) t ∈G be a collectionof elements of S , let F be a nonempty subset of (cid:0) [ r − d (cid:1) , and let ( b s ) s ∈F be a collection ofelements of S . Then we have (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F [ X s = b s ] ∩ (cid:92) t ∈G [ X t ∪{ r } = a t ] (cid:17) − (5.47) − P (cid:16) (cid:92) s ∈F [ X s = b s ] (cid:17) P (cid:16) (cid:92) t ∈G [ X t ∪{ r } = a t ] (cid:17)(cid:12)(cid:12)(cid:12) (cid:54) γ (4) | G | ( η, ϑ, d, n ) where γ (4) |G| ( η, ϑ, d, n ) is as in (5.16) . Proposition 5.11 follows by carefully selecting a sequence of events, and then applyingthe averaging argument presented in Lemma 5.2. In order to do so, we need to control thevariances of the corresponding averages. This is, essentially, the content of the followinglemma.
Lemma 5.12 (Variance estimate) . Let n, η, ϑ, X , S be as in the statement of P( d ) , andassume that X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) is an X -valued, η -spreadable, d -dimensional randomarray on [ n ] which is ( ϑ, S ) -box independent. Then for every nonempty subset G of (cid:0) [ n − d − (cid:1) with | ∪ G| (cid:54) ( n − / , and every collection ( a t ) t ∈G of elements of S we have P (cid:16) (cid:92) t ∈G [ X t ∪{ n − } = a t ] ∩ (cid:92) t ∈G [ X t ∪{ n } = a t ] (cid:17) (5.48) (cid:54) P (cid:16) (cid:92) t ∈G [ X t ∪{ n − } = a t ] (cid:17) P (cid:16) (cid:92) t ∈G [ X t ∪{ n } = a t ] (cid:17) + γ (3) |G| ( η, ϑ, d, n ) where γ (3) |G| ( η, ϑ, d, n ) is as in (5.15) .Proof. Let G be a subset of (cid:0) [ n − d − (cid:1) with | ∪ G| (cid:54) ( n − /
2, and let ( a t ) t ∈G be a collectionof elements of S . By Corollary 5.10, we have P (cid:16) (cid:92) t ∈G [ X t ∪{ n − } = a t ] ∩ (cid:92) t ∈G [ X t ∪{ n } = a t ] (cid:17) (5.49) (cid:54) (cid:89) t ∈G P (cid:0) [ X t ∪{ n − } = a t ] ∩ [ X t ∪{ n } = a t ] (cid:1) + γ (2) |G| ( η, ϑ, d, n ) . Moreover, by Lemma 5.6, (cid:89) t ∈G P (cid:0) [ X t ∪{ n − } = a t ] ∩ [ X t ∪{ n } = a t ] (cid:1) (5.50) (cid:54) (cid:89) t ∈G P (cid:0) [ X t ∪{ n − } = a t ] (cid:1) P (cid:0) [ X t ∪{ n } = a t ] (cid:1) + |G| ϑ ( η, ϑ, d, n ) . Finally, by Corollary 5.8, we see that (cid:89) t ∈G P (cid:0) [ X t ∪{ n − } = a t ] (cid:1) (cid:54) P (cid:16) (cid:92) t ∈G [ X t ∪{ n − } = a t ] (cid:17) + γ (1) |G| ( η, ϑ, d, n ) , (5.51) (cid:89) t ∈G P (cid:0) [ X t ∪{ n } = a t ] (cid:1) (cid:54) P (cid:16) (cid:92) t ∈G [ X t ∪{ n } = a t ] (cid:17) + γ (1) |G| ( η, ϑ, d, n ) . (5.52)The estimate (5.48) follows by combining (5.49)–(5.52) and invoking the definition of theconstant ( γ (3) |G| ( η, ϑ, d, n ) in (5.15). (cid:3) We are now ready to give the proof of Proposition 5.11.
Proof of Proposition . Set E := (cid:84) s ∈F [ X s = b s ] and A i := (cid:84) t ∈G [ X t ∪{ r − i } ] for every i ∈ { , . . . , (cid:98) n/ (cid:99)} . Since X is η -spreadable, for every i, j ∈ { , . . . , (cid:98) n/ (cid:99)} with i (cid:54) = j wehave(i) | P ( A i ) − P ( A j ) | (cid:54) η , and(ii) | P ( E ∩ A i ) − P ( E ∩ A j ) | (cid:54) η .Moreover, applying Lemma 5.12 and using the η -spreadability of X again, for every i, j ∈ { , . . . , (cid:98) n/ (cid:99)} with i (cid:54) = j we have(iii) P ( A i ∩ A j ) (cid:54) P ( A i ) P ( A j ) + γ (3) |G| ( η, ϑ, d, n ) + (2 |G| + 1) η .By Lemma 5.2 applied for “ δ = γ (3) |G| ( η, ϑ, d, n ) + (2 |G| + 1) η ” and taking into account thedefinition of the constant γ (4) |G| ( η, ϑ, d, n ), we conclude that (5.47) is satisfied. (cid:3) Step 5: completion of the proof.
This is the last step of the proof. Recall that weneed to prove that the statement P( d ) holds true, or equivalently, that the estimate (4.4) issatisfied for the sequence γ = ( γ k ( η, ϑ, d, n )) (cid:96)k =1 defined in Subsection 5.2. As expected,the verification of this estimate will be reduced to Proposition 5.11. To this end, wewill decompose an arbitrary nonempty subset F of (cid:0) [ n ] d (cid:1) into several components whichare easier to handle. The details of this decomposition are presented in the followingdefinition. Definition 5.13 (Slicing profile) . Let n, d be positive integers with n (cid:62) d and let F be anonempty subset of (cid:0) [ n ] d (cid:1) . There exist, unique, • u ∈ [ n ] , • r , . . . , r u ∈ [ n ] with d (cid:54) r < · · · < r u , and • for every i ∈ [ u ] a nonempty subset G i of (cid:0) [ r i − d − (cid:1) ,such that (5.53) F = u (cid:91) i =1 (cid:8) t ∪ { r i } : t ∈ G i (cid:9) . We refer to the triple ( u, ( r i ) ui =1 , ( G i ) ui =1 ) as the slicing of F , and to the sequence ( |G i | ) ui =1 as the slicing profile of F . Finally, we denote by SP( n ) the set of all nonempty finitesequences ( k i ) ui =1 which are the slicing profile of some nonempty subset F of (cid:0) [ n ] d (cid:1) . ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 29
Example . Let d = 2, n = 6, and let F be the subset of (cid:0) [6]2 (cid:1) defined by F := (cid:8) { , } , { , } , { , } , { , } , { , } (cid:9) . r = 61 2 r = 3 4 r = 5 G G G Figure 3.
The slicing profile of F .Then the slicing of F is the triple (cid:0) , ( r , r , r ) , ( G , G , G ) (cid:1) where r = 3, r = 5, r = 6, G = { , } , G = { , } and G = { } ; in particular, the slicing profile of F is thesequence (2 , , Lemma 5.15.
Let n, η, ϑ, X , S be as in the statement of P( d ) . Let X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) be an X -valued, η -spreadable, d -dimensional random array on [ n ] which is ( ϑ, S ) -boxindependent. Also let u (cid:54) ( n/ − d + 1 be a positive integer, and let ( k i ) ui =1 ∈ SP( (cid:98) n/ (cid:99) ) .If F is a nonempty subset of (cid:0) [ (cid:98) n/ (cid:99) ] d (cid:1) with slicing profile ( k i ) ui =1 , then for every collection ( a s ) s ∈F of elements of S , we have (5.54) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) − (cid:89) s ∈F P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ (5) ( η, ϑ, d, n, ( k i ) ui =1 ) where γ (5) ( η, ϑ, d, n, ( k i ) ui =1 ) is as in (5.17) .Proof. We proceed by induction on u . The case “ u = 1” follows from Corollary 5.8. Let u < ( n/ − d be a positive integer, and assume that (5.54) has been proved up to u . Let( k i ) u +1 i =1 ∈ SP( (cid:98) n/ (cid:99) ), let F be a subset of (cid:0) [ (cid:98) n/ (cid:99) ] d (cid:1) with slicing profile ( k i ) u +1 i =1 , and let( a s ) s ∈F be a collection of elements of S .First observe that n (cid:62) d + 2 since there exists a nonempty subset of (cid:0) [ (cid:98) n/ (cid:99) ] d (cid:1) withslicing profile of length at least 2; in particular, in what follows, Proposition 5.11 can be applied. Let ( u, ( r i ) u +1 i =1 , ( G i ) u +1 i =1 ) denote the slicing of F , and decompose F as F ∪ F where(5.55) F := (cid:8) t ∪ { r i } : t ∈ G i , i ∈ [ u ] (cid:9) and F := (cid:8) t ∪ { r u +1 } : t ∈ G u +1 (cid:9) . Notice that d (cid:54) r < r u +1 (cid:54) n/ G u +1 ⊆ (cid:0) [ r u +1 − d − (cid:1) and |G u +1 | = k u +1 . By Proposition5.11 applied for “ r = r u +1 ”, “ G = G u +1 ”, “( a t ) t ∈G = ( a t ∪{ r u +1 } ) t ∈G u +1 ”, “ F = F and“( b s ) s ∈F = ( a s ) s ∈F ”, we have (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F [ X s = a s ] ∩ (cid:92) s ∈F [ X s = a s ] (cid:17) − (5.56) − P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17)(cid:12)(cid:12)(cid:12) (cid:54) γ (4) k u +1 ( η, ϑ, d, n ) . On the other hand, by our inductive assumptions, we obtain that(5.57) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) − (cid:89) s ∈F P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ (5) ( η, ϑ, d, n, ( k i ) ui =1 ) . Moreover, since |G u +1 | = k u +1 , by Corollary 5.8,(5.58) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈F [ X s = a s ] (cid:17) − (cid:89) s ∈F P (cid:0) [ X s = a s ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) γ (1) k u +1 ( η, ϑ, d, n ) . The inductive step is the completed by combining (5.57) and (5.58) and using the defini-tion of the constant γ (5) ( η, ϑ, d, n, ( k i ) u +1 i =1 ) in (5.17). (cid:3) It is clear that Lemma 5.15 implies that P( d ) holds true. This completes the proof ofthe general inductive step, and so the entire proof of Theorem 4.2 is completed.6. Proofs of Theorems 1.5 and 2.3
Both results follows immediately by Theorem 2.2 and Corollary 4.3. The estimates in(1.3) and (2.5) follow by combining • the choice of β and (cid:96) in (2.2) and (2.3) respectively, • the definition of the numerical invariants in Subsection 5.2, and • the choice of the constants in the proof of Lemma 4.6 in Subsection 4.3.Indeed, the recursive selection in Subsection 5.2 implies that for every pair of positiveintegers n, d with n (cid:62) d , every 0 < η, ϑ (cid:54)
1, and every positive integer k with k (cid:54) (cid:0) (cid:98) n/ (cid:99) d (cid:1) , γ k ( η, ϑ, d, n ) (cid:54) k d (cid:0) d (cid:112) /n + d √ η + d √ ϑ (cid:1) . (6.1)Using (6.1), (2.2) and (2.3), it is easy to check that C ( p, ε, k ) and C ( d, m, p, ε, k ) respec-tively, are as desired. 7. Examples
Our goal in this section is to present examples which show that the “box independencecondition” in Theorems 1.5 and 2.3 is essentially optimal. We focus on boolean randomarrays as this case already covers all underlying phenomena.
ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 31
Boxes and faces.
We start by introducing some terminology which will be usedthroughout this section. Let d (cid:62) B of (cid:0) N d (cid:1) is a d -dimensional box of N if it is a d -dimensional box of [ n ] for some integer n (cid:62) d . (SeeSubsection 4.1.) Moreover, we say that a subset F of (cid:0) N d (cid:1) is a ( d − -face of N if it isof the form Box( H ) where H = ( H , . . . , H d ) is a finite sequence of nonempty subsets of N of cardinality at most 2 with max( H i ) < min( H i +1 ) for all i ∈ [ d − (cid:80) di =1 | H i | = 2 d −
1. (Thus, | H i | = 2 for all but at one i ∈ [ d ].)7.2. The two-dimensional case.
We have the following proposition.
Proposition 7.1.
There exists a boolean, exchangeable, two-dimensional random array X = (cid:104) X s : s ∈ (cid:0) N (cid:1) (cid:105) on N with the following properties. ( P For every s ∈ (cid:0) N (cid:1) we have E [ X s ] = . ( P For every distinct s, t ∈ (cid:0) N (cid:1) we have E [ X s X t ] = . ( P For every -dimensional box B of N and every nonempty subset G of B with G (cid:54) = B we have E (cid:2) (cid:81) s ∈ G X s (cid:3) = ( ) | G | . ( P For every -dimensional box B of N we have E (cid:2) (cid:81) s ∈ B X s (cid:3) = ( ) . ( P Let n (cid:62) be an integer, and let X n denote the subarray of X determinedby [ n ] . ( See Definition . ) Then there exists a translated multilinear polyno-mial f : R ( [ n ]2 ) → R of degree with E [ f ( X n )] = 0 and (cid:107) f ( X n ) (cid:107) L ∞ (cid:54) , such thatfor every subset I of [ n ] with | I | (cid:62) we have P (cid:0)(cid:12)(cid:12) E [ f ( X n ) | F I ] (cid:12)(cid:12) (cid:62) − (cid:1) (cid:62) − .Proof. We will define the random array X by providing an integral representation of itsdistribution. (Of course, this maneuver is expected by the Aldous–Hoover representationtheorem [Ald81, Hoo79].) Specifically, set V := { , } and A := { (0 , , (1 , } ⊆ V ; weview V as a discrete probability space equipped with the uniform probability measure. Wealso set Ω := { , } ( N ) and we equip Ω with the product σ -algebra which we denote by Σ.Let P be the unique probability measure on (Ω , Σ) which satisfies, for every nonemptyfinite subset F of (cid:0) N (cid:1) , that(7.1) P (cid:16)(cid:8) ( x t ) t ∈ ( N ) ∈ Ω : x s = 1 for all s ∈ F (cid:9)(cid:17) = 12 (cid:16) (cid:17) |F| + 12 (cid:90) (cid:89) s ∈F A ( v s ) d µ ( v )where: (i) µ denotes the product measure on V N obtained by equipping each factorwith the uniform probability measure on V , and (ii) for every v = ( v i ) ∈ V N and every s = { i < i } ∈ (cid:0) N (cid:1) by v s = ( v i , v i ) ∈ V we denote the restriction of v on thecoordinates determined by s . Next, for every s ∈ (cid:0) N (cid:1) let X s : Ω → { , } denote theprojection on the s -th coordinate, that is, X s (cid:0) ( x t ) t ∈ ( N ) (cid:1) = x s for every ( x t ) t ∈ ( N ) ∈ Ω.The fact that the set A is symmetric implies that the random array X = (cid:104) X s : s ∈ (cid:0) N (cid:1) (cid:105) is exchangeable; moreover, for every nonempty finite subset F of (cid:0) N (cid:1) we have(7.2) E (cid:104) (cid:89) s ∈F X s (cid:105) = 12 (cid:16) (cid:17) |F| + 12 (cid:90) (cid:89) s ∈F A ( v s ) d µ ( v ) . Using (7.2), properties ( P P
4) follow by a direct computation.In order to verify property ( P
5) we argue as in the proof of Proposition 3.6. Fix aninteger n (cid:62)
8. Let Box(2) be the 2-dimensional box of N defined in (4.2). We define f : R ( [ n ]2 ) → R by setting for every x = ( x t ) t ∈ ( [ n ]2 ) ∈ R ( [ n ]2 ) f ( x ) := (cid:89) s ∈ Box(2) x s − E (cid:104) (cid:89) s ∈ Box(2) X s (cid:105) (7.3) (4.2) = x { , } x { , } x { , } x { , } − E [ X { , } X { , } X { , } X { , } ] . It is clear that f is a translated multilinear polynomial of degree 4 with E [ f ( X n )] = 0 and (cid:107) f ( X n ) (cid:107) L ∞ (cid:54)
1. (Recall that X n denotes the subarray of X determined by [ n ].) Let I be an arbitrary subset of [ n ] with | I | (cid:62)
8. Since | I | (cid:62)
8, there exists a 2-dimensional box B of N with B ⊆ (cid:0) I (cid:1) and such that min( s ) (cid:62) s ∈ B . Set C := (cid:84) s ∈ B [ X s = 1]and observe that C ∈ F I . Hence, by the exchangeability of X , we have E (cid:2) E [ f ( X n ) | F I ] C (cid:3) = E [ f ( X n ) C ](7.4) = E (cid:104) (cid:89) s ∈ Box(2) ∪ B X s (cid:105) − E (cid:104) (cid:92) s ∈ Box(2) X s (cid:105) = 12 which implies that P (cid:0)(cid:12)(cid:12) E [ f ( X n ) | F I ] (cid:12)(cid:12) (cid:62) − (cid:1) (cid:62) − . The proof is completed. (cid:3) The higher-dimensional case.
We proceed to discuss the case “ d (cid:62)
3” which ismore involved. Our examples can be roughly described as semi-random, in the sense thatthey are part random and part deterministic.The following lemma provides us with the random component.
Lemma 7.2.
Let d (cid:62) be an integer, and let ε > . Then there exist a nonempty finiteset V and a symmetric subset A of V d − such that, denoting by A (cid:123) the complementof A , for every pair F, G of disjoint ( possibly empty ) subsets of (cid:0) [2 d ] d − (cid:1) we have (cid:12)(cid:12)(cid:12) (cid:90) (cid:16) (cid:89) s ∈ F A ( v s ) (cid:17) · (cid:16) (cid:89) s ∈ G A (cid:123) ( v s ) (cid:17) d µ ( v ) − (cid:16) (cid:17) | F | + | G | (cid:12)(cid:12)(cid:12) (cid:54) ε (7.5) where: (i) µ denotes the product measure on V N obtained by equipping each factorwith the uniform probability measure on V , (ii) for every v = ( v i ) ∈ V N and every s = { i < · · · < i d − } ∈ (cid:0) N d (cid:1) we have v s = ( v i , . . . , v i d − ) ∈ V d − , and (iii) in (7.5) we use the convention that the product of an empty family of functions is equal to theconstant function . Lemma 7.2 follows from a standard random selection and the Azuma–Hoeffding in-equality . We leave the details to the interested reader.We are now ready to state the higher-dimensional analogue of Proposition 7.1. That is, for every ( v , . . . , v d − ) ∈ V d − and every permutation π of [ d −
1] we have that( v , . . . , v d − ) ∈ A if and only if ( v π (1) , . . . , v π ( d − ) ∈ A . See also [DTV20, Fact 3.3 and Lemma 3.4] for a more general result.
ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 33
Proposition 7.3.
Let d (cid:62) be an integer. Also let δ > . Then there exists a boolean,exchangeable, d -dimensional random array X = (cid:104) X s : s ∈ (cid:0) N d (cid:1) (cid:105) on N with the followingproperties. ( P For every s ∈ (cid:0) N d (cid:1) we have (cid:12)(cid:12) E [ X s ] − (cid:12)(cid:12) (cid:54) δ . ( P For every distinct s, t ∈ (cid:0) N d (cid:1) we have (cid:12)(cid:12) E [ X s X t ] − (cid:12)(cid:12) (cid:54) δ . ( P For every ( d − -face F of N we have (cid:12)(cid:12) E (cid:2) (cid:81) s ∈ F X s (cid:3) − ( ) | F | (cid:12)(cid:12) (cid:54) δ . ( P For every d -dimensional box B of N we have (cid:12)(cid:12) E (cid:2) (cid:81) s ∈ B X s (cid:3) − ( ) | B | (cid:12)(cid:12) (cid:54) δ . ( P Set ϑ := 16 − − d +1 . ( Note that ϑ does not dependent on δ . ) Let n (cid:62) d be aninteger, and let X n denote the subarray of X determined by [ n ] . Then there existsa translated multilinear polynomial f : R ( [ n ] d ) → R of degree d with E [ f ( X n )] = 0 and (cid:107) f ( X n ) (cid:107) L ∞ (cid:54) , such that for every subset I of [ n ] with | I | (cid:62) d we have P (cid:0)(cid:12)(cid:12) E [ f ( X n ) | F I ] (cid:12)(cid:12) (cid:62) ϑ (cid:1) (cid:62) ϑ .Remark . We point out that property ( P
3) is rather strong. Indeed, arguing as in theproof of Theorem 4.2, it is not hard to show that if X = (cid:104) X s : s ∈ (cid:0) N d (cid:1) (cid:105) is any boolean,spreadable, d -dimensional random array on N which satisfies properties ( P
1) and ( P
3) ofProposition 7.3, then for every d -dimensional box B of N and every nonempty subset G of B with G (cid:54) = B we have (cid:12)(cid:12)(cid:12) E (cid:104) (cid:89) s ∈ G X s (cid:105) − (cid:16) (cid:17) | G | (cid:12)(cid:12)(cid:12) = o δ → d (1) . (7.6)Note that (7.6) barely misses to imply that X satisfies the box independence condition. Proof of Proposition . Let V and A be the sets obtained by Lemma 7.2 applied for(7.7) ε := min (cid:8) δ − d d , − − ( d +2)2 d (cid:9) and observe that V can be selected so that its cardinality is an even positive integer. Wealso note that in the rest of the proof we follow the notational conventions in Lemma 7.2.First, for every i ∈ [ d ] we define h i , h i : V d → { , } by setting for every v ∈ V d h i ( v ) := A ( v [ d ] \{ i } ) and h i ( v ) := A (cid:123) ( v [ d ] \{ i } ) . (7.8)Next, for every x ∈ { , } d define h x : V d → { , } by h x := d (cid:89) i =1 h x ( i ) i . (7.9)Finally, set(7.10) A := (cid:8) x ∈ { , } d : x (1) + · · · + x ( d ) is even (cid:9) and define H : V d → { , } by H := (cid:88) x ∈ A h x . (7.11) For instance, if d = 3, then H ( v , v , v ) = A ( v , v ) A ( v , v ) A ( v , v ) + A (cid:123) ( v , v ) A (cid:123) ( v , v ) A ( v , v )++ A (cid:123) ( v , v ) A ( v , v ) A (cid:123) ( v , v ) + A ( v , v ) A (cid:123) ( v , v ) A (cid:123) ( v , v ) . Note that the function H is symmetric . In the following series of claims we isolateseveral properties of H which will be used in the proofs of properties ( P P Claim 7.5.
For every k ∈ [ d + 1] set t k := { k, . . . , k + d − } ∈ (cid:0) N d (cid:1) . Then we have (7.12) (cid:12)(cid:12)(cid:12) (cid:90) H ( v t ) d µ ( v ) − (cid:12)(cid:12)(cid:12) (cid:54) d − ε. Moreover, for every k ∈ { , . . . , d + 1 } we have (7.13) (cid:12)(cid:12)(cid:12) (cid:90) H ( v t ) H ( v t k ) d µ ( v ) − (cid:12)(cid:12)(cid:12) (cid:54) d − ε. Proof of Claim . First observe that (7.12) follows from (7.5), the fact that | A | = 2 d − and the definition of H . Next, fix k ∈ { , . . . , d + 1 } . Then, for every v ∈ V N we have(7.14) H ( v t ) H ( v t k ) = (cid:88) x , y ∈ A (cid:16) d (cid:89) i =1 h x ( i ) i ( v t ) (cid:17)(cid:16) d (cid:89) j =1 h y ( j ) j ( v t k ) (cid:17) . Therefore, if k >
2, then (7.13) also follows from (7.5) and the fact that | A | = 2 d − . Soassume that k = 2. By (7.14), for every v ∈ V N we have(7.15) H ( v t ) H ( v t ) = (cid:88) x , y ∈ A (cid:16) d (cid:89) i =2 h x ( i ) i ( v t ) (cid:17)(cid:16) d − (cid:89) j =1 h y ( j ) j ( v t ) (cid:17) h x (1)1 ( v t ) h y ( d ) d ( v t ) . Notice that for every v ∈ V N we have h v t = h v t and h v t = h v t . Thus, setting W := { ( x , y ) ∈ A × A : x (1) = y ( d ) } , we see that h x (1)1 ( v t ) h y ( d ) d ( v t ) = 0 for every( x , y ) ∈ A × A \ W . Combining this information with (7.15), we obtain that(7.16) H ( v t ) H ( v t ) = (cid:88) ( x , y ) ∈W (cid:16) d (cid:89) i =2 h x ( i ) i ( v t ) (cid:17)(cid:16) d − (cid:89) j =1 h y ( j ) j ( v t ) (cid:17) h x (1)1 ( v t ) h y ( d ) d ( v t )for every v ∈ V N . On the other hand, by (7.5), for every ( x , y ) ∈ W we have(7.17) (cid:12)(cid:12)(cid:12) (cid:90) (cid:16) d (cid:89) i =2 h x ( i ) i ( v t ) (cid:17)(cid:16) d − (cid:89) j =1 h y ( j ) j ( v t ) (cid:17) h x (1)1 ( v t ) h y ( d ) d ( v t ) d µ ( v ) − (cid:16) (cid:17) d − (cid:12)(cid:12)(cid:12) (cid:54) ε. Since |W| = 2 d − , we conclude that (7.13) for k = 2 follows from (7.16) and (7.17). Theproof of Claim 7.5 is completed. (cid:3) That is, we have H ( v , . . . , v d ) = H ( v π (1) , . . . , v π ( d ) ) for every ( v , . . . , v d ) ∈ V d and every permu-tation π of [ d ]. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 35
Claim 7.6.
Set C := (cid:8) u ∪ { d − } : u ∈ Box( d − (cid:9) where Box( d − is as in (4.2) . ( Notice that C ⊆ (cid:0) N d (cid:1) . ) Then we have (7.18) (cid:12)(cid:12)(cid:12) (cid:90) (cid:89) s ∈ C H ( v s ) d µ ( v ) − (cid:16) (cid:17) | C | (cid:12)(cid:12)(cid:12) (cid:54) ( d + 1)2 d − d − d − ε. Proof of Claim . We start by setting j i := 2 i − j i := 2 i for every i ∈ [ d − (cid:15) = ( (cid:15) i ) d − i =1 ∈ { , } d − set s ( (cid:15) ) := (cid:8) j (cid:15) i i : i ∈ [ d − (cid:9) ∪ { d − } , and noticethat C = (cid:8) s ( (cid:15) ) : (cid:15) ∈ { , } d − (cid:9) . Moreover, by (7.9) and (7.11), we have (cid:90) (cid:89) s ∈ C H ( v s ) d µ ( v ) = (cid:90) (cid:89) (cid:15) ∈{ , } d − H ( v s ( (cid:15) ) ) d µ ( v ) =(7.19) = (cid:88) ( x (cid:15) ) (cid:15) ∈{ , } d − ∈ A { , } d − (cid:90) (cid:89) (cid:15) ∈{ , } d − h x (cid:15) ( v s ( (cid:15) ) ) d µ ( v ) . We define a subset R of A { , } d − by the rule( x (cid:15) ) (cid:15) ∈{ , } d − ∈ R ⇔ for every j ∈ [ d − , every (cid:15) = ( (cid:15) i ) d − i =1 ∈ { , } d − (7.20) and every (cid:15) (cid:48) = ( (cid:15) (cid:48) i ) d − i =1 ∈ { , } d − with (cid:15) i = (cid:15) (cid:48) i for all i ∈ [ d − \ { j } we have x (cid:15) ( j ) = x (cid:15) (cid:48) ( j ) . (cid:15) (0 , , , , x (cid:15) (1) x (cid:15) (2) x (cid:15) (3) Figure 4.
The structure of the set R for d = 3. Connected dots implyequality of the corresponding coordinates.Observe that for every j ∈ [ d −
1] and every (cid:15) = ( (cid:15) i ) d − i =1 , (cid:15) (cid:48) = ( (cid:15) (cid:48) i ) d − i =1 ∈ { , } d − with (cid:15) i = (cid:15) (cid:48) i for all i ∈ [ d − \ { j } we have h j ( v s ( (cid:15) ) ) = h j ( v s ( (cid:15) (cid:48) ) ) and h j ( v s ( (cid:15) ) ) = h j ( v s ( (cid:15) (cid:48) ) )for every v ∈ V N which, in turn, implies that h j ( v s ( (cid:15) ) ) h j ( v s ( (cid:15) (cid:48) ) ) = 0. Consequently, (cid:81) (cid:15) ∈{ , } d − h x (cid:15) ( v s ( (cid:15) ) ) = 0 for every ( x (cid:15) ) (cid:15) ∈{ , } d − ∈ A { , } d − \ R and every v ∈ V N .Therefore, by (7.19), we obtain that(7.21) (cid:90) (cid:89) s ∈ C H ( v s ) d µ ( v ) = (cid:88) ( x (cid:15) ) (cid:15) ∈{ , } d − ∈R (cid:90) (cid:89) (cid:15) ∈{ , } d − h x (cid:15) ( v s ( (cid:15) ) ) d µ ( v ) . On the other hand, by (7.5), for every ( x (cid:15) ) (cid:15) ∈{ , } d − ∈ R we have (cid:12)(cid:12)(cid:12) (cid:90) (cid:89) (cid:15) ∈{ , } d − h x (cid:15) ( v s ( (cid:15) ) ) d µ ( v ) − (cid:16) (cid:17) ( d +1)2 d − (cid:12)(cid:12)(cid:12) (7.22)= (cid:12)(cid:12)(cid:12) (cid:90) d (cid:89) i =1 (cid:89) (cid:15) ∈{ , } d − h x (cid:15) ( i ) i ( v s ( (cid:15) ) ) d µ ( v ) − (cid:16) (cid:17) ( d +1)2 d − (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) (cid:90) (cid:16) d − (cid:89) i =1 (cid:89) (cid:15) ∈{ , } d − h x (cid:15) ( i ) i ( v s ( (cid:15) ) ) (cid:17)(cid:16) (cid:89) (cid:15) ∈{ , } d − h x (cid:15) ( d ) d ( v s ( (cid:15) ) ) (cid:17) d µ ( v ) − (cid:16) (cid:17) ( d +1)2 d − (cid:12)(cid:12)(cid:12) (cid:54) ( d +1)2 d − ε. The estimate (7.18) follows from (7.21), (7.22), and the fact that |R| = 2 ( d − d − and | C | = 2 d − . The proof of Claim 7.6 is completed. (cid:3) Claim 7.7.
Let
Box( d ) be as in (4.2) . Then we have (7.23) (cid:12)(cid:12)(cid:12) (cid:90) (cid:89) s ∈ Box( d ) H ( v s ) d µ ( v ) − (cid:16) (cid:17) | Box( d ) | (cid:12)(cid:12)(cid:12) (cid:54) d d +( d − d − ε. Proof of Claim . As in the proof of Claim 7.6, for every i ∈ [ d ] set j i := 2 i − j i := 2 i . Moreover, for every (cid:15) = ( (cid:15) i ) di =1 ∈ { , } d set s ( (cid:15) ) := (cid:8) j (cid:15) i i : i ∈ [ d ] (cid:9) , and observethat Box( d ) = (cid:8) s ( (cid:15) ) : (cid:15) ∈ { , } d (cid:9) . We define a subset Q of A { , } d by setting( x (cid:15) ) (cid:15) ∈{ , } d ∈ Q ⇔ for every j ∈ [ d ] , every (cid:15) = ( (cid:15) i ) di =1 ∈ { , } d (7.24) and every (cid:15) (cid:48) = ( (cid:15) (cid:48) i ) di =1 ∈ { , } d with (cid:15) i = (cid:15) (cid:48) i for all i ∈ [ d ] \ { j } we have x (cid:15) ( j ) = x (cid:15) (cid:48) ( j ) . (cid:15) (0 , , , , , , , , x (cid:15) (1) x (cid:15) (2) x (cid:15) (3)(1 , , , , , , , , Figure 5.
The structure of the set Q for d = 3. As in Figure 4, con-nected dots imply equality of the corresponding coordinates. ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 37
By (7.9), (7.11), the definition of Q and arguing as in the proof of Claim 7.6, we see that (cid:90) (cid:89) s ∈ Box( d ) H ( v s ) d µ ( v ) = (cid:90) (cid:89) (cid:15) ∈{ , } d H ( v s ( (cid:15) ) ) d µ ( v )(7.25) = (cid:88) ( x (cid:15) ) (cid:15) ∈{ , } d ∈ A { , } d (cid:90) (cid:89) (cid:15) ∈{ , } d h x (cid:15) ( v s ( (cid:15) ) ) d µ ( v )= (cid:88) ( x (cid:15) ) (cid:15) ∈{ , } d ∈Q (cid:90) (cid:89) (cid:15) ∈{ , } d h x (cid:15) ( v s ( (cid:15) ) ) d µ ( v ) . By (7.5), for every ( x (cid:15) ) (cid:15) ∈{ , } d − ∈ Q we have(7.26) (cid:12)(cid:12)(cid:12) (cid:90) (cid:89) (cid:15) ∈{ , } d h x (cid:15) ( v s ( (cid:15) ) ) d µ ( v ) − (cid:16) (cid:17) d d − (cid:12)(cid:12)(cid:12) (cid:54) d d − ε. Finally, note that | Q | = 2 ( d − d − +1 . Using this information, (7.23) follows from (7.25),(7.26) and the fact that | Box( d ) | = 2 d . The proof of Claim 7.7 is completed. (cid:3) Claim 7.8.
Let B be a d -dimensional box of N such that min( s ) (cid:62) d +1 for every s ∈ B .Then we have (7.27) (cid:12)(cid:12)(cid:12) (cid:90) (cid:89) s ∈ Box( d ) ∪ B H ( v s ) d µ ( v ) − (cid:16) (cid:17) | Box( d ) | (cid:12)(cid:12)(cid:12) (cid:54) d d +1+( d − d − ε. Proof of Claim . It follows immediately by Claim 7.7. (cid:3)
After this preliminary discussion, we now enter into the main part of the proof. Let X = (cid:104) X s : s ∈ (cid:0) N d (cid:1) (cid:105) be a boolean, exchangeable, d -dimensional random array on N whosedistribution satisfies(7.28) E (cid:104) (cid:89) s ∈F X s (cid:105) = 12 (cid:16) (cid:17) |F| + 12 (cid:90) (cid:89) s ∈F H ( v s ) d µ ( v )for every nonempty finite subset F of (cid:0) N d (cid:1) . (The existence of such a random array followsarguing precisely as in the proof of Proposition 7.1.)First, we will show that X satisfies properties ( P
1) up to ( P P s ∈ (cid:0) N d (cid:1) be arbitrary and notice that, by the exchangeability of X and (7.28),(7.29) E [ X s ] = 14 + 12 (cid:90) H ( v t ) d µ ( v )where, as in Claim 7.5, we have t = { , . . . , d } . By (7.12) and the choice of ε in (7.7),we obtain that (cid:12)(cid:12) E [ X s ] − (cid:12)(cid:12) (cid:54) d − ε (cid:54) δ . For property ( P s, t ∈ (cid:0) N d (cid:1) be distinct, andset k := d − | s ∩ t | + 1. Since X is exchangeable, by (7.28), we have E [ X s X t ] = 18 + 12 (cid:90) H ( v t ) H ( v t k ) d µ ( v )(7.30) where t and t k are as in Claim 7.5. By (7.13), (7.30) and inoking again (7.7), we seethat (cid:12)(cid:12) E [ X s X t ] − (cid:12)(cid:12) (cid:54) d − ε (cid:54) δ . For property ( P F be a ( d − N . Bythe exchangeability of X , (7.28) and the choice of the set C in Claim 7.6, E (cid:104) (cid:89) s ∈ F X s (cid:105) = 12 (cid:16) (cid:17) | F | + 12 (cid:90) (cid:89) s ∈ C H ( v s ) d µ ( v )(7.31)which implies, by (7.18), that(7.32) (cid:12)(cid:12)(cid:12) E (cid:104) (cid:89) s ∈ F X s (cid:105) − (cid:16) (cid:1) | F | (cid:12)(cid:12)(cid:12) (cid:54) ( d + 1)2 d − d − d − ε (7.7) (cid:54) δ. Lastly, for property ( P B be a d -dimensional box of N . Using once again theexchangeability of X and (7.28), we see that E (cid:104) (cid:89) s ∈ B X s (cid:105) = 12 (cid:16) (cid:17) | B | + 12 (cid:90) (cid:89) s ∈ Box( d ) H ( v s ) d µ ( v )(7.33)and so, by (7.23),(7.34) (cid:12)(cid:12)(cid:12) E (cid:104) (cid:89) s ∈ B X s (cid:105) − (cid:16) (cid:17) | B | (cid:12)(cid:12)(cid:12) (cid:54) d d − d − d − ε (7.7) (cid:54) δ. Thus, it remains to verify property ( P n (cid:62) d , and define f : R ( [ n ] d ) → R by setting for every x = ( x t ) t : t ∈ ( [ n ] d ) ∈ R ( [ n ] d )(7.35) f ( x ) := (cid:89) s ∈ Box( d ) x s − E (cid:104) (cid:89) s ∈ Box( d ) X s (cid:105) . Clearly, f is a translated multilinear polynomial of degree 2 d and satisfies E [ f ( X n )] = 0and (cid:107) f ( X n ) (cid:107) L ∞ (cid:54)
1. On the other hand, if B is a d -dimensional box on N such thatmin( s ) (cid:62) d + 1 for every s ∈ B , then E (cid:104) (cid:89) s ∈ Box( d ) ∪ B X s (cid:105) (7.28) = 12 (cid:16) (cid:17) d +1 + 12 (cid:90) (cid:89) s ∈ Box( d ) ∪ B H ( v s ) d µ ( v )(7.36)and so, by (7.27),(7.37) (cid:12)(cid:12)(cid:12) E (cid:104) (cid:89) s ∈ Box( d ) ∪ B X s (cid:105) − (cid:16) (cid:17) d +1 (cid:12)(cid:12)(cid:12) (cid:54) d d +( d − d − ε. Using this estimate, property ( P
4) and arguing as in the proof of Proposition 7.1, it iseasy to verify that the function f satisfies property ( P (cid:3) ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 39 Comments and extensions
Dissociated random arrays.
The following theorem is the analogue of Theorem2.3 for the case of dissociated random arrays. (The proof follows arguing as in Section 3,and it is left to the interested reader.)
Theorem 8.1.
Let < p (cid:54) , let < ε (cid:54) , and set c = c ( ε, p ) := 14 ε p +1) p ( p − . (8.1) Also let n, d be positive integers with n (cid:62) max { d, /c } , and let X be a dissociated, d -dimensional random array on [ n ] whose entries take values in a measurable space X .Then for every measurable function f : X ( [ n ] d ) → R with E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p = 1 there exists an interval I of [ n ] with | I | (cid:62) cn such that (8.2) P (cid:0)(cid:12)(cid:12) E [ f ( X ) | F I ] (cid:12)(cid:12) (cid:54) ε (cid:1) (cid:62) − ε. Note that Theorem 8.1 improves upon Theorem 2.3 in two ways. Firstly, observethat in Theorem 8.1 no restriction is imposed on the distributions of the entries of X .Secondly, note that the random variable f ( X ) becomes concentrated by conditioning iton a subarray whose size is proportional to n .An important—especially, from the point of view of applications—class of randomarrays for which Theorem 8.1 is applicable consists of those random arrays whose entriesare of the form (1.7) where ( ξ , . . . , ξ n ) is a random vector with independent (but notnecessarily identically distributed) entries.8.2. Vector-valued functions of random arrays.
Recall that a Banach space E iscalled uniformly convex if for every ε > δ > x, y ∈ E with (cid:107) x (cid:107) E = (cid:107) y (cid:107) E = 1 and (cid:107) x − y (cid:107) E (cid:62) ε we have that (cid:107) ( x + y ) / (cid:107) E (cid:54) − δ . It isa classical fact (see [Ja72, GG71]) that for every uniformly convex Banach space E andevery p > q (cid:62) C > E -valued martingale difference sequence ( d i ) mi =1 we have(8.3) (cid:16) m (cid:88) i =1 (cid:107) d i (cid:107) qL p ( E ) (cid:17) /q (cid:54) C (cid:13)(cid:13) m (cid:88) i =1 d i (cid:13)(cid:13) L p ( E ) . (See, also, [Pi11, Pi16] for a proof and a detailed presentation of related material.) Using(8.3) instead of Proposition 3.2 and arguing precisely as in Section 3, we obtain thefollowing vector-valued version of Theorem 2.3. Theorem 8.2.
For every uniformly convex Banach space E , every pair d, m of positiveintegers with m (cid:62) , every p > , every < ε (cid:54) and every integer k (cid:62) d , there exists aconstant C > with the following property.Let n (cid:62) C be an integer, let X be a set with |X | = m , let X = (cid:104) X s : s ∈ (cid:0) [ n ] d (cid:1) (cid:105) bean X -valued, (1 /C ) -spreadable, d -dimensional random array on [ n ] , and assume that the following holds. There exists S ⊆ X with |S| = |X | − such that for every a ∈ S we have (8.4) (cid:12)(cid:12)(cid:12) P (cid:16) (cid:92) s ∈ Box( d ) [ X s = a ] (cid:17) − (cid:89) s ∈ Box( d ) P (cid:0) [ X s = a ] (cid:1)(cid:12)(cid:12)(cid:12) (cid:54) C . where
Box( d ) denotes the d -dimensional box defined in (2.6) . Then for every function f : X ( [ n ] d ) → E with E [ f ( X )] = 0 and (cid:107) f ( X ) (cid:107) L p ( E ) = 1 there exists an interval I of [ n ] with | I | = k such that (8.5) P (cid:0)(cid:13)(cid:13) E [ f ( X ) | F I ] (cid:13)(cid:13) E (cid:54) ε (cid:1) (cid:62) − ε. References [AdWo15] R. Adamczak and P. Wolff,
Concentration inequalities for non-Lipschitz functions withbounded derivatives of higher order , Probab. Theory Related Fields 162 (2015), 531–586.[Ald81] D. J. Aldous,
Representations for partially exchangeable arrays of random variables , J. Multi-variate Anal. 11 (1981), 581–598.[Au08] T. Austin,
On exchangeable random variables and the statistics of large graphs and hypergraphs ,Probability Surveys 5 (2008), 80–145.[Au13] T. Austin,
Exchangeable random arrays , preprint (2013), available at .[Ber96] V. Bergelson,
Ergodic Ramsey theory—an update , in “Ergodic Theory of Z d -Actions”, LondonMathematical Society Lecture Note Series, Vol. 228, Cambridge University Press, 1996, 1–61.[Bob04] S. Bobkov, Concentration of normalized sums and a central limit theorem for noncorrelatedrandom variables , Ann. Probab. 32 (2004), 2884–2907.[BLM13] S. Boucheron, G. Lugosi and P. Massart,
Concentration Inequalities. A Nonasymptotic Theoryof Independence , Oxford University Press, 2013.[Ch06] S. Chatterjee,
A generalization of the Lindeberg principle , Ann. Probab. 34 (2006), 2061–2076.[CGW88] F. R. K. Chung, R. L. Graham and R. M. Wilson,
Quasi-random graphs , Proc. Natl. Acad.Sci. USA 85 (1988), 969–970.[CGW89] F. R. K. Chung, R. L. Graham and R. M. Wilson,
Quasi-random graphs , Combinatorica 9(1989), 345–362.[DF80] P. Diaconis and D. Freedman,
Finite exchangeable sequences , Ann. Probab. 8 (19180), 745–764.[DK16] P. Dodos and V. Kanellopoulos,
Ramsey Theory for Product Spaces , Mathematical Surveys andMonographs, Vol. 212, American Mathematical Society, 2016.[DKK16] P. Dodos, V. Kanellopoulos and Th. Karageorgos,
Szemer´edi’s regularity lemma via martin-gales , Electron. J. Combin. 23 (2016), Research Paper P3.11, 1–24.[DKT16] P. Dodos, V. Kanellopoulos and K. Tyros,
A concentration inequality for product spaces , J.Funct. Anal. 270 (2016), 609–620.[DTV20] P. Dodos, K. Tyros and P. Valettas,
Decompositions of finite high-dimensional random arrays ,preprint (2021).[FT85] D. H. Fremlin and M. Talagrand,
Subgraphs of random graphs , Trans. Amer. Math. Soc. 291(1985), 551–582.[GSS19] F. G¨otze, H. Sambale and A. Sinulis,
Concentration inequalities for polynomials in α -sub-exponential random variables , preprint (2019), available at https://arxiv.org/pdf/1903.05964.pdf .[GG71] V. I. Gurarii and N. I. Gurarii, On bases in uniformly convex and uniformly smooth Banachspaces , Izv. Akad. Nauk SSSR Ser. Mat. 35 (1971), 210–215.
ONCENTRATION ESTIMATES FOR FUNCTIONS OF HIGH-DIMENSIONAL ARRAYS 41 [Hoo79] D. N. Hoover,
Relations on probability spaces and arrays of random variables , preprint (1979),available at .[Ja72] R. C. James,
Super-reflexive spaces with bases , Pac. J. Math. 41 (1972), 409–419.[Kal92] O. Kallenberg,
Symmetries on random arrays and set-indexed processes , J. Theor. Probab. 5(1992), 727–765.[Kal05] O. Kallenberg,
Probabilistic Symmetries and Invariance Principles , Probability and its Applica-tions (New York), Springer, 2005.[La06] R. Latala,
Estimates of moments and tails of Gaussian chaoses , Ann. Probab. 34 (2006), 2315–2331.[Le01] M. Ledoux,
The Concentration of Measure Phenomenon , Mathematical Surveys and Monographs,Vol. 89, American Mathematical Society, 2001.[MS75] W. G. Moginley and R. Sibson,
Dissociated random variables , Math. Proc. Cambridge Philos.Soc. 77 (1975), 185–188.[Pi11] G. Pisier,
Martingales in Banach Spaces ( in connection with Type and Cotype ), preprint (2011),available at .[Pi16] G. Pisier, Martingales in Banach Spaces , Cambridge Studies in Advanced Mathematics, Vol. 155,Cambridge University Press, 2016.[Ra30] F. P. Ramsey,
On a problem of formal logic , Proc. London Math. Soc. 30 (1930), 264–286.[RX16] E. Ricard and Q. Xu,
A noncommutative martingale convexity inequality , Ann. Probab. 44(2016), 867–882.[Tao06] T. Tao,
Szemer´edi’s regularity lemma revisited , Contrib. Discrete Math. 1 (2006), 8–28.[Tao08] T. Tao,
Structure and Randomness: Pages from Year One of a Mathematical Blog , AmericanMathematical Society, 2008.[V19] R. Vershynin,
Concentration inequalities for random tensors , Bernoulli 26 (2020), 3139–3162.
Department of Mathematics, University of Athens, Panepistimiopolis 157 84, Athens, Greece
Email address : [email protected] Department of Mathematics, University of Athens, Panepistimiopolis 157 84, Athens, Greece
Email address : [email protected] Mathematics Department, University of Missouri, Columbia, MO, 65211
Email address ::