What is the minimal cardinal of a family which shatters all d-subsets of a finite set?
aa r X i v : . [ m a t h . C O ] S e p What is the minimal cardinal of a familywhich shatters all d -subsets of a finite set? N. Chevallier and A. FruchardMay 12, 2018
In this note, d ≤ n are positive integers. Let S be a finite set of cardinal | S | = n and let 2 S denote its power set, i.e. the set of its subsets. A d -subset of S is a subset of S of cardinal d .Let F ⊆ S and A ⊆ S . The trace of F on A is the family F A = { E ∩ A ; E ∈ F } . One saysthat F shatters A if F A = 2 A . The VC-dimension of F is the maximal cardinal of a subset of S that is shattered by F [7]. The following is well-known [7, 4, 5]: Theorem 1 . (Vapnik-Chervonenkis, Sauer, Shelah)
If VC-dim ( F ) ≤ d (i.e. if F shatters no ( d + 1) -subset of S ) then |F | ≤ c ( d, n ) , where c ( d, n ) = (cid:0) n (cid:1) + · · · + (cid:0) nd (cid:1) . Moreover this bound is tight: It is achieved e.g. for F = (cid:0) S ≤ d (cid:1) , the family of all k -subsets of S , ≤ k ≤ d . A first natural question is:
Question 1 . Assume a family
F ⊆ S is maximal for the inclusion among all families ofVC-dimension at most d . Does F always have the maximal possible cardinal c ( d, n ) ? Let us define the index of F as follows:Ind F = max { d ∈ { , ..., n } ; F shatters all d -subsets of S } . Let C ( d, n ) = min {|F | ; Ind F = d } . For instance, we have C (1 , n ) = 2, with the (onlypossible) choice F = {∅ , S } . Of course we have 2 d ≤ C ( d, n ) ≤ n . The question is: Question 2 . Give the exact value of C ( d, n ) for ≤ d ≤ n . If this is not possible, give lowerand upper bounds as accurate as possible. A well-known duality yields another formulation of Question 2. Let ϕ : S → F , a
7→ { E ∈F ; a ∈ E } and set S = ϕ ( S ). In this manner, we have for all a ∈ S and all E ∈ F : a ∈ E ⇔ E ∈ ϕ ( a ) . (1)One can check that F shatters A ⊆ S if and only if, for every partition ( B, C ) of A (i.e. A = B ∪ C and B ∩ C = ∅ ) the intersection (cid:16) T b ∈ B ϕ ( b ) (cid:17) ∩ (cid:16) T c ∈ C ϕ ( c ) (cid:17) is nonempty, wherethe notation Y stands for F \ Y .If Ind F ≥
2, then ϕ is a one-to-one correspondance from S to S , hence we have log n ≤ C ( d, n ) for all 2 ≤ d ≤ n , where log denotes the logarithm in base 2.1 he case d = 2 . Using for instance the binary expansion, it is easy to show that the orderof magnitude of C (2 , n ) is actually log n . The next statement refines this. Proposition 2 . If n = (cid:0) ll (cid:1) = (cid:0) l − l − (cid:1) , then C (2 , n ) = 2 l .Proof . (Recall the notation A = F \ A .) We first prove by contradiction that C (2 , n ) > l − F of subsets of S shatters all 2-subsets of S , then the image S ⊆ F of S by ϕ must satisfy ∀ A = B ∈ S , A ∩ B, A ∩ B, A ∩ B, and A ∩ B are nonempty . (2)In particular S is a Sperner family of F (i.e. an antichain for the partial order of inclusion;one finds several other expressions in the literature: ‘Sperner system’, ‘independent system’,‘clutter’, ‘completely separating system’, etc.). For a survey on Sperner families and severalgeneralizations, we refer e.g. to [1] and the references therein.Assume now that |F | = 2 l −
1; it is known [6, 2, 3] that all Sperner families of F have acardinal at most (cid:0) l − l − (cid:1) , and that there are only two Sperner families of maximal cardinal: thefamilies (cid:0) F l − (cid:1) and (cid:0) F l (cid:1) , i.e. of ( l − l -subsets of F . However, none of thesefamilies satisfies both A ∩ B and A ∩ B nonempty in (2). As a consequence, we must have |F | ≥ l .Conversely, let S = { a , . . . , a n } , consider (cid:0) { ,..., l } l (cid:1) , the set of l -subsets of { , . . . , l } ,and choose one element in each pair of complementary l -subsets. We then obtain a family { A , . . . , A n } which satisfies (2). Now we set F = { E , . . . , E l } , with E i = { a j ; i ∈ A j } . Thecharacterization (1) shows that F shatters every 2-subset of S .The proof of the following statement is straightforward. Corollary 3 . If (cid:0) l − l − (cid:1) < n ≤ (cid:0) l +1 l (cid:1) , then l ≤ C (2 , n ) ≤ l + 2 . The upper bound can be slightly improved: One can prove that, if (cid:0) l − l − (cid:1) < n ≤ (cid:0) ll − (cid:1) , then2 l ≤ C (2 , n ) ≤ l + 1. Question 3 . It seems that we have C (2 , n ) = k if and only if (cid:0) k − ⌊ ( k − / ⌋− (cid:1) < n ≤ (cid:0) k − ⌊ k/ ⌋− (cid:1) ,where ⌊ x ⌋ denotes the integer part of x . Is it true? Is it already known? The first values are C (2 ,
2) = C (2 ,
3) = 4, C (2 ,
4) = 5, C (2 ,
5) = · · · = C (2 ,
10) = 6. Com-puter seems to be useless, at least for a naive treatment. Already in order to obtain C (2 ,
11) = 7,we would have to verify that C (2 , >
6, i.e. to find, for each of the (cid:0) (cid:1) ≈ families F in 2 S some 2-subset that is not shattered by the family. (Alternatively, in the dual statement,we have to check “only” (cid:0) (cid:1) ≈ . families S in 2 F .) The case d ≥ . From now, we assume n ≥ Proposition 4 . For all ≤ d < n , we have C ( d, n ) ≤ d d ! (3 log n ) d . The constant 3 can be improved. The proof below shows that, for all a > n largeenough, C ( d, n ) ≤ d d ! ( a log n ) d . Proof . Let F ⊂ S be a minimal separating system of S , i.e. such that, for all a, b ∈ S thereexists E ba ∈ F which satisfies b / ∈ E ba ∋ a . Since this amounts to choosing F minimal suchthat S = ϕ ( S ) is a Sperner family for F , we know that |F | = N if and only if (cid:0) N − ⌊ ( N − / ⌋ (cid:1) 4. We assume2 ≥ B and C of S such that | B ∪ C | = d , the set E CB = T c ∈ C (cid:0)S b ∈ B E cb (cid:1) contains B and does not meet C . Let F be the collection of all suchsets E CB ; then F shatters all subsets of S of cardinal at most d .To estimate |F | , we consider F k the collection of all such sets E CB , with | B | = k (and thus | C | = d − k ). We have |F k | = (cid:0) Nk (cid:1)(cid:0) N − kd − k (cid:1) (with N = |F | ). Then we choose F = S dk =0 F k . Weobtain |F | ≤ P dk =0 (cid:0) Nk (cid:1)(cid:0) N − kd − k (cid:1) = (cid:0) Nd (cid:1) d ≤ d d ! N d ≤ d d ! (3 log n ) d . Question 4 . Is (log n ) ⌊ d/ ⌋⌊ ( d +1) / ⌋ the right order of magnitude for C ( d, n ) ? By constructing auxiliary Sperner families from S , it is possible to give a better lower bound for C ( d, n ) than only C ( d, n ) ≥ C (2 , n ). For instance, in the case d = 3, for all distinct A, B, C ∈ S ,we must have A ∩ B C . One can check that this implies that the family { A ∩ B ; A, B ∈ S} is a Sperner family, therefore we obtain (cid:0) n (cid:1) ≤ (cid:0) C (3 ,n ) ⌊ C (3 ,n ) ⌋ / (cid:1) . Unfortunately, this does not modifythe order of magnitude. Already in this case d = 3, we do not know whether C (3 , n ) is of orderlog n , (log n ) , or an intermediate order of magnitude. Another formulation is: Question 5 . Prove or disprove: There exists C > such that, for all k ∈ N , if F is a finiteset of cardinal k and S ⊆ F satisfies ∀ A, B, C ∈ S , A ∩ B C , then |S| ≤ C C √ k . References [1] P. Borg, Intersecting families of sets and permutations: a survey. Int. J. Math. GameTheory Algebra 21 (2012) 543–559.[2] G. Katona, On a conjecture of Erd¨os and a stronger form of Sperner’s theorem. StudiaSci. Math. Hungar. J. Combin Theory J. Combin. Theory 25 (1972) 80–83.[5] S. Shelah, A combinatorial problem, stability and order for models and theories in infinitelanguages, Pacific J. Math. 41 (1972) 247–261.[6] E. Sperner, Ein Satz ¨uber Untermenger einer endlichen Menge, Math. Zeitschrift 27 (1928)544–548.[7] V. N. Vapnik and A. Y. Chervonenkis, On the uniform convergence of relative frequencesof events to their probabilities