[PDF] Improving Gebauer's construction of 3-chromatic hypergraphs with few edges

Abstract

In 1964 Erd\H{o}s proved, by randomized construction, that the minimum number of edges in a k-graph that is not two colorable is O(k^2\; 2^k). To this day, it is not known whether there exist such k-graphs with smaller number of edges. Known deterministic constructions use much larger number of edges. The most recent one by Gebauer requires 2^{k+\Theta(k^{2/3})} edges. Applying derandomization technique we reduce that number to 2^{k+\widetilde{\Theta}(k^{1/2})}.

Full PDF

aa r X i v : . [ c s . D M ] F e b IMPROVING GEBAUER’S CONSTRUCTION OF 3-CHROMATICHYPERGRAPHS WITH FEW EDGES

JAKUB KOZIK

Abstract.

In 1964 Erd˝os proved, by randomized construction, that the minimum number of edges ina k -graph that is not two colorable is O ( k k ). To this day, it is not known whether there exist such k -graphs with smaller number of edges. Known deterministic constructions use much larger numberof edges. The most recent one by Gebauer requires 2 k +Θ( k / ) edges. Applying derandomizationtechnique we reduce that number to 2 k + e Θ( k / ) . Introduction

In 1964 Erd˝os proved in [1] that (1 + o (1)) e ln(2)4 k k edges are suﬃcient to build a k -graph whichis not two colorable. To this day that result provides the best known upper bound for the minimumnumber of edges in such hypergraph. The Erd˝os’ bound results from the fact that random k -graphwith that number of edges, built on a set of k / k -graph that is not two colorable has been obtainedby Gebauer [3]. It requires 2 k +Θ( k / ) edges. It is also the ﬁrst construction in which the number ofedges is 2 k + o ( k ) . The main result of the current paper is an upgrade of this construction that allowsto cut down the number of edges to 2 k +Θ(( k log( k )) / ) .Within the whole paper, log( . ) stands for binary logarithm. We are only concerned with vertex twocoloring of hypergraphs. Vertex coloring is proper if no edge is monochromatic. Following commonconvention we use colors red and blue .2. Gebauer’s construction

We start with recalling the construction of [3], as we are going to modify it. The whole procedure isparametrized by t = t ( k ) that takes value roughly k α for some optimized positive α <

1. It it convenientto organize the vertices of the constructed hypergraph into a rectangular matrix M . Slightly abusingthe notation, we use M for both the matrix and the set of vertices. We use the same convention forsubmatrices of M . The length of the rows is denoted by s . Its value will be a subject of optimization.2.1. Preliminary choice of rows.

Vertex coloring can be seen as assigning colors to the entries ofthe matrix. A color is dominating in a row if at least half of its entries are colored with it (there canbe two dominating colors). The main part of the construction is designed to work with a submatrixof t rows with the same dominating color. A matrix for which one of the colors is dominating in allrows will be called consistently dominated . We always assume that red is the dominating color in sucha matrix.The ground matrix M has 2 t − t − · s vertices. Let M denote the set of submatrices of M built of every t rows. For every M ∈ M we apply the mainconstruction described in the next section. The construction outputs hypergraph H M . The union ofthe edge sets of these hypergraphs forms the edge set of the resulting hypergraph. For every coloring Key words and phrases.

Property B, Hypergraph Coloring, Deterministic Constructions.This work was partially supported by Polish National Science Center (2016/21/B/ST6/02165). i.e. k -uniform hypergraph of M at least one submatrix M ∈ M is consistently dominated. The main construction guaranteesthat in such a case, H M contains a monochromatic edge.2.2. Main construction.

Let M ∈ M , recall that M has t rows. Our goal is to build a hypergraph H M on the vertex set M such that for every consistently dominated coloring of M , there exists amonochromatic edge in H M . For ( σ , . . . , σ t ) ∈ [ s ] t , we denote by M ( σ , . . . , σ t ) matrix M in whichfor every i ∈ [ t ], the i -th row has been cyclically shifted by σ i . The construction proceeds as follows. For every(1) sequence of shifts σ ∈ [ s ] t ,(2) and set of indices I ⊂ [ s ] of size k/t ,add to H M an edge built from all elements of the columns of M ( σ ) with indices in I . Note that the edges of H M are of size k as required.Let us ﬁx a consistently dominated coloring of M . We assume wlog that red is the dominating colorof the rows. When the sequence of shifts is chosen randomly, the probability that some ﬁxed columnis red is at least 2 − t . As a consequence, for s > ( k/t ) 2 t the expected number of red columns is at least k/t . In particular, for some sequence of shifts, there exists a set of k/t red columns. Hence the edgebuilt for these shifts and columns is monochromatic.2.3. Counting.

We have (cid:18) t − t (cid:19) < t choices for the subset of rows in the preliminary step. Then, in the main construction, every sequenceof t elements of [ s ] and a subset of k/t elements of [ s ] is used to build an edge. The number of choicesis s t · (cid:18) sk/t (cid:19) s t · (cid:18) e sk/t (cid:19) k/t . For s = ( k/t ) 2 t (we assume for simplicity that it is an integer) we obtain( k/t ) t t · e k/t k = 2 t log( k/t )+ t + k/t log(e)+ k . The total number of edges is smaller than2 t + t log( k/t )+ t + k/t log(e)+ k . Finally we choose t so that the above exponent is minimized. That happens for t = Θ( k / ). In theend we obtain that the total number of edges is 2 k +Θ( k / ) .3. Improved construction

We modify only the main construction. Recall that we work with matrix M with t rows. For aﬁxed consistently dominating coloring of M , sequence of shifts σ ∈ [ s ] t is called good if M ( σ ) containsat least s − t red columns. The set of good sequences for a coloring C of M is denoted by G ( C ).If we ﬁx a consistently dominating coloring of M and choose the sequence of shifts σ ∈ [ s ] t uniformlyat random, the expected number of red columns in M ( σ ) is s − t . That observation was used to justifythat there exists a good sequence. However, it also suggests that a large number of shift sequencesmight be good. For the constructed hypergraph not to be two colorable, it is suﬃcient that for everyconsistently dominated coloring of M , at least one such sequence is used in the main construction.We apply derandomization techniques to construct relatively small set of sequences of shifts thatcan be used in the main construction instead of [ s ] t . For a family of sets F , a set that intersects everyelement of that family is called a hitting set for F . In these terms we are looking for a small hittingset for family G M = {G ( C ) : C is a consistently dominating coloring of M } . MPROVING GEBAUER’S CONSTRUCTION OF 3-CHROMATIC HYPERGRAPHS WITH FEW EDGES 3

Sequential choice of shifts.

We start with estimating the size of the set of good shift sequences.While it is not directly used in our construction, it provides good opportunity to introduce some tools.It will also allow to derive a probabilistic argument that small hitting sets actually exist.The property of being good is generalized to preﬁxes in the straightforward way – sequence of shifts( σ , . . . , σ i ) is good if the matrix trimmed to the ﬁrst i rows and shifted according to the sequence, hasat least s − i red columns.Suppose that ( σ , . . . , σ i ) is good. We want to estimate the number of possible choices of σ i +1 forwhich ( σ , . . . , σ i , σ i +1 ) is good as well. If the coloring of the ( i + 1)-th row was ”random”, then abouthalf of the choices would be right, and almost all of the choices would be almost right. That propertydoes not hold in the worst case scenario and hence we are going to work with relaxed deﬁnitions.For ε >

0, a sequence of shifts ( σ , . . . , σ i ) is ε -good if the number of red columns in the shiftedmatrix trimmed to the ﬁrst i rows is at least s (cid:0) − ε (cid:1) i . Then, every ε -good sequence of shifts of length t gives a shifted matrix with at least s (1 − ε ) t − t red columns. For s > e ( k/t ) 2 t and ε = 1 /t , the number of red columns is at least k/t as needed. Inthe modiﬁed construction we set s to ⌈ e ( k/t ) 2 t ⌉ .We also deﬁne G ε ( C ) as the set of ε -good sequences for a coloring C of M and G εM as {G ε ( C ) : C is a consistently dominating coloring of M } .The following proposition is used to derive a lower bound for the number of ε -good sequences. It isformulated in more general terms that needed here, but we are going to use it again later. For a set A ⊂ [ s ] and a number x , set A + x is deﬁned as the set A shifted cyclically within [ s ] by x , formally A + x = { ( a − x )( mod s ) + 1 : a ∈ A } . Purely technical proof of the proposition is moved toAppendix A. Proposition 1.

For any positive ε < and sets A, B ⊂ [ s ] , let α = | B | /s , there exist at least ε − (1 − ε ) α αs elements x ∈ [ s ] for which | ( A + x ) ∩ B | > (1 − ε ) α | A | . For | B | > s/ ε ε s/ x ∈ [ s ] for which | ( A + x ) ∩ B | > (1 − ε ) | A | / ε -good sequences of length j isat least (cid:18) ε ε s (cid:19) j . (For a ﬁxed j , and some ε -good sequence σ of length j −

1, let A be the set of indices of the red columnsin the matrix trimmed to the ﬁrst j − σ , and B be the set of indicesof red entries of the j -th row.)For j = t we get a lower bound for the number of ε -good sequences. Once we have that bound,typical application of the probabilistic method (along the lines of the proof from [1]) allows to proofthat there exists a hitting set for G εM of size 2 O ( t log( t )) (see Appendix B). We are interested howeverin deterministic construction.3.2. Expanders for hitting sets.

Linial, Luby, Saks and Zuckerman [4] worked on deterministicconstructions of small hitting sets for combinatorial rectangles. We summarize in this section, theirresults that are relevant for our developments. We follow closely their deﬁnitions.Graph G = ( V, E ) is an ( m, ∆ , α ) -expander if it has m vertices, maximum degree ∆ and for any A ⊂ V , the fraction of vertices in V − A that have a neighbor in A is at least α | A | /m . For a ﬁxed graph G let W r denote the set of walks in G of length r . Let W r,d be the set of subsequences of elements of JAKUB KOZIK W r of length d (not necessarily subsequences of consecutive elements). Set R ⊂ [ m ] d is a combinatorialrectangle if it is of a form R × . . . × R d for some R , . . . , R d ⊂ [ m ]. The volume of rectangle R , denotedas vol ( R ), is deﬁned as | R | /m d . Lemma 2 ([4]) . Let m, d be positive integers and R be a rectangle in [ m ] d . Suppose G is an ( m, ∆ , α ) -expander with / > α > . If r = 1 + (4 /α )( d + log(1 / vol ( R ))) , then W r,d contains a point from R . The above lemma implies that a speciﬁc set of sequences W r,d hits every combinatorial rectangle in[ m ] d of suﬃciently large volume.The following rough estimations for the size of W r,t will be suﬃcient for our needs. We have | W r | m (∆ + 1) r and | W r,d | < r | W r | m (2(∆ + 1)) r . Lemma 2 leaves some space for the choice of expander graph. Authors of [4] used the constructionof Margulis [5] (see also [2]) which allows to build an expander with ∆ = 8 and α = (2 − √ /

4. Aminor inconvenience is that the construction requires the number of vertices to be a perfect square.However, as observed already in [4], we can consider the rectangles of our interest as subsets of a largerspace [ m ′ ] d , and apply the lemma in that space. For every m we can choose number m ′ that is aperfect square and satisﬁes m m ′ m . While that change aﬀects the volumes of rectangles, theyget smaller at most by a factor of 2 − d . For our purposes this cost is negligible.When we are interested in rectangles of volume at least V , Lemma 2 instructs to take r = r ( d, V ) = 1 + (4 /α )( d + log(2 d / V )) . For some speciﬁc constant ˆ C and for all positive d and V we have r ( d, V ) ˆ C ( d + log(1 / V )) . Corollary 3.

There exists constant

C > such that, for every integers m, d , and V > there exists asubset of [ m ] d of size at most m · C ( d +log(1 / V )) , that intersects every combinatorial rectangle in [ m ] d of volume at least V . We apply that result, to construct a small hitting set for G εM . That set is then used in the modiﬁedmain construction instead of the set of all shift sequences.3.3. Under false assumption.

Unfortunately, for a ﬁxed consistently dominating coloring of M , theset of good or ε -good shift sequences does not need to form a combinatorial rectangle. It is instructiveto pretend for a moment that it does. We assume (falsely) in this subsection that G εM contains onlycombinatorial rectangles.By the discussion that follows Proposition 1, for every consistently dominating coloring of M , theset of ε -good shift sequences has volume at least ν = (cid:18) ε ε ) (cid:19) t . By Corollary 3 there exists a hitting set HS for all rectangles of volume ν of size s · C ( t +log(1 /ν )) . For ε = 1 /t and s = ⌈ e ( k/t ) 2 t ⌉ , the size of HS is at most 2 Ct log( t ) (assuming that t is suﬃciently large).Note that in the original construction all possible shift sequences were used. Using set HS instead of[ s ] t and choosing t = ( k log( k )) / , the total number of edges becomes2 k + O (( k log( k )) / ) . MPROVING GEBAUER’S CONSTRUCTION OF 3-CHROMATIC HYPERGRAPHS WITH FEW EDGES 5

Decomposing good shift sequences.

We showed in Section 3.1 that, for every consistentlydominating coloring of M , the set of ε -good shift sequences is large. While, in general, it does nothave a structure of combinatorial rectangle, in some sense it can be decomposed into a small numberof such. We start by altering the way that the sequences of shifts are represented. For the clarity ofthe exposition we assume that t is a power of 2.Let T be a rooted plane complete binary tree with t leaves . A subtree rooted at some internalnode of T consists of that node and all its descendants. A node of T is at level j if its distance to theset of leaves is j . Let S j be the set of inner nodes at level j . Note that | S j | = t − j , we denote thatvalue by d j . For h = log( t ), the tree has h + 1 levels with all the leaves on level 0.We associate leaves of T with rows of M in such a way that the i -th leaf from the left, corresponds tothe i -th row. Inner nodes of the tree are going to be labeled by elements of [ s ]. These labels representthe relative shifts between neighboring rows of M . For an inner node v , if l is the rightmost leaf of theleft subtree of v and r is the leftmost leaf of the right subtree of v , then the label of v describes howrow r is shifted wrt l .Labeling of a subtree rooted at node v is ε -good, if for r being the number of descendant leavesof v , the submatrix of the rows that correspond to these leaves, shifted according to the labels of theinner nodes of the subtree, has at least s ((1 − ε ) / r red columns. Note that ε -good labellings of thewhole tree correspond to ε -good sequences (up to a cyclic shift of the whole matrix, which is clearlyredundant in the original construction).We order the nodes of S j from left to right and represent labellings of the nodes of S j as elementsof [ s ] d j . We are going to work bottom up and label inner nodes in groups consisting of the nodes ofthe same level. A labeling of T is ε -good up to level j if all the subtrees rooted at level at most j are ε -good. In all the places where we use this deﬁnition, it can be assumed that the labeling is undeﬁnedfor the nodes of higher levels. Suppose that τ is a labeling of T that is ε -good up to level j −

1. Then,a sequence of labels σ ∈ [ s ] d j is called an ε -good level j extension (of τ ) if the labeling τ in which thelabels of the nodes of level j has been set to σ is ε -good up to level j . Proposition 4.

Suppose, that a labelling of T is ε -good up to level j − . Then, the set of its ε -goodlevel j extensions forms a combinatorial rectangle of volume at least ν j = (cid:16) ε ((1 − ε ) / − j − (cid:17) d j . Proof.

Fix j and suppose that labeling τ is ε -good up to level j −

1. We want to assign labels to thenodes of S j in such a way that all the subtrees rooted at depth j are ε -good shift trees as well. Notethat for any pair of distinct nodes of level j , the property of the corresponding subtrees of being ε -goodshift trees are determined by disjoint sets of rows of the underlying matrix. That justify that the setof ε -good level j extensions forms a combinatorial rectangle.Let v be a node of S j and let A and B be the sets of indices of red columns respectively in theshifted submatrices corresponding to the left and right subtrees of v . By the assumptions we knowthat both these sets have cardinality at least s ((1 − ε ) / − j − . We need to estimate the number of x ∈ [ s ] for which the set A ∩ ( B + x ) has cardinality at least s ((1 − ε ) / − j . Proposition 1 gives that there exist at least ε ((1 − ε ) / − j − s i.e. all the internal nodes of T have two children (left and right) and all the leaves are of the same distance from theroot JAKUB KOZIK such values. We obtain that the volume of combinatorial rectangle of ε -good level j extensions is atleast (cid:16) ε ((1 − ε ) / − j − (cid:17) d j (cid:3) By Corollary 3, there exists a set HS j of cardinality s · C ( d j +log(1 /ν j )) , that is a hitting set for the family of ε -good level j extensions for labellings that are ε -good up to level j −

1. That implies the following proposition.

Proposition 5.

Set HS = HS × . . . × HS h is a hitting set for the family of sets of ε -good labellingsof T . It remains to estimate the size of HS . We have | HS | Y j =1 ...h s · C ( d j +log(1 /ν j )) = s log( t ) · C P j =1 ...h ( d j +log(1 /ν j )) < s log( t ) · Ct · C P j =1 ...h log(1 /ν j ) , and X j =1 ...h log(1 /ν j ) = X j =1 ...h d j (log(1 /ε ) + 2 j − log(2 / (1 − ε ))) < t log(1 /ε ) + t X j =1 ...h log(4) (for ε < / t · log(1 /ε ) + 2 t · log( t )Therefore, for our parametrization (i.e. s = ⌈ e ( k/t ) 2 t ⌉ and ε = 1 /t ), and for all suﬃciently large t we get | HS | t log( t ) . Modiﬁed main construction.

Let HS be the set from Proposition 5. As we already observedlabellings of T correspond to shift sequences up to a cyclic shift of the whole matrix. For a labeling τ let σ ( τ ) be a shift sequence that is compatible with τ . Observe, that if τ is an ε -good labeling, then σ ( τ ) is ε -good shift sequence. Recall that we chose s = e ( k/t ) 2 t so that if σ is an ε -good sequence forsome consistently dominated coloring of M , then M ( σ ) has at least k/t red columns. The modiﬁedmain construction proceeds as follows. For every(1) labeling of the tree τ ∈ HS ,(2) and set of indices I ⊂ [ s ] of size k/t ,add to H M an edge build from all elements of the columns of M ( σ ( τ )) with indices in I . By Proposition 5 for every consistently dominated coloring of M , at least one ε -good labeling τ isused in the construction. Then, for every such coloring, matrix M shifted according to σ ( τ ) has atleast k/t red columns. As a consequence at least one of the edges of H M is monochromatic. MPROVING GEBAUER’S CONSTRUCTION OF 3-CHROMATIC HYPERGRAPHS WITH FEW EDGES 7

Counting.

Just like in the original construction, we have less than 2 t choices for the subset ofrows in the preliminary step. Then, in the modiﬁed main construction, we use every sequence of HS with every subset of k/t elements of [ s ] to build an edge. The number of choices is smaller than2 t log( t ) · (cid:18) sk/t (cid:19) < t log( t ) · (cid:18) e sk/t (cid:19) k/t . Substituting the value of s we obtain a value that is smaller than2 · t log( t ) · e k/t k = 2 t log( t )+(2 k/t ) log(e)+ k . The bound is multiplied by 2 to compensate for the ceiling in the deﬁnition of s . Taking into accountpreliminary choices of rows, the total number of edges is smaller than2 t +1+4 t log( t )+(2 k/t ) log(e)+ k . For t = ( k/ log( k )) / , the total number of edges becomes 2 k +Θ(( k log( k )) / ) . References

1. Paul Erd˝os,

On a combinatorial problem. II , Acta Mathematica Academiae Scientiarum Hungaricae (1964), 445–447.2. Ofer Gabber and Zvi Galil, Explicit constructions of linear-sized superconcentrators , J. Comput. System Sci. (1981), no. 3, 407–420, Special issued dedicated to Michael Machtey. MR 6335423. Heidi Gebauer, On the construction of 3-chromatic hypergraphs with few edges , Journal of Combinatorial Theory.Series A (2013), no. 7, 1483–1490.4. Nathan Linial, Michael Luby, Michael Saks, and David Zuckerman,

Eﬃcient construction of a small hitting set forcombinatorial rectangles in high dimension , Combinatorica (1997), no. 2, 215–234. MR 14792995. G. A. Margulis, Explicit constructions of expanders , Problemy Peredaˇci Informacii (1973), no. 4, 71–80. MR 0484767 Appendix A. Proof of Proposition 1

Proof.

Let random variable X denote the size of ( A + x ) ∩ B , when x ∈ [ s ] is chosen uniformly atrandom. By the fact that | B | = αs and linearity of expectation we obtain E ( X ) = α | A | . From the deﬁnition of X , we get also X | A | . We can observe now that a distribution that minimizes Pr[

X > (1 − ε ) α | A | ] and satisﬁes the aboveconditions, is supported only by values (1 − ε ) α | A | and | A | . There is only one such distribution thatsatisﬁes E ( X ) = α | A | . Straightforward calculations givePr[ X > (1 − ε ) α | A | ] > εα − (1 − ε ) α . (cid:3) Appendix B. Small hitting sets exist

Recall that, for a ﬁxed consistently dominated coloring of matrix M with t rows, the volume of ε -good sequences is at least p = (cid:18) ε ε (cid:19) t . The volume is exactly the probability that uniformly random sequence is ε -good. Let S be a setbuilt from m uniformly and independently sampled random sequences from [ s ] t . (Since, the sequencesare sampled with repetitions, it may happen that | S | < m .) The following formula upperbounds theexpected number of consistently dominated colorings of M , for which the set of ε -good sequences isnot hit by S st · (1 − p ) m < exp( st ln(2) − mp ) . JAKUB KOZIK

Therefore, whenever st ln(2) − mp

0, some set of m sequences hits all the sets of ε -good sequencesfor consistently dominating colorings. For s = ⌈ e ( k/t ) 2 t ⌉ and ε = 1 /t it is suﬃcient to take m of theorder 2 O ( t log( t )) to satisfy the inequality. As a consequence there exists a hitting set for G εM of size2 O ( t log( t )) . Theoretical Computer Science Department, Faculty of Mathematics and Computer Science, JagiellonianUniversity, Krak´ow, Poland

Email address ::