aa r X i v : . [ m a t h . C O ] D ec Induced Ramsey-type theorems
Jacob Fox ∗ Benny Sudakov † Abstract
We present a unified approach to proving Ramsey-type theorems for graphs with a forbiddeninduced subgraph which can be used to extend and improve the earlier results of R¨odl, Erd˝os-Hajnal, Pr¨omel-R¨odl, Nikiforov, Chung-Graham, and Luczak-R¨odl. The proofs are based on asimple lemma (generalizing one by Graham, R¨odl, and Ruci´nski) that can be used as a replacementfor Szemer´edi’s regularity lemma, thereby giving much better bounds. The same approach can bealso used to show that pseudo-random graphs have strong induced Ramsey properties. This leadsto explicit constructions for upper bounds on various induced Ramsey numbers.
Ramsey theory refers to a large body of deep results in mathematics concerning partitions of largestructures. Its underlying philosophy is captured succinctly by the statement that “In a large system,complete disorder is impossible.” This is an area in which a great variety of techniques from manybranches of mathematics are used and whose results are important not only to graph theory andcombinatorics but also to logic, analysis, number theory, and geometry. Since the publication of theseminal paper of Ramsey [43] in 1930, this subject has grown with increasing vitality, and is currentlyamong the most active areas in combinatorics.For a graph H , the Ramsey number r ( H ) is the least positive integer n such that every two-coloringof the edges of the complete graph K n on n vertices contains a monochromatic copy of H . Ramsey’stheorem states that r ( H ) exists for every graph H . A classical result of Erd˝os and Szekeres [26], whichis a quantitative version of Ramsey’s theorem, implies that r ( K k ) ≤ k for every positive integer k .Erd˝os [19] showed using probabilistic arguments that r ( K k ) > k/ for k >
2. Over the last sixtyyears, there has been several improvements on the lower and upper bounds of r ( K k ), the most recentby Conlon [15]. However, despite efforts by various researchers, the constant factors in the exponentsof these bounds remain the same.A subset of vertices of a graph is homogeneous if it is either an independent set (empty subgraph)or a clique (complete subgraph). For a graph G , denote by hom( G ) the size of the largest homogeneoussubset of vertices of G . A restatement of the Erd˝os-Szekeres result is that every graph G on n verticessatisfies hom( G ) ≥ log n , while the Erd˝os result says that for each n ≥ G on n ∗ Department of Mathematics, Princeton, Princeton, NJ. Email: [email protected] . Research supportedby an NSF Graduate Research Fellowship and a Princeton Centennial Fellowship. † Department of Mathematics, Princeton, Princeton, NJ. Email: [email protected] . Research supportedin part by NSF CAREER award DMS-0546523, NSF grant DMS-0355497 and by a USA-Israeli BSF grant. G ) ≤ n . (Here, and throughout the paper, all logarithms are base 2.) Theonly known proofs of the existence of Ramsey graphs , i.e., graphs for which hom( G ) = O (log n ), comefrom various models of random graphs with edge density bounded away from 0 and 1. This supportsthe belief that any graph with small hom( G ) looks ‘random’ in one sense or another. There are nowseveral results which show that Ramsey graphs have random-like properties.A graph H is an induced subgraph of a graph G if V ( H ) ⊂ V ( G ) and two vertices of H are adjacentif and only if they are adjacent in G . A graph is k -universal if it contains all graphs on at most k vertices as induced subgraphs. A basic property of large random graphs is that they almost surely are k -universal. There is a general belief that graphs which are not k -universal are highly structured. Inparticular, they should contain a homogeneous subset which is much larger than that guaranteed bythe Erd˝os-Szekeres bound for general graphs.In the early 1970’s, an important generalization of Ramsey’s theorem, known as the InducedRamsey Theorem, was discovered independently by Deuber [16], Erd˝os, Hajnal, and Posa [25], andR¨odl [44]. It states that for every graph H there is a graph G such that in every 2-edge-coloring of G there is an induced copy of H whose edges are monochromatic. The least positive integer n for whichthere is an n -vertex graph with this property is called the induced Ramsey number r ind ( H ). All ofthe early proofs of the Induced Ramsey Theorem give enormous upper bounds on r ind ( H ). It is stilla major open problem to prove good bounds on induced Ramsey numbers. Ideally, we would like tounderstand conditions for a graph G to have the property that in every two-coloring of the edges of G , there is an induced copy of graph H that is monochromatic.In this paper, we present a unified approach to proving Ramsey-type theorems for graphs with aforbidden induced subgraph which can be used to extend and improve results of various researchers.The same approach is also used to prove new bounds on induced Ramsey numbers. In the fewsubsequent sections we present in full detail our theorems and compare them with previously obtainedresults. H -free graphs As we already mentioned, there are several results (see, e.g., [27, 47, 2, 10]) which indicate thatRamsey graphs, graphs G with relatively small hom( G ), have random-like properties. The first advancein this area was made by Erd˝os and Szemer´edi [27], who showed that the Erd˝os-Szekeres boundhom( G ) ≥ log n can be improved for graphs which are very sparse or very dense. The edge densityof a graph G is the fraction of pairs of distinct vertices of G that are edges. The Erd˝os-Szemer´editheorem states that there is an absolute positive constant c such that hom( G ) ≥ c log nǫ log ǫ for every graph G on n vertices with edge density ǫ ∈ (0 , / k -universal with k fixed, then it contains a linear-sizedinduced subgraph that is very sparse or very dense. A graph is called H -free if it does not contain H as an induced subgraph. More precisely, R¨odl’s theorem says that for each graph H and ǫ ∈ (0 , / δ ( ǫ, H ) such that every H -free graph on n vertices contains an inducedsubgraph on at least δ ( ǫ, H ) n vertices with edge density either at most ǫ or at least 1 − ǫ . Togetherwith the theorem of Erd˝os and Szemeredi, it shows that the Erd˝os-Szekeres bound can be improved2y any constant factor for any family of graphs that have a forbidden induced subgraph.R¨odl’s proof uses Szemer´edi’s regularity lemma [48], a powerful tool in graph theory, which wasintroduced by Szemer´edi in his celebrated proof of the Erd˝os-Tur´an conjecture on long arithmeticprogressions in dense subsets of the integers. The regularity lemma roughly says that every largegraph can be partitioned into a small number of parts such that the bipartite subgraph betweenalmost every pair of parts is random-like. To properly state the regularity lemma requires someterminology. The edge density d ( X, Y ) between two subsets of vertices of a graph G is the fraction ofpairs ( x, y ) ∈ X × Y that are edges of G , i.e., d ( X, Y ) = e ( X,Y ) | X || Y | , where e ( X, Y ) is the number of edgeswith one endpoint in X and the other in Y . A pair ( X, Y ) of vertex sets is called ǫ -regular if for every X ′ ⊂ X and Y ′ ⊂ Y with | X ′ | > ǫ | X | and | Y ′ | > ǫ | Y | we have | d ( X ′ , Y ′ ) − d ( X, Y ) | < ǫ . A partition V = S ki =1 V i is called equitable if (cid:12)(cid:12) | V i | − | V j | (cid:12)(cid:12) ≤ i, j .Szemer´edi’s regularity lemma [48] states that for each ǫ >
0, there is a positive integer M ( ǫ ) suchthat the vertices of any graph G can be equitably partitioned V ( G ) = S ki =1 V i into k subsets with ǫ − ≤ k ≤ M ( ǫ ) satisfying that all but at most ǫk of the pairs ( V i , V j ) are ǫ -regular. For morebackground on the regularity lemma, see the excellent survey by Koml´os and Simonovits [37].In the regularity lemma, M ( ǫ ) can be taken to be a tower of 2’s of height proportional to ǫ − . Onthe other hand, Gowers [31] proved a lower bound on M ( ǫ ) which is a tower of 2’s of height proportionalto ǫ − . His result demonstrates that M ( ǫ ) is inherently large as a function of ǫ − . Unfortunately,this implies that the bounds obtained by applications of the regularity lemma are often quite poor.In particular, this is a weakness of the bound on δ ( ǫ, H ) given by R¨odl’s proof of his theorem. It istherefore desirable to find a new proof of R¨odl’s theorem that does not use the regularity lemma. Thefollowing theorem does just that, giving a much better bound on δ ( ǫ, H ). Its proof works as well in amulticolor setting (see concluding remarks). Theorem 1.1
There is a constant c such that for each ǫ ∈ (0 , / and graph H on k ≥ vertices,every H -free graph on n vertices contains an induced subgraph on at least − ck (log ǫ ) n vertices withedge density either at most ǫ or at least − ǫ . Nikiforov [41] recently strengthened R¨odl’s theorem by proving that for each ǫ > H oforder k , there are positive constants κ = κ ( ǫ, H ) and C = C ( ǫ, H ) such that for every graph G = ( V, E )that contains at most κ | V | k induced copies of H , there is an equitable partition V = S Ci = i V i of thevertex set such that the edge density in each V i ( i ≥
1) is at most ǫ or at least 1 − ǫ . Using the sametechnique as the proof of Theorem 1.1, we give a new proof of this result without using the regularitylemma, thereby solving the main open problem posed in [41].Erd˝os and Hajnal [23] gave a significant improvement on the Erd˝os-Szekeres bound on the sizeof the largest homogeneous set in H -free graphs. They proved that for every graph H there is apositive constant c ( H ) such that hom( G ) ≥ c ( H ) √ log n for all H -free graphs G on n vertices. Erd˝osand Hajnal further conjectured that every such G contains a complete or empty subgraph of order n c ( H ) . This beautiful problem has received increasing attention by various researchers, and was alsofeatured by Gowers [32] in his list of problems at the turn of the century. For various partial resultson the Erd˝os-Hajnal conjecture see, e.g., [4, 24, 29, 3, 28, 39, 12] and their references.Recall that a graph is k -universal if it contains all graphs on at most k vertices as induced subgraphs.Note that the Erd˝os-Hajnal bound, in particular, implies that, for every fixed k , sufficiently large3amsey graphs are k -universal. This was extended further by Pr¨omel and R¨odl [42], who obtained anasymptotically best possible result. They proved that if hom( G ) ≤ c log n then G is c log n -universalfor some constant c which depends on c .Let hom( n, k ) be the largest positive integer such that every graph G on n vertices is k -universalor satisfies hom( G ) ≥ hom( n, k ). The Erd˝os-Hajnal theorem and the Promel-R¨odl theorem both saythat hom( n, k ) is large for fixed or slowly growing k . Indeed, from the first theorem it follows that forfixed k there is c ( k ) > n, k ) ≥ c ( k ) √ log n , while the second theorem says that for each c there is c > n, c log n ) ≥ c log n . One would naturally like to have a generallower bound on hom( n, k ) that implies both the Erd˝os-Hajnal and Promel-R¨odl results. This is donein the following theorem. Theorem 1.2
There are positive constants c and c such that for all n, k , every graph on n verticesis k -universal or satisfies hom( G ) ≥ c c q log nk log n. Theorem 1.1 can be also used to answer a question of Chung and Graham [13], which was motivatedby the study of quasirandom graphs. Given a fixed graph H , it is well known that a typical graph on n vertices contains many induced copies of H as n becomes large. Therefore if a large graph G containsno induced copy of H , its edge distribution should deviate from “typical” in a rather strong way. Thisintuition was made rigorous in [13], where the authors proved that if a graph G on n vertices is not k -universal, then there is a subset S of ⌊ n ⌋ vertices of G such that | e ( S ) − n | > − k +27) n . Forpositive integers k and n , let D ( k, n ) denote the largest integer such that every graph G on n verticesthat is not k -universal contains a subset S of vertices of size ⌊ n ⌋ with | e ( S ) − n | > D ( k, n ). Chungand Graham asked whether their lower bound on D ( k, n ) can be substantially improved, e.g., replacedby c − k n . Using Theorem 1.1 this can be easily done as follows.A lemma of Erd˝os, Goldberg, Pach, and Spencer [22] implies that if a graph on n vertices has asubset R that deviates by D edges from having edge density 1 /
2, then there is a subset S of size ⌊ n/ ⌋ that deviates by at least a constant times D edges from having edge density 1 /
2. By Theorem 1.1with ǫ = 1 /
4, there is a positive constant C such that every graph on n vertices that is not k -universalhas a subset R of size at least C − k n with edge density at most 1 / /
4. This R deviatesfrom having edge density 1 / (cid:18) | R | (cid:19) ≥ | R | ≥ C − k n edges. Thus, the above mentioned lemma from [22] implies that there is an absolute constant c suchthat every graph G on n vertices which is not k -universal contains a subset S of size ⌊ n/ ⌋ with | e ( S ) − n | > c − k n . Chung and Graham also ask for non-trivial upper bounds on D ( k, n ). In thisdirection, we show that there are K k -free graphs on n vertices for which | e ( S ) − n | = O (2 − k/ n )holds for every subset S of ⌊ n ⌋ vertices of G . Together with the lower bound it determines theasymptotic behavior of D ( k, n ) and shows that there are constants c , c > c − k n Let H be a graph with k vertices and G = ( V, E ) be a graph with n vertices and atmost (1 − ǫ )2 − ( k ) n k labeled induced copies of H . Then there is a subset S ⊂ V with | S | = ⌊ n/ ⌋ and | e ( S ) − n | ≥ ǫc − k n , where c is an absolute constant. The proof of Theorem 1.3 can be easily adjusted if we replace the “at most” with “at least” andthe (1 − ǫ ) factor by (1 + ǫ ). Note that this theorem answers the original question of Chung andGraham in a very strong sense. Recall that the induced Ramsey number r ind ( H ) is the minimum n for which there is a graph G with n vertices such that for every 2-edge-coloring of G , one can find an induced copy of H in G whoseedges are monochromatic. One of the fundamental results in graph Ramsey theory (see chapter 9.3of [17]), the Induced Ramsey Theorem, says that r ind ( H ) exists for every graph H . R¨odl [45] notedthat a relatively simple proof of the theorem follows from a simple application of his result discussedin the previous section. However, all of the early proofs of the Induced Ramsey Theorem give poorupper bounds on r ind ( H ).Since these early proofs, there has been a considerable amount of research on induced Ramseynumbers. Erd˝os [21] conjectured that there is a constant c such that every graph H on k verticessatisfies r ind ( H ) ≤ ck . Erd˝os and Hajnal [20] proved that r ind ( H ) ≤ k o (1) holds for every graph H on k vertices. Kohayakawa, Pr¨omel, and R¨odl [36] improved this bound substantially and showedthat if a graph H has k vertices and chromatic number χ , then r ind ( H ) ≤ k ck (log χ ) , where c is auniversal constant. In particular, their result implies an upper bound of 2 ck (log k ) on the inducedRamsey number of any graph on k vertices. In their proof, the graph G which gives this bound israndomly constructed using projective planes.There are several known results that provide upper bounds on induced Ramsey numbers for sparsegraphs. For example, Beck [8] studied the case when H is a tree; Haxell, Kohayakawa, and Luczak [35]proved that the cycle of length k has induced Ramsey number linear in k ; and, settling a conjectureof Trotter, Luczak and R¨odl [40] showed that the induced Ramsey number of a graph with boundeddegree is at most polynomial in the number of its vertices. More precisely, they proved that for everyinteger d , there is a constant c d such that every graph H on k vertices and maximum degree at most5 satisfies r ind ( H ) ≤ k c d . Their proof, which also uses random graphs, gives an upper bound on c d that is a tower of 2’s of height proportional to d .As noted by Schaefer and Shah [46], all known proofs of the Induced Ramsey Theorem either relyon taking G to be an appropriately chosen random graph or give a poor upper bound on r ind ( H ).However, often in combinatorics, explicit constructions are desirable in addition to existence proofsgiven by the probabilistic method. For example, one of the most famous such problems was posedby Erd˝os [5], who asked for the explicit construction of a graph on n vertices without a complete orempty subgraph of order c log n . Over the years, this intriguing problem and its bipartite variant hasdrawn a lot of attention by various researches (see, e.g., [30, 1, 6, 9, 7]), but, despite these efforts, it isstill open. Similarly, one would like to have an explicit construction for the Induced Ramsey Theorem.We obtain such a construction using pseudo-random graphs.The random graph G ( n, p ) is the probability space of all labeled graphs on n vertices, where everyedge appears randomly and independently with probability p . An important property of G ( n, p ) isthat, with high probability, between any two large subsets of vertices A and B , the edge density d ( A, B ) = e ( A,B ) | A || B | is approximately p . This observation is one of the motivations for the followinguseful definition. A graph G = ( V, E ) is ( p, λ ) -pseudo-random if the following inequality holds for allsubsets A, B ⊂ V : | d ( A, B ) − p | ≤ λ p | A || B | . It is easy to show that if p < . 99, then with high probability, the random graph G ( n, p ) is ( p, λ )-pseudo-random with λ = O ( √ pn ). Moreover, there are also many explicit constructions of pseudo-random graphs which can be obtained using the following fact. Let λ ≥ λ ≥ . . . ≥ λ n be theeigenvalues of the adjacency matrix of a graph G . An ( n, d, λ ) -graph is a d -regular graph on n verticeswith λ = max i ≥ | λ i | . It was proved by Alon (see, e.g., [5], [38]) that every ( n, d, λ )-graph is in fact( dn , λ )-pseudo-random. Therefore to construct good pseudo-random graphs we need regular graphswith λ ≪ d . For more details on pseudo-random graphs, including many constructions, we refer theinterested reader to the recent survey [38].A graph is d -degenerate if every subgraph of it has a vertex of degree at most d . The degeneracynumber of a graph H is the smallest d such that H is d -degenerate. This quantity, which is alwaysbounded by the maximum degree of the graph, is a natural measure of its sparseness. In particular,in a d -degenerate graph every subset X spans at most d | X | edges. The chromatic number χ ( H ) ofgraph H is the minimum number of colors needed to color vertices of H such that adjacent vertices getdifferent colors. Using a greedy coloring, it is easy to show that d -degenerate graphs have chromaticnumber at most d + 1. The following theorem, which is special case of a more general result which weprove in Section 4, shows that any sufficiently pseudo-random graph of appropriate density has stronginduced Ramsey properties. Theorem 1.4 There is an absolute constant c such that for all integers k, d, χ ≥ , every ( k , n . ) -pseudo-random graph G on n ≥ k cd log χ vertices satisfies that every d -degenerate graph on k verticeswith chromatic number at most χ occurs as an induced monochromatic copy in all -edge-colorings of G . Moreover, all of these induced monochromatic copies can be found in the same color. This theorem implies that, with high probability, G ( n, p ) with p = 1 /k and n ≥ k cd log χ satisfiesthat every d -degenerate graph on k vertices with chromatic number at most χ occurs as an induced6onochromatic copy in all 2-edge-colorings of G . It gives the first polynomial upper bound on theinduced Ramsey numbers of d -degenerate graphs. In particular, for bounded degree graphs this is asignificant improvement of the above mentioned Luczak-R¨odl result. It shows that the exponent of thepolynomial in their theorem can be taken to be O ( d log d ), instead of the previous bound of a towerof 2’s of height proportional to d . Corollary 1.5 There is an absolute constant c such that every d -degenerate graph H on k verticeswith chromatic number χ ≥ has induced Ramsey number r ind ( H ) ≤ k cd log χ . A significant additional benefit of Theorem 1.4 is that it leads to explicit constructions for inducedRamsey numbers. One such example can be obtained from a construction of Delsarte and Goethalsand also of Turyn (see [38]). Let r be a prime power and let G be a graph whose vertices are theelements of the two dimensional vector space over finite field F r , so G has r vertices. Partition the r + 1 lines through the origin of the space into two sets P and N , where | P | = t . Two vertices x and y of the graph G are adjacent if x − y is parallel to a line in P . This graph is known to be t ( r − − t or r − t . Taking t = r k ( r − , we obtain an( n, d, λ )-graph with n = r , d = n/k , and λ = r − t ≤ r ≤ n / . This gives a ( p, λ )-pseudo-randomgraph with p = d/n = 1 /k and λ ≤ n / which satisfies the assertion of Theorem 1.4.Another well-known explicit construction is the Paley graph P n . Let n be a prime power which iscongruent to 1 modulo 4 so that − F n . The Paley graph P n has vertexset F n and distinct elements x, y ∈ F n are adjacent if x − y is a square. It is well known and notdifficult to prove that the Paley graph P n is (1 / , λ )-pseudo-random with λ = √ n . This can be usedtogether with the generalization of Theorem 1.4, which we discuss in Section 5, to prove the followingresult. Corollary 1.6 There is an absolute constant c such that for prime n ≥ ck log k , every graph on k vertices occurs as an induced monochromatic copy in all -edge-colorings of the Paley graph P n . This explicit construction matches the best known upper bound on induced Ramsey numbers ofgraphs on k vertices obtained by Kohayakawa, Pr¨omel, and R¨odl [36]. Similarly, we can prove thatthere is a constant c such that, with high probability, G ( n, / 2) with n ≥ ck log k satisfies that everygraph on k vertices occurs as an induced monochromatic copy in all 2-edge-colorings of G .Very little is known about lower bounds for induced Ramsey numbers beyond the fact that aninduced Ramsey number is at least its corresponding Ramsey number. A well-known conjecture ofBurr and Erd˝os [11] from 1973 states that for each positive integer d there is a constant c ( d ) such thatthe Ramsey number r ( H ) is at most c ( d ) k for every d -degenerate graph H on k vertices. As mentionedearlier, Haxell et al. [35] proved that the induced Ramsey number for the cycle on k vertices is linearin k . This implies that the induced Ramsey number for the path on k vertices is also linear in k .Also, using a star with 2 k − k edges is 2 k . It is natural to ask whether the Burr-Erd˝os conjecture extends to induced Ramseynumbers. The following result shows that this fails already for trees, which are 1-degenerate graphs. Theorem 1.7 For every c > and sufficiently large integer k there is a tree T on k vertices such that r ind ( T ) ≥ ck . T in the above theorem can be taken to be any sufficiently large tree that containsa matching of linear size and a star of linear size as subgraphs. It is interesting that the inducedRamsey number for a path on k vertices or a star on k vertices is linear in k , but the induced Ramseynumber for a tree which contains both a path on k vertices and a star on k vertices is superlinear in k . Organization of the paper. In the next section we give short proofs of Theorem 1.1 and Theorem1.2 which illustrate our methods. Section 3 contains the key lemma that is used as a replacement forSzemer´edi’s regularity lemma in the proofs of several results. We answer questions of Chung-Grahamand Nikiforov on the edge distribution in graphs with a forbidden induced subgraph in Section 4. InSection 5 we show that any sufficiently pseudo-random graph of appropriate density has strong inducedRamsey properties. Combined with known examples of pseudo-random graphs, this leads to explicitconstructions which match and improve the best known estimates for induced Ramsey numbers. Theproof of the result that there are trees whose induced Ramsey number is superlinear in the numberof vertices is in Section 6. The last section of this paper contains some concluding remarks togetherwith a discussion of a few conjectures and open problems. Throughout the paper, we systematicallyomit floor and ceiling signs whenever they are not crucial for the sake of clarity of presentation. Wealso do not make any serious attempt to optimize absolute constants in our statements and proofs. H -free graphs In this section, we prove Theorems 1.1 and 1.2. While we obtain more general results later in thepaper, the purpose of this section is to illustrate on simple examples the main ideas and techniquesthat we will use in our proofs. Our theorems strengthen and generalize results from [45] and [42] andthe proofs we present here are shorter and simpler than the original ones. We start with the proof ofTheorem 1.1, which uses the following lemma of Erd˝os and Hajnal [23]. We prove a generalization ofthis lemma in Section 4. Lemma 2.1 For each ǫ ∈ (0 , / , graph H on k vertices, and H -free graph G = ( V, E ) on n ≥ vertices, there are disjoint subsets A and B of V with | A | , | B | ≥ ǫ k − nk such that either every vertexin A has at most ǫ | B | neighbors in B , or every vertex in A has at least (1 − ǫ ) | B | neighbors in B . Actually, the statement of the lemma in [23] is a bit weaker than that of Lemma 2.1 but it is easyto get the above statement by analyzing more carefully the proof of Erd˝os and Hajnal. Lemma 2.1roughly says that every H -free graph contains two large disjoint vertex subsets such that the edgedensity between them is either very small or very large. However, to prove Theorem 1.1, we need tofind a large induced subgraph with such edge density. Our next lemma shows how one can iterate thebipartite density result of Lemma 2.1 in order to establish the complete density result of Theorem 1.1.For ǫ , ǫ ∈ (0 , 1) and a graph H , define δ ( ǫ , ǫ , H ) to be the largest δ (which may be 0) such thatfor each H -free graph on n vertices, there is an induced subgraph on at least δn vertices with edgedensity at most ǫ or at least 1 − ǫ . Notice that for 2 ≤ n ≤ n , the edge-density of a graph on n vertices is the average of the edge-densities of the induced subgraphs on n vertices. Therefore, fromdefinition of δ , it follows that for every 2 ≤ n ≤ δ ( ǫ , ǫ , H ) n and H -free graph G on n vertices, G contains an induced subgraph on exactly n vertices with edge density at most ǫ or at least 1 − ǫ .8ecall that the edge-density d ( A ) of a subset A of G equals e ( A ) / (cid:0) | A | (cid:1) , where e ( A ) is the number ofedges spanned by A . Lemma 2.2 Suppose ǫ , ǫ ∈ (0 , with ǫ + ǫ < and H is a graph on k ≥ vertices. Let ǫ = min( ǫ , ǫ ) . We have δ ( ǫ , ǫ , H ) ≥ ( ǫ/ k k − min (cid:16) δ (cid:0) ǫ / , ǫ , H (cid:1) , δ (cid:0) ǫ , ǫ / , H (cid:1)(cid:17) . Proof. Let G be a H -free graph on n ≥ n < k then we may consider any two-vertexinduced subgraph of G which has always density either 0 or 1. Therefore, for G of order less than k we can take δ = 2 /k , which is clearly larger than the right hand side of the inequality in the assertionof the lemma. Thus we can assume that n ≥ k . Applying Lemma 2.1 to G with ǫ/ ǫ , wefind two subsets A and B with | A | , | B | ≥ ( ǫ/ k − n/k , such that either every vertex in A is adjacentto at most ǫ | B | vertices of B or every vertex of A is adjacent to at least (1 − ǫ ) | B | vertices of B .Consider the first case in which every vertex in A is adjacent to at most ǫ | B | vertices of B (theother case can be treated similarly) and let G [ A ] be the subgraph of G induced by the set A . Bydefinition of function δ , G [ A ] contains a subset A ′ with | A ′ | = δ (3 ǫ / , ǫ , H ) (cid:16) ǫ (cid:17) k nk ≤ δ (3 ǫ / , ǫ , H ) | A | , such that the subgraph induced by A ′ has edge density at most ǫ or at least 1 − ǫ . If A ′ hasedge density at least 1 − ǫ we are done, since G [ A ′ ] is an induced subgraph of G with at least( ǫ/ k k − δ (3 ǫ / , ǫ , H ) n vertices and edge density at least 1 − ǫ . So we may assume that the edgedensity in A ′ is at most ǫ .Let B ⊂ B be those vertices of B that have at most ǫ | A ′ | neighbors in A ′ . Since A ′ ⊂ A , eachvertex of A ′ has at most ǫ | B | neighbors in B and the number of edges e ( A ′ , B ) ≤ ǫ | A ′ || B | . Therefore B has at least | B | / δ , B contains a subset B ′ with | B ′ | = δ (3 ǫ / , ǫ , H ) (cid:16) ǫ (cid:17) k nk ≤ δ (3 ǫ / , ǫ , H ) | B | , such that the induced subgraph G [ B ′ ] has edge density at most ǫ or at least 1 − ǫ . If it has edgedensity at least 1 − ǫ we are done, so we may assume that the edge density d ( B ′ ) is at most ǫ .Finally to complete the proof note that, since | A ′ | = | B ′ | , | A ′ ∪ B ′ | = 2 | A ′ | , d ( A ′ ) , d ( B ′ ) ≤ ǫ ,and d ( A ′ , B ′ ) ≤ ǫ , we have that e ( A ′ ∪ B ′ ) = e ( A ′ ) + e ( B ′ ) + e ( A ′ , B ′ ) ≤ ǫ (cid:18) | A ′ | (cid:19) + 32 ǫ (cid:18) | B ′ | (cid:19) + ǫ | A ′ || B ′ | = 2 ǫ | A ′ | − ǫ | A ′ | / ≤ ǫ (cid:18) | A ′ | (cid:19) . Therefore, d ( A ′ ∪ B ′ ) ≤ ǫ . ✷ From this lemma, the proof of our first result, that every H -free graph on n vertices contains asubset of at least 2 − ck (log ǫ ) n vertices with edge density either ≤ ǫ or ≥ − ǫ , follows in a few lines.9 roof of Theorem 1.1: Notice that if ǫ + ǫ ≥ 1, then trivially δ ( ǫ , ǫ , H ) = 1. In particular, if ǫ ǫ ≥ , then ǫ + ǫ ≥ δ ( ǫ , ǫ , H ) = 1. Therefore, by iterating Lemma 2.2 for t = log ǫ / log iterations and using that ǫ ≤ / 2, we obtain δ ( ǫ, ǫ, H ) ≥ (cid:18) ǫ k k k (cid:19) t ≥ − / ( k (log 1 /ǫ ) +(2 k +log k ) log 1 /ǫ ) ≥ − k (log 1 /ǫ ) , which, by definition of δ , completes the proof of the theorem. ✷ Recall the Erd˝os-Szemer´edi theorem, which states that there is an absolute constant c such thatevery graph G on n vertices with edge density ǫ ∈ (0 , / 2) has a homogeneous set of size at least c log nǫ log ǫ .Theorem 1.2 follows from a simple application of Theorem 1.1 and the Erd˝os-Szemer´edi theorem. Proof of Theorem 1.2: Let G be a graph on n vertices which is not k -universal, i.e., it is H -freefor some fixed graph H on k vertices. Fix ǫ = 2 − q log nk and apply Theorem 1.1 to G . It implies that G contains a subset W ⊂ V ( G ) of size at least 2 − k (log ǫ ) n = n / such that the subgraph inducedby W has edge density at most ǫ or at least 1 − ǫ . Applying the Erd˝os-Szemer´edi theorem to theinduced subgraph G [ W ] or its complement and using that ǫ log 1 /ǫ ≤ ǫ / for all ǫ ≤ 1, we obtain ahomogeneous subset W ′ ⊂ W with | W ′ | ≥ c log n / ǫ log ǫ ≥ c log n ǫ / ≥ c 10 2 q log nk log n, which completes the proof of Theorem 1.2. ✷ In this section we present our key lemma. We use it as a replacement for Szemer´edi’s regularity lemmain the proofs of several Ramsey-type results, thereby giving much better estimates. A very specialcase of this statement was essentially proved in Lemma 2.2 in the previous section. Our key lemmageneralizes the result of Graham, R¨odl, and Rucinski [34] and has a simpler proof than the one in [34].Roughly, our result says that if ( G , . . . , G r ) is a sequence of graphs on the same vertex set V withthe property that every large subset of V contains a pair of large disjoint sets with small edge densitybetween them in at least one of the graphs G i , then every large subset of V contains a large set withsmall edge density in one of the G i . To formalize this concept, we need a couple definitions.For a graph G = ( V, E ) and disjoint subsets W , . . . , W t ⊂ V , the density d G ( W , . . . , W t ) betweenthe t ≥ W , . . . , W t is defined by d G ( W , . . . , W t ) = P i If a sequence of graphs ( G , . . . , G r ) with common vertex set V is ( αρ, ρ ′ , ǫ, t ) -sparseand ( α, ρ, ǫ/ , -sparse, then ( G , . . . , G r ) is also ( α, ρρ ′ , ǫ, t ) -sparse. Proof. Since ( G , . . . , G r ) is ( α, ρ, ǫ/ , U ⊂ V with | U | ≥ α | V | , there is i ∈ [ r ]and disjoint subsets X, Y ⊂ U with | X | = | Y | = ρ | U | and d G i ( X, Y ) ≤ ǫ/ 4. Let X be the set ofvertices in X that have at most ǫ | Y | neighbors in Y in graph G i . Then e G i ( X \ X , Y ) ≥ ǫ | X \ X || Y | / e G i ( X, Y ) ≤ ǫ | X || Y | / 4. Therefore | X | ≥ | X | / ≥ ρ | U | and by removing extravertices we assume that | X | = ρ | U | .Since ( G , . . . , G r ) is ( αρ, ρ ′ , ǫ, t )-sparse, then there are positive integers t , . . . , t r such that Q rj =1 t j ≥ t and for each j ∈ [ r ] there are disjoint subsets X j, , . . . , X j,t j ⊂ X of size | X j, | = . . . = | X j,t j | = ρ ′ | X | with density d G j ( X j, , . . . , X j,t j ) ≤ ǫ . Let Y the set of vertices in Y that have atmost ǫ | X i, ∪ . . . ∪ X i,t i | neighbors in X i, ∪ . . . ∪ X i,t i in graph G i . Since every vertex of X is adjacent toat most ǫ | Y | vertices of Y and since X i, ∪ . . . ∪ X i,t i ⊂ X we have that d G i ( X i, ∪ . . . ∪ X i,t i , Y ) ≤ ǫ/ d G i ( X i, ∪ . . . ∪ X i,t i , Y \ Y ) ≥ ǫ . Therefore | Y | ≥ | Y | / 2, so again we can assumethat | Y | = ρ | U | = | X | . Since ( G , . . . , G r ) is ( αρ, ρ ′ , ǫ, t )-sparse, then there are positive integers s , . . . , s r such that Q rj =1 s j ≥ t and for each j ∈ [ r ] there are disjoint subsets Y j, , . . . , Y j,s j ⊂ Y with d G j ( Y j, , . . . , Y j,s j ) ≤ ǫ and | Y j, | = . . . = | Y j,s j | = ρ ′ | Y | .By the above construction, the edge density between X i, ∪ . . . ∪ X i,t i and Y i, ∪ . . . ∪ Y i,s i isbounded from above by ǫ . We also have that both d G i ( X i, , . . . , X i,t i ) and d G i ( Y i, , . . . , Y i,s i ) are atmost ǫ and these two sets have the same size. Therefore d G i ( X i, , . . . , X i,t i , Y i, , . . . , Y i,s i ) ≤ ǫ , implyingthat ( G , . . . , G r ) is (cid:0) α, ρρ ′ , ǫ, u (cid:1) -sparse with u = ( t i + s i ) Q j ∈ [ r ] \{ i } max( t j , s j ) for some i . By thearithmetic mean-geometric mean inequality, we have t ≤ r Y j =1 t j r Y j =1 s j ≤ t i s i Y j ∈ [ r ] \{ i } max( t j , s j ) = t i s i ( t i + s i ) u ≤ u . Thus u ≥ t . Altogether this shows that ( G , . . . , G r ) is (cid:0) α, ρρ ′ , ǫ, t (cid:1) -sparse, completing the proof. ✷ Rather than using this lemma directly, in applications we usually need the following two corollaries.The first one is obtained by simply applying Lemma 3.2 h − Corollary 3.3 If ( G , . . . , G r ) is ( α, ρ, ǫ/ , -sparse and h is a positive integer, then ( G , . . . , G r ) isalso (cid:16) ( ρ ) h − α, − h ρ h , ǫ, h (cid:17) -sparse. 11f we use the last statement with h = r log ǫ and α = ( ρ ) h − , then we get that there is an index i ∈ [ r ] and disjoint subsets W , . . . , W t ⊂ V with t ≥ h/r = ǫ , | W | = . . . = | W t | = 2 − h ρ h | V | , and d G i ( W , . . . , W t ) ≤ ǫ . Since (cid:0) | W | (cid:1) ≤ ǫt (cid:0) t | W | (cid:1) , even if every W i has edge density one, still the edgedensity in the set W ∪ . . . ∪ W t is at most 2 ǫ . Therefore, (using ǫ/ ǫ ) we have the followingcorollary. Corollary 3.4 If ( G , . . . , G r ) is (( ρ ) h − , ρ, ǫ/ , -sparse where h = r log ǫ , then there is i ∈ [ r ] andan induced subgraph G ′ of G i on ǫ − − h ρ h | V | vertices that has edge density at most ǫ . The key lemma in the paper of Graham, R¨odl, and Rucinski [34] on the Ramsey number of graphs(their Lemma 1) is essentially the r = 1 case of Corollary 3.4. H -free graphs In this section, we obtain several results on the edge distribution of graphs with a forbidden inducedsubgraph which answer open questions by Nikiforov and Chung-Graham. We first prove a strength-ening of R¨odl’s theorem (mentioned in the introduction) without using the regularity lemma. Thenwe present a proof of Theorem 1.3 on the dependence of error terms in quasirandom properties. Weconclude this section with an upper bound on the maximum edge discrepancy in subgraphs of H -freegraphs. To obtain these results we need the following generalization of Lemma 2.1. Lemma 4.1 Let H be a k -vertex graph and let G be a graph on n ≥ k vertices that contains lessthan n k (1 − k n ) Q k − i =1 (1 − δ i ) ǫ k − ii labeled induced copies of H , where ǫ = 1 and ǫ i , δ i ∈ (0 , forall ≤ i ≤ k − . Then there is an index i ≤ k − and disjoint subsets A and B of G with | A | ≥ δ i nk ( k − i ) Q j
Let M denote the number of labeled induced copies of H in G , which by our assumption isat most M < n k (cid:18) − k n (cid:19) k − Y i =1 (1 − δ i ) ǫ k − ii . (1)We may assume that the vertex set of H is [ k ]. Consider a random partition V ∪ . . . ∪ V k of the verticesof G such that each V i has cardinality n/k . Note that for any such partition there are ( n/k ) k ordered k -tuples of vertices of G with the property that the i -th vertex of the k -tuple is in V i for all i ∈ [ k ]. Onthe other hand the total number of ordered k -tuples of vertices is n ( n − · · · ( n − k + 1) and each ofthese k -tuples has the above property with equal probability. This implies that for any given k -tuplethe probability that its i -th vertex is in V i for all i ∈ [ k ] equals Q ki =1 n/kn − i +1 . In particular, by linearityof expectation, the expected number of labeled induced copies of H in G for which the image of everyvertex i ∈ [ k ] is in V i is at most M · Q ki =1 n/kn − i +1 . Using that Q (1 − x i ) ≥ − P x i for any 0 ≤ x i ≤ n ≥ k , we obtain 12 Y i =1 n/kn − i + 1 = k − k k − Y i =0 (1 − i/n ) − ≤ k − k − k − X i =0 i/n ! − = k − k (cid:18) − (cid:18) k (cid:19) /n (cid:19) − < (cid:18) − k n (cid:19) − k − k . This, together with (1), shows that there is a partition V ∪ . . . ∪ V k of G into sets of cardinality n/k such that the total number of labeled induced copies of H in G for which the image of every vertex i ∈ [ k ] is in V i is less than M (cid:18) − k n (cid:19) − k − k < k − k n k k − Y i =1 (1 − δ i ) ǫ k − ii . (2)We use this estimate to construct sets A and B which satisfy the assertion of the lemma. For avertex v ∈ V , the neighborhood N ( v ) is the set of vertices of G that are adjacent to v . For v ∈ V i anda subset S ⊂ V j with i = j , let ˜ N ( v, S ) = N ( v ) ∩ S if ( i, j ) is an edge of H and ˜ N ( v, S ) = S \ N ( v )otherwise. We will try iteratively to build many induced copies of H . After i steps, we will havevertices v , . . . , v i with v j ∈ V j for j ≤ i and subsets V i +1 ,i , V i +2 ,i , . . . , V k,i such that1. V ℓ,i is a subset of V ℓ of size | V ℓ,i | ≥ nk Q ij =1 ǫ j for all i + 1 ≤ ℓ ≤ k ,2. for 1 ≤ j < ℓ ≤ i , ( v j , v ℓ ) is an edge of G if and only if ( j, ℓ ) is an edge of H ,3. and if j ≤ i < ℓ and w ∈ V ℓ,i , then ( v j , w ) is an edge of G if and only if ( j, ℓ ) is an edge of H .In the first step, we call a vertex v ∈ V good if | ˜ N ( v, V i ) | ≥ ǫ | V i | for each i > 1. If less thana fraction 1 − δ of the vertices in V are good, then, by the pigeonhole principle, there is a subset A ⊂ V with | A | ≥ δ k − | V | = δ k ( k − n and an index j > | ˜ N ( v, V j ) | < ǫ | V j | for each v ∈ A .Letting B = V j , one can easily check that A and B satisfy the assertion of the lemma. Hence, we mayassume that at least a fraction 1 − δ of the vertices v ∈ V are good, choose any good v and define V i, = ˜ N ( v , V i ) for i > 1, completing the first step.Suppose that after step i the properties 1-3 are satisfied. Then, in step i + 1, we again call a vertex v ∈ V i +1 ,i good if | ˜ N ( v, V j,i ) | ≥ ǫ i +1 | V j,i | for each j > i +1. If less than a fraction 1 − δ i +1 vertices of V i +1 ,i are good, then, by the pigeonhole principle, there is a subset A ⊂ V i +1 ,i with | A | ≥ δ i +1 k − i − | V i +1 ,i | andindex j > i + 1 such that | ˜ N ( v, V j,i ) | < ǫ i +1 | V j,i | for each v ∈ A . Letting, B = V j,i , one can check usingproperties 1-3, that A and B satisfy the assertion of the lemma. Hence, we may assume that a fraction1 − δ i +1 of the vertices v i +1 ∈ V i +1 ,i are good, choose any good v i +1 and define V j,i +1 = ˜ N ( v i +1 , V j,i )for j > i + 2, completing step i + 1. Notice that after step i + 1, we have | V j,i +1 | ≥ ǫ i +1 | V j,i | for j > i + 1, which guarantees that property 1 is satisfied. The remaining properties (2 and 3) followfrom our construction of sets V j,i +1 .Thus if our process fails in one of the first k − A and B . Suppose nowthat we successfully performed k − i + 1, we had at least (1 − δ i +1 ) | V i +1 ,i | ≥ nk (1 − δ i +1 ) Q ij =1 ǫ j vertices to choose for vertex v i +1 . Also note that, by property 3, after step k − V k,k − to be v k . Moreover, by the property 2, every choice of thevertices v , . . . , v k form a labeled induced copy of H . Altogether, this gives at least | V k,k − | · k − Y i =1 (cid:18) nk (1 − δ i ) Y ≤ j
Let H be a graph with k vertices, α ≥ k /n , ǫ ≤ / , and G be a graph with at most − k ǫ ( k )( αn ) k induced copies of H . Then the pair ( G, ¯ G ) is ( α, ǫ k − k , ǫ, -sparse. The next statement strengthens Theorem 1.1 by allowing for many induced copies of H . It followsfrom Corollary 3.4 with r = 2 , h = 2 log(2 /ǫ ) , ρ = ǫ k − k , combined with the last statement in which weset α = ( ρ/ h − . Corollary 4.3 There is a constant c such that for each ǫ ∈ (0 , / and graph H on k vertices, everygraph G on n vertices with less than − c ( k log ǫ ) n k induced copies of H contains an induced subgraphof size at least − ck (log ǫ ) n with edge density at most ǫ or at least − ǫ . This result demonstrates that for each ǫ ∈ (0 , / 2) and graph H , there exist positive constants δ ∗ = δ ∗ ( ǫ, H ) and κ ∗ = κ ∗ ( ǫ, H ) such that every graph G = ( V, E ) on n vertices with less than κ ∗ n k induced copies of H contains a subset W ⊂ V of size at least δ ∗ n such that the edge density of W is atmost ǫ or at least 1 − ǫ . Furthermore, there is a constant c such that we can take δ ∗ ( ǫ, H ) = 2 − ck (log ǫ ) and κ ∗ ( ǫ, H ) = 2 − c ( k log ǫ ) . Applying Corollary 4.3 recursively one can obtain an equitable partitionof G into a small number of subsets each with low or high density. Theorem 4.4 For each ǫ ∈ (0 , / and graph H on k vertices, there are positive constants κ = κ ( ǫ, H ) and C = C ( ǫ, H ) such that every graph G = ( V, E ) on n vertices with less than κ n k inducedcopies of H , there is an equitable partition V = S ℓi =1 V i such that ℓ ≤ C and the edge density in each V i is at most ǫ or at least − ǫ . This extension of R¨odl’s theorem was proved by Nikiforov [41] using the regularity lemma and thereforeit had quite poor (tower like) dependence of κ and C on ǫ and k . Obtaining a proof without using theregularity lemma was the main open problem raised in [41] . Proof of Theorem 4.4. Let κ ( ǫ, H ) = ( ǫ ) k κ ∗ ( ǫ , H ) and C ( ǫ, H ) = 4 / ( ǫδ ∗ ( ǫ , H )), where κ ∗ and δ ∗ were defined above. Take a subset W ⊂ V of size δ ∗ ( ǫ , H ) ǫ n whose edge density is at most ǫ or at14east 1 − ǫ , and set U = V \ W . For j ≥ 1, if | U j | ≥ ǫ n , then by definition of κ we have that the numberof induced copies of H in U j is at most (the number of such copies in G ) κ n k = ( ǫ ) k κ ∗ n k ≤ κ ∗ | U j | k .Therefore by definition of κ ∗ and δ ∗ we can find a subset W j +1 ⊂ U j of size δ ∗ ǫ n ≤ δ ∗ | U j | whose edgedensity is at most ǫ or at least 1 − ǫ , and set U j +1 = U j \ W j +1 .Once this process stops, we have disjoint sets W , . . . , W ℓ , each with the same cardinality, and asubset U ℓ of cardinality at most ǫ n . The number ℓ is at most n/ | W | ≤ / ( ǫδ ∗ ( ǫ , H )) . Partition set U ℓ into ℓ equal parts T , . . . , T ℓ and let V j = W j ∪ T j for 1 ≤ j ≤ ℓ . Notice that V = V ∪ . . . ∪ V ℓ is an equitable partition of V . By definition, | T j | = | U ℓ | /ℓ ≤ ǫ n/ℓ . On the otherhand | W j | = ( n − | U ℓ | ) /ℓ ≥ (1 − ǫ/ n/ℓ . Since 1 − ǫ/ > / 8, this implies that | T j | ≤ ǫ n/ℓ ≤ ǫ (cid:0) − ǫ/ (cid:1) − | W j | ≤ ǫ | W j | . We next look at the edge density in V j . If the edge density in W j is at most ǫ/ 4, then using the abovebound on | T j | , it is easy to check that the number of edges in V j is at most (cid:18) | T j | (cid:19) + | T j || W j | + ǫ (cid:18) | W j | (cid:19) ≤ ǫ (cid:18) | W j | (cid:19) ≤ ǫ (cid:18) | V j | (cid:19) . Hence, the edge density in each such V j is at most ǫ . Similarly, if the edge density in W j is at least1 − ǫ , then the edge density in V j is at least 1 − ǫ . This completes the proof. ✷ We next use Lemma 4.1 to prove that there is a constant c > G on n vertices which contains at most (1 − ǫ )2 − ( k ) n k labeled induced copies of some fixed k -vertex graph H has a subset S of size | S | = ⌊ n/ ⌋ with | e ( S ) − n | ≥ ǫc − k n . Proof of Theorem 1.3. For 1 ≤ i ≤ k − 1, let ǫ i = (1 − i − k − ǫ ) and δ i = 2 i − k − ǫ . Notice that forall i ≤ k − Y j − i (3)and also that k − Y i =1 (1 − δ i ) ǫ k − ii = 2 − ( k ) k − Y i =1 (cid:0) − i − k − ǫ (cid:1) k − i +1 = 2 − ( k ) k Y j =2 (cid:0) − ǫ − j − (cid:1) j ≥ − ( k ) (cid:18) − ǫ k X j =2 j j +1 (cid:19) > (cid:16) − ǫ (cid:17) − ( k ) . We may assume that ǫ ≥ k /n since otherwise by choosing constant c large enough we get that ǫc − k n < (cid:18) − k n (cid:19) k − Y i =1 (1 − δ i ) ǫ k − ii ≥ (cid:16) − ǫ (cid:17) − ( k ) > (1 − ǫ )2 − ( k ) , ǫ i and δ i as above to our graph G since it contains at most(1 − ǫ )2 − ( k ) n k labeled induced copies of H . This lemma, together with (3), implies that there is anindex i ≤ k − A and B with | A | ≥ δ i nk ( k − i ) Y j
2. To finish the proofwe will use the lemma of Erd˝os et al. [22], mentioned in the introduction. This lemma says that ifgraph G on n vertices with edge density η has a subset that deviates by D edges from having edgedensity η , then it also has a subset of size n/ D/ η . Note that if the edge density of our graph G is either larger than 1 / ǫk − − k − n / 30 orsmaller than 1 / − ǫk − − k − n / 30 than by averaging over all subsets of size n/ S satisfying our assertion. Otherwise, if the edge density η of G satisfies | η − / | ≤ ǫk − − k − n / R from (4) deviates by at least ǫk − − k − n / − ǫk − − k − n / ≥ ǫk − − k n / η . Then, by the lemma of Erd˝os et al., G has a subset S of cardinality n/ ǫk − − k − n / 20 edges from having edge density η . This S satisfies (cid:12)(cid:12)(cid:12)(cid:12) e ( S ) − | S | (cid:12)(cid:12)(cid:12)(cid:12) ≥ ǫk − − k − n / − ǫk − − k − n / 30 = Ω (cid:16) ǫk − − k n (cid:17) , completing the proof. ✷ For positive integers k and n , recall that D ( k, n ) denotes the largest integer such that everygraph G on n vertices that is H -free for some k -vertex graph H contains a subset S of size n/ | e ( S ) − n | > D ( k, n ). We end this section by proving the upper bound on D ( k, n ). Proposition 4.5 There is a constant c > such that for all positive integers k and n ≥ k/ , thereis a K k -free graph G on n vertices such that for every subset S of n/ vertices of G , (cid:12)(cid:12)(cid:12) e ( S ) − n (cid:12)(cid:12)(cid:12) < c − k/ n . roof. Consider the random graph G ( ℓ, / 2) with ℓ = 2 k/ . For every subset of vertices X in thisgraph the number of edges in X is a binomially distributed random variable with expectation | X | ( | X |− .Therefore by Chernoff’s bound (see, e.g., Appendix A in [5]), the probability that it deviates from thisvalue by t is at most 2 e − t / | X | . Thus choosing t = 1 . ℓ / we obtain that the probability that thereis a subset of vertices X such that (cid:12)(cid:12) e ( X ) − | X | ( | X |− (cid:12)(cid:12) > t is at most 2 ℓ · e − t /ℓ ≪ 1. This impliesthat there is graph Γ on ℓ vertices such that every subset X of Γ satisfies (cid:12)(cid:12)(cid:12) e ( X ) − | X | (cid:12)(cid:12)(cid:12) ≤ ℓ / . (5)Let G be the graph obtained by replacing every vertex u of Γ with an independent set I u , of size n/ℓ , and by replacing every edge ( u, v ) of Γ with a complete bipartite graph, whose partition classesare independent sets I u and I v . Clearly, since Γ does not contain K k , then neither does G . We claimthat graph G satisfies the assertion of the proposition. Suppose for contradiction that there is a subset S of n/ G satisfying e ( S ) − n > ℓ / ( n/ℓ ) = 4 ℓ − / n = 2 − k/ n , (the other case when e ( S ) − n / < − ℓ − / n can be treated similarly). For every vertex u ∈ Γlet the size of S ∩ I u be a u n/ℓ . By definition, 0 ≤ a u ≤ S has size n/ P u a u = ℓ/ 2. We also have that e ( S ) = X ( u,v ) ∈ E (Γ) a u a v · ( n/ℓ ) > n + 4 ℓ / ( n/ℓ ) , and therefore X ( u,v ) ∈ E (Γ) a u a v > ℓ / 16 + 4 ℓ / = 14 (cid:16) X u a u (cid:17) + 4 ℓ / . Consider a random subset Y of Γ obtained by choosing every vertex u randomly and independentlywith probability a u . Since all choices were independent we have that E (cid:2) | Y | (cid:3) = X u a u + X u = v a u a v ≤ (cid:0) X u a u (cid:1) + ℓ/ . We also have that the expected number of edges spanned by Y is E (cid:2) e ( Y ) (cid:3) = P ( u,v ) ∈ E (Γ) a u a v . Then,by the above discussion, E (cid:2) e ( Y ) − | Y | / (cid:3) > ℓ / . In particular, there is subset Y of Γ with thisproperty, which contradicts (5). This shows that every subset S of n/ G satisfies (cid:12)(cid:12)(cid:12) e ( S ) − n (cid:12)(cid:12)(cid:12) ≤ − k/ n and completes the proof. ✷ The main result in this section is Theorem 5.4, which shows that any sufficiently pseudo-randomgraph of appropriate density has strong induced Ramsey properties. It generalizes Theorem 1.4 and17orollary 1.6 from the introduction. Combined with known examples of pseudo-random graphs, thistheorem gives various explicit constructions which match and improve the best known estimates forinduced Ramsey numbers.The idea of the proof of Theorem 5.4 is rather simple. We have a sufficiently large, pseudo-randomgraph G that is not too sparse or dense. We also have d -degenerate graphs H and H each withvertex set [ k ] and chromatic number at most q . We suppose for contradiction that there is a red-blueedge-coloring of G without an induced red copy of H and without an induced blue copy of H . Wemay view the red-blue coloring of G as a red-blue-green edge-coloring of the complete graph K | G | , inwhich the edges of G have their original color, and the edges of the complement ¯ G are colored green.The fact that in G there is no induced red copy of H means that the red-blue-green coloring of K | G | does not contain a particular red-green coloring of the the complete graph K k . Then we prove, similarto Lemma 2.1 of Erd˝os and Hajnal, that any large subset of vertices of G contains two large disjointsubsets for which the edge density in color red between them is small. By using the key lemma fromSection 3, we find k large disjoint vertex subsets V , . . . , V k of G for which the edge density in colorred is small between any pair ( V i , V j ) for which ( i, j ) an edge of H .Next we try to find an induced blue copy of H with vertex i in V i for all i ∈ [ k ]. Since theedge density between V i and V j in color red is sufficiently small for every edge ( i, j ) of H , we canbuild an induced blue copy of H one vertex at a time. At each step of this process we use pseudo-randomness of G to make sure that the existing possible subsets for not yet embedded vertices of H are sufficiently large and that the density of red edges does not increase a lot between any pair ofsubsets corresponding to adjacent vertices of H . This last part of the proof, embedding an inducedblue copy of H , is the most technically involved and handled by Lemma 5.5.Recall that [ i ] = { , . . . , i } and that a graph is d -degenerate if every subgraph has a vertex of degreeat most d . For an edge-coloring Ψ : E ( K k ) → [ r ], we say that another edge-coloring Φ : E ( K n ) → [ s ]is Ψ -free if, for every subset W of size k of the complete graph K n , the restriction of Φ to W is notisomorphic to Ψ. In the following lemma, we have a coloring Ψ of the edges of the complete graph K k with colors 1 and 2 such that the graph of color 2 is d -degenerate. We also have a Ψ-free coloring Φof the edges of the complete graph K n such that between any two large subsets of vertices there aresufficiently many edges of color 1. With these assumptions, we show that there are two large subsetsof K n which in coloring Φ have few edges of color 2 between them. A graph G is bi- ( ǫ, δ ) -dense if d ( A, B ) > ǫ holds for all disjoint subsets A, B ⊂ V ( G ) with | A | , | B | ≥ δ | V ( G ) | . Lemma 5.1 Let d and k be positive integers and Ψ : E ( K k ) → [2] be a -coloring of the edges of K k such that the graph of color is d -degenerate. Suppose that q, ǫ ∈ (0 , and Φ : E ( K n ) → [ s ] isa Ψ -free edge-coloring such that the graph of color is bi- ( q, ǫ d q k k − ) -dense. Then there are disjointsubsets A and B of K n with | A | , | B | ≥ ǫ d q k k − n such that every vertex of A is connected to at most ǫ | B | vertices in B by edges of color 2. Proof. Note that from definition, the vertices of every d -degenerate graph can be labeled 1 , , . . . such that for every vertex ℓ the number of vertices j < ℓ adjacent to it is at most d . (Indeed, removefrom the graph a vertex of minimum degree, place it in the end of the list and repeat this process inthe remaining subgraph.) Therefore we may assume that the labeling 1 , . . . , k of vertices of K k hasthe property that for every ℓ ∈ [ k ] there are at most d vertices j < ℓ such that the color Ψ( j, ℓ ) = 2.18artition the vertices of K n into sets V ∪ . . . ∪ V k each of size nk . For w ∈ V i and a subset S ⊂ V j with j = i , let N ( w, S ) = { s ∈ S | Φ( w, s ) = Ψ( i, j ) } . For i < ℓ , let D ( ℓ, i ) denote the number of vertices j ≤ i such that the color Ψ( j, ℓ ) = 2. By the above assumption, D ( ℓ, i ) ≤ d for 1 ≤ i < ℓ ≤ k .We will try iteratively to build a copy of K k with coloring Ψ. After i steps, we either find twodisjoint subsets of vertices A, B which satisfy the assertion of the lemma or we will have vertices v , . . . , v i and subsets V i +1 ,i , V i +2 ,i , . . . , V k,i such that1. V ℓ,i is a subset of V ℓ of size | V ℓ,i | ≥ ǫ D ( ℓ,i ) q i − D ( ℓ,i ) | V ℓ | for all i + 1 ≤ ℓ ≤ k ,2. Φ( v j , v ℓ ) = Ψ( j, ℓ ) for 1 ≤ j < ℓ ≤ i ,3. and if j ≤ i < ℓ and w ∈ V ℓ,i , then Φ( v j , w ) = Ψ( j, ℓ ).In the first step, we call a vertex w ∈ V good if | N ( w, V j ) | ≥ ǫ | V j | for all j > , j ) = 2and | N ( w, V j ) | ≥ q | V i | for all j > , j ) = 1. If there is no good vertex in V , then there isa subset A ⊂ V with | A | ≥ k − | V | and index j > , j ) = 1 and every vertex w ∈ A has fewer than q | V j | edges of color 1 to V j or Ψ(1 , j ) = 2 and every vertex w ∈ A is connectedto less than ǫ | V j | vertices in V j by edges of color 2. Letting B = V j , we conclude that the first caseis impossible since the graph of color 1 is bi-( q, ǫ d q k k − )-dense, while in the second case we would bedone, since A and B would satisfy the assertion of the lemma. Therefore, we may assume that thereis a good vertex v ∈ V , and we define V i, = N ( v , V i ) for i > i the properties 1-3 are still satisfied. Then, in step i +1, a vertex w ∈ V i +1 ,i is called good if | N ( w, V j,i ) | ≥ ǫ | V j,i | for each j > i + 1 with Ψ( i + 1 , j ) = 2 and | N ( w, V j,i ) | ≥ q | V j,i | foreach j > i + 1 with Ψ( i + 1 , j ) = 1. If there is no good vertex in V i +1 ,i , then there is a subset A ⊂ V i +1 ,i with | A | ≥ k − i − | V i +1 ,i | and j > i + 1 such that either Ψ( i + 1 , j ) = 1 and every vertex w ∈ A hasfewer than q | V j,i | edges of color 1 to V j,i or Ψ(1 , j ) = 2 and every vertex w ∈ A is connected to lessthan ǫ | V j,i | vertices in V j,i by edges of color 2. Note that even in the last step when i + 1 = k the size of A is still at least | V k,k − | /k ≥ ǫ d q k | V k | /k ≥ ǫ d q k k − n . Therefore, letting B = V j,i , we conclude that asbefore the first case is impossible since the graph of color 1 is bi-( q, ǫ d q k k − )-dense, while the secondcase would complete the proof, since A and B would satisfy the assertion of the lemma. Hence, wemay assume that there is a good vertex v i +1 ∈ V i +1 ,i , and we define V j,i +1 = N ( v i +1 , V j,i ) for j > i + 1.Note that | V j,i +1 | ≥ q | V j,i | if Ψ( i + 1 , j ) = 1 and | V j,i +1 | ≥ ǫ | V j,i | if Ψ( i + 1 , j ) = 2. This implies thatafter step i + 1 we have that | V ℓ,i +1 | ≥ ǫ D ( ℓ,i +1) q i +1 − D ( ℓ,i +1) | V ℓ | for all i + 2 ≤ ℓ ≤ k .The iterative process must stop at one of the steps j ≤ k − 1, since otherwise the coloring Φ wouldnot be Ψ-free. As we already explained above, when this happens we have two disjoint subsets A and B that satisfy the assertion of the lemma. ✷ Notice that if coloring Φ : K n → [ s ] is Ψ-free, then so is Φ restricted to any subset of K n of size αn . Therefore, Lemma 5.1 has the following corollary. Corollary 5.2 Let d and k be positive integers and Ψ : E ( K k ) → [2] be a -coloring of the edges of K k such that the graph of color is d -degenerate. If q, α, ǫ ∈ (0 , and Φ : E ( K n ) → [ s ] is a Ψ -freeedge-coloring such that the graph of color is bi- ( q, αρ ) -dense with ρ = ǫ d q k k − , then the graph ofcolor is ( α, ρ, ǫ, -sparse. The next statement follows immediately from Corollary 5.2 (with ǫ/ ǫ ) and Corollary 3.3.19 orollary 5.3 Let d , k , and h be positive integers and Ψ : E ( K k ) → [2] be a -coloring of the edgesof K k such that the graph of color is d -degenerate. Suppose that q, α, ǫ ∈ (0 , and Φ : E ( K n ) → [ s ] is a Ψ -free edge-coloring such that the graph of color is bi- ( q, αρ ) -dense with ρ = ( ǫ/ d q k k − . Thenthe graph of color is (( ρ ) h − α, − h ρ h , ǫ, h ) -sparse. Pending one additional lemma, we are now ready to prove the main result of this section, showingthat pseudo-random graphs have strong induced Ramsey properties. Theorem 5.4 Let χ ≥ and G be a ( p, λ ) -pseudo-random graph with < p ≤ / and λ ≤ (( p k ) d − pk ) 20 log χ n . Then every d -degenerate graph on k vertices with chromatic number at most χ occurs as an induced monochromatic copy in every -coloring of the edges of G . Moreover, all ofthese induced monochromatic copies can be found in the same color. Taking p = 1 /k , n = k cd log χ and constant c sufficiently large so that (( p k ) d − pk ) 20 log χ > n − . one can easily see that this result implies Theorem 1.4. To obtain Corollary 1.6, recall that for aprime power n , the Paley graph P n has vertex set F n and distinct vertices x, y ∈ F n are adjacent if x − y is a square. This graph is (1 / , λ )-pseudo-random with λ = √ n (see e.g., [38]). Therefore, forsufficiently large constant c , the above theorem with n = 2 ck log k , p = 1 / d = χ = k impliesthat every graph on k vertices occurs as an induced monochromatic copy in all 2-edge-colorings ofthe Paley graph. Similarly, one can prove that there is a constant c such that, with high probability,the random graph G ( n, / 2) with n ≥ ck log k satisfies that every graph on k vertices occurs as aninduced monochromatic copy in all 2-edge-colorings of G . Proof of Theorem 5.4. Suppose for contradiction that there is an edge-coloring Φ of G with colorsred and blue, and d -degenerate graphs H and H each having k vertices and chromatic number atmost χ such that there is no induced red copy of H and no induced blue copy of H . Since H , H are d -degenerate graphs on k vertices we may suppose that their vertex set is [ k ] and every vertex i has at most d neighbors less than i in both H and H .Consider the red-blue-green edge-coloring Φ of the complete graph K n , in which the edges of G have their original coloring Φ , and the edges of the complement ¯ G are colored green. Let Ψ be theedge-coloring of the complete graph K k where the red edges form a copy of H and the remainingedges are green. By assumption, the coloring Φ is Ψ-free. Since G is ( p, λ )-pseudo-random, we havethat the density of edges in ¯ G between any two disjoint sets A, B of size at least 6 p − λ is at least d ¯ G ( A, B ) = 1 − d G ( A, B ) ≥ − (cid:16) p + λ p | A || B | (cid:17) ≥ − p. Therefore the green graph in coloring Φ is bi-( q, p − λn )-dense for q = 1 − p/ ǫ = p k , ρ = ( ǫ/ d q k k − , h = log χ , and α = ( ρ/ h − . Using that q = 1 − p/ λ/n ≤ (( p k ) d − pk ) 20 log χ it is straightforward to check that 6 p − λn ≤ − h ρ h = αρ . By Corollary 5.3and Definition 3.1, there are 2 h = χ subsets W , . . . , W χ of K n with | W | = . . . = | W χ | ≥ − h ρ h n ,such that the sum of densities of red edges between all pairs W i and W j is at most (cid:0) χ (cid:1) ǫ . Hence, thedensity between W i and W j is also at most χ ǫ for all 1 ≤ i < j ≤ χ . Partition every set W i into k subsets each of size | W i | /k ≥ k − h ρ h n . Since the chromatic number of H is at most χ and it has k vertices, we can choose for every vertex i of H one of these subsets, which we call V i , such that20ll subsets corresponding to vertices of H in the same color class (of a proper χ -coloring) come fromthe same set W ℓ . In particular, for every edge ( i, j ) of H , the corresponding sets V i and V j lie in twodifferent sets { W ℓ } . Since the size of V i ’s is by a factor k smaller than the size of W ℓ ’s the density ofred edges between V i and V j corresponding to an edge in H is at most k χ ǫ ≤ p k (note that itcan increase by a factor at most k compare to density between sets { W ℓ } ). Notice that the subgraph G ′ ⊂ G induced by V ∪ . . . ∪ V k has n ′ ≥ − h ρ h n vertices and is also ( p, λ )-pseudo-random. By thedefinitions of ρ and h , and our assumption on λ , we have that λ/n ′ ≤ h − ρ − h λ/n ≤ h − ρ − h (cid:18)(cid:16) p k (cid:17) d − pk (cid:19) 20 log χ ≤ (cid:18)(cid:16) p k (cid:17) d − pk (cid:19) 10 log χ . Applying Lemma 5.5 below with H = H to the coloring Φ of graph G ′ with partition V ∪ . . . ∪ V k ,we find an induced blue copy of H , completing the proof. ✷ Lemma 5.5 Let H be a d -degenerate graph with vertex set [ k ] such that each vertex i has at most d neighbors less than i . Let G = ( V, E ) be a ( p, λ ) -pseudo-random graph on n vertices with < p ≤ / , λ ≤ (( p k ) d − pk ) n and let V = V ∪ . . . ∪ V k be a partition of its vertices such that each V i has size n/k . Suppose that the edges of G are -colored, red and blue, such that for every edge ( j, ℓ ) of H , thedensity of red edges between the pair ( V j , V ℓ ) is at most β = p k . Then there is an induced blue copyof H in G for which the image of every vertex i ∈ [ k ] lies in V i . Proof. For i < j , let D ( i, j ) denote the number of neighbors of j that are at most i . Let ǫ = k , ǫ = p k , and δ = (1 − p ) k p d . Since p ≤ / 4, notice that δ ≥ − pk p d and λ ≤ (cid:18)(cid:16) p k (cid:17) d − pk (cid:19) n ≤ p (10 k ) δ n. (6)We construct an induced blue copy of H one vertex at a time. At the end of step i , we will havevertices v , . . . , v i and subsets V j,i ⊂ V j for j > i such that the following four conditions hold1. for j, ℓ ≤ i , if ( j, ℓ ) is an edge of H , then ( v j , v ℓ ) is a blue edge of G , otherwise v j and v ℓ are notadjacent in G ,2. for j ≤ i < ℓ , if ( j, ℓ ) is an edge of H , then v j is adjacent to all vertices in V ℓ,i by blue edges,otherwise there are no edges of G from v j to V ℓ,i ,3. for i < j , we have | V j,i | ≥ (1 − p − ǫ ) i − D ( i,j ) ( p − ǫ ) D ( i,j ) | V j | ,4. and for j, ℓ > i if ( j, ℓ ) is an edge of H , then the density of red edges between V j,i and V ℓ,i is atmost (1 + ǫ ) i β .Clearly, in the end of the first k steps of this process we obtain a required copy of H . For i = 0and j ∈ [ k ], define V j, = V j . Notice that the above four properties are satisfied for i = 0 (the first twoproperties being vacuously satisfied). We now assume that the above four properties are satisfied atthe end of step i , and show how to complete step i + 1 by finding a vertex v i +1 ∈ V i +1 ,i and subsets V j,i +1 ⊂ V j,i for j > i + 1 such that the conditions 1-4 still hold.We need to introduce some notation. For a vertex w ∈ V j and a subset S ⊂ V ℓ with j = ℓ , let21 N ( w, S ) denote the set of vertices s ∈ S such that ( s, w ) is an edge of G , • B ( w, S ) denote the set of vertices s ∈ S such that ( s, w ) is a blue edge of G , • R ( w, S ) denote the set of vertices s ∈ S such that ( s, w ) is a red edge of G , • ˜ N ( w, S ) = N ( w, S ) if ( j, ℓ ) is an edge of H and ˜ N ( w, S ) = S \ N ( w, S ) otherwise, • ˜ B ( w, S ) = B ( w, S ) if ( j, ℓ ) is an edge of H and ˜ B ( w, S ) := S \ N ( w, S ) otherwise, and • p j,ℓ = p if ( j, ℓ ) is an edge of H and p j,ℓ = 1 − p if ( j, ℓ ) is not an edge of H .Note that since graph G is pseudo-random with edges density p , by the above definitions, for everylarge subset S ⊂ V ℓ and for most vertices w ∈ V j we expect the size of ˜ N ( w, S ) to be roughly p j,ℓ | S | .We also have for all S ⊂ V ℓ and w ∈ V j that ˜ B ( w, S ) = ˜ N ( w, S ) \ R ( w, S ).Call a vertex w ∈ V i +1 ,i good if for all j > i + 1, ˜ B ( w, V j,i ) ≥ ( p i +1 ,j − ǫ ) | V j,i | and for every edge( j, ℓ ) of H with j, ℓ > i + 1, the density of red edges between ˜ B ( w, V j,i ) and ˜ B ( w, V ℓ,i ) is at most(1 + ǫ ) i +1 β . If we find a good vertex w ∈ V i +1 ,i , then we simply let v i +1 = w and V j,i +1 = ˜ B ( w, V j,i )for j > i + 1, completing step i + 1. It therefore suffices to show that there is a good vertex in V i +1 ,i .We first throw out some vertices of V i +1 ,i ensuring that the remaining vertices satisfy the first ofthe two properties of good vertices. For j > i + 1 and an edge ( i + 1 , j ) of H , let R j consist of those w ∈ V i +1 ,i for which the number of red edges ( w, w j ) with w j ∈ V j,i is at least ǫ | V j,i | . Since thedensity of red between V i +1 ,i and V j,i is at most (1 + ǫ ) i β , then R j contains at most | R j | ≤ (1 + ǫ ) i β | V i +1 ,i || V j,i | ǫ | V j,i | = 2(1 + ǫ ) i ǫ − β | V i +1 ,i | vertices. Let V ′ be the set of vertices in V i +1 ,i that are not in any of the R j . Using that ǫ = 1 /k, ǫ = p k and β = p k we obtain | V ′ | ≥ | V i +1 ,i | − X j>i +1 | R j | ≥ | V i +1 ,i | − k (cid:16) ǫ ) i ǫ − β | V i +1 ,i | (cid:17) ≥ (cid:16) − k (1 + ǫ ) k ǫ − β (cid:17) | V i +1 ,i | ≥ | V i +1 ,i | . For j > i + 1, let S j consist of those w ∈ V ′ for which ˜ N ( w, V j,i ) < ( p i +1 ,j − ǫ ) | V j,i | . Then thedensity of edges of G between S j and V j,i deviates from p by at least ǫ . Since graph G is ( p, λ )-pseudo-random, we obtain that ǫ ≤ λ √ | V j,i || S j | and hence | S j | ≤ λ ǫ | V j,i | . Also using that p ≤ / − p − ǫ = 1 − p − p k ≥ (1 − k )(1 − p ). Therefore, our third condition, combined with δ = (1 − p ) k p d and (1 − x ) t ≥ − xt for all 0 ≤ x ≤ 1, imply that for j ≥ i + 1 | V j,i | ≥ (1 − p − ǫ ) i − D ( i,j ) ( p − ǫ ) D ( i,j ) | V j | ≥ (1 − p − ǫ ) k ( p − ǫ ) d | V j |≥ (cid:18)(cid:16) − k (cid:17) (1 − p ) (cid:19) k (cid:16) p − p k (cid:17) d | V j |≥ (cid:18) − k (cid:19) k (cid:18) − k (cid:19) k (1 − p ) k p d | V j |≥ 12 (1 − p ) k p d | V i | = δn k . (7)22ince λ ≤ pδ k n (see (6)) and ǫ = p k , we therefore have | S j | ≤ λ ǫ | V j,i | ≤ k | V i +1 ,i | . Let V ′′ be theset of vertices in V ′ that are not in any of the sets S j . The cardinality of V ′′ is at least | V ′′ | ≥ | V ′ | − X j>i +1 | S j | ≥ | V ′ | − k · (cid:16) k | V i +1 ,i | (cid:17) ≥ | V ′ | − | V i +1 ,i | ≥ | V i +1 ,i | . Moreover, by definition, for every j > i + 1 and every vertex w ∈ V ′′ there are | R ( w, V j,i ) | ≤ ǫ | V j,i | rededges from w to V j,i if ( i + 1 , j ) is an edge of H and also ˜ N ( w, V j,i ) has size at least ( p i +1 ,j − ǫ ) | V j,i | .This implies that | ˜ B ( w, V j,i ) | = | ˜ N ( w, V j,i ) \ R ( w, V j,i ) | ≥ | ˜ N ( w, V j,i ) | − ǫ | V j,i | ≥ ( p i +1 ,j − ǫ ) | V j,i | and therefore the vertices of V ′′ satisfy the first of the two properties of good vertices.We have reduced our goal to showing that there is an element of V ′′ that has the second propertyof good vertices. For i + 1 < j < ℓ ≤ k and ( j, ℓ ) an edge of H , let T j,ℓ denote the set of w ∈ V ′′ suchthat the density of red edges between ˜ B ( w, V j,i ) and ˜ B ( w, V ℓ,i ) is more than (1 + ǫ ) i +1 β . Notice thatany vertex of V ′′ not in any of the sets T j,ℓ is good. Therefore, if we show that T j,ℓ < | V ′′ | k for each T j,ℓ , then there is a good vertex in V ′′ and the proof would be complete. To do so we will assumewithout loss of generality that p i +1 ,j and p i +1 ,ℓ are both p (the other 3 cases can be treated similarlyusing the fact that ¯ G is (1 − p, λ )-pseudo-random). Since by (7) we have that | V ℓ,i | , | V j,i | ≥ δn k and | V ′′ | k ≥ k | V i +1 ,i | ≥ δn k , the result follows from the following claim. Claim 5.6 Let X, Y and Z be three disjoint subsets of our ( p, λ ) -pseudo-random graph G such that | X | ≥ δn k and | Y | , | Z | ≥ δn k . For every w ∈ X let B ( w ) , B ( w ) be the set of vertices in Y and Z respectively connected to w by a blue edge and suppose that | B ( w ) | ≥ ( p − p k ) | Y | and | B ( w ) | ≥ ( p − p k ) | Z | . Also suppose that the density of red edges between Y and Z is at most η for some η ≥ p k . Then there is a vertex w ∈ X such that the density of red edges between B ( w ) and B ( w ) is at most k +1 k η . Proof. Let m denote the number of triangles ( x, y, z ) with x ∈ X, y ∈ Y, z ∈ Z , such that the edge( y, z ) is red. We need an upper bound on m . Let U be the set of vertices in Y that have fewer than p δ (10 k ) − n red edges to Z . So the number m of triangles ( x, y, z ) which have y ∈ U and edge( y, z ) red is clearly at most m ≤ p δ (10 k ) − n . Let W , W denote the subsets of vertices in Y whose number of neighbors in X is at least ( p + p k ) | X | or respectively at most ( p − p k ) | X | . Since thedensity of edges between W i and X deviates from p by more than p k , using ( p, λ )-pseudo-randomnessof G , we have p k ≤ λ √ | X || W i | , or equivalently, | X || W i | ≤ k p − λ . Therefore, using the upperbound λ ≤ p (10 k ) δ n from (6), the number m of triangles ( x, y, z ) with y ∈ W = W ∪ W and edge( y, z ) red is at most m ≤ | X || W | n ≤ k p − λ n ≤ (10 k ) − p δ n . For y ∈ Y \ ( U ∪ W ), we have the number of neighbors of y in X satisfy (cid:12)(cid:12) | N ( y,X ) || X | − p (cid:12)(cid:12) ≤ p k and the number of red edges from y to Z is at least p δ (10 k ) − n . Recall that R ( y, Z ) denotes theset of vertices in Z connected to y by red edges, hence we have that | R ( y, Z ) | ≥ p δ (10 k ) − n forevery y ∈ Y \ ( U ∪ W ). We also have that | N ( y, X ) | ≥ p | X | / ≥ pδn k . Since G is ( p, λ )-pseudo-random, we can bound the number of edges between N ( y, X ) and R ( y, Z ) by p | N ( y, X ) || R ( y, Z ) | +23 p | N ( y, X ) || R ( y, Z ) | . Using the above lower bounds on | N ( y, X ) | and | R ( y, Z ) | , and the upper bound(6) for λ , one can easily check that λ p | N ( y, X ) || R ( y, Z ) | ≤ λ q(cid:0) pδn/ (16 k ) (cid:1)(cid:0) p δ (10 k ) − n (cid:1) ≤ p k . Hence the number of edges between N ( y, X ) and R ( y, Z ) is at most ( p + p k ) | N ( y, X ) || R ( y, Z ) | . Recallthat for all y ∈ Y \ ( U ∪ W ) we have that | N ( y, X ) | ≤ (cid:0) p + p k (cid:1) | X | . Also, since the density of rededges between Y and Z is at most η , we have that P y | R ( y, Z ) | ≤ η | Y || Z | . Therefore, the number m of triangles ( x, y, z ) with y ∈ Y \ ( U ∪ W ) , x ∈ X, z ∈ Z such that the edge ( y, z ) is red is at most m ≤ (cid:16) p + p k (cid:17) X y ∈ Y \ ( U ∪ W ) | N ( y, X ) || R ( y, Z ) | ≤ (cid:16) p + p k (cid:17) | X | X y | R ( y, Z ) | ≤ (cid:16) p + p k (cid:17) η | X || Y || Z | . Using the lower bounds on | X | , | Y | , | Z | , η from the assertion of the claim we have that p η | X || Y || Z | ≥ p δ (10 k ) n ≥ (10 k ) max (cid:0) m , m (cid:1) . This implies that the total number of triangles ( x, y, z ) with x ∈ X, y ∈ Y, z ∈ Z , such that the edge( y, z ) is red is at most m = m + m + m ≤ p η | X || Y || Z | (10 k ) + (cid:16) p + p k (cid:17) η | X || Y || Z |≤ (cid:0) / (8 k ) (cid:1) p η | X || Y || Z | . Therefore, there is vertex w ∈ X such that the number of these triangles through w is at most(1 + 1 / (8 k )) p η | Y || Z | . Since B ( w ) ⊂ N ( w, Y ) and B ( w ) ⊂ N ( w, Z ), then the number of red edgesbetween B ( w ) and B ( w ) is at most (1 + 1 / (8 k )) p η | Y || Z | . Since we have that | B ( w ) | ≥ ( p − p k ) | Y | and | B ( w ) | ≥ ( p − p k ) | Z | , the density of red edges between B ( w ) and B ( w ) can be at most(1 + 1 / (8 k )) p η | Y || Z || B ( w ) || B ( w ) | ≤ (1 + 1 / (8 k )) p η ( p − p k ) ≤ k + 1 k η, completing the proof. ✷ In this section we prove Theorem 1.7, that there are trees whose induced Ramsey number is superlinearin the number of vertices. The proof uses Szemer´edi’s regularity lemma, which we mentioned in theintroduction.A red-blue edge-coloring of the edges of a graph partitions the graph into two monochromatic sub-graphs, the red graph , which contains all vertices and all red edges, and the blue graph , which containsall vertices and all blue edges. The weak induced Ramsey number r weak ind ( H , H ), introduced byGorgol and Luczak [33], is the least positive integer n such that there is a graph G on n vertices suchthat for every red-blue coloring of the edges of G , either the red graph contains H as an induced24ubgraph or the blue graph contains H as an induced subgraph. Note that this definition is a relax-ation of the induced Ramsey numbers since we allow blue edges between the vertices of red copy of H or red edges between the vertices of blue copy of H . Therefore a weak induced Ramsey numberlies between the usual Ramsey number and the induced Ramsey number. Using this new notion wecan strengthen Theorem 1.7 as follows. Recall that K ,k denotes a star with k edges. Theorem 6.1 For each α ∈ (0 , , there is a constant k ( α ) such that if H is a graph on k ≥ k ( α ) vertices with maximum independent set of size less than (1 − α ) k , then r weak ind ( H, K ,k ) ≥ kα . Let T be a tree which is a union of path of length k/ k/ T contains the path P k/ and the star K ,k/ as inducedsubgraphs, then r ind ( T ) ≥ r weak ind ( P k/ , K ,k/ ). By using the above theorem with k/ k , H = P k/ , and sufficiently small α , we obtain that r ind ( T ) /k → ∞ . Moreover the same holds for everysufficiently large tree which contains a star and a matching of linear size as subgraphs. We deduceTheorem 6.1 from the following lemma. Lemma 6.2 For each δ > there is a constant c δ > such that if G = ( V, E ) is a graph on n vertices, then there is a -coloring of the edges of G with colors red and blue such that the red graphhas maximum degree less than δn and for every subset W ⊂ V , either there are at least c δ n blue edgesin the subgraph induced by W or there is an independent set in W in the blue graph of cardinality atleast | W | − δn . Proof. Let ǫ = δ . By Szemer´edi’s regularity lemma, there is a positive integer M ( ǫ ) together withan equitable partition V = S ki =1 V i of vertices of the graph G = ( V, E ) into k parts with ǫ < k < M ( ǫ )such that all but at most ǫk of the pairs ( V i , V j ) are ǫ -regular. Recall that a partition is equitable if (cid:12)(cid:12) | V i | − | V j | (cid:12)(cid:12) ≤ V i , V j ) is called ǫ -regular if for every X ⊂ V i and Y ⊂ V j with | X | > ǫ | V i | and | Y | > ǫ | V j | , we have | d ( X, Y ) − d ( V i , V j ) | < ǫ . Let c δ = ǫM ( ǫ ) − . Notice that to prove Lemma6.2, it suffices to prove it under the assumption that n is sufficiently large. So we may assume that n ≥ ǫ − M ( ǫ ).If a pair ( V i , V j ) is ǫ -regular with density d ( V i , V j ) at least 2 ǫ , then color the edges between V i and V j blue. Let G ′ be the subgraph of G formed by deleting the edges of G that are already colored blue.Let V ′ be the vertices of G ′ of degree at least δn . Color blue any edge of G ′ with a vertex in V ′ . Theremaining edges are colored red. First notice that every vertex has red degree less than δn .We next show that | V ′ | is small by showing that G ′ has few edges. There are at most k X i =1 (cid:18) | V i | (cid:19) ≤ n k ≤ ǫn edges ( v, w ) of G with v and w both in the same set V i . Since at most ǫk of the pairs ( V i , V j ) arenot ǫ -regular, then there are at most ǫn edges in such pairs. The ǫ -regular pairs ( V i , V j ) with densityless than 2 ǫ contain at most a fraction 2 ǫ of all possible edges on n vertices. So there are less than ǫn edges of this type. Therefore the number of edges of G ′ is at most 3 ǫn , and therefore there areat most | V ′ | ≤ e ( G ′ ) δn ≤ ǫδ − n < δn vertices of degree at least δn in it.25et W ⊂ V . Let W ′ = W \ V ′ , so W ′ has cardinality at least | W | − δn . Let W i = V i ∩ W ′ . Let W ′′ = S | W i |≥ ǫ | V i | W i . Notice that for any i ∈ [ k ] there are at most ǫ nk vertices in ( W ′ \ W ′′ ) ∩ V i , sothere are at most k ( ǫ nk ) = ǫn = δ n vertices in W ′ \ W ′′ . Therefore, W ′′ has at least | W | − δn vertices.If there are i = j such that | W i | , | W j | ≥ ǫ nk and the pair ( V i , V j ) is ǫ -regular with density at least 2 ǫ ,then there are at least ǫ | W i || W j | ≥ ǫk n ≥ ǫM ( ǫ ) − n = c δ n blue edges between W i and W j . In this case the blue subgraph induced by W has at least c δ n edges.Otherwise, all the edges in W ′′ are red, and W ′′ is an independent set in the blue graph of cardinalityat least | W | − δn . ✷ Proof of Theorem 6.1. Let H be a graph on k vertices with maximum independent set of size lessthan (1 − α ) k . Take δ = α and c δ to be as in Lemma 6.2. Let G = ( V, E ) be any graph on n vertices,where n ≤ kα . If H has at least c δ k edges, consider a random red-blue coloring of the edges of G suchthat the probability of an edge being red is α . The expected degree of a vertex in the red graph isat most αn/ 2. Therefore by the standard Chernoff bound for the Binomial distribution it is easy tosee that with probability 1 − o (1) the degree of every vertex in the red graph is less than αn ≤ k , i.e.,it contains no K ,k . On the other hand, for k sufficiently large, the probability that the blue graphcontains a copy of H is at most n k (1 − α/ e ( H ) ≤ n k e − αc δ k / ≤ e − αc δ k / k log( k/α ) = o (1) . Thus with high probability this coloring has no blue copy of H as well. This implies that we canassume that the number of edges in H is less than c δ k .By Lemma 6.2, there is a red-blue edge-coloring of the edges of G such that the red graph hasmaximum degree at most δn and every subset W ⊂ V contains either an independent set in theblue graph of size at least | W | − δn or contains at least c δ n blue edges. Since δn = α n < k ,then the red graph does not contain K ,k as a subgraph. Suppose for contradiction that there isan induced copy of H in the blue graph, and let W be the vertex set of this copy. The blue graphinduced by W has e ( H ) < c δ k ≤ c δ n edges. Therefore it contains an independent set of size at least | W | − δn ≥ | W | − αk = (1 − α ) k , contradicting the fact that H has no independent set of size (1 − α ) k .Therefore, there are no induced copies of H in the blue graph. ✷ • All of the results in this paper concerning induced subgraphs can be extended to many colors.One such multicolor result was already proved in Section 5 (see Lemma 5.1), and we use herethe notation from that section. For example, one can obtain the following generalization ofTheorem 1.1. For k ≥ 2, let Ψ : E ( K k ) → [ r ] be an edge-coloring of the complete graph K k andΦ : E ( K n ) → [ s ] be a Ψ-free edge-coloring of the complete graph K n . Then there is a constant c so that for every ǫ ∈ (0 , / W ⊂ K n of size at least 2 − crk (log ǫ ) n and a color i ∈ [ r ] such that the edge density of color i in W is at most ǫ . Since the proofs of this statementand other generalizations can be obtained using our key lemma in essentially the same way as26he proofs of the results that we already presented (which correspond to the two color case), wedo not include them here. • It would be very interesting to get a better estimate in Theorem 1.1. This will immediatelygive an improvement of the best known result for Erd˝os-Hajnal conjecture on the size of themaximum homogeneous set in H -free graphs. We believe that our bound can be strengthenedas follows. Conjecture 7.1 For each graph H , there is a constant c ( H ) such that if ǫ ∈ (0 , / and G isa H -free graph on n vertices, then there is an induced subgraph of G on at least ǫ c ( H ) n verticesthat has edge density either at most ǫ or at least − ǫ . This conjecture if true would imply the Erd˝os-Hajnal conjecture. Indeed, take ǫ = n − c ( H )+1 .Then every H -free graph G on n vertices contains an induced subgraph on at least ǫ c ( H ) n = n c ( H )+1 vertices that has edge density at most ǫ or at least 1 − ǫ . Note that this inducedsubgraph or its complement has average degree at most 1, which implies that it contains a cliqueor independent set of size at least n c ( H )+1 . • One of the main remaining open problems on induced Ramsey numbers is a beautiful conjectureof Erd˝os which states that there exists a positive constant c such that r ind ( H ) ≤ ck for everygraph H on k vertices. This, if true, will show that induced Ramsey numbers in the worst casehave the same order of magnitude as ordinary Ramsey numbers. Our results here suggest thatone can attack this problem by studying 2-edge-colorings of a random graph with edge probability1 / 2. It looks very plausible that for sufficiently large constant c , with high probability randomgraph G ( n, / 2) with n ≥ ck has the property that any of its 2-edge-colorings contains everygraph on k vertices as an induced monochromatic subgraph. Moreover, maybe this is even truefor every sufficiently pseudo-random graph with edge density 1 / • The results on induced Ramsey numbers of sparse graphs naturally lead to the following ques-tions. What is the asymptotic behavior of the maximum of induced Ramsey numbers over alltrees on k vertices? We have proved r ind ( T ) is superlinear in k for some trees T . On the otherhand, Beck [8] proved that r ind ( T ) = O (cid:0) k log k (cid:1) for all trees T on k vertices.For induced Ramsey numbers of bounded degree graphs, we proved a polynomial upper boundwith exponent which is nearly linear in the maximum degree. Can this be improved further, e.g.,is it true that the induced Ramsey number of every n -vertex graph with maximum degree d isat most a polynomial in n with exponent independent of d ? It is known that the usual Ramseynumbers of bounded degree graphs are linear in the number of vertices. Acknowledgment. We’d like to thank Janos Pach and Csaba T´oth for helpful comments on an earlystage of this project and Steve Butler and Philipp Zumstein for carefully reading this manuscript. References [1] N. Alon, The Shannon capacity of a union, Combinatorica (1998), 301–310.272] N. Alon, M. Krivelevich, and B. Sudakov, Induced subgraphs of prescribed size, J. Graph Theory (2003), 239–251.[3] N. Alon, J. Pach, R. Pinchasi, R. Radoiˇci´c, and M. Sharir, Crossing patterns of semi-algebraicsets, J. Combin. Theory Ser. A (2005), 310–326.[4] N. Alon, J. Pach and J. Solymosi, Ramsey-type theorems with forbidden subgraphs, Combinatorica (2001), 155–170.[5] N. Alon and J. H. Spencer, The probabilistic method, Proceedings of the 37thACM STOC (2005), 1–10.[7] B. Barak, A. Rao, R. Shaltiel, and A. Wigderson, 2-Source dispersers for sub-polynomial en-tropy and Ramsey graphs beating the Frankl-Wilson construction, Proceedings of 38th ACM STOC (2006), 671–680.[8] J. Beck, On size Ramsey number of paths, trees and circuits II, in: Mathematics of Ramsey theory ,Algorithms Combin., 5, Springer, Berlin, 1990, 34–45.[9] J. Bourgain, More on the sum-product phenomenon in prime fields and its applications, Int. J.Number Theory (2005), 1–32.[10] B. Bukh and B. Sudakov, Induced subgraphs of Ramsey graphs with many distinct degrees, J.Combin. Theory Ser. B (2007), 612–619.[11] S. A. Burr and P. Erd˝os, On the magnitude of generalized Ramsey numbers for graphs, in: Infiniteand Finite Sets , Vol. 1, Colloquia Mathematica Societatis J´anos Bolyai, Vol. 10, North-Holland,Amsterdam/London, 1975, 214–240.[12] M. Chudnovsky and S. Safra, The Erd˝os-Hajnal conjecture for bull-free graphs, preprint.[13] F. R. K. Chung and R. L. Graham, On graphs not containing prescribed induced subgraphs, in: A Tribute to Paul Erd˝os , ed. by A. Baker, B. Bollobas and A. Hajnal, Cambridge University Press(1990), 111–120.[14] F. R. K. Chung, R. L. Graham, and R. M. Wilson, Quasi-random graphs, Combinatorica (1989),345–362.[15] D. Conlon, A new upper bound for diagonal Ramsey numbers, Annals of Math. , to appear.[16] W. Deuber, A generalization of Ramsey’s theorem, in: Infinite and Finite Sets , Vol. 1, ColloquiaMathematica Societatis J´anos Bolyai, Vol. 10, North-Holland, Amsterdam/London, 1975, 323–332.[17] R. Diestel, Graph theory , 2nd edition, Springer, 1997.[18] N. Eaton, Ramsey numbers for sparse graphs, Discrete Math. (1998), 63–75.2819] P. Erd˝os, Some remarks on the theory of graphs, Bull. Amer. Math. Soc. (1947), 292–294.[20] P. Erd˝os, On some problems in graph theory, combinatorial analysis and combinatorial numbertheory, Graph theory and combinatorics (Cambridge, 1983) (B. Bollob´as, ed.), Academic Press,London, New York, 1984, 1–17.[21] P. Erd˝os, Problems and results on finite and infinite graphs, in: Recent advances in graph theory (Proc. Second Czechoslovak Sympos., Prague, 1974), Academia, Prague, 1975, 183–192.[22] P. Erd˝os, M. Goldberg, J. Pach, and J. Spencer, Cutting a graph into two dissimilar halves, J.Graph Theory (1988), 121–131.[23] P. Erd˝os and A. Hajnal, Ramsey-type theorems, Discrete Appl. Math. (1989), 37–52.[24] P. Erd˝os, A. Hajnal, and J. Pach, Ramsey-type theorem for bipartite graphs, Geombinatorics (2000), 64–68.[25] P. Erd˝os, A. Hajnal, and L. P’osa, Strong embeddings of graphs into colored graphs, , in: Infiniteand Finite Sets , Vol. 1, Colloquia Mathematica Societatis J´anos Bolyai, Vol. 10, North-Holland,Amsterdam/London, 1975, 585–595.[26] P. Erd˝os and G. Szekeres, A combinatorial problem in geometry, Compositio Mathematica (1935), 463–470.[27] P. Erd˝os and E. Szemer´edi, On a Ramsey type theorem, Period. Math. Hungar. (1972), 295–299.[28] J. Fox, J. Pach, CS. D. T´oth, Intersection patterns of curves, to appear in Israel J. of Math.[29] J. Fox and B. Sudakov, Density theorems for bipartite graphs and related Ramsey-type results,preprint.[30] P. Frankl and R. Wilson, Intersection theorems with geometric consequences, Combinatorica (1981), 357–368.[31] W. T. Gowers, Lower bounds of tower type for Szemer´edi’s uniformity lemma, Geom. Funct.Anal. (1997), 322–337.[32] W. T. Gowers, Rough structure and classification, GAFA 2000 (Tel Aviv, 1999), Geom. Funct.Anal. (2000) Special Volume, Part I, 79–117.[33] I. Gorgol and T. Luczak, On induced Ramsey numbers, Discrete Math. (2002), 87–96.[34] R. Graham, V. R¨odl, and A. Ruci´nski, On graphs with linear Ramsey numbers, J. Graph Theory (2000) 176–192.[35] P. E. Haxell, Y. Kohayakawa, and T. Luczak, The induced size-Ramsey number of cycles, Combin.Probab. Comput. (1995), 217–240.[36] Y. Kohayakawa, H. Pr¨omel, and V. R¨odl, Induced Ramsey numbers, Combinatorica (1998),373–404. 2937] J. Koml´os and M. Simonovits, Szemer´edi’s regularity lemma and its applications in graph theory. Combinatorics, Paul Erd˝os is eighty , Vol. 2 (Keszthely, 1993), 295–352, Bolyai Soc. Math. Stud., 2,J´anos Bolyai Math. Soc., Budapest, 1996.[38] M. Krivelevich and B. Sudakov, Pseudo-random graphs, in: More Sets, Graphs and Numbers ,Bolyai Society Mathematical Studies 15, Springer, 2006, 199–262.[39] D. Larman, J. Matouˇsek, J. Pach, and J. T¨or˝ocsik, A Ramsey-type result for convex sets, Bull.London Math. Soc. (1994), 132–136.[40] T. Luczak and V. R¨odl, On induced Ramsey numbers for graphs with bounded maximum degree, J. Combin. Theory Ser. B (1996), 324–333.[41] V. Nikiforov, Edge distribution of graphs with few copies of a given graph, Combin. Probab.Comput. (2006), 895–902.[42] H. Pr¨omel and V. R¨odl, Non-Ramsey graphs are c log n -universal, J. Combin. Theory Ser. (1999), 379–384.[43] F. P. Ramsey, On a problem of formal logic, Proc. London Math. Soc. (1930), 264–286.[44] V. R¨odl, The dimension of a graph and generalized Ramsey theorems, Master’s thesis, CharlesUniversity, 1973.[45] V. R¨odl, On universality of graphs with uniformly distributed edges, Discrete Math. (1986),125–134.[46] M. Schaefer and P. Shah, Induced graph Ramsey theory, Ars Combin. , (2003), 3–21.[47] S. Shelah, Erd˝os and Renyi conjecture, J. Combin. Theory Ser. A82