[PDF] A matheuristic approach for the b-coloring problem using integer programming and a multi-start multi-greedy randomized metaheuristic

Abstract

Given a graph G=(V,E), the b-coloring problem consists in attributing a color to every vertex in V such that adjacent vertices receive different colors, every color has a b-vertex, and the number of colors is maximized. A b-vertex is a vertex adjacent to vertices colored with all used colors but its own. The b-coloring problem is known to be NP-Hard and its optimal solution determines the b-chromatic number of G, denoted \chi_b(G). This paper presents an integer programming formulation and a very effective multi-greedy randomized heuristic which can be used in a multi-start metaheuristic. In addition, a matheuristic approach is proposed combining the multi-start multi-greedy randomized metaheuristic with a MIP (mixed integer programming) based local search procedure using the integer programming formulation. Computational experiments establish the proposed multi-start metaheuristic as very effective in generating high quality solutions, along with the matheuristic approach successfully improving several of those results. Moreover, the computational results show that the multi-start metaheuristic outperforms a state-of-the-art hybrid evolutionary metaheuristic for a subset of the large instances which were previously considered in the literature. An additional contribution of this work is the proposal of a benchmark instance set, which consists of newly generated instances as well as others available in the literature for classical graph problems, with the aim of standardizing computational comparisons of approaches for the b-coloring problem in future works.

Full PDF

AA matheuristic approach for the b -coloring problem using integerprogramming and a multi-start multi-greedy randomizedmetaheuristic Rafael A. Melo ∗ Michell F. Queiroz † Marcio C. Santos ‡ February 22, 2021

Abstract

Given a graph G = ( V, E ), the b -coloring problem consists in attributing a color to every vertexin V such that adjacent vertices receive diﬀerent colors, every color has a b -vertex, and the number ofcolors is maximized. A b -vertex is a vertex adjacent to vertices colored with all used colors but its own.The b -coloring problem is known to be NP-Hard and its optimal solution determines the b -chromaticnumber of G , denoted χ b ( G ). This paper presents an integer programming formulation and a very ef-fective multi-greedy randomized heuristic which can be used in a multi-start metaheuristic. In addition,a matheuristic approach is proposed combining the multi-start multi-greedy randomized metaheuristicwith a MIP (mixed integer programming) based local search procedure using the integer programmingformulation. Computational experiments establish the proposed multi-start metaheuristic as very eﬀec-tive in generating high quality solutions, along with the matheuristic approach successfully improvingseveral of those results. Moreover, the computational results show that the multi-start metaheuristicoutperforms a state-of-the-art hybrid evolutionary metaheuristic for a subset of the large instances whichwere previously considered in the literature. An additional contribution of this work is the proposalof a benchmark instance set, which consists of newly generated instances as well as others available inthe literature for classical graph problems, with the aim of standardizing computational comparisons ofapproaches for the b -coloring problem in future works. Keywords: metaheuristics, graph b -coloring; integer programming; ﬁx-and-optimize; matheuristics. Given a simple graph G = ( V, E ) and a set of colors K = { , . . . , | K |} , deﬁne a coloring c : V → K as afunction which assigns to each vertex v ∈ G a color i ∈ K . A coloring is said to be proper if c ( u ) (cid:54) = c ( v ) forevery uv ∈ E . An example of proper coloring is illustrated in Figure 1.Given a coloring c , deﬁne v to be a b -vertex if v has at least one neighbor with each color in K \ { c ( v ) } ,more precisely, N ( v ) ∩ { u ∈ V | c ( u ) = i } (cid:54) = ∅ for every i ∈ K \ { c ( v ) } . A coloring is said to be a b -coloring if every color in K has at least one associated b -vertex. Examples of b -colorings are illustratedin Figure 2. Alternatively, deﬁne color classes of c as the parts of a partition of V into independent sets C i = { v ∈ V | c ( v ) = i } for each i ∈ K . A vertex v ∈ V with c ( v ) = j is called a b-vertex for color j if v has ∗ Universidade Federal da Bahia, Departamento de Ciˆencia da Computa¸c˜ao, Computational Intelligence and OptimizationResearch Lab (CInO), Salvador, Brazil. ( [email protected] ). Corresponding author. † Universidade Federal da Bahia, Departamento de Ciˆencia da Computa¸c˜ao, Computational Intelligence and OptimizationResearch Lab (CInO), Salvador, Brazil. ( [email protected] ). ‡ Universidade Federal do Cear´a, Campus Russas. Rua Felipe Santiago, 411. Russas, CE 62900-000. Brazil.( [email protected] ). a r X i v : . [ m a t h . O C ] F e b igure 1: Illustration of a proper coloring with seven colors.a neighbor representing every other color class, i.e., N ( v ) ∩ C i (cid:54) = ∅ for all i ∈ K \ { j } . In this alternativedeﬁnition, a b -coloring is a proper coloring such that every color class has a b-vertex . (a) b -coloring with four colors in which the b -vertices are d (color 1), f (color 2), h (color 2), e (color 3), a (color 4), and g (color 4). (b) b -coloring with ﬁve colors in which the b -vertices are d (color 1), f (color 2), g (color 3), e (color 4), and h (color 5). Figure 2: Illustrations of b -colorings with four and ﬁve colors.The chromatic number of a graph G , χ ( G ), is the minimum number of colors needed to properly color G . The b -chromatic number of a graph G , χ b ( G ), is the maximum number of colors for which G admitsa b -coloring. The coloring problem consists in encountering a proper coloring of a graph minimizing thenumber of colors. The b -coloring problem consists in encountering a proper b -coloring of a graph maximizingthe number of colors. The problem of ﬁnding χ b ( G ) was shown to be NP-hard in Irving and Manlove (1999),thus the b -coloring problem is NP-hard.Although the b -coloring and the coloring problems appear to be closely related, they have several diﬀer-ences. First of all, they consider the objective functions in opposite directions and the diﬀerence betweentheir optimal solution values can be arbitrarily large (Kratochv´ıl, Tuza, & Voigt, 2002). Furthermore, the b -coloring problem can be largely inﬂuenced by the girth (length of a shortest cycle) of the graph, what isnot exactly the case for the coloring problem (V. Campos, Lima, & Silva, 2013). Besides, a property thatis commonly exploited by constructive and enumerative methods for the coloring problem is the fact thatone can have solutions with the number of colors ranging from the chromatic number to the cardinality ofthe vertex set. However, it is not true that one can construct a b -coloring with k colors for every integer k ranging from the chromatic number to the b -chromatic number (Barth, Cohen, & Faik, 2007). Additionally,notice that a proper graph coloring which is not a b -coloring can be trivially improved by the removal of2 color, namely, one that does not have a b -vertex. Therefore, when one is trying to minimize the numberof colors, b -colorings appear naturally as otherwise the available coloring could be easily improved. On theother hand, when one is trying to maximize the number of colors, it is a challenging task to increase thenumber of colors while ensuring that b -vertices are generated for every new color. This suggests that thesearch for good quality solutions for the b -coloring problem should explore the structure of feasible solutionsin a diﬀerent manner.V. A. Campos et al. (2015) presented a motivation for solving the b -coloring problem, namely, ﬁndingan upper bound for the b -algorithm which is a heuristic approach for the coloring problem. The b -algorithmworks as follows, it begins with a greedy coloring and afterwards tries to reduce the number of used colorsby changing the colors of certain vertices. In this context, a b -vertex represents a vertex that cannot haveits color changed and thus forbids further improvements by the b -algorithm. Hence, the b -chromatic numberrepresents the worst case of the b -algorithm.Let N ( v ) = { u ∈ V | uv ∈ E } be the open neighborhood (or simply neighborhood) of v in G , N [ v ] = N ( v ) ∪ { v } be the closed neighborhood of v in G , ¯ N ( v ) = V \ N [ v ] be the anti-neighborhood of v in G ,and ¯ N [ v ] = ¯ N ( v ) ∪ { v } be the closed anti-neighborhood of v in G . Deﬁne N c ( v ) = { i ∈ K | c ( u ) = i for some u ∈ N ( v ) } to be the set of colors adjacent to v , which we denote the color neighborhood of v . Also, let N c [ v ] = N c ( v ) ∪ { c ( v ) } be the color closed neighborhood , and ¯ N c ( v ) = K \ N c [ v ] be the color anti-neighborhood of v . Denote the degree of v by d ( v ), which is the size of its neighborhood | N ( v ) | .Considering ∆( G ) to be the maximum degree of a graph, we write ∆ whenever G is clear from the context.The neighborhood of a b -vertex can contain at most ∆ colors. Deﬁne the color degree of v by d c ( v ), which isthe size of its color neighborhood | N c ( v ) | . Consider a sorting of the vertices V = { v , v , . . . , v n } such that d ( v ) ≥ d ( v ) ≥ ... ≥ d ( v n ). The invariant m ( G ) = max { i | i − ≤ d ( v i ) } provides an upper bound for the b -chromatic number of G . Let V m ⊆ V be the subset of vertices with degree at least m ( G ) −

1, i.e. for each v ∈ V m , d ( v ) ≥ m ( G ) −

1. Denote K m ⊆ K as the set of colors that were attributed to some vertex in V m in a given coloring c , i.e. k ∈ K m if there is a vertex v ∈ V m such that c ( v ) = k . The concept of b -coloring appeared in diﬀerent applications. Gaceb, Eglin, Lebourgeois, and Emptoz(2008, 2009) applied b -coloring to improve postal mail sorting systems, which are based on eﬃcient opticalrecognition of the addresses on envelopes. The authors presented a new approach for address block local-ization, which is a very important step on the recognition of the addresses. Their approach uses b -coloringto train a classiﬁer in the identiﬁcation of the address block, and according to the authors a rate of 98%good locations on a set of 750 envelope images was obtained. Elghazel, Deslandres, Hacid, Dussauchoy, andKheddouci (2006) proposed a new clustering approach based on b -coloring of graphs. The presented clustervalidation algorithm evaluates the quality of clusters based on the b -vertex property. The authors take onthis clustering technique to detect a new typology of hospital stays in the French healthcare system.Several authors studied properties of b -coloring for special classes of graphs. Kratochv´ıl et al. (2002)have shown that deciding the b -chromatic number is NP-Complete even for bipartite graphs. A graph G is m -tight if it has exactly m ( G ) vertices with degree exactly equal to m ( G ) −

1. In this regard, Havet, Sales,and Sampaio (2012) proved that deciding if χ b ( G ) = m ( G ) is NP-Complete for tight chordal graphs, whileshowing that the b -chromatic number of a split graph can be obtained in polynomial time.Primal bound results were introduced by Irving and Manlove (1999). We can assume that the chromaticnumber χ ( G ) is a lower bound, as every b -coloring is also a proper coloring. The upper bound is ∆ + 1,on account of the additional color being the color of a b -vertex itself. This upper bound can be narrowed,since for a b -coloring we need a suﬃcient amount of vertices of high degree. Naturally, for a b -coloringwith k colors, at least k vertices with k − m ( G ) is areduced upper bound for the problem. A variety of bounds on the b -chromatic number were also presentedin Alkhateeb and Kohl (2011); Balakrishnan and Raj (2013); Kouider and Mah´eo (2002).Regular graphs belong to a special class of graphs such that m ( G ) = ∆ + 1, one of the main reasonswhy they attract signiﬁcant study. Kratochv´ıl et al. (2002) have shown that for every d -regular graph withat least d vertices χ b ( G ) = ∆ + 1, establishing that there is only a limited number of d -regular graphs for3hich χ b ( G ) < ∆ + 1. Later, Cabello and Jakovac (2011) proved that for every d -regular graph with at least2 d vertices χ b ( G ) = ∆ + 1. A detailed review of the literature related to the b -chromatic number can befound in Jakovac and Peterin (2018).The b -coloring problem for more general graphs was considered in several works. Corteel, Valencia-Pabon,and Vera (2005) introduced an approximation approach for the b -chromatic number. They have shown thatthe b -chromatic number cannot be approximated within a factor of 120 / − (cid:15) for any constant (cid:15) >

0, unlessP = NP. Galˇc´ık and Katreniˇc (2013) settled negatively the question about the existence of a constant-factorapproximation algorithm for the b -chromatic number, proving that for graphs with n vertices, there is no (cid:15) >

0, for which the problem can be approximated within a factor n / − (cid:15) , unless P = NP.Despite the fact that the b -coloring problem has received a lot of attention from the graph theory commu-nity, just a few authors considered optimization approaches such as metaheuristics or integer programming.To the best of our knowledge, Fister, Peterin, Mernik, and ˇCrepinˇsek (2015) were the ﬁrst authors to pro-pose a metaheuristic algorithm for the b -coloring problem. They proposed an hybrid evolutionary algorithmand tested its performance on a set of small instances composed of d -regular graphs. For the tested d -regular instances, the metaheuristic obtained the optimal solutions, which were attested using a brute forcemethod. Encouraged by those results, the authors also considered larger benchmark instances from thesecond DIMACS implementation challenge (Johnson & Trick, 1996). As far as our knowledge goes, the onlymetaheuristic for the b -coloring problem is the one presented in Fister et al. (2015), contrasting with theclassical graph coloring problem, as the latter has a diversity of heuristic methods proposed in the liter-ature (Avanthay, Hertz, & Zuﬀerey, 2003; Bl¨ochliger & Zuﬀerey, 2008; de Werra, 1990; L¨u & Hao, 2010;Mabrouk, Hasni, & Mahjoub, 2009).Koch and Peterin (2015) introduced an integer linear programming formulation for the b -chromatic index χ (cid:48) b ( G ), the edge version of the problem. The authors also provide bounds and general results for a diversityof direct products of graphs regarding the b -chromatic index. Koch and Marenco (2019) proposed an integerprogramming approach for the decision version of the b -coloring problem, which consists in determiningwhether a graph G admits a b -coloring with a given number of colors. The authors also performed apolyhedral study of the proposed formulation, presented valid inequalities and implemented a branch-and-cut algorithm. Computational experiments were performed testing whether χ b ( G ) = m ( G ) for the inputgraphs. The main contributions of this paper are an integer programming formulation for the b -coloring problem,a very eﬀective multi-start multi-greedy randomized metaheuristic which attempts to explore the problemstructure in the search for good quality solutions, and a matheuristic approach obtained by combining theproposed multi-start metaheuristic with a ﬁx-and-optimize local search based on the introduced integerprogramming formulation. To the best of our knowledge, this paper presents the ﬁrst matheuristic forthe b -coloring problem, and the ﬁrst integer programming formulation which can be directly applied to itsoptimization version. Furthermore, we present a benchmark set consisting of newly created instances as wellas available ones for coloring and maximum clique problems. The computational experiments show thatthe newly proposed approaches are very eﬀective, reaching and proving optimality for several of the testedinstances. Furthermore, the approaches are able to outperform a state-of-the-art metaheuristic (Fister etal., 2015) for the b -coloring problem when taking into consideration all nine large instances considered bythose authors.The remainder of the paper is organized as follows. Section 2 introduces an integer programming formula-tion for the b -coloring problem. Section 3 describes the multi-greedy randomized heuristic. Section 4 presentsthe multi-start multi-greedy randomized metaheuristic, the MIP (mixed-integer programming) based ﬁx-and-optimize local search procedure using the proposed integer programming formulation, and the matheuristicapproach which is obtained by combining the ﬁrst two. Section 5 summarizes the computational experiments.Final considerations are discussed in Section 6. 4 Integer programming formulation

We now describe a formulation by representatives (Campˆelo, Corrˆea, & Frota, 2004) for the b -coloringproblem. Consider the binary variable x uv to be equal to one if vertex u represents the color of vertex v and to be zero otherwise, deﬁned for every ordered pair ( u, v ), with u ∈ V and v ∈ ¯ N [ u ]. In the proposedformulation, a vertex u ∈ V can only represent the color of another vertex if x uu = 1, which means that u isthe representative and also a b -vertex of that color. Note that a color may have several b -vertices, but onlyone of them will be the representative.Deﬁne the set of vertices in the anti-neighborhood of u which are not adjacent to other vertices in thisanti-neighborhood as ¯ N ∗ ( u ) = ¯ N ( u ) − { v | ∃ w ∈ ¯ N ( u ) , vw ∈ E } . Additionally, consider the complement of E as ¯ E = { uv | uv / ∈ E } . The b -coloring problem can be cast as the following linear integer program: z = max (cid:88) u ∈ V x uu (1) (cid:88) v ∈ ¯ N [ u ] x vu = 1 , ∀ u ∈ V, (2) x uv + x uw ≤ x uu , ∀ u ∈ V, v, w ∈ ¯ N ( u ) s.t. vw ∈ E, (3) x uv ≤ x uu , ∀ u ∈ V, v ∈ ¯ N ∗ ( u ) , (4) (cid:88) w ∈ N ( v ) ∩ ¯ N ( u ) x uw ≥ x uu + x vv − , ∀ ( u, v ) s.t. uv ∈ ¯ E, (5) x uv ∈ { , } , ∀ u ∈ V, v ∈ ¯ N [ u ] . (6)The objective function (1) maximizes the number of representative vertices, which are the b -vertices.Constraints (2) ensure that every vertex must have a color. Constraints (3) force the coloring to be proper.Constraints (4) guarantee that a vertex can only give a color if it is a representative (notice that if thisconstraint is removed, a vertex that has a stable set as anti-neighborhood is allowed to represent all itsneighborhood without being a representative). Constraints (5) are the b -coloring restrictions which implythat if both u and v are b -vertices, then there must be a neighbor of v which is represented by u . This isachieved due to the fact that if both u and v are representatives, the right-hand side is equal to one, implyingthat the summation in the left-hand side, which is composed by the neighbors of v that can be representedby u , should be at least one. Constraints (6) ensure the integrality requirements on the variables. Observation 1.

Let z be any valid lower bound for the optimal value z ∗ of formulation (1)-(6), i.e. z ≤ z ∗ .Let V (cid:48) ⊂ V be the set of vertices with degree strictly smaller than (cid:100) z (cid:101)− , i.e., d ( u ) < (cid:100) z (cid:101)− for every u ∈ V (cid:48) .Therefore, one can set to zero variables x uu corresponding to vertices u ∈ V (cid:48) without losing optimality, asthe vertices in V (cid:48) can never be b -vertices in a b -coloring with at least (cid:100) z (cid:101) colors. Furthermore, variables x uv would also be set to zero for every pair ( u, v ) such that u ∈ V (cid:48) and v ∈ ¯ N ( u ) . Observation 2.

Let ˆ x be an integer feasible solution for (1)-(6) with objective value ˆ z . In any solutionwhich strictly improves ˆ x , every vertex v ∈ V which is determined to be a representative must have degree d ( v ) at least ˆ z , i.e. d ( v ) ≥ ˆ z . Let V (cid:48) ⊂ V be the set of vertices with degree strictly smaller than ˆ z , i.e., d ( u ) < ˆ z for every u ∈ V (cid:48) . Therefore, in order to obtain a solution which strictly improves ˆ x , one can set tozero variables x uu corresponding to vertices u ∈ V (cid:48) without losing optimality in case such improving solutionexists. Similarly to Observation 1, variables x uv would also be set to zero for every pair ( u, v ) such that u ∈ V (cid:48) and v ∈ ¯ N ( u ) . In this section, we present a multi-greedy randomized constructive heuristic for the b -coloring problem.The heuristic follows a two-phase framework similar to the one of Elghazel et al. (2006). In the ﬁrst phase,an initial proper coloring, not necessarily a b -coloring, is generated. The second phase ensures a proper5 -coloring is obtained starting from the coloring achieved in the ﬁrst phase. In the remainder of this section,after presenting the pseudo-code of the two-phase framework, we describe the details of the ﬁrst phase inSubsection 3.1 and of the second phase in Subsection 3.2.The multi-greedy randomized constructive heuristic runs in O ( | V | + | V | ∆ log ∆) (as it will be shown inCorollary 1), and is described in Algorithm 1. It takes as inputs the graph G = ( V, E ) and two parametersregarding the sizes of restricted candidate lists (RCL) which will be deﬁned later in this section, namely α and β . The algorithm returns a proper coloring c and a set of colors K c . The heuristic uses the followingstructures: • c : structure that represents the coloring which assigns a color to each vertex v ∈ V ; • N c : structure that represents the color neighborhoods of vertices v ∈ V in coloring c ; • K c : set of colors used in coloring c ; • K b : set of colors in K c that have b -vertices.The ﬁrst phase of the approach is invoked in procedure INITIAL-COLORING (line 1), which will be detailedin Section 3.1, to obtain an initial proper coloring employing ∆ + 1 available colors. Observe that the upperbound ∆+1 was used instead of m ( G ) with the intention of not being too restrictive and give more ﬂexibilityfor the heuristic to use colors that will be removed later in the second phase of the framework. The structures c , N c , K c , and K b are determined by this call to INITIAL-COLORING.As it was already mentioned, procedure INITIAL-COLORING does not ensure a b -coloring, as somecolors in K c might not have a b -vertex. In order to obtain a feasible b -coloring, the second phase is invokedin procedure FIND-B-COLORING (line 2), which will be detailed in Section 3.2, in order to remove colorsfrom K c until a b -coloring is achieved. The updated structures c and K c are returned at the end of theexecution of FIND-B-COLORING. RANDOMIZED-CONSTRUCTIVE thus returns the obtained b -coloring c as well as the set of used colors K c (line 3). Algorithm 1:

RANDOMIZED-CONSTRUCTIVE(

G, α, β ) c, N c , K c , K b ← INITIAL-COLORING(

G, α ); c, K c ← FIND-B-COLORING(

G, α, β, N c , K b , c, K c ); return b -coloring c , K c ; An initial coloring is obtained using procedure INITIAL-COLORING, which is detailed in Algorithm 2.In addition to the graph G , the algorithm also takes as input a parameter α related to the size of restrictedcandidate lists. The structures c , N c , K c , and K b will be returned at the end of its execution. We remarkthat, for ease of explanation, the pseudo-code which will be presented assumes the graph is connected. Aneasy way to overcome this fact will be given once the algorithm is described. The following structures areused by the algorithm: • K (cid:48) : initial set of available colors; • Q : stores the set of vertices which had already been colored; • Υ v : keeps the vertices in N ( v ) which have no attributed color; • K m : set of colors in K c that were attributed to some vertex v ∈ V with degree at least m ( G ) − • HEURISTIC-COLOR-VERTEX: described in Algorithm 3, the procedure takes as inputs the graph G , vertices v and u , as well as structures N c , K (cid:48) , K m . The method returns a color to be attributed tovertex u . Firstly, structure LC is initialized as empty (line 1), and will store the set of candidate colors6or coloring u . The algorithm then checks if u has degree greater than or equal to m ( G ) − LC with the set of colors k ∈ K (cid:48) not belonging to the color neighborhood ofneither v nor u , and have not been assigned to a vertex with degree greater than or equal to m ( G ) − k (cid:54)∈ N c ( v ), k (cid:54)∈ N c ( u ) and k (cid:54)∈ K m (line 3). The purpose behind this coloring idea is to diversifythe colors assigned to both neighborhoods of u and v , while trying to give diﬀerent colors to verticeswith high enough degrees to become b -vertices, in an attempt to increase the probability of ﬁnding b -vertices that represent the greater amount of color classes. If LC is still empty (line 4), the algorithmtries to include in LC colors k ∈ K (cid:48) not belonging to the color neighborhood of neither v nor u , i.e., k (cid:54)∈ N c ( v ) and k (cid:54)∈ N c ( u ) (line 5). If no such color exists, i.e., LC remains with no elements in line 6, LC is built in line 7 with colors k ∈ K (cid:48) not belonging to the color neighborhood of u , i.e., k (cid:54)∈ N c ( u ).This guarantees at least one color in LC since the algorithm initially works with ∆ + 1 available colorsand d ( u ) ≤ ∆. The color in LC with lowest index is returned in line 8. Algorithm 2:

INITIAL-COLORING(

G, α ) c ( v ) ← N c ( v ) ← ∅ for each v ∈ V ; K c , K m , K b ← ∅ ; v ∆ ← argmax u ∈ V {| N ( u ) |} ; K (cid:48) ← { , , , ..., ∆ + 1 } ; c ( v ∆ ) ← Update N c ( v (cid:48) ) for each v (cid:48) ∈ N ( v ∆ ), K c and K m ; Q ← { v ∆ } ; while Q (cid:54) = ∅ do Create RCL Q ( α ) with the best elements in Q ; v ← vertex randomly selected from RCL Q ( α ) ; Υ v ← { w ∈ N ( v ) | c ( w ) = 0 } ; while Υ v (cid:54) = ∅ do Create RCL Υ v ( α ) with the best elements in Υ v ; u ← vertex randomly selected from RCL Υ v ( α ) ; c ( u ) ← HEURISTIC-COLOR-VERTEX(

G, v, u, K (cid:48) , K m ); Update, if necessary, N c ( v (cid:48) ) for each v (cid:48) ∈ N ( u ), K c and K m ; Q ← Q ∪ { u } ; Υ v ← Υ v \ { u } ; Q ← Q \ { v } ; K b ← { c ( v ) | v ∈ V and N c [ v ] = K c } ; return c , N c , K c , K b ;Algorithm 2 ﬁrst initializes the used structures as follows. For each vertex v ∈ V , the color neighborhoodof v is initialized as empty and c ( v ) is set to 0, which implies that no color is assigned to v (line 1). Thesets K c , K m and K b are initialized as empty (line 2). Next, the algorithm sets v ∆ as the maximum degreevertex in G in line 3, where ties are broken arbitrarily. The set K (cid:48) is initialized with ∆ + 1 colors in line 4,followed by the coloring of v ∆ with color 1 in line 5. The structures N c , K c and K m are updated in line 6.The neighborhood of vertices in Q are yet to be explored, and the set is initialized with v ∆ in line 7. Thealgorithm then performs a series of iterations to assign colors to the vertices in G while the set Q is notempty in lines 8-19. Elements from a restricted candidate list (RCL) containing the best elements in Q arerandomly chosen along the construction of the solution. Given the vertices in Q the greedy choice criterionfor RCL Q ( α ) is: • maximization of the vertex degree: p = max v ∈ Q d ( v ).RCL Q ( α ) is deﬁned as a subset of Q containing all candidates whose evaluation for the greedy criterion liesin an interval of values deﬁned by a parameter α ∈ [0 . , . p = min v ∈ Q d ( v ), thus this interval isgiven by [ p − α ( p − p ) , p ]. RCL Q ( α ) is created in line 9. A vertex v is randomly selected from RCL Q ( α ) in7ine 10. Set Υ v is built in line 11 with the vertices in N ( v ) that have no assigned color. Similar to RCL Q ( α ),elements from a restricted candidate list containing the best elements in Υ v are randomly chosen along theconstruction of the solution. Given the vertices in Υ v , the greedy choice criterion for RCL Υ v ( α ) is: • maximization of the vertex degree: p = max v ∈ Υ v d ( v ).RCL Υ v ( α ) is deﬁned for Υ v as RCL Q ( α ) was deﬁned for Q . RCL Υ v ( α ) is created in line 13. A vertex u is randomly selected from RCL Υ v ( α ) in line 14 and receives a color determined by procedure HEURISTIC-COLOR-VERTEX in line 15. The structures N c , K c and K m are updated in line 16. The neighborhoodof u is yet to be explored, so the vertex is inserted into Q in line 17. Vertex u is then removed from Υ v inline 18 and a new iteration resumes, until set Υ v becomes empty. Vertex v is then withdrawn from Q inline 19. After all vertices have been colored, i.e., Q is empty, the algorithm updates the list K b of colorshaving b -vertices in line 20. The structures c , N c , K c and K b are then returned in line 21. Note that, forease of explanation, the described pseudo-code assumes the graph is connected. However, in the case of adisconnected graph, this can be overcome by simply inserting into Q the uncolored vertex of highest degree(if there is at least one uncolored vertex) as a last step in the loop of lines 8-19 whenever Q becomes empty. Proposition 1.

Algorithm 2 runs in O ( | V | ) .Proof. Consider Q and Υ v to be ordered lists containing vertices sorted in nonincreasing order of vertexdegree, which means that every element entering these lists should be inserted into the correct orderedposition. Additionally, assume N c ( v ) for each v ∈ V , K c and K m to be represented as ∆ + 1-dimensionalbinary vectors, with each element k representing whether color k belongs to the corresponding set or not.Firstly, consider the running time to perform a single update of the structures. Note that there are O (∆)updates of structures N c ( v (cid:48) ) and each of them can be done in O (1). The updates of K c and K m can allbe done in O (1). Thus, a single update of all the required structures can be done in O (∆). HEURISTIC-COLOR-VERTEX runs in O (∆), which is implied by the construction of LC and the selection of its minimumvalue. In Algorithm 2, the instructions of lines 3-7 run in O ( | V | ). Line 20 can be done in O ( | V | ∆). In orderto determine the complexity of the while loop in lines 8-19, we perform an aggregated analysis. Note thateach vertex v ∈ V is inserted into and removed from Q at most once and each insertion into this ordered listcan be performed in O ( | V | ), implying O ( | V | ) for all the insertions. As Q is kept as an ordered list, whenevera vertex is to be removed from Q , line 9 is carried out in O ( | V | ). At the moment a vertex enters Q lines13-18 are executed in O ( | ∆ | ). We ommit the entrance of vertices in Υ v from the analysis as they are directlyrelated to their entrance in Q , i.e., whenever a vertex enters Υ v in line 11 it will be removed from Υ v in line 18just after its entrance in Q . Therefore, the overall running time of Algorithm 2 is O ( | V | + | V | ( | V | + ∆))which is O ( | V | ). Algorithm 3:

HEURISTIC-COLOR-VERTEX(

G, v, u, K (cid:48) , K m ) LC ← ∅ ; if d ( u ) ≥ m ( G ) − then LC ← { k | k ∈ K (cid:48) , k (cid:54)∈ N c ( u ), k (cid:54)∈ N c ( v ) and k (cid:54)∈ K m } ; if LC = ∅ then LC ← { k | k ∈ K (cid:48) , k (cid:54)∈ N c ( u ) and k (cid:54)∈ N c ( v ) } ; if LC = ∅ then LC ← { k | k ∈ K (cid:48) and k (cid:54)∈ N c ( u ) } ; return min { k | k ∈ LC } ; b -coloring A feasible b -coloring is obtained using procedure FIND-B-COLORING, which is detailed in Algorithm4. In addition to the graph G and RCL size parameters α and β , the algorithm also takes as inputs the sets8 c , K b , K c , and the coloring c . Remark that the inputs c and K c will be updated by the algorithm and willbe returned at the end of its execution. The following structure is used: • ¯ K b : set of colors that do not have b -vertices. Algorithm 4:

FIND-B-COLORING(

G, α, β, N c , K b , c, K c ) ¯ K b ← K c \ K b ; while ¯ K b (cid:54) = ∅ do Create RCL ¯ K b ( β ) with the best elements in ¯ K b ; r ← color randomly selected from RCL ¯ K b ( β ); foreach v ∈ V such that c ( v ) = r do Create RCL ¯ N c ( v ) ( α, β ) with the best elements in ¯ N c ( v ); c ( v ) ← color randomly selected from RCL ¯ N c ( v ) ( α, β ); Update, if necessary, N c ( v (cid:48) ) for each v (cid:48) ∈ N ( v ) ; K c ← K c \ { r } ; foreach v ∈ V such that c ( v ) ∈ ¯ K b do if N c ( v ) ∪ { c ( v ) } = K c then ¯ K b ← ¯ K b \ { c ( v ) } ; ¯ K b ← ¯ K b \ { r } ; return c, K c ;FIND-B-COLORING, which is a modiﬁcation of the b -algorithm mentioned in the introduction, consistsin iteratively eliminating colors from the graph by recoloring vertices colored with colors in ¯ K b . The set ¯ K b is initialized with every color in K c \ K b in line 1. The algorithm then performs a series of iterations while ¯ K b is not empty (lines 2-13). Elements from a restricted candidate list containing the best elements in ¯ K b arerandomly chosen along the construction of the solution. Given the colors in ¯ K b the greedy choice criterionfor RCL ¯ K b ( β ) is: • maximization of the color index: p = max r ∈ ¯ K b r ;Criterion p aims to remove colors with higher index since after the execution of Algorithm 2, colors withsmaller index are presumably closer to have a b -vertex. RCL ¯ K b ( β ) is deﬁned as a subset of ¯ K b containingits β best candidates. RCL ¯ K b ( β ) is created in line 3. A color r is randomly selected from RCL ¯ K b ( β ) inline 4. For each vertex v ∈ V colored with r , i.e., c ( v ) = r , a new color is assigned to v (lines 5-8). Notethat any color in ¯ N c ( v ) is avaiable to color v . Elements from a restricted candidate list containing the bestelements in ¯ N c ( v ) are randomly chosen along the construction of the solution. Before explaining the greedycriterion, let ζ rv be the number of vertices adjacent to v such that color r ∈ ¯ N c ( v ) is also not in their colorneighborhood. Additionally, let M uv ⊆ ¯ N c ( v ) be the set of colors not adjacent to neither u nor v . Deﬁne M ∗ uv = argmin M uv : u ∈ N ( v ) | M uv | as the minimum cardinality set among all M uv for u ∈ N ( v ). Note that M ∗ uv is the set of colors not adjacent to the vertex with the minimum number of missing colors in its colorneighborhood. Given the colors in ¯ N c ( v ) the greedy choice criteria for RCL ¯ N c ( v ) ( α, β ) are: • maximization of vertices with a new color added to their color neighborhood: p = max r ∈ ¯ N c ( v ) ζ rv ; • minimization of the color index considering the colors in M ∗ uv : p = min r ∈ M ∗ uv r .Criterion p intends to increase the color neighborhood of as many vertices as possible, whereas p aims topredict the vertex which is the closest to become a b -vertex. Given p , RCL ¯ N c ( v ) ( α, β ) is deﬁned as a subsetof ¯ N c ( v ) containing all candidates whose evalutation of the greedy criterion lie in an interval of values deﬁnedby a parameter α ∈ [0 . , . p = min r ∈ ¯ N c ( v ) ζ rv , thus this interval is given by [ p − α ( p − p ) , p ].As for p , RCL ¯ N c ( v ) is deﬁned as a subset of ¯ N c ( v ) containing its β best candidates.RCL ¯ N c ( v ) ( α, β ) is created in line 6. Any of the greedy functions p or p can be chosen for the constructionof RCL ¯ N c ( v ) ( α, β ) and they are selected at random with 50% chance each. Note that, as stated previously9n the deﬁnition of p and p , the selection of the one to be used will deﬁne if RCL ¯ N c ( v ) ( α, β ) uses α or β .Vertex v receives a color randomly selected from RCL ¯ N c ( v ) ( α, β ) in line 7. Color neighborhood of vertices in N ( v ) are then updated in line 8. After all vertices previously colored with r have been assigned a new color, r is removed from set K c in line 9.The algorithm then certiﬁes if colors in ¯ K b now have a b -vertex in lines 10-12. Colors that now have a b -vertex are removed from ¯ K b in line 12. Lastly, r is removed from ¯ K b in line 13. The algorithm terminateswhen ¯ K b = ∅ which implies K b = K c , so the resulting b -coloring c and the set of used colors K c are returnedin line 14. Proposition 2.

Algorithm 4 runs in O ( | V | ∆ log ∆) .Proof. Observe that Algorithm 4 performs a series of color removals and updates. The while loop of lines2-13 is executed O (∆) times, as each color is removed at most once. On any occasion a color r is to beremoved from ¯ K b , line 3 is carried out in O (∆ log ∆). The foreach loop of lines 5-8 is executed O ( | V | ) timesand each iteration is performed in O (∆ log ∆), therefore the complete loop is executed in O ( | V | ∆ log ∆).The foreach loop of lines 10-12 is also executed O ( | V | ) times and the veriﬁcation and possible updates areall performed in O (1) for each iteration, consequently the complete loop is executed in O ( | V | ). Note thatin order to perform the veriﬁcation of line 11 in O (1), one could keep for each v ∈ V an indicator vectorcorresponding to N c ( v ) together with the number of nonzero entries in this vector, as well as an indicatorvector corresponding to K c in conjunction with the number of nonzero entries in this vector. The veriﬁcationcould thus be performed by simply comparing the number of nonzero entries in these two indicator vectors.Algorithm 4 thus runs in O (∆(∆ log ∆ + | V | ∆ log ∆ + | V | )), which is O ( | V | ∆ log ∆). Corollary 1.

Algorithm 1 runs in O ( | V | + | V | ∆ log ∆) .Proof. The result follows from Propositions 1 and 2. Note that the running time of the algorithm is dominatedby the calls to Algorithms 2 and 4, and therefore runs in O ( | V | + | V | ∆ log ∆). In this section, before describing the matheuristic approach, we present its two main components: (a)the multi-start multi-greedy randomized metaheuristic and (b) the MIP (mixed integer programming) basedﬁx-and-optimize local search procedure. The multi-start metaheuristic consists in performing a predeﬁnednumber of iterations of the multi-greedy randomized heuristic and is described in Subsection 4.1. The MIP-based ﬁx-and-optimize local search consists in solving a restricted MIP obtained by ﬁxing certain decisionvariables and is described in Subsection 4.2. Finally, Subsection 4.3 presents the matheuristic approachwhich consists in the combination of the multi-start metaheuristic with the MIP-based ﬁx-and-optimizelocal search procedure.

The pseudo-code of the multi-start multi-greedy randomized metaheuristic is described in Algorithm 5.In addition to the graph G and two parameters regarding the sizes of restricted candidate lists (RCL), namely α and β , the algorithm also takes as input it max , which represents the maximum number of iterations that10he multi-greedy randomized heuristic will be executed. Algorithm 5:

MULTISTART-B-COL(

G, α, β, it max ) K ∗ c ← ∅ ; for i = 1 , ..., it max do c i , K c i ← RANDOMIZED-CONSTRUCTIVE(

G, α, β ); if | K c i | > | K ∗ c | then c ∗ ← c i ; K ∗ c ← K c i ; if | K ∗ c | = m ( G ) then return c ∗ , K ∗ c ; return c ∗ , K ∗ c ;Procedure MULTISTART-B-COL will save in c ∗ the best obtained coloring. The set of used colors in c ∗ , K ∗ c , is initialized as empty (Algorithm 5, line 1). The coloring generated at iteration i ≤ it max is representedby c i and the corresponding set of used colors as K c i . | K c i | represents the solution value, which is the numberof used colors in coloring c i . The loop in lines 2–8 performs iterations i = 1 , . . . , it max . The constructionphase starts by invoking procedure RANDOMIZED-CONSTRUCTIVE to build the solution c i in line 3. Incase an improving solution is obtained, the algorithm updates c ∗ and K ∗ c in lines 5-6. If the solution valueof c ∗ matches the upper bound m ( G ) the execution of RANDOMIZED-CONSTRUCTIVE is terminated byreturning c ∗ in line 8, as the solution is proven to be optimal. Otherwise, a new iteration begins until themaximum number of iterations it max is exceeded. The solution with the highest number of used colors, i.e.,the best solution c ∗ encountered by the multi-start phase, is returned in line 9. Given an available feasible solution, the MIP-based ﬁx-and-optimize local search procedure consists ingenerating a subproblem obtained from the original b -coloring problem by ﬁxing certain decision variablesat the values they assume in the available feasible solution which is also oﬀered as a warm start for the usedMIP solver. With fewer variables remaining to be optimized, it is expected that the resulting subproblem ismore tractable by a standard MIP solver than the original problem. In this work, the input feasible solutionconsists of the best solution generated by MULTISTART-B-COL. The MIP-based ﬁx-and-optimize localsearch is described in Algorithm 6. In addition to the graph G and an initial feasible solution, representedby c and K c , the algorithm also takes as input the maximum time allowed for solving the obtained MIPformulation given by MAXTIME. In our framework, the initial feasible solution oﬀered to MIP-LS will bethe currently best known solution returned by MULTISTART-B-COL. Algorithm 6:

MIP-LS( G , c , K c , MAXTIME) V b , V ← ∅ ; foreach k ∈ K c do u ← argmax v ∈ V { d ( v ) | c ( v ) = k, N c ( v ) = K c \ { k }} ; V b ← V b ∪ { u } ; foreach u ∈ V \ V b do if d ( u ) < | K c | then V ← V ∪ { u } ; Solve the MIP (1)-(6) with addition of constraints (7)-(8) and c given as warm start, restricted totime limit MAXTIME, in order to obtain coloring c ∗ using colors K ∗ c ; return the best solution c ∗ , K ∗ c encountered by the MIP;The set of representative b -vertices V b ⊆ V and the set of vertices V ⊆ V that cannot be representativesin an improving solution are initialized as empty in line 1 of Algorithm 6. Set V b is built according to11he input solution in the foreach loop of lines 2-4. Following Observations 1 and 2, set V is built fromthe input solution in the foreach loop of lines 5-7 with all vertices which are not b -vertices in coloring c and have degree strictly smaller than its number of colors | K c | . Line 8 solves a mixed integer programdeﬁned by the formulation presented in Section 2, in which all variables in V b are ﬁxed to one (i.e., allcorresponding vertices are selected to be representatives in the solution) and all vertices in V are ﬁxedto zero. Additionally, coloring c is provided as a warm start, i.e., as an initial feasible solution. Fixing isachieved by adding the following additional constraints to the formulation x uu = 1 , ∀ u ∈ V b , (7) x uv = 0 , ∀ u ∈ V , v ∈ ¯ N [ u ] . (8)The best solution obtained by the resulting MIP restricted to a maximum time limit MAXTIME is returnedin line 9. Note that the input coloring c is always feasible for this MIP subproblem. Combinations of metaheuristics with exact algorithms from mathematical programming approaches suchas mixed integer programming (MIP), called matheuristics, have received considerable attention over thelast few years. It has been acknowledged by the optimization research community that combining eﬀortfrom exact and metaheuristic approaches could achieve better solutions when compared with pure classicmethods (Raidl & Puchinger, 2008; Dumitrescu & St¨utzle, 2009). Matheuristics frequently beneﬁt frommetaheuristics as the main method to compute good quality solutions, with the exact approach used to en-hance these solutions by solving subproblems. Motivated by recently successful results by matheuristics (Doi,Nishi, & Voß, 2018; Cunha, Kramer, & Melo, 2019; Perumal, Larsen, Lusby, Riis, & Sørensen, 2019; Melo,Queiroz, & Ribeiro, 2021), we combine the multi-start metaheuristic MULTISTART-B-COL that appearsin Algorithm 5 with the MIP-based ﬁx-and-optimize local search procedure presented in Algorithm 6, whichproduces the matheuristic MSBCOL + : MSBCOL + Step 1: c ∗ , K ∗ c ← MULTISTART-B-COL(

G, α, β, it max ); Step 2: c (cid:48)∗ , K (cid:48)∗ c ← MIP-LS( G , c ∗ , K ∗ c , MAXTIME); Step 3:

Return c (cid:48)∗ , K (cid:48)∗ c . All computational experiments were carried out on a machine running under Ubuntu x86-64 GNU/Linux,with an Intel Core i7-8700 Hexa-Core 3.20GHz processor and 16Gb of RAM. The metaheuristic was codedin C++ and the formulation solved using CPLEX 12.8 under standard conﬁgurations. Each execution of thesolver was limited to one hour (3,600s). Subsection 5.1 describes the benchmark instances. Subsection 5.2lists the tested approaches and reports the parameter settings. Subsections 5.3 and 5.4 summarize thecomputational results for small and large instances, correspondingly. Finally, Subsection 5.5 compares someof the obtained computational results with a state-of-the-art metaheuristic presented in Fister et al. (2015)taking into consideration a subset of the large instances.

The tests were carried out on a set of benchmark instances divided into small ( ≤ > ggen (Morgenstern, n.d.) and includesbipartite, geometric and random graphs. Small instances were created with the following parameters: (a) { , , , } vertices; (b) edge probability for random and bipartite graphs and the euclidean distance forgeometric graphs lie in { . , . , . , . } . Five instances were generated for each combination of numberof vertices and edge probability (or euclidean distance for the geometric graphs), therefore instances withthose same characteristics, but diﬀerent seeds, are organized into instance groups. Each instance group isidentiﬁed by C n p , where C represents the class of the graph: random ( R ), bipartite ( B ), and geometric( G ); n gives the number of vertices and p denotes the edge probability for random and bipartite graphs, andthe euclidean distance for geometric graphs. More challenging large bipartite and random instances werealso created in a similar fashion but with the number of vertices in { , , , } . We remark that allresults reported for this set of instances represent average values over the corresponding instance group.We also use the graphs presented in the benchmark instances from the Second DIMACS ImplementationChallenge as they are largely used in the literature, especially for coloring and maximum clique problems(Avanthay et al., 2003; L¨u & Hao, 2010; Moalic & Gondran, 2018; Nogueira, Pinheiro, & Subramanian,2018; San Segundo, Coniglio, Furini, & Ljubi´c, 2019). The instances are identiﬁed by their original ﬁlenameand can be obtained in the DIMACS Implementation Challenges website (Trick et al., 2015). We denotethe instances for coloring problems as graph coloring instances and those for maximum clique problems asmaximum clique instances. The complete benchmark instances along with detailed results for each instanceare available in Melo, Queiroz, and Santos (2020) at Mendeley Data. In this subsection we present the tested approaches and the preliminary experiments carried out todetermine the parameters of the proposed techniques. The following approaches were considered in thecomputational experiments:(a) MSBCOL: run exclusively MULTISTART-B-COL in parallel using all cores of the target machine;(b) MSBCOL + : run the matheuristic, using the best solution encountered by the metaheuristic MSBCOLas a warm start for the MIP-based ﬁx-and-optimize local search procedure;(c) MSBCOL ∗ : run the complete integer programming formulation presented in Section 2, using the bestsolution encountered by the metaheuristic MSBCOL as a warm start. Following Observations 1 and 2,variables corresponding to vertices with degree less than or equal to this best solution value are ﬁxedto zero, as long as they are not b -vertices in the warm start solution;(d) IP: Run the integer programming formulation presented in Section 2 without any initial solution orﬁxings of variables.The used test strategy was adopted to evaluate the behavior of the newly proposed methods according withthe class and size of the benchmark instances. Furthermore, we wanted to verify the eﬀectiveness of theMIP-based ﬁx-and-optimize local search when compared with the complete formulation.Deﬁne p ( G ) to be the density of G , calculated as p ( G ) = ×| E || V |× ( | V |− , and let the maximum number ofiterations for MULTISTART-B-COL, it max , be computed as it max = 100 + (cid:22) √ | V |× √ p ( G ) (cid:25) . This formulafor it max can be interpreted as follows. The minimum number of iterations that the algorithm executes isgiven by the ﬁrst part of the formula, which is the constant 100. The variable number of iterations given by (cid:22) √ | V |× √ p ( G ) (cid:25) is inversely proportional to the size and density of the graph, as iterations become more timeconsuming on larger and denser graphs. Such choice was made as an attempt to allow a reasonable numberof iterations in order to avoid poor performance of the algorithm. The experiments to tune the parameter13alues are reported in the following. We randomly selected a small subset containing approximately 5.0% ofthe instances with varying characteristics for parameter tuning. The following values were tested for eachparameter:(a) α ∈ { . , . , . , . } ;(b) β ∈ { . , . , . , . } ;The best obtained parameter values for MULTISTART-B-COL were α = 0 .

00 and β = 0 . Tables 1-3 report the results for MSBCOL, MSBCOL + , MSBCOL ∗ and IP on the new set of generatedsmall instances composed of bipartite, geometric and random graphs. The ﬁrst column identiﬁes the instancegroup. Columns 2 to 4 report the number of vertices ( | V | ), the average number of edges ( | E | ) along with theaverage solution upper bound for the instance group ( m ( G )). Columns 5 to 8 give, for MSBCOL, the bestencountered solution values ( z M ), the average solution values for the executed number of iterations ( z avg ), theaverage running times in seconds ( time ( s )), and the percentual gap between the solution found by MSBCOLand the best obtained solution ( best ), calculated as 100 × ( best − z M ) best (% best ). Columns 9 and 10 give, forMSBCOL + , the encountered solution values ( z M + ) and the average running times in seconds ( time ( s )) forthe MIP-based local search procedure. Columns 10 to 15 give, for the exact approaches MSBCOL ∗ and IP,the encountered solution values ( z M ∗ and z IP , respectively), the average running times to solve the instancesto optimality ( time ( s )), and the average open gaps (in %) of the unsolved instances ( gap ), calculated as100 × ( ub − lb ) ub , where lb represents the best known integer solution and ub the best upper bound achieved atthe end of the execution. The last two lines report the number of best known solutions found by each ofthe proposed approaches ( best ), and, for MSBCOL ∗ and IP, the amount of instances solved to optimality( opt ).The value ’ n/a ’ in a cell indicates that, for at least one instance in the group, either the solver exceededthe time limit before obtaining a feasible solution or the execution was halted by the operating system dueto memory limitations. The value ’ t.l. ’ for column time ( s ) means that none of the instances in the groupwere solved to optimality within the time limit of 3,600 seconds using the corresponding integer program.The value ’-’ for column gap represents that all ﬁve instances in the group were solved to optimality. Thebest encountered solution values are shown in bold. 14able 1: Results for MSBCOL conducted on small bipartite graphs. Instance group MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)bip 50 0.2 50 128.8 8.2 7.6 6.4 < < < < < < < < < < < < n/a n/a n/a n/a n/a t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. able 2: Results for MSBCOL conducted on small geometric graphs. Instance group MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)geo 50 0.2 50 125.8 8.8 8.4 7.9 < < < < < < < < < < < < < < < < < < < < < < < < < < able 3: Experiments conducted on small random graphs. Instance group MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)rand 50 0.2 50 248.8 12.8 9.6 8.1 < < n/a n/a n/a n/a n/a n/a rand 50 0.6 50 730.8 29.4 17.8 15.7 0.1 16.8 20.4 0.8 < < t.l. t.l. t.l. t.l. t.l. < t.l. t.l. t.l. t.l. t.l. t.l. t.l. < t.l. t.l. t.l. t.l. t.l. t.l. t.l. < t.l. t.l. he results show that MSBCOL performed very eﬃciently, as the reported running times for all instancesin the three classes of graphs were below 0.4 seconds, which is practically negligible. Besides, the solutionsencountered by MSBCOL are all within 23.0% of the best encountered solution. Results are particularlynoteworthy for geometric graphs, as 12 out of 16 (75.0%) reported solutions were within 5.0% of the bestknown. Regarding MSBCOL + , the results show that the MIP-based ﬁx-and-optimize local search has im-proved the initial solutions provided by MSBCOL for all 48 instance groups. MSBCOL + encountered thebest known solutions for 6 out of 18 (33.3%) instance groups for both geometric and bipartite graphs. Fur-thermore, MSBCOL + presented notable results for random graphs, as it reported 8 out of 16 (50.0%) bestknown solutions. MSBCOL + was also very eﬀective, as 30 out of 48 instance groups (62.5%) were solved inless than one second. The most impressive performance can be seen for geometric graphs, as the methodwas able to solve 13 out of 18 instance groups (72.2%) in less than 0.1 seconds.Similarly to MSBCOL + , MSBCOL ∗ also improved the initial solutions provided by MSBCOL for all 48instance groups. For bipartite graphs MSBCOL ∗ found the best known solution for 12 out of 16 instancegroups (75.0%). Additionally for bipartite graphs, 47 out of 80 instances (58.8%) were optimally solvedby the approach. MSBCOL ∗ presented remarkable results for geometric graphs, reporting the best knownand optimal solutions for all 16 instance groups. Regarding random instances, MSBCOL ∗ performed well,encountering best known solution values for 8 out of 16 (50.0%) instance groups. Furthermore, 26 out of 80(31.5%) small random instances were solved to optimality. Finally, within the given time limit, 28 out of 48(58.3%) instance groups were completely solved to optimally using MSBCOL ∗ .Lastly, IP presented similar results when compared with MSBCOL ∗ , as it reached optimality in 27 outof 48 (56.2%) instance groups, however, MSBCOL ∗ uses less computational times, which can be speciallyevidenced in groups bip

70 0 . bip

80 0 . geo

80 0 .

2, and geo

80 0 .

4. The results show that for bipartitegraphs IP obtained 11 out of 16 (68.7%) best known solutions for bipartite graphs. As for geometric graphs,identical to MSBCOL ∗ , IP achieved the best known and optimal solutions for all 16 instance groups. IP alsoperformed well on random instances, considering that it encountered best known solutions for 9 out of 16(56.2%) instance groups, and solved to optimality 23 out of 80 instances (28.8%).Tables 4-5 report the results for MSBCOL, MSBCOL + , MSBCOL ∗ and IP on the set of small graphsfrom the Second DIMACS Implementation Challenge. The ﬁrst column identify the DIMACS instance.Columns 2 to 4 report the number of vertices of each graph ( | V | ), the number of edges ( | E | ) along with thesolution upper bound ( m ( G )). Columns 5 to 8 give, for MSBCOL and MSBCOL + , the encountered solutionvalues ( z M and z M + , respectively) and the running time in seconds ( time ( s )). Columns 9 to 14 give, forMSBCOL ∗ and IP, the encountered solution values ( z M ∗ and z IP , respectively), the running time to solvethe instance ( time ( s )), and the open gap (in %) in case of unsolved instances ( gap ), deﬁned as before. Thelast two lines report the number of best known solutions found by each of the proposed approaches ( best ),and, for MSBCOL ∗ and IP, the amount of instances solved to optimality ( opt ).The value ’ n/a ’ in a cell expresses that either the solver exceeded the time limit before obtaining afeasible solution or the execution was halted by the operating system due to memory limitations. The value’ t.l. ’ for column time ( s ) indicates that the instance was not solved to optimality within the time limit of3,600 seconds. The value ’-’ for column gap means that the instance was solved to optimality. The bestencountered solution values are shown in bold. 18able 4: Experiments conducted on small DIMACS graphs for coloring problems. Instance name MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)dsjc125.1.col 125 736 17 13 11.0 0.2 23.5 15 11.0 t.l. t.l. t.l. t.l. < t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a le450 15a.col 450 8,168 57 t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a le450 25b.col 450 8,263 60 t.l. t.l. n/a n/a n/a le450 5a.col 450 5,714 34 t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. < t.l. < < t.l. < < < < < < < t.l. able 5: Experiments conducted on small DIMACS graphs for the maximum clique problem. Instance name MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)brock200 2.clq 200 9,876 100 43 38.9 1.6 10.4 47 t.l. t.l. t.l. < t.l. t.l. < < t.l. < < t.l. < < t.l. < t.l. < t.l. t.l. t.l. < < t.l. t.l. < t.l. t.l. t.l. t.l. t.l. < < < he multi-start approach, MSBCOL, achieved notable results, considering that for graph coloring in-stances (Table 4) the algorithm reported 10 out of 27 (37.0%) best known results, and for maximum cliqueinstances (Table 5) returned 7 out of 14 (50.0%) best known solution values. MSBCOL achieved the majorityof solutions within 20.0% of the best known, with a few exceptions being instances dsjc125.1.col, mulsol.i.2.coland johnson16-2-4.clq. Moreover, several of the reported solution values are within 5.0% of the best known,as can be seen in dsjc250.1.col, fpsol2.i.2.col, fpsol2.i.3.col, mulsol.i.3.col, mulsol.i.5.col, zeroin.i.1.col, andhamming6-2.clq. Lastly, we also mention that these results were generated very eﬃciently, as the reportedtimes were all under 10.0 seconds for graph coloring instances and 2.0 seconds for maximum clique instances.The results show that MSBCOL + improved the initial solutions provided by MSBCOL for 7 out of 27(25.9%) graph coloring instances and for 5 out of 14 (35.71%) maximum clique instances. The most out-standing improvements can be seen in dsjc125.5.col, mulsol.i.2.col, brock2002.clq and keller4.clq. MSBCOL + encountered best known solutions for 13 out of 27 (48.2%) graph coloring instances and 11 out of 14 (78.6%)for maximum clique instances.One can see from the tables that MSBCOL ∗ outperformed MSBCOL + , especially for graph coloringinstances, whereas the approach enhanced the initial solutions provided by MSBCOL in 15 out of 27 (55.6%)cases. As for maximum clique instances, 6 out of 14 (42.9%) initial solutions were improved. Note thatMSBCOL ∗ found the best known solution values for a majority of instances, which strongly supports itseﬀectiveness. For graph coloring instances, 24 out of 27 (88.9%) best known solutions were reported, as formaximum clique instances it returned 12 out of 14 (85.7%) best known results. Additionally, MSBCOL ∗ optimally solved 16 out of 27 (59.3%) graph coloring instances, and 8 out of 14 (57.1%) maximum cliqueinstances.The results also show that IP reported noticeable inferior results when compared to MSBCOL ∗ , as theapproach returned best known solutions for 14 out of 27 (51.9%) graph coloring instances and 8 out of 14(57.1%) maximum clique instances. Besides, IP did not obtain integer feasible solutions for three graphcoloring instances (fpsol2.i.3.col, le450 25a.col, le450 25b.col), which reinforce the importance of the initialsolutions provided by MSBCOL. Lastly, IP optimally solved 12 out of 27 (44.4%) graph coloring instances,and 4 out of 14 (28.6%) maximum clique instances. It is noteworthy that MSBCOL ∗ solved instances tooptimality considerably faster than IP as can be seen in instances mulsol.i.2.col and zeroin.i.2.col, whichwere solved by MSBCOL ∗ in around 1.0 second, meanwhile IP took over 2000.0 seconds to solve them.Overall, the results show that MSBCOL can generate solutions very quickly, which is advantageousin cases where one values performance over optimality. Additionally, both MSBCOL + and MSBCOL ∗ accomplished to improve MSBCOL results. Even though MSBCOL + was outperformed by MSBCOL ∗ andIP in terms of solution values, it presents lower computational times and the idea could be heuristicallyadapted and used in a combinatorial local search strategy to achieve even better solutions. MSBCOL ∗ and IP obtained better known solutions for the majority of instances, but it is worth mentioning that theyare more viable options when larger computational times are available. One can observe that the optimalsolutions found by MSBCOL ∗ and IP show that for the tested small instances the b -chromatic number isequal or very close to the upper bound m ( G ). Analyzing the performances of MSBCOL ∗ and IP, MSBCOL ∗ has much lower computational times in general and optimally solved more instances than IP, which suggeststhe usefulness of initial solutions provided by MSBCOL. Tables 6 and 7 report the results for MSBCOL, MSBCOL + , MSBCOL ∗ , and IP on the new set of morechallenging instances composed of large bipartite and random graphs. The structure of these tables is similarto that of Tables 1-3. Note that large geometric instances were not tested as it was observed for the smallinstances in Section 5.3 that they are much easier to solve than the bipartite and random ones.21able 6: Results for MSBCOL conducted on large bipartite graphs. Instance group MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)bip 500 0.2 500 12,473.0 58.4 28.6 25.8 3.1 21.4 28.6 t.l. t.l. t.l. t.l. t.l. n/a n/a n/a bip 500 0.6 500 37,500.4 155.0 t.l. t.l. n/a n/a n/a bip 500 0.8 500 49,961.0 202.8 n/a n/a t.l. n/a n/a n/a bip 600 0.2 600 17,989.6 69.4 t.l. t.l. n/a n/a n/a bip 600 0.4 600 35,968.8 127.8 n/a n/a n/a n/a n/a n/a n/a n/a bip 600 0.6 600 54,006.0 185.4 n/a n/a n/a n/a n/a n/a n/a n/a bip 600 0.8 600 71,938.4 243.0 t.l. t.l. n/a n/a n/a bip 700 0.2 700 24,559.2 80.6 n/a n/a n/a n/a n/a n/a n/a n/a bip 700 0.4 700 49,056.2 149.2 n/a n/a n/a n/a n/a n/a n/a n/a bip 700 0.6 700 73,554.2 217.0 n/a n/a n/a n/a n/a n/a n/a n/a bip 700 0.8 700 97,935.4 283.8 n/a n/a n/a n/a n/a n/a n/a n/a bip 800 0.2 800 32,040.8 90.8 n/a n/a n/a n/a n/a n/a n/a n/a bip 800 0.4 800 63,974.2 169.2 n/a n/a n/a n/a n/a n/a n/a n/a bip 800 0.6 800 95,985.8 246.6 n/a n/a n/a n/a n/a n/a n/a n/a bip 800 0.8 800 127,880.0 324.0 n/a n/a n/a n/a n/a n/a n/a n/a able 7: Experiments conducted on large random graphs. Instance group MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)rand 500 0.2 500 24,964.4 107.6 42.8 39.4 5.8 15.7 42.8 t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a n/a n/a n/a rand 500 0.8 500 99,755.2 393.2 155.4 145.8 19.0 0.6 t.l. t.l. t.l. t.l. t.l. n/a n/a n/a rand 600 0.4 600 71,805.4 243.2 t.l. t.l. n/a n/a n/a rand 600 0.6 600 107,681.0 357.2 122.8 116.1 26.0 3.9 n/a n/a t.l. t.l. t.l. t.l. n/a n/a n/a rand 700 0.2 700 48,992.8 149.0 n/a n/a n/a n/a n/a n/a n/a n/a rand 700 0.4 700 97,795.2 283.4 n/a n/a n/a n/a n/a n/a n/a n/a rand 700 0.6 700 146,642.0 417.2 t.l. t.l. n/a n/a n/a rand 700 0.8 700 195,652.0 551.6 t.l. t.l. n/a n/a n/a rand 800 0.2 800 63,928.6 169.4 n/a n/a n/a n/a n/a n/a n/a n/a rand 800 0.4 800 127,726.0 323.6 n/a n/a n/a n/a n/a n/a n/a n/a rand 800 0.6 800 191,557.0 476.8 t.l. n/a n/a n/a n/a n/a n/a rand 800 0.8 800 255,578.0 630.6 t.l. t.l. n/a n/a n/a he results show that even though the instances are considerably large, MSBCOL can still generatesolutions in low computation times. More speciﬁcally, its running time was within 30 seconds for thebipartite and within 64 seconds for the random graphs. On the other hand, the results also show that,diﬀerently from what happened for the small bipartite and random instances, MSBCOL + and MSBCOL ∗ were not able to consistently improve the quality of the solutions obtained by MSBCOL. Besides, for all theexecutions of MSBCOL + , MSBCOL ∗ , and IP, either the time limit was reached or the execution was halteddue to memory limitations. In that sense, the empirical results show that these two new sets of instancesappear to be very challenging for the b -coloring problem.Furthermore, note that IP was only able to ﬁnish its execution without being halted for bipartite instanceswith 500 vertices and random instances with less than 600 vertices. It is noteworthy that, for most of the casesin which IP reached the time limit, the obtained solutions could improve those achieved by MSBCOL, theonly exception being random 500 0.8. This observation leads to the conclusion that, even though MSBCOLcan generate solutions of reasonable quality for these large challenging instances in low computational times,there is still space for improvements.Tables 8-9 report the results for MSBCOL, MSBCOL + , MSBCOL ∗ and IP on the set of large graphs fromthe Second DIMACS Implementation Challenge. The structure of these tables is identical to Tables 4-5.The results show that MSBCOL presented very good results when compared to the other approaches,as the multi-start approach achieved 15 out of 32 (46.9%) best known solutions for graph coloring in-stances (Table 8). Regarding maximum clique instances (Table 9), MSBCOL obtained 34 out of 64 (53.1%)best known values. We highlight instances in which the reported gaps of the solutions were within 5.0%of the best known: dsjc500.5.col, ﬂat300 28 0.col, inithx.i.1.col, inithx.i.2.col, inithx.i.3.col, le450 15d.col,r1000.1.col, brock400 1.clq, brock400 2.clq, brock400 3.clq, brock400 4.clq, hamming10-2.clq, hamming10-4.clq, hammin8-2.clq, san200 0.9 2.clq, san200 0.9 3.clq, and san400 0.9 1.clq. For the remaining instances,the solutions found by MSBCOL were all within 28.0% of the best reported values. The reported times forgraph coloring instances were all below 3.0 minutes, which is an impressive performance considering that thelargest instance in this set (R1000.1c.col) has 1000 vertices and 485,090 edges. Additionally, MSBCOL alsoexecuted in less than 3.0 minutes for most maximum clique instances, with the few exceptions being graphswhose number of vertices are at least 1500 or number of edges are over 800,000 (C2000.5.clq, C2000.9.clq,C4000.5.clq, keller6.clq, MANNa81.clq, phat1500-1.clq, phat1500-2.clq and phat1500-3.clq).The results show that MSBCOL + reasonably improved solutions from MSBCOL, as the method enhanced10 out of 32 (31.3%) solutions for graph coloring instances, and with respect to maximum clique instances,MSBCOL + improved 26 out of 64 (37.5%) solution values. The most remarkable improvements obtainedby MSBCOL + can be seen in DSJC500.9.col, DSJR500.5.col, R1000.1c.col, C500.9.clq, gen400 p0.9 55.clq,gen400 p0.9 65.clq and gen400 p0.9 75.clq. For the previous mentioned instances, MSBCOL + returned asolution with at least 30 colors more when compared with the initial provided by MSBCOL, which is astrong indication of the advantage in applying such method. Moreover, MSBCOL + encountered best knownsolutions for 16 out of 32 (50.0%) graph coloring instances and 37 out of 64 (57.8%) for maximum cliqueinstances.Contrasting with the previous behaviour for small instances, when applied to large instances MSBCOL + presented slightly superior results than MSBCOL ∗ , as the latter was successful in improving solutions for 8out of 32 (25.0%) graph coloring instances and 18 out of 64 (28.1%) maximum clique instances. In termsof solution values, MSBCOL ∗ returned best known results for 15 out of 32 (46.9%) graph coloring instancesand 33 out of 64 (51.6%) maximum clique instances. Results also show that MSBCOL ∗ displayed diﬃcultyin solving larger instances to optimality, as the method only solved 5 for each graph coloring (15.6%) andmaximum clique (7.8%) instances. These results indicate the diﬃculty of the MIP solver in solving a problemwhen the number of variables increase substantially, which explains better results when the ﬁx-and-optimizeapproach MSBCOL + was employed. 24 able 8: Experiments conducted on large DIMACS graphs for coloringproblems.Instance name MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)dsjc1000.1.col 1,000 49,629 112 n/a n/a n/a n/a n/a n/a n/a n/a dsjc1000.5.col 1,000 249,826 501

147 74.7 0.0 n/a n/a n/a n/a n/a n/a n/a n/a dsjc1000.9.col 1,000 449,449 888

336 96.2 0.0 t.l. t.l. n/a n/a n/a dsjc250.5.col 250 15,668 126 51 46.9 2.8 10.5 51 t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a n/a n/a n/a n/a n/a ﬂat1000 60 0.col 1,000 245,830 493 n/a n/a n/a n/a n/a n/a n/a n/a ﬂat1000 76 0.col 1,000 246,708 494 n/a n/a n/a n/a n/a n/a n/a n/a ﬂat300 20 0.col 300 21,375 144 t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a inithx.i.2.col 645 13,979 52 49 49.0 29.1 2.0 n/a n/a n/a inithx.i.3.col 621 13,969 52 49 49.0 27.9 2.0 t.l. t.l. t.l. n/a n/a n/a le450 15c.col 450 16,680 93 t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a r1000.1c.col 1,000 485,090 957 156 146.0 154.6 27.4 t.l. t.l. t.l. t.l. n/a n/a n/a r1000.5.col 1,000 238,267 472 n/a n/a n/a n/a n/a n/a n/a n/a r250.1c.col 250 30,227 238 75 71.3 3.1 12.8 81 t.l. t.l. t.l. t.l. t.l. n/a n/a n/a school1 nsh.col 352 14,612 101 t.l. t.l. t.l. able 9: Experiments conducted on large DIMACS graphs for the maximum cliqueproblem.Instance name MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)brock200 1.clq 200 14,834 146 64 60.2 2.1 12.3 t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a brock800 2.clq 800 208,166 517 t.l. t.l. n/a n/a n/a brock800 3.clq 800 207,333 515 t.l. t.l. n/a n/a n/a brock800 4.clq 800 207,643 515 t.l. t.l. n/a n/a n/a C1000.9.clq 1,000 450,079 889 t.l. t.l. n/a n/a n/a

C2000.5.clq 2,000 999,836 1,000 n/a n/a n/a n/a n/a n/a n/a n/a

C2000.9.clq 2,000 1,799,532 1,784 n/a n/a n/a n/a n/a n/a n/a n/a

C250.9.clq 250 27,984 220 111 102.3 3.0 12.6 124 2.0 t.l. t.l. n/a n/a n/a n/a n/a n/a n/a n/a

C500.9.clq 500 112,332 442 195 182.9 17.5 22.0 t.l. t.l. t.l. < t.l. < t.l. < t.l. t.l. < t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. < t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a n/a n/a n/a n/a n/a MANN a27.clq 378 70,551 365 < < n/a n/a n/a n/a n/a n/a n/a n/a p hat1000-2.clq 1,000 244,799 496 n/a n/a n/a n/a n/a n/a n/a n/a p hat1000-3.clq 1,000 371,746 694 t.l. t.l. n/a n/a n/a p hat1500-1.clq 1,500 284,923 457 n/a n/a n/a n/a n/a n/a n/a n/a p hat1500-2.clq 1,500 568,960 760 n/a n/a n/a n/a n/a n/a n/a n/a p hat1500-3.clq 1,500 847,244 1,051 n/a n/a n/a n/a n/a n/a n/a n/a p hat300-1.clq 300 10,933 91 t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. n/a n/a n/a p hat700-2.clq 700 121,728 353 t.l. t.l. n/a n/a n/a p hat700-3.clq 700 183,010 487 t.l. t.l. t.l. n/a n/a n/a n/a n/a n/a n/a n/a san200 0.7 1.clq 200 13,930 138 61 53.7 2.1 25.6 t.l. t.l. t.l. t.l. t.l. t.l. able 9: continued from previous pageInstance name MSBCOL MSBCOL + MSBCOL ∗ IP | V | | E | m ( G ) z M z avg time(s) % best z M + time(s) z M ∗ time(s) gap(%) z IP time(s) gap(%)san200 0.9 1.clq 200 17,910 173 93 87.8 1.9 11.4 96 < t.l. t.l. < t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. t.l. < t.l. t.l. t.l. t.l. n/a n/a n/a sanr400 0.7.clq 400 55,869 276 t.l. t.l. n/a n/a n/a he results for IP on larger instances show once more that initial solutions provided by MSBCOL arevery relevant, considering that both approaches that use those solutions as warm start, i.e. MSBCOL + andMSBCOL ∗ , reported higher numbers of best known solution values. IP returned 9 out of 32 (28.1%) bestknown results for graph coloring instances, and 21 out of 64 (32.8%) for maximum clique instances. Besides,the number of optimal solutions reported by IP is 1 for graph coloring instances (3.1%), and 3 for maximumclique instances (4.7%).Generally speaking, the results in this section have reinforced the eﬀectiveness of MSBCOL, as theapproach obtained several best known values and reported reasonable running times even for very largegraphs. Besides, MSBCOL was the only method that reported solutions for numerous instances, as can beseen in dsjc1000.1.col, as an example. Nevertheless, solutions generated by MSBCOL still have room forimprovements, as results by MSBCOL + and MSBCOL ∗ have shown. We also point out that a few optimalsolutions reported by MSBCOL ∗ and IP largely diﬀer from the upper bound m ( G ) (see MANN a27.clq,MANN a45.clq, and MANN a81.clq), however an in depth analysis is out of the scope of this work and canbe an interesting direction for future theoretical works regarding lower bounds on the b -chromatic number.For the large set of instances, the heuristic methods (MSBCOL and MSBCOL + ) outperformed the exactones (MSBCOL ∗ and IP), which supports the idea of using MSBCOL to provide initial solutions to moreadvanced metaheuristics, such as to heuristically adapt MSBCOL + in a combinatorial local search strategy. In this subsection, we compare the solutions obtained by our newly proposed approach MSBCOL withthe best ones reported in the literature using a state-of-the-art approach, namely the hybrid evolutionaryalgorithm of Fister et al. (2015) (denoted henceforth as HEA). We choose to analyze in this subsection onlyMSBCOL rather than the complete matheuristic approach in order to establish this metaheuristic as a robustand eﬀective method, considering that both MSBCOL and HEA can be classiﬁed as pure metaheuristics.The authors tested their algorithm on a set of small instances composed of d -regular graphs with up to 12vertices and on nine large graphs from the second DIMACS implementation challenge (which is a subset ofthe large benchmark set described in Subsection 5.1). As the small instance set used by the authors onlyincludes extremely small graphs and is not publicly available, we do not report results for that set.Table 10 compares the results obtained by MSBCOL and HEA for all nine large graphs tested in Fisteret al. (2015). We remark that the results for these instances using all the approaches proposed in our paperwere already presented in Subsection 5.4, as these nine instances represent a subset of the bencmark setdescribed in Subsection 5.1. The ﬁrst column identiﬁes the instance. Columns 2 to 4 report the number ofvertices ( | V | ), the number of edges ( | E | ), and the solution upper bound ( m ( G )). Columns 5 to 8 give, forMSBCOL, the best solution value ( z M ), the average solution value for the executed number of iterations( z avg ), the running time in seconds (time(s)), and the percentual improvement over HEA ( imp M ), calculatedas 100 × z M − z H z H with z H being the best solution encountered by HEA. Columns 9 and 10 give, for HEA, thesolution value ( z H ), and the running time in seconds (time(s)). The best solution values are shown in bold.We point out that the authors did not report the machine used in the experiments for HEA, therefore themain goal in this subsection is to compare the quality of the solutions obtained using the two approaches.28able 10: Results comparing MSBCOL and HEA for a subset of the instances containing nine large graphs. Instance name MSBCOL HEA (Fister et al., 2015) | V | | E | m ( G ) z M z avg time(s) imp M (%) z H time(s) dsjc250.5.col 250 15,668 126 The reported values show that MSBCOL clearly outperformed HEA for all nine large instances consideredin Fister et al. (2015). MSBCOL was able to obtain strictly better solutions for all of them, representing an100.00% success rate on improving over the previously best known solutions presented by HEA. The mostnotable performance can be seen in instances dsjr500.1.col, dsjr500.5.col and ﬂat1000 50 0.col, for whichMSBCOL achieved improvements over 10.00% when compared to HEA. Results also show that even some ofthe average solution values were able to improve over the previously best known solution values presented byHEA, since 4 out of 9 (44.4%) outperform HEA’s results (see dsjc500.1.col, dsjr500.5.col, ﬂat1000 50 0.col,and r250.5.col).It is noteworthy that MSBCOL was very eﬀective when it comes to the running times, as they were belowa minute for all but one instance. The maximum running time was less than 70.0 seconds for the largestinstance (ﬂat1000 50 0.col), which is composed of 1,000 vertices and 245,000 edges.

In this paper, we considered the b -coloring problem and proposed: the ﬁrst integer programming for-mulation for the optimization variant of the problem, which consists in maximizing the number of colorsused in a proper b -coloring; a multi-start multi-greedy randomized metaheuristic, which diﬀers from pre-vious (meta)heuristics by taking into account the structure of the problem in its mechanism; and a veryeﬀective matheuristic approach combining the multi-start multi-greedy randomized metaheuristic with aﬁx-and-optimize local search procedure using the proposed integer programming formulation. Moreover, wealso proposed a benchmark set of instances to be used in future works.Computational experiments were performed on a newly proposed benchmark set to analyze the perfor-mance of the presented techniques. The multi-greedy randomized heuristic has shown to be very eﬀectivewhile having very few parameters to be conﬁgured. The integer programming formulation was able to pro-vide satisfactory results, but it is considerably compromised as the instance size grows, considering that thenumber of variables have a tendency to become intractable leading to memory overﬂow. The ﬁx-and-optimizelocal search procedure used in the matheuristic approach improved a signiﬁcant amount of solutions andreported the majority of best results, demonstrating to be a very eﬀective and promising method for the b -coloring and other related problems. The results have also shown that the proposed multi-start meta-heuristic outperforms a state-of-the-art evolutionary algorithm for a subset of the instances, namely, all ninelarge instances which were considered in Fister et al. (2015). Last but not least, the proposed benchmark setfeatures a variety of instances including small and large graphs with diﬀerent characteristics, which can beused in future computational experiments to verify the performance of both exact and heuristic approachesfor the b -coloring problem.Relevant research directions include the development of combinatorial local search approaches to overcomethe memory limitations of the used large formulations. Such combinatorial local search approaches could be The authors did not report the computational resources used for the experiments. b -coloring problem. Acknowledgments:

Work of Rafael A. Melo was supported by Universidade Federal da Bahia; the BrazilianMinistry of Science, Technology, Innovation and Communication (MCTIC); the State of Bahia Research Foundation(FAPESB); and the Brazilian National Council for Scientiﬁc and Technological Development (CNPq). Work ofMichell F. Queiroz was partially supported by a CAPES scholarship. Work of Marcio C. Santos was supported byUniversidade Federal do Cear´a. The authors would like to thank the editor and the anonymous reviewers for thevaluable comments which helped to improve the quality of this paper.

References

Alkhateeb, M., & Kohl, A. (2011). Upper bounds on the b -chromatic number and results for restricted graphclasses. Discussiones Mathematicae Graph Theory , (4), 709–735.Avanthay, C., Hertz, A., & Zuﬀerey, N. (2003). A variable neighborhood search for graph coloring. EuropeanJournal of Operational Research , (2), 379–388.Balakrishnan, R., & Raj, S. F. (2013). Bounds for the b -chromatic number of G − v . Discrete AppliedMathematics , (9), 1173–1179.Barth, D., Cohen, J., & Faik, T. (2007). On the b -continuity property of graphs. Discrete Applied Mathe-matics , (13), 1761 - 1768.Bl¨ochliger, I., & Zuﬀerey, N. (2008). A graph coloring heuristic using partial solutions and a reactive tabuscheme. Computers & Operations Research , (3), 960–975.Cabello, S., & Jakovac, M. (2011). On the b -chromatic number of regular graphs. Discrete Applied Mathe-matics , (13), 1303–1310.Campos, V., Lima, C., & Silva, A. (2013). b -coloring graphs with girth at least 8. In J. Neˇsetˇril & M. Pellegrini(Eds.), The Seventh European Conference on Combinatorics, Graph Theory and Applications (pp. 327–332). Pisa: Scuola Normale Superiore.Campos, V. A., Lima, C. V., Martins, N. A., Sampaio, L., Santos, M. C., & Silva, A. (2015). The b -chromaticindex of graphs. Discrete Mathematics , (11), 2072–2079.Campˆelo, M., Corrˆea, R., & Frota, Y. (2004). Cliques, holes and the vertex coloring polytope. InformationProcessing Letters , (4), 159 - 164.Corteel, S., Valencia-Pabon, M., & Vera, J.-C. (2005). On approximating the b -chromatic number. DiscreteApplied Mathematics , (1), 106–110.Cunha, J. O., Kramer, H. H., & Melo, R. A. (2019). Eﬀective matheuristics for the multi-item capacitatedlot-sizing problem with remanufacturing. Computers & Operations Research , , 149 - 158.de Werra, D. (1990). Heuristics for graph coloring. In Computational graph theory (pp. 191–208). Springer.Doi, T., Nishi, T., & Voß, S. (2018). Two-level decomposition-based matheuristic for airline crew rosteringproblems with fair working time.

European Journal of Operational Research , (2), 428–438.Dumitrescu, I., & St¨utzle, T. (2009). Usage of exact algorithms to enhance stochastic local search algorithms.In Matheuristics (pp. 103–134). Springer.Elghazel, H., Deslandres, V., Hacid, M.-S., Dussauchoy, A., & Kheddouci, H. (2006). A new clusteringapproach for symbolic data and its validation: Application to the healthcare data. In

InternationalSymposium on Methodologies for Intelligent Systems (pp. 473–482).Fister, I., Peterin, I., Mernik, M., & ˇCrepinˇsek, M. (2015). Hybrid evolutionary algorithm for the b -chromaticnumber. Journal of Heuristics , (4), 501–521.Gaceb, D., Eglin, V., Lebourgeois, F., & Emptoz, H. (2008). Improvement of postal mail sorting system. International Journal of Document Analysis and Recognition , (2), 67–80.Gaceb, D., Eglin, V., Lebourgeois, F., & Emptoz, H. (2009). Robust approach of address block localization inbusiness mail by graph coloring. International Arab Journal of Information Technology , (3), 221–229.30alˇc´ık, F., & Katreniˇc, J. (2013). A note on approximating the b -chromatic number. Discrete AppliedMathematics , (7-8), 1137–1140.Havet, F., Sales, C. L., & Sampaio, L. (2012). b -coloring of tight graphs. Discrete Applied Mathematics , (18), 2709–2715.Irving, R. W., & Manlove, D. F. (1999). The b -chromatic number of a graph. Discrete Applied Mathematics , (1-3), 127–141.Jakovac, M., & Peterin, I. (2018). The b -chromatic number and related topics - A survey. Discrete AppliedMathematics , , 184–201.Johnson, D. S., & Trick, M. A. (1996). Cliques, coloring, and satisﬁability: second DIMACS implementationchallenge, october 11-13, 1993 (Vol. 26). American Mathematical Society.Koch, I., & Marenco, J. (2019). An integer programming approach to b -coloring. Discrete Optimization , , 43–62.Koch, I., & Peterin, I. (2015). The b -chromatic index of direct product of graphs. Discrete Applied Mathe-matics , , 109–117.Kouider, M., & Mah´eo, M. (2002). Some bounds for the b -chromatic number of a graph. Discrete Mathe-matics , (1-2), 267–277.Kratochv´ıl, J., Tuza, Z., & Voigt, M. (2002). On the b -chromatic number of graphs. In InternationalWorkshop on Graph-Theoretic Concepts in Computer Science (pp. 310–320). Springer.L¨u, Z., & Hao, J.-K. (2010). A memetic algorithm for graph coloring.

European Journal of OperationalResearch , (1), 241–250.Mabrouk, B. B., Hasni, H., & Mahjoub, Z. (2009). On a parallel genetic–tabu search based algorithm forsolving the graph colouring problem. European Journal of Operational Research , (3), 1192–1201.Melo, R. A., Queiroz, M. F., & Ribeiro, C. C. (2021). Compact formulations and an iterated local search-based matheuristic for the minimum weighted feedback vertex set problem. European Journal ofOperational Research , (1), 75–92.Melo, R. A., Queiroz, M. F., & Santos, M. C. (2020). Data for: A matheuristic approach for the b -coloringproblem using integer programming and a multi-start multi-greedy randomized metaheuristic. (Onlinereference, last access on January 04, 2020, http://dx.doi.org/10.17632/54w6s6f6wr.1)Moalic, L., & Gondran, A. (2018). Variations on memetic algorithms for graph coloring problems. Journalof Heuristics , (1), 1–24.Morgenstern, C. (n.d.). Graph generator ggen. (Online reference, last access on May 16, 2019,http://iridia.ulb.ac.be/ fmascia/ﬁles/ggen.tar.bz2)Nogueira, B., Pinheiro, R. G., & Subramanian, A. (2018). A hybrid iterated local search heuristic for themaximum weight independent set problem.

Optimization Letters , (3), 567–583.Perumal, S. S., Larsen, J., Lusby, R. M., Riis, M., & Sørensen, K. S. (2019). A matheuristic for the driverscheduling problem with staﬀ cars. European Journal of Operational Research , (1), 280–294.Raidl, G. R., & Puchinger, J. (2008). Combining (integer) linear programming techniques and metaheuristicsfor combinatorial optimization. In Hybrid Metaheuristics (pp. 31–62). Springer.San Segundo, P., Coniglio, S., Furini, F., & Ljubi´c, I. (2019). A new branch-and-bound algorithm for themaximum edge-weighted clique problem.

European Journal of Operational Research , (1), 76–90.Trick, M., Chvatal, V., Cook, B., Johnson, D., McGeoch, C., & Tarjan, B. (2015). Benchmark instancesfrom the Second DIMACS Implementation Challenge. (Online reference, last access on May 16, 2019, http://archive.dimacs.rutgers.edu/pub/challenge/graph/benchmarks/http://archive.dimacs.rutgers.edu/pub/challenge/graph/benchmarks/