[PDF] Mixed Integer Programming for Searching Maximum Quasi-Bicliques

Abstract

This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its "almost" complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a γ -quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least γ∈(0,1] . For a bigraph and fixed γ , the problem of searching for the maximal quasi-biclique consists of finding a subset of vertices of the bigraph such that the induced subgraph is a quasi-biclique and its size is maximal for a given graph. Several models based on Mixed Integer Programming (MIP) to search for a quasi-biclique are proposed and tested for working efficiency. An alternative model inspired by biclustering is formulated and tested; this model simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by triclustering \textsc{TriBox}.

Full PDF

MMixed Integer Programming for SearchingMaximum Quasi-Bicliques

Dmitry I. Ignatov, Polina Ivanova, and Albina Zamaletdinova

Abstract

This paper is related to the problem of ﬁnding the maximal quasi-bicliquesin a bipartite graph (bigraph). A quasi-biclique in the bigraph is its “almost” completesubgraph. The relaxation of completeness can be understood variously; here, weassume that the subgraph is a γ -quasi-biclique if it lacks a certain number of edgesto form a biclique such that its density is at least γ ∈ ( , ] . For a bigraph and ﬁxed γ ,the problem of searching for the maximal quasi-biclique consists of ﬁnding a subset ofvertices of the bigraph such that the induced subgraph is a quasi-biclique and its sizeis maximal for a given graph. Several models based on Mixed Integer Programming(MIP) to search for a quasi-biclique are proposed and tested for working eﬃciency.An alternative model inspired by biclustering is formulated and tested; this modelsimultaneously maximizes both the size of the quasi-biclique and its density, usingthe least-square criterion similar to the one exploited by triclustering TriBox. There are many data sources that can be represented as a bipartite graph; for example,in recommender systems and web stores, users can interact with diﬀerent items likemovies, books, clothes, and other products. The most commonly studied data usually

Dmitry I. IgnatovNational Research University Higher School of Economics, Moscow and St. Petersburg De-partment of Steklov Mathematical Institute of Russian Academy of Sciences, Russia, e-mail: [email protected] (0000-0002-6584-8534)Polina IvanovaNational Research University Higher School of Economics, Moscow, Russia, e-mail: [email protected] (0000-0001-6010-7991)Albina ZamaletdinovaNational Research University Higher School of Economics, Moscow, Russia, e-mail: [email protected] a r X i v : . [ c s . D S ] F e b Ignatov et al. has a structure of a bipartite graph whose vertices form two disjoint sets. For example,social network data, where a binary relation between two sets show interactionsbetween people and communities, advertisement data with a set of consumers and acorresponding set of products and so on.In this study, we are interested in the analysis of such bipartite data and search fordense communities, where almost all elements are connected. A situation where allelements of a community are involved can be described by a concept of a bicliqueor a complete subgraph of a bipartite graph.Unfortunately, the community completeness requirement excludes almost com-plete communities frequently met in real-world data. Due to this reason, we allowsome edges to be absent and introduce the concept of quasi-biclique. In order tobound the size of quasi-biclique, we can use the subgraph minimal density or themaximum number of absent edges needed to complete a subgraph.The problem of searching for maximal quasi-clique is NP-hard (Pattillo et al.,2013) as well as the problem of searching for maximal quasi-biclique (Liu et al.,2008b); the maximum edge biclique problem is known to be NP-complete (Peeters,2003). Many algorithms that solve those problems are being developed (Wang,2013; Abello et al., 2002; Sim et al., 2006; Liu et al., 2008a). For instance, Veremyevet al. (2016) oﬀered an exact Mixed Integer Programming model for searching formaximal quasi-clique but the case of bipartite graphs for quasi-bicliques was not yetconsidered within the MIP framework.The aim of this paper is to propose a Mixed Integer Programming models forﬁnding a maximum quasi-bicliques in a bipartite graph and compare the resultsobtained by those models with those of existing algorithms.The paper is organised as follows. Section 2 introduces several basic deﬁnitions,namely biclique, quasi-bicliques, and its density and provides a short overview ofrelated work along with important propositions on algorithmic aspects. Section 3proposes two Mixed Integer Programming models for quasi-biclique search. In Sec-tion 4, the chosen datasets are described. Section 5 summarises the experimentalresults. Section 6 concludes the paper.

Let us introduce several basic notions.

Deﬁnition 1

In a graph G = ( V , E ) a subgraph G (cid:48) = ( V (cid:48) , E (cid:48) ) , where V (cid:48) ⊆ V , E (cid:48) ⊆ E ,is called a vertex-induced subgraph. Let us denote such graph as G [ V (cid:48) ] . Deﬁnition 2

A complete subgraph of a graph ( V , E ) is called a clique. Deﬁnition 3

A complete bipartite subgraph in a bipartite graph ( U , V , E ⊆ U × V ) is called a biclique. ixed Integer Programming for Searching Maximum Quasi-Bicliques 3 Deﬁnition 4

The density of an arbitrary graph is the ratio of the number of edges tothe maximum possible number of edges.The density of a bipartite graph G = ( V , U , E ) is ρ = | E || V | | U | . Deﬁnition 5

A subgraph G (cid:48) = ( V (cid:48) , E (cid:48) ) of a given graph G = ( V , E ) is called f ( k ) -dense, if G (cid:48) is a subgraph induced by a vertex subset V (cid:48) ⊆ V , | V (cid:48) | = k and | E (cid:48) | ≥ f ( k ) , where f : Z + → R + is a chosen function. Deﬁnition 6 A γ -quasi-biclique in a bipartite graph G = ( U , V , E ) is its bipartiteinduced subgraph G (cid:48) = ( V (cid:48) , U (cid:48) , E (cid:48) ⊆ V (cid:48) × U (cid:48) ) with the density at least γ ∈ ( , ] . Let us consider properties and searching algorithms of cliques in a graph G = ( V , E ) .For a graph G = ( V , E ) and a ﬁxed γ ∈ ( , ] we need to ﬁnd a V (cid:48) ⊆ V such that G [ V (cid:48) ] is a γ -quasi-clique and | V (cid:48) | is maximal.Problem of searching for maximum quasi-clique as well as the problem of search-ing for maximum clique is NP-hard (Liu et al., 2008b),(Pattillo et al., 2013). Inaddition to that, the assumption of graph incompleteness leads to the loss of usefulproperties of a clique. For instance, inheritance property which is used in most max-imum clique searching algorithms does not hold. Namely, if G [ V ] is a clique, then G [ V (cid:48) ] is a clique as well, where V (cid:48) is a subset of V . This property does not hold for γ -quasi-cliques: i.e. a subset of a γ -quasi-clique is not necessarily a γ -quasi-clique.However, for quasi-cliques we can deﬁne the property of quasi-inheritance: γ -quasi-clique with | V | > γ -quasi-clique with | V | − The problem of maximum quasi-biclique in a bipartite graph G = ( U , V , E ) withﬁxed γ ∈ ( , ] is to ﬁnd U (cid:48) ⊆ U and V (cid:48) ⊆ V such that vertex-induced subgraph G [ U (cid:48) , V (cid:48) ] is a γ -quasi-biclique of size | U (cid:48) | + | V (cid:48) | , maximum for this graph. Let usdenote a maximum γ -quasi-biclique in the graph G by ω γ ( G ) Let us consider several commonly met deﬁnitions of quasi-biclique. In Liu et al.(2008b), we can ﬁnd the following deﬁnition.

Deﬁnition 7

A induced subgraph G (cid:48) [ U (cid:48) , V (cid:48) ] is called a δ -quasi-biclique (0 ≤ δ ≤ .

5) in a bipartite graph G = ( U , V , E ) if:1. ∀ u ∈ U (cid:48) : d ( u , V (cid:48) ) = |{ v ∈ V (cid:48) |( u , v ) ∈ E }| ≥ ( − δ ) · | V (cid:48) | ,2. ∀ v ∈ V (cid:48) : d ( v , U (cid:48) ) = |{ u ∈ U (cid:48) |( u , v ) ∈ E }| ≥ ( − δ ) · | U (cid:48) | . Ignatov et al.

In order to consider the third deﬁnition of quasi-biclique Sim et al. (2006), let usintroduce some useful notations. The neighbourhood of a vertex v ∈ V in a graph G = ( V , E ) is a set of vertices Γ ( v ) = { u ∈ V |( u , v ) ∈ E } . For a vertex set V (cid:48) ⊆ V and a vertex v ∈ V \ V (cid:48) let us denote a set of verticesfrom V (cid:48) adjacent to v as Γ V (cid:48) ( v ) = { u |( u , v ) ∈ E & u ∈ V (cid:48) } . By a set of vertices Γ ( V (cid:48) ) = ∪ v ∈ V (cid:48) Γ ( v ) we denote a loose neighbourhood of subset V (cid:48) Deﬁnition 8

Obviously, Deﬁnitions 7 and 8 of quasi-bicliques can be reduced to thedeﬁnition of γ -quasi-biclique.1. In Deﬁnition 7, let us sum the ﬁrst condition over all vertices from U (cid:48) . We get,that (cid:213) u ∈ U (cid:48) d ( u , V (cid:48) ) ≥ ( − δ ) · | V (cid:48) || U (cid:48) | , where (cid:213) u ∈ U (cid:48) d ( u , V (cid:48) ) is a number of edgesin a δ -quasi-biclique, | V (cid:48) || U (cid:48) | is the maximum possible number of edges in abipartite graph. Thus a δ -quasi-biclique is a γ -quasi-biclique with γ = − δ .Both deﬁnitions of quasi-biclique are equivalent if γ ∈ [ . , ] .2. By summing both conditions over sets U (cid:48) and V (cid:48) , respectively, in Deﬁnition 8 weget: (cid:205) u ∈ U (cid:48) Γ V (cid:48) ( u )| U (cid:48) || V (cid:48) | ≥ − ε | V (cid:48) | , (cid:205) v ∈ V (cid:48) Γ U (cid:48) ( v )| U (cid:48) || V (cid:48) | ≥ − ε | U (cid:48) | . Since (cid:213) u ∈ U (cid:48) Γ V (cid:48) ( u ) = (cid:213) v ∈ V (cid:48) Γ U (cid:48) ( v ) is a number of edges in an ε -quasi-biclique G [ U (cid:48) , V (cid:48) ] , then the density of G [ U (cid:48) , V (cid:48) ] is: ρ ( G [ U (cid:48) , V (cid:48) ]) ≥ − ε min (| U (cid:48) | , | V (cid:48) |) . Bounding the size of a quasi-clique vertex sets from below ω ( ) l ≤ | U (cid:48) | and ω ( ) l ≤ | V (cid:48) | , we can establish a connection between these deﬁnitions. If we let γ = − ε min ( ω ( ) l , ω ( ) l ) , we obtain that G [ U (cid:48) , V (cid:48) ] is a γ -quasi-clique under condition ε ∈ [ , min ( ω ( ) l , ω ( ) l )) . (cid:3) Most properties of quasi-cliques naturally fulﬁls for quasi-bicliques as well. How-ever, since the density deﬁnition of quasi-biclique diﬀers from the case of quasi-cliqueand the maximum number of edges is a function of two variables with no convexproperties, most algorithms searching for maximum quasi-clique are not directlyapplicable to search for maximum quasi-biclique. ixed Integer Programming for Searching Maximum Quasi-Bicliques 5

Pattillo et al. (2013) established inequalities for upper bounds for the size ofmaximum quasi-clique shown below.

Proposition 1

In a graph G = ( V , E ) with | V | = n and | E | = m the maximum size ofa quasi-clique ω γ ( G ) satisﬁes the following inequality: ω γ ( G ) ≤ γ + √ γ + γ m γ . (1)In order to obtained similar bound for a quasi-biclique, we need to allow thefollowing conditions on quasi-biclique. Proposition 2

In a bipartite graph G ( U , V , E ) , with | U | = n U , | V | = n V and | E | = m ,the maximum size of a quasi-biclique ω γ ( G ) satisﬁes the following inequalities:1. ω γ ( G ) ≤ (cid:114) m γ , for balanced quasi-biclique (the sizes of two vertex sets U and V are equal),2. ω γ ( G ) ≤ min (cid:26) ( + θ ) · (cid:114) m γ ( − θ ) , (cid:18) + − θ (cid:19) · (cid:114) m ( + θ ) γ (cid:27) , if θ ∈ ( , ) andsizes of vertex sets diﬀer from each other by no more than in θ . Proof

Let U (cid:48) and V (cid:48) be vertex sets of a maximum γ -quasi-biclique and let n U (cid:48) and n V (cid:48) be their cardinalities, respectively.1. For balanced quasi-clique n U (cid:48) = n V (cid:48) , hence ω γ ( G ) = · n U (cid:48) . Obviously, thatthe maximum possible number of edges in a quasi-biclique in less than the totalnumber of graph edges. Then γ · n U (cid:48) = γ · (cid:18) ω γ ( G ) (cid:19) ≤ m , γ -quasi-biclique is “almost” balanced when ( − θ ) n V (cid:48) ≤ n U (cid:48) ≤ ( + θ ) n V (cid:48) .Thus, ω γ ( G ) = n U (cid:48) + n V (cid:48) ≤ ( + θ ) n V (cid:48) ⇒ m ≥ γ · n U (cid:48) · n V (cid:48) ≥ γ · ( − θ ) · n V (cid:48) ≥ γ · ( − θ ) · (cid:18) ω γ ( G ) + θ (cid:19) ⇒⇒ ω γ ( G ) ≤ (cid:115) m ( + θ ) γ ( − θ ) . Analogously, ω γ ( G ) ≤ (cid:18) + − θ (cid:19) n U (cid:48) , γ · n U (cid:48) · n V (cid:48) ≥ γ · ( + θ ) n U (cid:48) ⇒⇒ ω γ ( G ) ≤ (cid:18) + − θ (cid:19) (cid:115) m ( + θ ) γ . Ignatov et al.

Now let us discuss a few chosen algorithms that implement maximum quasi-biclique search.A greedy algorithm for searching maximum quasi-bicliques according to Deﬁni-tion 7 is discussed in detail by Liu et al. (2008b). The algorithm uses two parameters:1) δ to control the size of the quasi-biclique ( δ = − γ ) and 2) τ to control the smallestpossible number of vertices that belong to one of the partitions of a quasi-biclique.Let us denote by U (cid:48) and V (cid:48) vertex sets of quasi-biclique of the graph G ( U , V , E ) .At the beginning of the algorithm we set U (cid:48) = ∅ and V (cid:48) = V . From the vertex set U \ U (cid:48) we choose such vertex u that its degree is maximum and delete from V ’ allvertices for which d ( v , U (cid:48) ) < ( − δ ) · | U (cid:48) | . This process continues as long as thesize of U (cid:48) < τ . However, this algorithm can miss possible vertex candidates, thusauthors introduce the second step of the algorithm: if there is a vertex u outside ofthe current vertex set U (cid:48) such that its degree is maximum in U \ U (cid:48) and U (cid:48) ∪ { u } remains a quasi-biclique, then it can be added to U (cid:48) . The same applies to V (cid:48) as longas there is a vertex to add. In this section we will show how to adapt the model F3 from Veremyev et al. (2016)for searching for maximum quasi-bicliques. Let us consider disjoint sets U (cid:48) ∪ V (cid:48) , U (cid:48) ∩ V (cid:48) = ∅ that form a quasi-biclique of a bipartite graph G = ( U , V , E ) . Usingsimilar techniques as in Veremyev et al. (2016), we introduce the following variables: u i = ⇔ i ∈ U (cid:48) , v j = ⇔ j ∈ V (cid:48) , y ij = ⇔ ∃ ( i , j ) ∈ E ∩ ( U (cid:48) × V (cid:48) ) z ( ) k = ⇔ | U (cid:48) | = k , z ( ) k = | V (cid:48) | ,ω ( ) l , ω ( ) u are the lower and upper bounds, respectively, for the vertex set U (cid:48) ,ω ( ) l , ω ( ) u are the lower and upper bounds, respectively, for the vertex set V (cid:48) . We can reﬁne the sizes of vertex sets of a quasi-biclique using Proposition 2. Thenwe build Model 1:

Model 1 ixed Integer Programming for Searching Maximum Quasi-Bicliques 7 ω γ ( G ) = max u , v , y , z (cid:34)(cid:213) i ∈ U u i + (cid:213) j ∈ V v j (cid:35) , (2)under conditions (cid:213) ( i , j )∈ E y ij ≥ γ ω ( ) u (cid:213) n = ω ( ) l ω ( ) u (cid:213) m = ω ( ) l n · m · z ( ) n · z ( ) m , (3) ∀ i ∈ U , ∀ j ∈ V : y ij ≤ u i , y ij ≤ v j , y ij ≥ v i + v j − , (4) (cid:213) i ∈ U u i = ω ( ) u (cid:213) n = ω ( ) l nz ( ) n , (cid:213) j ∈ V v j = ω ( ) u (cid:213) m = ω ( ) l mz ( ) m , (5) ω ( ) u (cid:213) n = ω ( ) l z ( ) n = , ω ( ) u (cid:213) m = ω ( ) l z ( ) m = , (6) ∀ i ∈ U , ∀ j ∈ V : u i ∈ { , } , v j ∈ { , } , ∀ i < j , ( i , j ) ∈ E : y ij ∈ { , } , (7) ∀ n ∈ { ω ( ) l : z ( ) n ≥ , . . . , ω ( ) u } , ∀ m ∈ { ω ( ) l , . . . , ω ( ) u } : z ( ) m ≥ . (8)As in the model F3 we can bound z ( ) k and z ( ) k and recast them from binary intocontinuous variables.Suppose, that there exists an optimal solution (cid:16) u ∗ , v ∗ , y ∗ , z ( ) , z ( ) (cid:17) of Model 1,where vectors z ( ) and z ( ) are not binary (cid:18) z ( ) n ≥ , z ( ) n ≥ (cid:19) . Let (cid:99) z ( ) be a binary vector with (cid:99) z ( ) k = ⇔ | U (cid:48) | = k and (cid:99) z ( ) k = k ∈ { ω ( ) l , . . . , ω ( ) u } ; analogously, vector (cid:99) z ( ) : (cid:99) z ( ) k = ⇔ | V (cid:48) | = k and 0 otherwise. Hence, it is obvious, that vectors (cid:99) z ( ) and (cid:99) z ( ) satisfy constraint 6.Constraints 3 and 5 can be rewritten as follows: (cid:213) i ∈ U u ∗ i = ω ( ) u (cid:213) n = ω ( ) l n (cid:99) z ( ) n , (cid:213) j ∈ V v ∗ j = ω ( ) u (cid:213) m = ω ( ) l m (cid:99) z ( ) m (by deﬁnition), (cid:213) ( i , j )∈ E y ∗ ij ≥ γ ω ( ) u (cid:213) n = ω ( ) l ω ( ) u (cid:213) m = ω ( ) l n · m · z ( ) n · z ( ) m == γ (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) n = ω ( ) l n · z ( ) n (cid:170)(cid:174)(cid:174)(cid:172) (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) m = ω ( ) l m · z ( ) m (cid:170)(cid:174)(cid:174)(cid:172) = γ (cid:32)(cid:213) i ∈ U u ∗ i (cid:33) (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) m = ω ( ) l (cid:213) j ∈ V v ∗ j (cid:170)(cid:174)(cid:174)(cid:172) = γ (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) n = ω ( ) l n · (cid:99) z ( ) n (cid:170)(cid:174)(cid:174)(cid:172) (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) m = ω ( ) l m · (cid:99) z ( ) m (cid:170)(cid:174)(cid:174)(cid:172) . Ignatov et al.

This means that (cid:16) u ∗ , v ∗ , y ∗ , (cid:99) z ( ) , (cid:99) z ( ) (cid:17) is also an optimal solution of the problemand usage of continuous variables z ( ) n and z ( ) m in Model 1 is proved.In the worst case, when ω ( ) l = ω ( ) l = ω ( ) u = | U | , ω ( ) u = | V | , the model has | U | + | V | binary variables and | E | + | U | + | V | continuous. Remark

In Model 1, condition 3 is not linear, so we can linearise it. Let us introducea new variable z n , m = z ( ) n · z ( ) m . Then left side of the inequality 3 is: ω ( ) u (cid:213) n = ω ( ) l ω ( ) u (cid:213) m = ω ( ) l ( n · m ) · z n , m . Conditions 5 are changed as follows: (cid:213) i ∈ U u i = ω ( ) u (cid:213) n = ω ( ) l ω ( ) u (cid:213) m = ω ( ) l nz n , m , (cid:213) j ∈ V v j = ω ( ) u (cid:213) n = ω ( ) l ω ( ) u (cid:213) m = ω ( ) l mz n , m , where c ( ) n , m = n and c ( ) n , m = m .Using this substitution for variables z ( ) n and z ( ) m , the model becomes a linearinteger programming model. In the worst-case scenario, for dense graph there are | U | + | V | binary variables and | E | + | U | · | V | continuous variables to be optimized. (cid:3) Let us look at diﬀerent maximizing criteria for related Mixed Integer Programmingmodels. In papers (Ignatov et al., 2015; Mirkin and Kramarenko, 2011) dedicatedto triclustering generation, K = ( G , M , B , I ) is a triadic context with G , the set ofobjects, M , the set of attributes, B , the set of conditions, and I ⊆ G × M × B ,the ternary relation. The proposed triclustering algorithm searches for clusters thatmaximize the following criteria: f ( T ) = ρ ( T )| X || Y || Z | . (9)By narrowing this criteria for binary contexts, it is possible to obtain anothermaximising criteria for Model 7 GF3( f ) from Veremyev et al. (2016), p.191.For a bipartite graph G = ( U , V , E ) and its induced subgraph G [ C , C ] , function f is maximized over the density and size of biclique. f ( C , C ) = ρ ( G [ C , C ])·| C |·| C | = (|{( i , j ) : i ∈ C , j ∈ C , ( i , j ) ∈ E }|) | C | · | C | . (10)Using variables deﬁnitions from the previous model we can rewrite function f : ixed Integer Programming for Searching Maximum Quasi-Bicliques 9 f ( C ) = (cid:16)(cid:205) ( i , j )∈ E y ij (cid:17) ( (cid:205) i ∈ U u i ) · (cid:16)(cid:205) j ∈ V v j (cid:17) Since function f is multiplicative, the direct way to transform it to an additivefunction is logarithmisation: f log ( C ) = · log |{( i , j ) : i ∈ C , j ∈ C , ( i , j ) ∈ E }| − log | C | − log | C | = · log (cid:169)(cid:173)(cid:171) (cid:213) ( i , j )∈ E y ij (cid:170)(cid:174)(cid:172) − log (cid:32)(cid:213) i ∈ U u i (cid:33) − log (cid:32)(cid:213) j ∈ V v j (cid:33) . (11)As in Model 1, (cid:213) i ∈ U u i = ω ( ) u (cid:213) n = ω ( ) l nz ( ) n , (cid:213) j ∈ V v j = ω ( ) u (cid:213) m = ω ( ) l mz ( ) m . Now we introduce a new variable: w k = ⇔ |{( i , j ) : i ∈ C , j ∈ C , ( i , j ) ∈ E }| = k , then (cid:205) ( i , j )∈ E y ij = (cid:205) ( i , j )∈ E k w k . f log ( C ) = · log (cid:169)(cid:173)(cid:171) (cid:213) ( i , j )∈ E k w k (cid:170)(cid:174)(cid:172) − log (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) n = ω ( ) l nz ( ) n (cid:170)(cid:174)(cid:174)(cid:172) − log (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) m = ω ( ) l mz ( ) m (cid:170)(cid:174)(cid:174)(cid:172) == · (cid:213) ( i , j )∈ E log ( k ) w k − ω ( ) u (cid:213) n = ω ( ) l log ( n ) z ( ) n − ω ( ) u (cid:213) m = ω ( ) l log ( m ) z ( ) m . (12)Obviously, that equality log (cid:169)(cid:173)(cid:171) (cid:213) ( i , j )∈ E k w k (cid:170)(cid:174)(cid:172) = (cid:213) ( i , j )∈ E log ( k ) w k because w k is binaryvariable and (cid:213) ( i , j )∈ E w k =

1. Thus there exists a unique number k ∗ such that w k ∗ = (cid:169)(cid:173)(cid:171) (cid:213) ( i , j )∈ E k w k (cid:170)(cid:174)(cid:172) = log ( k ∗ ) = w k ∗ · log ( k ∗ ) = (cid:213) ( i , j )∈ E log ( k ) w k . Asimilar statement is true for log (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) n = ω ( ) l nz ( ) n (cid:170)(cid:174)(cid:174)(cid:172) and log (cid:169)(cid:173)(cid:173)(cid:171) ω ( ) u (cid:213) m = ω ( ) l mz ( ) m (cid:170)(cid:174)(cid:174)(cid:172) .Without extra conditions on the sizes of vertex sets of a quasi-biclique and itsminimum number of edges, the model has 2 · (| U | + | V |) + · | E | variables. Model 2 | E | (cid:213) k = log ( k ) · w k − | U | (cid:213) n = log ( n ) z ( ) n − | V | (cid:213) m = log ( m ) z ( ) m −−−−−−−→ w , z ( ) , z ( ) max , under conditions | E | (cid:213) k = w k ≥ γ ω ( ) u (cid:213) n = ω ( ) l ω ( ) u (cid:213) m = ω ( ) l n · m · z ( ) n · z ( ) m , (cid:213) ( i , j )∈ E y ij = | E | (cid:213) k = k · w k , (cid:213) i ∈ U u i = ω ( ) u (cid:213) n = ω ( ) l nz ( ) n , (cid:213) j ∈ V v j = ω ( ) u (cid:213) m = ω ( ) l mz ( ) m , | E | (cid:213) k = w k = , | U | (cid:213) n = z ( ) n = , (cid:213) m = z ( ) m = , ∀ i ∈ U : u i ∈ { , } ∀ j ∈ V : v j ∈ { , } ∀ i < j , ( i , j ) ∈ E : y ij ∈ { , } , ∀ k ∈ { , . . . , | E |} : w k ∈ { , } , ∀ n ∈ { ω ( ) l , . . . , ω ( ) u } : z ( ) n ≥ , ∀ m ∈ { ω ( ) l , . . . , ω ( ) u } : z ( ) m ≥ . Remark

In order to simplify the model. we can add extra constraints for variables w k , k ∈ { , . . . , | E |} . Let k be a possible number of edges in a quasi-biclique, then:1. k ≤ ω ( ) u · ω ( ) u .2. If γ · ω ( ) l · ω ( ) l ≤ | E | ⇒ k ≥ γ · ω ( ) l · ω ( ) l .3. Let us consider U (cid:48) such that | U (cid:48) | = ω ( ) l and ∀ u ∈ U (cid:48) de g ( u ) ≤ min x ∈ U \ U (cid:48) { de g ( x )} .That is U (cid:48) is a subset of U with the minimum possible size and with all smallestdegree vertices w.r.t. U . Then k ≥ γ (cid:213) u ∈ U (cid:48) de g ( u ) .4. Similarly, for V (cid:48) ⊆ V : | V (cid:48) | = ω ( ) l and ∀ v ∈ V (cid:48) de g ( v ) ≤ min x ∈ V \ V (cid:48) { de g ( x )} , then k ≥ γ (cid:213) v ∈ V (cid:48) de g ( v ) . (cid:3) Datasets for testing the performance of the algorithms are mainly taken from (Borgattiet al., 2014; Batagelj and Mrvar, 2014).1. Southern Women: | U | = , | V | = | E | =

89 edges, a classic ethnographicdataset with a bipartite graph of 18 women, which met in a series of 14 informalsocial events (Freeman, 2003). ixed Integer Programming for Searching Maximum Quasi-Bicliques 11

2. Divorce in the US: | U | = , | V | =

50 vertices, and | E | =

225 edges. This graphdescribes the particular causes of divorce in the United States.3. Dutch Elite: | U | = , | V | =

937 vertices, and | E | = | U | = , | V | =

395 vertices, and | E | =

877 edges. Thelist of people in the ﬁrst partition of the graph consists of the most inﬂuentialpersons regarding their membership in administrative authorities.5. Movie-Lens (ml-latest-small): | V | = , | V | =

50 vertices, and | E | = The greedy algorithm of searching for maximal γ -quasi-biclique in a bipartite graphwas implemented in Python 2.7. The MIP models were implemented with the opti-mization package CPLEX, created by IBM. All computations were carried out on alaptop with macOS operating system, 2.7 GHz Intel Core i5 processor, and RAM 8GB 1867 MHz.The search for solutions in the CPLEX package was performed by means ofthe branch-and-cut method, which is similar to the branch-and-bound algorithmicapproach. The method uses a search tree, where each node represents a subproblemthat needs to be solved and possibly analysed further.The branch procedure creates two new nodes from the active parent node.Generally, at this point, the boundaries of one variable are applied and stored forthe current node and all its child nodes. In its turn, the cut procedure adds a newconstraint to the model. As a result of any cut, the solution space for the subproblems,which are presented in the nodes, is reduced, and the number of branches needed toprocess decreases. CPLEX processes active nodes in the tree until no more activenodes are available or a certain limit is reached .The standard solution with the CPLEX software package assumes only one of theoptimal solutions as the answer. However, in CPLEX it is possible to obtain a set ofoptimal solutions using the solution pool method, which allows one to ﬁnd and storeseveral solutions of MIP models.The generation of multiple solutions works in two steps. The ﬁrst step is identicalto the usual solution search using the CPLEX software package. At this step, thealgorithm ﬁnds the only optimal solution of the integer programming problem. Italso saves nodes in the search tree that could potentially be useful; for example, if not all the variable constraints are taken into account or if all the nodes contain asuitable value, but the target function is not optimal.In the second step, using previously calculated and stored information in the ﬁrststage, several solutions are generated, and the tree is traversed again, in particularwithin the branches rooted from the additional nodes stored in the ﬁrst stage. On a toy example of a graph with 12 vertices, we consider the search results formaximal γ -quasi-bicliques, γ = .

8, using Models 1 and 2, respectively (Fig. 1).Fig. 1: The results of search for quasi-bicliques using Models 1 and 2.The results for both models are the same (w.r.t. to the solutions output order).Even for this small-sized problem, the time is tangible: the computations with Model1 took 2.16 s, and for Model 2, it was 2.94 s. A comparison of the executed modelsand the greedy algorithm in terms of computational time is given below for theselected bipartite graphs. ixed Integer Programming for Searching Maximum Quasi-Bicliques 13

The algorithm of searching for the maximal quasi-biclique using the CPLEX softwarepackage was implemented for Models 1 and 2 (see Section 3) and compared with thegreedy algorithm from (Liu et al., 2008b) (let us denote it as Greedy Algorithm).There are no comparison results presented for the model F3 (Veremyev et al.,2016): despite its fast work, the algorithm based on this model chose quasi-bicliquesof very small size and maximum density (i.e. bicliques). This phenomenon is ratherexpected since the model F3 implies a completely diﬀerent function of the density ofthe subgraph. Therefore, the comparison, in this case, is irrelevant. The descriptionof Complete QB in (Sim et al., 2006) lacks of important implementation details.The weakness of the constructed MIP models was identiﬁed during the ﬁndingsolution. Since the problem of enumerating all maximal quasi-bicliques in practicerequires considerable time, the software package can discard some solutions, if ithas found quite a few optimal ones already. First of all, the search is carried outamong unbalanced quasi-bicliques (no constraints on the approximately equal sizeof the quasi-clique partitions have been given). For large graphs, this means that thenumber of vertices in one of the parts of the found optimal solution may exceed thenumber of vertices in the second part by hundreds of times or more.This issue can be addressed in two ways. Firstly, one can set roughly equal limitson the size of the partitions. Secondly, it is possible to adapt the model for ﬁndingan almost balanced quasi-bicliques, that means that sizes of partitions of a quasi-biclique diﬀer by θ . To do this, the following conditions should be added to Model1 or Model 2: ω ( ) u (cid:213) n = ω ( ) l z ( ) n ≥ ( − θ ) ω ( ) u (cid:213) m = ω ( ) l z ( ) m , (13) ω ( ) u (cid:213) n = ω ( ) l z ( ) n ≤ ( + θ ) ω ( ) u (cid:213) m = ω ( ) l z ( ) m . (14)Models with additional conditions 13 and 14 have not been tested.It has also been noted that small-sized quasi-bicliques can be useless in prac-tice, but their recalculation is costly. Therefore, for each data set, we can establishminimum bounds on the size of a quasi-biclique (of the order of the smallest vertexdegree with respect to the graph partitions).The results the algorithms execution are presented in Table 1 for γ = .

6, Table 2for γ = . γ = . . For each algorithm its main parametersare indicated: the algorithm running time (time), the number of found maximumquasi-bicliques (count) and the maximum size of the found solution.Two found γ -quasi bicliques for the dataset divorce in the US are shown in Fig. 2. The size column in Table 3 shown as the result of summation | U (cid:48) | and | V (cid:48) | Table 1: Results of maximum γ -quasi-biclique search. Parameters: γ = . Data

Model 1 Model 2 Greedy Algorithm time count size time count size time count sizeSouthern 678 ms 4 (18,4) 801 ms 2 (18,4) 234 ms 4 (17, 5)WomenDivorse 1.23 s 1 (4,50) 3.38 s 1 (4,50) 360 ms 1 (2, 46)in USDutchElite 7602 s 2 (26,1) 181 s 1 (11,3) 3 s 1 (10,3)(top200)DutchElite - - - 6968 s 1 (45,2) 1954 s 1 (40,2)Movie-Lens 28068 s 2 (692,2) 13851 s 5 (900,3) 5976 s 2 (754,2)(small)

Table 2: The results of maximum γ -quasi-biclique search for γ = . Data

Model 1 Model 2 Greedy’ algorithm time count size time count size time count sizeSouthern 1.29 s 1 (16,3) 1.11 s 1 (10,6) 309 ms 1 (16, 2)WomenDivorse 1.56 s 1 (2,45) 2.66 s 3 (5,36) 320 ms 1 (2,28)in USDutchElite 8497 s 1 (23,1) 1668 s 3 (10,3) 1.63 s 1 (10,3)(top200)DutchElite - - - 6166 s 1 (20,2) 1511 s 1 (20,1)Movie-Lens - - - 10719 s 6 (800, 3) - - -(small)

Table 3: The results of maximum γ -quasi-biclique search for γ = . Data

Model 1 Model 2 Greedy algorithm time count size time count size time count sizeDivorce in US 8.53 s 1 38 1.7 s 2 33 313 ms 1 25DutchElite (top200) - - - 4834 s 2 13 2.5 s 1 13DutchElite - - - 7129 s 1 47 1718 1 21Movie-Lens (small) - - - 9046 s 2 445 - - -

Dashes ("-") in the following tables mean that the algorithm worked 10 hours anddid not ﬁnd a solution. If one of the partitions of the maximal quasi-biclique has aunit size, this is marked in the table as ( U (cid:48) , V (cid:48) ) , where U (cid:48) and V (cid:48) are the sizes of thepartitions. ixed Integer Programming for Searching Maximum Quasi-Bicliques 15 Fig. 2: Quasi-bicliques obtained by the studied MIP models for the dataset Divorcein US with γ = . One can note, that mixed linear programming models work an order of magnitudeslower than the greedy algorithm by Liu et al. (2008b), but they ﬁnd more quasi-cliques and generally each of them has a larger size.For small graphs, the time for ﬁnding the solution by the considered models isacceptable. Model 1 contains a fewer number of variables that must be optimized,but its maximisation criterion is costly for large graphs. Thus, on large-sized graphsModel 1 works too long (more than 10 hours), especially for high γ density thresh-olds. The dependence of the speed and quality of processing on γ is also apparentfor two other algorithms: for high thresholds on density, those methods work longersince the number of possible optimal solutions to the problem is reduced. Model 2on similar graphs showed better results, but the processing time is still quite large.For DutchElite data with a large number of vertices and a small number of edges,MIP-based algorithms work much longer than on more dense graphs.

If we consider the results, not in terms of speed, but terms of quality, then Model2 was the best one. This model produced more unique and larger quasi-bicliquesthan other algorithms.The following ways of future work seems to be relevant: 1) further improvementsof the proposed models by establishing tighter bounds for diﬀerent constraints andusing optimization tricks; 2) exploration of new optimization criteria; 3) comparisonof diﬀerent MIP solvers with the state-of-the-art approaches of searching for quasi-bicliques in a larger set of experiments.Another interesting avenue for research could be a study on connection betweenvarious approximations of formal concepts (fault-tolerant concepts (Besson et al.,2006) and object-attribute biclusters (Ignatov et al., 2012, 2017)), Boolean matrixfactorization (Miettinen, 2013; Belohlávek et al., 2019) and quasi-bicliques.

Acknowledgements

The work of Dmitry I. Ignatov shown in all the sections, except 5 and 6,has been supported by the Russian Science Foundation grant no. 17-11-01276 and performed atSt. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences,Russia. The authors would like to thank Boris Mirkin, Vladimir Kalyagin, Panos Pardalos, andOleg Prokopyev for their piece of advice and inspirational discussions. Last but not least we arethankful to anonymous reviewers for their useful feedback.

References

Abello J, Resende MGC, Sudarsky S (2002) Massive quasi-clique detection. In: Ra-jsbaum S (ed) LATIN 2002: Theoretical Informatics, Springer Berlin Heidelberg,Berlin, Heidelberg, pp 598–612Batagelj V, Mrvar A (2014) Pajek. In: Encyclopedia of Social Network Analysis andMining, pp 1245–1256, DOI 10.1007/978-1-4614-6170-8\_310, URL https://doi.org/10.1007/978-1-4614-6170-8_310

Belohlávek R, Outrata J, Trnecka M (2019) Factorizing boolean matrices usingformal concepts and iterative usage of essential entries. Inf Sci 489:37–49, DOI 10.1016/j.ins.2019.03.001, URL https://doi.org/10.1016/j.ins.2019.03.001

Besson J, Robardet C, Boulicaut J (2006) Mining a new fault-tolerant pattern type asan alternative to formal concept discovery. In: Conceptual Structures: Inspirationand Application, 14th International Conference on Conceptual Structures, ICCS2006, Aalborg, Denmark, July 16-21, 2006, Proceedings, pp 144–157, DOI10.1007/11787181\_11, URL https://doi.org/10.1007/11787181_11

Borgatti SP, Everett MG, Freeman LC (2014) UCINET. In: Encyclopedia of SocialNetwork Analysis and Mining, pp 2261–2267, DOI 10.1007/978-1-4614-6170-8\_316, URL https://doi.org/10.1007/978-1-4614-6170-8_316

Freeman LC (2003) Finding social groups: A meta-analysis of the southern womendata. In: Breiger R, Carley K, Pattison P (eds) Dynamic Social Network Modelingand Analysis: Workshop Summary and Papers, National Academies Press ixed Integer Programming for Searching Maximum Quasi-Bicliques 17

Harper FM, Konstan JA (2015) The movielens datasets: History and context. ACMTrans Interact Intell Syst 5(4):19:1–19:19, DOI 10.1145/2827872, URL http://doi.acm.org/10.1145/2827872

Ignatov DI, Kuznetsov SO, Poelmans J (2012) Concept-based biclustering forinternet advertisement. In: 12th IEEE International Conference on Data Min-ing Workshops, ICDM Workshops, Brussels, Belgium, December 10, 2012, pp123–130, DOI 10.1109/ICDMW.2012.100, URL https://doi.org/10.1109/ICDMW.2012.100

Ignatov DI, Gnatyshak DV, Kuznetsov SO, Mirkin BG (2015) Triadic formal con-cept analysis and triclustering: searching for optimal patterns. Machine Learning101(1):271–302, DOI 10.1007/s10994-015-5487-y, URL https://doi.org/10.1007/s10994-015-5487-y

Ignatov DI, Semenov A, Komissarova D, Gnatyshak DV (2017) Multimodal cluster-ing for community detection. In: Formal Concept Analysis of Social Networks,pp 59–96, DOI 10.1007/978-3-319-64167-6\_4, URL https://doi.org/10.1007/978-3-319-64167-6_4

Liu HB, Liu J, Wang L (2008a) Searching maximum quasi-bicliques from protein-protein interaction network. Journal of Biomedical Science and Engineering1(03):200Liu X, Li J, Wang L (2008b) Quasi-bicliques: Complexity and binding pairs. In:Hu X, Wang J (eds) Computing and Combinatorics, Springer Berlin Heidelberg,Berlin, Heidelberg, pp 255–264Miettinen P (2013) Fully dynamic quasi-biclique edge covers via boolean matrix fac-torizations. In: Proceedings of the Workshop on Dynamic Networks Managementand Mining, ACM, New York, NY, USA, DyNetMM ’13, pp 17–24, DOI 10.1145/2489247.2489250, URL http://doi.acm.org/10.1145/2489247.2489250

Mirkin BG, Kramarenko AV (2011) Approximate bicluster and tricluster boxes inthe analysis of binary data. In: Rough Sets, Fuzzy Sets, Data Mining and GranularComputing - 13th International Conference, RSFDGrC 2011, Moscow, Russia,June 25-27, 2011. Proceedings, pp 248–256, DOI 10.1007/978-3-642-21881-1\_40, URL https://doi.org/10.1007/978-3-642-21881-1_40

Pattillo J, Veremyev A, Butenko S, Boginski V (2013) On the maximum quasi-clique problem. Discrete Applied Mathematics 161(1):244 – 257, DOI https://doi.org/10.1016/j.dam.2012.07.019, URL

Peeters R (2003) The maximum edge biclique problem is np-complete. Dis-crete Applied Mathematics 131(3):651 – 654, DOI https://doi.org/10.1016/S0166-218X(03)00333-0Sim K, Li J, Gopalkrishnan V, Liu G (2006) Mining maximal quasi-bicliques toco-cluster stocks and ﬁnancial ratios for value investment. In: Sixth InternationalConference on Data Mining (ICDM’06), pp 1059–1063, DOI 10.1109/ICDM.2006.111Veremyev A, Prokopyev OA, Butenko S, Pasiliao EL (2016) Exact mip-basedapproaches for ﬁnding maximum quasi-cliques and dense subgraphs. Comp

Opt and Appl 64(1):177–214, DOI 10.1007/s10589-015-9804-y, URL https://doi.org/10.1007/s10589-015-9804-yhttps://doi.org/10.1007/s10589-015-9804-y