Discrimination of attractors with noisy nodes in Boolean networks
aa r X i v : . [ c s . D S ] S e p Discrimination of attractors with noisy nodes in Boolean networks
Xiaoqing Cheng , Wai-Ki Ching , Sini Guo , and Tatsuya Akutsu School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, 710049, China Advanced Modelling and Applied Computing Laboratory, Department of Mathematics, TheUniversity of Hong Kong, Pokfulam Road, Hong Kong Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji,Kyoto, 611-0011, Japan * Corresponding author: Tatsuya Akutsu, [email protected]
September 29, 2020
Abstract
Observing the internal state of the whole system using a small number of sensor nodes isimportant in analysis of complex networks. Here, we study the problem of determining theminimum number of sensor nodes to discriminate attractors under the assumption that eachattractor has at most K noisy nodes. We present exact and approximation algorithms for thisminimization problem. The effectiveness of the algorithms is also demonstrated by computationalexperiments using both synthetic data and realistic biological data. Keywords—
Observability; Boolean networks; attractors; genetic networks; biomarkers
It is important for analyzing complex network systems to select a small set of nodes (i.e., sen-sor nodes ) whose measurements can determine all other state variables. Relationships betweenstructure of complex networks and sensor nodes have been analyzed recently, especially for lin-ear systems [1, 2]. However, biological systems contain non-linear components and thus exhibitswitch-like behaviors. Therefore, it is essential to study the observability of non-linear systems.Even though most biological phenomena manifest them in a continuous domain, the binary ex-pression shows promising and useful results [3, 4]. The Boolean network (BN) is one of the moststudied mathematical models for genetic networks [5, 6], in which the state of each gene is repre-sented by 0 (off) or 1 (on). Observability of BNs has been widely studied [7, 8, 9]. However, it isimpossible in most cases to observe all internal states from a small set of sensor nodes because BNis a highly non-linear network [9]. Therefore, another approach has been proposed: discriminationof attractors, where an attractor is a collection of state cycles. Attractors are classified into single-ton attractors and periodic attractors , where the former and latter correspond to statically steadystates and periodically steady states, respectively. The purpose of discrimination of attractorsis to determine the minimum set of sensor nodes required to discriminate all given attractors.Since attractors are often interpreted as cell types [10], the discrimination problem correspondsto a problem of selecting the minimum number of genes that are needed to identify types of cells(e.g., types of cancers), which is closely related to selection of biomarkers or marker genes , a veryimportant topic in biological and medical sciences [11, 12].This discrimination problem was proposed in [13], and has been extensively studied [14].All the results assume clean input data. However, gene expression noise is inevitable due toenvironmental fluctuations and the stochasticity of biochemical reactions such as transcription,chromatin remodeling and post-translational regulation [15, 16, 17]. Therefore, proposing a robustdiscrimination model is essential towards robust classification of cell types. To this end, we eformulate the discrimination problem on BNs by assuming the number of noisy nodes is boundedby a number K , where this number is closely related to the Hamming distance, a standard distancemeasure for binary vectors.In this paper, we consider the discrimination problem for attractors with noisy nodes firstly,and present an exact algorithm. Another polynomial-time approximation algorithm is proposed toin order to balance the tradeoff between the size of the target set and the overall time complexity.Discrimination of singleton attractors with noisy nodes is a special case of our general discrimi-nation problem here. In this special case, the distance between any pair of attractors equals toHamming distance between two attractors’ states and thus it might be possible to develop simplerand/or faster algorithms. Therefore we present an exact algorithm and a polynomial-time approx-imation algorithm specified for discrimination of singleton attractors with noisy nodes afterwards.Finally, we perform computational experiments using synthetic data and realistic biological data.We remark that in our study, we assume a set of attractors are given without knowing the internalstructure of a BN. Although enumerating all the singleton attractors is an NP-hard problem, thereare some algorithms developed to find all the attractors up to moderate size networks [18, 19, 20].Furthermore, we assume that those attractors can be given independent of BN structures sincethey will be directly obtained from the expression data of stable cells. A list of notations used in this paper is given in Table 1. Firstly, we give a mathematical for-mulation of finding a minimum discriminator for attractors with noisy nodes (
MinDattNN ). Tothis end, we define the distance between a pair of attractors (POA) by observing a set of nodesˆ V . This new definition is needed because a periodic attractor is a periodically steady time se-ries. For a set ˆ V ⊆ V and an n -dimensional 0-1 vector v , v ˆ V denotes the | ˆ V | -dimensional vectorconsisting of elements of v that correspond to ˆ V . For example, if n = 5 , v = [1 , , , ,
0] andˆ V = { v , v , v } , then v ˆ V = [1 , , Att i and Att i be two periodic attractors and p ( i ) bethe period of Att i : Att i = [ v (0) , v (1) , . . . , v ( p ( i ) − Att i = [ w (0) , w (1) , . . . , w ( p ( i ) − . Define
Ser ( Att i , ˆ V , t ) as an infinite sequence of | ˆ V | -dimensional vectors beginning from time step t : Ser ( Att i , ˆ V , t ) = [ v ˆ V ( t ) , v ˆ V ( t + 1) , v ˆ V ( t + 2) , . . . ] . Let
Dist ( Att i , Att i , ˆ V ) be the distancebetween Att i and Att i by observing ˆ V : Dist ( Att i , Att i , ˆ V ) = p − min t =0 (cid:26) X v i ∈ ˆ V I { Ser ( Att i , { v i } , , Ser ( Att i , { v i } , t ) } (cid:27) where p = LCM ( p ( i ) , p ( i )) (LCM means the least common multiple) and I { Ser ( Att i , { v i } , , Ser ( Att i , { v i } , t ) } = (cid:26) , if Ser ( Att i , { v i } , = Ser ( Att i , { v i } , t ) , , otherwise . Then
Dist ( Att i , Att i , V ) = 0 if and only if these two attractors are identical. In the noisy case, itis hypothesized that noisy nodes vary in different attractors, thus at most 2 K nodes’ value may flipfrom 1 to 0 or vice versa for a pair of attractors. Therefore, in the worst case, a POA is discrimi-nated only if there are at least 2 K + 1 different nodes observed, that is Dist ( Att i , Att i , V ) > K . Definition 1 (Minimum Discriminator for Attractors with Noisy Nodes [MinDattNN] ) Input : A set of m attractors { Att , Att , . . . , Att m } where Att i is a p ( i ) × n binary matrix ( n isnumber of genes), and an integer K denoting the maximum number of noisy nodes per attractor. Output : A minimum cardinality set ˆ V of nodes such that Dist ( Att i , Att i , ˆ V ) ≥ K + 1 holdsfor all i , i with ≤ i < i ≤ m . Inspired by Lemma 1 in [14], we consider gene pairs. We first construct a binary (cid:0) m (cid:1) × (cid:0) n (cid:1) matrix D by D [ T ( i , i , m ) , T ( j , j , n )] = Dist ( Att i , Att i , { v j , v j } ). Here T ( i , i , m ) = ( i − m − i ( i − + i − i . Then for each POA ( Att i , Att i ) and a set ˆ V , we construct an undirected graph G D ( T ( i , i , m ) , ˆ V ) = h ˆ V , ˆ E i where ˆ E = { e = ( v j , v j ) | v j , v j ∈ ˆ V , D [ T ( i , i , m ) , T ( j , j , n )] = Common Notations m Number of attractors n Number of genes M Number of POAs (i.e., M = (cid:0) m (cid:1) ) x T ( i ,i ,m ) , ≤ T ( i , i , m ) ≤ M A pair of attractors (POA), (
Att i , Att i ) U = { x l , ≤ l ≤ M } A set of POAs need to be discriminatedNotations in
MinDattNNv
A 0-1 vector V A set of nodes corresponding to genes v ˆ V , ˆ V ⊆ V | ˆ V | -dimensional vector consisting ofelements of v that correspond to ˆ VAtt i = [ v (0) , . . . , v ( p ( i ) − Att i = [ w (0) , . . . , w ( p ( i ) − p ( i ) The period of Att i Ser ( Att i , ˆ V , t ) = An infinite sequence of | ˆ V | -dimensional[ v ˆ V ( t ) , v ˆ V ( t + 1) , v ˆ V ( t + 2) , . . . ] vectors beginning from time step tDist ( Att i , Att i , ˆ V ) Distance between Att i and Att i by observing ˆ VD [ T ( i , i , m ) , T ( j , j , n )] Distance between ( Att i , Att i ) by observing { v j , v j } s T ( j ,j ,n ) = { x T ( i ,i ,m ) | D [ T ( i , i , m ) , T ( j , j , n )] = 0 } A set of POAs that can be discriminated by { v j , v j } S = { s , s , . . . , s n ( n − } A set of all candidate sensor pairs r T ( i ,i ,m ) A dummy variable indicating the distance ≤ Dist ( Att i , Att i , ˆ V ) between a POA under the current discriminator G D ( T ( i , i , m ) , ˆ V ) Adjacent graph of T ( i , i , m ) − th row of D constrained on nodes in ˆ VMC ( G D ( T ( i , i , m ) , ˆ V )) A maximum clique of G D ( T ( i , i , m ) , ˆ V )Notations in MinDSattNN
Att An m × n attractor matrix J = { j , j , . . . , j k } A set of column/row indices A [ i, − ]( resp.A [ − , j ]) The i -th row (resp. j -th column) of AA [ i, J ]( resp.A [ J, j ]) The sub-matrix of A [ i, − ] (resp. A [ − , j ]) consistingof the j , j , . . . , j k -th columns (resp. rows) H ( x, y ) Hamming distance between vectors x and ys j = { x T ( i ,i ,m ) | Att [ i , j ] = Att [ i , j ] } A set of POAs that can be discriminated by v j S = { s , s , . . . , s n } A set of candidate sensor nodes r T ( i ,i ,m ) = H ( Att [ i , J ] , Att [ i , J ]) A dummy variable indicating the distance betweena POA under the current discriminator } . From Lemma 1 (given below), Dist ( Att i , Att i , ˆ V ) can be calculated by computing a maxi-mum clique of G D ( T ( i , i , m ) , ˆ V ), then we need to decide whether Dist ( Att i , Att i , ˆ V ) ≥ K + 1holds for all i , i with 1 ≤ i < i ≤ m . Note that the exceptional case that all node in G D ( T ( i , i , m ) , ˆ V ) are isolated needs to be discussed based on whether these two attractor canbe discriminated by observing any nodes in ˆ V (line 7-9 in Algorithm 1). Example 1 is an illustra-tive example for calculation of Dist ( Att i , Att i , ˆ V ). The resulting algorithm is given in Algorithm1, and its time complexity is analyzed in Theorem 1. Lemma 1
Suppose that G D ( T ( i , i , m ) , ˆ V ) has at least one edge. γ = | M C ( G D ( T ( i , i , m ) , ˆ V )) | is the number of nodes in the maximum clique of G D ( T ( i , i , m ) , ˆ V ) . Then Dist ( Att i , Att i , ˆ V ) = | ˆ V | − γ . Proof:
Define
Dif ( i ,i ) ( { v j } , t )= ( , if Ser ( Att i , { v j } ,
0) =
Ser ( Att i , { v j } , t ) , , otherwise , Without considering order of nodes in ˆ V , Dif ( i ,i ) ( ˆ V , t ) has the form of
Dif ( i ,i ) ( ˆ V , t ) =00 · · · | {z } ˆ V ( t ) · · · | {z } ˆ V ( t ) . From definition, we know
Dist ( Att i , Att i , ˆ V ) = min t | ˆ V ( t ) | = | ˆ V | − max t | ˆ V ( t ) | . v v v v v v v v v Figure 1: (a) G D ( T (1 , , , V ) (left) and (b) G D ( T (2 , , , V ) (right) On the other hand, if v j , v j ∈ ˆ V ( t ), then Dist ( Att i , Att i , { v j , v j } ) = 0, this means all nodes inˆ V ( t ) forms a clique. Maximizing | ˆ V ( t ) | equals to calculating the number of nodes in a maximumclique of G D ( T ( i , i , m ) , ˆ V ), which completes the proof. ✷ Algorithm 1
Exact algorithm for
MinDattNNInput: set of attractors, set of nodes V , integer K Output: set of nodes ˆ V Calculate matrix D for k = 2 K + 1 to n do for ˆ V ⊂ V and | ˆ V | = k do sig ← for ≤ i ≤ i ≤ m do sig ← X v j ,v j ∈ ˆ V D [ T ( i , i , m ) , T ( j , j , n )] if sig = ( | ˆ V | − | ˆ V | then γ ← else γ ← | M C ( G D ( T ( i , i , m ) , ˆ V )) | end if if k − γ ≥ K + 1 then sig ← sig + 1 else Break end if end for if sig = M then return ˆ V end if end for end for Example 1
Let
Att = [00001 , , Att = [10100] , Att = [11001] and V = { v , v , v , v , v } .Part of D is shown below, where gene pairs with v are omitted. G D ( T (1 , , , V ) and G D ( T (2 , , ,V ) are shown in Fig. 1. D = ( v , v ) ( v , v ) ( v , v ) ( v , v ) ( v , v ) ( v , v )2 2 1 2 1 1 ( Att , Att )2 2 1 2 1 1 ( Att , Att )1 1 0 2 1 1 ( Att , Att ) , Case 1 : Consider ( Att , Att ) and ˆ V = { v , v , v } , then ˆ E = { ( v , v ) } 6 = ∅ . From Lemma1, Dist ( Att , Att , ˆ V ) = 3 − | M C ( G D ( T (2 , , , ˆ V )) | = 1 holds. This can be verified since Dist ( Att , Att , ˆ V ) equals to the Hamming distance between vectors and . Case 2 : Consider ( Att , Att ) and ˆ V = { v , v , v } , then ˆ E = ∅ . In this case, Dist ( Att , Att , ˆ V ) =3 since these two attractors are discriminated by observing any node in ˆ V , which corresponds toLine 7 in Algorithm 1. Case 3 : Consider ( Att , Att ) and ˆ V = { v , v , v } , then ˆ E = ∅ . In this case, Dist ( Att , Att , ˆ V ) =3 − | M C ( G D ( T (2 , , , ˆ V )) | = 2 since these two attractors are the same by observing v but arediscriminated by observing v or v , which corresponds to Line 8 in Algorithm 1. inally, we can see that ˆ V = { v , v , v , v } is a solution of MinDattNN for K = 1 because thedistance between any attractor pair is 3 or 4. Theorem 1
Algorithm 1 computes an optimal discriminator ˆ V ∗ in O (3 | ˆ V ∗ | / n | ˆ V ∗ | m + m n p ) time, where p = LCM { p (1) , p (2) , . . . , p ( m ) } . Proof:
The correctness of the algorithm follows from Lemma 1. D can be calculated in O ( m n p )time by a naive algorithm. We can apply the Bron-Kerbosch algorithm to calculate the maximumclique, whose time complexity is O (3 n/ ) for graphs with n nodes. Therefore, the total timecomplexity is O (3 | ˆ V ∗ | / n | ˆ V ∗ | m + m n p ), where | ˆ V ∗ | is the minimum number of needed nodes. ✷ Algorithm 2 is a greedy-type approximation algorithm running in O ( poly ( m, n, p )) time, which ismuch more efficient than Algorithm 1. Here, we introduce some notations. x T ( i ,i ,m ) means aPOA ( Att i , Att i ), and s T ( j ,j ,n ) = { x T ( i ,i ,m ) | D [ T ( i , i , m ) , T ( j , j , n )] = 0 } . For instance, x means ( Att , Att ) and s = { x , x , x } in Example 1. Notice that we adopt r T ( i ,i ,m ) torecord the distance between Att i and Att i , and r T ( i ,i ,m ) ≤ Dist ( Att i , Att i , ˆ V ) holds whereˆ V is the current discriminator set. In this greedy algorithm, a pair of nodes is added in eachiteration, instead of a single node. It is because there exist cases in which the single node additionstrategy fails (Proposition 1), whereas the node pair addition strategy can always find a feasiblesolution (Proposition 2). An illustrative example is given in Example 2. Algorithm 2
Approximation algorithm for
MinDattNNInput: U = { x , x , . . . , x ( m ) } , S = { s , s , . . . , s ( n ) } , integer K Output: set of nodes ˆ V Calculate matrix D ˆ V ← ∅ , r l = 0 for 1 ≤ l ≤ M while U = ∅ and S = ∅ do Find s T ( j ,j ,n ) ∈ S maximize | s T ( j ,j ,n ) ∩ U | ˆ V ← ˆ V ∪ { v j , v j } , S ← S − { s T ( k,k ′ ,n ) | k = j or k ′ = j } ; for all x l ∈ s T ( j ,j ,n ) and r l < K + 1 do r l ← r l + D ( l, T ( j , j , n )) if r l ≥ K + 1 then U ← U − { x l } end if end for end while if S = ∅ then ˆ V ← V end if Proposition 1
Suppose that ( Att i , Att i ) is a pair of attractors and ˆ V is a discriminator set suchthat Dist ( Att i , Att i , ˆ V ) = N . Then it is possible that for any node v j ∈ V − ˆ V , Dist ( Att i , Att i , ˆ V ∪{ v j } ) = N , but there exists v i ∈ ˆ V such that Dist ( Att i , Att i , { v i , v j } ) = 0 . Proof:
Let
Att = [00101 , , Att = [00010 , V = { v , v , v } . Then we have Dist ( Att , Att , ˆ V ) = Dist ( Att , Att , { v , v , v , v } ) = Dist ( Att , Att , { v , v , v , v } ) = 1. Onthe other hand, we have Dist ( Att , Att , { v , v } ) = Dist ( Att , Att , { v , v } ) = 0 . ✷ Proposition 2 If ( Att i , Att i ) is a pair of attractors and ˆ V is a discriminator such that Dist ( Att i ,Att i , ˆ V ) = N , then for any pairs of nodes ( v j , v j ) such that v j / ∈ ˆ V , v j / ∈ ˆ V , and x T ( i ,i ,m ) ∈ step U S ˆ V r = [ r , r , r ]0 { x , x , x } { s , . . . , s } ∅ [0 , , { x , x , x } { s , s , s , s , s , s } { v , v } [1 , , { x , x } { s } { v , v , v , v } [2 , , ∅ ∅ { v , v , v , v , v , v } [3 , , s T ( j ,j ,n ) , Dist ( Att i , Att i , ¯ V ) ≥ N + D [ T ( i , i , m ) , T ( j , j , n )] holds where ¯ V = ˆ V ∪ { v j , v j } . Proof:
Let
Dif ( i ,i ) ( ˆ V , t ) has the form of
Dif ( i ,i ) ( ˆ V , t ) = 00 · · · | {z } ˆ V ( t ) · · · | {z } ˆ V ( t ) . and β ( t ) = X v i ∈ ˆ V I { Ser ( Att i , { v i } , , Ser ( Att i , { v i } , t ) } . Then, it is obvious β ( t ) = | ˆ V ( t ) | . Inthe proof of Lemma 1, we claimed a relationship between β ( t ) and Dist ( Att i , Att i , ˆ V ) that Dist ( Att i , Att i , ˆ V ) = min t β ( t ) . Let α = D [ T ( i , i , m ) , T ( j , j , n )]. Since x T ( i ,i ,m ) ∈ s T ( j ,j ,n ) , α is either 1 or 2. This means α = min t α ( t ) = min t X v i ∈{ v j ,v j } I { Ser ( Att i , { v i } , , Ser ( Att i , { v i } , t ) } . Similarly, consider β ( t ) = X v i ∈ ¯ V I { Ser ( Att i , { v i } , , Ser ( Att i , { v i } , t ) } then Dist ( Att i , Att i , ¯ V )= min t β ( t ) . Obviously, we have min t β ( t ) ≥ min t β ( t ) + α , and the equality is achieved whenboth β ( t ) and α ( t ) are minimized at the same t , which implies Dist ( Att i , Att i , ¯ V ) ≥ N + D [ T ( i , i , m ) , T ( j , j , n )] . ✷ Example 2
Three attractors are given by
Att = [010101 , , , , , , Att = [010011 , , , , , and Att = [010001 , , , .A detailed process of Algorithm 2 is shown in Table 2 where r = [ r , r , r ] . After the 3rd step, allnodes have been added to ˆ V and r = [3 , , holds, and then it returns V since U = ∅ . Besides, there is a difficulty in analyzing the approximation factor in general. Therefore, we add acondition
Dist ( Att i , Att i , ˆ V j − ) + Dist ( Att i , Att i , ˆ V ∗ − ˆ V j − ) ≥ K + 1 to obtain a guaranteedapproximation factor as below. Even though it is difficult to test whether or not the condition issatisfied before running the approximation algorithm, it seems from numerical experiments thatthis condition is satisfied in most cases. Theorem 2
Let ˆ V j denote the discriminator set obtained after the j -th iteration of Algorithm2 and ˆ V ∗ be an optimal solution. Suppose that Dist ( Att i , Att i , ˆ V j − ) + Dist ( Att i , Att i , ˆ V ∗ − ˆ V j − ) ≥ K +1 is satisfied for each j . Then, Algorithm 2 is an ln( M (2 K +1))+1 factor polynomialtime approximation algorithm for MinDattNN ( M = (cid:0) m (cid:1) ). Proof:
Firstly, we consider the approximation ratio when the algorithm terminates when S = ∅ .Let t ( k ) be the index such that a pair of nodes s t ( k ) is chosen at the k -th iteration. Let ˆ V j − denote the discriminator set after the ( j − U j is U at the j − th iteration. Weassume without loss of generality that ˆ V j − = { v , v , · · · , v j − } holds. In each iteration, a pairof nodes is added, thus the distance between a POA can be increased by at most 2, thus we havethe following inequality: P i
1. Therefore, we have | ˆ V || ˆ V ∗ | ≤ (cid:16) + . . . + M (2 K +1) (cid:17) ≤ ln( M (2 K + 1)) + 1 . On the other hand, if the algorithmterminates when S = ∅ , it will reduce the approximation factor. Therefore, ln ( M (2 K + 1)) + 1gives an upper bound of the approximation ratio.Next, we analyze the time complexity. Let p = LCM { p (1) , p (2) , . . . , p ( m ) } . Calculation of matrix D should cost O ( m n p ) time. It is also clear that the inner While loop takes O ( m n log n )time. Therefore, the theorem holds. ✷ As we mentioned before, discrimination of singleton attractors with noisy nodes is a special caseof the discrimination problem proposed in the previous section. In this case, each attractor
Att i isa binary vector and the distance between a POA ( Att i , Att i ) is degenerated to H ( Att i , Att i ),where H ( x, y ) is the Hamming distance between binary vectors x and y . Therefore, we putall attractors in a binary m × n matrix Att in which each row represents a singleton attractor.Moreover, for a matrix A , let A [ i, − ] (resp., A [ − , j ]) denotes the i -th row (resp., j -th column).Similarly, for a matrix A and a set of column (resp., row) indices J = { j , . . . , j k } , A [ i, J ] (resp., A [ J, j ]) denotes the submatrix of A [ i, − ] (resp., A [ − , j ]) consisting of the j , . . . , j k -th columns(resp., rows). [14] considered the clean (i.e., without noise) version of the problem, The task wasto find the minimum set of column indices J such that Att [ i , J ] = Att [ i , J ] holds for all i , i with 1 ≤ i < i ≤ m . In the noisy case, it is hypothesized that noisy nodes vary in differentattractors, thus at most 2 K nodes are not reliable for a POA, thus a POA is discriminated onlyif H ( Att [ i , − ] , Att [ i , − ]) > K . Hence, we define MinDSattNN as follows.
Definition 2 (Minimum Discriminator for Singleton Attractors with Noisy Nodes [MinDSat-tNN] ) Input : A set of singleton attractors represented by an m × n binary matrix Att and an integer K denoting the maximum number of noisy nodes per attractor. Output : A minimum cardinality set J of columns such that H ( Att [ i , J ] , Att [ i , J ]) ≥ K + 1 holds for all i , i with ≤ i < i ≤ m . To solve
MinDSattNN , we develop an integer programming (IP)-based exact method. To thisend, we construct a matrix C att of size (cid:0) m (cid:1) × n by comparing every two rows of Att : C att [ T ( i , i , m ) , j ] = (cid:26) , if Att [ i , j ] = Att [ i , j ] , , if otherwise . In MinDatt , matrix D is constructed from calculating distance between any POA by observinga pair of nodes, whereas in the singleton case observing only a node is needed. Let M denote (cid:0) m (cid:1) . et y = [ y , y , . . . , y n ] be a vector in which y j takes 1 ( j ∈ J ) or 0 ( j / ∈ J ). Then MinDSattNN can be formulated as a typical IP problem min · y T subject to (cid:26) C att [ i, − ] · y T ≥ K + 1 ( i = 1 , , . . . , M ) ,y j ∈ { , } ( j = 1 , , . . . , n ) , where = [1 , , . . . , | {z } n . Accordingly, we see that MinDSattNN can be transformed into aninteger programming problem with n binary variables and O ( m ) constraints. It should be notedthat existing IP solvers (e.g., intlinprog in MATLAB) take exponential time in the worst case andthus this IP-based method takes exponential time in the worst case. However, it is reasonablebecause both MinDattNN and
MinDSattNN include the problem of discrimination of singletonattractors (
MinDiscSatt ) [14], which is known to be NP-hard, as a special case and thus areNP-hard. The following is an illustrative example of the IP process.
Example 3
Let
Att be given by
Att = , which means that there exist three singleton attractors Att [1 , − ] = [1 , , , , , , , , Att [2 , − ] =[1 , , , , , , , , Att [3 , − ] = [1 , , , , , , , . Then we would have C att = , and all parameters into intlinprog (in MATLAB) would be f = [1 , , , , , , , , intcon =[1 , , , , , , , , lb = [0 , , , , , , , , ub = [1 , , , , , , , , then we would have the optimalobject value of 5 and y = [0 , , , , , , , , which means J = { , , , , } . MinDSattNN is a special case of the set multi-cover problem [21], which is NP-hard. Therefore,in order to balance the trade-off between the size of a target set J and the overall time complexity,we design a simple greedy algorithm, Algorithm 3, based on [21]. As shown in Theorem 3, it hasa guaranteed approximation ratio ln ( M (2 K + 1)) + 1. Notice that, usually, m ≪ n and K ≪ n and thus the ratio is acceptable as the computational time can be reduced significantly. Similarly, x T ( i ,i ,m ) denotes a POA and let s j = { x T ( i ,i ,m ) | Att [ i , j ] = Att [ i , j ] } denote the POAs thatcan be discriminated by v j . For example, x = ( Att , Att ) and s = { x , x } in Example 3. Anexample of this algorithm is given in Example 4. Algorithm 3
Approximation algorithm for
MinDSattNNInput: U = { x l , ≤ l ≤ M } , S = { s , s , . . . , s n } , integer K Output: set of nodes J J ← ∅ , r l = 0 for 1 ≤ l ≤ M while U = ∅ do Find s j ∈ S with maximum | s j ∩ U | , J ← J ∪ { j } , S ← S − { s j } ; for all x l ∈ s j and r l < K + 1 do r l ← r l + 1 if r l = 2 K + 1 then U ← U − { x l } end if end for end while xample 4 Attractors are the same as in Example 3, and detailed execution steps are shown inTable 3. In this case, we have the final target set equals to { v , v , v , v , v } . Table 3: Example of execution of Algorithm 3. step
U S J r = [ r , r , r ]0 { x , x , x } { s , . . . , s } ∅ [0 , , { x , x , x } { s , s , . . . , s } { } [1 , , { x , x , x } { s , s , . . . , s } { , } [2 , , { x , x } { s , s , s , s , s } { , , } [3 , , { x } { s , s , s , s } { , , , } [3 , , ∅ { s , s , s } { , , , , } [3 , , Theorem 3
Algorithm 3 is an ln( M (2 K +1))+1 factor polynomial-time approximation algorithmfor MinDSattNN . Proof:
Let t ( j ) be the index such that s t ( j ) is chosen at the j -th iteration. We say that pricefor solving MinDSattNN is N if N nodes are needed for discriminating the given attractorseach with at most K noisy nodes. In each iteration, a column index is added to J and thenprice 1 is added to the total cost. Afterward, we assign an average price of P ( x l , j ) to each x l at the j -th iteration, for x l ∈ U j , where U j is the set U at j -th iteration. Therefore we have P ( x l , j ) = (cid:26) | s t ( j ) ∩ U j | , x l ∈ s t ( j ) ∩ U j , , otherwise . We call | s t ( j ) ∩ U j | coverage power of s t ( j ) . After termination of Algorithm 3, the total numberof nodes in the discriminator set should be | J | from which | J | = P | J | j =1 P Ml =1 P ( x l , j ) holds. Herewe give a key inequality (A.1), P ( x l , j ) ≤ | J ∗ | M (2 K + 1) − j − X k =1 | s t ( k ) ∩ U k | , where we define thesummation in the denominator to be 0 when j = 1 because s t (0) = ∅ , and the proof is given inAppendix A.Let J ∗ be an optimal solution of MinDSattNN . Then, we have | J | = | J | X j =1 M X l =1 P ( x l , j ) = | J | X j =1 ( | U j ∩ s t ( j ) | ) P ( x l , j ) ≤ | J | X j =1 ( | U j ∩ s t ( j ) | ) · | J ∗ | M (2 K + 1) − j − X k =1 | s t ( k ) ∩ U k |≤ | J ∗ | | J | X j =1 i e X s = i s M (2 K + 1) − s ! ≤ | J ∗ | (cid:18) . . . + 1 M (2 K + 1) (cid:19) , where i s = P j − k =1 | s t ( k ) ∩ U k | and i e = P jk =1 | s t ( k ) ∩ U k | −
1, and the second inequality comes from P | J |− k =1 | s t ( k ) ∩ U k | ≥
1. Therefore, the approximation ratio is bounded by | J || J ∗ | ≤ . . . + 1 M (2 K + 1) < ln( M (2 K + 1)) + 1 . n m K len Time of Time of Approximation Algorithm 1 (s)
Algorithm 2 (s) ratio100 3 1 3 63.6160 0.0100 2100 5 1 5 209.09 0.0100 2100 5 2 5 260982 1.3 1.71000 3 3 1 9902.3 0.0100 2
Table 5: Numerical results on discrimination of singleton attractors. n m K
Time of Time ApproximationIP(s)
Algorithm 3 (s) ratio50 5 1 0.109 0.003 1.333350 5 3 0.045 0.002 1.1538500 5 1 0.137 0.001 1.3333500 5 3 661.4 0.000 1.20005000 5 3 7063 0.000 1.200020000 5 3 7064 0.080 1.200020000 5 5 6855 0.017 1.100020000 5 10 7168 0.400 1.1000
Hereafter we analyze the time complexity. The number of iterations in
While loop is boundedby min {| J ∗ | · ln( M (2 K + 1)) + 1 , n } . and time complexity for each iteration of While loop is atmost O ( m · n log n ) by using merge sort. Therefore, the total cost for Algorithm 3 is O (cid:0) m n log n · min {| J ∗ | · (ln( M (2 K + 1)) + 1) , n } ) , which is a polynomial order of m and n . ✷ All numerical experiments were conducted using Matlab on a PC with dual-core 3.4 GHz pro-cessor and 8 GB RAM. Firstly, for each of
MinDattNN and
MinDSattNN , we conductedcomputational experiments using simulation data by randomly generating a couple of attractors,repeating the numerical experiment 10 times for each parameter set, and then recording the av-erage time and maximum approximation ratio. For the discrimination of attractors, results arelisted in Table 4, where len denotes the maximum length of attractors. It is seen that Algorithm2 is much faster than Algorithm 1 and the approximation ratios are not large. For discriminationof singleton attractors, it is seen from Table 5 that Algorithm 3 is much faster than the IP-basedmethod, and the approximation ratios are much smaller than ln( M (2 K + 1)) + 1.Next, we also examined the efficiency of the IP-based method and Algorithm 3 for the cleancase of MinDSattNN , we compared those with a previous one,
SolveMinDiscSatt [14]. It isseen from Table 6 that the new methods are much more efficient even for the clean case.Then, we conducted computational experiments using BN models on the following four real
Table 6: Numerical results on the clean case of discrimination of singleton attractors. n m
Time of Time of Time of Approximation
SolveMinDiscSatt (s) IP (s)
Algorithm 3 (s) ratio100 5 0.4094 0.0657 0.0125 11000 5 7.2797 0.2703 0.0031 110000 5 1571.2 51.6 0.0000 120000 5 5128.2 125.5 0.0000 1 n m
CPU time (sec) Identified MarkersExact Approx. Exact Approx(1) 40 3+(7) 612.11 0.04 ZAP70, TCR, SLP76, SEK, RLK ZAP70, TCR, SLP76, SEK, RLK, TCRphosp(2) 90 (4 , , , ≈ , ,
4) CCND2, HDAC9, INPPSD CAV1, CCND2, HDAC9, INPP5D, PAK(3) 60 5 0.52 0.53 wg1, WG1, EN1, PTC1, PH1, ptc2, PTC2, SMO3 wg1, WG1, en1, EN1, hh1, en2, EN2, hh2, PTC2(4) 9 3 0.05 0.06 Rb, TELase, Cyclin, E2F, ESE2 p53, p16, Rb, TELase, Snai2, E2F biological processes, where K = 1 was used in all cases: (1) Logical model analyzing T-cellactivation ([22]), (2) IGVH mutational status in chronic lymphocytic leukemia [23], (3) Segmentpolarity genes in Drosophila Melanogaster [24], (4) Tumorigenic transformation of human epithelialcells [25]. Note that the algorithms for
MinDattNN were used for (1) and (2), whereas those for
MinDSattNN were used for (3) and (4). The results are summarized in Table 7. In this table, n and m denote the numbers of genes and attractors, where the periodic attractors are shown bya list of their periods.In the BN model (1), there exist 9 periodic attractors [22]. However, distances among someattractors are less than or equal to 2. Therefore, we discarded such attractors. Finally, we onlykept attractors 1, 2, 3 and 9 for verification It is known that ZAP70, TCR, SLP76, SEK, andRLK play important roles in T-cell development and lymphocyte activation or development ofthe nervous system, while genes ZAP70, TCR are the ligand for TCRphosp. Therefore, theapproximation algorithm may have found more important genes. In addition, the approximationalgorithm was much faster than the exact algorithm. These facts suggest the usefulness of theapproximation algorithm.In the BN model (2), the exact and approximation algorithms identified 7 and 10 genes, re-spectively. Therefore, the resulting approximation ratio is 10 /
7, which is much smaller thanln( M (2 K + 1)) + 1. Besides, there are 6 common genes identified, which encode proteins in-volved in critical biological processes or diseases like human child lymphoblastic leukemia, histonedeacetylase and so on. Among the other identified genes, AKAP12 gene functions in binding tothe regulatory subunit of PAK and confining the holoenzyme to discrete locations within the cell.AEBP1 gene encodes proteins that may function as a transcriptional repressor and play a role inadipogenesis and smooth muscle cell differentiation. AICDA encodes a RNA-editing deaminase.AKT3 encodes proteins known to be regulators of cell signaling in response to insulin and growthfactors. These facts suggest the usefulness of both algorithms. However, the approximation algo-rithm was much faster than the exact one. Therefore, these results suggest again the usefulnessof the approximation algorithm.In the BN model (3), there are 10 singleton attractors [24]. However, since the distancesamong attractors 3, 4, and 6 are less than 3, and the distances among attractors 7, 8, 9, and 10are also less than 3, we only keep the five attractors: 1, 2, 3, 6, 7. There are 60 nodes in this BN,including 5 segment polarity genes (en, wg, ptc, ci, hh) and their proteins (EN, WG, PTC, CI,CIR, CIA, SMO, HH), one pair-rule gene (slp) and its protein SLP in one parasegment primordia(4 cells). Most markers identified by two algorithms are the same except that gene hh is notidentified by the exact algorithm and transcription factors PTC and SMO are not identified bythe approximation algorithm. It has been verified experimentally that binding of HH (protein ofhh) would remove the inhibition of SMO. It is seen from Table 7 that the approximation ratio byAlgorithm 3, is 9 / . epithelial cells , senescent cells and mesenchymal stem-like cells .The obtained molecular players can be interpreted as master regulators for each cell. It is knownthat cells with mesenchymal stem-like phenotype have a strong potential of transferring to cino-mas. Snai2 was chosen as a master regulator from the approximation algorithm (but not from theIP-based method). By taking a further look at this molecule, we can see that the activation ofSnai2 enables cells to sustain proliferative signals and to evade growth suppressors by undergoinga de-differentiation process. Thus it is activated in mesenchymal stem-like cells but not in theother two kinds of cells. This fact suggests the usefulness of the approximation algorithm. cknowledgements XQC was partially supported by National Science Foundation of China under grant numbers11801434 and 3115200128. WKC was partially supported by Hong Kong RGC GRF Grant no.17301519, National Natural Science Foundation of China Under Grant number 11671158, andIMR and RAE Research fund from Faculty of Science, HKU. TA was partially supported by JSPSKAKENHI Grant number 18H04413.
References [1] G. Yan, G. Tsekenis, B. Barzel, J.-J. Slotine, Y.-Y. Liu, and A.-L. Barab´asi, “Spectrum ofcontrolling and observing complex networks,”
Nature Physics , vol. 11, no. 9, p. 779, 2015.[2] Y.-Y. Liu, J.-J. Slotine, and A.-L. Barab´asi, “Observability of complex systems,”
Proceedingsof the National Academy of Sciences , vol. 110, no. 7, pp. 2460–2465, 2013.[3] I. Shmulevich and W. Zhang, “Binary analysis and optimization-based normalization of geneexpression data,”
Bioinformatics , vol. 18, no. 4, pp. 555–565, 2002.[4] S. Watterson, S. Marshall, and P. Ghazal, “Logic models of pathway biology,”
Drug DiscoveryToday , vol. 13, no. 9-10, pp. 447–456, 2008.[5] S. Kauffman, “Homeostasis and differentiation in random genetic control networks,”
Nature ,vol. 224, no. 5215, p. 177, 1969.[6] S. A. Kauffman,
The Origins of Order: Self-Organization and Selection in Evolution . OUPUSA, 1993.[7] D. Cheng, H. Qi, and Z. Li,
Analysis and control of Boolean networks: a semi-tensor productapproach . Springer Science & Business Media, 2010.[8] D. Laschov, M. Margaliot, and G. Even, “Observability of boolean networks: A graph-theoretic approach,”
Automatica , vol. 49, no. 8, pp. 2351–2362, 2013.[9] R. Li, M. Yang, and T. Chu, “Controllability and observability of boolean networks arisingfrom biology,”
Chaos: An Interdisciplinary Journal of Nonlinear Science , vol. 25, no. 2,p. 023104, 2015.[10] S. Huang, “Gene expression profiling, genetic networks, and cellular states: an integratingconcept for tumorigenesis and drug discovery,”
Journal of Molecular Medicine , vol. 77, no. 6,pp. 469–480, 1999.[11] S. S. Shen-Orr, R. Tibshirani, P. Khatri, D. L. Bodian, F. Staedtler, N. M. Perry, T. Hastie,M. M. Sarwal, M. M. Davis, and A. J. Butte, “Cell type–specific gene expression differencesin complex tissues,”
Nature Methods , vol. 7, no. 4, p. 287, 2010.[12] D. Bell, D. Roberts, M. Kies, P. Rao, R. S. Weber, and A. K. El-Naggar, “Cell type-dependentbiomarker expression in adenoid cystic carcinoma: biologic and therapeutic implications,”
Cancer , vol. 116, no. 24, pp. 5749–5756, 2010.[13] Y. Qiu, X. Cheng, W.-K. Ching, H. Jiang, and T. Akutsu, “On observability of attractorsin boolean networks,” in
Bioinformatics and Biomedicine (BIBM), 2015 IEEE InternationalConference on , pp. 263–266, IEEE, 2015.[14] X. Cheng, T. Tamura, W.-K. Ching, and T. Akutsu, “Discrimination of singleton and periodicattractors in boolean networks,”
Automatica , vol. 84, pp. 205–213, 2017.[15] G. Chalancon, C. N. Ravarani, S. Balaji, A. Martinez-Arias, L. Aravind, R. Jothi, and M. M.Babu, “Interplay between gene expression noise and regulatory network architecture,”
Trendsin genetics , vol. 28, no. 5, pp. 221–232, 2012.[16] M. I. Love, W. Huber, and S. Anders, “Moderated estimation of fold change and dispersionfor rna-seq data with deseq2,”
Genome Biology , vol. 15, no. 12, p. 550, 2014.[17] F. Wu, J. Shim, T. Gong, and C. Tan, “Orthogonal tuning of gene expression noise usingcrispr–cas,”
Nucleic acids research , vol. 48, no. 13, pp. e76–e76, 2020.
18] T. Akutsu, S. Kuhara, O. Maruyama, and S. Miyano, “A system for identifying geneticnetworks from gene expression patterns produced by gene disruptions and overexpressions,”
Genome Informatics , vol. 9, pp. 151–160, 1998.[19] A. Veliz-Cuba, B. Aguilar, F. Hinkelmann, and R. Laubenbacher, “Steady state analysis ofboolean molecular network models via model reduction and computational algebra,”
BMCBioinformatics , vol. 15, no. 1, p. 221, 2014.[20] J. G. Za˜nudo and R. Albert, “An effective network reduction approach to find the dynamicalrepertoire of discrete dynamic networks,”
Chaos: An Interdisciplinary Journal of NonlinearScience , vol. 23, no. 2, p. 025111, 2013.[21] S. Rajagopalan and V. V. Vazirani, “Primal-dual rnc approximation algorithms for (multi)-set (multi)-cover and covering integer programs,” in
Proceedings of 1993 IEEE 34th AnnualFoundations of Computer Science , pp. 322–331, IEEE, 1993.[22] S. Klamt, J. Saez-Rodriguez, J. A. Lindquist, L. Simeoni, and E. D. Gilles, “A methodol-ogy for the structural and functional analysis of signaling and regulatory networks,”
BMCBioinformatics , vol. 7, no. 1, p. 56, 2006.[23] M. C. ´Alvarez-Silva, S. Yepes, M. M. Torres, and A. F. G. Barrios, “Proteins interaction net-work and modeling of igvh mutational status in chronic lymphocytic leukemia,”
TheoreticalBiology and Medical Modelling , vol. 12, no. 1, p. 12, 2015.[24] R. Albert and H. G. Othmer, “The topology of the regulatory interactions predicts theexpression pattern of the segment polarity genes in drosophila melanogaster,”
Journal ofTheoretical Biology , vol. 223, no. 1, pp. 1–18, 2003.[25] L. F. M´endez-L´opez, J. Davila-Velderrain, E. Dom´ınguez-H¨uttinger, C. Enr´ıquez-Olgu´ın,J. C. Mart´ınez-Garc´ıa, and E. R. Alvarez-Buylla, “Gene regulatory network underlying theimmortalization of epithelial cells,”
BMC Systems Biology , vol. 11, no. 1, p. 24, 2017.
A Proof of Key Inequality (A.1) in Theorem 3
Let α be the average price in the j -th iteration, that is, α = P ( x l , j ). Let J ∗ represent anoptimal discriminator set. When j = 1, we need to prove α ≤ | J ∗ | M (2 K +1) . Note that in eachiteration, we choose a node with the maximum coverage power (or the minimum average price),thus we have α ≤ | s k ′ ∩ U | , k ′ ∈ J ∗ which indicates | J ∗ | ≥ α (cid:0)P k ′ ∈ J ∗ | s k ′ ∩ U | (cid:1) . Let y ∗ bethe binary vector corresponding to J ∗ . Then we have (cid:0)P k ′ ∈ J ∗ | s k ′ ∩ U | (cid:1) ≥ M (2 K + 1) from C att [ i, − ] · ( y ∗ ) T ≥ K + 1 ( i = 1 , , . . . , M ) . Thus the inequality is satisfied when j = 1.Before presenting our proof for the general case, we give a definition of a special class of sets(multi-set) by allowing it as a collection of objects and any object may have duplications in thesame set. For example, suppose S = { x , x , x , x , x , x } and S = { x , x , x , x } . Then | S i | isthe cardinal number of the set S i , which equals to the number of elements in S i and | S | = 6 and | S | = 4. Moreover, we let S ∪ S = { x , x , x , x , x , x , x , x , x , x } be the union of these twosets without removing duplications, S ∩ S = { x , x } be the intersection of two sets by keepingthe common elements of S and S with smaller frequency, and S − S = { x , x , x , x } be therelative complement of S in S by removing those common elements of S and S with smallerduplicates from S . We will then apply this new definition in the following analysis.Let U ′ = { x , x , . . . , x | {z } K +1 , . . . , x M , x M , . . . , x M | {z } K +1 } where | U ′ | = M (2 K + 1). Then the greedyalgorithm can be rewritten as follows. lgorithm 1 Greedy algorithm
MAPMinDSattNNInput: set of POAs U ′ , set of nodes S Output: set of nodes J J ← ∅ while U ′ = ∅ do Find s j ∈ S with maximum | s j ∩ U ′ | , J ← J ∪ { j } , S ← S − { s j } ; end while Let J j − = { t (1) , t (2) , . . . , t ( j − } and U ′ j denote J and U after the ( j − re ( x T ( i ,i ,m ) ) denote the number of repetitions of x T ( i ,i ,m ) in U ′ j . Since J j − ∪ ( J ∗ − J j − ) is an optimal solution, it is easy to see H ( Att [ i , J ∗ − J j − ] , Att [ i , J ∗ − J j − ]) ≥ re ( x T ( i ,i ,m ) ) . In the j -th iteration, Algorithm 3 will choose s t ( j ) with the maximum coverage power, whichmeans that the average price will be minimized. Then for k ′ ∈ J ∗ − J j − , we have α ≤ | s k ′ ∩ U ′ j | , which indicates α (cid:16)P k ′ ∈ J ∗ − J j − | s k ′ ∩ U ′ j | (cid:17) ≤ P k ′ ∈ J ∗ − J j − . Thus we have α ≤ | J ∗ − J j − || ( S k ′ ∈ J ∗ − J j − s k ′ ) ∩ U ′ j | . (5)Recall that H ( Att [ i , J ∗ − J j − ] , Att [ i , J ∗ − J j − ]) ≥ re ( x T ( i ,i ,m ) ) . This means that the numbercounting together all the repetitions of x T ( i ,i ,m ) in set s ′ k , k ′ ∈ J ∗ − J j − should be greater thanor equal to re ( x T ( i ,i ,m ) ), then we have (cid:16)S k ′ ∈ J ∗ − J j − s k ′ (cid:17) ∩ U ′ j = U ′ j . Here the union operationshould be over multi-sets. From this and Ineq. (5), we have α ≤ | J ∗ − J j − || ( S k ′ ∈ J ∗ − J j − s k ′ ) ∩ U ′ j | = | J ∗ − J i || U ′ j |≤ | J ∗ | M (2 K + 1) − j − X k =1 | s t ( k ) ∩ U ′ k | = | J ∗ | M (2 K + 1) − j − X k =1 | s t ( k ) ∩ U k | . Here U k is the set U in the k -th iteration in Algorithm 3. The last equality holds because there isno duplicated elements in s t ( k ) , and those elements are the same in U ′ k and U k without consideringthe number of duplications of each element. Then the inequality is proved.without consideringthe number of duplications of each element. Then the inequality is proved.