Collectively canalizing Boolean functions
CCollectively canalizing Boolean functions
Claus Kadelka, Benjamin Keilty, Reinhard Laubenbacher2020
Abstract
This paper studies the mathematical properties of collectively canalizing Boolean functions, a classof functions that has arisen from applications in systems biology. Boolean networks are an increasinglypopular modeling framework for regulatory networks, and the class of functions studied here captures akey feature of biological network dynamics, namely that a subset of one or more variables, under certainconditions, can dominate the value of a Boolean function, to the exclusion of all others. These functionshave rich mathematical properties to be explored. The paper shows how the number and type of suchsets influence a function’s behavior and define a new measure for the canalizing strength of any Booleanfunction. We further connect the concept of collective canalization with the well-studied concept of theaverage sensitivity of a Boolean function. The relationship between Boolean functions and the dynamicsof the networks they form is important in a wide range of applications beyond biology, such as computerscience, and has been studied with statistical and simulation-based methods. But the rich relationshipbetween structure and dynamics remains largely unexplored, and this paper is intended as a contributionto its mathematical foundation.
One of the great advances of biology in the twentieth century is the discovery of genes and their regulatoryrelationships, now increasingly described as gene regulatory networks that are amenable to a descriptionby dynamic mathematical models. Traditionally, this has been done using systems of ordinary differentialequations, one per gene in the network, based on the view of the network as a biochemical reaction network,subject to constraints, such as preservation of mass. An alternative view, arguably closer to biological think-ing, is to represent gene regulatory networks as similar to logical switching networks, popular in engineering,employing ON/OFF state representations instead of continuously varying concentrations, introduced by S.Kaufman[1, 2]. This explains the increasing popularity of such models in biology, in particular, since inmany cases detailed kinetic measurements are not readily available. A natural next question then is whatthe biological constraints are on the Boolean functions that occur in this way.In the 1940s, C. Waddington introduced the concept of canalization in developmental biology as anexplanation for the surprising stability of developmental processes in the face of varying environmentalconditions [3]. S. Kauffman later adapted this concept and introduced a canalization concept for Booleanfunctions [4, 5]. A multitude of studies have shown that Boolean networks composed of canalizing functionsexhibit more ordered dynamics than random ones, resulting in, e.g., fewer and shorter attractors as well aslower sensitivities to perturbations [4, 5, 6, 7, 8, 9, 10].A canalizing function possesses at least one input variable such that, if this variable takes on a certain“canalizing” value, then the output value of the function is already determined, regardless of the values ofthe remaining input variables. If this variable takes on another, non-canalizing, value, and there is a second1 a r X i v : . [ c s . D M ] O c t ariable with this same property, the function is 2-canalizing. If k variables follow this pattern, the functionis k -canalizing [11], and the number of variables that follow this pattern is called the canalizing depth of thefunction [12]. If the canalizing depth equals the number of inputs (i.e. if all variables follow the describedpattern), the function is also called nested canalizing.It is straightforward to see that any Boolean function can be represented as a polynomial over the fieldwith two elements, first exploited in [13], to use tools from computational algebra for the inference of Booleannetwork models from experimental data. He and Macauley showed that any Boolean function can be writtenin a unique canonical form, from which the number of Boolean functions with a certain canalizing depth canbe easily derived [11]. In addition, explicit formulas for the number of various types of Boolean and multi-state canalizing and nested canalizing functions have also been found [14, 15, 16, 17]. Given the stringency ofthe definition of canalization it is not surprising that a random Boolean function in several variables is onlyrarely canalizing, let alone nested canalizing. It is thus remarkable that most functions found in publishedgene regulatory network models are indeed canalizing or even nested canalizing [18, 19, 20], suggesting thatthe canalization property does indeed capture an important property of the logic of gene regulatory networks.Biological observations suggest that, in many cases, a given gene is regulated by a collection of othergenes that jointly determine that gene’s dynamics. Based on this phenomenon, less stringent definitions ofcanalization have been considered [21, 22]. Most Boolean functions exhibit some degree of canalization inthe sense that a few variables taking on certain “canalizing” input values frequently suffice to determine theoutput of a function, regardless of the values of the remaining input variables. This phenomenon has beenfirst described and studied by Bassler et al., and has been termed collective canalization [21]. The amount ofcanalization a particular Boolean function exhibits is described by the set of numbers P k , k = 0 , , . . . , n − k -dimensional input sets that are collectively canalizing. Another way to thinkof these numbers is as the probability that the output of the Boolean function is already determined if k randomly chosen inputs are fixed. Reichhardt and Bassler used results from group theory and isomerchemistry to classify all Boolean functions in n variables based on the set of numbers P k , k = 0 , , . . . , n − P k , k = 0 , , . . . , n − P k , k = 0 , , . . . , n − In this section, we review some concepts and definitions, introduce the concept of canalization , and generalizeit to collective canalization , following earlier work by Reichhardt and Bassler [22]. Throughout the paper,let ⊕ denote addition modulo 2 when used in a polynomial, and the “exclusive or” (XOR) function whenused in a Boolean logical expression. Definition 2.1.
A Boolean function f ( x , . . . , x n ) is essential in the variable x i if there exists an x ∈ { , } n such that f ( x ) (cid:54) = f ( x ⊕ e i ) , e i is the i th unit vector, and ⊕ denotes addition modulo 2. Definition 2.2.
A Boolean function f ( x , . . . , x n ) is canalizing if there exists a variable x i , a Booleanfunction g ( x , . . . , x i − , x i +1 , . . . , x n ) and a, b ∈ { , } such that f ( x , x , ..., x n ) = (cid:40) b, if x i = ag ( x , x , ..., x i − , x i +1 , ..., x n ) , if x i (cid:54) = a In that case, we say that x i canalizes f (to b ).Some authors further require the function g to be non-constant; in this paper, we do not impose thisrequirement. This is because when defining collectively canalizing functions, it makes sense to includeconstant g , and it is convenient to have our definition of canalizing functions here correspond to the definitionof 1-set canalizing functions in Definition 2.7. Definition 2.3. [11] A Boolean function f ( x , . . . , x n ) is k -canalizing, where 1 ≤ k ≤ n , with respect to thepermutation σ ∈ S n , inputs a , . . . , a k and outputs b , . . . , b k , if f ( x , . . . , x n ) = b x σ (1) = a ,b x σ (1) (cid:54) = a , x σ (2) = a ,b x σ (1) (cid:54) = a , x σ (2) (cid:54) = a , x σ (3) = a , ... ... b k x σ (1) (cid:54) = a , . . . , x σ ( k − (cid:54) = a k − , x σ ( k ) = a k ,g (cid:54)≡ b k x σ (1) (cid:54) = a , . . . , x σ ( k − (cid:54) = a k − , x σ ( k ) (cid:54) = a k , where g = g ( x σ ( k +1) , . . . , x σ ( n ) ) is a Boolean function on n − k variables. When g is not canalizing, theinteger k is the canalizing depth of f (as in [12]). An n -canalizing function is also called nested canalizingfunction (NCF), and we define all Boolean functions to be 0-canalizing.He and Macauley provided the following powerful stratification theorem. Theorem 2.4. [11] Every Boolean function f ( x , . . . , x n ) (cid:54)≡ can be uniquely written as f ( x , . . . , x n ) = M ( M ( · · · ( M r − ( M r p C ⊕ ⊕ · · · ) ⊕ ⊕ b, where each M i = (cid:81) k i j =1 ( x i j ⊕ a i j ) is a nonconstant extended monomial, p C is the core polynomial of f ,and k = (cid:80) ri =1 k i is the canalizing depth. Each x i appears in exactly one of { M , . . . , M r , p C } , and the onlyrestrictions are the following “exceptional cases”:1. If p C ≡ and r (cid:54) = 1 , then k r ≥ ;2. If p C ≡ and r = 1 and k = 1 , then b = 0 .When f is a non-canalizing functions (i.e., when k = 0 ), we simply have p C = f . Theorem 2.4 shows that any Boolean function has a unique standard monomial form, in which thevariables are partitioned into different layers based on their dominance. Any canalizing variable is in thefirst layer. Any variable that “becomes” canalizing when excluding all variables from the first layer is inthe second layer, etc. All remaining variables that never become canalizing are part of the core polynomial.The number of variables that “become” eventually canalizing is the canalizing depth, and NCFs are exactlythose functions where all variables “become” eventually canalizing.3 efinition 2.5. [9] The layer structure of a Boolean function f ( x , . . . , x n ) with canalizing depth k is definedas the vector ( k , . . . , k r ), where r is the number of layers and k i is the size of the i th layer , i = 1 , . . . , r .The layer structure follows directly from the unique standard monomial form of f (Theorem 2.4). Example 2.6.
The function f ( x , x , x ) = x ∧ ( x ∨ x ) is nested canalizing with layer structure ( k =1 , k = 2). The unique standard monomial form is f = M ( M ⊕ M = x and M = ( x ⊕ x ⊕ Definition 2.7.
A Boolean function f ( x , . . . , x n ) is k -set canalizing, where 0 ≤ k ≤ n , if there exists apermutation σ ∈ S n , inputs a , . . . , a k ∈ { , } and an output b ∈ { , } such that f ( x , x , ..., x n ) = (cid:40) b, ( x σ (1) , x σ (2) , . . . , x σ ( k ) ) = ( a , a , . . . , a k ) ,g ( x , . . . , x n ) , otherwise.In that case, the input set C k = { ( x σ (1) , a ) , ( x σ (2) , a ) , ..., ( x σ ( k ) , a k ) } (collectively) canalizes f (to b ). Definition 2.8.
For 0 ≤ k ≤ n , the k -set canalizing proportion of a Boolean function f ( x , . . . , x n ), denoted P k ( f ), is defined as the proportion of k -sets C k from Definition 2.7, which collectively canalize f . Remark 2.9.
These definitions imply the following.(a) A function f is k -set canalizing if and only if P k ( f ) > f ( x , . . . , x n ), P n ( f ) ≡ P ( f ) = 0 except when f is aconstant function, in which case P ( f ) = 1.(d) Canalizing functions, as defined in Definition 2.2, are exactly the 1-set canalizing functions.(e) Consider the n -dimensional Boolean cube B n , with vertices labeled according to f . P k ( f ) is the proba-bility that any ( n − k )-face of B n is constant. Example 2.10.
The function f ( x , x , x , x ) = ( x ∨ x ) ∧ ( x ∨ x ) is not canalizing, P ( f ) = 0. However, f is 2-set canalizing because if x = 0 and x = 0, then f ≡
0, regardless of the values of x and x .Thus, { ( x , , ( x , } collectively canalizes F to 0. Similarly, { ( x , , ( x , } canalizes f to 0, while { ( x , , ( x , } , { ( x , , ( x , } , { ( x , , ( x , } , and { ( x , , ( x , } canalize f to 1. Thus, P ( f ) = = . In this section we investigate the k -set canalizing proportion of various types of functions and use it todefine the canalizing strength of any Boolean function, a measure which we argue more accurately resemblesthe biological concept of canalization. We begin by showing that the k -set canalizing proportion can neverdecrease in k . Theorem 3.1.
Let f ( x , . . . , x n ) be a Boolean function. Then for ≤ k < n , P k − ( f ) ≤ P k ( f ) ≤
12 (1 + P k − ( f )) . roof. Let [ n ] = { , , . . . , n } . Let f ( x , . . . , x n ) be a Boolean function, and let C k be the set of all k -inputsets that collectively canalize f . For an input set C = { ( x σ (1) , a ) , ( x σ (2) , a ) , ..., ( x σ ( k − , a k − ) } with | C | = k −
1, define an extended input set C ∗ ( σ ( k ) , a k ) = { ( x σ (1) , a ) , ( x σ (2) , a ) , ..., ( x σ ( k − , a k − ) , ( x σ ( k ) , a k ) } ,where σ ( i ) (cid:54) = σ ( j ) whenever i (cid:54) = j . Further, let P C ( f ) be the proportion of all possible extended input sets C ∗ which collectively canalize f . Clearly, P k ( f ) = E [ P C ( f )] , where the expectation is taken uniformly over all possible input sets C with | C | = k − C ∈ C k − : If C already collectively canalizes f , then P C ( f ) = 1.Case 2, C (cid:54)∈ C k − : We will consider all 2( n − ( k − C ∗ and show that P C ( f ) ≤ .Case 2a, ∃ j ∈ [ n ] − { σ (1) , σ (2) , . . . , σ ( k − } such that C ∗ ( j,
0) and C ∗ ( j,
1) both collectively canalize f tothe same output: This implies that C already collectively canalizes f but this contradicts C (cid:54)∈ C k − .Case 2b, ∃ j ∈ [ n ] − { σ (1) , σ (2) , . . . , σ ( k − } such that C ∗ ( j,
0) and C ∗ ( j,
1) both collectively canalize f to different output values: This implies that C ∗ ( j,
0) and C ∗ ( j,
1) are the only two choices for C ∗ thatcollectively canalize f . Since there are n − ( k −
1) choices for j and each has two corresponding C ∗ , we have P C ( f ) = n − ( k − = n − k +1 ≤ .Case 2c, (cid:64) j ∈ [ n ] − { σ (1) , σ (2) , . . . , σ ( k − } such that C ∗ ( j,
0) and C ∗ ( j,
1) both collectively canalize f :This implies that at most one of the two corresponding C ∗ collectively canalizes f , thus P C ( f ) ≤ .By definition, P ( C ∈ C k − ) = P k − ( f ). Therefore, conditioning on C ∈ C k − yields P k ( f ) = E [ P C ( f )]= P ( C ∈ C k − ) E (cid:2) P C ( f ) (cid:12)(cid:12) C ∈ C k − (cid:3) + P ( C (cid:54)∈ C k − ) E (cid:2) P C ( f ) (cid:12)(cid:12) C (cid:54)∈ C k − (cid:3) = P k − ( f ) · − P k − ( f )) E (cid:2) P C ( f ) (cid:12)(cid:12) C (cid:54)∈ C k − (cid:3) Thus, P k − ( F ) ≤ P k ( F ) ≤ P k − ( F ) + (cid:0) − P k − ( F ) (cid:1)
12 = 12 (cid:0) P k − ( F ) (cid:1) Corollary 3.2.
For a constant Boolean function f ( x , . . . , x n ) , P ( f ) = P ( f ) = . . . = P n ( f ) = 1 . If f isnot constant, then P k ( f ) ≤ − k for all ≤ k < n . Proof. If f ( x , . . . , x n ) is constant, then P ( f ) = 1 and P n ( f ) = 1 by Remark 2.9. Thus, by Theorem 3.1, P ( f ) = P ( f ) = . . . = P n ( f ) = 1.If f ( x , . . . , x n ) is not constant, then P ( f ) = 0 = 1 − . Proceed by induction and assume that P k − ( f ) ≤ − k − for some k < n −
1. Then by Theorem 3.1, P k ( f ) ≤
12 (1 + P k − ( f )) ≤ (cid:18) − k − (cid:19) = 1 − k In fact, we can show that the equality P k ( f ) = 1 − k only holds if f is a special type of canalizingfunction, an NCF with exactly one layer (see Theorem 2.4 and Definition 2.5). Theorem 3.3. If f ( x , . . . , x n ) is a Boolean NCF with exactly one layer, then P k ( f ) = 1 − k for all ≤ k < n . Further, if for some f ( x , . . . , x n ) with n ≥ , there exists a k , < k < n such that P k ( f ) = 1 − k ,then f is a NCF with exactly one layer. roof. Let f be a Boolean NCF with exactly one layer. By Thm. 4.5 in [11], there exist α , . . . , α n , β ∈{ , } such that f can be uniquely written in standard monomial form, f ( x , . . . , x n ) = β + n (cid:89) i =1 ( x i + α i ) . Let C k = { ( x σ (1) , a σ (1) ) , ( x σ (2) , a σ (2) ) , ..., ( x σ ( k ) , a σ ( k ) ) } be a randomly chosen input set of size k, < k < n ,as in Definition 2.7. Then, C k collectively canalizes f to β if (cid:81) ki =1 ( x σ ( i ) + α i ) = 0, i.e. if α σ ( i ) (cid:54) = a σ ( i ) forsome 1 ≤ i ≤ k . We have P ( α σ ( i ) = a σ ( i ) ) = for all 1 ≤ i ≤ k and thus, due to independence, P k ( f ) = P (cid:0) ∃ i ∈ { , . . . , k } : α σ ( i ) (cid:54) = a σ ( i ) (cid:1) = 1 − P (cid:0) ∀ i ∈ { , . . . , k } : α σ ( i ) = a σ ( i ) (cid:1) = 1 − k . Further by Remark 2.9, P ( f ) = 0 = 1 − as f is not constant.To prove the second part, let f be a Boolean function on n ≥ P k ( f ) = 1 − k forsome 0 < k < n . By Theorem 3.1, P k ( f ) ≤ (1 + P k − ( f )). Thus,1 − k ≤
12 (1 + P k − ( f ))1 − k − ≤ P k − ( f )However, by Corollary 3.2, P k − ( f ) ≤ − k − , so in fact P k − ( f ) = 1 − k − . Iteratively, we get P ( f ) = .Consider any variable x i . If both x i = 0 and x i = 1 canalize f to the same value, then f is a constantfunction and P ( f ) = 1, a contradiction. On the other hand, If both x i = 0 and x i = 1 canalize f to differentvalues then no other variable can canalize f , thus P ( f ) = n , contradicting P ( f ) = for n ≥
3. Thus, only x i = 0 or x i = 1 can canalize f for n ≥ P ( f ) ≤ .In order for P ( f ) = , we need every variable x i to canalize f to the same value b ∈ { , } for exactlyone input a i . Thus, we can express f in standard monomial form (Thm. 4.5 in [11]), f = b + n (cid:89) i =1 ( x i + ¯ a i ) , and deduce that f is an NCF with exactly one layer.Theorem 3.3 provides maximal values for the k -set canalizing proportion of non-constant functions. Thefollowing example provides minimal values. Example 3.4.
For b ∈ { , } , let f ( x , . . . , x n ) = x ⊕ x ⊕ . . . ⊕ x n ⊕ b be the parity function. Then, P k ( f ) = 0 for all 0 ≤ k < n because in any case knowledge of all inputs is required to determine the output.Given maximal and minimal values for the k -set canalizing proportion of non-constant functions, we arenow in a position to define a new robustness measure for any Boolean function with n ≥ k -set canalizing proportions for all k with 0 < k < n . Definition 3.5.
The canalizing strength of a non-constant Boolean function f ( x , . . . , x n ) with n ≥ c ( f ) = 1 n − n − (cid:88) k =1 k k − P k ( f ) ∈ [0 , . xample 3.6. For b ∈ { , } , let f ( x , . . . , x n ) = x ⊕ x ⊕ . . . ⊕ x n ⊕ b be the parity function as inExample 3.4. Then, c ( f ) = 0, highlighting that the output of the parity function can only be determinedwhen the values of all inputs are known.On the other hand, if f is a nested canalizing function with exactly one layer, f.e. the AND function f ( x , . . . , x n ) = x x · · · x n , then with Theorem 3.3, c ( f ) = 1 n − n − (cid:88) k =1 k k − − k )= 1 n − n − (cid:88) k =1
1= 1 . Remark 3.7.
The weights in Definition 3.5 are chosen such that (i) c ( f ) ∈ [0 ,
1] for any non-constant Booleanfunction and (ii) c ( f ) = 1 for the “most” canalizing functions (NCFs with exactly one layer), irrespective of n . This results however in c ( f ) > f . Alternatively, one could define the canalizingstrength of a Boolean function as the unweighted average of P k ( f ) , k = 1 , . . . , n −
1. This definition wouldensure that any function (even constant ones) possesses a canalizing strength in [0 , n −
1. It thus serves as a measure for the closeness to “perfect” canalization (NCF with 1 layer), Thenext example highlights how the canalizing strength coincides more closely with the biological concept ofcanalization than, for instance, the canalizing depth or a simple binary measure of the presence/absence ofcanalizing variables.
Example 3.8.
The function f ( x , . . . , x ) = x ∧ (cid:0) ⊕ i =2 x i (cid:1) is canalizing in x (canalizing depth = 1).We have P ( f ) = 0 . , P ( f ) = 0 . , P ( f ) = 0 . , P ( f ) = 0 .
5, resulting in c ( f ) ≈ . g ( x , . . . , x ) = ( x + · · · + x ) > P ( g ) = 0. However, P ( g ) = 0 . , P ( g ) = 0 . , P ( g ) = 0 .
75, resulting in c ( g ) ≈ . f .This example highlights that functions that are canalizing in the traditional sense (Definition 2.2) mayhave a lower canalizing strength than functions not canalizing in the traditional sense. We thus investigatedthe distribution of the k -set canalizing proportion and the canalizing strength for different types of functions. Definition 3.9. (see f.e. [6]) A random Boolean function f ( x , . . . , x n ) with bias p can be generated byflipping a p -biased coin 2 n times and accordingly filling in the truth table. The bias is not a property ofan individual Boolean function; rather, it is a property of the underlying probability space. Setting p = yields a uniform distribution on all Boolean functions f : { , } n → { , } . Theorem 3.10.
For a given bias p ∈ (0 , and for ≤ k ≤ n , we have E [ P k ( f )] = (1 − p ) n − k + p n − k . In particular, when the expectation is taken uniformly over all f : { , } n → { , } (i.e., in the unbiasedcase of p = ), E [ P k ( f )] = 12 n − k − . .0 0.2 0.4 0.6 0.8 1.0bias p0.00.20.40.60.81.0 E [ P ] n 2345 0.0 0.2 0.4 0.6 0.8 1.0bias p0.00.20.40.60.81.0 E [ P k ] k n n n n Figure 1: (A) Probability that a random input canalizes a function ( E [ P ]) for different sampling biasesand different numbers of inputs ( n ). (B) Expected k -set canalizing proportion ( E [ P k ]) for different samplingbiases and different values of k ∈ { n − , n − , n − , n − } . Proof.
Let f be a Boolean function in n variables, sampled uniformly at random from the space of p -biased Boolean functions. By Remark 2.9e, the k -set canalizing proportion, P k ( f ), is the probability thatan ( n − k )-face of the n -dimensional Boolean cube, with vertices labeled according to f , is constant. Each( n − k )-face has 2 ( n − k ) vertices and there are two possible constants, 0 and 1, which are taken on withprobability 1 − p and p , respectively. Thus, E [ P k ( f )] = (1 − p ) n − k + p n − k . Figure 1 highlights the implications of Theorem 3.10. Unbiased functions ( p = 0 .
5) exhibit the lowest k -set canalizing proportion and thus the lowest canalizing strength, irrespective of k or the number of inputs, n . Increased absolute bias leads to increased canalization.An analysis of all Boolean functions in n = 4 variables revealed that, on average, functions with morecanalizing variables have a higher canalizing strength (Figure 2A). There are, however, strong variationsand this result holds only in the average, as highlighted by Example 3.8. Functions in n = 4 variables withcanalizing depth 3 = n − Corollary 3.11.
For any bias p ∈ (0 , , the expected canalizing strength of randomly chosen Booleanfunctions approaches as the number of variables increases, E [ c ( f )] −→ n →∞ . c a n a li z i n g s t r e n g t h c a n a li z i n g s t r e n g t h Figure 2: Distribution of the canalizing strength for all 2 = 65 ,
536 Boolean functions in n = 4 variableswith a fixed (A) the canalizing depth, (B) number of symmetry groups. Horizontal dark lines depict therespective maximal, mean and minimal value (top to bottom). Proof.
By Definition 3.5, Theorem 3.10 and linearity of the expectation, we have E [ c ( f )] = 1 n − n − (cid:88) k =1 k k − (cid:16) (1 − p ) n − k + p n − k (cid:17) ≤ n − n − (cid:88) k =1 − p, p ) n − k ≤ n − n − (cid:88) k =1 max (1 − p, p ) n − k ≤ n − − max (1 − p, p ) −→ n →∞ . An interesting, related question is the following: We know that the set of all canalizing functions is verysmall compared to all Boolean functions. That is, P (cid:0) P ( f ) > (cid:12)(cid:12) f : { , } n → { , } (cid:1) −→ n →∞ . Similarly, we know that all Boolean functions except for the parity function and its conjugate have someconstant edge in their hypercube representation. That is, all but two Boolean functions are ( n − P (cid:0) P n − ( f ) > (cid:12)(cid:12) f : { , } n → { , } (cid:1) −→ n →∞ . But what happens “in between”? More precisely: For which k , does P (cid:0) P k ( f ) > (cid:12)(cid:12) f : { , } n → { , } (cid:1) −→ n →∞ orollary 3.12. For any bias p ∈ (0 , and any integer k > , lim n →∞ P (cid:0) P n − k ( f ) > (cid:12)(cid:12) f : { , } n → { , } (cid:1) (cid:54) = 0 while lim n →∞ P (cid:0) P k ( f ) > (cid:12)(cid:12) f : { , } n → { , } (cid:1) = 0 . Proof.
By Theorem 3.10, E [ P n − k ( f )] = (1 − p ) k + p k > , irrespective of n . This directly yields the first part of the corollary.To prove the second part, we realize that all possible values of P k ( f ) are by definition fractions. Thereare 2 k (cid:0) nk (cid:1) different input sets, which contain k out of n variables. Thus, P k ( f ) > P k ( f ) ≥ ( nk ) k .Now, assume lim n →∞ P (cid:0) P k ( f ) > (cid:12)(cid:12) f : { , } n → { , } (cid:1) = r (cid:54) = 0 . This implies E [ P k ( f )] ≥ r · (cid:0) nk (cid:1) k . We can express (cid:0) nk (cid:1) as a polynomial in n with degree k and leading coefficient k ! , and getlim n →∞ n k E [ P k ( f )] ≥ lim n →∞ r · n k (cid:0) nk (cid:1) k = r · k !2 k > . However, by Theorem 3.10 we have for any p ∈ (0 ,
1) thatlim n →∞ n k E [ P k ( f )] = lim n →∞ n k (cid:16) (1 − p ) n − k + p n − k (cid:17) = 0by l’Hˆopital’s rule. This is a contradiction, which completes the proof. The average sensitivity, introduced in [23], measures how sensitive the output of a function is to inputchanges, and constitutes one of the most studied properties of a Boolean function [16, 6, 24]. Thus far,average sensitivity and canalization were two distinct concepts. In this section, we derive bounds for theaverage sensitivity of a Boolean function in terms of the k -set canalizing proportions, allowing us to connectthese two concepts. Definition 4.1.
The sensitivity of a Boolean function f ( x , . . . , x n ) at a vector x ∈ { , } n is defined as thenumber of Hamming neighbors of x with a different function value than f ( x ). That is, S ( f, x ) = n (cid:88) i =1 χ [ f ( x ) (cid:54) = f ( x ⊕ e i )] . Definition 4.2.
The average sensitivity of a Boolean function f ( x , . . . , x n ) is the expected value of S ( f, x ).Assuming a uniform distribution of x , S ( f ) = E [ S ( f, x )] = 12 n (cid:88) x ∈{ , } n n (cid:88) i =1 χ [ f ( x ) (cid:54) = f ( x ⊕ e i )] . efinition 4.3. Assuming a uniform distribution of x , the normalized average sensitivity of a Booleanfunction f ( x , . . . , x n ) is s ( f ) = 1 n S ( f ) = 1 n n (cid:88) i =1 E (cid:2) f ( x ) ⊕ f ( x ⊕ e i ) (cid:3) = 1 n n (cid:88) x ∈{ , } n n (cid:88) i =1 χ [ f ( x ) (cid:54) = f ( x ⊕ e i )] . Often times, including in this section, it is more convenient to consider the normalized average sensitivity,which can be thought of as the probability that a randomly chosen edge in the n -dimensional hypercubewith vertices labeled according to f is non-constant.In order to prove the central theorem of this section, we require the following lemma. Lemma 4.4.
For all ( x, y ) ∈ [0 , , for all n ∈ N and ≤ k ≤ n , − (cid:20) n − k n (cid:0) (1 − x ) k + (1 − y ) k (cid:1) + k n (cid:0) (1 − x ) k − + (1 − y ) k − (cid:1)(cid:21) k ≥ n − n ( x + y ) + 1 n . Proof.
Consider the function f : [0 , → R defined by f ( x, y ) = n − n − n − n ( x + y ) − (cid:20) n − k n (cid:0) (1 − x ) k + (1 − y ) k (cid:1) + k n (cid:0) (1 − x ) k − + (1 − y ) k − (cid:1)(cid:21) k Clearly, Lemma 4.4 is true if and only if f ≥ , . Write t = 1 − x and u = 1 − y for readability, andtake the derivative with respect to x : ∂f∂x = − n − n − k (cid:20) n − k n (cid:0) t k + u k (cid:1) + k n (cid:0) t k − + u k − (cid:1)(cid:21) k − (cid:18) − k ( n − k )2 n t k − − k ( k − n t k − (cid:19) But t ≤ u ≤
1, so ∂f∂x ≤ − n − n + 1 k (cid:20) n − k n · k n · (cid:21) k − (cid:18) k ( n − k )2 n + k ( k − n (cid:19) = − n − n + (cid:18) n − k n + k − n (cid:19) = 0Since f is symmetric (i.e., f ( x, y ) = f ( y, x )) we also have ∂f∂y ≤
0. Hence, f is non-increasing in both x and y , implying f ( x, y ) ≥ f (1 ,
1) = 0 for all ( x, y ) ∈ [0 , . Theorem 4.5.
For any Boolean function f : { , } n → { , } , and for any integer < k ≤ n , k − (1 − P n − k ( f )) ≤ s ( f ) ≤ − ( P n − k ( f )) k . Proof.
We prove the left inequality using a geometric argument. By Remark 2.9, P k ( f ) is the probabilitythat an ( n − k )-face of the n -dimensional Boolean cube B n is constant, where the vertices of B n are labeledaccording to f . Thus, 1 − P n − k ( f ) is the probability that a k -face is not constant. Similarly, s ( f ) is exactlythe probability that a 1-face (i.e., an edge) of B n is not constant. Let H be a k -face of B n where f is not11onstant. Any vertex in H has k edges that are part of H and H possesses k k − total edges. Since f is notconstant on H , there is at least one vertex in H where f takes on a different value. Thus, H possesses atleast k non-constant edges, and by summing over all (constant and non-constant) k -faces we get s ( f ) ≥ (1 − P n − k ( f )) · kk k − + P n − k ( f ) ·
0= 12 k − (1 − P n − k ( f ))We use induction to prove the right inequality. The n = 1 and n = 2 cases are trivial. Now, assume s ( g ) ≤ − ( P ( n − − j ( g )) j for all functions g on n − < j ≤ n −
1. Let f : { , } n → { , } be any function on n variables. We seek to prove that s ( f ) ≤ − ( P n − k ( f )) k for all k with 0 < k ≤ n .By definition, the normalized average sensitivity is s ( f ) = 1 n n (cid:88) i =1 E (cid:2) f ( x ) ⊕ f ( x ⊕ e i ) (cid:3) = 1 n (cid:16) E (cid:2) f ( x ) ⊕ f ( x ⊕ e n ) (cid:3) + n − (cid:88) i =1 E (cid:2) f ( x ) ⊕ f ( x ⊕ e i ) (cid:3)(cid:17) , where the expectation is taken uniformly over all x ∈ { , } n . Define g ( x , x , . . . , x n − ) ≡ f ( x , x , . . . , x n − , g ( x , x , . . . , x n − ) ≡ f ( x , x , . . . , x n − , x n and taken over all ˜x ∈ { , } n − such that s ( f ) = 1 n (cid:16) E (cid:2) f ( x ) ⊕ f ( x ⊕ e n ) (cid:3) + n − (cid:88) i =1 (cid:0) E (cid:2) g ( ˜x ) ⊕ g ( ˜x ⊕ e i ) (cid:3) + E (cid:2) g ( ˜x ) ⊕ g ( ˜x ⊕ e i ) (cid:3)(cid:1)(cid:17) = 1 n E (cid:2) f ( x ) ⊕ f ( x ⊕ e n ) (cid:3) + n − n s ( g ) + n − n s ( g ) ≤ n + n − n (cid:0) s ( g ) + s ( g ) (cid:1) To calculate P n − k ( f ), consider all input sets A with n − k variables. We distinguish two cases:Case 1, x n ∈ A : This occurs with probability n − kn . For a ∈ { , } , we have x n = a with probability , and A canalizes f if and only if the n − k − A − { x n } canalizes g a .Case 2, x n (cid:54)∈ A : This occurs with probability kn . In this case, A canalizes f if and only if A canalizes both g and g to the same value. Thus, the proportion of n − k input sets A , which do not include x n but canalize f is no greater than min( P n − k ( g ) , P n − k ( g )).Using the induction hypothesis, we thus get P n − k ( f ) ≤ n − k n ( P n − k − ( g ) + P n − k − ( g )) + kn min ( P n − k ( g ) , P n − k ( g ))= n − k n ( P n − k − ( g ) + P n − k − ( g )) + kn min (cid:0) P ( n − − ( k − ( g ) , P ( n − − ( k − ( g ) (cid:1) ≤ n − k n (cid:16) (1 − s ( g )) k + (1 − s ( g )) k (cid:17) + kn min (cid:16) (1 − s ( g )) k − , (1 − s ( g )) k − (cid:17) ≤ n − k n (cid:16) (1 − s ( g )) k + (1 − s ( g )) k (cid:17) + k n (cid:16) (1 − s ( g )) k − + (1 − s ( g )) k − (cid:17) − ( P n − k ( f )) k ≥ − (cid:20) n − k n (cid:0) (1 − s ( g )) k + (1 − s ( g )) k (cid:1) + k n (cid:0) (1 − s ( g )) k − + (1 − s ( g )) k − (cid:1)(cid:21) k ≥ n − n ( s ( g ) + s ( g )) + 1 n , by Lemma 4.4 ≥ s ( F ) , which concludes the induction.Theorem 4.5 provides bounds for the average sensitivity of a Boolean function given only some of itscanalizing proportions, or in terms of Graph Theory, given only the proportion of monochromatic higher-dimensional sides of a Boolean cube, we provide upper and lower bounds for the number of monochromatic(1-dimensional) edges. Further, the k = 1 case in Theorem 4.5 directly yields the following trivial result,relating the normalized average sensitivity and the ( n − Corollary 4.6. s ( f ) = 1 − P n − ( f ) for any Boolean function f : { , } n → { , } . Many properties of Boolean functions have been thoroughly studied over the course of the last decades.Most early studies and complexity measures of Boolean functions were motivated by questions arising fromtheoretical computer science. For example, Nisan used the sensitivity, the block sensitivity and the certificatecomplexity of a Boolean function to derive bounds for the worst-case time needed to compute a Booleanfunction using an ideal algorithm [25]. Just recently, Huang showed that all these complexity measures arepolynomially related, thereby proving a major open problem in complexity theory [26].The definition of the k -set canalizing proportions P k , k = 0 , , . . . , n −
1, which are the focus of this paper,is reminiscent of the definition of certificate complexity, with one big difference. Nisan defines certificatesas sets of inputs to a Boolean function, which suffice to determine the output of the function [25]. Acertificate is thus exactly a collectively canalizing input set (Definition 2.7). The certificate complexity ofan n -variable Boolean function, however, is defined as the number of inputs that need to be known in the worst case (i.e., when considering all 2 n configurations) to determine the output of the function. The k -setcanalizing proportion, on the other hand, quantifies the average proportion of k -sets that collectively canalizea function. This highlights a general difference in the scope of use of Boolean functions in different areasof application. While theoretical computer science is particularly concerned with the worst-case scenario,the focus of biological studies is the average behavior of a system, which can be described by complexitymeasures like the average sensitivity of a Boolean function.The motivation for the complexity measures studied in this paper comes from the biological concept ofcanalization. Considering canalization as a property of a Boolean function (as in [21, 22]), rather than on thebasis of individual variables as traditionally done [4, 5, 6, 11], allows us to define and study the canalizingstrength of any Boolean function. With this broader definition of canalization, we can thus distinguishfiner differences in the canalization property. Given that the large majority of Boolean functions in severalvariables is simply not canalizing in the traditional sense (Definition 2.3), this constitutes a biologicallyrelevant advancement.The k -set canalizing proportions P k , k = 0 , , . . . , n − P ), we can derive the number of canalizing variables. If, on the other hand, we know all but one inputof a function, we can derive its average sensitivity, which is thus a bijection of the ( n − k -set canalizing proportions, which togetherwith Theorem 3.3, allow us to define the most canalizing and least canalizing Boolean functions. Thesubsequent definition of the canalizing strength (Definition 3.5) yields a novel measure for how close toperfect canalization any non-constant Boolean function is. References [1] Stuart A Kauffman. Metabolic stability and epigenesis in randomly constructed genetic nets.
Journalof theoretical biology , 22(3):437–467, 1969.[2] Stuart Kauffman. The large scale structure and dynamics of gene control circuits: an ensemble approach.
Journal of Theoretical Biology , 44(1):167–190, 1974.[3] Conrad H Waddington. Canalization of development and the inheritance of acquired characters.
Nature ,150(3811):563–565, 1942.[4] Stuart Kauffman, Carsten Peterson, Bj¨orn Samuelsson, and Carl Troein. Random Boolean networkmodels and the yeast transcriptional network.
Proceedings of the National Academy of Sciences ,100(25):14796–14799, 2003.[5] Stuart Kauffman, Carsten Peterson, Bj¨orn Samuelsson, and Carl Troein. Genetic networks with cana-lyzing Boolean rules are always stable.
Proceedings of the National Academy of Sciences , 101(49):17102–17107, 2004.[6] Ilya Shmulevich and Stuart A Kauffman. Activities and sensitivities in Boolean network models.
Physicalreview letters , 93(4):048701, 2004.[7] Fredrik Karlsson and Michael H¨ornquist. Order or chaos in Boolean gene networks depends on the meanfraction of canalizing functions.
Physica A: Statistical Mechanics and its Applications , 384(2):747–757,2007.[8] Tiago P Peixoto. The phase diagram of random Boolean networks with nested canalizing functions.
The European Physical Journal B , 78(2):187–192, 2010.[9] Claus Kadelka, Jack Kuipers, and Reinhard Laubenbacher. The influence of canalization on the robust-ness of Boolean networks.
Physica D: Nonlinear Phenomena , 353:39–47, 2017.[10] Elijah Paul, Gleb Pogudin, William Qin, and Reinhard Laubenbacher. The dynamics of canalizingBoolean networks.
Complexity , 2020.[11] Qijun He and Matthew Macauley. Stratification and enumeration of Boolean functions by canalizingdepth.
Physica D: Nonlinear Phenomena , 314:1–8, 2016.[12] Lori Layne, Elena Dimitrova, and Matthew Macauley. Nested canalyzing depth and network stability.
Bulletin of mathematical biology , 74(2):422–433, 2012.[13] Reinhard Laubenbacher and Brandilyn Stigler. A computational algebra approach to the reverse engi-neering of gene regulatory networks.
Journal of theoretical biology , 229(4):523–537, 2004.1414] Winfried Just, Ilya Shmulevich, and John Konvalina. The number and probability of canalizing func-tions.
Physica D: Nonlinear Phenomena , 197(3-4):211–221, 2004.[15] David Murrugarra and Reinhard Laubenbacher. The number of multistate nested canalyzing functions.
Physica D: Nonlinear Phenomena , 241(10):929–938, 2012.[16] Yuan Li, John O Adeyeye, David Murrugarra, Boris Aguilar, and Reinhard Laubenbacher. Booleannested canalizing functions: A comprehensive analysis.
Theoretical Computer Science , 481:24–36, 2013.[17] Claus Kadelka, Yuan Li, Jack Kuipers, John O Adeyeye, and Reinhard Laubenbacher. Multistate nestedcanalizing functions and their networks.
Theoretical Computer Science , 675:1–14, 2017.[18] Stephen E Harris, Bruce K Sawhill, Andrew Wuensche, and Stuart Kauffman. A model of transcriptionalregulatory networks based on biases in the observed regulation rules.
Complexity , 7(4):23–40, 2002.[19] Bryan C Daniels, Hyunju Kim, Douglas Moore, Siyu Zhou, Harrison B Smith, Bradley Karas, Stuart AKauffman, and Sara I Walker. Criticality distinguishes the ensemble of biological regulatory networks.
Physical review letters , 121(13):138102, 2018.[20] Claus Kadelka, Taras-Michael Butrie, Evan Hilton, Jack Kinseth, and Haris Serdarevic. A meta-analysis of Boolean network models reveals design principles of gene regulatory networks. arXiv preprintarXiv:2009.01216 , 2020.[21] Kevin E Bassler, Choongseop Lee, and Yong Lee. Evolution of developmental canalization in networksof competing boolean nodes.
Physical review letters , 93(3):038101, 2004.[22] CJ Olson Reichhardt and Kevin E Bassler. Canalization and symmetry in Boolean models for geneticregulatory networks.
Journal of Physics A: Mathematical and Theoretical , 40(16):4339, 2007.[23] Stephen Cook, Cynthia Dwork, and R¨udiger Reischuk. Upper and lower time bounds for parallel randomaccess machines without simultaneous writes.
SIAM Journal on Computing , 15(1):87–97, 1986.[24] Ravi B Boppana. The average sensitivity of bounded-depth circuits.
Information processing letters ,63(5):257–261, 1997.[25] Noam Nisan. CREW PRAMs and decision trees.
SIAM Journal on Computing , 20(6):999–1007, 1991.[26] Hao Huang. Induced subgraphs of hypercubes and a proof of the sensitivity conjecture.