Combinatorial and Algorithmic Properties of One Matrix Structure at Monotone Boolean Functions
aa r X i v : . [ c s . D M ] F e b Combinatorial and Algorithmic Properties of OneMatrix Structure at Monotone Boolean Functions
Valentin Bakoev ∗ Abstract
One matrix structure in the area of monotone Boolean functions is definedhere. Some of its combinatorial, algebraic and algorithmic properties are de-rived. On the base of these properties three algorithms are built. First of themgenerates all monotone Boolean functions of n variables in lexicographic order.The second one determines the first (resp. the last) lexicographically minimaltrue (resp. maximal false) vector of an unknown monotone function f of n variables. The algorithm uses at most n membership queries and its runningtime is Θ( n ). It serves the third algorithm, which identifies an unknown mono-tone Boolean function f of n variables by using membership queries only. Theexperimental results show that for 1 ≤ n ≤
6, the algorithm determines f byusing at most m.n queries, where m is the combined size of the sets of minimaltrue and maximal false vectors of f . Keywords: monotone Boolean function; matrix structure properties; generating algorithm;minimal true vector; maximal false vector; identification algorithm
Note (Feb. 15, 2019).
This manuscript was written in 2005 and has not been published tillnow. This is its original version where few misprints have been corrected and the Internetreferences have been updated.
The problems in the area of monotone Boolean functions (MBFs) are important notonly for the Boolean algebra. Many of them are related to (or they have a directinterpretation in) problems, arising in various fields, such as graph (hypergraph) the-ory, threshold logic, circuit theory, artificial intelligence, computation learning theory,game theory etc. [2, 6, 15]. Some problems, concerning MBFs, are still not solvedin the general case, others have still open complexities. Some well-known scientistsconsider that the capabilities of the known tools and methods for investigation ofMBFs are still not efficient enough and they recommend new approaches and tools ∗ Faculty of Mathematics and Informatics, University of Veliko Tarnovo, 2 Theodosi TarnovskiSt, 5000 Veliko Tarnovo, Bulgaria; email: [email protected]
1o be searched and used [5, 15]. This opinion additionally motivated us to do thefollowing investigations and to represent them here.Three of the well known problems, concerning MBFs are:(1) The Dedekind’s problem – for enumeration of the MBFs of n variables (or,equivalently, for enumeration of all antichains of subsets of an n -set). The problemis set by Dedekind in the end of 19th century, it is the oldest problem in the area ofMBFs [20];(2) The identification problem – for identification of an unknown MBF of n variables by using a given learning model;(3) Determining at least one minimal true vector and/or at least one maximalfalse vector of an unknown MBF. This problem is closely related to the problem (2).Here we represent our investigations of these three problems. Firstly, in Section2, we recall some necessary notions and known results. In Section 3 we define one ma-trix structure, which represents the precedences of the vectors of the n -dimensionalBoolean cube. Some combinatorial and algorithmic properties of the structure arederived. On the base of them we build three algorithms. The first of them is repre-sented in Section 4. It generates all MBFs of n variables in lexicographic order and itworks in polynomial total time. The second one has two versions and it is describedin Section 5. It determines the first (resp. the last) lexicographically minimal true(resp. maximal false) vector of an unknown MBF of n variables. The algorithm isof the type binary search, it has Θ( n ) running time and uses at most n member-ship queries for each of these vectors. Section 6 represents the third algorithm whichidentifies an unknown MBF of n variables by using membership queries only. It usesthe second algorithm and obeys to the ”Divide and conquer” strategy. Some com-ments concerning the realizations, the complexity and the experimental results of thealgorithm are also given. One of the famous notions in Discrete mathematics is the n -dimensional Boolean cube { , } n – the n -th Cartesian power of the set { , } , consisting of all n -dimensionalbinary vectors, i.e., { , } n = { ( a , a , . . . , a n ) | a i ∈ { , } , i = 1 , , . . . , n } . Obvi-ously, these vectors are exactly 2 n . The serial number of the vector α = ( a , a , . . . , a n ) ∈ { , } n is the natural number α = a . n − + a . n − + · · · + a n . , i.e., thenatural number whose binary representation is a a . . . a n . The vector α ∈ { , } n precedes lexicographically the vector β ∈ { , } n if either there exists an integer k, ≤ k ≤ n , such that a k < b k and a i = b i for i < k , or α = β . The vectorsof { , } n are in a lexicographic order in the sequence α , α , . . . , α n − , if α i pre-cedes lexicographically α j , for 0 ≤ i < j ≤ n −
1. When the vectors of { , } n are in a lexicographic order (as we consider henceforth), their serial numbers formthe sequence 0 , , . . . , n −
1. The following inductive and constructive definition ofthe n -dimensional Boolean cube determines a procedure for obtaining its vectors inlexicographic order. 2 efinition 2.1
1) We call one-dimensional Boolean cube the set { , } = { (0) , (1) } .Its elements (0) and (1) are one-dimensional binary vectors and they are in lexico-graphic order.2) Let { , } n − = { α , α , . . . , α n − − } be the ( n − -dimensional Boolean cube and let its elements, the ( n − α , α , . . . , α n − − , be inlexicographic order.3) We build the n -dimensional Boolean cube { , } n by { , } n − , firstly by adding0 in the beginning of all its vectors, and next by adding 1, i.e., { , } n = { (0 α ) , (0 α ) , . . . , (0 α n − − ) , (1 α ) , (1 α ) , . . . , (1 α n − − ) } , and so the vectors of { , } n are in lexicographic order.The relation ” (cid:22) ” is defined over { , } n × { , } n as follows: α (cid:22) β (we read” α precedes β ”) if a i ≤ b i for i = 1 , , . . . , n . It is reflexive, antisymmetric and transitive and so { , } n is a partially ordered set (POSet) with respect to the relation” (cid:22) ”. When α (cid:22) β or β (cid:22) α we call α and β comparable , otherwise we call them incomparable .The mapping f : { , } n → { , } is called a Boolean function (or function , inshort) of n variables. If α, β ∈ { , } n and α (cid:22) β always implies f ( α ) ≤ f ( β ), thenthe function f is called monotone (or positive ). We denote by M n the set of all MBFsof n variables. When we consider the binary constants 0 and 1 as functions, we denotethem by ˜0 and ˜1, respectively. They are the unique functions of 0 variables in M .Let f ∈ M n and α ∈ { , } n . If f ( α ) = 0 (resp. f ( α ) = 1) then α is called a falsevector (resp. true vector ) of f . The set of all false vectors (resp. all true vectors) of f is denoted by F ( f ) (resp. T ( f )). The false vector α is called maximal if there isno other vector α ′ ∈ F ( f ) such that α (cid:22) α ′ and α = α ′ . The set of all maximal falsevectors is denoted by max F ( f ). Symmetrically, the true vector β is called minimal if there is no other vector β ′ ∈ T ( f ) such that β ′ (cid:22) β and β ′ = β , also min T ( f )denotes the set of all minimal true vectors of f . Obviously, each monotone function f can be determined by only one of the sets min T ( f ) or max F ( f ).The function x σ is defined as follows: x σ = x if σ = 1, or x σ = ¯ x if σ = 0. Theconjunction K = x σ i . . . x σ k i k is called an implicant of the function f , if T ( K ) ⊆ T ( f ).If K ′ and K ′′ are implicants of f , such that T ( K ′ ) ⊂ T ( K ′′ ), we say that K ′′ absorbs K ′ . The implicant K of f is called prime , if there is not other implicant K ′ of f , suchthat T ( K ) ⊂ T ( K ′ ). The disjunction D = x τ j ∨ · · · ∨ x τ r j r is called an implicate (or clause ) of the function g if F ( D ) ⊆ F ( g ). If D ′ and D ′′ are implicates of f , such that F ( D ′ ) ⊂ F ( D ′′ ), we say that D ′′ absorbs D ′ . The implicate D of g is called prime ifthere is not other clause D ′ of g , such that F ( D ) ⊂ F ( D ′ ).In [4, 5, 8, 14, 15] it is shown that each monotone function f has an uniqueirredundant (minimal) disjunctive normal form (IDNF), consisting of all prime im-plicants of f , and also an unique irredundant conjunctive normal form (ICNF), con-sisting of all prime implicates of f . In both forms all literals are uncomplemented,so the IDNF and the ICNF of an arbitrary monotone function are superpositionsover the set { xy, x ∨ y, ˜0 , ˜1 } . The existence of a bijection between the set of primeimplicants in IDNF of f and the set min T ( f ) is also noted – each prime implicant3 i = x i x i . . . x i k corresponds to the vector α ∈ min T ( f ) having ones in coordinates i , i , . . . , i k and zeros in all the rest coordinates. Hence α is a characteristic vectorof K i . Analogously, in [8, 14] is shown the existence of a bijection between the setof all prime implicates in ICNF of f and the set max F ( f ) – each prime implicate D j = x j ∨ x j ∨ · · · ∨ x j r in the ICNF of f corresponds to the vector β ∈ max F ( f )having zeros in coordinates j , j , . . . , j r and ones in all the rest coordinates. So β isan anti-characteristic vector of D j . Example 2.2
Let us consider the function f ( x, y, z ) = (0 , , , , , , , ∈ M , forwhich we have:1) The IDNF of f is f ( x, y, z ) = y ∨ xz , and y , xz are its prime implicants. Theycorresponds bijectively to the vectors (0 , , , (1 , , and so min T ( f ) = { (0 , , , (1 , , } ;2) The ICNF of f is f ( x, y, z ) = ( x ∨ y )( y ∨ z ) , and ( x ∨ y ) , ( y ∨ z ) are its primeimplicates. They corresponds bijectively to the vectors (0 , , , (1 , , and hence max F ( f ) = { (0 , , , (1 , , } . We shall represent the relation ” (cid:22) ” over the vectors of { , } n by a matrix. Definition 3.1
We define a matrix of the precedences P n = || p ij || of dimension 2 n × n as follows: for each pair of vectors α, β ∈ { , } n , such that α = i , β = j , we put p ij = 1 if α (cid:22) β , or p ij = 0 otherwise.The rows and the columns of P n are numbered from 0 till 2 n −
1, in accordancewith the numbers of the vectors in { , } n . Theorem 3.2
For n = 1 the matrix P is (cid:18) (cid:19) . For any integer n > P n is ablock matrix of the form P n = (cid:18) P n − P n − O n − P n − (cid:19) , where P n − denotes the same matrix of dimension n − × n − , and O n − is thezero matrix of dimension n − × n − . P n represents the precedences of the vectors of { , } n in accordance with Definition 3.1. Proof.
We shall prove the theorem by an induction on n .1) Obviously, for n = 1 the matrix P is of the given form and it represents theprecedences of the vectors (0) and (1) in { , } .2) We suppose that the theorem is true for the matrix P n − , which represents theprecedences of the vectors in { , } n − in accordance with Definition 3.1.3) Following Definition 2.1, ∀ α ∈ { , } n ⇒ α = (0 , γ ) or α = (1 , γ ), where γ ∈ { , } n − . For arbitrary vectors α, β ∈ { , } n , in dependence of this whetherthey begin with 0 or with 1, we consider the following four cases:i) α = (0 , γ ) , β = (0 , δ ), where γ, δ ∈ { , } n − . Let α = i , β = j . Then i = γ , j = δ , 0 ≤ i, j ≤ n − −
1, and also α (cid:22) β iff γ (cid:22) δ . So, for any such4 , j the elements p ij of the matrix P n have the same values as the elements with thesame indices in the matrix P n − . Therefore the matrix P n − is placed in the upperleft block (quarter) of P n .ii) α = (0 , γ ) , β = (1 , δ ), where γ, δ ∈ { , } n − . Then α = γ = i , 0 ≤ i ≤ n − −
1, and β = 2 n − + δ = j , 2 n − ≤ j ≤ n −
1. Also α (cid:22) β iff γ (cid:22) δ . Forthese values of i and j the elements p ij of the matrix P n are the same as the elements p ik , k = j − n − , of the matrix P n − . So P n − is placed in the right upper block of P n . iii) α = (1 , γ ) , β = (0 , δ ), γ, δ ∈ { , } n − . Every vector beginning with 1 doesnot precede a vector beginning with 0. For the numbers of these vectors we have: i = α , 2 n − ≤ i ≤ n − j = β , 0 ≤ j ≤ n − −
1. Hence p [ i, j ] = 0 for allsuch i and j , and so the zero matrix O n − is placed in the left lower block of P n .iv) α = (1 , γ ) , β = (1 , δ ), γ, δ ∈ { , } n − . The case is analogous to the case(i), the difference is only in the numbers of the vectors: 2 n − ≤ α, β ≤ n − P n − is placed in the right lower block of P n .Therefore the matrix P n has the structure which states the theorem. Also P n represents the precedences of the vectors of { , } n in accordance with Definition 3.1,since the matrix P n − do this for the vectors of { , } n − (because of the inductivesuggestion). So the theorem is proved. ⋄ Remark 3.3
From the properties of the relation ” (cid:22) ” and from the theorem it followsthat P n is a triangular matrix, having ones on its major diagonal and zeros underit. The triangle of numbers on and over the major diagonal of P n is related to otherknown structures:1) it is a discrete analog of the fractal structure known as Sierpinski triangle;2) the transposed matrix P Tn coincides with the Pascal’s triangle consisting of n rows, where the numbers are taken modulo 2, i.e., over GF (2) . The matrix P n can be expressed recursively as a Kronecker product: P n = P ⊗ P n − = P ⊗ P ⊗ P n − = · · · = Kronecker n -th power of P .We denote by R n = { r , . . . , r n − } the set of all rows of P n considered as binaryvectors. Theorem 3.4
Let α = ( a , a , . . . , a n ) ∈ { , } n , α = i , ≤ i ≤ n − , and α hasones in the coordinates i , i , . . . , i r , ≤ r ≤ n , i.e., α be the characteristic vector ofthe conjunction c i = x i x i . . . x i r (so it is a monotone function). If we consider c i asa function of n variables, then the vector of its functional values contains the samevalues as (corresponds to) the i -th row r i of the matrix P n . When α = 0 , the zerorow r of P n corresponds to the ˜1 . Proof.
We note, that we number the coordinates of the vectors of { , } n from left tothe right, denoting by x , x , . . . , x n the variables corresponding to them. FollowingDefinition 2.1, firstly we add zeros and next we add ones in the beginning of eachvector of { , } n − to obtain the vectors of { , } n . This is equivalent to an adding ofthe variable x in the beginning and increasing the indices of all variables of { , } n − by one. 5ll elements of the zero row r of P n are ones and so it corresponds to the vectorof ˜1, as a function of n variables. The rest part of the assertion we shall prove byinduction on n .1) Obviously, the assertion is true for the matrix P .2) We suppose that the theorem is true for the matrix P n − , i.e., ∀ i , 1 ≤ i ≤ n − −
1, the vector of functional values (or briefly ”vector of the function”henceforth) of the conjunction c i = x i x i . . . x i r , which characteristic vector is α ∈{ , } n − , α = i , coincides with the i -th row r i of the matrix P n − .3) Let α ∈ { , } n − , α = i = 2 i + 2 i + · · · + 2 i m , 1 ≤ i ≤ n − −
1, 1 ≤ m ≤ n −
1, and so α has ones in the coordinates i , i , . . . , i m . In accordance withthe inductive suggestion, the row r i of P n − coincides with the vector of the function c i = x i x i . . . x i m . We consider two cases:i) Let β ∈ { , } n be the vector, which is obtained by adding 0 in the beginningof α . Then β = i , it has ones in the coordinates i + 1 , i + 1 , . . . , i m + 1 and so β is acharacteristic vector of the conjunction c ′ i = x i +1 x i +1 . . . x i m +1 . Following Theorem3.2, the row r ′ i of P n is obtained by writing the row r i of P n − two times one afteranother (as a concatenation of strings). So r ′ i coincides with the vector of a functionof n variables, which is obtained by adding the fictitious variable x to the function c i . Therefore the row r ′ i of P n contains the functional values of c ′ i .ii) Let β ∈ { , } n be the vector, which is obtained by adding 1 in the beginningof α . Then β = 2 n − + i = k , it has ones in the coordinates 1 , i + 1 , . . . , i m + 1 andso β is a characteristic vector of the conjunction c k = x x i +1 . . . x i m +1 . FollowingTheorem 3.2, the first half of the row r k of P n is a row from the zero matrix O n − ,and its second half is the row r i of P n − . We consider r k as a vector of function of n variables. It is obtained by adding (in conjunction) the essential variable x to afunction of n − { , } n we have x = 0and so the values in the first half of r k are zeros. On the second half of the vectors of { , } n we have x = 1 and so the values in the second half of r k are the same as theseof r i . Therefore the row r k of P n contains the functional values of the conjunction c k .So the theorem is proved. ⋄ We denote by C n the set of all conjunction of n variables without negations. Remark 3.5
The correspondence in Theorem 3.4 between the conjunction c i and itscharacteristic vector α ∈ { , } n , α = i , is a bijection ϕ : { , } n → C n . Theorem3.4 states the relation between the set C n and the matrix P n – this is the bijection ψ : C n → R n , the bijection between the formula representation c i and the vectorrepresentation r i of each conjunction of n variables without negations. Table 1 illustrates the assertion of Theorem 3.4 for n = 3.We consider the conjunction and the disjunction over binary vectors as a bit-wise operations. So the vector of an arbitrary f ∈ M n can be expressed as a lin-ear combination f ( x , x , . . . , x n ) = a r ∨ a r ∨ · · · ∨ a n − r n − , where the coeffi-cients a , a , . . . , a n − ∈ { , } , and the trivial combination corresponds to ˜0. When f ( x , x , . . . , x n ) = c i ∨ c i ∨ · · · ∨ c i k is an IDNF of f , then the corresponding to theprime implicants rows r i , r i . . . , r i k are pairwise incomparable (as binary vectors)and the vector of f is a result of r i ∨ r i ∨ · · · ∨ r i k .6 = ( x , x , x ) i = α P c i (0 0 0) 0 1 1 1 1 1 1 1 1 ˜1 . (0 0 1) 1 0 1 0 1 0 1 0 1 x (0 1 0) 2 0 0 1 1 0 0 1 1 x (0 1 1) 3 0 0 0 1 0 0 0 1 x x (1 0 0) 4 0 0 0 0 1 1 1 1 x (1 0 1) 5 0 0 0 0 0 1 0 1 x x (1 1 0) 6 0 0 0 0 0 0 1 1 x x (1 1 1) 7 0 0 0 0 0 0 0 1 x x x Table 1: Illustration of the assertion of Theorem 3.4, for n = 3Let us consider an arbitrary row r i of the matrix P n and let the values in po-sitions i , i , . . . , i k ( i = i < i < · · · < i k ) be ones. Then the set of vectors { α i , α i , . . . , α i k } ⊆ { , } n is actually the set T ( c i ). The vector α i = α i precedesall the rest vectors of T ( c i ) and therefore min T ( c i ) = { α i } .Now let j be the position of the rightmost zero in an arbitrary row r i of P n .Let us consider the j -th column of P n and let the values in positions j , j , . . . , j m ( j < j < · · · < j m = j ) of this column be ones. This means that each vectorfrom the set { α j , α j , . . . , α j m } ⊆ { , } n precedes the vector α j m = α j . Since therow r i corresponds to the vector of a monotone function, the zero in position j of r i implies zeros in positions j , j , . . . , j m − . Hence { α j , α j , . . . , α j m } ⊆ F ( c i ) andonly the last of them α j ∈ max F ( c i ). So | max F ( c ) | = 0, and | max F ( c i ) | ≥ i >
0. As we have noted, the vector α j corresponds bijectively to the clause d j , whichanti-characteristic vector is α j . Example 3.6
Let us consider the row r of P (see Table 1). It corresponds tothe conjunction c = x x and has ones in positions 3 and 7. Therefore T ( c ) = { (0 , , , (1 , , } and min T ( c ) = { (0 , , } . The rightmost zero in r is in position6. The sixth column of P has ones in positions { , , , } . Therefore { (0 , , , (0 , , , (1 , , , (1 , , } ⊆ F ( c ) and (1 , , ∈ max F ( c ) . Actually, max F ( c ) = { (1 , , , (1 , , } , α = (1 , , corresponds to the clause d = x , and α = (1 , , – to the clause d = x . The following assertion is symmetrical to Theorem 3.4.
Theorem 3.7
Let α = ( a , a , . . . , a n ) ∈ { , } n , α = i , ≤ i ≤ n − and α has zeros in the coordinates i , i , . . . , i r , ≤ r ≤ n , i.e., α be the anti-characteristicvector of the disjunction d i = x i ∨ x i ∨ · · · ∨ x i r (so it is a monotone function). If weconsider d i as a function of n variables, then the values in its vector are the negatedvalues of the i -th column of the matrix P n . When α = 2 n − , the negated values ofthe last column of P n are the values of the vector of ˜0 . The proof is analogous to the proof of Theorem 3.4 and we omit it.7
Generating MBFs of n variables in lexicographicorder In Section 1 it was mentioned that the oldest problem in the area of MBFs is theDedekind’s problem – for enumerating MBFs of n variables (or, equivalently, forcounting all antichains of subsets of a given n -element set). The numerous efforts ofthe researchers for solving this problem led up to obtaining a lot of estimations for | M n | (from above and below) [12, 21]. Now an exact formula for the number of MBFsof n variables, in the general case, is not known. Till now this number was known for0 ≤ n ≤ n | M n | | M n | , for 0 ≤ n ≤ generating and counting ) for a partial solvingthe Dedekind’s problem. In [19] a computer program (written in C++) for gener-ating MBFs of n variables, 0 ≤ n ≤
7, is represented. The program is used to testsorting networks, and also for solving the Dedekind’s problem for these values of n .It realizes an algorithm, described in [21] and based on the following property. Let f, g ∈ M n − , f and g be given by their vectors, and let f (cid:22) g (i.e., f ∨ g = g ). If h isthe function, which vector is a concatenation of the vectors of f and g (considered asstrings), then h ∈ M n . So generating the functions of M n requires: (1) all functionsof M n − to be generated and stored, and (2) to check whether f (cid:22) g , for all pairs f, g ∈ M n − . These two characterizations decrease the speed of the algorithm.Here we propose an algorithm, called Gen , for generating the MBFs of n vari-ables. It was created in 1995, its initial purpose was to compute the value of | M | . Wehave done a series of optimizations and experiments and in 1999 we have generatedabout 7% of the function in M for about 150 hours total time, on several 200 MHzcomputers, performing independent subproblems (for comparison, at this time thefunctions of M was generated for 6 seconds on a 300 MHz computer). The principle”generating and counting” turned out not so powerful for solving this problem, asthe mixed (analytical and computational) approach of Yovovic etc. [11, 18]. When8e got to know about their results, after a comparison of our partial results withthe asymptotic estimations in [12, 21], and after evaluating the total time for thegeneration, our attempts were canceled.Algorithm Gen generates the vectors of the functions of M n in lexicographic or-der , for given n . It is based on the matrix P n in the sense of Remark 3.5 and theexplanations after it. Namely, if the row r i of P n has zero in position j , i < j < n − r i and r j are incomparable and therefore f = r i ∨ r j = c i ∨ c j ∈ M n .After that, if the vector of f has zero in position k , i < k < n −
1, then it and therow r k are incomparable, so we can put f = f ∨ r k = f ∨ c k ∈ M n , and so on. Moreprecisely, the algorithm works with the rows of P n consecutively, starting from thelast row. For i = 2 n − , n − , . . . , ,
0, the algorithm puts f = r i and outputs it.Thereafter, while the vector of f contains zeros in the positions after the i -th one, thealgorithm does the following: it determines the position j of the rightmost zero in f ,performs the bitwise disjunction f ∨ r j , assigns the result to f and outputs it. Thusthe vectors of the functions of M n will be generated lexicographically. The formaldescription of the algorithm is: Algorithm Gen.
Generates the MBFs of n variables in a lexicographic order. Input: n . Output: the vectors of the functions of M n in a lexicographic order. Procedure:
1) Put f = ˜0. Print f .2) For each row r i , i = 2 n − , n − , . . . ,
0, put f = r i and:a) print f ;b) for each j , j = 2 n − , n − , . . . , i + 1, check the j -th position of f . If f [ j ] = 0,then put (recursively) f = f ∨ r j and go to step a).3) End.The main part of the realization of Gen , written in Pascal, is:
1) Program Gen;2) .....3) Procedure Generate (G: BoolFun; i : integer);4) var j : integer;5) begin6) for j:= i to dim do { Disjunction between the i-th row }7) if P[i,j]=1 then G[j]:= 1; { and the current function G.}8) Print (G);9) for j:= dim-1 downto i+1 do { Searching a zero for }10) if G[j]=0 then Generate (G, j); { the next disjunction.}11) end; { Generate }12) Begin { Main }13) ...14) readln (n); { Number of variables }15) dim:= 1 shl n - 1; { dim:= 2^n-1 }16) Fill_Matrix; { Filling in the matrix P_n }
7) for k:= 0 to dim do F[k]:= 0; { Initialization - constant 0 }18) Print (F);19) for k:= dim downto 0 do Generate (F, k);20) End.
Comments on the algorithm and its realization:
1) The procedure
Fill Matrix in row 16) generates and stores in the memorythe matrix P n – in accordance with either Theorem 3.2, or Remark 3.3. In ourrealization (in Borland Pascal 7.0) 1 ≤ n ≤
7, and this restriction depends on therealization only, it does not concern the nature of the algorithm. When n > P n have to be used.2) Gen actually generates the functions lexicographically, since the cycles in rows9 and 19 have a decreasing step.3)
Gen can generate only these functions, which come lexicographically after thegiven function h ∈ M n . For this purpose, in row 17 we have to initiate F = h insteadof F = ˜0, and also the cycle in row 19 has to start from i – the position of the leftmostone in the vector of h . Alternatively, if we change only the final value in the samecycle – for example, to be m (0 < m < dim ) instead of 0, then Gen will generate onlythese functions of M n , which precede lexicographically the function, which vector is r m +1 .4) The maximal number of rows of P n , which are pairwise incomparable determinethe depth of the recursion in Gen . The disjunctions of these rows give a function,having a maximal number of true vectors, namely max f ∈ M n | min T ( f ) | = (cid:0) n ⌊ n/ ⌋ (cid:1) [4, 5](this is the size of the longest antichain in the POSet { , } n ).5) Gen has an exponential time-complexity – the nature of the problem is suchthat the size of the output is always exponential towards the size of the input. Moreprecise classification of such algorithms is given in [9].
Gen uses only the last gener-ated function and a part of a certain row of P n to generate the next function. It doesthis in an incremental polynomial time , i.e., the new function is generated in time,which is polynomial in the combined size of the input, the last generated function andone row of P n . So the algorithm runs in a polynomial total time [9], i.e., its runningtime is a polynomial in the combined size of the input and the output.6) Gen can be modified easily to generate all antichains of a given POSet. Forthis purpose it is enough to build a new matrix P ′ n representing the correspondingrelation for each pair of elements of the POSet. The problem for the determination of at least one minimal true vector and/or at leastone maximal false vector of an unknown monotone function is an important problemin the area of MBFs [7, 10, 13]. It is closely related to another important problem– for identification of such a function. In the publications in Russian these problems10re considered under the assumption that an arbitrary function f ∈ M n is studiedby using some operator A f , such that A f ( α ) = f ( α ) (i.e., returns the value of f on α ), for α ∈ { , } n [7, 10, 16]. In the papers in English the same problems areinvestigated in the terminology of computational learning theory , which goals are todefine and study useful models of learning phenomena from an algorithmic point ofview [1, 2]. In the investigation of these two problems the learning algorithm asks anoracle two types of queries for an unknown function f ∈ M n :– membership queries – whether a selected vector α is a true (or a false) vectorfor f . The oracle answers ”Yes” or ”No”;– equivalence queries – whether the unknown function f is equivalent with thehypothesis-function g . The oracle replies either ”Yes”, or returns a counterexample(an arbitrary vector α , such that f ( α ) = g ( α )).The problem for determining at least one minimal true vector and/or at leastone maximal false vector of an unknown monotone function is solved algorithmically.Effective algorithms, which determine the corresponding vector(s) by using a minimalnumber of membership queries and having a minimal time-complexity, are searched(created) in the investigations. The considered problem and its generalization (for k -valued monotone functions) are studied by Katherinochkina [10]. The estimations,derived by her show that an algorithm for determining an arbitrary maximal falsevector of an unknown f ∈ M n needs at least (cid:0) n ⌊ n/ ⌋ (cid:1) references to the correspondingoperator A f . For an unknown f ∈ M n , Gainanov [7] proposes an algorithm whichdetermines a new vector α , such that either α ∈ min T ( f ), or α ∈ max F ( f ). Thealgorithm works as follows. Let α ∈ { , } n , its coordinates j , j , . . . , j k be ones and A f ( α ) = 1. Let e i ∈ { , } n be the i -th unit vector (only i -th its coordinate is one,and all the rest are zeros). Algorithm builds the sequence f ( α i ) , i = 1 , , . . . , k , suchthat α = α , α i = α ⊕ f ( α ) e j ⊕ · · · ⊕ f ( α i − ) e j i − ⊕ e j i , i = 1 , , . . . , k. Let p be the maximal index for which f ( α p ) = 1. Then the vector β = α p is a newvector for min T ( f ). The case A f ( α ) = 0 is treated analogously – the coordinates l , l , . . . , l r of the zeros in α are considered and the algorithm builds the sequence f ( α i ) , i = 1 , , . . . , r , such that: α = α , α i = α ⊕ ¯ f ( α ) e l ⊕ · · · ⊕ ¯ f ( α i − ) e l i − ⊕ e l i , i = 1 , , . . . , r. If q is the maximal index for which f ( α q ) = 0, then the vector β = α q is anew vector for max F ( f ). The Gainanov’s algorithm refers to the operator A f O ( n )times, its time-complexity is O ( n ). It is among the most effective algorithms of thistype. That is why it is so popular and useful for other algorithms – for example, inidentification of monotone functions [4, 5, 15]. Definition 5.1
Let f ∈ M n . The vector α ∈ min T ( f ) is called lexicographicallyfirst minimal true vector (LFMT vector, in short), if α precedes lexicographicallyeach other vector β ∈ min T ( f ). Symmetrically, the vector γ ∈ max F ( f ) is called11 exicographically last maximal false vector (LLMF vector), if each other vector δ ∈ max F ( f ) precedes lexicographically γ .Here we propose an algorithm in two versions, called Search First (resp.
Search Last ), for determining LFMT (resp. LLMF) vector of an unknown f ∈ M n .The algorithm is based on the block structure of the matrix P n and its properties,given in Theorem 3.2, Theorem 3.4 and the explanations after it. The vector of an ar-bitrary f ∈ M n \{ ˜0 } is a componentwise disjunction of some incomparable rows of P n . Search First determines the minimal number of a row among these in the disjunc-tion – so it is the number of the LFMT vector of f . Symmetrically, Search Last determines the zero component in f , having a maximal number – so it is the numberof the LLMF vector of f . In both versions we use the variables left and right , denotingthe left and the right limit (correspondingly) of the interval for search. Their initialvalues are: lef t = 0 and right = 2 n − Search First asks membership queriesfor the vectors of { , } n , having numbers of the type m = ( lef t + right ) div
2, i.e.,whether f ( α m ) = 1? If ”Yes”, it puts right = m , otherwise it puts lef t = m + 1. Search First computes the next value of m (by the same equality), it asks mem-bership query for α m again and changes the value of either lef t , or right , an so on,until the condition lef t < right is true. In other words, the algorithm performs sim-ply a binary search. Theorem 3.2 implies the correctness of this approach – all rowsfrom the upper half of P n contain 1 in position m = ⌊ (0 + 2 n − / ⌋ = 2 n − −
1, andall the rows from the lower half of P n contain 0 in the same position. The same isvalid for the blocks P n − and the corresponding positions: m = 2 n − − right is changed), or m = 2 n − + 2 n − − lef t is changed). And so on,until some block P is reached. Here is the code of Search First , written in Pascal.
Function Search_First (left, right : integer) : integer;var m : integer;beginwhile left < right dobeginm:= (left + right) div 2;if f[m] = 1 { treated as a membership query }then right:= melse left:= m+1;end;Search_First:= left;end;
Search Last works in analogous manner, the membership queries are usedfor the vectors, having numbers of the type m = (( lef t + right ) div
2) + 1, i.e.,whether f ( α m ) = 0? If ”Yes”, the algorithm puts lef t = m , otherwise it puts right = m −
1, and so on. The explanations of the performance and the correctnessof
Search Last are analogous to these of the previous one, its code is similar tothe code of
Search First and so we omit them.
Comments on the algorithms and their realizations: Search First determines the LFMT vector, and
Search Last – the LLMFvector of an unknown f ∈ M n \{ ˜0 , ˜1 } by using n membership queries, their running12ime is Θ( n ). When f = ˜0 (resp. f = ˜1), the LFMT (resp. LLMF) vector does notexist and a simple modification in the functions has to be done – for example, one moremembership query has to be asked, or (equivalently): if f = ˜0 and Search Last is started after
Search First , then the first functions determines ˜0 successfully byone additional query only (we use this approach in identification).2) The algorithm performs integer divisions only (for computing indices), it doesnot generate any vectors and it is preliminarily known what kind of vector will besearched and found – contrary to the algorithm of Gainanov.3) For clarity, in this section we represented a simplified version of the algorithm,working on completely unknown function f (i.e., there is not partial knowledge aboutit). In its real (extended) version, the algorithm asks membership query only whenthe current vector is unknown (i.e., the value of f on it cannot be deduced by thecurrent partial knowledge). Example 5.2
The following table illustrates the performance of the algorithm on asample function f ∈ M , treated as an unknown. The order of the tested positions (ofits vector) is shown in the third row of the table. Those, tested by Search First ,are denoted with Arabic numbers, and those, tested by
Search Last , are denotedwith roman numbers. f ( x , x , x , x ) . Table 3: Illustration of the performance of
Search First and
Search Last
We use this example to note, that if the real version of
Search Last (its codeis given in the next section) is started after this one of
Search First , then its thirdmembership query becomes unnecessary – the value of f in position 14 is definedby the prime implicant c , which is already determined by Search First . Theseversions are included and controlled by an algorithm for identification, representedin the next section. When the size of the vector decreases and/or there is somepartial knowledge about the function, the necessary number of queries and the time-complexity of the algorithm decrease.4) The performance of
Search First and
Search Last (in their extendedversions they register each new prime implicant/implicate, which they find) has thefollowing additional properties.
Proposition 5.3
Each new prime implicant (resp. implicate), which the function
Search First (resp.
Search Last ) finds, absorbs the previous one, found by thecorresponding function. In the general case, the same is not true for the set of impli-cates (resp. implicants), which
Search First (resp.
Search Last ) finds.
Proposition 5.4
Let
Search First and
Search Last be executed one by anotheron the function f ∈ M n and let the numbers of the determined LFMT and LLMFvectors be i and j , correspondingly. When the vector of f : ) is of size 4 (i.e., f ∈ M ), orb) is of the type (0 , . . . , , , . . . , (i.e., i = j + 1 ), orc) is of the type (0 , . . . , , , , , . . . , (i.e., i + 1 = j ),then the sets min T ( f ) and max F ( f ) are determined completely in the process ofsearching. The truth of these assertions follows directly from the given explanations andcomments about the performance of the algorithm (see the order of the tested posi-tions). They can also be proved strongly by induction on n . The problem for
Identification of an unknown MBF f ∈ M n is also solved algorith-mically, which means that the learning algorithm determines the sets min T ( f ) and max F ( f ) as an output – so f is specified completely. When the algorithm usesmembership queries only, this is an example of exact learning (of a Boolean the-ory f ) by membership queries , in the terminology of computational learning theory[2, 5, 15]. When the learning model allows both membership and equivalence queriesto be asked, there are algorithms (for example in [1, 7]), which determine min T ( f ) ofan unknown f ∈ M n by using O ( n | min T ( f ) | ) membership and equivalence queriesgenerally. Here we consider and discuss only the first learning model.In [5] are discussed some details, characteristics, estimations and criteria for thelearning algorithm, which uses only membership queries. For such algorithm it isargued that:1) it must determine both the sets min T ( f ) and max F ( f ), although one of themcan be obtained by another – this requires an exponential time (in the combined sizeof the both sets) it the general case. Later, in [8] it is shown that the determiningof max F ( f ) from min T ( f ) is equivalent to determining the set min T ( f d ), where f d ( x , x , . . . , x n ) = ¯ f (¯ x , ¯ x , . . . , ¯ x n ) is the dual function of f (this problem is knownas ” Dualization ” or ”
Transversal hypergraph ”). In [8] it is proved that this problemcan be solved in incremental quasi-polynomial time;2) its complexity (i.e., the number of the asked queries and the time-complexity)has to be evaluated in the combined size of the input n and the output m = | min T ( f ) | + | max F ( f ) | , including the time for generating the vectors for the queries.The size of m can become as large as (cid:0) n ⌊ n/ ⌋ (cid:1) + (cid:0) n ⌈ n/ ⌉ (cid:1) and polynomiality in n onlycan not be expected [4, 5].The existence of a polynomial total time algorithm for solving the problem Iden-tification is equivalent to the existence of such type algorithms for solving many otherinteresting problems in areas as hypergraph theory, theory of coteries, artificial intel-ligence, Boolean theory [3, 14]. These problems have still open complexities in spiteof the numerous investigations. The results of Fredman, Khachiyan and Gurvich[6, 8] show that it is unlikely these problems to be NP-hard.Makino, Ibaraki, Boros, Hammer etc. have studied intensively the complexityof the Identification problem [4, 5, 14, 15]. It is closely related to the complexity ofthe problem for determining a new vector for some of the sets M T ⊆ min T ( f ) and14 F ⊆ max F ( f ), representing the partial knowledge about the unknown functionin a current stage. The authors solve a restricted problem, they propose some algo-rithms, which decide whether the unknown function f is 2-monotonic or not, and if f is 2-monototonic they identify it in polynomial total time and by using polynomialnumber of queries. In [14, 15] is introduced the notion maximum latency as a measureof the difficulty in finding a new vector. It is shown that if the maximum latency ofthe unknown function f ∈ M n is a constant, then an unknown vector (i.e., vector forwhich the partial knowledge is insufficient to decide a true or a false vector is it) canbe found in polynomial time and there is an incrementally polynomial-time algorithmfor identification (the algorithm in [15] uses O ( n m ) time and O ( n m ) queries). In[14] it is proved that restricted classes of monotone functions have a constant maxi-mum latency. On the base of these results in [17] it is proved that almost all MBFsare polynomially learnable by membership queries.Here we propose an algorithm, called Identify , which identifies an unknown f ∈ M n by membership queries only. It is based on the properties of the matrix P n and uses the algorithm for determining LFMT and LLMF vectors. We consider the problem for identification as an opposite to the problem for generating the MBFs ,i.e., during the generation, the algorithm Gen combines (by disjunctions) some in-comparable rows of P n , whereas in identification of such a function (considered asunknown) the learning algorithm has to determine (or to separate) these rows. Thedifficulty in this process is due to the fact that the rows have positions, where theirelements coincide. Identify determines both min T ( f ) and max F ( f ) of an unknown f ∈ M n , i.e.,the knowledge about f is n (which is the input) and its monotonicity. The algorithmworks recursively and it obeys to the following main idea. Firstly it determines theLFMT and the LLMF vector of f by using Search First and
Search Last (theirextended versions). After that it splits f into two subfunctions g, h ∈ M n − , so g (cid:22) h ,and the vector of f is a concatenation f = gh of the vectors of g and h (consideredas strings) – as it was mentioned in Section 4. So the LFMT (resp. LLMF) vectorof f , which is found, is a LFMT (resp. LLMF) vector of g (resp. h ). The algorithmcontinues by identification of g and h in the same way as f , i.e., it determines theLLMF vector of g and recursively identifies it by splitting it into two subfunctions(having the mentioned above properties), and then it determines the LFMT vector of h and identifies it in the same way. The recursive splitting into subfunctions continuesin accordance with Proposition 5.4, i.e., until a subfunction of some of the types init is obtained – so it is identified. In addition to Proposition 5.4 we note, that if asubfunction of the type f ′ = ˜0 h ′ , or f ′ = g ′ ˜1 is obtained, then the subfunction ˜0 (resp.˜1) is already identified and the algorithm continues with an identification of h ′ (resp. g ′ ) only. So Identify is a representative of the ”
Divide and conquer ” strategy. Itsmain idea can be seen in a most clear form in the following procedure Id , written inPascal. Procedure Id (left, right, lm1, rm0 : integer);{ left is the initial position, and right is the final }{ position of the vector of f (or some its subfunction). }{ lm1 is the position of the LFMT, rm0 is the position }{ of the LLMF vector of f (or some its subfunction). } ar m : integer; { Position of the split. }p0, { Position of the LLMF vector of g.}p1 : integer; { Position of the LFMT vector of h.}begin { Tests the cases of Proposition 5.4: }1) if right-left <= 3 then exit; { case a), }2) if lm1 > rm0 then exit; { case b), }3) if lm1 = rm0+1 then exit; { case c). }4) t:= rm0-1; { Test for a subfunction of the form (000...01)^k.}5) if (rm0+1 = right) and (GetFunValue (t)= -1) thenbegininc (q); { Counting the queries by q is included.}if F[t] = 0 then { When unknown - membership query,}Reg_Clause (t) { registers a new clause, or }else Reg_Impl (t); { registers a new implicant. }end;6) m:= (left+right) div 2; { Computing the position of the split.}7) if lm1 > m thenbegin { When f’=0h’ - }Id (m+1, right, lm1, rm0); exit; { identifies h’.}end;8) if rm0 <= m thenbegin { When f’=g’1 - }Id (left, m, lm1, rm0); exit; { identifies g’.}end; { In the rest cases, f’= g’h’: }9) p0:= Search_Last (left, m); { - searching the LLMF vector of g’}10) Id (left, m, lm1, p0); { and identification of g’; }11) p1:= Search_First (m+1, right); { - searching the LFMT vector of h’}12) Id (m+1, right, p1, rm0); { and identification of h’. }end; {Id} As the comments in the source show, the ”if”-operators in rows 1), 2) and 3) testthe conditions of Proposition 5.4 for finishing the identification. These in rows 7) and8) test whether ˜0 or ˜1 is a subfunction of the current function – then the identificationcontinues with the rest half of it. The test in row 5) was not discussed till now. Thereare functions (subfunctions), which vector is of the form (0 , . . . , , k , i.e., the vector(0 , . . . , ,
1) is repeated k times. In the worst case they can have only one primeimplicant and two prime implicates, independently on n . If f is such a function,then m = | min T ( f ) | + | max F ( f ) | = 3 and the number of queries, necessary for itsidentification, can grow non-polynomially in n and m . The test in row 5) checks thethird position from right to the left, so it recognizes such functions by one additionalmembership query and prevents the number of queries to grow unnecessary. Forexample, without this test the algorithm identifies the function f = (0 , , , ∈ M
16y using 16 queries, and after including the test the number of queries becomes 10.The function
GetFunValue and the procedures
Reg Impl and
Reg Clause in Id are discussed in the following comments. Comments on the algorithm and its realizations:
1) We wrote several versions of
Identify and we done many experiments aswell, trying to minimize the number of queries and the time-complexity. Since ourfirst goal was to minimize the number of queries, the older versions generate thematrix P n and use a partial function (i.e., a hypothesis-function h of n variables,the values of its vector are marked as unknown initially; the algorithm asks queriesabout f and registers the obtained knowledge in the vector of h , until it containsunknown values – thereafter the h is completely specified and h = f ), similarly tothe algorithms in [4, 5, 14, 15]. This approach always implies an exponential time-complexity. Generating and using the matrix P n is justified when all functions ofsome large enough set has to be identified.2) In the last version we do not generate the matrix P n . Instead of this, thefollowing function GetP determines and returns the value in the cell p [ i, j ] of thematrix P n (the values of the array d are set d [ k ] = 2 k −
1, for k = 0 , , . . . , n , initially). Function GetP (i, j, m : integer) : boolean;{ Determines the value in the cell p[i,j] of the matrix P_m. }{ GetP returns "true" when p[i,j]=1, or "false" when p[i,j]=0. }beginGetP:= false;while m>=1 dobeginif i>j then { If p[i,j] is under the major diagonal }exit; { of P_m - returns "false". }if (i=j) or (m=1) then { If p[i,j] is on the major }begin { diagonal of P_m, or p[i,j] is }GetP:= true; exit; { over this of P_1 - returns "true".}end;m:= m-1; { Checks in which block of P_m is the cell p[i,j]:}if i > d[m] then { if it is in the lower half, }i:= i-d[m]-1;if j > d[m] then { if it is in the right half, }j:= j-d[m]-1;end; { the corresponding block is chosen. }end; { GetP }
Theorem 3.2 implies the correctness of the function
GetP . The cycle ”while” init is repeated at most m times, hence the time-complexity of GetP is O ( n ) for P n (of dimension 2 n × n ). The recursive version of GetP is more compact but it runsa bit slower.3) The second main change in the last version of
Identify is the discardingof the hypothesis-function. It may contain many unnecessary values (for example,these before (resp. after) the first LFMT (resp. last LLMF) vector), many checksand fillings (sometimes of the type 2 n ) have to be done for registration of each new17rime implicant/implicate, which leads to an exponential time-complexity. When Identify needs to know the value of f in its k -th position (i.e., on α k ∈ { , } n ),it tries to derive it from the partial knowledge about f . Let us denote by T P I ( f )(resp. T P C ( f )) the set of all temporary prime (between each other) implicants (resp.clauses) of some subfunctions of f , which are known at a current stage. In accordancewith Theorem 3.4 and Theorem 3.7, the algorithm performs:a) checks the set T P I ( f ): if for some c i ∈ T P I ( f ), p [ i, k ] = 1, i.e., GetP ( i, k, n )returns ”true”, then the value in k -th position is 1, otherwise it goes to step b);b) checks the set T P C ( f ): if for some d j ∈ T P C ( f ), p [ k, j ] = 0, i.e., GetP ( k, j, n )returns ”false”, then the value in k -th position is 0, otherwise it is unknown.These two steps are realized by the function GetFunValue ( k ), which returnsthe value of f in its k -th position: 0, 1, or − GetFunValue will be O ( n ( | T P I ( f ) | + | T P C ( f ) | )). Although the set T P I ( f )( T P C ( f )) contains only prime (between each other) implicants (clauses), we do notsucceeded to estimate how large can they become. Our experimental results show,that there are only few functions in M , for which the maximal size of T P I (or
T P C ),reached in identification, exceeds m of the corresponding function by one. Obviously,when the identification finishes, then T P I ( f ) = min T ( f ) and T P C ( f ) = max F ( f ).4) Identify analyzes the answers to each membership query and accumulatesthis partial knowledge by procedures, called
Reg Implicant and
Reg Clause .Their parameter, the number k , is of the just found new implicant or clause. Inaccordance with Theorem 3.4 and Theorem 3.7, the data representation of the im-plicants and the implicates consists of their numbers only, which are stored in twoarrays, corresponding to the sets T P I ( f ) and T P C ( f ). Following Proposition 5.3,each new implicant/clause is not compulsory prime, so its registration consists of: (1)including it to the corresponding array, and (2) excluding from the array of these ele-ments, which precede (are absorbed by) the new one. Step (1) runs in a constant time,step (2) uses the function GetP to check the precedences at most | T P I ( f ) | (resp. | T P C ( f ) | ) times. So the time complexity of Reg Implicant (resp.
Reg Clause )is O ( n | T P I ( f ) | ) (resp. O ( n | T P C ( f ) | ).5) Identify uses the real versions of
Search First and
Search Last . Thealgorithm executes them to determine the LFMT and the LLMF vector of f beforecalling the procedure Id . Here is the real code of Search Last . Function Search_Last (left, right : integer) : integer;var m : integer;found : boolean;beginfound:= false;while left < right dobeginm:= (left+right) div 2 +1;case GetFunValue (m) of { Checks the value of f in m-th position.}0 : left:= m;1 : right:= m-1;-1 : begin { When this value is unknown - } f f[m]=0 then { membership query: }begin { - a new prime clause is found;}left:= m; found:= true;endelsebegin { - a new implicant is found. }right:= m-1; Reg_Impl (m);end;inc (q); { Counting the queries by q is included.}end;end;end;Search_Last:= right;if found then Reg_Clause (right);end; {Search_Last} As the previous version of
Search Last , this one asks at most n membershipqueries to determine the LLMF vector of the function f , which vector is of size 2 n (in fact, the real number of the queries can be quite smaller because of the par-tial knowledge). The time-complexity of this version is determined by the cycle”while”, which is executed exactly n times. Then the function GetFunValue isexecuted n times, the function Reg Impl is executed at most n − Reg Clause is executed once. So the time complexity of
Search Last is: O ( n.O ( n ( | T P I ( f ) | + | T P C ( f ) | )) + ( n − .O ( n | T P I ( f ) | ) + O ( n | T P C ( f ) | ) = O ( n ( | T P I ( f ) | + | T P C ( f ) | )). The real code of Search First is analogous to thisone of
Search Last and it has the same complexity as
Search Last . If we reg-ister only the prime implcants/implicates, which they find, then always
T P I ( f ) ⊆ min T ( f ) and T P C ( f ) ⊆ max F ( f ) and their sizes will be bounded by m . But thealgorithm will ask membership queries for one and the same vectors more than onceand the number of queries will grow extremely.These versions of Search First and
Search Last do not generate any test-vectors. They compute the numbers of these vectors only in a constant time.
Comments on the complexity of the algorithm
For the recent version of
Identify we do not succeeded to estimate the numberof queries, used in the general case. We have considered some reasons (concerning theperformance of
Search First and
Search Last and its features), to assume thatthe number of queries is polynomial in n and m , but we cannot prove (or disprove)this. We note that each LFMT and LLMF vectors of the function/subfunction, whichthe algorithm finds, bring maximal information (towards any other vectors) about it.Also both functions ask membership query only in a case of necessity, i.e., when thevalue of the unknown function on a given vector cannot be derived from the currentpartial knowledge about it and its monotonicity.In the recent version of the algorithm we removed some reasons, which lead toan exponential time-complexity. The assertions of Proposition 5.4 and some otherreasons, considered above, simplify and speed-up the identification of many functions.So the time-complexity of the algorithm decreases (towards the previous versions),19ut it remains exponential in the general case. The worst cases for the algorithm aresuch functions, where the recursion stops only when a subfunction of two variablesis reached. Obviously, they contain subfunctions, which are one and the same, butthe algorithm does not recognize them – so it identifies each of them separately andindependently. We note, that in such cases some subfunctions can be determinedcompletely by the partial knowledge about the identified ones. Then Search First and
Search Last do not ask a new query. For example, 17 functions in M areidentified in the process of determining their LFMT and LLMF vectors, but the restthree functions have an unknown vector – so they need either Search First or Search Last to be executed one more time.
Experimental results
We have done many experiments for identification of all functions in M n , ≤ n ≤
6, treated as unknown. They are generated preliminarily by the algorithm
Gen and are written in a file, the prime implicants of each function are also written. Forexample, on a 2.4 GHz processor and for n = 6, the running time for generating thesefunctions is 9 minutes and 2 seconds. For the same parameters, the identification(including some checks and collecting data for statistics) continues 26 minutes and17 seconds. The experimental results show that for 0 ≤ n ≤
6, the functions of M n are identified by no more than nm queries (the unique exception is ˜0, which isidentified by nm + 1 queries). Table 4 represents the maximal q max and the average q ave number of membership query, used for identification, in dependence of n . n q max q ave n q max q ave M n , for 1 ≤ n ≤ M .The diagram on the first figure represents how many MBFs are identified by thecorresponding number of queries. The second diagram represents how many MBFshave one and the same ratio q/ ( nm ) (in %), where q is the number of asked queries,and m = | min T ( f ) | + | max F ( f ) | is obtained in identification, for each f ∈ M \{ ˜0 } . In this work we represented our investigations on three important problems, concern-ing MBFs. We introduced one matrix structure and derived some of its combinatorialand algorithmic properties. They were used as a base for building three algorithms,which was the main reason these problems to be considered from a common pointof view. Solving the second and the third problems set some questions, concerningthe complexity of the corresponding algorithms, which remained open. They will be20igure 1: Results of identification of all MBFs of 6 variablesFigure 2: MBFs of 6 variables with one and the same ratio q/ ( nm ) (in %) obtainedin their identificationsubject to our future investigations. We believe that the proposed approach, matrixstructure and algorithms have more (and better) properties and capabilities thanthese, which we succeeded to obtain and represent here.21 eferences [1] D. Angluin, Queries and concept learning, Machine learning , 2 (1988) 319–342.[2] D. Angluin, L. Hellerstein and M. Karpinski, Learning read-once formulas withqueries,
J. of the ACM , 40 (1993) 185–210.[3] J. C. Bioch, T. Ibaraki, Complexity of identification and dualization of positiveBoolean functions,
Inform. and Comput. , 123 (1995) 50–63.[4] E. Boros, P. Hammer, T. Ibaraki, K. Kawakami, Identifying 2-monotonic positiveBoolean functions in polynomial time,
Lecture Notes in Comp. Sci. , 557 (1991)104–115.[5] E. Boros, P. Hammer, T. Ibaraki, K. Kawakami, Polynomial time recognition of2-monotonic positive Boolean functions given by an oracle,
SIAM J. Comput. ,26 (1) (1997) 93–109.[6] M. Fredman, L. Khachiyan, On the complexity of dualization of monotone dis-junctive normal forms,
J. of Algorithms , 21 (1996) 618–621.[7] D. N. Gainanov, On the criterion of the optimality of an algorithm for evaluat-ing monotonic Boolean functions,
USSR Comput. Math. and Math. Physics , 24(1984) 1250–1257, (or pp. 176–181 in the same issue in English).[8] V. Gurvich, L. Khachiyan, On generating the irredundant conjunctive and dis-junctive normal forms of monotone Boolean functions,
Discrete Appl. Math.
Inform. Proc. Letters , 27 (1988) 119–123.[10] N. Katerincochkina, Searching of the maximal upper zero for one class monotonefuncions in k -valued logic, Reports of AS USSR , 234 (4) (1977) 746–749, (inRussian).[11] G. Kilibarda, V. Yovovic, On the number of monotone Boolean functions withfixed number of lower units,
Intellektualnye sistemy , 7 (1–4) (2003) 193–217, (inRussian).[12] A. D. Korshunov, On the number of monotone Boolean functions,
ProblemyKibernetiki , 38 (1981) 5–108, (in Russian).[13] M. Kovalev, P. Milanov, Monotone functions of multivalued logic and superma-troids,
USSR Comput. Math. and Math. Physics , 24 (5) (1984) 786-789.[14] K. Makino, T. Ibaraki, The maximum latency and identification of positiveBoolean functions,
SIAM J. Comput. , 26 (1997) 1363–1383.2215] K. Makino, T. Ibaraki, A fast and simple algorithm for identifying 2-monotonicpositive Boolean functions,
J. of Algorithms , 26 (1998) 291–305.[16] N. A. Sokolov, On the optimal identification of monotone Boolean functions,
USSR Comput. Math. and Math. Physics , 22 (2) (1982) 449–461.[17] I. Shmulevich, A. Korshunov, J. Astola, Almost all monotone Boolean functionsare polynomially learnable using membership queries,
Inform. Proc. Letters , 79(2001) 211–213.[18] V. Yovovic, G. Kilibarda, On the number of Boolean functions in the Postclasses F µ , Discrete Math. and Applications , 9 (1999) 563–586.
Internet references
Ron Zeno’s site (Ramblings in mathe-matics and computer science) .[20] http://mathpages.com/home/kmath030.htm –
Dedekind’s Problem
Generating theMonotone Boolean Functions .[22] http://mathpages.com/home/kmath515.htm –