aa r X i v : . [ c s . CC ] F e b Sampling and Complexity of Partition Function ∗ Chuyu Xiong
Independent researcher, New York, USAEmail: [email protected]
February 3, 2021
Abstract
The number partition problem is a well-known problem, which is one of 21 Karp’s NP-completeproblems [14]. Partition function is a boolean function that is equivalent to the number parti-tion problem with number range restricted. To understand the computational complexity of thenumber partition problem and partition function is quite important and hard. People speculatethat we need new tools and methods [17] for such problem. In our recent research on universallearning machine [3, 4], we developed some tools, namely, fitting extremum, proper sampling set,boolean function with parameters (used in trial-and-error fashion). We found that these toolscould be applied to the partition function. In this article, we discuss the set up of the partitionfunction, properties of the partition function, and the tools to be used. This approach leads us toprove that the lower bound of the computational complexity of partition function, as well as thelower bound of the computational complexity of the number partition problem, is exponential tothe size of problem. This implies: P = NP [15]. Keywords: Number Partition Problem, Partition Function, Fitting Extremum, ProperSampling Set, Boolean Function with Parameters, P vs. NP
It is a great pleasant thing to learn and practice often! — Confucius
Simple can be harder than complex: You have to work hard to get your thinking clean to make itsimple. ...... once you get there, you can move mountains. — Steve Jobs
The number partition problem is one very famous problem [10]. It could be stated in a short sentence:given a set of natural numbers Ω, can we divide Ω into two subsets Ω and Ω so that the sum of thenumbers in Ω equals the sum of the numbers in Ω ? Example 1.1 ( Number Partition).
Given Ω = { , , , , , , , , } . If we set Ω = { , , , , } and Ω = { , , , } , then P Ω = P Ω = 9. In this case, the partition problem has positive answer.Note, another partition is: Ω = { , , , , } and Ω = { , , , } .For set Ω = { , , , , } , clearly the partition problem has negative answer, since the sum of Ω isodd.But for set Ω = { , , , , } , the partition problem has negative answer, even though the sum of Ωis even.Clearly, for a given set of integers Ω, either Ω can be equally partitioned, or cannot. Thus, we have aboolean function, called as partition function. Clearly, the number partition problem is equivalent tothe evaluation of partition function. ∗ Great thanks for whole heart support of my wife. Thanks for Internet and research contents contributers to Internet. Sampling and Complexity
Detailed references of partition problem can be seen in ([10]). The number partition problem isNP-complete [14, 12]. Yet, partition problem seems to be relatively easier than other NP-completeproblems. It is often refereed as ”the easiest hard problem” ([9]) . There are many studies for thisproblem. Here, we are particularly interested in such question: what is the complexity of the partitionfunction? Since a boolean function can be realized by a boolean circuit, thus, the equivalent questionis: what is the complexity of the boolean circuit that realizes partition function?Such a question is not easy. In order to address such question, we need new tools and new approach.Scott Aaronson said it quite elegantly: ”find new, semantically-interesting ways to ”hobble” theclasses of polynomial-time algorithms and polynomial-size circuits, besides the ways that have alreadybeen studied, such as restricted memory, restricted circuit depth, monotone gates only, arithmeticoperations only, and restricted families of algorithms..... Any such restriction that one discovers iseffectively a new slope that one can try to ascend up the P = N P mountain.” [17] We are highlyencouraged by this line of thinking. But still, what is the new tools?Our major research interests in recent years are on universal learning machine and related problems[1, 3]. An universal learning machine is a machine that can learn any pattern from data withouthuman intervention. In our setting, an universal learning machine has a conceiving space, and insideconceiving space, there are a lot of X-forms. One X-form is nothing but a boolean function withsubjectivity of machine. Learning is actually to generate new X-forms and/or modify existing X-forms to meet data. Thus, we need to go deep on the relationship between data and boolean functions.Going along this path, we found a set of tools, called as Fitting Extremum, Proper Sampling Set [4].Another set of tools also related to learning machine. We tried to study the subjectivity and thedynamic action of machine [6]. In this process, we found Boolean Function with Parameters, whichcan be traced back to Kugel’s Putnam-Gold machine [19, 20]. In our studies about these tools, wefound that luckily these tools could be used on partition function, and could give us insights aboutcomputational complexity of partition function. In this article, we are going to try to use these toolson complexity of partition function.This article is arranged in following way. In section 2, we will defined partition vector and give detaildefinition of partition function. Here, one very useful lemma ”Uniqueness of Partition Vector” is given,which reflects the deep nature about partition problem. In section 3, we introduce the tool: Booleanfunction with Parameters. In section 4, we review Fitting Extremum (FE) and Proper Sampling Set(PSS), and 2 theorems (PSS implies circuit, circuit implies PSS). We also discuss some examples,which can help us to understand partition function. In section 5, we use the tools on the partitionfunctions, which leads us to the conclusion: P = N P . In section 6, we put forward some furtherthoughts that can help us to understand the methods and tools better.
First, we need to have exact definition of partition function and establish relationship between thenumber partition problem and boolean function. We will do so in several steps. First, we introducepartition vector, which can help us to describe details of partition of a group of numbers. Partitionvector p is a vector with components 1 or -1. This is natural: with components 1 or -1 actually dividea group of number into 2 groups, i.e. p represents a way to partition. Definition 2.1 ( Partition Vectors).
A partition vector p with length N is a vector p = ( p , p , . . . , p N ),with components p i = ± , i = 1 , , . . . , N . For a group of natural numbers Ω = { α , α , . . . , α N } , thequantity P N p i α i , denoted as < p, Ω > , represents the result of partition on Ω by p . Easy to see, Ωis equally partitioned by p is equivalent to < p, Ω > = 0. We call all partition vectors with length N as Partition Vector Space, denote as P V N , or just P V .Note, if p equally partitions a group of number Ω, so does − p . Actually, p and − p represent exactlythe same partition. Thus, we will only consider the partition vector with p = 1. The simple examplesbelow can help us to see how partition vectors are related to partition of a group of numbers. Example 2.1 ( Partition Vectors).
Given Ω = { , , , , , , , , } . Consider a partition vector p = (1 , − , − , , , − , , , − < p, Ω > = P p i s i = 0, i.e. Ω is equally partitioned huyu Xiong p . Another partition vector p ′ = { , , , , , − , − , − , − } also equally partitions Ω, since < p ′ , Ω > = 0. That is to say, for Ω, there are more than one partition vector that equally partitionsΩ.Given Ω = { , , , , } , and partition vector p = (1 , − , , − , − < p, Ω > = P p i s i = −
1. It isalso easy to see that for any partition vector p , < p, Ω > = 0.Given Ω = { , , , , } . Partition vector p = (1 , , , − , −
1) equally partitions Ω. Moreover, p is theonly partition vector that can do so. Any other partition vector will not be able to equally partitionsΩ. In this case, we say Ω is uniquely equally partitioned by p , or p is the unique partition vectorfor Ω. Note, when we say unique, we exclude the case: p = ( − , − , − , , p = 1.In definition, the components of partition vector have values ±
1. This is natural since we can directlyapply multiplication and sum. Note, ± ↔ , − ↔
0. So, if q ∈ B N is a normal boolean vector, it is equivalent to a partition vector p ∈ P V N . So, we also say a boolean vector q is one partition vector in this sense exactly. Thus, wecan define < q, Ω > = < p, Ω > for a group of number Ω and a boolean vector q ∈ B N . We are goingto use this notation a lot.Note, when we say ”a group of numbers”, the numbers are natural number. But, we want to useboolean function to study the problem. For this purpose, we need to restrict the range of numbers.For example, ”numbers are less than 4”, or ”numbers are less than 2 N ”, etc. Actually, such a rangeof number plays a crucial role in the number partition problem [9].Partition vector for a group of numbers is important and interesting. Let’s consider some examples. Example 2.2 ( Unique Partition Vector).
First, consider numbers in this range: integers lessthan 4. Take a partition vector, say, p = { , , − , − , − } . Easily see that p equally partitionsΩ = { , , , , } . However, this p is not unique for Ω. Another partition vector p ′ = { , − , , − , − } also equally partitions Ω.Then consider the range of numbers: integers less than 8. Still consider the same partition vector p = (1 , , − , − , − p equally partitions Ω = { , , , , } . This time, p is theunique partition vector of Ω.At the above, first we have a partition vector p , then we look for a set of numbers (in certain range) Ω,and to see if p can equally partition Ω, then to see if p is unique. The range of numbers is important.We can also consider in opposite direction, i.e. give a set of numbers Ω, then look a partition vector p so that < p, Ω > = 0, and if so, is p unique? As one example, given Ω = { , , , , } , look for p with < p, Ω > = 0? Easy to see, p = (1 , − , − , − , −
1) is one, and p is unique, i.e. for any other p ′ = p , < p ′ , Ω > = 0The above simple examples actually present us such a question: given a partition vector p , can we finda set of numbers Ω so that p equally partitions Ω? For such Ω, is p unique? We can also look for theopposite question: given a set of numbers Ω, can we find a partition vector p that equally partitionsΩ? If so, is p is unique? We notice one important property: For some range of numbers, it is hard tofind partition vector unique, while for some range of numbers, partition vector is often unique. Sucha property is deeply related to number partition problem. In fact, [8] discussed such property in someway. Following lemma will discuss these questions and property. Lemma 2.1 ( Uniqueness of Partition Vector).
For any
N > , for any partition vector p withlength N , we can find a group of numbers Ω = { α , α , . . . , α N } with range ≤ α i < N so that < p, Ω > = 0 , and p is the unique partition vector to be so, i.e. for any other partition vector p ′ = p ,must have < p ′ , Ω > = 0 . Proof:
We are going to do this: For a given partition vector p , find a group of numbers: Ω = { α , α , . . . , α N } to satisfy the 2 conditions.At first, we consider some special partition vectors. These partition vectors are like this: p =( p , p , p , . . . , p N ), with p = . . . = p J = 1, and p J +1 = . . . = p N = −
1, where 1 ≤ J ≤ [ N/ Sampling and Complexity [ N/
2] is the integer part of N/
2. That is to say, such partition vector has its first J componentsequals 1, and the rest of components equal −
1. There are totally [ N/
2] many such partition vectors.We consider these partition vectors separately.For J = 1, p = (1 , − , − , . . . , − p = 1 and all other p j = −
1. For this partition vector p , wechoose a group of numbers: Ω = { α , N − , N − , . . . , } , where α = 2 N −
2. Clearly, P N − j =1 j =2 N − α . So, we have < p, Ω > = 0. For uniqueness, we see the sum of all numbers of Ω: α + 2 N − + 2 N − + . . . + 2 = 2(2 N − N −
2. Thus, the only possible partition vector is p , i.e. if p ′ = p , must have < p ′ , Ω > = 0.For J = 2, p = (1 , , − , . . . , − p = p = 1 and all other p j = −
1. For this partition vector p ,we choose a group of numbers: Ω = { α , α , N − , N − , . . . , } , where α = 2 N − , α = 1. Clearly, P N − j =2 j = 2 N − α + α . So, < p, Ω > = 0. For uniqueness, we can see the sum of all numbersof Ω: α + α + 2 N − + 2 N − + . . . + 4 = 2(2 N − N −
4. Thus, the only possible partition vector is p , i.e. if p ′ = p , must have < p ′ , Ω > = 0.Generally, for any 2 ≤ J ≤ [ N/ p = (1 , , . . . , , − , . . . − p = . . . = p J = 1 and p J +1 = . . . = p N = −
1. For this partition vector p , we choose a group of numbers:Ω = { α , α , . . . , α J , N − , N − , . . . , J } , where α = 2 N − J − J − + 1 , α = 2 J − , . . . , α J = 1,Clearly, P N − j = J j = 2 N − J = P Jj =1 α j . So, < p, Ω > = 0. For uniqueness, we can see the sum of allnumbers of Ω: α + α + . . . | α J + 2 N − + 2 N − + . . . + 2 J = 2(2 N − J ). So, the sum of positive part(or negative part) must be 2 N − J . Thus, the only possible partition vector is p , i.e. if p ′ = p , musthave < p ′ , Ω > = 0.Then we consider a general partition vector p , but the number of 1’s ≤ [ N/ p = 1. Wecan permute the components of p (but first component not change) to get a partition vector p ′ , where p ′ is one of those special partition vector discussed above. For example, p = (1 , − , , − , → p ′ =(1 , , , − , − p ′ , we have Ω ′ as discussed above. Then, by reverse permutation on Ω ′ , weget Ω. Easy to see such p and Ω satisfy: 1) < p, Ω > = 0, 2) p is the unique partition vector to be so.Finally, we consider a general partition vector p , but the number of 1’s could be bigger than [ N/ − p , i.e. to change 1 to -1, and -1 to 1. We can get the Ω for this p , andwith: 1) < p, Ω > = 0, 2) p is the unique partition vector to be so. (cid:4) In the lemma, for one partition vector p we actually choose one Ω with the 2 properties. But, for p ,there are many more Ω with the 2 properties. Specially, if the range of numbers are become bigger.Conversely, if the range of numbers are become smaller, we might not be able to choose Ω with theproperties. This is related to the discussions in [9].Here, we should make one note. In the above Lemma, we consider the range of number as 1 ≤ α i < N .But, actually, we need to consider the range of number as 0 ≤ α i < N , as we will see in later discussionof partition problem. But, this is not a problem. We can explain it as: the lemma still holds onlywith a little modification. If in Ω, some α i = 0, then partition vector is not unique anymore, p i canbe either 1 or −
1. But, other than this, all others are still true.Partition vector will make our descriptions on partition function much easier. However, we also needto set a restriction on range for numbers, since we cannot use integer (which is infinite).
Definition 2.2 ( Partition Function of Integers).
For a integer
N >
2, and a natural number M ≥
1, define a function
P ar as below:
P ar : { α , α , . . . , α N } → B , ≤ α i ≤ M The value of
P ar is: if there is one partition vector p with length N that equally partitions Ω = { α , α , . . . , α N } , i.e. < p, Ω > = 0, P ar (Ω) = 1, otherwise
P ar (Ω) = 0. We call such function aspartition function with range M .The partition function captures the number partition problem, i.e. if the partition problem has positiveanswer, the partition function P ar = 1, otherwise,
P ar = 0. Thus, to study the partition functionis equivalent to study partition problem. But, here partition problem is modified: the number is notchosen from all integer, but only chosen from natural numbers less than M . huyu Xiong { α , α , . . . , α N } , we cantraverse all partition vectors p ∈ P V , if for one p , < p, Ω > = 0, then stop, and P ar (Ω) = 1; if for any p ∈ P V , < p, Ω > = 0, then P ar (Ω) = 0. Note,
P V has totally 2 N − − N .Here, we remark on the number of partition vectors in P V . As discussed above, we only considersuch partition vectors whose first component is 1. And, we will not consider partition vector with allcomponents are 1 since this partition vector does not represent a true partition. Thus, it is easy tosee that the number of all possible partition vectors is 2 N − − M in Def. 2.2 as 2 K −
1, i.e. a K -bit integer, we then define the partition function of K -bits integers. Note, if α i is a K -bit integer, then 0 ≤ α i ≤ K −
1. In this case, the numberequivalent to the number M in Definition 2.2 is 2 K −
1. We go further to turn a set of numbers intoa bit array. There are several ways to turn an integer to bits, e.g. unary representation or binaryrepresentation. We will use the normal binary representation. That is to say, for a K -bit integer,its binary representation is a K bit array. For a set of numbers, by concatenating bit arrays of allnumbers in the set together, we will have a long bit array. Specifically, for a set of K -bit numbersΩ = { α , α , . . . , α N } , the corresponding bit array has length KN .For example, for the set of numbers: Ω = { , , , } , we have N = 4 , K = 3, and 1 = 001 , , , S is represented as a bit array: 011001010111. By this way, we can define partitionfunction as function on bit array. Definition 2.3 ( Partition Function of Bit Array).
For a integer
N >
2, and a integer K ≥ P ar
K,N (or just
P ar ) as below:
P ar
K,N : B KN → B , v = P ar
K,N ( x ) , where x ∈ B KN is a bit array with length KN , and v ∈ B is a binary value. The value of P ar
K,N isdefined as: First, cut x into a set of numbers of K -bits: Ω = { α , α , . . . , α N } , the cutting is from leftto right; Second, if there is one partition vector p ∈ P V N that equally partitions Ω, i.e. < p, Ω > = 0, P ar
K,N ( x ) = 1, otherwise P ar
K,N ( x ) = 0. We call such function as partition function of bit arraywith size KN .We also use P ar for
P ar
K,N in the situation without confusing. When K = N , we even use P ar N for P ar
N,N . Example 2.3 ( Partition Function of Bit Array).
Consider a partition function with size 2 × f : B → B . For any x ∈ B , we can cut x into 6 pieces, and each piece is a2-bit integer (from 0 to 3). For example, x = 1101011010001 → { , , , , , } For this x , clearly, P ar , ( x ) = 1. Another example, x = 1101011010000 → { , , , , , } For this x , should have P ar , ( x ) = 0.Clearly partition function with size KN depends on K and N . The combination of N and K willdetermine the property of the function. One special case is K = 1. In this case, only 1 bit is used torepresent integer. Thus, the only possible integer is 0 or 1. Easy to see, in this case, partition functionis reduced to parity function. This case is simple but very useful.As [8] discussed, the complexity of partition function is determined by K vs. N . Specially, when K ≥ N , the partition function becomes complicated, hence interesting. Thus, we are going to consider K = N . So, for each N , we have a boolean function with dimension N . Sampling and Complexity
But, what about the dimension between square number N and ( N + 1) ? We can define booleanfunction with such dimension as well. Consider a natural number 0 < L , we want to have a booleanfunction f L : B L → B for each L , and when L = N , f L become P ar
N,N exactly. How we define sucha sequence of boolean functions? Here is the definition.
Definition 2.4 ( General Partition Function).
For a integer
L >
2, let K = [ √ L ], i.e. K is theinteger part of √ L . And, let N ′ = [ L/K ], i.e. the integer part of
L/K . So, to define a booleanfunction
GP ar L as below: GP ar L : B L → B , v = GP ar L ( b ) , where b ∈ B L is a bit array with length L , and v ∈ B is a binary value. Note, we have KN ′ ≤ L
N,N = P ar N . Example 2.4 ( Examples of General Partition Function of Bitarray).
One example: Consider L = 12, so, K = [ √ L ] = 3, N = [ L/K ] = 4, and L = KN . Thus, we are going to cut bit array into 4pieces, each piece 3-bits. One case: x = 110101101000 → { , , , } For this x , clearly, GP ar ( x ) = 0. Another example: Consider L = 13, so, K = [ √ L ] = 3, N = [ L/K ] = 4, and L = KN + 1. Thus, we are going to cut bit array into 5 pieces, first 4 pieces are3-bits, last piece is 1-bit. One case: x = 1001011010111 → { , , , , } For this x , easy to see GP ar ( x ) = 1. (cid:4) Such a sequence of boolean functions
GP ar L : B L → B , L = 2 , , , . . . , is the topic for computationalcomplexity. We are going to discuss such a sequence of boolean functions. When L = N , N = 2 , , . . . ,we have P ar N . In this section, we are going to introduce one tool, i.e. boolean function with parameters. This tool isrealized when we studied subjectivity and dynamic action of machine [6]. In the research, we noticedKugel’s Putnam-Gold machine [19, 20]. Kugel thought a Turing machine can be used in trial-and-errorfashion and he argued that by such a way, a Turing machine could do much more. Such a thoughtgives us an inspiration: If we introduce boolean function with parameters and use in trial-and-errorfashion, we can have a new way to write boolean function, and such way is quite powerful to expresssome functions otherwise hard to express. So far, we have not seen any reference for such concept,i.e. boolean function with parameters to be used in such a fashion. This tool is not particularly hard,but it is quite useful, so we write this section to explain it.First, consider one very simple example.
Example 3.1 ( A Simple Case). f ( x , x ) = x ∧ x is a simple boolean function on B to B . It has noparameters. Based on it, we introduce one boolean function with parameters: f ( x , x , s, t ) = x s ∧ t x .Here, parameter s is one boolean variable behaving like a switch: if s = 1, do nothing, if s = 0, putinto a ¬ . Same for t . Thus, f (1 , , ,
1) = 1 ¬ ∧ ∧ f (0 , , ,
1) = 0 ¬ ∧ ∧ f ( x , x , ,
0) = x ∧ ¬ x , and f ( x , x , ,
1) = ¬ x ∧ x . huyu Xiong Definition 3.1 ( Boolean Function with Parameters).
A boolean function ϕ : B N × B J → B ,is called as a boolean function on B N with J binary parameters. We often write such function as ϕ ( x, p ) : B N → B , where x ∈ B N , p ∈ B J .The boolean function with parameters are used in trial-and-error fashion. Suppose ϕ is a booleanfunction with parameters, and we have a list of parameter vectors: P = { p , p , . . . , p K } , we canconduct a trial-and-error process: for a given x , first try ϕ with p , if ϕ ( x, p ) = 1, it is good, we stoptrial and f ( x ) = 1; if ϕ ( x, p ) = 0, trial fails, we continue to try ϕ with p , and do the same as for p ;if at some p k , ϕ ( x, p k ) = 1, trial successes at p k and f ( x ) = 1; if for all parameter vector p k ∈ P , trialfails (i.e. ϕ ( x, p k ) = 0), then f ( x ) = 0. We can see, by this way, we get a new boolean function from ϕ and P . This is a quite essential process. We formally define it as following. Definition 3.2 ( Trial with Parameters).
Suppose ϕ : B N × B J → B is a boolean function with J binary parameters, and P = { p , p , . . . , p K } is a list of parameter vectors (each p k is a J -dimbinary vector), we can form a boolean function f : B N → B in this way: for a given input x ∈ B N ,if there is one parameter p k so that ϕ ( x, p k ) = 1, then f ( x ) = 1; otherwise, f ( x ) = 0 (i.e. for allparameter vector p k ∈ S , ϕ ( x, p k ) = 0). Such function f is called as ϕ trial with P , and we denoteas: f = ϕ ⊙ P = ϕ ⊙ { p , p , . . . , p K } .Here is a note for the symbol ⊙ . It is used to represent the operator of a boolean function withparameters trial with a parameter list. This operator can be analogy to ”shooting target” and thesymbol looks like a simplified target. It is quite intuitive. Example 3.2 ( Simple Case of Trial with Parameters).
Here is one very simple example. As inExample above, ϕ ( x , x , s, t ) = x s ∧ t x is a boolean function on B with 2 parameters. We thenhave a list of parameters: S = { (1 , , (0 , } . It is quite easy to see that ϕ trial with S equals parityfunction, i.e. ϕ ⊙ S = x ⊕ x .In fact, boolean function with parameters is very much targeting to partition function. It is very easyto see that using trial-and-error fashion to express partition function is very natural. First, we definea boolean function with parameter, which actually represent the partition by one particular partitionvector. Exactly, we define a boolean function with parameters ϕ by following equation: ϕ : B KN × B N → B , ϕ ( x, p ) = ( < p, Ω x > = 00 if < p, Ω x > = 0 (1)where x ∈ B KN is a KN -dim bitarray, p ∈ B N is partition vector ( N -dim), Ω x is the set of numbers cutfrom x , Ω x has N K -bit numbers, and < ., . > is the sum defined in last section (note, the conversion1 ↔ , − ↔ ϕ is a boolean function with parameters. The meaning of ϕ isvery clear: if x can be equally partitioned by p , ϕ ( x, p ) = 1, otherwise ϕ ( x, p ) = 0.Using this notation, we can restate the Uniqueness of Partition Vector lemma again as below. Thiswill be quite useful. Lemma 3.1 ( Uniqueness of Partition Vector).
For any
N > , for any partition vector p withlength N , we can find a boolean vector x ∈ B N so that ϕ ( x, p ) = 1 , and p is the unique partitionvector to be so, i.e. for any other partition vector p ′ = p , must have ϕ ( x, p ′ ) = 0 . How to use ϕ ? We are going to use it by trial-and-error fashion. That is to say, if P is a set ofpartition vectors, we can define a boolean function f = ϕ ⊙ P . The purpose to do so is clearly shownin next lemma. Lemma 3.2 ( Partition Function expressed by trial-and-error).
Partition function of bitarray
P ar
K,N : B KN → B can be expressed by trial-and-error fashion: P ar ( x ) = ϕ ( x, p ) ⊙ P V , where ϕ isthe boolean function with parameters defined in Eq. 1, and P V N = P V is the partition vector space.
The proof of the lemma is directly from the definition of
P ar
K,N and ϕ . Thus, we can see: partitionfunction can be very easily expressed by boolean function with parameters and trial with parameters. Sampling and Complexity
Compare to the definition of partition function in last section, the way to express it in trial withparameters are very natural and much easier to handle.In above, if ϕ trial with whole partition vector space P V , we get partition function. But, how aboutonly trial with some partition vectors? In such case, we call them as sub-partition function. Definitionis below.
Definition 3.3 ( Sub-Partition Function).
Suppose P ⊂ P V N , i.e. P is a set of partition vectors, asub-partition function on P is a boolean function SP ar
P,K,N : B KN → B , SP ar
P,K,N ( x ) = ϕ ( x, p ) ⊙ P ,where ϕ is the boolean function with parameters defined in Eq. 1.Note, if P = P V , then
SP ar
P,K,N = P ar
K,N .A boolean function can be expressed by a boolean circuit. We want to show, similarly, a booleanfunction with parameters can be expressed by a boolean circuit with parameters.First, we need to consider how to join parameters into circuit. There are several ways. One way is toconsider: a boolean circuit with switches is constructed by nodes of s ∧ t and s ∨ t , where s and t areswitches that can take the position of ”pass” or ”negation” according to the value of parameter s or t : if s = 1, it is ”pass”; if s = 0, it is ”negation”; similarly for t . In Example 3.1, we have seen such booleancircuit with parameters. Another way is to consider: in the circuit, add some ”constant nodes”, whichtakes fixed value (but, we can switch the value according to parameter value). For example, C is aboolean circuit. We add one constant node o into, and form a new circuit: C ′ p = C ∧ o . This newcircuit will perform differently for different value of o . If o = 1, C ′ p becomes C , if o = 0 , C ′ p = 0. Thiscircuit C ′ p is a boolean circuit with parameters.It is easy to show that the 2 ways are equivalent. But we are not going to discuss it here. In belowlemma, we are going to use the second way. Lemma 3.3 ( Boolean Circuit with Parameters).
Any boolean function with parameters ϕ canbe expressed by a boolean circuit with parameters C . Proof:
Consider a boolean function with parameters ϕ : B N × B J → B . We are going to show thatwe can find a boolean circuit with parameters to express ϕ .For any given parameters p ∈ B J , ϕ ( ., p ) is a boolean function on B N , then we can find a circuit C p expressing ϕ ( ., p ). Consider all such circuits: { C p | p ∈ B J } . There are totally L = 2 J many suchcircuits. We are going to make a circuit with parameters out from these circuits { C p } .Then, we build a circuit with parameters O qp like this: p ∈ B J is the parameter, q ∈ B J is a structureindicator that indicates the specific structure of this circuit, circuit is O qp = s o ∧ s o . . . ∧ s o J , where o j is a constant node that takes value p j , s are switches that takes value as: if q j = 1, s is pass, if q j = 0, s is negation. The circuit with parameters O qp has this property: if parameter vector equalsstructure indicator, i.e. p = q , O qp = 1, else, i.e. p = q , O qp = 0.Using O qp and C p , we can get a circuit with parameters as following: V ( p ) = _ q ∈ B J ( C q ∧ O qp )We can clearly see that V ( p ) is a boolean circuit with parameters expressing ϕ , since ∀ x ∈ B N , ∀ p ∈ B J ,we have V ( p )( x ) = C p ( x ). (cid:4) In the above proof, we only consider how to express ϕ and did not consider any other factors. In fact,the above circuit V ( p ) often is not an efficient one. Quite often, we can make a much more efficientcircuit.Finally, in Eq. 1, we defined a boolean function with parameters, what is the circuit with parametersexpressing this function? This is highly related to the partition function. We will see it in next section. huyu Xiong Now, we discuss another set of tools. In recent years, we have been trying to see why and how a machinecan learn from data without human intervention (so called mechanical learning) [1, 3]. An universallearning machine is a theoretical model for such purpose, which contains conceiving space, and inconceiving space, there are many X-forms. Actually, if without considering subjectivity of machine, aX-form is one boolean circuit (expressing a boolean function). In the process to understand universallearning machine, we found that mechanical learning can be achieved by conducting fitting extremum[4]. Fitting Extremum (FE) is, very briefly say, if a learning machine keeps looking for a booleancircuit that fits with data with minimal number of nodes ( ∧ , ∨ nodes), then eventually, that booleancircuit will express the boolean function desired to learn. Such a learning is pure mechanical, nohuman intervention is necessary. Just follow this rule, the learning can be achieved if there is enoughdata, no matter what is the learning target. The data sufficient for this purpose is Proper SamplingSet (PSS). This result so far is only theoretical, but it reveals the nature of learning. We are currentlyconduct research to push it to applications.Fortunately, during the research process, we found that Fitting Extremum (FE) and Proper SamplingSet (PSS) form good tools for computational complexity, specially for partition function. Moreover,FE and PSS are working well with boolean function with parameters and trial-and-error fashiontogether.In this section, we will briefly review FE and PSS. We just give necessary definitions and results. Fordetails, please see [4].Below, we are considering to learn a boolean function f : B N → B from data. Definition 4.1 ( Sampling Set).
A sampling set S is one subset of B N , i.e. S ⊂ B N . We also justsay sampling. Moreover, over one sampling set, there are assigned values: Sv = { [ x, b ] | x ∈ S, b = 0 or 1 } We call such set Sv as sampling set with assigned values, or sampling with values, or just sampling.For a boolean function f : B N → B , we can have the sampling set with values of f (or just samplingfor f ): Sv = { [ x, f ( x )] | x ∈ S } With sampling, learning is to find a boolean function that fits with sampling. Actually, learningis looking for a boolean circuit that fits with sampling. There are many possible choices for suchboolean functions or boolean circuits. First, the set of all boolean functions on B N has huge size: 2 N .Moreover, if we denote all boolean circuits on B N as C , the size of C is even bigger than 2 N , since aboolean function could have many boolean circuits expressing it. So, what we do?Fitting Extremum is to look for a boolean circuit that has least number of nodes while fitting withsampling. Definition 4.2 ( Fitting Extremum).
For a sampling set Sv with values, we define one extremumproblem as following: Min: d ( C ) , C ∈ C & ∀ [ x, b ] ∈ Sv C ( x ) = b We call this problem as fitting extremum on Sv .In the definition of fitting extremum, we give a sampling set with values. But, what if we give a subsetof B N and a boolean function? This sure will define a fitting extremum as well. Definition 4.3 ( Fitting Extremum of a Boolean Function).
For one boolean function f : B N → B , and for a sampling set S ⊂ B N , we define one extremum problem as following:Min: d ( C ) , C ∈ C & ∀ x ∈ S C ( x ) = f ( x )We call this problem as fitting extremum on S and f .0 Sampling and Complexity
Such a circuit C is called as circuit generated by fitting extremum on sampling S and f . That is tosay, given a sampling and a boolean function, we can generate a circuit from them by FE. We will sayFE on S to fit f , and FE on S by fitting f to gets C , etc.Note, for a given S , the circuit generated from FE on S to f is often not f , could be totally differentthan f . However, when S satisfies certain condition, the circuit will express f . Definition 4.4 ( Proper Sampling Set).
For a given boolean function f : B N → B , and for asampling set S ⊂ B N , if fitting extremum on S and f generates a boolean circuit C , i.e. C fits f on S , and d ( C ) reaches minimum, and if C expresses f exactly, i.e. ∀ x ∈ B N , C ( x ) = f ( x ), we say S isa proper sampling set of f , or just proper sampling.We will use FE to stand fitting extremum and PSS for proper sampling set. In another words, when S is proper sampling set, the boolean circuit generated by FE on S to f will always express f . Thisis one crucial property. We can also say, if S is PSS, FE on S to fit f will get a circuit expressing f . Lemma 4.1 ( Existence of PSS).
For any boolean function f , there is some subset S ⊂ B N so that S is proper sampling set of f . So, for any boolean function f , PSS always exists. The lemma is very easy to prove. The trivial (andworst) case is that PSS equals the whole boolean space B N . We can think in this way: give a sampling S , if S is not PSS, we can add more elements into S , eventually, S will become PSS. Of course, wedo not want the whole space, if possible. So, we need to consider a PSS as small as possible. Definition 4.5 ( Minimal PSS).
For a given boolean function f : B N → B , if a sampling S ⊂ B N isa proper sampling set, and | S | reaches the minimum, we call such a sampling set as minimal propersampling set.We use mPSS to stand for minimal PSS. If S is a mPSS of a boolean function f , then for any sampling S ′ , if | S ′ | < | S | , S ′ could not be a PSS of f . Thus, mPSS of f indeed describes one important propertyof f . It is easy to see that for any boolean function f : B N → B , mPSS indeed exists. Lemma 4.2 ( mPSS). For a boolean function f : B N → B , mPSS exists, i.e. there is a sampling S , S is PSS of f and for any sampling S ′ , if | S ′ | < | S | , S ′ could not be a PSS of f . Proof:
We know for f , PSS of f indeed exists. But, all possible PSS of f form a finite set. We choosethe sampling in this set with smallest size, it will be a mPSS of f . (cid:4) For one f , there could be more than one mPSS, i.e. might have such a situation: S and S are bothmPSS of f , and S = S . But, the size of all mPSS of f is same. Thus, the size of mPSS gives us oneimportant property of f .FE and PSS are deeply related to learning. But, we want to see this relationship here: the size ofsampling set is deeply related to complexity of boolean circuit. Following 2 theorems tell such arelationship. Theorem 4.3 ( PSS implies Circuit). If f is a boolean function f : B N → B , and S ⊂ B N is aPSS for f , and | S | is the size of PSS, then there is a circuit C expressing f and d ( C ) < N | S | . Opposite direction is also true, that is to say, if we have a circuit, we can construct a PSS from it.
Theorem 4.4 ( Circuit implies PSS). If f is a boolean function f : B N → B , and C is a booleancircuit expressing f , then there is a PSS for f , and size of PSS is bounded by d ( C ) . The 2 theorems tell us that for a boolean function f , if we have a PSS of f , we can construct a circuitto express f and the size of circuit is proportional to the size of sampling. And, reversely, if there isone circuit expressing f , then we can find a PSS by using circuit, and the size of sampling is controlledby size of circuit. Since the size of circuit is one good measure of computational complexity of f , so isthe size of PSS. This is a very important property, which means that we can examine the complexityof f by examining the learning process and PSS of f . Particularly, ”circuit implies PSS” gives a lower huyu Xiong Some Examples of Boolean Functions
We are going to see some examples of boolean functions, and their PSS and circuits. These examplesare simple boolean functions, but highly related to partition function and very useful.
Simplest Addition and Subtraction
To illustrate the situation, first, we consider the simplest addition: z = x + y , where x = x x , y = y y are 2 bits integers, z = z z z is a 3 bits integer. For example, x = 01 , y = 10 , z = x + y = 011; x =11 , y = 01 , z = x + y = 100. Easy to see such addition can be completely described by 3 booleanfunctions z , z , z (2 adjunct boolean functions t , t are for convenience). They are: z = x ⊕ y , t = x ∧ y , z = ( x ⊕ y ) ⊕ t , t = ( x ∧ y ) ∨ ( x ∧ t ) ∨ ( y ∧ t ) , z = t (2)Here, t , t are 2 adjunct functions for carry-over value for x + y and x + y + t . All these 5 booleanfunctions are on B . Note, elements in B is formed in this way: x x y y . In another words, anelement in B is cut into x, y , then do the addition.Second, we consider the simplest subtraction: z = x − y , where x = x x , y = y y are 2 bits integers, z = z z z is a 3-bit integer with first digit for sign (0 for positive, 1 for negative). For example, x = 01 , y = 10 , z = x − y = 101; x = 11 , y = 01 , z = x − y = 010. Such subtraction is fully describedby 3 boolean functions z , z , z (and 2 adjunct functions). They are: z = x ⊕ y , t = ¬ x ∧ y , z = ( x ⊕ y ) ⊕ t , t = ( ¬ x ∧ ( y ∨ t )) ∨ ( x ∧ ( y ∧ t )) , z = t (3)Here, t , t are 2 adjunct functions for carry-over value for x − y and x − y − t . All these 5 booleanfunctions are on B , an element of B is cut into x, y , then do the subtraction.For these boolean functions, it is very interesting to see their PSS and circuits. First consider z in addi-tion, which is a boolean function ( ⊕ ) on B . Pick a sampling set in B , say S = { , , , } .We can easily see such S is a mPSS of z . It means: FE on S to fit z will get a circuit C , then C must express z . Should note, there are more mPSS. For example, S = { , , , } isanother. Note, | S | = 4.Second, consider t in addition. This time, we can pick up a sampling in B like this: S = { , , } . This is a mPSS for t . Should note, any mPSS for z is a PSS for t .Then, consider z in addition. z is formed by 2 branches: x ⊕ y and t . The sampling for x ⊕ y could be: S = { , , , } . The sampling for t could be: S ′ = { , , , } .However, S ∪ S ′ could not be PSS for z . We can add S ′′ = { , , , } to make sure tocapture the second ⊕ . ˆ S = S ∪ S ′ ∪ S ′′ = { , , , , , , , } is mPSS of z . Note | ˆ S | = 8.Then, consider t in addition. We will add more to ˆ S . This set S ′′′ = { , } is good for capturethe ∨ and ∧ in t . So, the set ¯ S = ˆ S ∪ S ′′′ is a mPSS of t . Note, | ¯ S | = 10. z = t , so ¯ S is PSS of z .We can easily see: a PSS for z is also a PSS for z and z , so, ¯ S is a PSS for all those 5 booleanfunctions. This sampling set ¯ S is sufficient to describe the addition.Similar results for subtraction.Then, we consider another boolean function: v ( x ) : B → B , value of v ( x ) for x = x x x x ∈ B are:if z = x x − x x = 0, then v ( x ) = 1, else, i.e. z = x x − x x = 0, then v ( x ) = 0. v can be written When we worked on FE and PSS in 2019, we did not know there were works on ”partially defined Boolean function”(pdBf) [16]. Our process of learning can be thought as a series of pdBf that is expanding. However, FE is much differentthan pdBf. pdBf did not seek fitting extremum, and relate pdBf to learning. Conceptually, PSS is related to sampling complexity [13]. But, so far we have not seen any work on samplingcomplexity in the direction to seek fitting extremum, hence related to complexity of circuit. Sampling and Complexity by z , z , z in subtraction: v ( x ) = ¬ z ( x ) ∧ ¬ z ( x ) ∧ ¬ z ( x ). Note, if S is a PSS of z , z , z , then, itis a PSS of v . The boolean function v is the simplest case of partition function.These boolean functions above are quite simple. But, they are quite illustrative and useful. Addition and Subtraction of K -bits Numbers Above, we considered addition and subtraction for integer with only 2-bits and we get full understand-ing about its circuits and PSS. It is very easy to generalize to integers with any bits. We consider suchaddition z = x + y , where x = x K − . . . x x , y = y K − . . . y y are K -bits integers, z = z K . . . z z isa ( K + 1)-bits integer. For example, x = 1001 , y = 1010 , z = x + y = 10011. Easy to see such additioncan be described by K + 1 boolean functions for z , z , . . . , z K (and K adjunct boolean functions).They are: z = x ⊕ y , t = x ∧ y ,z = ( x ⊕ y ) ⊕ t , t = ( x ∧ y ) ∨ ( x ∧ t ) ∨ ( y ∧ t ) ,. . . . . . ,z K − = ( x K − ⊕ y K − ) ⊕ t K − , t K − = ( x K − ∧ y K − ) ∨ ( x K − ∧ t K − ) ∨ ( y K − ∧ t K − ) ,z K = t K − (4)Here, t , t , . . . , t K − are K adjunct functions for carry-over value for x + y , x + y etc. All these 2 K +1 boolean functions are on B K . Note, elements in B K is formed in this way: x K − . . . x x y K − . . . y y .In another words, an element in B K is cut into x, y , then do the addition.Similarly, we can have subtraction z = x − y , where x = x K − . . . x x , y = x K − . . . y y are K -bitsintegers, z = z K . . . z z is a ( K + 1)-bits integer, with first digit z K for sign (0 means positive, 1means negative). For example, x = 10101 , y = 11010 , z = x − y = 111011. We will have similar2 K + 1 boolean functions that fully describe this subtraction. Easy to see such subtraction can berepresented by K + 1 boolean functions for z , z , . . . , z K (and K adjunct boolean functions). Theyare: z = x ⊕ y , t = x ∧ y ,z = ( x ⊕ y ) ⊕ t , t = ( x ∧ y ) ∨ ( x ∧ t ) ∨ ( y ∧ t ) ,. . . . . . ,z K − = ( x K − ⊕ y K − ) ⊕ t K − , t K − = ( x K − ∧ y K − ) ∨ ( x K − ∧ t K − ) ∨ ( y K − ∧ t K − ) ,z K = t K − (5)Here, t , t , . . . , t K − are K adjunct functions for carry-over value for x − y , x − y − t etc. Allthese 2 K + 1 boolean functions are on B K . Also, an element in B K is cut into x, y , then do thesubtraction.For these boolean functions, it is very interesting to see their circuits and PSS. One immediate obser-vation is: z with t are almost same as z with t , . . . , and almost same as z K − with t K − . Theyhave exactly same structure, the only difference is the index. This is no surprise. Addition is linearto K (bits of integer). Such a property will make circuits and PSS much easier.We already know the PSS for z with t (above ˆ S ). The PSS for z with t is just linearly expansionfrom ˆ S . And, this linear expansion will continue to z K − with t K − , and to z K with t K − . Here, wewill not write down exactly PSS for them. But, it is very clear that the PSS of these boolean functionhas size proportional to K . So does mPSS S . Such S will be sufficient to describe all 2 K + 1 booleanfunctions for addition, so sampling S is sufficient to describe the addition, and S is linear to K . Forsubtraction, we have the same conclusion: there is a mPSS S , its size is linear to K , sufficient todescribe the subtraction. Notice, in B K , there are totally 2 K many elements. So, relatively, S is avery small subset in B K . With Only One Partition Vector
We have seen addition/subtraction. To compute partition function, we are going to use Eq. 1. Forany given p , the computation of ϕ ( x, p ) is done by N − K -bits integers, huyu Xiong p j determines to do addition or subtraction at j -th place. So, for a given p , ϕ ( ., p ) is a booleanfunction on B KN . We want to see the PSS and circuit for this boolean function.Consider x ∈ B KN , and p ∈ P V N . As in Eq. 1, we cut x into a group of numbers: Ω x = { α , α , . . . , α N } , each number is a K -bits number. We first compute z = < p, Ω x > . Here, z isa K + N − z = z L . . . z z , where L = K + N −
1, then ϕ ( x, p ) is: ϕ ( x, p ) = ^ j =1 , ,...,L ¬ z j Note, z = < p, Ω x > is actually formed by N − K + 1 boolean functions, and all 2 K + 1 booleanfunctions can be fully described by a PSS S , whose size is linear to ( N − K + 1), or linear to N K .So, the size of circuit expressing ϕ ( ., p ) is also linear to N K . Boolean Function with Parameters
No, we consider the circuit with parameters expressing the boolean function with parameters in Eq.1. Suppose C p is the circuit expressing ϕ ( ., p ), the method in Lemma 3.3 tells us how to establish acircuit with parameters expressing ϕ ( x, p ).To do so, let’s see a family of circuit with parameters O qp : p ∈ B J is the parameter, q ∈ B J is a structureindicator that indicates the specific structure of this circuit, circuit is O qp = s o ∧ s o . . . ∧ s o J , where o j is a constant node that takes value p j , s is switches that takes value as: if q j = 1, s is pass, if q j = 0, s is negation. The circuit with parameters O qp has this property: if parameter p = q , O qp = 1,else, i.e. p = q , O qp = 0.Then we have this circuit: V ( p ) = _ q ∈ B J ( C q ∧ O qp )This circuit with parameters V ( p ) is the boolean circuit with parameters expressing ϕ ( x, p ). Note, thenotation V ( p ) means: without given specify p , it is circuit with parameters. But, with the parameteris given, say p , then V ( p ) is a circuit (now parameter is chosen). Sub-partition Function and Partition Function
With circuit with parameters V ( p ), we can have a circuit for a sub-partition function and circuit forpartition function. Suppose P = { p , p , . . . , p J } is a list of partition vectors, and SP ar
K,N,P ( x ) = ϕ ( x, p ) ⊙ P is a sub-partition function. Using V ( p ), we can have a circuit: D P = _ p ∈ P V ( p ) D P is the circuit expressing SP ar
K,N,P . If P is the whole partition vector space, then D P expressing P ar
K,N .We can see that V ( p ) has 2 parts. One is C q . For given q , as we discussed above, the size of C q is proportional to KN . Another part is C q to join with O qp . The size to choose p is quite big, it isproportional to 2 J . Questions arise: can we simplify the circuit V ( p )? Can we reduce the range ofchoices? Same questions for D P . This is what we are going to discuss in next section. In this section, we are going to use tools discussed above (fitting extremum, proper sampling set,boolean function with parameter and trial-and-error fashion) on sub-partition functions and partitionfunction. We expect that these tools can help us to gain deep insight of partition function.4
Sampling and Complexity
Partition function
P ar
N,N , N = 2 , , . . . is what we are interested. For simplicity, we can use P ar N or just P ar for
P ar
N,N . So,
P ar N : B N → B . We are also interested in sub-partition functions. P V N = P V is the partition vector space. For a list of partition vectors P = { p , p , . . . , p L } ⊂ P V N , wehave a boolean function: SP ar
N,P : B N → B , SP ar
N,P ( x ) = ϕ ( x, p ) ⊙ P = ϕ ( x, p ) ⊙ { p , p , . . . , p L } .If P is the whole partition vector space, then SP ar
N,P = P ar N . P V N is a finite set. Specially, we can make an order of all partition vectors, i.e., all partition vectorscan be written in this way: P V N = { p , p , . . . , p J } , where J = 2 N − − | P V N | is the numberof all partition vectors. For such order, we can form a sequence of subset of partition vectors: P = { p } , P = { p , p } , . . . . Generally, P j = { p , p , . . . , p j } , j = 1 , , . . . , J . Clearly, P ⊂ P ⊂ . . . ⊂ P J ,and P J = P V N . For simplicity, we use Q j to stand for SP ar
N,P j , so, Q j is a sequence of booleanfunctions: Q j : B N → B , Q j ( x ) = ϕ ( x, p ) ⊙ P j , j = 1 , , . . . , J . We are interested in this sequence ofsub-partition functions. Note, Q J = P ar .We are going to study the sequence of sub-partition function Q j , eventually to reach P ar . The se-quence Q j of course depends on the choice of the order to partition vector space: P V N = { p , p , . . . , p J } , J = 2 N − −
1. There are many possible such orders. But, as we will shown below, our results willhold for any such order.First, we are going to define some subspace of B N as below: Z ⊂ B N , Z = { x ∈ B N | P ar ( x ) = 0 } ,W ⊂ B N , W = { x ∈ B N | P ar ( x ) = 1 } ,W j ⊂ B N , W j = { x ∈ B N | Q j ( x ) = 1 } , j = 1 , , . . . , J Easy to see Z ∪ W = B N , Z ∩ W = ∅ . We also have following lemma. Lemma 5.1 ( Property of W j ). For the sequence of subsets W j , j = 1 , , . . . , J defined above, wehave: W ⊂ W ⊂ . . . ⊂ W J , and all these inclusions are true inclusion, i.e. W j \ W j +1 = ∅ , j =1 , , . . . , J − , and W J = W . Proof:
By definition, for any j , Q j ( x ) = ϕ ( x, p ) ⊙ P j , Q j +1 ( x ) = ϕ ( x, p ) ⊙ P j +1 . Easy to see, if x ∈ W j , then Q j ( x ) = 1, Q j +1 ( x ) = 1 follows, so x ∈ W j +1 , that is to say, W j ⊂ W j +1 . Fromthe lemma ”Uniqueness of Partition Vector”, W j \ W j +1 = ∅ immediately follows. Finally, since Q J = P ar , W J = W . (cid:4) Following, we want to show this fact: for a sub-partition function,
SP ar
N,P : B N → B , SP ar N,P ( x ) = ϕ ( x, p ) ⊙ P = ϕ ( x, p ) ⊙ { p , p , . . . , p L } , for each partition vector p j , a PSS of SP ar
N,P must have atleast one element so that it is unique to each p j .We start from sub-partition function over only one partition vector. Here, suppose P = { p } . Wealso use ϕ ( x, p ) for ϕ ( x, p ) ⊙ { p } , etc. Lemma 5.2 ( One Partition Vector).
Suppose f ( x ) = SP ar
N,P ( x ) = ϕ ( x, p ) ⊙ P = ϕ ( x, p ) ⊙ { p } .If S is a PSS of f , must have at least one element x ∈ S so that ϕ ( x, p ) = 1 . Proof:
Use contradiction. Suppose S has no such element. Thus, for any x ∈ S , f ( x ) = ϕ ( x, p ) = 0,so f ( x ) = 0. Let circuit C C d ( C
0) = 0. So, the circuit C f on S , and d ( C
0) reachesminimum. Since S is PSS of f , FE on S to fit f must get a circuit expressing f . C C f , it means: ∀ x ∈ B N , f ( x ) = 0. But, ∃ x ∈ B N , ϕ ( x, p ) = 1, so f ( x ) = 1. Thecontradiction proves lemma. (cid:4) Note, in above proof,
SP ar
N,P = ϕ ( x, p ). We then consider sub-partition function over only 2partition vectors. Here, suppose P = { p , p } . Lemma 5.3 ( Two Partition Vectors).
Suppose f ( x ) = SP ar
N,P ( x ) = ϕ ( x, p ) ⊙ P = ϕ ( x, p ) ⊙{ p , p } . If S is a PSS of f , must have at least one element x ∈ S so that ϕ ( x, p ) = 1 , ϕ ( x, p ) = 0 ,and also must have at least another element y ∈ S so that ϕ ( y, p ) = 0 , ϕ ( y, p ) = 1 . huyu Xiong Proof:
Still use contradiction. So, let us suppose: S has no element so that ϕ ( x, p ) = 1 , ϕ ( x, p ) = 0.Thus, for any x ∈ S , if ϕ ( x, p ) = 1, then must ϕ ( x, p ) = 1. Let circuit C C ϕ ( x, p ) and d ( C
1) reaches minimum. So, the circuit C f on S , and d ( C
1) reachesminimum. Since S is PSS of f , FE on S to fit f must get a circuit expressing f . C C f , it means: ∀ x ∈ B N , f ( x ) = C x ) = ϕ ( x, p ). But, this is nottrue. Due to lemma of ”Uniqueness of Partition Vector”, there is at least one element x ∈ B N so that ϕ ( x, p ) = 1 but ϕ ( x, p ) = 0, i.e. f ( x ) = 1 , ϕ ( x, p ) = 0. The contradiction proves: ∃ x ∈ S, ϕ ( x, p ) = 1 , ϕ ( x, p ) = 0.By the exactly same way, we can prove: ∃ y ∈ S, ϕ ( y, p ) = 0 , ϕ ( y, p ) = 1. (cid:4) Note, in above proof,
SP ar
N,P = ϕ ( x ) ⊙ { p , p } . We then consider more general case: P = { p , p , . . . , p L } , 1 ≤ L ≤ J , and consider sub-partition function over P . Lemma 5.4 ( More Partition Vectors).
Suppose f ( x ) = SP ar
N,P ( x ) = ϕ ( x, p ) ⊙ P = ϕ ( x, p ) ⊙{ p , p , . . . , p L } , ≤ L ≤ J . If S is a PSS of f , for any given ≤ i ≤ L , must have at least oneelement x ∈ S so that ϕ ( x, p i ) = 1 and for any other ≤ j ≤ L, i = j , ϕ ( x, p j ) = 0 . Proof:
Still use contradiction. So, suppose: S has no element so that ϕ ( x, p i ) = 1 , ϕ ( x, p j ) = 0 forany other 1 ≤ j ≤ L, i = j . It means, for any x ∈ S , if ϕ ( x, p i ) = 1, then at least there is one j, ≤ j ≤ L, i = j so that ϕ ( x, p j ) = 1.Let’s consider this set of partition vectors: P ′ = P { p i } , i.e. take p i out of P . And, consider asub-partition function over P ′ : g = ϕ ( x, p ) ⊙ P ′ .So, we can see: for any x ∈ S , if f ( x ) = 0, then g ( x ) = 0; for any x ∈ S ; if f ( x ) = 1, must ϕ ( x, p j ) = 1,for some j, ≤ j ≤ L , so there are 2 cases: either ϕ ( x, p i ) = 1 or ϕ ( x, p i ) = 0, for former case, g ( x ) = 1follow, for latter case, if ϕ ( x, p i ) = 0, so must be some j, ≤ j ≤ L, j = i so that ϕ ( x, p j ) = 1, itmeans g ( x ) = 1. So, we know: ∀ x ∈ S, f ( x ) = g ( x ).Let circuit CI is such a circuit: CI expressing g and d ( CI ) reaches minimum. So, the circuit CI fits f on S , and d ( CI ) reaches minimum. Since S is PSS of f , FE on S to fit f must get a circuit expressing f . CI is such a circuit, so CI should express f , it means: ∀ x ∈ B N , f ( x ) = CI ( x ) = g ( x ). But, thisis not true. Due to lemma of ”Uniqueness of Partition Vector”, there is at least one element x ∈ B N so that ϕ ( x, p i ) = 1 but ϕ ( x, p j ) = 0 for any other partition vector p j , i.e. f ( x ) = 1 , g ( x ) = 0. Thecontradiction proves: ∃ x ∈ S, ϕ ( x, p i ) = 1 , ϕ ( x, p j ) = 0 , ∀ i, ≤ j ≤ L, i = j . (cid:4) In above lemmas, we have shown: for a sub-partition function,
SP ar
N,P : B N → B , SP ar N,P ( x ) = ϕ ( x, p ) ⊙ P = ϕ ( x, p ) ⊙ { p , p , . . . , p L } , if S is a PSS of SP ar
N,P , for each partition vector p j , S musthave at least one element that is special to p j . We put this into a lemma. Lemma 5.5 ( PSS of Q j ). If S is a PSS of Q j , < j ≤ J , S must have at least one element in W ,and at least one element in W \ W , . . . , at least one element in W j \ W j − . Proof:
Easy to see: if ϕ ( x, p ) = 1 and ϕ ( x, p i ) = 0 for all i ≤ j, i = 1, then x ∈ W . Accordingto above lemma ”More Partition Vectors”, S must have at least one such x . Also, easy to see: if ϕ ( x, p ) = 1 and ϕ ( x, p i ) = 0 for all i ≤ j, i = 2, then x ∈ W \ W . According to lemma ”MorePartition Vectors”, S must have at least one such x .The other is by the same argument. (cid:4) .The above lemma gives one very essential property of Q j . Specially, for P ar = Q J , we have: if S is PSS for P ar , it must have at least one element in W , and at least one element in W \ W , . . . , W j \ W j − , . . . , finally, at least one element in W J \ W J − .Note, in the proof, there is no any dependence on any particular order about P V N = { p , p , . . . , p J } .In fact, they are consequence of lemma ”Uniqueness of Partition Vector”.For this very essential property, we can see the illustration in Fig. 1.6 Sampling and Complexity W W . . . . . . W J S Fig1. Illustration of S and W j , j = 1 , , . . . , J , S is a PSS of P ar
By this property, we get a lower bound for circuit expressing Q j , and particularly a lower bound forcircuit expressing P ar . Lemma 5.6 ( Size of mPSS).
For sub-partition function Q j , j = 1 , , . . . , J , and suppose S is a mPSSof it, then j < | S | . Specially, for partition function P ar N , if S is a mPSS of it, then N − ≤ | S | . So,if C N is a circuit expressing P ar N , then N − ≤ d ( C N ) . Proof:
By lemma above, if S is a mPSS of Q j , so it is a PSS, thus, S must have at least one elementin W , at least one element in W \ W , . . . , at least one element in W j \ W j − . That is to say, it has atleast j different elements, so j < | S | . Particularly, partition function P ar N = Q J , and J = 2 N − − N − ≤ | S | . Since C N is circuit expressing P ar N , by the theorem ”Circuit implies PSS”, there isa PSS S ′ of P ar N = Q j , and | S ′ | ≤ d ( C N ). Since S is mPSS, 2 N − ≤ | S | ≤ | S ′ | ≤ d ( C N ) follows. (cid:4) This lemma gives lower bound of circuit expressing
P ar N . We put this into theorem below. Theorem 5.7 ( Computational Complexity of Partition Function).
The computational com-plexity of
P ar
N,N is greater than η N , where η is a constant. Proof:
P ar
N,N = P ar N , by the lemma ”Size of mPSS”, for any circuit C expressing P ar N , we have2 N − ≤ d ( C ), so N ≤ d ( C ). (cid:4) Now, we understand the complexity of partition function for a special case, i.e. K = N , or L = N : P ar
N,N , N = 2 , , . . . , we then turn attention to more general partition function GP ar L , L = 2 , , . . . .We have following theorem. Theorem 5.8 ( Computational Complexity of General Partition Function).
For general par-tition function
GP ar L , L = 2 , , . . . , the computational complexity C is greater than ηL [ √ L ] , where [ w ] is the integer part of w , and η is a constant. Proof:
Let denote the computational complexity of
GP ar L as ρ L . If L = N . then GP ar L = P ar
N,N ,so we have η N < rho L , or η √ L < rho L , where η is a constant, and in this case √ L = [ √ L ]. If N ≤ L < ( N + 1) , clearly ρ N < ρ L . Then, we have η [ √ L ] = η N < rho L . (cid:4) We can discuss another issue: For partition function, we ask this question: FE to fit partition functionon a sampling set, what will be the outcome? For general boolean function, this question usually isquite hard. But, for partition function, we have following lemma, which gives a surprisingly straight-forward answer.
Lemma 5.9 ( FE to fit Partition Function ).
Suppose S ⊂ B N is a sampling, suppose FE on S to fit partition function P ar N to generate a circuit C , suppose Ψ S = { p , p , . . . , p L } is the partitionvector set associated with S , and suppose f t : B N → B , f t ( x ) = ϕ ( x, p ) ⊙ Ψ S . If T ⊂ S is a PSS ofall those boolean functions in addition and subtraction (as shown in section 4), then: C must express f t . Proof:
Since T is a PSS of all boolean functions in addition and subtraction, so is S . So, for a given p ,FE on S to fit the boolean function z ( x ) = ϕ ( x, p ), we will get a circuit that expresses z ( x ) = ϕ ( x, p ).Thus, FE on S to fit P ar N is equivalent: FE on S to fit ϕ ( x, p ) ⊙ Ψ S . This proves lemma. (cid:4) This lemma tells us: if S is more than enough to be PSS of addition and subtraction (which is verysmall, relatively), then FE on S to fit P ar N will generate a sub-partition function over the partitionvector set associated with S . This is an good and strong property. huyu Xiong Complexity of Number Partition Problem
The number partition problem is to answer: whether a set of N natural numbers Ω can be dividedinto two subsets Ω and Ω so that the sum of the numbers in Ω equals the sum of the numbers inΩ . The size of problem is N . In this problem, the number is integer without restriction. In orderto put the problem into the framework of boolean functions, we have to put restriction on the size ofinteger, i,e. K -bits integer. So, partition function has 2 sizes: one is the size of problem, N , anotheris the size of integers, K . But, the size of problem (i.e. N , how many numbers to be partitioned) issame in both the number partition problem and partition function.In previous discussions, we set K = N . As discussions in [8] indicates that such setting makespartition function interesting. And, K = N makes partition function easier to handle and givesus convenience. In such setting, as theorem 5.6 tells us, the computational complexity of partitionfunction P ar N = P ar
N,N has lower bound η N .It is easy to see that the computational complexity of the original number partition problem is higherthan the computational complexity of P ar N . Since the lower bound for complexity of P ar N is η N ,the computational complexity of the number partition problem with size N also has lower bound η N .In our discussions, there is one form can help us to see the complexity more clearly. This formis: P ar N ( x ) = ϕ N ( x, p ) ⊙ P V N . This form clearly shows that there are 2 kinds of computationalcomplexity. One is the complexity to compute ϕ N once p is given (which are N − N -bits integers). Another is the complexity to find the correct partition vector, which is the sizeof P V N that is exponential to N . We have already shown that the first complexity is proportional to N in last section. And, in above discussions (lemma 5.1 to 5.5), we have shown that there is no wayto reduce the second complexity to be smaller than η N , it has to be exponential. Such form tells uswell where the computational complexity comes from.Now, we can conclude: the lower bound of computational complexity of the number partition problemis exponential to the size N . Here, we quote Cook: ”Thus to prove P = N P it suffices to prove asuper-polynomial lower bound on the size of any family of Boolean circuits solving some specificNP-complete problem, such as 3-SAT.” [15] Thus, we have shown P = NP . In this section, we write down some extended thoughts, which can help to better understand the toolsand methods that we used in this study.
Complexity of learning gauges the efforts to learn a boolean function, which can be measured by thesize of mPSS. Complexity of computing is the lower bound of boolean circuit expressing the booleanfunction. According to the 2 theorems: PSS implies circuit, and circuit implies PSS, the 2 complexitiesare equivalent. This is the fundamental thoughts in this study.
This research on complexity of partition function gets its inspiration and tools from our studies onuniversal learning machine. However, this research can feedback to learning theory and push learningto higher level.In current learning theory, the learning target quite often is just a specific boolean function. This isnot good. The much better learning target should be a boolean function with parameters and list ofparameters. Such a learning could be much more effective and efficient. It is the effectiveness of theboolean functions with parameters in this research work that inspires us to look back learning theoryand think so. We will continue work in this direction.8
Sampling and Complexity
A sequence of boolean functions, { f N } , f N : B N → B , N = 1 , , . . . , is a powerful computationalmodel, which Avi Wigderson thinks as ”hardware analog of an algorithm” [18]. Sequence of partitionfunctions is a special case of sequence of boolean functions. One key used in current study is: towrite the partition function in this form: f ( x ) = ϕ ( x, p ) ⊙ Ψ. This form reveals the computationalcomplexity clearly. In this form, there are 2 parts: one is ϕ , which is polynomial; another is Ψ, andΨ can not be reduced to less.However, the success and power of this form promote us to propose an idea: such a form is a canonicalto a sequence of boolean function. What is the meaning? Let’s give a clear description.For a sequence of boolean functions { f N } , f N : B N → B , N = 1 , , . . . , if f N can be written in this form: f N = ϕ N ( x, p ) ⊙ { p , p , . . . , p T } , where ϕ N : B N × B K → B is a boolean function with parameters,parameter dimension is K , K = K ( N ) is a function of N , and the complexity of ϕ N is just polynomialto N , and parameter list: Ψ N = { p , p , . . . , p T } is in B K , and T = T ( N ) is a function of N , and thislist of parameters could not be reduced, i.e. T could not become smaller. We will call such a form ϕ N ⊙ Ψ N as a canonical form for { f N } .Then, we put forward a conjecture. Conjecture 6.1 ( Canonical Form).
For any sequence of boolean functions { f N } , N = 1 , , . . . , itcan be written in a canonical form: f N = ϕ N ⊙ Ψ N This is just a conjecture. The canonical form has deep relationship with so call Non-deterministicTuring machine (NDTM). This relationship needs further studies.The lemma ”FE to fit partition function” tells us that canonical form very naturally appears forpartition function.
Consider a problem Q(x), Q stands for problem, x stands for instance. Solve the problem for aparticular x in short time, call as A . Find a way to solve the problem for all x mechanically and inshort time, call as B . A and B are different. Clearly, if we can do B , then we can do A . But, howabout opposite: if we can do A , can we do B ? In theory, we know it might not be true. But, peopleconstantly ask: if A , can B ? There are a lot confusing about them, ”there seems to be an invisiblefence between them” [17].If the problem can be written into canonical form, we can see the situation much better. Suppose f ( x ) = ϕ ( x, p ) ⊙ Ψ N , and the problem is: For an instance x ∈ B N , should f ( x ) = 1? For this problem, A and B are quite clear. Consider a particular x ∈ B N , if f ( x ) = 1 and we can pick a short list ofparameters for this particular x , and do trial-and-error according to the canonical form. Since thelist is short, this way to solve problem is also very short. This is to do A . However, if we want todo B , since the form is canonical form, there is no a mechanical way that can pick up a short list ofparameters for any given instance x , thus, the only mechanical way to compute if f ( x ) = 1, is to dotrial-and-error according to the canonical form going through whole Ψ N . So, no matter how we choosethe order of Ψ N , there are always some parameters are at the end of Ψ N , so, the computational costis high.Now we know, in order to do B , there is no way better than trial-and-error going through whole Ψ N .Such a way gives an universal algorithm but could need very long time. However, in order to do A ,we do not need to do so. Often, there is a smarter way. At some degree, we can say that intelligenceis the ability to do much better than universal algorithm. Canonical form indicates that to do A intelligently is to intelligently find a short list of parameters. For many such x , this can be done. Wespeculate that one way to go is by inspiration [5]. The study in this direction would be very fruitful. huyu Xiong We think the approach we used in the study to partition function could be extended to other problems.We can summarize the approach as: First, find the canonical form of the problem. This form is verynatural for partition function (almost immediately from its definition). For other problems, we mightneed big efforts to find such a form. Canonical form makes things much easier to understand andto handle. Second, make use of mPSS and PSS (so, theorems: PSS implies circuit, circuit impliesPSS). In order to do so, we need deep knowledge that are specific to the particular problem. Forpartition function, such knowledge is presented in the lemma ”Uniqueness of Partition Vector”. Forother problem, to find such deep knowledge is also the key.For a given computational problem, whether or not the above 2 steps can be achieved is questionable.However, we believe that this approach might work for some important computational problems. Cansuch an approach form the ”new, semantically-interesting ways” that Scott Aaronson talked?
Acknowledgment
Special thanks to Dr. Liu, Yu in France. Since 2017, I have had many discussions online with Dr.Liu on Turing machine, Non-deterministic Turing machine and other topics related to computation.These discussions are very insightful and helped me to think in different angle. Thanks to Dr. Huang,Daiyong in Shanghai and Mr. Huang, Chong in Wuhan for various and very useful discussions. Thanksto discussion participants in several WeChat groups, which attract people from whole world togetherand form a chaotic yet stimulating communication environment for thoughts.
References