On Computation Complexity of True Proof Number Search
OOn Computation Complexity of True Proof Number Search
Chao Gao
University of Alberta ∗ [email protected] Abstract
We point out that the computation of true proof and disproof numbers for proof number search in arbi-trary directed acyclic graphs is NP-hard, an impor-tant theoretical result for proof number search. Theproof requires a reduction from SAT, which demon-strates that finding true proof/disproof number forarbitrary DAG is at least as hard as deciding if ar-bitrary SAT instance is satisfiable, thus NP-hard.
Solving games is an important work direction for artifi-cial intelligence research. Proof number search (PNS) [Al-lis, 1994; Allis et al. , 1994] was developed specialized forsolving games, drawing inspiration from conspiracy numbersearch [McAllester, 1988]. Unlike Alpha-Beta based fixed-depth pruning [Knuth and Moore, 1975], PNS is a best-firstsearch that iteratively adjusts its search path by proof and disproof numbers, making it a stronger alternative for solv-ing games especially in the presence of deep and narrowplays. The notion of proof and disproof numbers for node x is used to express the minimum number of descending leafnodes x has to solve in order to prove that x is a win andloss respectively. In standard implementation, PNS initial-izes proof and disproof numbers of leaf nodes as (1 , andthen backup these numbers recursively through an sum op-eration, though there are enhancements trying to establishmore informative initialization of proof and disproof num-bers at node creation [Breuker, 1998; Breuker et al. , 1999;Winands et al. , 2004]. Depth-first proof number search(DFPN) [Nagai, 2002] is a PNS variant that adopts twothresholds to avoid unnecessary traversal of the search tree;it has the same behavior as PNS in trees, but exhibits lowermemory footprint at the expense of re-expansion; it can befurther improved by incorporating various general or game-dependent techniques. Yoshizoe et al [Yoshizoe et al. , 2007]introduced λ search to DFPN to solve the capturing prob-lems in Go; threshold controlling and source node detec-tion [Kishimoto, 2010] were introduced to DFPN to deal witha variety of issues from Tsume-Shogi. ∗ Extended work of a section in [Gao, 2020].
Together with other game-specific or game-independent al-gorithmic developments, PNS and their variants have beenused for successfully solving a number of games, e.g.,Gomoku [Allis et al. , 1996], checkers [Schaeffer et al. , 2007],and small board size Hex [Pawlewicz and Hayward, 2013;Gao et al. , 2017]. Furthermore, since PNS was developedby modeling game-searching as AND/OR trees [Nilsson,1980], the algorithm has also been applied for solving real-world problems that can be described as a form of AND/ORgraphs [Pearl, 1984]. A notable example is chemical synthe-sis [Heifets and Jurisica, 2012; Kishimoto et al. , 2019].One problem in PNS is that the recursively computed proofand disproof numbers are well-defined on AND/OR trees [Al-lis, 1994], but in practice, the AND/OR structured state-space graph of many problems — including many two-playergames — is a directed acyclic graph (DAG). In some do-mains, it has been shown that treating these underlying graphas a tree may cause serious over-counting for both proofand disproof numbers, resulting huge proof/disproof num-ber for an easy-to-solve node, consequently preventing PNSfrom solving the domain for a long time [Kishimoto, 2010;Nagai, 2002]. Algorithms with exponential complexity areknown to find the true proof and disproof numbers at eachnode in DAGs [Schijf et al. , 1994], but they are impracti-cal even for toy problems. Heuristic techniques address thisby replacing sum-cost with a variant of max-cost at eachnode [Ueda et al. , 2008], identifying some specific cases andcuring them individually [Kishimoto, 2010; Nagai, 2002].This over-counting problem has been discussed extensivelyin [Kishimoto et al. , 2012; Kishimoto and Marinescu, 2014];however, the theoretical complexity for dealing with thisproblem has not been presented by previous researchers. Inthis paper, we aim to formally establish the computationaldifficulty of true proof number search in DAGs; we show thatsuch a task is NP-hard. Note that an earlier version of thisdiscussion has appeared in a section in [Gao, 2020]. This pa-per provides a specific account of the hardness of PNS as anindependent work without the involvement of other unrelatedtopics.
To make this paper self-contained, in this section we reviewpreliminaries for PNS, then discuss the over-counting issueof PNS in DAGs. a r X i v : . [ c s . CC ] F e b .1 Directed Acyclic AND/OR Graphs A directed acyclic AND/OR graph [Pearl, 1984] is a DAGwith an additional property that any edge coming out of anode is labeled either as an OR or AND edge. A node con-tains only OR outgoing edges is called an
OR node . Con-versely, a node emitting only AND edges is called an
ANDnode . Any node emanating both AND and OR edges is called mixed node . To distinguish, in graphic notation, all ANDedges from the same node are often grouped using an arc.It is also easy to see that a mixed node can be replaced withtwo pure AND and OR nodes; see Figure 1. Thus, for easeof presentation, in the remaining text of this paper, we as-sume that a directed acyclic AND/OR graph contains onlypure AND and OR nodes. That is, we note the graph as atuple G =
Figure 1: Left: A is a mixed node. Right: A is an OR node; E isan AND node. These two graphs are equivalent while the right onecontains only pure AND and OR nodes. ABD E CE F
Figure 2: Directed acyclic AND/OR graph represents problem-reduction. B is the only AND node. A and C are OR nodes. D , E , and F are leaf nodes that can be regarded as either AND or ORnodes. Figure 2 also shows that if we assume A is solvable , in thebest case, only one sub-problem E is required to be solvable to validate our assumption. Conversely, knowing only E and F both are unsolvable is enough to say that A is unsolvable .In other words, the minimum number of leaf nodes to exam-ine for proving A is 1, and for disproving . The sub-graphthat used to claim either A is solvable or unsolvable is called solution-graph . For Figure 2, a solvable solution-graph can be { A → C → E } , assuming E is solvable; an unsolvable solution-graph can be { A → C → E, C → F } assumingboth E and F are unsolvable. Formally, given a graph G , ∀ x ∈ { V o , V a } , define p ( x ) and d ( x ) respectively as the minimum number descending leafnodes in order to solve for proving and disproving x respec-tively. When graph G is an AND/OR tree, the following re-cursive computation scheme exists [Allis, 1994]: p ( x ) = n is non-terminal leaf node min x j ∈ ch ( x ) p ( x j ) x is OR node (cid:80) x j ∈ ch ( x ) p ( x j ) x is AND node d ( n ) = x is non-terminal leaf node min x j ∈ ch ( x ) d ( x j ) x is AND node (cid:80) x j ∈ ch ( x ) d ( x j ) x is OR node (1)In Eq. (1), ch ( x ) represents the set of direct child successorsfor x . One scenario Eq. (1) fails to cover is when x is a termi-nal leaf node, in which case proof and disproof numbers of x are self-evident, as in Eq. (2). p ( x ) = (cid:26) x is solvable ∞ x is unsolvable d ( x ) = (cid:26) x is unsolvable ∞ x is solvable (2)Equipped with proof and disproof numbers, PNS conductsa best-first search repeatedly doing the following steps:1. Selection. Starting from the root, at each node x : 1) if x is OR node, select a child node with the minimum p value; 2) if x is AND node, select a child node with theminimum d value. Stop this until until x becomes a leafnode.2. Evaluation and Expansion. An external function iscalled to check if the leaf is a terminal or not. If not,the leaf node is expanded and all its newly children areassigned with ( p, d ) ← (1 , .3. Backup. Updated proof and disproof numbers for theselected leaf node is back-propagated up to the tree ac-cording to Eq. (1) and (2).Given sufficient memory and computation time, it hasbeen shown that PNS and DFPN are not only complete forAND/OR trees but also acyclic AND/OR graphs [Kishimotoand M¨uller, 2008; Allis, 1994]. That is, PNS terminates whenthe root node becomes a terminal , i.e., its ( p, d ) becomes ei-ther (0 , ∞ ) or ( ∞ , , indicating respectively the root is solv-able or unsolvable . An AND/OR tree can be used to model game-tree [Nilsson,1980], in which case, the state nodes where it is the first-player to play are OR nodes, and those of the second-playerre AND nodes. Compared to general AND/OR trees, theadditional regularity for an AND/OR tree of a two-playeralternate-turn zero-sum game is that OR and AND appear al-ternately in layers. That is, if x is OR node, ∀ y ∈ ch ( x ) , y must be an AND node, and vice versa.In this game context, the notion of p ( x ) and d ( x ) in Eq. (1)and (2) can be simplified as φ ( x ) and δ ( x ) , which respec-tively represent the minimum number of non-terminal leaf nodes to solve in order to prove that x is winning and los-ing. Here, a node is said to be a winning state if the player toplay at that node wins (respectively for losing ). By such, it be-comes unnecessary to explicitly discern whether x is an ANDor OR node, since φ ( x ) would be solely dependent on delta values of ch ( x ) , and δ ( x ) would be equal to the summed φ values of ch ( x ) . This simplified computation scheme is pre-cisely expressed in Eq. (3). φ ( x ) = x is non-terminal leaf node x is terminal winning state ∞ x is terminal losing state min x j ∈ ch ( x ) δ ( x j ) δ ( x ) = x is non-terminal leaf node ∞ x is terminal winning state x is terminal losing state (cid:80) x j ∈ ch ( x ) φ ( x j ) (3)Figure 3 shows an example game-tree , where, according toPNS, node j is to be selected for next node expansion, andafter that, the ancestor nodes of j will be updated due to thechange in j . Note that, for better convenience, in games con-text, nodes of first players and second-player are respectivelydrawn using square and circular shapes, eliminating the useof arcs between edges for representing AND nodes. a b d ∞ h i ,0 ∞ e j k ∞ l c ∞ f ∞ ,0 g Figure 3: PNS example in a game-tree. Each node has a pair ofevaluations ( φ, δ ) , computed bottom up. A bold edge indicates alink where the minimum selection is made at each node, accordingto children’s δ values. In games, the existence of transpositions indicates thatthe state-space graphs are often a DAG rather than a tree.Although the computational convenience as in Eq. (3) stillholds due to the alternating regularity, as Eq. (1) for generalAND/OR graphs, φ and δ no long represent true proof anddisproof numbers. In the next section, we discuss the issue indetail. A(2 , , ,
1) E(1 ,
1) C (1 , , Figure 4: PNS example in a game DAG. Solving E to be a losswould imply both B and C are winning , thus A would be a loss; thisimplies that the δ value of A should be , but computation accord-ing Eq. (3) gives δ ( A ) = 2 . This is because when summing the φ from B and C , E was counted twice.
14 110 3 120 6 2 110 3 14 11
Figure 5: An AND/OR graph in lattice form of layers. Every nodeis AND node. For a graph of such having n layers, the true proofnumber for the root is , but computing it with Eq. (1) gives (cid:0) n − k (cid:1) ,where k = n − . In a game-DAG, computing proof and disproof numbers viaEq. (3) could over-count some non-terminal leaf nodes mul-tiple times. Figure 4 shows an example where E was countedtwice at A .The over-counting problem in general AND/OR graphs canbe extremely severe, since it is possible that a node might becounted an exponential number of times. Figures 5 and 6are two examples where we can analytically see the drasticdifference between the true proof number and the recursivelycomputed number by Eq. (1).A natural question then arises: for a layered DAG rooted atnode x , what is the computational difficulty of calculating thetrue proof and disproof numbers for x ? Even though manyexact [Schijf et al. , 1994; M¨uller, 2002] or heuristic [Ueda etal. , 2008; Kishimoto, 2010] methods have been carried out toaddress the over-counting issue, formal proof on the theoreti-cal difficulty of this task has been lacking. In the next sectionwe prove this sub-task in proof number search is NP-hard. ba ac abcb bc abd abcdc ad acdd bd bcdcd Figure 6: An AND/OR graph in combinatorial lattice form of 5 lay-ers. For a graph of such having n layers, the true proof number forthe root is , but computing it with Eq. (1) gives ( n − . We now show that the computation of true proof and disproofnumbers in arbitrary AND/OR DAGs. This proof is closelyrelated to proof of computationally difficulty for “optimal so-lution graph” from [Sahni, 1974]. Our contribution is bring-ing the classic results into the context of computing proof anddisproof numbers, providing the heuristic search communitya definite answer to an important question concerning PNS.First, we have the following observation.
Theorem 1.
Deciding whether a SAT instance is satisfiablecan be reduced to finding the true proof (or disproof) numberof an AND/OR graph.Proof.
Consider a SAT instance in conjunction normal form P = ∧ ki C i , where each clause is a disjunction of literals, C i = ∨ j l j . Let x , x , . . . , x n be all the variables, each literal l j is either x j or ¬ x j . Construct an AND/OR graph as follows.1. Let the start node be an AND node, denoted by P .2. P contains n + k successors C i and X j , ∀ i =1 , . . . , k, ∀ j = 1 , . . . , n . They are all OR nodes.3. Each X j contains two successors T x j and F x j repre-senting x j and ¬ x j respectively. The successors of each C i are those literals that appear in that clause.For such a graph, to satisfy the start node P , every clausemust be satisfied and each variable node X j must be assignedto a value; thus, the minimum possible proof number for P is n — in such a best case, to satisfy each X j , only one of T x j and F x j is needed. This is equivalently to say finding ifa SAT instance is satisfiable can be transformed into findingthe true proof number of P in this AND/OR graph.For disproof number, construct another graph by revertingthe above graph such that all AND nodes are converted toOR nodes, and all OR nodes to AND nodes, then we seethat to disprove P , it is sufficient to disprove either one in { C , . . . , C k , X , . . . , X n } ; however, the minimum possible P C C C X X X Tx Fx Tx Fx Tx Fx ( x ∨ x ∨ x ) ∧ ( ¬ x ∨¬ x ∨¬ x ) ∧ ( ¬ x ∨ x )P C C C X X X Tx Fx Tx Fx Tx Fx ( x ∨ x ∨ x ) ∧ ( ¬ x ∨ ¬ x ∨ ¬ x ) ∧ ( ¬ x ∨ x ), NO? Figure 7: Above: Deciding whether this SAT instance is satisfiablecan be reduced to finding the true proof number for P . Below: De-ciding whether this SAT instance is unstatisfiable can be reduced tofinding the true disproof number for P . solution could be disprove C i if ∃ i = 1 , . . . , k such that C i contains less than literals, otherwise X j , ∀ j = 1 , . . . , n .For either case, this would lead the SAT instance unsatisfi-able. So, this is equivalent to say finding if a SAT instance isunsatisfiable can be converted into finding the true disproofnumber of P in the constructed AND/OR graph.An example formula P = ( x ∨ x ∨ x ) ∧ ( ¬ x ∨ ¬ x ∨¬ x ) ∧ ( ¬ x ∨ x ) and the constructed AND/OR graphs areshown in Figure 7.Then, we can derive the following result. Theorem 2.
Given arbitrary AND/OR DAG rooted at x , com-puting the true proof and disproof number for x is NP-hard.Proof. The graph construction from SAT in Theorem (1) iswith polynomial time. It follows that computing proof or dis-proof number exactly in an arbitrary DAG is at least as diffi-cult as finding if an SAT is satisfiable, thus NP-hard.
We proved that computing exact proof/disproof number fordirected acyclic graphs is NP-hard. We expect our discussioncould provide useful inspiration to future PNS developments,either for solving games or real-world AND/OR graphs. eferences [Allis et al. , 1994] L Victor Allis, Maarten van der Meulen,and H Jaap Van Den Herik. Proof-number search.
Artifi-cial Intelligence , 66(1):91–124, 1994.[Allis et al. , 1996] L. Victor Allis, H. Jaap van den Herik,and M. P. H. Huntjens. GoMoku solved by new searchtechniques.
Computational Intelligence , 12:7–23, 1996.[Allis, 1994] LV Allis.
Searching for solutions in games andartificial intelligence . PhD thesis, Universiteit Maastricht,1994.[Breuker et al. , 1999] Dennis Michel Breuker, JosephWillem Hubertus Marie Uiterwijk, and Hendrik JacobHerik. ”The PN2-search algorithm” . UniversiteitMaastricht, Department of Computer Science, 1999.[Breuker, 1998] Dennis M Breuker.
Memory versus searchin games . PhD thesis, 1998.[Gao et al. , 2017] Chao Gao, Martin M¨uller, and Ryan Hay-ward. Focused depth-first proof number search usingconvolutional neural networks for the game of Hex. In
Proceedings of the Twenty-Sixth International Joint Con-ference on Artificial Intelligence, IJCAI-17 , pages 3668–3674, 2017.[Gao, 2020] Chao Gao.
Search and Learning Algorithms forTwo-Player Games with Application to the Game of Hex .PhD thesis, University of Alberta, 2020.[Heifets and Jurisica, 2012] Abraham Heifets and Igor Ju-risica. Construction of new medicines via game proofsearch. In
Twenty-Sixth AAAI Conference on Artificial In-telligence , 2012.[Kishimoto and Marinescu, 2014] Akihiro Kishimoto andRadu Marinescu. Recursive best-first AND/OR search foroptimization in graphical models. In
UAI , pages 400–409,2014.[Kishimoto and M¨uller, 2008] Akihiro Kishimoto and Mar-tin M¨uller. About the completeness of depth-first proof-number search. In
International Conference on Computersand Games , pages 146–156. Springer, 2008.[Kishimoto et al. , 2012] Akihiro Kishimoto, Mark HMWinands, Martin M¨uller, and Jahn-Takeshi Saito. Game-tree search using proof numbers: The first twenty years.
ICGA Journal , 35(3):131–156, 2012.[Kishimoto et al. , 2019] Akihiro Kishimoto, Beat Buesser,Bei Chen, and Adi Botea. Depth-first proof-number searchwith heuristic edge cost and application to chemical syn-thesis planning. In
Advances in Neural Information Pro-cessing Systems , pages 7224–7234, 2019.[Kishimoto, 2010] Akihiro Kishimoto. Dealing with infiniteloops, underestimation, and overestimation of depth-firstproof-number search. In
Proceedings of the Twenty-FourthAAAI Conference on Artificial Intelligence , pages 108–113. AAAI Press, 2010.[Knuth and Moore, 1975] Donald E Knuth and Ronald WMoore. An analysis of alpha-beta pruning.
Artificial in-telligence , 6(4):293–326, 1975. [McAllester, 1988] David Allen McAllester. Conspiracynumbers for min-max search.
Artificial Intelligence ,35(3):287–310, 1988.[M¨uller, 2002] Martin M¨uller. Proof-set search. In
Interna-tional Conference on Computers and Games , pages 88–107. Springer, 2002.[Nagai, 2002] Ayumu Nagai.
Df-pn algorithm for search-ing AND/OR trees and its applications . PhD thesis, PhDthesis, Department of Information Science, University ofTokyo, 2002.[Nilsson, 1980] Nils J Nilsson.
Principles of artificial intel-ligence . Morgan Kaufmann, 1980.[Pawlewicz and Hayward, 2013] Jakub Pawlewicz andRyan B Hayward. Scalable parallel DFPN search. In
International Conference on Computers and Games ,pages 138–150. Springer, 2013.[Pearl, 1984] Judea Pearl. Heuristics: intelligent searchstrategies for computer problem solving. 1984.[Sahni, 1974] Sartaj Sahni. Computationally related prob-lems.
SIAM Journal on Computing , 3(4):262–279, 1974.[Schaeffer et al. , 2007] Jonathan Schaeffer, Neil Burch, Yn-gvi Bj¨ornsson, Akihiro Kishimoto, Martin M¨uller, RobertLake, Paul Lu, and Steve Sutphen. Checkers is solved.
Science , 317(5844):1518–1522, 2007.[Schijf et al. , 1994] Martin Schijf, L Victor Allis, andJos WHM Uiterwijk. Proof-number search and transpo-sitions.
ICGA Journal , 17(2):63–74, 1994.[Ueda et al. , 2008] Toru Ueda, Tsuyoshi Hashimoto, JunichiHashimoto, and Hiroyuki Iida. Weak proof-numbersearch. In
International Conference on Computers andGames , pages 157–168. Springer, 2008.[Winands et al. , 2004] Mark HM Winands, Jos WHM Uiter-wijk, and H Jaap van den Herik. An effective two-levelproof-number search algorithm.
Theoretical ComputerScience , 313(3):511–525, 2004.[Yoshizoe et al. , 2007] Kazuki Yoshizoe, Akihiro Kishi-moto, and Martin M¨uller. Lambda depth-first proof num-ber search and its application to go. In