[PDF] A Typo in the Paterson-Wegman-de Champeaux algorithm

Abstract

We investigate the Paterson-Wegman-de Champeaux linear-time unification algorithm. We show that there is a small mistake in the de Champeaux presentation of the algorithm and we provide a fix.

Full PDF

aa r X i v : . [ c s . L O ] J u l A typo in the Paterson-Wegman-de Champeaux algorithm

Valeriu Motroi and S¸tefan Ciobˆac˘a Alexandru Ioan Cuza University Ia¸si, Romania { motroival, stefan.ciobaca } @gmail.com Abstract

We investigate the Paterson-Wegman-de Champeaux linear-time uniﬁcation algorithm.We show that there is a small mistake in the de Champeaux presentation of the algorithmand we provide a ﬁx.

In this paper we investigate the Paterson-Wegman algorithm [3], as improved by de Cham-peaux [1]. The algorithm has linear-time complexity. In Figure 1 we present the pseudo-codeproposed by de Champeaux. We add line numbers and we make some cosmetic changes, whichdo not aﬀect the algorithm logic. For example, we omit the else branch of an if statement whose then branch ends with an exit statement. The de Champeaux presentation of the algorithmends with a post-processing step, described in Figure 6.The issue we identify is that the post-processing step enters an inﬁnite loop. The inﬁniteloop is caused by a bug in the occurs-check test. We give an input producing an inﬁnite loopin the next section. The bug can be ﬁxed syntactically by indenting an assignment statement,i.e., moving it inside the inner code block.This issue was noticed and ﬁxed by Erik Jacobsen [2] (see footnote on Page 34). However,in this paper we present and analyze an troublesome input in detail. We show how the de Champeaux algorithm works when trying to unify the terms X and f ( X ).The algorithm starts with the DAG representation of the two terms, which we show in Figure 2.As the two terms have maximal sharing between them, there is only one node labeled X . Thereare two roots, each corresponding to one of the terms to be uniﬁed. We use simple arrows todenote the relation between parent and child nodes of the DAG.The algorithm creates links (undirected edges) between nodes that should be in the sameequivalence relation. We use dashed lines to denote the links created by the algorithm. Thealgorithm also maintains stacks (shown graphically on the right) and a set of pointers fromnodes to nodes, which are represented by two-headed arrows.The algorithm starts by creating a link between X and f ( X ) (Figure 3).The next step is to call Finish on all functional nodes (line 3). In this example we haveonly one functional node, f . At this step, we have r = f ( X ). Because complete(r) is markedas false and pointer(r) is NIL , we jump straight to line 12, where we set pointer(r) to r andpush it to the stack (Figure 4).At the ﬁrst iteration of the while loop, at line 15, we have s = r . As s and r have the samefunction symbol, we do not enter the if statement at line 16. As s does not have any parent,we do not enter the if statement at line 18. The variable s has a link to X and, as a result,at line 21 we have r = f ( X ), s = f ( X ), t = X . The variable t is not marked complete and is typo in the Paterson-Wegman-de Champeaux algorithm Motroi and Ciobˆac˘a Procedure

Solver( u, v ) : Create link (u, v) While there is a function node r,

Finish (r) While there is a variable node r,

Finish (r) BUILD-SIGMA (SIGMA) Procedure

Finish( r ) : if complete(r) then Exit if pointer(r) = NIL then Exit with failure Create new pushdown stack with operations

Push (*) and

Pop pointer(r) := r Push (r) while stack = NIL do s := Pop if r, s have diﬀerent function symbols then Exit with failure FOR-EACH parent t of s do Finish (t) FOR-EACH link (s, t) do if Complete(t) or t = r then Ignore t else if pointer(t) = NIL then pointer(t) := r Push (t) else if pointer(t) = r then Exit with failure else Ignore t // (since t is already on STACK) if s = r then if Variable(s) then Subs(s) := r Add s to SIGMA (input to BUILD-SIGMA) else Create links { j th son(r), j th son(s) | ≤ j ≤ q } Complete(s) := true end Complete(r) := trueFigure 1: Paterson-Wegman algorithm as presented by de Champeaux. We add line numbersand we make some cosmetic changes.fxRoot 1 Root 2Figure 2: The data structures at the start of the algorithm.2 typo in the Paterson-Wegman-de Champeaux algorithm Motroi and Ciobˆac˘a fxFigure 3: The data structures representation after adding the ﬁrst link.fx f(X)Figure 4: The data structures after pushing the ﬁrst functional node to the stack.not equal to r , so we enter the if statament at line 24, set pointer(t) to be r and push it on thestack (Figure 5).After this step, we jump straight to line 30, because there is only one link. We do not enterthe if statement at line 30 because s equals r . Then we set complete(s) to true at line 36. Notethat s is still f . In the next iteration of the while loop at line 14 we have s = X . Because ofthe shared structure of common variables, we call Finish( f ) at line 19, but complete( f ) is true,so we exit this function call at line 8. Next follows the loop at line 20. We have the initial link X and f ( X ), so in this case t = f ( X ), but complete(t) is true and the node t is ignored (line22). Moving on, on line 30, we enter the if statement and jump to line 32, because s = X ,which is a variable. At line 32 we set subs ( X ) = f ( X ) and at line 33 we add X to SIGMA .Then, at line 36, we set complete(s) to true. The stack is now empty, so we go to the line38 where we set complete(r) to true (this is the second time we set complete(s) to true). Theexecution of

Finish is done and we call

Finish on all variable nodes. We have only one variable, X , which has complete(X) set to true, so we immediately return. Now we call BUILD-SIGMA .One important observation is that we ﬁnished the main algorithm and the occurs-check at line9 did not happen.In Figure 6 we show the implementation of

BUILD-SIGMA . The function

BUILD-SIGMA creates a substitution from a ordered substitution in linear time. By running the algorithm, weconclude that it enters an inﬁnite loop. In short, below are order of the function calls.1.

BUILD-SIGMA(list( X )) - at line 12. EXPLORE-VARIABLE( X ) -at line 33. DESCEND( f ( X ) ) - at line 74. EXPLORE-ARGUMENTS(list( X )) - at line 21fx XFigure 5: The data structures after adding the variable X to the stack. 3 typo in the Paterson-Wegman-de Champeaux algorithm Motroi and Ciobˆac˘a Procedure

BUILD-SIGMA( list-of-variables ) : FOR-EACH variable x i in list-of-variables do Add to ﬁnal substitution x i → EXPLORE-VARIABLE( x i ) Function

EXPLORE-VARIABLE( x i ) : if Ready( x i ) = NIL then Exit with Ready( x i ) out := DESCEND(Subs( x i )) if out = NIL then out := x i Ready( x i ) := out Exit with out Function

DESCEND( u i ) : if u i = NIL then Exit with NIL if Variable( u i ) then Exit with

EXPLORE-VARIABLE( u i ) if Constant( u i ) then Exit with u i if Ready( u i ) then Exit with Ready( u i ) out := EXPLORE-ARGUMENTS (arguments-of( u i )) if out = arguments-of( u i ) then Ready( u i ) := out else // Cons gets as ﬁrst argument a node and as a second argument // a pointer to a list of nodes and will return a pointer to // a list of nodes with the ﬁrst argument in front // of the second argument. Ready( u i ) := Cons(Head-of( u i ), out) Exit with Ready( u i ) Function

EXPLORE-ARGUMENTS( list-of-arguments ) : if list-of-arguments = NIL then Exit with NIL DESCEND (1st(list-of-arguments)) tail-new := EXPLORE-ARGUMENTS (tail(list-of-arguments)) if = tail-new = tail(list-of-arguments) then Exit with Cons(1st-new, tail-new) Exit with list-of-argumentsFigure 6: Post-processing step described by de Champeaux.4 typo in the Paterson-Wegman-de Champeaux algorithm Motroi and Ciobˆac˘a DESCEND( X ) - at line 346. EXPLORE-VARIABLE ( X ) - at line 16The Ready variable is not used. As a result, we enter a inﬁnite loop.

The issue with the pseudo-code presented by de Champeaux is on line 36 in the

Finish pro-cedure. Based on the pseudo-code by Paterson-Wegman,

Complete(s) should be set to trueinside the if statement at line 36. We propose a ﬁxed version in Figure 7. This change ﬁxes thepseudo-code and the algorithm remains linear time and there are no further issues.

We investigate the Paterson-Wegman linear-time uniﬁcation algorithm as improved by de Cham-peaux. We show an example where the occurs-check test fails to work as expected and results inan inﬁnite loop in the post-processing step. We show that the issue is caused by a misindentedstatement (line 36) in the pseudo-code. Once the statement is properly indented, the algorithmis correct and works in linear-time as claimed.

References [1] Dennis de Champeaux. About the Paterson-Wegman linear uniﬁcation algorithm.

J. Comput. Syst.Sci. , 32(1):79–90, February 1986.[2] Erik Jacobsen. Uniﬁcation and anti-uniﬁcation. Technical report, 1991 (accessed: June 2020). http://erikjacobsen.com/pdf/unification.pdf .[3] M. S. Paterson and M. N. Wegman. Linear uniﬁcation. In

Proceedings of the Eighth Annual ACMSymposium on Theory of Computing , STOC ’76, pages 181–186, New York, NY, USA, 1976. ACM. typo in the Paterson-Wegman-de Champeaux algorithm Motroi and Ciobˆac˘a Procedure

Finish( r ) : if complete(r) then Exit if pointer(r) = NIL then Exit with failure Create new pushdown stack withoperations

Push (*) and

Finish( r ) : if complete(r) then Exit if pointer(r) = NIL then Exit with failure Create new pushdown stack withoperations

Push (*) and