A Theoretical Study of (Full) Tabled Constraint Logic Programming
aa r X i v : . [ c s . L O ] S e p Under consideration for publication in Theory and Practice of Logic Programming A Theoretical Study of (Full) Tabled Constraint LogicProgramming ∗ Joaqu´ın Arias , Manuel Carro , IMDEA Software Institute, Universidad Rey Juan Carlos, Universidad Polit´ecnica de Madrid [email protected], manuel.carro@ { imdea.org,upm.es } submitted 1 January 2003; revised 1 January 2003; accepted 1 January 2003 Abstract
Logic programming with tabling and constraints (TCLP, tabled constraint logic programming ) has beenshown to be more expressive and, in some cases, more efficient than LP, CLP, or LP with tabling. In thispaper we provide insights regarding the semantics, correctness, completeness, and termination of top-downexecution strategies for full TCLP, i.e., TCLP featuring entailment checking in the calls and in the answers.We present a top-down semantics for TCLP and show that it is equivalent to a fixpoint semantics. Westudy how the constraints that a program generates can effectively impact termination, even for constraintclasses that are not constraint compact, generalizing previous results. We also present how differentvariants of constraint projection impact the correctness and completeness of TCLP implementations. Allof the presented characteristics are implemented (or can be experimented with) in Mod TCLP, a modularframework for Tabled Constraint Logic Programming, part of the Ciao Prolog logic programming system.
KEYWORDS : Constraints, Tabling, Logic programming, Foundations, Implementation.
Constraint Logic Programming (CLP) (Jaffar and Maher 1994) extends Logic Programming(LP) with variables that can belong to arbitrary constraint domains and the ability toincrementally solve equations involving these variables. CLP brings additional expressive powerto LP, since constraints can very concisely capture complex relationships. Also, shifting from“generate-and-test” to “constraint-and-generate” patterns reduces the search tree and thereforebrings additional performance, even if constraint solving is in general more expensive thanfirst-order unification.Tabling (Tamaki and Sato 1986; Warren 1992) is an execution strategy for logic programs thatsuspends repeated calls which could cause infinite loops. Answers from non-looping branches areused to resume suspended calls which can, in turn, generate more answers. Only new answers aresaved, and evaluation finishes when no new answers can be generated. Tabled evaluation alwaysterminates for calls/programs with the bounded term depth property (those that can only generateterms with a finite bound on their depth) and can improve efficiency for terminating programs that ∗ Work partially supported by EIT Digital, MINECO project TIN2015-67522-C3-1-R (TRACES), MICINN projectPID2019-108528RB-C21 (ProCode), and Comunidad de Madrid project S2018/TCS-4339 BLOQUES-CM co-fundedby EIE Funds of the European Union.
Joaqu´ın Arias and Manuel Carro dist(X, Y, D) :- dist(X, Z, D1), edge(Z, Y, D2), D is D1 + D2. dist(X, Y, D) :- edge(X, Y, D). ?- dist(a,Y,D), D Tabling can be used naturally to computefixpoints (Kanamori and Kawamura 1993; Janssens and Sagonas 1998), but, additionally,by implementing abstract domain operations as constraints (Arias and Carro 2019b),entailment will automatically detect more particular calls and suspend their executionto reuse analysis results from most general calls, thereby speeding up the fixpointcomputation. Constraints can also be used to state preconditions to the analysis resultsbefore the analysis starts in a powerful yet flexible fashion. These preconditions canpropagate during the evaluation and help solve some verification problems faster. Reasoning on ontologies: An ontology formalizes types, properties, and interrelationshipsamong entities. They can be expressed as a lattice constraint system and, with TCLP,evaluation in ontologies can benefit from entailment of instances which are more particularthan other entities, in a fashion similar to OWL ( ), but in potentiallyricher domains and/or more complex scenarios (e.g., stream data analysis (Arias 2016)). Constraint-based verification: Verification conditions can be encoded as constraint systems,and the tabling engine can use entailment to guarantee termination and save executiontime (Charatonik et al. 2002; Jaffar et al. 2004; Gange et al. 2013). Incremental evaluation of aggregates: For aggregates that can be embedded into a lattice (e.g.,minimum), the aggregation operation can be expressed based on the partial order of thelattice. In these cases, the aggregate operations in the lattice can be seen as a counterpartof the operations among constraints defined in TCLP (Arias and Carro 2019c).In order to highlight some of the advantages of TCLP vs. LP, tabling, and CLP with respectto declarativeness and logical reading, in (Arias and Carro 2019a) we compared how differentversions of a program to compute distances between nodes in a graph behave under these threeapproaches. Each version was adapted to a different paradigm, but trying to stay as close aspossible to the original code, so that the additional expressiveness can be solely attributed to the heoretical Study of Tabled CLP • LP : The code in Fig. 1a is the Prolog version of a program used to find the distance betweentwo nodes in a graph. The distance between two nodes is calculated by adding variables D1 and D2 , corresponding to distances to and from an intermediate node, once they areinstantiated. The figure also shows a query used to determine which node(s) Y is/arewithin a distance K from node a . This query does not terminate as left recursion makesthe recursive clause enter an infinite loop. If we convert the program to a right-recursiveversion by swapping the calls to edge/3 and dist/3 , the program will still not terminatein a cyclic graph. • CLP( R ) : Fig. 1b is the CLP( R ) version of the same code where addition is modeled as aconstraint and placed at the beginning of the clause. Since the total distance D is bound bythe constraint D in the query, the search would be expected to be pruned if D exceedsthe maximum distance, K . However, the constraints placed before the recursive call donot cause this bound to be violated, and therefore it would enter a loop even for graphswithout loops. The right-recursive version of the CLP( R ) program in Fig. 1c will howeverfinish because the initial bound to the distance eventually causes the constraint store tobecome inconsistent, which provokes a failure in the search. Note that this transformationis easy in this case, but it would not have the same effect should the clause be writtenwith a (logically equivalent) double recursion. This is optional in this example, but it maybe necessary or more natural in other cases, such as in parsing applications, languageinterpreters, algorithms on trees, or divide-and-conquer algorithms. • Tabling : Tabling records the first occurrence of each call to a tabled predicate (the generator ) and its answers. In variant tabling, the most usual form of tabling, when a callequal up to variable renaming to a previous generator is found (a variant), its execution issuspended, and it is marked as a consumer of the generator. For example, dist(a,Y,D) isa variant of dist(a,Z,D) if Y and Z are free variables. When a generator finitely finishesexploring all of its clauses and its answers are collected, its consumers are resumed andare fed the answers of the generator. This may make consumers produce new answersthat will in turn cause more resumptions. Tabling is a complete strategy for all programswith the bounded term-depth property, which in turn implies that the Herbrand modelis finite. Therefore, left- or right-recursive reachability terminates in finite graphs withor without cycles. However, the program in Fig. 1a has an infinite minimum Herbrandmodel for cyclic graphs: every cycle can be traversed an unbound number of times,giving rise to an unlimited number of answers with a different distance each. The query ?- dist(a, Y, D), D < K will therefore not terminate under variant tabling. • TCLP : The program in Fig. 1b can be executed with tabling and using constraintentailment to suspend calls which are more particular than previous calls and,symmetrically, to keep only the most general answers returned. Entailment can be seen asa generalization of subsumption for the case of general constraints; in turn, subsumptionwas shown to enhance termination and performance in tabling (Swift and Warren 2010).When a goal G entails another goal G , the solutions for G are a subset of the solutionsfor G . To make the entailment relationship explicit, we define a TCLP goal as ( g , c g ) This is a typical query for the analysis of social networks (Swift and Warren 2010). Joaqu´ın Arias and Manuel Carro Table 1: Termination properties comparison of LP, CLP, tabling and TCLP. Graph LP CLP TAB TCLPWithout cycles Left recursion × × X X Right recursion X X X X With cycles Left recursion × × × X Right recursion × X × X where g is the call (a literal) and c g is the projection of the current constraint store onto thevariables of the call. Then, a goal G =( dist(X, Y, D) , D < 150 ) is entailed by anothergoal G =( dist(X, Y, D) , D > 0 ∧ D < 75 ) because the solutions for D > 0 ∧ D < 75 arecontained in the solutions for D < 150 ( D > 0 ∧ D < 75 ⊑ D < 150 ), and we write G ⊑ G .We say that G , the more particular goal, is the consumer , and G , the most general goal,is the generator . The key observation behind the use of entailment in TCLP is that calls tomore particular goals can suspend their execution and later recover the answers collectedby the most general call and continue execution. The solutions for the consumer are asubset of that for the generator. However, some answers for a generator may not be validfor a consumer. For example, D > 125 ∧ D < 135 is a solution for G but not for G , since G has a constraint store more restrictive than the G . Therefore, the tabling engine shouldcheck and filter, via the constraint solver, that answers from generators are consistent withthe constraint store of consumers.The use of entailment in calls and answers enhances termination properties. Column “TCLP”in Table 1 summarizes the termination characteristics of dist/3 under TCLP, and shows thata full integration of tabling and CLP makes it possible to find all the solutions and finitelyterminate in all the cases. Additionally, in (Arias and Carro 2019a) we experimentally show thatMod TCLP, a framework that fully implements entailment in the call and answer entailmentphase, can improve performance.The theoretical basis of Tabled Constraint Logic Programming (TCLP) were establishedin (Toman 1997) using a framework of bottom-up evaluation of Datalog systems and presentingthe basic operations (projection and entailment checking) that are necessary to ensurecompleteness w.r.t. the declarative semantics. In this work, we present the theoretical basisof TCLP for a top-down execution on which Mod TCLP (Arias and Carro 2019a) is based. InSection 2 we present the operational semantics of a top-down execution of TCLP programs withgeneric constraint solvers. In Section 3 we extend the soundness, completeness, and terminationproofs. In Section 4 we explain the benefits of using entailment checking with more relaxednotions projections. In this section we present a bottom-up fixpoint semantics of TCLP that used constraint entailmentfor the answers and a top-down semantics that extends (Toman 1997) by explicitly modelingentailment both in the answers and in the calls. This semantics uses objects that mimic theconstruction of forests of trees in implementations of tabling. heoretical Study of Tabled CLP A (tabled) constraint logic program consists of clauses of the form: h :- c , l , . . . , l k . where h is an atom, c is an atomic constraint or conjunction of constraints, l i are literals, ‘:-’represents the logical implication ‘ ← ’, and ‘,’ represents the logical conjunction ‘ ∧ ’. The headof the clause is h and the rest is called the body, denoted by body ( h ) . We will assume throughoutthis paper that the program has been rewritten so that clause heads are linearized (all the variablesare different) and all head unifications take place in c . The constraint c or the literals l i or bothmay be absent. In the last case the rule is called a fact and it is customarily written omitting thebody. We will assume that we are dealing with definite programs , i.e., programs where the literalsin the body are always positive (non-negated) atoms.A query to a TCLP program is a clause with the head false , usually written ?- c q , q , where c q is an atomic constraint or a conjunction of constraints and q is a literal. We follow (Jaffar and Maher 1994) in this section. Constraint logic programming introducesconstraint solving methods in logic-based programming languages. During the evaluation ofa CLP program, the inference engine generates constraints whose consistency with respect tothe current constraint store are checked by the constraint solver . If the check fails, the enginebacktracks to a previous choice and takes a pending, unexplored branch of the search tree. In thenext sections we will review the fixpoint and operational semantics of CLP and will extend themto TCLP. Definition 1. A constraint solver , CLP( X ), is a (partial) executable implementation of a constraint domain ( D , L ) . The parameter X stands for the 4-tuple ( Σ , D , L , T ) where:– Σ is a signature which determines the predefined predicates and function symbols and theirarities.– D is a Σ -structure: the constraint domain over which the computation is performed.– L is the class of Σ -formulas: the class of constraints that can be expressed with Σ . It shouldbe closed under variable renaming, conjunction, and existential quantification.– T is a first-order Σ -theory: an axiomatization of the properties of D , which determines whatconstraints hold and what constraints do not hold. D and T should agree on satisfiability ofconstraints, and every unsatisfiability in D has to be detected by T , i.e., for every constraintc ∈ L , D (cid:15) c iff T (cid:15) c. A constraint can be an atomic constraint or a conjunction of (simpler) constraints. We denoteconstraints with lower case letters, e.g. c , and sets of constraints with uppercase letters, e.g. S . Example 1. The Herbrand domain CLP( H ) used in logic programming is the constraint domain overfinite trees, where Σ contains constants, function symbols, and the predicate =/2 ; D is theset of finite trees, where each node is labeled by a constant (if it does not have children) ora function symbol of arity n (if it has n children). L is the set of constraints generated by the This covers as well the case of a conjunction of literals since we can always add a rule to that effect to the program. Joaqu´ın Arias and Manuel Carroprimitive constraints (i.e., equality) between trees (terms). Typical constraints are X=g(a) and X=f(Z, Y) ∧ Z=a . Definition 2 (Valuation) . Let S = { X , . . . , X n } be a set of variables. A valuation v is a mappingfrom variables in S to values in D . We write v = { X d , . . . , X n d n } to indicate that thevalue d i is assigned to variable X i . For convenience, and where it is not ambiguous, we will denote the value d i assigned to avariable X i by the valuation v as v ( X i ) (e.g., X i d i ∈ v ). Likewise, for a literal l we will denoteby v ( l ) the literal obtained by substituting the variables in l for their associated values in thevaluation v (for those variables that appear in v ) and, for a constraint c , we define similarly v ( c ) . Definition 3 (Solution of a constraint) . Let c be a constraint, vars ( c ) the set of variablesoccurring in c, and v a valuation over vars ( c ) on the constraint domain D . Then v is a solution for the constraint c if v ( c ) holds in the constraint domain. Definition 4 (Projection) . Let c be a constraint, S ⊆ vars ( c ) a set of variables occurring in c,and T = vars ( c ) \ S the rest of the variables of c. The projection of c over S, denoted Pro j ( c , S ) ,is another constraint c s such that c s ≡ ∃ T · c, i.e.:– Any solution v s for c s can be extended to be a solution for c.– Any solution v for c can be restricted to the variables in S and the restricted valuation is asolution for c s . The minimal set of operations that we expect a constraint solver to support, in order to interfaceit successfully with a tabling system (Arias and Carro 2019a), are: • Test for consistence or satisfiability. A constraint c is consistent in the constraint domain D , denoted D (cid:15) c , if it has a solution in D . • Test for entailment ( ⊑ D ). We say that a constraint c is entailed by another constraint c ( c ⊑ D c ) if any solution of c is also a solution of c . We extend the notion of constraintentailment to a set of constraints: a set of constraints C is entailed (or covered) by anotherset of constraints C (and we write it as C ⊑ D C ) if ∀ c i ∈ C ∃ c j ∈ C . c i ⊑ D c j . • An operation to compute the projection of a constraint c onto a finite set of variables S . Pro j ( S , c ) . The canonical model of a Prolog program is the minimal Herbrand model. Similarly, the fixpointsemantics of a CLP program P over a constraint domain D is the least D -S-model, which wedefine next. The presence of variables in D -S-models makes it possible to use entailment todiscard subsumed constraints in the bottom-up construction of the fixpoint.We can define the least D -S-model of a program using the S-semantics (Falaschi et al. 1989;Jaffar and Maher 1994) for languages with constraints (Gabbrielli and Levi 1991). It differs fromthe standard model (van Emden and Kowalski 1976) essentially due to the presence of variablesin interpretations and models. We may omit the subscript D if there is no ambiguity. heoretical Study of Tabled CLP Definition 5 ( D -S-interpretation) . Let the pair ( l , c ) be a constraint literal, where l is a literaland c ∈ D an atomic constraint or a conjunction of constraints such that vars ( c ) ⊆ vars ( l ) . A D -S-interpretation is a set of constraint literals. Definition 6 ( D -S-model) . Let P be a program. A D -S-model of P is a D -S-interpretation thatis logically consistent with the clauses in P. The CLP fixpoint S-semantics is defined as the smallest fixpoint of the immediate consequenceoperator, S D P , where all the operations behave as defined in the constraint domain D . Definition 7 (Operator S D P (Falaschi et al. 1989; Toman 1997)) . Let P be a CLP program and Ia D -S-interpretation. The immediate consequence operator S D P is defined as:S D P ( I ) = I ∪ { ( h , c ) | h :- c h , l , . . . , l k is a clause of P , ( a i , c i ) ∈ I , < i ≤ k , c ′ = Pro j ( vars ( h ) , c h ∧ V ki = ( a i = l i ∧ c i )) , D (cid:15) c ′ , if c ′ ⊑ c ′′ for some ( h , c ′′ ) ∈ I then c = c ′′ else c = c ′ } Note that S D P may not add a pair (literal, constraint) when a constraint more general is alreadypresent in the interpretation being enlarged. However, to guarantee monotonicity, it does notremove existing more particular constraints. The operational semantics of TCLP (Definition 10)will do that. In this section we first present a top-down semantics for CLP withouttabling/suspension (Jaffar and Maher 1994) and then we extend it to capture the operationalsemantics of TCLP. The operational semantics is given in terms of a transition system thatcomputes the least model defined by the CLP fixpoint semantics (Section 2.3). The evaluationof a query is a sequence of steps from the initial state to a final state. Definition 8. A state is a tuple h R , c i where:– R, the resolvent, is a multiset of literals and constraints that contains the collection ofas-yet-unseen literals and constraints of the program.– c, the constraint store, is an atomic constraint or a conjunction of constraints. It is actedupon by the solver. In (Jaffar and Maher 1994) the constraint store is divided into a collection of awake constraintsand a collection of asleep constraints. This separation is ultimately motivated by implementationissues and we will not make this distinction here.Given a query ( q , c q ) , the initial state of the evaluation is h{ q } , c q i . Every transition stepbetween states resolves literals of the resolvent against the clauses of the program and addsconstraints to the constraint store. A derivation is successful if it is finite and the final state hasthe form h /0 , c i (i.e., the resolvent becomes empty). The answer for the query is Pro j ( vars ( q ) , c ) .As it is customary, we assume that the transitions due to constraint handling are deterministic(there is only one possible children per node), while the transitions due to literal matching may benon-deterministic (there are as many children as clauses whose head matches some literal in theresolvent). As a result, query evaluation takes the shape of a search tree, constructed followingDef. 9. The order in which literals are selected is not relevant. In practice, implementations would Joaqu´ın Arias and Manuel Carro use a computation rule that is in charge of deciding the new constraint/literal to be resolvedamong the set of pending literals. A common rule is to follow the left-to-right order in whichliterals are written in the body of clauses.In what follows we will assume that variables in clauses are renamed apart before they areused in order to avoid clashes with existing variable names. Definition 9 (CLP tree) . Let P be a CLP definite program and ( q , c q ) a query. A CLP tree of ( q , c q ) for P, denoted by τ P ( q , c q ) , is a tree such that: The root of τ P ( q , c q ) is h{ q } , c q i , the initial state. The nodes of τ P ( q , c q ) are labeled with its corresponding state h L , c i , where L is a setcontaining the constraints and literals pending to be solved. The child/children of a node h l ∪ L , c i , where l is a literal, is/are: • A node/nodes h body ( h i ) ∪ L , c ∧ ( l = h i ) i obtained by resolution of l against thematching clause(s) h i :- body ( h i ) in P where l = h i is an abbreviation for theconjunction of equations between the arguments of l and h i . There is one node foreach matching clause. Matching clauses are assumed to be renamed apart. • Or a leaf node fail if there are no clauses in P which matching heads for the literal l. The child of a node h c ′ ∪ L , c i , where c ′ is a constraint, is: • The node h L , c ∧ c ′ i if D (cid:15) c ∧ c ′ . • Or a leaf node fail if D (cid:15) c ∧ c ′ . A leaf node h /0 , c i is the final state of a successful derivation. c is the final constraint store. The set of answers of τ P ( q , c q ) (i.e., the answers to the query ( q , c q ) ), denoted by Ans ( q , c q ) ,is the set of constraints c ′ i obtained as the projection of the final constraint stores c i ontovars ( q ) : Ans ( q , c q ) = { c ′ i | c ′ i = Pro j ( vars ( q ) , c i ) . h /0 , c i i ∈ τ P ( q , c q ) } We denote the set of tabled predicates in a TCLP program by Tab P . The most general callsto predicates in Tab P are called generators and are resolved against program clauses. The setof generators created during the evaluation of a query ( q , c q ) is denoted by Gen ( q , c q ) . Theanswers for a generator are collected and associated to that generator; see below how entailmentis used to keep only the relevant answers. Calls to tabled predicates that are more particular thana previously created generator become consumers and are not resolved against program clauses.Instead, they are resolved by consuming the answers collected from a generator; this is termed answer resolution .The execution of a query w.r.t. a TCLP program is represented as a forest of derivation trees ,and contains the tree corresponding to the initial query and the trees corresponding to each of thegenerators. The evaluation of each generator corresponds to one of the trees of the forest. Duringexecution, call entailment (Def. 10.2.b) detects when a goal is entailed/subsumed by a previousgoal (its generator) and if so, it suspends their execution and eventually reuses the answers fromthe generator. During answer entailment, answers that are entailed by another (more general)answer are discarded/removed (Def. 10.2.f). Definition 10 (TCLP forest) . Let P be a TCLP definite program, Tab P the set of tabledpredicates, and ( q , c q ) a query. A TCLP forest of ( q , c q ) for P, denoted as F P ( q , c q ) is the set ofTCLP trees such that:heoretical Study of Tabled CLP The initial tree, τ P ( q , c q ) , is the TCLP tree of the query, and the rest of the trees, τ P ( g i , c g i ) ,are the TCLP trees of the generators ( g i , c g i ) ∈ Gen ( q , c q ) : F P ( q , c q ) = { τ P ( q , c q ) , τ P ( g i , c g i ) , . . . } with i ≥ A TCLP tree, denoted by τ P ( q , c q ) (resp. τ P ( g i , c g i ) ), is similar to a CLP tree where: (a) The root of the TCLP tree τ P ( g , c ) is h{ g } , c i , its initial state. (b) The descendants of a node h t ∪ L , c i where t is a tabled literal are obtained byobtaining answers for t through answer resolution (i.e., consuming existing answers)in one of the two following ways:– If ( t , c ) is a consumer of a previous generator ( g , c g ) ∈ Gen ( q , c q ) , we use theanswers c i ∈ Ans ( g , c g ) to construct its children. In this case, g and t match and ( g , c g ) is entailed by ( t , c ) , i.e., c ∧ ( t = g ) ⊑ D c g . As a reminder, t = g denotes theconjunction of equality constraints between the corresponding arguments of t andg and Ans ( g , c g ) is the set of recorded answers for ( g , c g ) .– Otherwise, ( t , c ) will produce a new generator ( t , c ′ ) and we use the answersc i ∈ Ans ( t , c ′ ) . In this case, a new TCLP tree τ P ( t , c ′ ) , where c ′ = Pro j ( vars ( t ) , c ) ,is created and added to the current forest. The goal ( t , c ′ ) is then marked as agenerator and added to Gen ( q , c q ) .From the possible answers c i to ( t , c ) , children nodes are constructed as follows: • A node h c i ∪ L , c i , one for each answer c i . • Or a leaf fail if there is no answer c i . (c) The transitions for non-tabled literals and for new generators are as in the CLP tree(Def. 9.3). (d) The transitions for constraints are as in the CLP tree (Def. 9.4). (e) A leaf node h /0 , c i is the final state of a successful derivation and c is its final constraintstore. (f) The set of answers of τ P ( g , c g ) , the TCLP tree of the generator ( g , c g ) , denoted byAns ( g , c g ) , is the set constraints c ′ i obtained as the projection of the final constraintstores c i onto vars ( g ) that do not entail any other constraint c j , i.e., they are the mostgeneral answers.Ans ( c , c g ) = { c ′ i | c ′ i = Pro j ( vars ( g ) , c i ) , h /0 , c i i ∈ τ P ( g , c g ) , ∄ c j · h /0 , c j i ∈ τ P ( g , c g ) , c i = c j , c ′ i ⊑ Pro j ( vars ( g ) , c j ) } The set of the answers of the forest F P ( q , c q ) , denoted by Ans ( q , c q ) , is the set of answers of τ P ( q , c q ) that are obtained as in the CLP tree (Def. 9.6). The answer management strategy used in Def. 10.2.f aims at keeping only the most generalanswers. Since implementations incrementally save answers as they are found, some previousproposals used simpler answer management strategies. For example, (Cui and Warren 2000;Chico de Guzm´an et al. 2012) checked entailment when adding answers to the previouslygenerated ones and only discarded answers which were more particular than a previous one.This reduces the number of saved answers, but older answers that are more particular than neweranswers were still kept. It could also be possible to remove previous answers that are moreparticular than new answers but still add answers that are more particular than previous ones. Thechoice among them does not impact soundness or completeness properties. However, discarding0 Joaqu´ın Arias and Manuel Carro and removing redundant answers, despite extra cost, has been shown to greatly increase theefficiency of the implementation (Arias and Carro 2019a). Example 2. TCLP forest of dist/3 This example illustrates how the algorithm works with mutually dependent generators, i.e.,generators that consume answers from each other, and to see why not all the answersfrom a generator may be directly used by its consumers. Fig. 2 shows the TCLP forestcorresponding to querying the right-recursive dist/3 program (Fig. 1c). Unlike theleft-recursive version, which generates only one TCLP tree, the right-recursive versiongenerates two TCLP trees, one for each generator. The reason is that the left-recursiveversion only seeks paths from the node a , but the right-recursive version creates a newTCLP tree at the state s4 to collect the paths from the node b , since edge(a, b) had beenpreviously evaluated at state s3 . We explain now how we obtain some of the states; the restare obtained similarly. s1 the TCLP tree τ P ( dist ( a , V0 , V1 ) , V1 < ) is created. s4 is obtained by resolving the literal edge(a, Z ,D1 ) . Ans(s5) the tabled literal dist(b, V0, D2 ) is a new generator and a new TCLP tree τ P ( dist ( b , V2 , V3 ) , V3 > ∧ V3 < ) is created (Def. 10.2.b). s5 is the root node of the new TCLP tree. s6i/ii are obtained by resolving the literal dist(b, V2, V3) against the clauses of theprogram. s8 is obtained by resolving the literal edge(b, Z ,D1 ) .In the state s8 , the call ( dist(a, V2, D2 ) , D2 > 0 ∧ D2 < 75 ) is suspendedbecause it entails the former generator ( dist(a, V0 ,V1 ) , V1 < 150 ) . Ans(s1) the tabled literal dist(a, V2, D2 ) is resolved with answer resolution (Def. 10.2.f)using the answers from the previous TCLP tree τ P ( dist ( a , V0 , V1 ) , V1 < ) because the renamed projection of the current constraint store onto the variable ofthe literal entails the projected constraint store of the generator: ( V1 > 0 ∧ V1 < 75 ) ⊑ V1 < 150 . Since the initial TCLP forest is under construction and depends on itself,the current branch derivation is suspended.This suspension also causes the former generator to suspend at the state s4 . s9 is a final state obtained upon backtracking to the state s6ii . b1 is the first answer of the second generator.At this point the suspended calls can be resumed by consuming the answer b1 or byevaluating s2ii . The algorithm first tries to evaluate s2ii and then it will resume s4 consuming b1 . s10 is a final state obtained upon backtracking to the state s2ii . a1 is the first answer of the first generator: V0=b ∧ V1=50 . s11 is a final state obtained from the state s4 by consuming b1 . a2 is the second answer of the first generator: V0=a ∧ V1 > 75 ∧ V1 < 85 . s12 is a final state obtained from the state s8 by consuming a1 . b2 is the second answer of the second generator. This example also appears in the Supplementary Material of (Arias and Carro 2019a). The projection of V3 > 0 ∧ V3 < 100 ∧ D1 > 0 ∧ D2 > 0 ∧ V3=D1 +D2 ∧ Z =a ∧ D1 > 25 ∧ D1 < 35 onto D2 is D2 > 0 ∧ D2 < 75 . After renaming D2 =V1 , the resulting projection is V1 > 0 ∧ V1 < 75 . heoretical Study of Tabled CLP s1s2is3s4 Ans( s5 ) s11a2 V0 = a ∧ V1 > ∧ V1 < s14a3 V0 = b ∧ V1 > ∧ V1 < s2iis10a1 V0 = b ∧ V1 = (b1) (b2) s5s6is7s8 Ans( s1 ) s12b2 V2 = b ∧ V3 > ∧ V3 < s13 fail s15 fail s6iis9b1 V2 = a ∧ V3 > ∧ V3 < (a1) (a2)(a3) s1 h{ dist(a,V0,V1) } , V1 < i s2i h{ D1 +D2 , edge(a,Z ,D1 ), dist(Z ,Y ,D2 ) } , V1 < ∧ Y = V0 ∧ D = V1 i s3 h{ edge(a,Z ,D1 ), dist(Z ,V0,D2 ) } , V1 < ∧ D1 > ∧ D2 > ∧ V1 = D1 + D2 i s4 h{ dist(b,V0,D2 ) } , V1 < ∧ D1 > ∧ D2 > ∧ V1 = D1 + D2 ∧ Z = b ∧ D1 = i Ans(dist(b,V2,V3), V > ∧ V < ) s11 h /0 , V1 < ∧ D2 > ∧ V1 = + D2 ∧ V0 = a ∧ D2 > ∧ D2 < i a2 V0 = a ∧ V1 > ∧ V1 < s14 h /0 , V1 < ∧ D2 > ∧ V1 = + D2 ∧ V0 = b ∧ D2 > ∧ D2 < i a3 V0 = b ∧ V1 > ∧ V1 < s2ii h{ edge(a,V0,V1) } , V1 < ∧ Y = V0 ∧ D = V1 i s10 h /0 , V1 < ∧ V0 = b ∧ V1 = i a1 V0 = b ∧ V1 = with renaming V = V ∧ D = V (b1)(b2) s5 h{ dist(b,V2,V3) } , V3 > ∧ V3 < i s6i h{ D1 +D2 , edge(b,Z ,D1 ), dist(Z ,Y ,D2 ) } , V3 > ∧ V3 < ∧ Y = V2 ∧ D = V3 i s7 h{ edge(b,Z ,D1 ), dist(Z ,V2,D2 ) } , V3 > ∧ V3 < ∧ D1 > ∧ D2 > ∧ V3 = D1 + D2 i s8 h{ dist(a,V2,D2 ) } , V3 > ∧ V3 < ∧ D1 > ∧ D2 > ∧ V3 = D1 + D2 ∧ Z = a ∧ D1 > ∧ D1 < i Ans(dist(a,V0 ,V1 ), V < ) is entailed because V > ∧ V < ⊑ V < s12 h /0 , V3 > ∧ V3 < ∧ D2 > ∧ V3 > + D2 ∧ V3 < + D2 ∧ V2 = b ∧ D2 = i b2 V2 = b ∧ V3 > ∧ V3 < s13 h /0 , V3 > ∧ V3 < ∧ D2 > ∧ V3 > + D2 ∧ V3 < + D2 ∧ V2 = a ∧ D2 > ∧ D2 < i fail s15 h /0 , V3 > ∧ V3 < ∧ D2 > ∧ V3 > + D2 ∧ V3 < + D2 ∧ V2 = b ∧ D2 > ∧ D2 < i fail s6ii h{ edge(b,V2,V3) } , V3 > ∧ V3 < ∧ Y = V2 ∧ D = V3 i s9 h /0 , V3 > ∧ V3 < ∧ V2 = a ∧ V3 > ∧ V3 < i b1 V2 = a ∧ V3 > ∧ V3 < with renaming V = V ∧ D = V (a1)(a2)(a3) Fig. 2: TCLP forest of ?- D with right recursion.2 Joaqu´ın Arias and Manuel Carro s13 is a failed derivation obtained from s8 by consuming a2 . It fails because the constraints V0=a ∧ V1 > 75 ∧ V1 < 85 are inconsistent with the current constraint store. Notethat the projection of the constraint store of s8 onto V1 is V1 > 0 ∧ V1 < 75 . Its childis a fail node. s14 is a final state obtained from the state s4 by consuming b2 . a3 is the third answer of the first generator: V0=b ∧ V1 > 125 ∧ V1 < 135 . s15 is a failed derivation obtained from s8 by consuming a3 . Its child is a fail node. The comparison of this forest (with two trees) with the forest obtained for the left-recursiveversion (with one tree) illustrates why left recursion reduces the execution time and memoryrequirements when using tabling / TCLP: left recursion will usually create fewer generators.We have also seen that using answers from a most general call, as in the answer resolution ofstate s8 (i.e., the constraint store of the consumer V1 > 0 ∧ V1 < 75 is more particular thanthe constraint store of the generator V1 < 150 ), makes it necessary to filter the correct ones (i.e.,answer resolution for a2 and a3 failed). This is not required in variant tabling because the answersfrom a generator are always valid for its consumers. In this section we prove the soundness and completeness of the operational semantics for thetop-down execution of tabled constraint logic programs previously presented. Then, we presentsome additional results on termination properties for arbitrary constraint solvers that are notnecessarily constraint-compact, extending the results in (Toman 1997). (Toman 1997) proves soundness and completeness of SLG C for TCLP Datalog programs byreduction to soundness and completeness of bottom-up evaluation. It is possible to extend theseresults to prove the soundness and completeness of our proposal: they only differ in the answermanagement strategy and the construction of the TCLP forest. The strategy used in SLG C onlydiscards answers which are more particular than a previous answer, while in our proposal we inaddition remove previously existing more particular answers (Def. 10.2.f). The result of this isthat only the most general answers are kept. In SLG C , the generation of the forest is modeledas the application of rewriting rules. In our proposal, the TCLP forest is defined as a transitionsystem (Def. 10), where the different cases in the definition can be seen as rules which make theTCLP forest evolve.The lemma, theorems, and their proofs are reformulated taking in consideration thesedifferences. First we prove that answer resolution using entailment is correct w.r.t. SLDresolution; and although only the most general answers are kept, answer resolution usingentailment is complete w.r.t. SLD resolution. Then we use these results to prove soundness andcompleteness of TCLP with entailment w.r.t. the least fixed point semantics. Lemma 1 (Application of derivations with most general constraint stores) . Let h{ l i , l i + , . . . , l k } , cs i i h{ l i + , . . . , l k } , cs i + i be a derivation and ( l i , c ) a goal with cs i ⊑ c. Then: ∃h{ l i } , c i h /0 , c ′ i with cs i + = cs i ∧ c ′ Intuitively, if there is an SLD derivation that gives a solution for a goal ( l i , cs i ) , this solution heoretical Study of Tabled CLP ( l i , c ) without the need to resolve themore particular one. Proof. We will see that there exists a derivation h{ l i } , c i h /0 , c ′ i that follows the same steps as h{ l i , . . . , l k } , cs i i h{ l i + , . . . , l k } , cs i + i :(1) if h{ l i , . . . , l k } , cs i i is resolved against a clause l i :- c h , then its resulting constraint store is cs i + = cs i ∧ c h (plus head unification). Since cs i ⊑ c , we can apply the same rule to h{ l i } , c i andits resulting constraint store is c ′ = c ∧ c h . Also, since cs i ⊑ c , we have cs i ⇔ cs i ∧ c . Therefore, cs i + = cs i ∧ c ∧ c h (expanding cs i ) and cs i + = cs i ∧ c ′ (contracting c ∧ c h ).(2) if h{ l i , . . . , l k } , cs i i is resolved against a clause l i :- c h , a , . . . , a m , the next state is h{ a ,. . . , a m , l i + , . . . , l k } , cs i ∧ c h i (resp. h{ a , . . . , a m } , c ∧ c h i ). By induction, since cs i ⊑ true (resp. c ⊑ true ), there exist m derivations h{ a j } , true i h /0 , c ′ a j i such that the resulting constraintstore of the path is cs i + = cs i ∧ c h ∧ V mj = c ′ a j (resp. c ′ = c ∧ c h ∧ V mj = c ′ a j ). Since cs i ⊑ c , wehave cs i ⇔ cs i ∧ c . Therefore, cs i + = cs i ∧ c ∧ c h ∧ V mj = c ′ a j (expanding cs i ) and cs i + = cs i ∧ c ′ (contracting c ∧ c h ∧ V mj = c ′ a j ).We will use this lemma to prove correctness of answer resolution. We model the answersobtained for a generator with the derivation h{ l i } , c i h /0 , c ′ i , while ( l i , cs i ) would be aconsumer for the generator ( l i , c ) . Note that the condition cs i ⊑ c precisely captures the generator/ consumer relationship. Corollary 1 (Correctness of answer resolution using entailment) . As an immediate consequenceof Lemma 1, using answer resolution with entailment (Def. 10.2.b) gives correct results. Answerresolution of h{ l i , . . . , l k } , cs i i consumes an answer c ′ from a previous derivation h{ l i } , c i h /0 , c ′ i where ( l i , c ) is the generator of the derivation and, by the definition of generator, cs i ⊑ c.When D (cid:15) cs i ∧ c ′ (Def. 10.2.d), it generates the state h{ l i + , . . . , l k } , cs i ∧ c ′ i . Corollary 2 (Completeness of answer resolution using entailment) . Recall that Ans ( l , c ) is theset containing the most general answers for a generator goal ( l , c ) (Def. 10.2.f), and if thereare two goals ( l , c a ) and ( l , c b ) with c a ⊑ c b , only the answers for the most general goal c b need to be kept. Therefore, for any derivation of a generator h{ l i } , c i h /0 , c i i we have that ∃ c ′ i ∈ Ans ( l i , c ′ ) . c i ⊑ c ′ i for some c ′ s.t. c ⊑ c ′ . Let us take a (partial) clause derivation h{ l i , . . . , l k } , c i h{ l i + , . . . , l k } , c ∧ c i i . If c ′ i ∈ Ans ( l i , c ′ ) for some c ′ s.t. c ⊑ c ′ (which is the entailmentcondition necessary to use the saved answer constraints), then c i ⊑ c ′ i . If we use c ′ i to performanswer resolution with ( l i , c ) , we have h{ l i , . . . , l k } , c i h{ l i + , . . . , l k } , c ∧ c ′ i i . Given that c i ⊑ c ′ i ,we have that c ∧ c i ⊑ c ∧ c ′ i , and any answer returned by clause resolution is contained in someanswer returned by answer resolution with entailment. The same reasoning can be applied to thederivation of l i + and so on. Therefore, answer resolution with entailment does not lose answersw.r.t. clause resolution even if not all the goals and answers are memorized. Theorem 1 (Soundness w.r.t. the fixpoint semantics) . Let P be a TCLP definite program and ( q , c q ) a query. Then for any answer c ′ of the TCLP forest F P ( q , c q ) c ′ ∈ Ans ( q , c q ) ⇒ ∃ ( q , c ) ∈ lfp ( S D P ( /0 )) . c ′ = c q ∧ cI.e., any answer derived from the forest construction can also be derived from the bottom-upcomputation.Proof. For any answer c ′ ∈ Ans ( q , c q ) there exists a successful derivation h{ q } , c q i h /0 , c ′ i .Since c q ⊑ true , by Lemma 1 there exists h{ q } , true i h /0 , c i . c ′ = c q ∧ c . We know that for4 Joaqu´ın Arias and Manuel Carro any successful derivation h{ q } , true i h /0 , c i against the clauses of the program there is ananswer derived from the bottom-up computation ( q , c ) ∈ lfp ( S D P ( /0 )) . Therefore, by Corollary 1if answer resolution is used instead of clause resolution, the result is also correct and for anyanswer c ′ ∈ Ans ( q , c q ) there exists ( q , c ) ∈ lfp ( S D P ( /0 )) . c ′ = c q ∧ c . Theorem 2 (Completeness w.r.t. the fixpoint semantics) . Let P be a TCLP definite program and ( h , true ) a query. Then for every ( h , c ) in lfp ( S D P ) : ( h , c ) ∈ lfp ( S D P ( /0 )) ⇒ ∃ c ′ ∈ Ans ( h , true ) . c ⊑ c ′ I.e., all the answers derived from the bottom-up computation are also derived by the forestconstruction or entailed by answers inferred in the forest.Proof. We know that for any answer derived from the bottom-up computation ( h , c ) ∈ lfp ( S D P ( /0 )) there exists a successful derivation h{ h } , true i h /0 , c i against the clauses of the program.By Corollary 2 if answer resolution is used instead of clause resolution, the results is alsocomplete. Therefore, since the answer management strategy only keeps the most general answers(Def. 10.2.f), we have that ∃ c ′ ∈ Ans ( h , true ) . c ⊑ c ′ . The next definition is a fundamental property of some constraint domains that plays a key role inthe termination of the evaluation of queries to TCLP programs (Toman 1997). Definition 11 (Constraint-compact) . Let D be a constraint domain, and D the set of allconstraints expressable in D . Then D is constraint-compact iff:– for every finite set of variables S, and– for every subset C ⊆ D such that ∀ c ∈ C . vars ( c ) ⊆ S,there is a finite subset C fin ⊆ C such that ∀ c ∈ C . ∃ c ′ ∈ C fin . c ⊑ D c ′ Intuitively speaking, a constraint domain D is constraint-compact if for any (potentiallyinfinite) set of constraints C expressable in D using a finite number of variables, there is a finite set of constraints C fin ⊆ C that covers C in the sense of ⊑ D . In other words, C fin is as generalas C . Additionally, in a constraint-compact constraint domain, if an infinite set of constraintsis unsatisfiable, then there is a finite subset which is unsatisfiable, therefore guaranteeing theexistence of finite unsatisfiability proofs. Example 3. The gap-order constraints (Revesz 1993) is a constraint-compact domain generated fromthe set C < Z = { x < u : u ∈ A } ∪ { u < x : u ∈ A } ∪ { x + k < y : k ∈ Z + } where A ⊂ Z + is finite.First, we see that the set C x < u (resp. C u < x ) of possible constraints of the form x < u (resp.u < x), where x ∈ S, is finite, because A and S are finite. Therefore, it is trivial to define afinite set that covers C x < u ∪ C u < x . Second, for every pair of variables x , y ∈ S, the set C x + k < y of possible constraints of the form x + k < y , k ∈ Z + can be covered by a finite subset of itself.Although for a given pair of variables x, y one can generate an infinite number of constraintsx + k i < y choosing different k i ∈ Z + , the constraint x + k < y having the smallest k amongall the k i ( ∀ k i . k ≤ k i ) subsumes all the rest of the constraints (x + k i < y ⊑ x + k < y). Notethat k always exists, since k i ∈ Z + , which has a minimum. Since S is finite, we only have tocheck it for two given x, y; we can repeat the same process for every pair of variables, sinceheoretical Study of Tabled CLP there is only a finite number of them. Therefore, the infinite set C x + k < y has a finite subsetC fin = { x + k < y } which covers it (C x + k < y ⊑ C fin ). Example 4. The Herbrand domain is not constraint-compact. Take the infinite set of constraintsC = { X = a , X = f ( a ) , X = f ( f ( a )) , . . . } . No finite subset of C using only constraints in Ccan cover C. The termination of TCLP Datalog programs under a top-down strategy when the constraintsystem is constraint-compact is proven in (Toman 1997). In that case, the evaluation will suspendthe exploration of a call whose constraint store is less general than or comparable to a previouscall. Eventually, the program will generate a set of call constraint stores that can cover any infiniteset of constraints in the constraint domain, therefore finishing evaluation.Many TCLP applications require constraint domains that are not constraint-compact becauseconstraint-compact domains in general have a limited expressiveness. We refine here thetermination theorem (Toman 1997, Theorem 23) for Datalog programs with constraint-compactdomains to cover cases where the constraint domain is not constraint-compact, but in which theprogram evaluation generates only a constraint-compact subset of all the constraints expressablein the constraint domain. Theorem 3 (Termination in non constraint-compact domains) . Let P be a TCLP( D ) definiteprogram and ( q , c q ) a query. Then the TCLP execution for that query terminates iff: • For every goal ( g , c i ) in the forest F ( q , c q ) , the set C g is constraint-compact, where C g is theset of all the constraint stores c i , projected and renamed w.r.t. the arguments of g. • For every goal ( g , c g ) in the forest F ( q , c q ) , the set A h{ g } , c g i is constraint-compact, whereA h{ g } , c g i is the set of all the answer constraints c ′ , projected and renamed w.r.t. the argumentsof g, s.t. c ′ is a successful derivation of ( g , c i ) in the forest F ( q , c q ) .Proof. (Toman 1997) proves termination by observing that the SLG C rewriting rules can beapplied only finitely many times. We extend this proof to ensure that the TCLP forest generatedis finite and therefore the program execution terminates.1. The execution can only generate a finite number of literals, up to variable renaming, becausethey are linearized (unifications take place in the constraints in the body) and the number ofpredicates in the program is finite.2. The execution can only generate a finite number of TCLP forests τ P ( g , c g ) because thenumber of possible literals is finite (point 1) and for each literal g , the set C g of its possibleactive constraint stores is constraint-compact. That means that, for every subset of activeconstraint stores C ⊑ C g , there exists a finite subset, C fin ⊆ C of possible most general calls,such that ∀ c ∈ C . ∃ c ′ ∈ C fin . c ⊑ D c ′ . Therefore, at some point every new call will be entailedby some previous generator (this is checked in Def. 10.2.b).3. The set of answers Ans ( g , c g ) (Def. 10.2.f) is finite because the set of possible most generalanswer constraints is finite. The justification similar to that in point 2.4. The number of children from a node resolved against clauses in P (Def. 10.2.c) is finitebecause the number of clauses in P is finite.5. The number of children from a node resolved by answer resolution (Def. 10.2.b) is finitebecause, by point 3, the set of answers Ans ( g , c g ) is finite.The intuition here is that for every subset C from the set of all possible constraint stores C g Joaqu´ın Arias and Manuel Carro p(X) :- Y = f(X), p(Y). p(a). (a) Program which finishesunder TCLP( H ). nat(X) :- X nat(Y). nat(0). (b) Natural numbersin TCLP( Q ). nat_k(X) :- X nat_k(Y). nat_k(0). nat_k(X) :- X (c) Describing infinitely manynumbers in TCLP( Q ). Fig. 3: TCLP programs under H and Q .that can be generated when evaluating a call to P , if there is a finite subset C fin ⊆ C that covers(i.e., is as general as) C , then, at some point, any call will be entailed by previous calls, therebyallowing its suspension to avoid loops. Similarly, for every subset A from the set of all possibleanswer constraints A h{ g } , c g i that can be generated by a call, if there is a finite subset A fin ⊆ A that covers A , then, at some point, any answer will be entailed by a previous one, ensuring thatthe class of answers Ans ( g , c g ) which entail any other possible answer returned by the programis finite. Note that this result implies the classical result that programs with the bounded depthterm property always finish under tabling with variant tabling, since the bounded depth termproperty means that the number of possible constraints is finite and therefore any constraint setcovers itself. Example 5. The Herbrand domain (with constants and function symbols) and syntactic equalityis not constraint-compact, and therefore termination of TCLP( H ) programs is not guaranteed.However, in the case of programs which have only constants, the number of constraints that canbe generated is finite, and therefore termination is ensured. Termination is also ensured (evenwith variant tabling) when a program can only generate terms with a bounded depth. In thiscase, the number of distinct terms (and therefore of equality constraints) that can be generatedis finite as well. Example 6. Fig. 3a shows a program which loops in tabled Prolog and under variant tabling. The unification appears explicitly in the body for clarity. Although CLP( H ) is notconstraint-compact, the constraints generated by that program under the query ?- p(X) canmake it finish. Let examine its behavior from two points of view: Compactness of the call constraint stores The set of all the constraintstores generated for the predicate p/1 under the query ( p(X) , true ) isC p ( V ) = { true , V = f ( X ) , V = f ( f ( X )) , . . . } . It is constraint-compact because forevery subset C there is a finite set, e.g. C fin = { true } , that covers C. Compactness of the answer constraints Additionally, the set of all answer constraints forthe query, A ( p(V) , true ) = { V = a } , is also constraint-compact because it is finite. Since bothare constraint-compact, the execution terminates. Note that a finite answer set does not imply a finite domain for the answers: the set of answers Ans(q, c q )= { V > 5 } isfinite, but the answer domain of V is infinite. The syntax C p ( V ) means that (i) we are projecting all the calls to predicate p/1 on the variables that call, and (ii) weare renaming these variables to be V in all the calls. We could associate with every constraint store the names of thevariables in the call in order to be able to compare different constraints stores (which is unnecessary after projection ifthere is only one variable in the call, but it would be needed if more than one variable is involved). In order to avoidsuch an overload, and without loss of generality, we preferred to project and rename to a unique set of variables. heoretical Study of Tabled CLP Suspension due to call entailment The first recursive call is ( p(Y ) , Y =f(X) ) and itsprojected and renamed constraint store is entailed by the initial store: V=f(X) ⊑ true .Therefore, TCLP evaluation suspends the recursive call, shifts execution to the secondclause, and generates the answer X=a . This answer is given to the suspended recursive call,results in the inconsistent constraint store Y =f(X) ∧ Y =a , and the execution terminates. Example 7. Using the previous example (Fig. 3a) under the query ?- p(a) , the set ofall the generated constraint stores is C p ( V ) = { V = a , V = f ( a ) , V = f ( f ( a )) , . . . } . It is notconstraint-compact and the execution does not terminate. Let us examine its behavior: The call constraint stores are not compact The first recursive call is ( p(Y ) , X=a ∧ Y =f(X) ) and the projection of its constraint store, Y =f(a) , is not entailedby the initial one after renaming: V=f(a) V=a . Then this call is evaluated and producesthe second recursive call, ( p(Y ) , X=a ∧ Y =f(X) ∧ Y =f(f(X)) ) . Its projectedconstraint store, Y =f(f(a)) , is not entailed by any of the previous constraint stores,and so on with the rest of the recursive calls. Therefore, the evaluation loops withoutterminating. Let us show the termination properties of the examples used in (Arias and Carro 2019a). Theseexamples show under what conditions programs would terminate even if the constraint domainis not constraint-compact. Example 8. Fig. 3b shows a program which generates all the natural numbers using TCLP( Q ).Although CLP( Q ) is not constraint-compact, the constraint stores generated by that program forthe query ?- X are constraint-compact and the program finitely finishes. Let uslook at its behavior from two points of view: Compactness of the call constraint stores and answer constraints The set of allconstraint stores generated for the predicate nat/1 under the query ( nat(X) , X < 10 ) isC nat ( V ) = { V < , V < , . . . , V < − , V < − , . . . } . It is constraint-compact because everysubset C ∈ C nat ( V ) is covered by C fin = { V < } . The set of all possible answer constraintsfor the query, A ( nat(V) , V < 10 ) = { V = , . . . , V = } , is also constraint-compact because it isfinite. Therefore, the program terminates. Suspension due to call entailment The first recursive call is ( nat(Y ) , X < 10 ∧ X=Y +1 ) and the projection of its constraint store after renaming is entailedby the initial one since V < 9 ⊑ V < 10 . Therefore, TCLP evaluation suspends in therecursive call, shifts execution to the second clause and generates the answer X=0 . Thisanswer is given to the recursive call, which was suspended, produces the constraint store X < 10 ∧ X=Y +1 ∧ Y =0 , and generates the answer X=1 . Each new answer X n =n is usedto feed the recursive call. When the answer X=9 is given, it results in the (inconsistent)constraint store X < 10 ∧ X=Y +1 ∧ Y =9 and the execution terminates. Example 9. The program in Fig. 3b does not terminate for the query ?- X . Let us examine its behaviour: The call constraint stores are not compact The set of all constraint stores generated bythe query ( nat(X) , X > 0 ∧ X < 10 ) is C nat ( V ) = { V > ∧ V < , V > − ∧ V < , . . . , V > − n ∧ V < ( − n ) , . . . } , which it is not constraint-compact. Note that V is, in successivecalls, restricted to a sliding interval [ k, k+10 ] which starts at k=0 and decreases k in eachrecursive call. No finite set of intervals can cover any subset of the possible intervals. Joaqu´ın Arias and Manuel Carro The evaluation loops The first recursive call is ( nat(Y ) , X > 0 ∧ X < 10 ∧ X=Y +1 ) and the projection of its constraint store is not entailed by the initial one after renamingsince (V > -1 ∧ V < 9) (X > 0 ∧ X < 10) . Then this call is evaluated and producesthe second recursive call, ( nat(Y ) , X > 0 ∧ X < 10 ∧ X=Y +1 ∧ Y =Y +1 ) . Again, theprojection of its constraint store, Y > -2 ∧ Y < 8 , is not entailed by any of the previousconstraint stores, and so on. The evaluation therefore loops. Example 10. The program in Fig. 3b does not terminate with the query ?- nat(X) . Compactness of the call constraints stores The set of all constraint stores generated bythe query ( nat(X) , true ) is C nat ( V ) = { true } . The set C nat ( V ) is constraint-compactbecause it is finite. The answer constraints are not compact However, the answer constraint setA ( nat(V) , true ) = { V = , V = , . . . , V = n , . . . } is not constraint-compact, and therefore theprogram does not terminate. The evaluation does not terminate The first recursive call is ( nat(Y ) , X=Y +1 ) and theprojection of its constraint store is entailed by the initial store. Therefore, the TCLPevaluation suspends the recursive call, shifts execution to the second clause, and generatesthe answer X=0 . This answer is used to feed the suspended recursive call, resulting in theconstraint store X=Y +1 ∧ Y =0 which generates the answer X=1 . Each new answer X=n is used to feed the suspended recursive call. Since the projection of the constraint stores onthe call variables is true , the execution tries to generate infinitely many natural numbers. Example 11. Unlike what happens in pure Prolog/variant tabling, adding new clauses to aprogram under TCLP can make it terminate. As an example, Fig. 3c is the same as Fig. 3b withthe addition of the clause nat_k(X):- X . Let us examine its behavior under the query ?- nat_k(X) : Compactness of call/answer constraint stores The set of all constraint stores generatedremains C nat k ( V ) = { true } . But the new clause makes the answer constraint setbecome A ( nat_k(V) , true ) = { V = , V = , . . . , V = n , . . . , V > , V > , . . . , V > n ,. . . } , which is constraint-compact because a constraint of the form V > n entails infinitelymany constraints, i.e. it covers the infinite set {V=n+1, . . . ,V > n+1, . . . } . Therefore, sinceboth sets are constraint-compact, the program terminates. First search, then consume The first recursive call ( nat_k(Y ) , X = Y +1 ) is suspendedand the TCLP evaluation shifts to the second clause which generates the answer X=0 . Then,instead of feeding the suspended call, the evaluation continues the search and shifts to theadded clause, nat_k(X):- X , and generates the answer X > 1000 . Since no moreclauses remain to be explored, the answer X=0 is used, generating X=1 . Then X > 1000 isused, resulting in the constraint store X=Y +1 ∧ Y > 1000 , which generates the answer X > 1001 . However, X > 1001 is discarded because X > 1001 ⊑ X > 1000 . Then, one byone each answer X=n is used, generating X=n+1 . But when the answer X=1000 is used,the resulting answer X=1001 is discarded because X=1001 ⊑ X > 1000 . At this point the The equation in the body of the clause X=Y +1 defines a relation between the variables but, since the domain of X isnot restricted, its projection onto Y returns no constraints (i.e., Proj(Y , X=Y +1)= true ). This depends on the strategy used by the TCLP engine to resume suspended goals. An implementation that gathers allthe answers for goals that can produce results first, and then these answers are used to feed suspended goals, makesthe exploration of the forests proceed in a breadth-first fashion. heoretical Study of Tabled CLP evaluation terminates because there are no more answers to be consumed. The resulting setof answers is Ans(nat_k(X),true)= {X=0, X > 1000, X=1, . . . ,X=1000} . The detection of more particular calls and answers is performed by checking entailment ofthe current constraint store of calls (resp., answers) against the projected constraint store of aprevious call. Some previous frameworks (Schrijvers et al. 2008; Cui and Warren 2000) did notimplement a precise projection due to performance and implementation issues. Given that insome cases approximate projections can be more efficient and/or easier to implement, it is worthexploring how relaxing projection impacts soundness and completeness. Let c be a constraintstore and let c s be a projection of c on some set of variables S . Let us also recall (Def. 2) that avaluation is a mapping from variables to domain constants and that a solution for a constraint is avaluation that is consistent with the interpretation of the constraint in its domain. We distinguishthree possible projection variants: Precise projection ( denoted c ≡ c s ) c s is a projection of c over some set of variables S , asdefined in Def. 4. Over-approximating projection ( denoted c ⊑ c s ) The projected constraint c s is more generalthan the precise projection, e.g., some solutions for c s are not partial solutions for c . Anysolution for c is still a solution for c s . Under-approximating projection ( denoted c ⊒ c s ) c s is less general than the preciseprojection, e.g., there may be solutions for c that are not solutions for c s . Any solution of c s is still a (partial) solution for c .Let us explain how these projection variants interact with the three phases of the operationalsemantics described in Section 2.4: • During the call entailment check (see Def. 10.2.b), if a new goal ( t , c ) , where t is a tabledliteral, does not entail a previous generator then, a new TCLP forest F P ( t , c s ) is createdand ( t , c s ) is a new generator, where c s = Pro j ( vars ( t ) , c ) . Therefore, depending on theprojection variant used, we have that: — Using a precise projection, as already shown, the evaluation of the generator ( t , c s ) would generate the same answers as the evaluation of the goal ( t , c ) . — Using an over-approximating projection, the generator ( t , c s ) is more general than ( t , c ) , and therefore the evaluation of ( t , c s ) may generate answers that are notconsistent with the constraint store c . Note, however, that these answers will befiltered: when they are recovered and applied to a consumer (or to their generator)they will be checked for consistency against the constraint store of the call for whichthey are used. — Using an under-approximating projection, the generator ( t , c s ) is more particularthan the goal ( t , c ) , and, therefore, its evaluation may not generate answers that ( t , c ) would. Note that all of them would be consistent with c .On the other hand, if a new goal ( t , c ′ ) entails a previous generator ( t , c s ) , the goal ( t , c ′ ) is as usual marked as a consumer and would consume the answers generated by ( t , c s ) . In all cases the projected constraint store c S only has the variables in S in common with the original store c . Joaqu´ın Arias and Manuel Carro • During the answer entailment check (Def. 10.2.f), the final constraint store a of eachsuccessful derivation of the evaluation of a generator ( t , c s ) is projected to obtain theanswer constraint a s , i.e., a s = Pro j ( vars ( t ) , a ) . Depending on the projection variant usedwe have that: — Using a precise projection (denoted a ≡ a s ), as already proved, the resulting set ofanswer constraints for a generator does not add or exclude any valuation w.r.t. theset of its final constraint stores. — Using an over-approximating projection (denoted a ⊑ a s ), the projected answerconstraint a s may add valuations that are not consistent with the final constraintstore a . — Using an under-approximating projection (denoted a s ⊑ a ), a s may excludevaluations that are contained in the constraint store a . • During the application of the answers (Def. 10.2.d), each answer constraint a s obtainedduring the evaluation of a generator is added to the constraint store c of the goal thatcreated the generator and the goals that were marked as consumers of that generator. If a s is consistent with c , i.e., D (cid:15) c ∧ a s the evaluation continues under the constraint store c ∧ a s . Otherwise, it fails and the next answer constraint is retrieved.We will now summarize how using non-precise projections impacts the soundness andcompleteness of TCLP. Tables 2a and 2b summarize whether soundness and completeness (resp.)are preserved when using over- and under-approximations for the projections in the call (column)and answer (row) entailment check: ‘ X ’ in a location of each table means that the correspondingcombination of projection variants preserves soundness (resp., completeness), while ‘ × ’ meansthe opposite. As expected, some combinations do not preserve soundness / completeness. Let usgive an intuition behind these tables. • In the top row of Table 2a, the only combination that may be unsound is the one that usesan over-approximation for the call projection: the answers may be more general than whata precise approximation would produce. However, as mentioned before, when an answer isapplied to a goal, a conjunction with the call constraint of that goal is made. That balancesthe use of an over-approximation in the call. This is in fact similar to the case of a consumerthat uses answers from a more general generator. • The combinations in the middle row of Table 2a are not sound becauseover-approximations can produce answer constraints that allows for more valuations thana correct solution. • The cases in the bottom row of Table 2a are clearly sound as the projection of the answerconstraints is more restrictive than a precise projection, and therefore it cannot introduceunwanted solutions. • The combinations in the rightmost column and the bottom-most row of Table 2b may notbe complete because they either restrict the projected store for a call or they restrict theanswers. In both cases, solutions may be missed. • The rest of the cases in Table 2b may use projections more relaxed than a precise one, soadditional solutions can be generated, but no solution should be removed.Some approximate projections can be more efficient and/or easier to implement than preciseprojections, and that justifies their use in specific scenarios. For brevity, let us comment on the heoretical Study of Tabled CLP ≡ ’, ‘ ⊑ ’ and ‘ ⊒ ’)for the call and answer entailment check. (a) Soundness preservation. c ≡ c s c ⊑ c s c ⊒ c s a ≡ a s X X X a ⊑ a s × × × a ⊒ a s X X X (b) Completeness preservation. c ≡ c s c ⊑ c s c ⊒ c s a ≡ a s X X × a ⊑ a s X X × a ⊒ a s × × × combinations that preserve soundness and completeness, ≡ / ≡ and ⊑ / ≡ , and a combinationthat over-approximates the answers while using a precise projection in the calls, ≡ / ⊑ : • ≡ / ≡ : Precise projection ‘ ≡ ’ in the call and answer entailment check. This is optimal inthe sense that it guarantees soundness and completeness, removes redundant answers, andreduces the search space. It has been used in (Arias and Carro 2019a). • ⊑ / ≡ : Over-approximate projection ‘ ⊑ ’ for the calls and precise projection ‘ ≡ ’ for theanswers. In this case, generators may generate answers that a precise projection would not,since they start with a more relaxed constraint store (which can turn terminating queriesinto non-terminating ones). This of course preserves completeness. Soundness is preservedbecause answer constraints that are not consistent with the initial goal constraint store c will be discarded. Example 12. Call abstraction (Schrijvers et al. 2008) is an extreme example, where the constraint storeassociated with the tabled call is not taken into account for the execution of the call (i.e.,the projection of a constraint store is always the constraint true ). Therefore, a generatorwith true as constraint store will be entailed by any subsequent call because c ⊑ truefor any constraint c. As mentioned above (see Example 10), this loses several benefitsof tabling with constraints because we have to compute all the possible results for anunrestricted call and then filter them through the constraint store active at call-time.However, soundness is preserved. • ≡ / ⊑ : Precise projection ‘ ≡ ’ for the calls and over-approximate projection ‘ ⊑ ’for the answers. This combination is relevant because applications such as programanalyzers based on abstract interpretation can be seen as performing an execution in anabstract domain that over-approximates the values of the concrete domain to guaranteetermination. This over-approximation can be implemented with a constraint system thatreflects the operations of abstract domain and whose answer projections are as wellover-approximated. Such an over-approximation can increase performance because a moregeneral answer would be more frequently entailed by other answers, reducing the numberof answers stored and the number of resumptions.However, using an over-approximation in the answer projections may make answerresolution to lose precision arbitrarily. When an answer constraint a for a generator ( t , c s ) is projected to obtain the over-approximated answer constraint a s , this answer is savedin case it can be reused later on.2 Joaqu´ın Arias and Manuel Carro When a (more concrete) consumer ( t , c ′ ) performs answer resolution consuming a s , theresulting answer would be c ′ ∧ a s . Depending on how the over-approximation is performed, c ′ ∧ a s can be arbitrarily less precise (or even incomparable) than what would have beenthe result of executing ( t , c ′ ) against program clauses and then abstracting it. However,there are some cases where by putting some conditions on when an answer is reused, thisproblem can be worked around. Example 13. The implementation of PLAI with TCLP presented in (Arias and Carro 2019b) is anexample of this option. In that paper, an abstract interpreter is built using TCLPwhere the abstract domain and its operations are modeled using a constraint system.One of these computes the lowest upper bound of different abstract substitutionsresulting from the analysis of each clause of a predicate, to return the abstractsubstitution corresponding to the predicate. If a and a are the abstract substitutionsat the end of the bodies of two (normalized) clauses p and p , one would like tocalculate Pro j ( var ( p ) , a ∨ a ) , where Pro j may be an overapproximation. When answersubstitutions for each clause are projected and stored separately, composing them isdone by computing Pro j ( var ( p ) , a ) ⊔ Pro j ( var ( p ) , a ) , which can be less precise thanPro j ( var ( p ) , a ∨ a ) . That makes the predicate-level abstract substitution for p to possiblybe an overapproximation of the more precise abstract version.The tabled abstract substitution for goal p can be retrieved and used to compute the exitsubstitution for another goal p ′ when p ′ ⊑ p, using answer resolution. In that case, theexit substitution for p ′ can be arbitrarily less precise than what would have been obtainedby analyzing directly p ′ using clause resolution and then abstracting. We worked aroundthis issue by reusing substitutions only in the case that p and p ′ correspond to the samepoint in the lattice, i.e., when their entry substitutions are (semantically) equal modulovariable renaming. This ensures that the abstract substitution for p can be used for p ′ without incurring in additional loss of precision, because the analysis results for p ′ and pshould be the same. To the best of our knowledge, there are no examples where under-approximate projections ‘ ⊒ ’are used. However, since they preserve soundness (except when an over-approximation is usedfor answer projection, which is neither sound not complete), they can be useful in scenarios wherethe existence of a solution is enough to answer a question. This would the case, for example, forprogram verification: a solution for a query to a TCLP program that uses underapproximationsand looks for counterexamples to the correctness of a program would demonstrate the existenceof an error in the program, even if the answer only shows a subset of the domain of the variablesfor which the program exhibits a wrong behavior. We have extended the theoretical basis of tabled constraint logic programming for a top-downexecution. We have characterized the properties that the constraint solver should holds in orderto guarantee soundness and completeness. For non constraint-compact constraint systems, wedefine sufficient conditions for queries to terminate. For constraint domains without a preciseimplementation of the projection of constraint stores, we evaluate how relaxing the projectionimpacts soundness, completeness, and termination. heoretical Study of Tabled CLP References A RIAS , J. 2016. Tabled CLP for Reasoning over Stream Data. In Technical Communications of the 32ndInt’l. Conference on Logic Programming . Vol. 52. OASIcs, 1–8. Doctoral Consortium.A RIAS , J. AND C ARRO , M. 2019a. Description, Implementation, and Evaluation of a Generic Design forTabled CLP. Theory and Practice of Logic Programming 19, RIAS , J. AND C ARRO , M. 2019b. Evaluation of the Implementation of an Abstract InterpretationAlgorithm using Tabled CLP. Theory and Practice of Logic Programming 19, RIAS , J. AND C ARRO , M. 2019c. Incremental evaluation of lattice-based aggregates in logicprogramming using modular TCLP. In , J. J. Alferes and M. Johansson, Eds. LNCS, vol. 11372. Springer, 98–114.C HARATONIK , W., M UKHOPADHYAY , S., AND P ODELSKI , A. 2002. Constraint-Based Infinite ModelChecking and Tabulation for Stratified CLP. In ICLP’02 , P. J. Stuckey, Ed. Lecture Notes in ComputerScience, vol. 2401. Springer, 115–129.C HICO DE G UZM ´ AN , P., C ARRO , M., H ERMENEGILDO , M. V., AND S TUCKEY , P. 2012. A GeneralImplementation Framework for Tabled CLP. In , T. Schrijvers and P. Thiemann, Eds. LNCS, vol. 7294. Springer Verlag, 104–119.C UI , B. AND W ARREN , D. S. 2000. A system for Tabled Constraint Logic Programming. In Int’l.Conference on Computational Logic . LNCS, vol. 1861. Springer, 478–492.D AWSON , S., R AMAKRISHNAN , C. R., AND W ARREN , D. S. 1996. Practical Program Analysis UsingGeneral Purpose Logic Programming Systems – A Case Study. In Proceedings of the ACM SIGPLAN’96Conference on Programming Language Design and Implementation . ACM Press, New York, USA,117–126.F ALASCHI , M., L EVI , G., M ARTELLI , M., AND P ALAMIDESSI , C. 1989. Declarative Modeling of theOperational Behaviour of Logic Programs. Theoretical Computer Science 69 , 289–318.G ABBRIELLI , M. AND L EVI , G. 1991. Modeling Answer Constraints in Constraint Logic Programs. In Proc. 8th Int’l Conference on Logic Programming . 238–252.G ANGE , G., N AVAS , J. A., S CHACHTE , P., S ØNDERGAARD , H., AND S TUCKEY , P. J. 2013. FailureTabled Constraint Logic Programming by Interpolation. TPLP 13, AFFAR , J. AND M AHER , M. 1994. Constraint Logic Programming: A Survey. Journal of LogicProgramming 19/20 , 503–581.J AFFAR , J., S ANTOSA , A. E., AND V OICU , R. 2004. A CLP Proof Method for Timed Automata. In RTSS .IEEE Computer Society, 175–186.J ANSSENS , G. AND S AGONAS , K. 1998. On the Use of Tabling for Abstract Interpretation: An Experimentwith Abstract Equation Systems. In Tabulation in Parsing and Deduction .K ANAMORI , T. AND K AWAMURA , T. 1993. Abstract Interpretation Based on OLDT Resolution. Journalof Logic Programming 15 , 1–30.R AMAKRISHNA , Y., R AMAKRISHNAN , C., R AMAKRISHNAN , I., S MOLKA , S., S WIFT , T., AND W ARREN , D. 1997. Efficient Model Checking Using Tabled Resolution. In Computer Aided Verification .LNCS, vol. 1254. Springer Verlag, 143–154.R EVESZ , P. Z. 1993. A Closed-Form Evaluation for Datalog Queries with Integer (Gap)-Order Constraints. Theoretical Computer Science 116, 1, 117–149.S CHRIJVERS , T., D EMOEN , B., AND W ARREN , D. S. 2008. TCHR: a Framework for Tabled CLP. Theoryand Practice of Logic Programming Joaqu´ın Arias and Manuel Carro S WIFT , T. AND W ARREN , D. S. 2010. Tabling with answer subsumption: Implementation, applicationsand performance. In Logics in Artificial Intelligence . Vol. 6341. 300–312.T AMAKI , H. AND S ATO , M. 1986. OLD Resolution with Tabulation. In Third International Conferenceon Logic Programming . Lecture Notes in Computer Science, Springer-Verlag, London, 84–98.T OMAN , D. 1997. Memoing Evaluation for Constraint Extensions of Datalog. Constraints 2, VAN E MDEN , M. H. AND K OWALSKI , R. A. 1976. The Semantics of Predicate Logic as a ProgrammingLanguage. Journal of the ACM 23 , 733–742.W ARREN , D. S. 1992. Memoing for Logic Programs. Communications of the ACM 35, 3, 93–111.W ARREN , R., H ERMENEGILDO , M., AND D EBRAY , S. K. 1988. On the Practicality of Global FlowAnalysis of Logic Programs. In Fifth International Conference and Symposium on Logic Programming .MIT Press, 684–699.Z OU , Y., F ININ , T., AND C HEN , H. 2005. F-OWL: An Inference Engine for Semantic Web. In