A Machine Learning guided Rewriting Approach for ASP Logic Programs
Elena Mastria, Jessica Zangari, Simona Perri, Francesco Calimeri
FF. Ricca, A. Russo et al. (Eds.): Proc. 36th International Conferenceon Logic Programming (Technical Communications) 2020 (ICLP 2020)EPTCS 325, 2020, pp. 261–267, doi:10.4204/EPTCS.325.31 c (cid:13)
E. Mastria, J. Zangari, S.Perri, F. CalimeriThis work is licensed under theCreative Commons Attribution License.
A Machine Learning guided Rewriting Approach for ASPLogic Programs ∗ Elena Mastria
Department of Mathematics and Computer ScienceUniversity of Calabria, Italy [email protected]
Jessica Zangari
Department of Mathematics and Computer ScienceUniversity of Calabria, Italy [email protected]
Simona Perri
Department of Mathematics and Computer ScienceUniversity of Calabria, Italy [email protected]
Francesco Calimeri
Department of Mathematics and Computer ScienceUniversity of Calabria, Italy [email protected]
Answer Set Programming (ASP) is a declarative logic formalism that allows to encode computa-tional problems via logic programs. Despite the declarative nature of the formalism, some advancedexpertise is required, in general, for designing an ASP encoding that can be efficiently evaluated byan actual ASP system. A common way for trying to reduce the burden of manually tweaking an ASPprogram consists in automatically rewriting the input encoding according to suitable techniques, forproducing alternative, yet semantically equivalent, ASP programs. However, rewriting does not al-ways grant benefits in terms of performance; hence, proper means are needed for predicting theireffects with this respect. In this paper we describe an approach based on Machine Learning (ML) toautomatically decide whether to rewrite. In particular, given an ASP program and a set of input facts,our approach chooses whether and how to rewrite input rules based on a set of features measuringtheir structural properties and domain information. To this end, a Multilayer Perceptrons model hasthen been trained to guide the ASP grounder I - DLV on rewriting input rules. We report and discussthe results of an experimental evaluation over a prototypical implementation.
Answer Set Programming (ASP) [4, 12] is a declarative programming paradigm proposed in the areaof non-monotonic reasoning and logic programming. With ASP, computational problems are encodedby logic programs whose intended models, called answer sets, correspond one-to-one to solutions ofthe original problem. After several years of theoretical research, the scientific community reached ageneral consensus regarding the foundations of ASP computation, and a number of efficient evaluationmethods and real systems are available today [11, 6]. Typically, the same computational problem canbe encoded by means of many different ASP programs which are semantically equivalent; however, realASP systems may perform very differently when evaluating each one of them.Indeed, structural properties of a logic program can make computation easier or harder; furthermore,specific aspects and features of the ASP system at hand might have significant impact on performance, as,for instance, adopted algorithms and optimizations. As a result, some expert knowledge can be requiredin order to select the “best” encoding when performance is crucial; this, in a certain sense, conflicts withthe declarative nature of ASP that, ideally, should free users from the burden of computational issues. For ∗ This work has been partially supported by MIUR under project “Declarative Reasoning over Streams” (CUPH24I17000080001) – PRIN 2017, by MISE under project “S2BDW” (F/050389/01-03/X32) – “Horizon2020” PON I&C2014-20, by Regione Calabria under project “DLV LargeScale” (CUP J28C17000220006) – POR Calabria 2014-20. A Machine Learning guided Rewriting Approach for ASP Logic Programs P S X Y Z D (a) HG ( r ) {P,Y,Z,S,X} {D,P,Y,Z} (b) T D ( r ) {D,Y,Z,S,X}{D,P,S,X} (c) T D ( r ) Figure 1: Decomposing a rule.this reason, ASP systems tend to be endowed with pre-processing means aiming at making performanceless encoding-dependent; intuitively, this also fosters the usage of ASP in practice.The idea of transforming logic programs has been explored in past literature, to different extents,such as verification, performance improvements, etc. (see e.g., [14, 9] and related works); in this paper,we focus on ASP and sketch a preliminary work on a Machine Learning (ML) strategy for automaticallyoptimizing ASP encodings. Such strategy relies on an adaptation of hypergraph tree decompositionstechniques for rewriting rules and inductively estimates whether decomposition is convenient or not.We devised and experimentally tested a prototypical implementation relying on the system I - DLV [5].Experimental results show that the approach is promising; indeed, despite the embryonal nature of thesystem, performance are already comparable to the ones gained with well-assessed deductive heuristics.
An ASP rule r can be represented as a hypergraph [2] HG ( r ) and be decomposed according to a treedecomposition T D ( r ) of HG ( r ) into a set RD ( r ) of new rules that are equivalent to the original one,yet typically shorter. Such technique is adopted in lpopt [2] to rewrite a program before it is fed to anASP system. In more details, a (undirected) hypergraph is a generalization of a (undirected) graph inwhich an edge can join two or more vertices. HG ( r ) has a hyperedge for each literal l in the body andin the head of r containing all variables in l . A tree decomposition of a hypergraph HG ( r ) is a tuple( T D ( r ) , χ ), where T D ( r ) = ( V ( T D ( r )) , E ( T D ( r ))) is a tree and χ : V ( T D ( r )) → V ( HG ( r )) is a functionmapping a set of vertices χ ( t ) ⊆ V ( HG ( r )) to each vertex t of the decomposition tree T D ( r ) , such thatfor each e ∈ E ( HG ( r )) there is a node t ∈ V ( T D ( r )) such that e ⊆ χ ( t ) , and for each v ∈ V ( HG ( r )) theset { t ∈ V ( T D ( r )) | v ∈ χ ( t ) } is connected in T D ( r ) . Intuitively, a tree decomposition T D ( r ) of HG ( r ) is a tree such that each vertex is associated to a bag , i.e., a set of nodes of HG ( r ) , and such that eachhyperedge of HG ( r ) is covered by some bag, and for each node of HG ( r ) all vertices of T D ( r ) whosebag contains it induce a connected subtree of T D ( r ) .A tree decomposition T D ( r ) induces a set of rules that rewrites r , called rule decomposition , anddenoted by RD ( r ) containing a fresh rule for each vertex v of T D ( r ) . Roughly, each body literal l in r , such that the set of variables in l is contained in v , is added to the body of the rule generated for v . Moreover, some rules may be generated to guarantee safety. In general, more than one decomposi-tion is possible for each rule. The following example illustrates these ideas. Let us consider the rule: r : p ( X , Y , Z , S ) : - s ( S ) , a ( X , Y , S − ) , c ( D , Y , Z ) , f ( X , P , S − ) , P > = D . Figure 1 depicts the conversion of rule r into the hypergraph HG ( r ) and two possible decomposi-tions T D ( r ) and T D ( r ) . According to T D ( r ) , r can be decomposed into the set of rules RD ( r ) : r : p ( X , Y , Z , S ) : - s ( S ) , a ( X , Y , S − ) , f ( X , P , S − ) , f resh pred ( P , Y , Z ) . r : f resh pred ( P , Y , Z ) : - c ( D , Y , Z ) , P > = D , f resh pred ( P ) . r : f resh pred ( P ) : - s ( S ) , f ( , P , S − ) . . Mastria, J. Zangari, S.Perri, F. Calimeri r has the same head of r and in the body all literals covering the first node of T D ( r ) withvariables { P , Y , Z , S , X } ; r has in the head the fresh predicate f resh pred r . The bodyof r contains the literals having as variables { D , P , Y , Z } appearing in the other node of T D ( r ) . Therule r ensures safety in r : f resh pred ( P ) is added in the body of r and to the head of r , whosebody has a set of literals coming from r and covers P . Intuitively, a different rewriting could be obtainedby differently handling safety: e.g., by adding the literals s ( S ) and f ( , P , S − ) to the body of r andavoiding to introduce r . Similarly, according to T D ( r ) , r can be rewritten into RD ( r ) : r : p ( X , Y , Z , S ) : - a ( X , Y , S − ) , c ( D , Y , Z ) , f resh pred ( D , S , X ) . r : f resh pred ( D , S , X ) : - s ( S ) , f ( X , P , S − ) , P > = D , f resh pred ( D ) . r : f resh pred ( D ) : - c ( D , , ) . The commonly adopted approach for the evaluation of ASP programs relies on a grounding (or instan-tiation) module ( grounder ) that generates a propositional theory semantically equivalent to the inputprogram coupled with a subsequent module ( solver ), that uses propositional techniques for generatinganswer sets. There are monolithic systems integrating both computational stages such as
DLV [1] and clingo [10], as well as systems performing only grounding or stand-alone solvers.Among grounders, I - DLV [5] employs a heuristic-guided tree decomposition algorithm [7] aimingat optimizing the instantiation process. Roughly, I - DLV possibly decomposes input rules into multiplesmaller ones according to the technique sketched in Section 2 on the basis of a formulas that estimates thecost of joining body literals. This deductive heuristic relies on internally computed grounding statisticssuch as the number of generated atoms or arguments’s selectivities (cfr. [7]). The technique provedbeneficial on both grounding and solving performance, permitting to mitigate the so-called groundingbottleneck issue and to actually instantiate programs that cannot be grounded otherwise.In this paper, in contrast with the aforementioned proposal, we present an approach relying on an in-ductive heuristic. We still aim at properly deciding whether decomposing rules might improve groundingperformance but via a ML heuristic that considers only “static” information on the non-ground structureof the input program. In particular, this new heuristic is based on a predictive model purposefully de-signed and trained able to classify each input rule as: “better to decompose”, “better not to decompose”or “indifferent” (i.e., applying or not decompositions has almost the same effect on performance).
As anticipated, we opted for classification. We recall that in such a task input data consists of a set ofrecords known as examples ; each record (or example ) is a tuple of form ( X , y ) , where X = { x , · · · , x n } for n > y is a class label, namely, the so-called target attribute . Classificationlies in learning a target function f : X (cid:55)→ y that maps a set of attributes X to a class label y . In our context,for a rule r , X is the features described in Section 3.2 computed on r , while y can be either “better todecompose”, “better not to decompose” or “indifferent”.We chose a set of features by focusing only on easily computable non-ground structural propertiesand domain information. The set of examples has been created by selecting all decomposable rulesfrom a large set of widely spread ASP benchmarks. We obtained an example from each selected ruleby computing features on it and associating a class label. The association has been done by taking into64 A Machine Learning guided Rewriting Approach for ASP Logic Programs account the I - DLV grounding times when the considered rule is decomposed or not (see Section 3.3).Once we achieved a consistent data set, an Artificial Neural Network (ANN) has been designed andtrained to build the classifier. Eventually, we experimentally evaluated the quality of the resulting model.
In total, we devised 19 features involving information about non-ground properties, tree decompositionstructures and input facts. For brevity, we herein focus on the 6 features that showed a higher correlationwith class labels. The features, reported below, are defined via the following notations. Let P be anASP program, we indicate as Facts I ( P ) the set of input facts of P . We denote as EDB ( P ) the set of allpredicates in P defined only by facts and as IDB ( P ) the remaining ones. Let r be a rule, H ( r ) is the set ofatoms in the head of r , as B ( r ) the set of literals in body of r and as RD ( r ) a possible tree decompositionof r . We denote as sharedVars ( r ) the number of joins in B ( r ) i.e. the number of times each pair ofatoms in the rule shares the same variable. Let p be a predicate, we indicate as arity ( p ) the arity of p . | Facts I ( P ) | | B ( r ) | | RD ( r ) | ∑ ri ∈ RD ( r ) | B ( r i ) || RD ( r ) | ∑ r i ∈ RD ( r ) sharedVars ( r i ) ∑ pi ∈ IDB ( P ) arity ( p i ) | IDB ( P ) | Intuitively, for a rule r , from left to right, the 6 features concern: ( i ) the number of input facts, ( ii ) the body length, ( iii ) the number of rules in which r can be decomposed, ( iv ) the average length and ( v ) the total number of joins in the rule decomposition, ( vi ) the average arity of IDB predicates.
We collected decomposable rules from benchmarks of the 3th, the 4th and the 6th official ASP competi-tions as well as from the grounding-intensive 2-QBF domain [2]. Since often rules cannot be decomposedas tree decomposition is applicable only on the basis of intrinsic structural rule properties, we tried toenrich the set of examples. To this end, we performed a preprocessing step in which we applied to en-codings techniques inspired by unfolding [15]. The aim was to obtain additional rules with longer bodieswhich are more likely to be decomposable.At this point, we computed the features over the so collected decomposable rules and then, weproperly associated a class label to each one. Labels have been assigned by first generating for eachconsidered rule an example program consisting of its non-decomposed version along with the set ofrequired input facts, and then by measuring I - DLV times when asked to instantiate the example programwhen decomposition is disabled and forcibly activated. If the difference in instantiation times of twoversions is lower than 10%, the features set is labelled as “indifferent”; otherwise, the assigned label iseither “decomp” or “do-not-decomp”, depending on which version led to lowest grounding time.The resulting data set is formed by 3852 examples: 3106 have been labeled as “indifferent”, 417as “do-not-decomp” and the remaining 329 as “decomp”, with class distributions of 80 , , , To build the classifier, we adopted a
Multilayer Perceptrons (MLPs) Neural Network [8]. MLPs arecommonly used for such tasks, as they often permit to reach a high accuracy by “learning” compleximplicit relationships within data. In classifications, the learning process of a neural network is guided by . Mastria, J. Zangari, S.Perri, F. Calimeri
Class Precision Recall F1- score “indifferent” 0 .
97 0 .
96 0 . .
86 0 .
83 0 . .
78 0 .
88 0 . score macro 0 .
87 0 .
89 0 . .
94 0 .
94 0 . (a) Performance measures (b) Confusion Matrix (c) AUC-ROC curves Figure 2: Model validation results.a loss function that, in general, determines the quality of the model prediction. More specifically, duringthe training phase, the internal network configuration undergoes through a series of transformationsaiming at minimizing the loss function. Since it is computed at each training step, the loss value givesa clear indicator of how well the current configuration performs the task for which the network wasdesigned; in our case, a multi-class classification.Given that we wanted to maintain the natural data set configuration, we adopted a cost-sensitivelearning method to deal with the imbalance issue. Such approach, instead of modifying the distributionof training data, assigns different weights to classes in the loss function so that minority classes misclas-sifications are more penalized; this commonly used adjustment allows to deal with the imbalance directlyinto the learning algorithm itself. In particular, we implemented as custom loss function the α -balancedfocal loss for multi-classes classification [13]. Experimentally, this focal loss variant has proved to besuitable for better classifying examples belonging to minority classes. For the loss function minimizationprocess, we used the adaptive learning rate optimization algorithm Adam as an optimizer.
The model has been trained over 300 epochs. As convention, 70% of the examples have been used as training set and the remaining 30% as test set . In this splitting we carefully maintained the originalclass distributions. Since we are dealing with an unbalanced data set, accuracy cannot be considered asan appropriate performance measure. In this metric, the impact of the classification errors of minoritycases is reduced by the proper classification of majority cases. Thus, the quality of the model has beenassessed by means of the
F1-score defined as F = × ( Precision × Recall ) / ( Precision + Recall ) which provides us with more information about the effectiveness of the model on correctly predictingthe instances belonging to minority classes [3]. The F Precision and
Recall arehigh; the former indicates the proportion of cases classified as relevant that are actually relevant, while,the latter measures the proportion of relevant cases classified among all the relevant ones.The Receiver Operator Characteristic (
ROC ) curve examines the model capability of detecting
TruePositives (TP) instances and compares it with
False Positive (FP) predictions. The ROC curve plots
TPrate against
FP rate on the vertical axis and on the horizontal axis, respectively. The larger the AreaUnder the Curve (
AUC ) is, the higher is the quality of the model (i.e., AUC=1.0).Figure 2a reports precision , recall and F1 scores both class by class and aggregated using macroand weighted as average methods. Despite unbalance, the model achieves good performance also whendealing with minority classes. Figure 2b shows the confusion matrix summarizing distributions of model66 A Machine Learning guided Rewriting Approach for ASP Logic Programs
Table 1: 4th Competition - number of grounded instances and average grounding times in seconds.
Problem I - DLV never I - DLV always I - DLV deduct I - DLV induct
Name
Nomystery 30 30 34.86 30 18.92 30 35.66 30 18.84
Sokoban 30 30 2.68 30 2.76 30 2.77 30 2.85Ricochet Robots 30 30 0.27 30 0.31 30 0.31 30 0.31Crossing Minimization 30 30 0.10 30 0.10 30 0.10 30 0.10Reachability 30 30 102,46 30 101,65 30 102,72 30 102,72Strategic Companies 30 30 0,21 30 0,22 30 0,21 30 0,41Solitaire 27 27 0.13 27 0.19 27 0.20 27 0.20
Weighted-Sequence Problem 30 30 2.83 30 9.50 30 2.90 30 11.16
Stable Marriage 30 30 27.72 30 2.54 30 2.54 30 2.47Incremental Scheduling 30 12 295.62 21 219.97 21 222.00 21 214.59Qualitative Spatial Reasoning 30 30 2.84 30 2.84 30 2.83 30 2.85Chemical Classification 30 30 88.49 30 88.50 30 88.67 30 87.28Abstract Dialectical Frameworks 30 30 0.13 30 0.13 30 0.13 30 0.13Visit-all 30 30 0.13 30 0.13 30 0.13 30 0.14Complex Optimization 29 29 34.89 29 34.23 29 35.15 29 35.51Knight Tour with Holes 30 20 177.76 20 173.03 20 174.90 20 180.98Maximal Clique 30 30 0.32 30 0.32 30 0.33 30 0.31
Labyrinth 30 30 1.47 30 1.39 30 1.48 30 0.71
Minimal Diagnosis 30 30 2.54 30 2.22 30 2.57 30 2.90Hanoi Tower 30 30 0.22 30 0.23 30 0.23 30 0.23Graph Colouring 30 30 0.10 30 0.10 30 0.10 30 0.10Total 696 666 22.46 677 22.39 677 23.07 677 22.47 predictions. The
AUC - ROC plot in Figure 2c evidences the model capability of distinguishing amongclasses: an AUC very close to 1 . In this section, we analyze the impact of the proposed inductive heuristic on I - DLV performance. Fourversions of I - DLV have been compared: ( i ) I - DLV never with decomposition disabled; ( ii ) I - DLV always with decomposition always enabled; ( iii ) I - DLV deduct with decomposition applied according tothe internal deductive heuristic; ( iv ) I - DLV induct with decomposition guided by the inductive model.The latter version has been externally implemented thanks to the capability of I - DLV to customizeits grounding process via annotations [5]. More in detail, the model communicates with I - DLV viaan external module which, given an encoding, for each rule r : first, invokes the model to determinewhether r has to be decomposed, and then accordingly, annotates r as to decompose or not. Eventually,these annotated encodings are fed to I - DLV . We report in Table 1 results on the 4th competition. Foreach version, the table details the number of grounded instances and the average instantiation times perproblem. Experiments have been performed on a NUMA machine equipped with two 2.8 GHz AMDOpteron 6320 and 128 GiB of main memory, running Linux Ubuntu 14.04.4; memory limit has been setto 15 GiB and time limit to 600 seconds per instance.In general, the proposed method behaves consistently with the well-established deductive methodused in I - DLV . On the one side, we observe cases such as
Nomystery in which the inductive heuristicidentifies benefits of applying decompositions: I - DLV deduct rewrites the input encoding in a way similarto I - DLV never , while the I - DLV induct rewriting is comparable to that performed by I - DLV always . Animprovement is also gained in problem
Labyrinth : I - DLV induct is faster than the other versions. Onthe other side, there is the case of
Weighted-Sequence Problem in which the proposed method causes a . Mastria, J. Zangari, S.Perri, F. Calimeri I - DLV induct performance is in line with others.In summary, one can note that, despite the embryonal nature of the work, performance are alreadycomparable to the ones obtained with well-assessed methods. Further studies will tell whether the induc-tive approach is actually effective for improving performance of ASP grounders. In this respect, we planto enrich the set of classification features, experimenting with other classification algorithms and con-sidering a larger set of benchmarks for both training and testing. Moreover, we plan to consider furtherrewriting techniques besides tree decomposition, and to extend our implementation with the capabilityof foreseeing effects of each single rewriting and/or combinations thereof.
References [1] Mario Alviano, Francesco Calimeri, Carmine Dodaro, Davide Fusc`a, Nicola Leone, Simona Perri, FrancescoRicca, Pierfrancesco Veltri & Jessica Zangari (2017):
The ASP System DLV2 . In:
LPNMR 2017, Espoo,Finland, July 3-6, 2017, Proceedings , LNCS
The power of non-ground rules in Answer SetProgramming . TPLP
A survey of predictive modeling on imbalanced domains . ACM Computing Surveys (CSUR)
Answer set programming at a glance . Communications of the ACM
I-DLV: The new intelligentgrounder of DLV . IA Design and results of theFifth Answer Set Programming Competition . AI Optimizing Answer Set Computation viaHeuristic-Based Decomposition . TPLP
Principles of artificial neural networks . 7, World Scientific, doi:10.1142/8868.[9] Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi & Maurizio Proietti (2014):
VeriMAP: A Tool forVerifying Programs through Transformations . In:
TACAS 2014, Held as Part of the ETAPS 2014, Grenoble,France, April 5-13, 2014. Proceedings , LNCS
Multi-shot ASP solvingwith clingo . TPLP
Evaluation Techniques and Systems for Answer Set Programming: a Survey . In J´erˆome Lang, editor:
IJCAI2018, July 13-19, 2018, Stockholm, Sweden. , ijcai.org, pp. 5450–5456, doi:10.24963/ijcai.2018/769.[12] Michael Gelfond & Vladimir Lifschitz (1991):
Classical Negation in Logic Programs and DisjunctiveDatabases . NGC
Focal loss for dense objectdetection . In:
Proceedings of the IEEE ICCV , pp. 2980–2988, doi:10.1109/ICCV.2017.324.[14] Alberto Pettorossi & Maurizio Proietti (1996):
Rules and Strategies for Transforming Functional and LogicPrograms . ACM Comput. Surv.