SMT-based Verification of LTL Specifications with Integer Constraints and its Application to Runtime Checking of Service Substitutability
Marcello M. Bersani, Luca Cavallaro, Achille Frigeri, Matteo Pradella, Matteo Rossi
SSMT-based Verification of LTL Specifications with Integer Constraints and itsApplication to Runtime Checking of Service Substitutability
Marcello M. Bersani
Politecnico di MilanoMilano, [email protected]
Luca Cavallaro
Politecnico di MilanoMilano, [email protected]
Achille Frigeri
Politecnico di MilanoMilano, [email protected]
Matteo Pradella
CNR IEIIT-MIMilano, [email protected]
Matteo Rossi
Politecnico di MilanoMilano, [email protected]
Abstract —An important problem that arises during theexecution of service-based applications concerns the abilityto determine whether a running service can be substitutedwith one with a different interface, for example if the formeris no longer available. Standard Bounded Model Checkingtechniques can be used to perform this check, but they must beable to provide answers very quickly, lest the check hampersthe operativeness of the application, instead of aiding it. Theproblem becomes even more complex when conversationalservices are considered, i.e., services that expose operationsthat have Input/Output data dependencies among them. Inthis paper we introduce a formal verification technique for anextension of Linear Temporal Logic that allows users to includein formulae constraints on integer variables. This technique ap-plied to the substitutability problem for conversational servicesis shown to be considerably faster and with smaller memoryfootprint than existing ones.
Keywords -Bounded Model Checking, SMT-solvers, Service-Oriented Architectures.
I. I
NTRODUCTION
Service Oriented Architectures (SOAs) are a flexible setof design principles that promote interoperability amongloosely coupled services that can be used across multiplebusiness domains. In this context applications are typicallycomposed of services made available by third-party vendors.This opens new scenarios that are unimaginable in traditionalapplications. On the one hand, an organization does nothave total control of every part of the application, hencefailures and service unavailability should be taken intoaccount at runtime. On the other hand, during the applicationexecution new services might become available that enablenew features or provide equivalent functionalities with betterquality. Therefore the ability to support the evolution ofservice compositions, for example by allowing applicationsto substitute existing services with others discovered atruntime, becomes crucial.Most of the frameworks proposed in recent years forthe runtime management of service compositions make theassumption that all semantically equivalent services agreeon their interface [1], [2]. In the practice this assumptionturns out to be unfounded. The picture is further complicatedwhen one considers conversational services , i.e., services that expose operations with input/output data dependenciesamong them. In fact, in this case the composition must dealwith sequences of operation invocations, i.e., the behaviorprotocol , instead of single, independent, ones.[3], [4] propose an approach to tackle the substitutabilityproblem, i.e., the problem of deciding when a service canbe dynamically substituted by another one discovered at run-time, based on Bounded Model Checking (BMC) techniques.Even if the approach proved to be quite effective, the Propo-sitional Satisfiability (SAT) problem on which traditionalBMC relies requires to deal with lengthy constraints, whichtypically limits the efficiency of the analysis phase. In thesetting of the runtime management of service compositionsthis is not acceptable, as delays incurred when decidingwhether services are substitutable or not can hamper theoperativeness of the application.In this paper we introduce a verification technique, basedon Satisfiability Modulo Theories (SMT), for an extension ofPropositional Linear Temporal Logic with Both past and fu-ture operators (PLTLB). This extension, called CLTLB(DL),allows users to define formulae including Difference Logic(DL) constraints on time-varying integer variables.Our SMT-based verification technique has two main ad-vantages: (i) unlike in traditional BMC, arithmetic domainsare not approximated by means of a finite representation,which proves to be particularly useful in the service substi-tutability problem; (ii) the implemented prototype is shownto be considerably faster and with smaller memory footprintthan existing ones based on traditional BMC, due to theconciseness of the problem encoding.The technique exploits decidable arithmetic theories sup-ported by many SMT solvers [5] to natively deal with integervariables (hence, with an infinite domain). This allows usto decide larger substitutability problems than before, insignificantly less time: the response times of our prototypetool make it usable also in a runtime checking setting.This paper is structured as follows: Section II intro-duces the issues underlying the runtime checking of servicesubstitutability; Sections III and IV present, respectively,CLTLB(DL) and its SMT-based encoding for verificationpurposes; Section V explains how the approach works on a r X i v : . [ c s . L O ] A p r case study, and Section VI discusses some experimentalresults; Finally, Section VII presents some related works.II. S UBSTITUTABILITY C HECKING OF C ONVERSATIONAL S ERVICES
The approach presented in [3] enables service substitu-tion through the automatic definition of suitable mappingscripts . These map the sequences of operations that theclient is assuming to invoke on the expected service intothe corresponding sequences made available by the actualservice (i.e., the service that will be actually used). Mappingscripts are automatically derived given (i) a description ofservice interfaces in which input and output parametersare associated with each service operation, and (ii) thebehavioral protocol associated with each service, describedthrough an automaton.The mapping between an expected and an actual serviceassumes that two compatibility relationships have been pre-viously defined. The first states the compatibility betweenstates of two automata. The second concerns the compatibil-ity between names and data associated with some operation o exp ∈ O exp in the expected service and those associatedwith some operation o (cid:48) act ∈ O act in the actual service.For the sake of simplicity, here we assume that states andoperation names and data are compatible if they are calledthe same way (more sophisticated compatibility relationshipsare explored in [6]).Given these definitions, we say that a sequence of oper-ations in the automaton of the expected service is substi-tutable by another sequence of operations in the automatonof the actual service if a client designed to use the expectedservice sequence can use the actual service sequence withoutnoticing the difference. This can happen when the followingconditions hold:1) The sequence in the actual service automaton startsand ends in states that are compatible with the initialand final states of the sequence in the expected serviceautomaton.2) All data parameters of the operations in the actualservice automaton sequence are compatible with thoseappearing in the expected service automaton sequence.This substitutability definition allows us to build a reasoningmechanism based on PLTLB that, given an expected servicesequence, returns a corresponding actual service sequence.The formal model for reasoning about substitutability in-cludes the behavioral protocols of both the expected and theactual services represented as Labelled Transition Systems(LTS) and formalized in PLTLB, in which each transitionis labelled with the associated operation. Input and outputparameters of each operation are also part of the model (Fig.1 shows the LTS of a service discussed in Section V).In addition, the model includes the definition of two kindsof integer counters. The first is called seen , and it is used tocheck that the actual service can work using a subset of the startSearchLyric startend SearchLyric(song;artist;):SongRank;song;artist;ArtistUrl;SongUrl;lyricsId;lyricCheckSumSearchLyricText(lyricText;):SongRank;song;artist;ArtistUrl;SongUrl;lyricsId;lyricCheckSumSearchLyric(song;artist;):SongRank;song;artist;ArtistUrl;SongUrl;lyricsId;lyricCheckSumSearchLyricText(lyricText;):SongRank;song;artist;ArtistUrl;SongUrl;lyricsId;lyricCheckSumGetLyric(lyricCheckSum;lyricsId;):Lyric;LyricCorrectUrl;LyricRank;LyricCovertArtUrl;LyricCorrectUrl;artist;song Figure 1. LTS of the ChartLyrics service of Section V. input data provided by the client to the expected service.The second is called needed , and it is used to check thatthe actual service can provide a superset of the data theclient expects to receive as output of the expected service.The model includes an instance of seen (resp. needed ) foreach type of data that can be used as input (resp. output)parameter for an operation.The model states that each time an operation of the ex-pected service is invoked, the instances of seen for each inputparameter and those of needed for each output parameter areall incremented by one. Conversely, when an operation of theactual service is invoked, the instances of the seen counterfor each input parameter and those of the needed counterfor each output parameter are all decremented by one. Notethat an actual service operation can be invoked only if the seen counter for each of its input parameters is ≥ (i.e. theinput parameters have been provided by a client expectingto invoke some operations on the expected service).Through this model, given a sequence of operationsin the expected service automaton, we can formalize theproblem of finding a substituting operation sequence inthe actual service automaton. More precisely, the actualoperation sequence exists if, when the expected operationsequence is finished, the actual and expected services are incompatible states, and each instance of the needed counterhas a value ≤ . The rationale behind the latter condition isthat when the value of a needed counter is , then the actualservice provided enough instances of a certain type of datato fulfill client requests. If, on the other hand, the actualservice provides more instances of a type of data than thoserequested, then the corresponding needed counter is < .In case the expected service operation sequence analyzedis substitutable by one in the actual service, a mappingscript is generated and then interpreted by an adapter thatintercepts all service requests issued by the client andtransforms them into some requests the actual service can2nderstand. Fig. 2 shows the placement of adapters intothe infrastructure architecture and highlights their nature ofintermediaries (see [3] for details). Service Composition1) Request for o1 on S1
Proxy
3) Requests foro1 and 02 on S1 4)Adapted Request for S2
MappingScriptS1 to S2: mapo1 and o2 on S1to o1 on S2
ServiceS2ServiceS1
Adapter
Operation: o1Input
Figure 2. The adaptation runtime infrastructure.
III. A
LOGIC FOR TIME - VARYING COUNTERS
In order to deal with time-varying counters over actualdomains (such as seen and needed discussed in Section II),we introduce an extension of Linear-time Temporal Logicwith past operators and non-quantified first order integervariables. The language we consider, denoted CLTLB(DL),is an extension of PLTLB which combines pure Booleanatoms and formulae with terms defined by DL constraints.Counters can naturally be represented by integer variablesover the whole domain without any approximation due to apropositional encoding. In [7] we prove the decidability ofthe satisfiability problem in more general cases.Difference Logic is the structure (cid:104) Z , = , ( < d ) d ∈ Z (cid:105) whereeach < d is a binary relation defined as x < d y ⇔ x < y + d. The notations x < y , x ≤ y , x ≥ y , x > y and x = y + d are abbreviations for x < y , x < y ∨ x = y , ¬ ( x < y ) , ¬ ( x < y ∨ x = y ) and y < d − x ∧ x < d +1 y , respectively.Let AP the set of Atomic Propositions and V the set ofvariables; the CLTLB(DL) language is defined as follows: φ := (cid:40) p | ϕ ∼ ϕ | φ ∧ φ | ¬ φ | X φ | Y φ | Z φ | φ U φ | φ S φϕ := x | X ϕ | Y ϕ where p ∈ AP , x ∈ V , ∼ is any relation in DL, X is theusual “next”, Y , Z are “previous” operators, U and S arethe usual “until” and “since” operators. Subformulae ϕ arecalled arithmetic temporal terms (a.t.t.); for such terms, wedefine recursively the depth | ϕ | : | x | = 0 , | X ( ϕ ) | = | ϕ | + 1 , | Y ( ϕ ) | = | ϕ | − . Depth extends naturally to formulae as the minimum depthof its a.t.t.’s.The semantics of a formula φ of CLTLB(DL) is definedw.r.t. a linear time structure ( S, s , I, π, L ) where S is theset of states, s is the initial state, I : [ | φ | , − × V → Z is anassignment of variables, π is an infinite path π = s s . . . endowed with a sequence of valuations σ : N × V → Z and L : S → AP is the labeling function. The function I allows a valuation of variables to be defined also for instantspreceding zero and then to be extended to a.t.t.’s. Indeed, if ϕ is such a term, x is the variable in ϕ , s i is a state alongthe sequence, and σ i is a shorthand for σ ( i, · ) , then: σ i ( ϕ ) = (cid:26) σ i + | ϕ | ( x ) , if i + | ϕ | ≥ ; I ( i + | ϕ | , x ) , if i + | ϕ | < .Given a model π σ , the semantics of a formula φ is recur-sively defined as: π iσ | = p ⇔ p ∈ L ( s i ) for p ∈ APπ iσ | = ( ϕ ∼ ϕ ) ⇔ σ i + | ϕ | ( x ϕ ) ∼ σ i + | ϕ | ( x ϕ ) π iσ | = ¬ φ ⇔ π iσ (cid:54)| = φπ iσ | = φ ∧ ψ ⇔ π iσ | = φ and π iσ | = ψπ iσ | = X φ ⇔ π i +1 σ | = φπ iσ | = Y φ ⇔ π i − σ | = φ ∧ i > π iσ | = Z φ ⇔ π i − σ | = φ ∨ i = 0 π iσ | = φ U ψ ⇔ (cid:40) ∃ j ≥ i : π jσ | = ψ ∧ π nσ | = φ ∀ i ≤ n < jπ iσ | = φ S ψ ⇔ (cid:40) ∃ ≤ j ≤ i : π jσ | = ψ ∧ π nσ | = φ ∀ j < n ≤ i where x ϕ i is the variable that appears in ϕ i and ∼ isany relation in DL. The R and T operators, over infi-nite paths, can be defined as usual: φ R ψ ≡ ¬ ( ¬ φ U ¬ ψ ) and φ T ψ ≡ ¬ ( ¬ φ S ¬ ψ ) . By means of previous dualitiesand DeMorgan’s rules, it is always possible to rewriteall formulae to positive normal form . From now on, weassume all formulae are in positive normal form. A formula φ ∈ CLTLB(DL) is satisfiable if there exists a linear timestructure ( S, s , I, π, L ) and a sequence of valuations σ suchthat π σ | = φ ; where π σ is the the sequence built from π andthe valuations as described before.Unfortunately, CLTLB(DL) is too expressive in the sensethat the satisfiability problem can be proven to be highlyundecidable [8]. However, the satisfiability and the modelchecking problems for a CLTLB(DL) formula φ for k - partial valuations (i.e., for all computation in which thevalue of counters is considered only up to k plus the maximum depth of the subformulae of φ steps) is shown tobe decidable [7]. Both of them reduce to the satisfiability andthe model checking problems, respectively, over boundedpaths of length equal to k with k -partial valuations. As inthe standard BMC (of a property φ ) the goal is looking3or finite initialized path of the system that are witnessesof wrong behaviors, i.e., paths along which the negationsof the property φ holds. When the finite path of length k admits a loop, it contains all its infinite periodic behavior;and conversely, when a loop does not exists, it representsall its possible extensions. Indeed, it is representative of aninfinite path. Formally, paths are words of states s i whichmay be possibly periodic: π = uv ω with u = s · · · s l and v = s l +1 · · · s k where l ≤ k , if the loop exists; π = uv , if it does not. Beside the propositional model, the values of the variables up to the state s k are depicted bya bounded representation π σ k of the model π σ . It is alsoopportunely bordered by some values of variables referringto time instants outwards the finite path, before s and after s k depending on the depth of the formula. Arithmetic DLconstraints may be part of the possibly periodic model π σ and, thus, are defined by means of a finite prefix of length k . According to [7], [9], we are allowed to use a properbounded semantics to state reachability properties on thatpart of the system involving a counting mechanism (i.e., X x = y + 1 , where x , y are variables). Note that overfinite acyclic paths, the equivalence φ R ψ ≡ ¬ ( ¬ φ U ¬ ψ ) and φ T ψ ≡ ¬ ( ¬ φ S ¬ ψ ) no longer holds. Then, R (andsymmetrically T ) is redefined as [10]: π iσ | = k φ R ψ ⇔∃ i ≤ j ≤ k, π jσ | = k φ ∧ π nσ | = k ψ ∀ i ≤ n ≤ j Based on this assumption, the (existential) reachabilityproblem over infinite path endowed with a k − partial val-uation σ k , π σ k | = φ , can be reduced to the bounded(existential) reachability problem over finite paths (possiblycyclic) with k − partial valuation π σ k | = k φ : Theorem 1 ([7]) . Let φ be a CLTLB(DL) formula. Thereexists k > such that if π σ k is a path endowed with a k -partial valuation of variables, then π σ k | = φ ⇔ π σ k | = k φ . These results allow us to correctly verify the satisfiabilityof CLTLB(DL) formulae and also to realize a boundedmodel checking of systems involving DL constraints. Partic-ularly, when a counting mechanism is defined, reachabilityproperties of values of variables along paths of finite lengthcan be verified. Obviously, if the reachability property doesnot hold within k , then k can be refined and augmented. Asexplained later in Section VI, the substitutability problemcan be significantly solved by means of a BMC approachby correctly estimating an upper bound of k . This is doneby using an opportune heuristic based on the dimension ofthe automata describing services and the length of traces ofinvocations. For this reasons, the substitutability problem,which requires to check counting mechanism over finitepaths of invocations of service functions, can be easilyencoded to a bounded reachability problem. IV. E NCODING OF B OUNDED R EACHABILITY P ROBLEM
In this section the bounded reachability problem is en-coded as the satisfiability of a Quantifier Free IntegerDifference Logic formula with Uninterpreted Function andpredicate symbols (QF-UFIDL). Such a logic is shown to bedecidable, and the satisfiability problem to be NP-complete,as it can be easily proved applying Nelson-Oppen Theorem.The QF-UFIDL encoding results to be more succinct andexpressive than the Boolean one: lengthy propositional con-straints are substituted by more concise DL constraints andarithmetic (infinite) domains do not require an explicit finiterepresentation. These facts, considering also that the satis-fiability problem for QF-UFIDL has the same complexityof SAT, make the SMT-based approach particularly efficientto solve runtime substitutability problem, as demonstratedby performance results. In the key work of Biere et al. [9],the BMC is reduced to a pure propositional satisfiabilityproblem. This approach, and further refinements [10], [11],[12], has been already implemented in the Zot tool . A. Encoding the Time
As discussed before, the BMC problem amounts to lookfor a finite representation of infinite (possibly periodic)paths. The SAT-based approach encodes finite paths [9] bymeans of k + 3 propositional variables. The time instantat which the periodic suffix starts is defined by the loopselector variables l , l , . . . l k : l i holds if and only if theloop starts at instant i , i.e., s i is the successor of s k . Then,the truth (of atomic proposition) in s i and s k , defined bythe labeling function L defined in Section III, must be thesame. Further propositional variables, inLoop i ( ≤ i ≤ k )and loopEx , respectively, mean that time instant i is insidea loop and that there actually exists a loop.The same temporal behavior can be defined by means of one QF-UFIDL formula involving only one integer loop-selecting variable loop ∈ Z : k (cid:94) i =1 ( loop = i ⇒ L ( s i − ) = L ( s k )) . The QF-UFIDL encoding is more concise: it does not require k + 3 Boolean variables ( l i , inLoop i and loopExists ). Avalue of loop between and k defines if there exists a loopand its position; it does not depend on the k parameter. B. Encoding the Arithmetic Temporal Terms
Since CLTLB(DL) formulae consist also of a.t.t.’s, weneed to define a suitable semantics for them. An arithmeticformula function , i.e. an uninterpreted function α : Z → Z ,is associated with each arithmetic temporal subterm of Φ .Let α be such a subterm, then the arithmetic formulafunction associated with it (denoted by the same name but Zot: a Bounded Satisfiability Checker, http://home.dei.polimi.it/pradella/
4n written in bold face), is recursively defined w.r.t. thesequence of valuations σ as: α ≤ i ≤ kx x ( i ) = σ i ( x ) X α X α ( i ) = α ( i + 1) Y α Y α ( i ) = α ( i − This semantics is well-defined between and k thanks tothe initialization function I . C. Encoding the Propositional Terms
The propositional encoding is inspired from that onestudied in [10] but deeply revised to take also into accountrelations over a.t.t.’s. In the case of Boolean encoding, thesemantics of a PLTLB formula Φ is defined w.r.t. the truthvalue of all its subformulae only by means of Boolean vari-ables t associated to each of them, for all ≤ i ≤ k + 1 : if t i holds then the subformula t holds at instant i . The instant k + 1 is appended to the path to easily represent the instantin the past where the loop realizes the periodicity; indeed,it turns to be useful for the encoding. The propositionalsemantics of a CLTLB(DL) formula Φ is defined alike thatone of PLTLB. The QF-UFIDL encoding, instead, associatesto each propositional subformula a formula predicate thatis a unary uninterpreted predicate ϕ ∈ P ( Z ) . When thesubformula ϕ holds at instant i then ϕ ( i ) holds. As thelength of paths is fixed to k + 1 , and all paths start from ,formula predicates are actually subsets of { , . . . , k +1 } . Let ϕ be a propositional subformula of Φ , α , β be two a.t.t.’sand ∼ be any relation in DL; then the formula predicateassociated with ϕ (denoted by the same name but written inbold face), is recursively defined as: ϕ ≤ i ≤ k + 1 p p ( i ) ⇔ p ∈ L ( s i ) α ∼ β ( α ∼ β )( i ) ⇔ α ( i ) ∼ β ( i ) ¬ φ ¬ φ ( i ) ⇔ ¬ φ ( i ) φ ∧ ψ ( φ ∧ ψ )( i ) ⇔ φ ( i ) ∧ ψ ( i ) D. Encoding Temporal OperatorsTemporal subformulae constraints define the basic tem-poral behavior of future and past operators, by using theirtraditional fixpoint characterizations. Let φ and ψ be propo-sitional subformulae of Φ , then: ϕ ≤ i ≤ k X φ X φ ( i ) ⇔ φ ( i + 1) φ U ψ ( φ U ψ )( i ) ⇔ ( ψ ( i ) ∨ ( φ ( i ) ∧ ( φ U ψ )( i + 1))) φ R ψ ( φ R ψ )( i ) ⇔ ( ψ ( i ) ∧ ( φ ( i ) ∨ ( φ R ψ )( i + 1))) ϕ < i ≤ k + 1 i = 0 Y φ Y φ ( i ) ⇔ φ ( i − ¬ Y φ (0) Z φ Z φ ( i ) ⇔ φ ( i − Z φ (0) φ S ψ ( φ S ψ )( i ) ⇔ ( ψ ( i ) ∨ ( φ ( i ) ∧ ( φ S ψ )( i − φ S ψ )(0) ⇔ ψ (0) φ T ψ ( φ T ψ )( i ) ⇔ ( ψ ( i ) ∧ ( φ ( i ) ∨ ( φ T ψ )( i − φ T ψ )(0) ⇔ ψ (0) Last state constraints define an equivalence between truth in k +1 and those one indicated by loop , since the instant k +1 is representative of the instant loop along periodic paths.Otherwise, truth values in k + 1 are trivially false. Theseconstraints have a similar structure to the correspondingBoolean ones, but here they are defined by only one DLconstraint, for each subformula ϕ of Φ , w.r.t. the variable loop : (cid:16)(cid:86) ki =1 ( loop = i ⇒ ( ϕ ( k + 1) ⇔ ϕ ( i ))) (cid:17) ∧ (cid:16)(cid:16)(cid:86) ki =1 ¬ ( loop = i ) (cid:17) ⇒ ( ¬ ϕ ( k + 1)) (cid:17) . Note that if a loop does not exists then the fixpoint semanticsof R is exactly that one defined over finite acyclic pathin Sec. III. Finally, to correctly define the semantic of U and R , their eventuality have to be accounted for. Briefly,if φ U ψ holds at i , then ψ eventually holds in j ≥ i ; if φ R ψ does not hold at i , then ψ eventually does not hold in j ≥ i . Along finite paths of length k , eventualities must holdbetween and k . If a loop exists, an eventuality may holdswithin the loop. The original Boolean encoding introduces k propositional variables for each φ U ψ and φ R ψ subformulaof Φ , for all ≤ i ≤ k , which represent the eventualityof ψ implicit in the formula. The interested reader shouldconsult [10]. Differently, in the QF-UFIDL encoding, only one variable j ψ ∈ Z is introduced for each ψ occurring ina subformula φ U ψ or φ R ψ . ϕ Base φ U ψ (cid:16)(cid:87) ki =1 loop = i (cid:17) ⇒ ( ϕ ( k ) ⇒ loop ≤ j ψ ≤ k ∧ ψ ( j ψ )) φ R ψ (cid:16)(cid:87) ki =1 loop = i (cid:17) ⇒ ( ¬ ϕ ( k ) ⇒ loop ≤ j ψ ≤ k ∧ ¬ ψ ( j ψ )) The complete encoding of Φ consists of the logical conjunc-tion of all above components, together with Φ evaluated atthe first instant along the time structure.Let Φ be a pure propositional formula, actually in PLTLB,then we can compare the dimension of the SAT-based encod-ing versus the QF-UFIDL one. If m is the total number ofsubformulae and n is the total number of temporal operators U and R occurring in Φ , then the SAT-based encodingrequires (2 k +3)+( k +2) m +( k +1) n = O ( k ( m + n )) freshpropositional variables. Differently, the QF-UFIDL encodingrequires only n + 1 integer variables ( loop and j ψ ) and m unary predicates (one for each subformula).V. C ASE S TUDY
To demonstrate our methodology, we use an exampleconcerning two existing conversational services available onthe Internet. These two services realize two lyric searchengines. One is called
ChartLyrics , the other LyricWiki . http://lyrics.wifkia.com/Main Page s s s s s start cSE(1;0):2;4sS(0;1):1;0 sA(10):0sS(0;1):1;0cSE(1;0):2;4 gA(0):0;3;5;6;7sA(10):0gS(2;4;0;1):8;9;1;0 sS(0;1):1;0sS(0;1):1;0cSE(1;0):2;4 sA(10):0cSE(1;0):2;4 sA(10):0cSE(1;0):2;4sS(1;0):2;4 sS(0;1):1;0cSE(1;0):2;4 sA(10):0sS(0;1):1;0cSE(1;0):2;4gS(2;4;1):8;9;1;0gA(0):0;3;5;6;7 Figure 3. A subset of behavior protocol automaton of
LyricWiki . Operations : searchSongs (sS), checkSongExists (cSE), searchArtists (sA),getArtist (gA), getSong (gS).
Parameters : artist (0), song (1), lyricsId (2),item (3), lyricCheckSum (4), SongUrl (5), year (6), album (7), LyricCor-rectUrl (8), Lyrics (9), lyricText (10).
ChartLyrics is a lyrics database sorted by artists or songs.The WSDL of ChartLyrics provides three operations: (i)
SearchLyric to search available lyrics, (ii)
SearchLyricText to search a song by means of some text within an availablelyric text, and (iii)
GetLyric to retrieve the searched lyric.
LyricWiki is a free site where anyone can go to getreliable lyrics for any song from any artist. The WSDLof
LyricWiki provides several operations. Five of themare of interest for our purposes: (i) searchSongs to searchfor a possible song on LyricWiki and get up to ten closematches, (ii) checkSongExists to check if a song exists inthe
LyricWiki database, (iii) getSong to get the lyrics fora searched
LyricWiki song with the exact artist and songmatch, (iv) searchArtists to search for a possible artist byname and return up to ten close matches, and (v) getArtist to get the entire discography for a searched artist. To get alyric through
ChartLyrics , a client can exploit the followingsequence of operation invocations:
SearchLyric , GetLyric .Conversely, to get a lyric through
LyricWiki , a possiblesequence of operation invocations is the following: check-SongExists , searchSongs , getSong (see the representation ofthe conversational protocols of ChartLyrics and
LyricWiki ,respectively, in Fig. 1 and Fig. 3).If
LyricWiki were part of a web application realizedthrough a service composition, it could happen that, incertain circumstances, it would need to be replaced by
ChartLyrics or by any other specialized search engine. Thiscould happen, for instance, to accommodate the preferencesof users having their preferred engine, or to handle thecases when
LyricWiki is unavailable for any reason. Thedeveloper could code, by hand, the instructions to dealwith any possible engine and its replacement. However,this approach does not allow the application to deal withsearch engines unknown at design time. A better solution, http://api.chartlyrics.com/apiv1.asmx?WSDL http://lyrics.wikia.com/server.php?wsdl which would overcome this problem, is to build a map-ping mechanism that dynamically handles the mismatchesby automatically synthesizing a behavior protocol mappingscript. The adaptation realized by the synthesized mappingscript could state, e.g., that the sequence of LyricWiki oper-ations checkSongExists , searchSongs , getSong maps to thesequence of ChartLyrics operations
SearchLyric , GetLyric .Let us consider as an example the expected serviceoperation sequence checkSongExists , searchSongs , getSong ,which brings the LyricsWiki behavior protocol automatonfrom state start to state s (see Fig. 3). We assume tohave established a compatibility relation between services’data. Also, for the sake of brevity, the automata of Fig. 1and 3 are represented with this relation already established,though in practice this requires an additional mapping step(for more details see [6], [4]). Finally, we establish a statecompatibility relation. This defines that state s of theexpected service is compatible with state end of the actualservice, which means that if the expected service reachesstate s , then the actual service should reach state end . Theexample expected operations sequence starts from the start state and leads the behavior protocol model into state s .The automata describing service protocols, the statecompatibility relation and the expeced service op-eration sequence are all formalized through suitableCLTLB(DL) formulae expectedService , actualService and expectedOperationSequence . Then, we formulate theproblem of checking if the expected service can be substi-tuted by the actual service in terms of a bounded reacha-bility problem over the automata describing the protocolsof the expected and actual services. The problem consistsin searching for a finite operation sequence on the ac-tual service automaton which starts (resp. ends) in a statecompatible with the start (resp. end) state of the expectedservice operation sequence. Moreover, the actual serviceoperation sequence should require no more input parametersthan those provided to the expected service sequence, andit should provide at least the same parameters providedby the expected service sequence. To ensure this propertywe keep track, through instances of counters seen and needed (see Section II), of how many parameters of anygiven kind are provided as input to the expected serviceoperations and of how many parameters of any given kindare returned by each actual service operation (this is for-malized through suitable CLTLB(DL) formulae seen ad needed ). Finally, a solution for the bounded reachabilityproblem can be obtained by checking the satisfiability ofCLTLB(DL) formula expectedService ∧ actualService ∧ expectedOperationSequence ∧ seen ∧ needed .Considering the example sequence on LyricsWiki , a clientexpecting to invoke this sequence is assuming to provide asinput to the first operation of the sequence a song and anartist. This will set the seen counter to for both providedinputs. Moreover, it expects the invoked operation to return6 lyricsId and a lyricCheckSum , which will increment thecorresponding instances of the needed counter to . Consid-ering the actual service protocol, our approach searches foran operation accepting a subset of the provided input dataand providing a superset of the required return data.The operation to be selected should leave the start stateas the state compatibility relation provided as input forthe approach mandates the compatibility of state start of LyricsWiki with state start of ChartLyrics . In our examplethe invocation of checkSongExists makes
SearchLyric theonly suitable candidate. After the invocation of this actualservice operation all instances of seen and those instances of needed associated to theoutput parameters of checkSongEx-ists are reset to . The actual service operation returnsalso some extra data that are not required by the invokedexpected service operation (i.e. song, artist, songRank, artis-tUrl, songUrl). In this case the reasoning mechanism offerstwo possible choices: extra data can be discarded (henceignored also in the future), or they can be initially ignored,but stored for an eventual later use. The former strategyis more conservative, but it may also limit the possibilityof the reasoning mechanism to find an adapter. The latterstrategy may affect data consistency in some cases, as itallows using as a reply for an operation some data that havebeen received before the request has been actually issued, butit also opens the possibility of finding adapters in situationsin which the former would fail. In this case study we usethe latter strategy, hence the needed counters for those datathat are not required as a response by the invoked expectedservice operation are set to − .After the invocation of SearchLyric the actual servicegoes in
SearchLyric start state. The next operation on theexpected sequence to be invoked is searchSongs , whichrequires as input the names of the song to be searched and ofits author and provides as return parameters the names of theartist and of the song, if they are found. Since the needed counters for both the name of the artist and of the songare set to − , instances of those data have been previouslystored, hence no operation shall be invoked on the actualservice, which remains in state SearchLyric start .The last operation in the expected sequence is getSong ,which requires as input artist and song names and the idand checkSum returned by the previously invoked check-SongExists . The expected service has again the same threeoperations of the previous step available, but this time thereare two available candidates for selection: searchSongs and
GetLyric . In this situation the latter is selected, becauseof the state compatibility relation provided as input tothe adapter search phase. Given the data-flow constraintselicited before,
GetLyric is the only available operation thatcan satisfy also the state compatibility relation. After theinvocation of
GetLyric the expected and actual services arein compatible states and the needed counter instances are allset to . Then, the actual service operation sequence found can be substituted to the expected service sequence.A mapping script generated for the example sequence inthis section is reported in Table I. Each step contains thestate in which each one of the analyzed automata is, theoperations in seq exp and in seq act that should be invoked inthat step, and the exchanged data, if any. For each operationin seq exp the adapter expects to receive an invocation for theexpected service, and for each operation in seq act the adapterperforms an invocation to the actual service. The table showsalso the updates for the seen and needed counters.VI. E VALUATION AND E XPERIMENTAL R ESULTS
In order to evaluate the encoding presented in this paperwe built a plug-in of Zot and we used it in three setsof experiments : (i) We created adapters for sequences ofincreasing length related to the case study presented inSection V. This set of experiments was used as a qualitativeevaluation of the approach on examples taken from the realworld. (ii) We ran the same set of experiments on Zotusing three different encodings, namely the traditional SAT-based encoding (PLTL/SAT), the new SMT-based one ofthe same logic (PLTL/SMT), and the SMT-based of logicCLTLB(DL) introduced in this paper. We measured elapsedtime and occupied memory, and we compared the resultsto get an estimate of how the introduction of the SMT-solver speeds up the adapter-building mechanism. (iii) Wecreated some service interface models with growing numberof parameters and tried to solve them with both the originalversion of the encoding and with the extensions. This setof experiments has the purpose to compare how much thenew encoding scales on models larger than those found incommon practice.All experiments were run using the Common Lisp com-piler SBCL 1.0.29.11 on a 2.50GHz Core2 Duo laptop withLinux and 4 GB RAM. We chose to use two different SMT-solvers in our tests: Microsoft Z3 and SRI Yices . For theSAT-based PLTL encoding we used MiniSat .The first set of experiments was carried out selectingsome operation sequences on the expected service presentedin Section V. The selected sequences set comprises thesimple sequence analyzed in the case study plus sequencesof growing length obtained trying to execute up to 5 con-secutive searchSongs and checkSongExists opera-tions. We set the time bounds for the experiments usinga simple heuristic, based on the sum of the states of theautomata of the input services. In those cases in which theabstract sequence featured repeated invocations of the sameoperation, the time bound was augmented with the numberof repetitions of each operation. This set of experiments The experiments sets are available athttp://home.dei.polimi.it/cavallaro/sefm10-experiments.html Z3: http://research.microsoft.com/en-us/um/redmond/projects/z3/ Yices: http://yices.csl.sri.com/ MiniSat: http://minisat.se/ tep Execution trace Content Counters value1
LyricWiki
State:start ;
LyricWiki
Operation:checkSongExists All counters set to 0
LyricWiki
Input: song, artist;
LyricWiki
Output:lyricId, lyricCheckSum chartLyrics
State:start;
LyricWiki
Operation:checkSongExists2
LyricWiki
State: s seen(song) = seen(artist) = 1 chartLyrics Input: song, artist needed(lyricId) = needed(lyricCheckSum) = 1 chartLyrics
Output:song , artist, artistUrl, songRank, lyricsId, lyricChecksum chartLyrics
State:start; chartLyrics
Operation:searchLyric3
LyricWiki
State: s ; LyricWiki
Operation:searchSongs seen(song) = seen(artist) = 0
LyricWiki
Input:song, artist;
LyricWiki
Output:song, artist needed(lyricsId) = needed(lyricCheckSum) = 0 chartLyrics
State:searchLyric start needed(artist) = needed(artistUrl) = -1needed(song) = needed(songRank)= -14
LyricWiki
State: s seen(song) = seen(artist) = 1 chartLyrics State:searchLyric start needed(song) = needed(artist) = 0 chartLyrics
Operation:
None LyricWiki
State: s ; LyricWiki
Operation: getSong No Changes
LyricWiki
Input: lyricId, song, lyricCheckSum, artist
LyricWiki
Output:song, artist, lyricCorrectUrl, Lyric chartLyrics
State:searchLyric start6
LyricWiki
State: s seen(song) = seen(artist) = 2 chartLyrics Input: lyricId, lyricCheckSum seen(lyricCheckSum) = seen(lyricId) = 1 chartLyrics
Output: song , artist, artistUrl, lyricRank, Lyric, lyricCorrectUrl, lyricCoverArtUrl needed(song) = needed(artist) = 1 chartLyrics
State:searchLyric start chartLyrics
Operation:getLyric needed(lyricCorrectUrl) = needed(Lyric) = 17
LyricWiki
State: s seen(lyricCheckSum) = seen(lyricId) = 0 LyricWiki
Operation:
None needed(song) = needed(artist) = 0 chartLyrics
State:end needed(lyricCorrectUrl) = needed(Lyric) = 0 chartLyrics
Operation:
None needed(artistUrl) = needed(lyricRank) = -1
Table IM
APPING SCRIPT GENERATED FOR THE EXAMPLE IN THIS SECTION produced a set of mapping scripts that we checked byinspection. Fig. 4(a) and Fig. 4(b) report the overall results.Fig. 4(b) shows that the CLTLB(DL) encoding has lowermemory occupation than the SAT-based PLTL encoding forthe same problem. Fig. 4(a) shows that the CLTLB(DL)encoding on Z3 performs much better than the others.Lastly, we tried to push the limits of our technique tocheck its robustness. To do so, we generated simple serviceprotocols featuring operations with a growing number ofparameters. We chose this setting for our experiments basedon our experience in the common practice, which suggeststhat services usually exhibit very simple protocols, whileoperations have sometimes a considerable number of pa-rameters. Note that the models used in these experimentsare much bigger than those commonly found in practice.The experiments are based on expected and actual serviceswith 10 states, and a trace bound of 21 time instants. Theresults are shown in Fig. 4(c) and in Fig. 4(d). The numberof parameters used in experiments ranges from 10 (i.e. eachoperation has 10 input and 10 output parameters) to 90.As shown in the figures, the CLTLB(DL) encoding on Z3was the only one we managed to push up to 90 parameters,while we stopped experimenting much earlier with the PLTLencoding on Yices, Z3 and MiniSat. Note that in Fig. 4(c)-4(d) the combination CLTLB(DL)/Yices is missing becauseof its poor performance on this set of experiments (thesimplest case was solved in more than 500 seconds).VII. R
ELATED W ORK
Our approach is closely related both to works supportingsubstitution of services and to works about verificationusing model checking. Many approaches that support the automatic generation of adapters (or equivalent mechanisms)are based on the use of ontologies and focus on non-conversational services (see for instance [6], [13]). They allassume that the usual WSDL definition of a service interfaceis enriched with some kinds of ontological annotations. Atrun-time, when a service bound to a composition needs to besubstituted, a software agent generates a mapping by parsingsuch ontological annotations.
SCIROCO [14] offers similarfeatures but focuses on stateful services. It requires all ser-vices to be annotated with both a SAWSDL description anda WSResourceProperties [15] document, which representsthe state of the service. When an invoked service becomesunavailable,
SCIROCO exploits the SAWSDL annotations tofind a set of candidates that expose a semantically matchinginterface. Then, the WS-ResourceProperties document as-sociated to each candidate service is analyzed to find outif it is possible to bring the candidate in a state that iscompatible with the state of the unavailable service. If thisis possible, then this service is selected for replacement ofthe one that is unavailable. All these three approaches offerfull run-time automation for service substitution, but as theservices they consider are not conversational, they performthe mapping on a per-operation basis. An approach thatgenerates adapters covering the case of interaction protocolsmismatches is presented in [16]. It assumes to start froma service composition and a service behavioral descriptionboth written in the BPEL language [17]. These are thentranslated in the
YAWL formal language [18] and matched inorder to identify an invocation trace in the service behavioraldescription that matches the one expected by the servicecomposition. The matching algorithm is based on graphexploration and considers both control flow and data flow8 C S - s equen c e c he ck c he ck c he ck s ea r c h2 c he ck s ea r c h2 c he ck s ea r c h3 c he ck s ea r c h5 E l ap s ed t i m e ( s e c ond s ) z3(CLTLB)yices(CLTLB)yices(PLTL/SMT)z3(PLTL/SMT)MiniSat(PLTL/SAT) (a) Elapsed times on the second set of experiments C S - s equen c e c he ck c he ck c he ck s ea r c h2 c he ck s ea r c h2 c he ck s ea r c h3 c he ck s ea r c h5 O cc up i ed M e m o r y ( M B y t e s ) z3(CLTLB)yices(CLTLB)yices(PLTL/SMT)z3(PLTL/SMT)MiniSat(PLTL/SAT) (b) Memory occupations on the second set of experiments E l ap s ed t i m e ( s e c ond s ) Parameters Number10 - 100 parameters, 10 states, HL = 21z3 (CLTLB)z3(PLTL/SMT)yices(PLTL/SMT)MiniSat(PLTL/SAT) (c) Elapsed times on the third set of experiments M e m o r y ( M B y t e s ) Parameters Number10 - 100 parameters, 10 states, HL = 21 z3 (CLTLB)z3 (PLTL/SMT)yices (PLTL/SMT)MiniSat(PLTL/SAT) (d) Memory occupations on the third set of experimentsFigure 4. Experimental Results requirements. The approach presented in [19] offers similarfeatures and has been implemented in an open source tool .While both these approaches appear to fulfill our need forsupporting interaction protocol mapping, they present someshortcoming in terms of performances, as shown in [3].Although QF-UFIDL involves variables over infinite do-main, our particular BMC of CLTLB(DL) formulae becameeffective because it is not used as an infinite-state modelchecking procedure. In general, transitions systems definedby arithmetic constraints provide a large class of infinite-state systems which are suitable for modeling a large varietyof applications. So, intensive work has been devoted toidentify useful classes with decidable reachability and safetyproperties [20], [21]. Some implemented procedures [22],[23] rely on a pure operational approach and the complexityof the decision problem of the underlying arithmetic (3-EXPTIME in the case of Presburger Logic) do not make The Dinapter tool: http://sourceforge.net/projects/dinapter them appropriate for runtime checking. Much effort is alsodevoted to study decidabilty and complexity of temporallogic of arithmetic constraints, [24], [25], [7], [8]. [26]proposes a semi-decision procedure aimed to be used formodel checking of an extension of CTL* with Presburgerconstraints. Finally, an operational approach to BMC whichexploits a direct translation of LTL formulae of arithmeticconstraints is suggested in [27]. Our approach offers a mixedoperational-descriptive BMC based on the satisfiability ofCLTLB(DL) formulae which enjoys the NP-completenessof the decision problem of DL, significantly less than thatof more complex theories.VIII. C
ONCLUSION
In this paper we introduced an efficient encoding for alinear temporal logic with arithmetic constraints. Our encod-ing was found very suitable for application to a real problemtaken from the SOA domain and showed better performancesand lower memory occupation than the other encodings9e compared it with. The research work is currently stillongoing. For future work we plan to further experiment withour encoding and to investigate its theoretical properties.A
CKNOWLEDGMENTS
Many thanks to Elisabetta Di Nitto, Angelo Morzenti, andPierluigi San Pietro for the fruitful discussions and theirsupport. This research has been partially funded by theEuropean Commission, Programme IDEAS-ERC, Project227977-SMScom, and by the Italian Government under theproject PRIN 2007 D-ASAP (2007XKEHFA).R
EFERENCES [1] V. De Antonellis, M. Melchiori, L. De Santis, M. Mecella,E. Mussi, B. Pernici, and P. Plebani, “A layered architecturefor flexible web service invocation,”
Softw. Pract. Exper. ,vol. 36, no. 2, pp. 191–223, 2006.[2] K. Verma, K. Gomadam, A. Sheth, J. Miller, and Z. Wu, “TheMETEOR-S approach for configuring and executing dynamicweb processes,” LSDIS Lab, University of Georgia, Athens,Georgia, Tech. Rep. LSDIS Technical Report 05-001, 2005.[3] L. Cavallaro, E. Di Nitto, and M. Pradella, “An automaticapproach to enable replacement of conversational services,”in
ICSOC/ServiceWave , 2009, pp. 159–174.[4] L. Cavallaro, E. Di Nitto, P. Pelliccione, M. Pradella, andM. Tivoli, “Synthesizing adapters for conversational web-services from their WSDL interface,” in
SEAMS ’10 . ACM,2010.[5] S. Ranise and C. Tinelli, “The SMT-LIBstandard: Version 1.2,” Tech. Rep., 2006,http://combination.cs.uiowa.edu/smtlib/.[6] L. Cavallaro, G. Ripa, and M. Zuccal`a, “Adapting servicerequests to actual service interfaces through semantic anno-tations,” in
Proceedings of PESOS , 2009.[7] M. M. Bersani, A. Frigeri, M. Pradella, M. Rossi,A. Morzenti, and P. San Pietro, “SMT-based BoundedModel Checking with Difference Logic Constraints,”http://arXiv:submit/0018237, Politecnico di Milano, Tech.Rep., 2010.[8] H. Comon and V. Cortier, “Flatness Is Not a Weakness,” in
CSL , 2000, pp. 262–276.[9] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, and Y. Zhu,“Bounded model checking,”
Advances in Computers , vol. 58,pp. 118–149, 2003.[10] A. Biere, K. Heljanko, T. A. Junttila, T. Latvala, andV. Schuppan, “Linear encodings of bounded ltl model check-ing,”
Logical Methods in Computer Science , vol. 2, no. 5,2006.[11] M. Pradella, A. Morzenti, and P. San Pietro, “The symmetryof the past and of the future: bi-infinite time in the verificationof temporal properties,” in
ESEC/SIGSOFT FSE , 2007, pp.312–320. [12] M. Pradella, A. Morzenti and P. San Pietro, “A metricencoding for bounded model checking,” in
FM 2009: FormalMethods , ser. Lecture Notes in Computer Science, A. Cav-alcanti and D. Dams, Eds., vol. 5850. Springer, 2009, pp.741–756.[13] C. Drumm, “Improving schema mapping by exploiting do-main knowledge,” Ph.D. dissertation, Universitat Karlsruhe,Fakultat fur Informatik, 2008.[14] M. Fredj, N. Georgantas, V. Issarny, and A. Zarras, “Dynamicservice substitution in service-oriented architectures,” in
Pro-ceedings of SERVICES , 2008.[15] T. Schaeck and R. Thompson, “WS-ResourceProperties,”http://docs.oasis-open.org/wsrp/Misc/, 2003.[16] A. Brogi and R. Popescu, “Automated generation of BPELadapters,” in
Proceedings of ICSOC , 2006.[17] OASIS, “Web Services Business Process Execution Lan-guage Version 2.0,” http://docs.oasis-open.org/wsbpel/2.0/wsbpel-v2.0.pdf, 2007.[18] W. M. P. van der Aalst and A. H. M. ter Hofstede, “YAWL:yet another workflow language,”
Inf. Syst. , vol. 30, no. 4, pp.245–275, 2005.[19] J. A. Mart`ın and E. Pimentel, “Automatic generation ofadaptation contracts,” in
Proceedings of FOCLASA , 2008.[20] L. Fribourg and H. Ols´en, “Proving Safety Properties ofInfinite State Systems by Compilation into Presburger Arith-metic,” in
CONCUR , 1997, pp. 213–227.[21] H. Comon and Y. Jurski, “Multiple Counters Automata,Safety Analysis and Presburger Arithmetic,” in
CAV , 1998,pp. 268–279.[22] B. Boigelot, “Symbolic methods for exploring infinite statespaces,” Ph.D. dissertation, Universit´e de Li`ege, 1998.[23] A. Annichini, A. Bouajjani, and M. Sighireanu, “TReX: ATool for Reachability Analysis of Complex Systems,” in
CAV ,2001, pp. 368–372.[24] S. Demri, “LTL over Integer Periodicity Constraints: (Ex-tended Abstract),” in
FoSSaCS , 2004, pp. 121–135.[25] S. Demri and D. D’Souza, “An automata-theoretic approachto constraint LTL,” in
FSTTCS , 2002, pp. 121–132.[26] S. Demri, A. Finkel, V. Goranko, and G. van Drimmelen,“Towards a Model-Checker for Counter Systems,” in
ATVA ,2006, pp. 493–507.[27] L. M. de Moura, H. Rueß, and M. Sorea, “Lazy theoremproving for bounded model checking over infinite domains,”in
CADE , 2002, pp. 438–455., 2002, pp. 438–455.