A small-step approach to multi-trace checking against interactions
Erwan Mahe, Boutheina Bannour, Christophe Gaston, Arnault Lapitre, Pascale Le Gall
AA small-step approach to multi-trace checkingagainst interactions (long version)
Erwan Mahe , Boutheina Bannour , Christophe Gaston , ArnaultLapitre , and Pascale Le Gall Laboratoire de Mathématiques et Informatique pour la Complexité et les SystèmesCentraleSupélec - Université de Paris-Saclay9 rue Joliot-Curie, F-91192 Gif-sur-Yvette Cedex CEA, LIST, Laboratory of Systems Requirements and Conformity Engineering,P.C. 174, Gif-sur-Yvette, 91191, France
Abstract.
Interaction models describe the exchange of messages be-tween the different components of distributed systems. We have previ-ously defined a small-step operational semantics for interaction models.The paper extends this work by presenting an approach for checkingthe validity of multi-traces against interaction models. A multi-trace isa collection of traces (sequences of emissions and receptions), each rep-resenting a local view of the same global execution of the distributedsystem. We have formally proven our approach, studied its complexity,and implemented it in a prototype tool. Finally, we discuss some ob-servability issues when testing distributed systems via the analysis ofmulti-traces.
Keywords: interaction · small-step operational semantics · multi-traceanalysis · distributed system Context.
A distributed system (DS) can be viewed as a collection of sub-systems, which are distributed over distinct physical locations and which commu-nicate with each other by exchanging messages [11]. Analyzing the executions ofDSs is a key problem to assess their correctness. However the distributed natureof observations complicates the investigation of bugs and undesirable behaviors.The absence of a global clock makes the classical notion of trace often too strongto represent DS executions. Indeed a trace fully orders all events occurring in itwhile ordering events occurring on remote locations is often impossible. There-fore, multi-traces are better suited to model executions of DSs. A multi-traceis a collection of traces, one per sub-system, which represents the sequence ofactions - emissions and receptions of messages - that have been observed at itsinterface. Contrarily to traces, multi-traces do not strongly constrain orderingsbetween actions occurring on different sub-systems. Our work is related to thegeneral problem of the automatic analysis and debugging of DSs based on locallogging of traces [15,4,16,13,2]. We are positioned at the intersection of two main a r X i v : . [ c s . L O ] S e p E. Mahe, B. Bannour, C. Gaston, A. Lapitre, P. Le Gall issues: (1) that of tracking the causality of actions in traces [15,13] based on thehappened-before relation of Lamport [11] and (2) that of checking multi-tracesagainst formal properties [2] or models [16,4].
Contribution.
In a model-based approach, we ground our analysis on modelsof interactions as the reference of intended DS executions. This kind of mod-els - which include UML Sequence Diagrams [18], Message Sequence Charts [9],BPMN Choregraphies [17] among others - are widely used to specify DSs. In suchmodels, DS executions are thought of as a coordination of message exchangesbetween multiple sub-systems. This allows for a specification from a global per-spective. We consider interaction models where the execution units are actions(the same as those constituting traces) and can be combined using operatorsof sequencing, choice, repetition and parallelism. In a previous work [14], wehave proposed a small-step operational semantics for interactions, backed by anequivalent algebraic denotational semantics. This paper presents an approachto check the validity of multi-traces against interaction models. Validity refersto the notion of being an accepted multi-trace, intuitively reflecting the factof fully realizing one of the behaviours prescribed by the reference interactionmodel, taking into account that interaction models can be non-deterministic.We prove the correctness and discuss the complexity class of our method foranalyzing multi-traces w.r.t. interaction model semantics. Moreover, we discussobservability issues arising when testing distributed systems via the analysis ofmulti-traces.As part of our contribution, we also developed a prototype implementing thesmall-step semantics and the multi-trace analysis. This tool is able to rendergraphical representations detailing the steps taken by the analysis. Images ofinteractions in this paper were adapted from its outputs.
Related work.
Interaction models have been extensively used to validate DSsusing Test Case generation [5,3,12]. Much effort is spent on the generation of localtest cases to mitigate the following problems: (1) "observability" i.e. the difficultyin inferring global executions from partial visions of message exchanges and (2)"controllability" i.e. the difficulty in determining when to apply stimuli in orderto realize a targeted global execution. Our work, however, falls within anotherdomain which is Passive Testing [2,16] (in which testers are only observers), anddiscusses other problems such as the Test Oracle Problem [6] (determining ex-pected outputs w.r.t. given stimuli). Both works [2,16] have proposed approachesto check a set of local logs recorded in Service Oriented Systems. Authors in [2]propose a methodology to verify the conservation of invariants during the exe-cution of the system. Both local and global invariants can be checked althoughthe latter is more costly in terms of computations. Our approach is different inthat the reference for the analysis is not a logical property but a model of in-teraction as in [16,6]. [16] discusses passive testing against choreography modelsexpressed in the language Chor [19]. It differs however from our approach in sofar as: (1) Chor is less expressive than the interaction language we propose (par-ticularly w.r.t. the absence of weak sequencing and the nature of loops), (2) [16]only handles synchronous communication between services, which cannot always mall-step multi-trace checking against interactions (long version) 3 describe accurately concrete implementations and (3) the local logs are not di-rectly checked against the model but first pass through a synthesis step in whicha global log is reconstituted and then this global log is checked. Authors in [6]investigate the computational cost of log analysis w.r.t. graphs of MSCs. Thiscost is compared in different cases according to the quality of observations (localor tester observability i.e. whether one have a set of independent local logs or aglobally ordered log) and the expressivity of the MSC graphs (presence of choice,loop or parallelism). The work echoes results for "MSC Membership" [1,8] whichstate that this problem is NP-complete. The main factor of the cost blow-up liesin the fact that distributed actions can be equally re-ordered in multiple ways.Our work is in the lineage of such research but we rather consider richer in-teraction models (asynchronous communications, weak sequencing, no enforcedfork-join, ...). As such our language is closer to the appealing expressivenessof UML Sequence Diagrams. We therefore expect higher computational costs.Nevertheless, by applying a small-step semantics guided by the reading of themulti-trace, only pertinent parts of the search space are explored.
Plan.
This paper is organized as follows. Sec.2 presents (multi-)traces andthe concrete syntax of our interaction language. Sec.3 describes how interactionterms can be rewritten so as to define a small-step semantics in the form ofaccepted traces or multi-traces. Sec.4 presents our multi-trace analysis as wellas some theoretical properties (termination, membership characterization, NP-hardness) and discuss a possible extension of our approach to take into accountobservability problems. Finally Sec.5 concludes the paper.
Our goal is to analyse the validity of DS executions collected in the form ofsets of local logs called multi-traces, w.r.t. a given interaction model. We nowintroduce the basic notions required to manipulate those concepts.The description of a DS requires distinguishing between its distinct indepen-dent sub-systems and the different messages those sub-systems can exchange.In this paper, those sub-systems are abstracted as so-called lifelines (as in mostInteraction-based languages) and we will note L the set of all lifelines and M the set of all messages. In the rest of the paper, L and M will be left implicit.The basic building blocks of both (multi-)traces and interactions are actions.An action is either the emission or the reception of a message m from or towardsa lifeline l , noted respectively l ! m and l ? m . We note the set of all actions with Act = { l∆m | l ∈ L, ∆ ∈ { ! , ? } , m ∈ M } . When L is reduced to a singleton { l } , we note Act ( l ) . For an action act of the form l∆m , lf ( act ) denotes l . A trace characterizes a given execution of a DS as a sequence of actions(from Act ∗ ), which appear in the order in which they occurred globally. For a set X , X ∗ denotes the set of sequences of elements of X with (cid:15) being theempty sequence and the dot notation ( . ) being the concatenation law. E. Mahe, B. Bannour, C. Gaston, A. Lapitre, P. Le Gall Given L = { l , · · · , l n } , a multi-trace is a tuple of traces µ = ( σ , · · · , σ n ) where, for any j ∈ [1 , n ] , σ j ∈ Act ( l j ) ∗ . A multi-trace therefore describes theexecution of a DS as the collection of traces locally observed on each sub-system.Multi-traces do not constrain orderings between actions occurring on differentlifelines. We note M ult = (cid:81) l ∈ L Act ( l ) ∗ the set of multi-traces.We may use the projection operator proj from Def.1 to project any trace ς ∈ Act ∗ into a multi-trace proj ( ς ) ∈ M ult . Definition 1 (Trace Projection). proj : Act ∗ → M ult is s.t.: – proj ( (cid:15) ) = ( (cid:15), · · · , (cid:15) ) – given j ∈ [1 , n ] and act ∈ Act ( l j ) and ς ∈ Act ∗ if proj ( ς ) = ( σ , · · · , σ j , · · · , σ n ) then proj ( act.ς ) = ( σ , · · · , act.σ j , · · · , σ n ) . For instance, if we consider the trace ς = a ! m .c ? m .c ! m .d ? m defined over M = { m , m } and L = { a, b, c, d } then: proj ( ς ) = ( a ! m , (cid:15), c ? m .c ! m , d ? m ) seqloop seq seqstricta ! m b ? m seqaltstrictb ! m c ? m ∅ b ! m para ! m strictc ! m a ? m ← (cid:26) interactiontermdiagramrepr. (cid:27) → Fig. 1: Example interaction
An interaction is a model which describes a DS by defining which are theactions that it may express and which are the possible orderings between those.As exemplified on the left of Fig.1, interactions are binary trees whose leavesare actions from
Act . Precedence relations between actions at different leafpositions are then determined by the operators found in the inner nodes of thetree that separate those positions. Definition 2 (Interactions).
The set
Int of interactions is s.t.: – ∅ ∈ Int and
Act ⊂ Int , – for ( i , i ) ∈ Int and f ∈ { strict, seq, alt, par } , f ( i , i ) ∈ Int , – for i ∈ Int and f ∈ { strict, seq, par } , loop f ( i ) ∈ Int . mall-step multi-trace checking against interactions (long version) 5 The empty interaction ∅ and any action of Act are basic interactions. seq ( i , i ) (weak sequencing) indicates that actions specified by i must occur before thoseof i iff they occur on the same lifeline. In contrast, strict ( i , i ) (strict sequenc-ing) imposes that actions specified by i must occur before those of i in any case. par ( i , i ) allows actions from i and i to be fully interleaved while alt ( i , i ) (exclusive alternative) specifies that either actions specified by i or by i occur.As for the loop operators, loop f with f ∈ { seq, strict, par } , the index f indicateswith which binary operator loop unrollings have to be composed: in other words loop f ( i ) is equivalent to the term alt ( ∅ , f ( i , loop f ( i )) (here we detailed thechoice between not unrolling ( ∅ ) and unrolling once).Interactions can be illustrated by diagrams (cf. right part of Fig.1). Lifelinesare depicted as vertical lines and actions l∆m as arrows carrying their specificmessage m and originating from or pointing towards their specific lifeline l . Thepassing of a message from a lifeline to another is modelled using the strict oper-ator (e.g. strict ( a ! m, b ? m ) to denote the passing of m from a to b ). In diagrams,a message passing is depicted as an arrow from source to target lifeline. seqaltstrictb ! m c ? m ∅ b ! m Fig. 2: Small example
Let us consider the example from Fig.2(subterm of the one from Fig.1). Firstly, b can either send m to c or not sendanything. This choice is modelled by the alt alternative operator. Secondly, b mustsend m to the environment. The implicitsequencing that we have described in nat-ural language with the adverbs "firstly"and "secondly" is modelled by the seq weak sequencing operator, which, unlike the other operators that are drawnexplicitly with boxes, is implicitly represented by the top to bottom direction.The semantics of an interaction i can be defined as a set of global traces Accept ( i ) or (not equivalently) by a set of of multi-traces AccM ult ( i ) . Fig.3enumerates those semantics for the interaction from Fig.2. Let us note that aninterleaving between b ! m and c ? m is noticeable in Accept ( i ) but is not in AccM ult ( i ) . Accept ( i ) = b ! m .c ? m .b ! m ,b ! m .b ! m .c ? m ,b ! m AccMult ( i ) = (cid:26) ( b ! m .b ! m , c ? m ) , ( b ! m , (cid:15) ) (cid:27) Fig. 3: Enumerations of finite trace and multi-trace semantics for a simple example
So as to formally define the set of accepted (multi-)traces of an interaction i , we reformulate semantic rules i act −−→ i (cid:48) from [14] without relying on some de-notational counterpart (in particular, without referring to notions of precedencerelations between actions, as in [10,14]). To do this, in Sec.3.1, we extract in-formation statically from the term structure of interactions. This information E. Mahe, B. Bannour, C. Gaston, A. Lapitre, P. Le Gall is required to define, in Sec.3.2, the small-step interaction execution function χ grounding our operational approach. Finally, in Sec.3.3, we provide interactionswith their two semantics: Accept , based on global traces, and
AccM ult , obtainedby projection of
Accept . As an interaction i can contain several occurrences of the same action act , oursmall-steps do not correspond to transformations of the form i act −−→ i (cid:48) bur rather i act @ p −−−−→ i (cid:48) where p indicates the position of a specific occurrence of act within i . seqaltstrictb ! m c ? m ∅ b ! m (cid:15) Fig. 4: Positions
To do so, we use positions expressed in theDewey Decimal Notation [7]. As the arity of ouroperators is at most 2, positions are defined aselements of { , } ∗ . A sub-interaction of an inter-action i at position p is noted i | p . Fig.4 illustratespositions within the interaction from Fig.2.Moreover, for any set P ∈ P ( { , } ∗ ) and x ∈{ , } , we will note x.P the set { x.p | p ∈ P } .The set pos ( i ) of all well-defined positionsw.r.t. an interaction i is given below: Definition 3 (Positions). pos : Int → P ( { , } ∗ ) is defined as follows: – pos ( ∅ ) = { (cid:15) } and for any act ∈ Act , pos ( act ) = { (cid:15) } – for any i and i in Int : • pos ( f ( i , i )) = { (cid:15) } ∪ .pos ( i ) ∪ .pos ( i ) with f ∈ { strict, seq, par, alt }• pos ( loop f ( i )) = { (cid:15) } ∪ .pos ( i ) with f ∈ { strict, seq, par } We can then unambiguously designate sub-terms of an interaction i (calledsub-interactions) by using positions from pos ( i ) . For any p ∈ pos ( i ) , we usethe notation i | p to refer to this sub-interaction of i at position p . The formaldefinition of this notation is given below: Definition 4 (Sub-interactions). _ | _ : Int × { , } ∗ → Int is a partial func-tion defined over couples ( i, p ) ∈ Int × { , } ∗ s.t. p ∈ pos ( i ) as follows: – i | (cid:15) = i for any i ∈ Int – for any i , i in Int and p ∈ pos ( i ) and p ∈ pos ( i ) : • ( f ( i , i )) | .p = ( i ) | p for any f ∈ { strict, seq, par, alt }• ( f ( i , i )) | .p = ( i ) | p for any f ∈ { strict, seq, par, alt }• ( loop f ( i )) | .p = ( i ) | p for any f ∈ { strict, seq, par } The exp (cid:15) function assesses statically whether or not an interaction accepts/ expresses the empty trace (cid:15) . Naturally ∅ only accepts (cid:15) , while interactions act ∈ Act do not ( act must be executed). Similarly, any loop accepts (cid:15) because we use the verb "express" in its name exp (cid:15) so as not to risk confusion between thissimple static function and the "accept" semantics defined latermall-step multi-trace checking against interactions (long version) 7 it is possible to repeat 0 times its content. The treatment of binary operatorsdiffers according to their intuitive meaning: for alt , it is sufficient that one of thetwo direct sub-interactions accepts (cid:15) , while for the scheduling operators ( seq , strict and par ), both have to accept (cid:15) . Definition 5 (Emptiness). exp (cid:15) : Int → bool is the function such that: – exp (cid:15) ( ∅ ) = (cid:62) and for any act ∈ Act , exp (cid:15) ( act ) = ⊥ , – for any i and i in Int : • exp (cid:15) ( f ( i , i )) = exp (cid:15) ( i ) ∧ exp (cid:15) ( i ) with f ∈ { strict, seq, par }• exp (cid:15) ( alt ( i , i )) = exp (cid:15) ( i ) ∨ exp (cid:15) ( i ) • exp (cid:15) ( loop f ( i )) = (cid:62) with f ∈ { strict, seq, par } seq º alt ˝ strict º b ! m º c ? m º ∅ ˝ b ! m º Fig. 5: exp (cid:15)
Let us consider Fig.5. We can recognize theinteraction term of the example from Fig.2. Thisinteraction does not express the empty trace i.e. exp (cid:15) ( i ) = ⊥ . Fig.5 illustrates how exp (cid:15) ( i ) is com-puted. We start from the leaf nodes.On the leaf node that is the empty interactionat position (i.e. i | = ∅ ) we have immediatelythat exp (cid:15) ( i | ) = (cid:62) . Indeed, the empty interac-tion describes an execution where no action is ex-pressed, leading to the empty trace (cid:15) . We notethat on Fig.5 by drawing a ˝ symbol on the immediate right of the root node ofthis sub-interaction i | .On the leaf nodes hosting actions, at positions , and , we immediatelyhave exp (cid:15) ( i | ) = exp (cid:15) ( i | ) = exp (cid:15) ( i | ) = ⊥ . Indeed, an interaction that isreduced to a single action act describes the mandatory execution of that action,leading to a trace ς = act of length , which is not the empty trace. We thendecorate on Fig.5 those respective nodes with the º symbol, signifying that thosesub-interactions do not express the empty trace.Let us then consider sub-interaction i | = strict ( b ! m , c ? m ) . Given thatwe have computed exp (cid:15) for its child sub-interactions i | and i | , we can im-mediately infer that exp (cid:15) ( i | ) = ⊥ . Indeed, the strict operator is a schedulingoperator i.e. it describes executions which are specific interleavings between executions: one occurring on the left sub-interaction (here i | ), and one on theright (here i | ). Hence, if, on at least one child sub-interaction, every execu-tion expresses at least one action, then any execution of the parent interactionalso expresses at least one action. Therefore exp (cid:15) ( i | ) = ⊥ . On Fig.5, we thendecorate the root note of i | (which is the node at position ) with the º symbol.Let us then consider sub-interaction i | = alt ( i | , ∅ ) . By definition, i | ac-cepts the empty trace. Indeed, given that its root node is an alt operator, i | describes an alternative between the expression of distinct behaviors, each onemodelled by one of the two sub-interaction i | and i | = ∅ . i | do not expressthe empty trace (in facts, it describes a single possible execution which leads to atrace b ! m .c ? m ). i | describes the empty execution, leading to the empty trace E. Mahe, B. Bannour, C. Gaston, A. Lapitre, P. Le Gall (cid:15) . Therefore i | can either produce trace b ! m .c ? m or the empty trace. Hencewe have exp (cid:15) ( i | ) = (cid:62) , which we note by symbol ˝ on the node at position onFig.5.Finally, whether or not the overall interaction i = seq ( i | , i | ) expresses (cid:15) isdetermined as for any other sub-interaction. Here the root node is a scheduling seq operator therefore so as to express (cid:15) , i would require both i | and i | to doso. As a result we have exp (cid:15) ( i ) = ⊥ .The avoids function states, for an interaction i and a lifeline l , whether ornot i accepts an execution that involves no actions occurring on l . The emptyinteraction ∅ "avoids" every lifeline. An action l (cid:48) ∆m "avoids" l iff it occurs ona different lifeline. Then, as for exp (cid:15) , avoids is defined inductively. Any loopmay avoid any lifeline given that, in any case, it is possible to repeat 0 times itscontent. For an interaction of the form i = alt ( i | , i | ) , it is sufficient that anyone of the two sub-interactions i | or i | avoids l so that i may avoid l . For thescheduling operators ( seq , strict and par ), both have to avoid l . Definition 6 (Avoiding).
We define the functions avoids : Int × L → bool s.t.for any l ∈ L : – avoids ( ∅ , l ) = (cid:62) – avoids ( l (cid:48) ∆m, l ) = ( l (cid:48) (cid:54) = l ) , for any act = l (cid:48) ∆m ∈ Act , – for any i and i in Int : • avoids ( f ( i , i ) , l ) = avoids ( i , l ) ∧ avoids ( i , l ) with f ∈ { strict, seq, par }• avoids ( alt ( i , i ) , l ) = avoids ( i , l ) ∨ avoids ( i , l ) • avoids ( loop f ( i ) , l ) = (cid:62) with f ∈ { strict, seq, par } seq ˝ alt ˝ strict º b ! m ˝ c ? m º ∅ ˝ b ! m ˝ Fig. 6: avoids ( i, c ) Fig.6 describes the application of avoids ( _ , c ) on an interaction i (from Fig.2) and its sub-interactions. At the leaf nodes, avoids ( _ , c ) is ei-ther (cid:62) (on ∅ or on actions which do not occur on c ) or ⊥ (on actions which occur on c ). In this in-teraction i , the only action occurring on lifeline c is c ? m . At this leaf node we therefore put the º symbol to signify that avoids ( c ? m , c ) = ⊥ .In all other leaf actions, we put the ˝ symbol.Then, avoids ( _ , c ) is computed from bottom totop w.r.t. the interaction term, in the exact same manner as exp (cid:15) would be. Thevalue of avoids ( _ , c ) on a parent interaction is inferred from the values computedon child sub-interactions depending on the nature of the parent operator.In the following, for any i, l ∈ Int × L we will simply note: – exp (cid:15) ( i ) when exp (cid:15) ( i ) = (cid:62) – ¬ exp (cid:15) ( i ) when exp (cid:15) ( i ) = ⊥ – avoids ( i, l ) when avoids ( i, l ) = (cid:62) – ¬ avoids ( i, l ) when avoids ( i, l ) = ⊥ .Among all actions leaves of i , only some are immediately executable. Thefunction f ront (for frontier ) in Def.7, determines the positions of all such actions. mall-step multi-trace checking against interactions (long version) 9 Definition 7 (Frontier). f ront : Int → P ( { , } ∗ ) is the function s.t.: – f ront ( ∅ ) = ∅ and for any act ∈ Act , f ront ( act ) = { (cid:15) } , – for any i and i in Int : • f ront ( strict ( i , i )) = (cid:26) .f ront ( i ) ∪ .f ront ( i ) if exp (cid:15) ( i )1 .f ront ( i ) else • f ront ( seq ( i , i )) = 1 .f ront ( i ) ∪{ p | p ∈ .f ront ( i ) , avoids ( i , lf ( i | p ) } , • f ront ( f ( i , i )) = 1 .f ront ( i ) ∪ .f ront ( i ) with f ∈ { alt, par }• f ront ( loop f ( i )) = 1 .f ront ( i ) for f ∈ { strict, seq, par } .For any p ∈ f ront ( i ) , i | p is a called a frontier action. The empty interaction has an empty frontier: f ront ( ∅ ) = ∅ . For any action act , f ront ( act ) = { (cid:15) } ( (cid:15) is the position of act which is immediately executable).For i of the form f ( i , i ) , f ront ( i ) is inferred from f ront ( i ) and f ront ( i ) . Inall cases, actions occurring at positions in f ront ( i ) are immediately executablein i . Indeed, the term being read from left to right, all operators, if they intro-duce ordering constraints, will only do so on the right sub-interaction i . Thus .f ront ( i ) is included in f ront ( i ) . If f = alt or f = par , .f ront ( i ) is alsoincluded in f ront ( i ) because no constraint may prevent the execution of actionsfrom i . If f = strict , any action from i can only be executed if no action from i is (otherwise it would violate the strict sequencing). Therefore .f ront ( i ) isincluded in f ront ( i ) iff i accepts the empty trace. If f = seq , elements p from .f ront ( i ) are included in f ront ( i ) iff i accepts an execution that does notinvolve the lifeline on which the action i | p occurs. seqloop seq seqstricta ! m b ? m seqaltstrictb ! m c ? m ∅ b ! m para ! m strictc ! m a ? m Fig. 7: Frontier actions (highlighted)
Fig.7 illustrates the definition of f ront on the example from Fig.1. Given the different actions on leaves, we have f ront ( i ) ⊆ { , , , , , , , } . Actions on the right ofevery strict operators are prevented frombeing executed by those on their left andas such are not in the frontier. This elim-inates { , , } . b ! m and b ! m are prevented from being executed by b ? m which is a cousin on their left w.r.t the seq operator at position . This elimi-nates { , } . Then, by elimination, f ront ( i ) = { , , } . We now define the small-step used in our operational semantics. It consistsin transforming an interaction i having the position p in its frontier into aninteraction i (cid:48) s.t. i (cid:48) characterizes in intentions all the possible futures of theexecution of the action i | p according to i . At first, we define a function that associates to any interaction i that mayavoid l (i.e. s.t. avoids ( i, l ) ), a new interaction, which characterizes exactly allthe executions of i that do not involve lifeline l . Definition 8 (Pruning).
The function prune : Int × L → Int is defined forcouples ( i, l ) in Int × L verifying avoids ( i, l ) by: – prune ( ∅ , l ) = ∅ and for any act ∈ Act , prune ( act, l ) = act – for any ( i , i ) ∈ Int , prune ( alt ( i , i ) , l ) is equal to: • prune ( i , l ) if avoids ( i , l ) ∧ ¬ avoids ( i , l ) • prune ( i , l ) if avoids ( i , l ) ∧ ¬ avoids ( i , l ) • alt ( prune ( i , l ) , prune ( i , l )) if avoids ( i , l ) ∧ avoids ( i , l ) – for any ( i , i ) ∈ Int and any f ∈ { strict, seq, par } : • prune ( f ( i , i ) , l ) = f ( prune ( i , l ) , prune ( i , l )) – for any i ∈ Int and any f ∈ { strict, seq, par } : • prune ( loop f ( i ) , l ) = loop f ( prune ( i, l )) if avoids ( i, l ) • prune ( loop f ( i ) , l ) = ∅ if ¬ avoids ( i, l ) For any given lifeline l , prune ( _ , l ) : Int → Int eliminates from a giveninteraction i (s.t. the precondition avoids ( i, l ) is satisfied) all actions occurringon lifeline l while preserving a maxima the original semantics of i i.e. so that Accept ( prune ( i, l )) ⊆ Accept ( i ) and Accept ( prune ( i, l )) is the maximum subsetof Accept ( i ) that contains no trace in which there are actions occurring on l .So as to preserve the semantics, the interaction term i can only be modified intwo manners with the aim to eliminate actions: (1) by forcing the choice of a givensub-interaction in alt nodes (illustrated on Fig.8) and (2) by choosing to forbidthe repetition of a sub-interaction in loop nodes (illustrated on Fig.9). Thosemodifications strictly correspond to the elimination of some possible executionsof i and therefore we have Accept ( prune ( i, l )) ⊆ Accept ( i ) .We describe in the following the mechanism of prune formalised in Def.8.Let’s consider a lifeline l . We have prune ( ∅ , l ) = ∅ because there is noth-ing to eliminate. For any action act ∈ Act , prune ( act, l ) is well defined iff avoids ( act, l ) . Therefore, act is not an action that needs to be eliminated and prune ( act, l ) = act . For i = alt ( i | , i | ) , in order for the precondition avoids ( i, l ) to be satisfied, we have either or both of avoids ( i | , l ) or avoids ( i | , l ) . If bothbranches avoid l they can be pruned and kept in the interaction term. If onlya single one does, we only keep the pruned version of this single branch. Forany scheduling operator f , if i = f ( i | , i | ) , in order to have avoids ( i, l ) we musthave both avoids ( i | , l ) and avoids ( i | , l ) . Then prune ( i, l ) is simply defined asthe scheduling by f of the pruned versions of i | and i | . Finally, for loops, i.e.with i of the form loop f ( i | ) with f a scheduling operator, we distinguish twocases. (1) If ¬ avoids ( i | , l ) then any execution of i | will yield a trace containingactions occurring on l . Therefore it is necessary to forbid the repetition of theloop. This is done by specifying that prune ( i, l ) = ∅ . (2) If avoids ( i | , l ) then itis not necessary to forbid the repetition of the loop, given that sub-interaction i | can be pruned and therefore may not yield traces with actions occurring on l . This being the modification which preserves a maximum amount of traces of mall-step multi-trace checking against interactions (long version) 11 the semantics, we have prune ( i ) = loop f ( prune ( i | , l )) . The recursive nature of prune then guarantees that only the minimally required modifications are doneon the interaction term so as to eliminate from it undesired actions. loop seq seqstricta ! m b ? m seqaltstrictb ! m c ? m ∅ b ! m interaction before pruning pruning with prune ( _ , c ) interaction after pruning Fig. 8: Illustration of prune (case where a branch of alternative is cut-out)
Fig.8 illustrates a specific application of the pruning process. We consideran interaction i , drawn on the left part of Fig.8 and which term is given in themiddle part of Fig.8. i is defined over the set L = { a, b, c } of lifelines. We thenapply prune ( _ , c ) on i to obtain the interaction drawn on the right part of Fig.8.The blue lines represent the rewriting orchestrated by the prune function. Theonly action occurring on c in i is c ? m . It must be eliminated. As its parent is ascheduling operator ( strict ), it must also be eliminated. The grand-parent nodeis an alt operator. The right cousin underneath this alt is ∅ , which "avoids" c .Therefore, we can force the choice of the right branch of this alt to solve thepruning. The remaining interaction then does not contain any action occurringon c . As explained earlier, prune made the minimal modifications to i so as toeliminate c ? m . For instance, we could have simply (and naively) forbidden therepetition of the loop at the root position; but this would also have eliminatedfrom the semantics of the remaining interaction a number of traces which we donot want to be eliminated. loop seq seqstricta ! m b ? m seqaltstrictb ! m c ? m ∅ b ! m ∅ ∅ interaction before pruning pruning with prune ( _ , a ) interaction after pruning Fig. 9: Illustration of prune (case where the repetition of a loop is forbidden)2 E. Mahe, B. Bannour, C. Gaston, A. Lapitre, P. Le Gall
Fig.9 illustrates a specific application of the pruning process. We consider thesame interaction i as in Fig.8, However we consider the pruning w.r.t. lifeline a instead of c . The only action occurring on a in i is a ! m . It must be eliminated.Its parent and grand-parent being respectively a strict and a seq operator, theymust both be eliminated. Finally, a loop node is reached. At this point, the onlychoice is to forbid the repetition of this loop. We therefore replace it by theempty interaction ∅ (as indicated in blue) to obtain the interaction on the right.We can now define the "e χ ecution" function χ which computes, from a giveninteraction i and position p ∈ f ront ( i ) , the interaction i (cid:48) which characterizes allthe continuations of the executions of i which start with the execution of action i | p at position p . Definition 9 (Interaction Execution).
The function χ : Int × { , } ∗ → Int × Act , defined for couples ( i, p ) verifying p ∈ f ront ( i ) is s.t.: – for any act ∈ Act , χ ( act, (cid:15) ) = ( ∅ , act ) – for any i , i ∈ Int , p ∈ f ront ( i ) , let us note χ ( i , p ) = ( i (cid:48) , act ) , then: • χ ( alt ( i , i ) , .p ) = ( i (cid:48) , act ) , • χ ( f ( i , i ) , .p ) = ( f ( i (cid:48) , i ) , act ) for f ∈ { strict, seq, par } , • χ ( loop f ( i ) , .p ) = ( f ( i (cid:48) , loop f ( i )) , act ) for f ∈ { strict, seq, par } , – for any i , i ∈ Int , p ∈ f ront ( i ) , let us note χ ( i , p ) = ( i (cid:48) , act ) , then: • χ ( alt ( i , i ) , .p ) = ( i (cid:48) , act ) • χ ( strict ( i , i ) , .p ) = ( i (cid:48) , act ) iff .p ∈ f ront ( strict ( i , i )) • χ ( seq ( i , i ) , .p ) = ( seq ( prune ( i , lf ( act )) , i (cid:48) ) , act ) iff .p ∈ f ront ( seq ( i , i )) • χ ( par ( i , i ) , .p ) = ( par ( i , i (cid:48) ) , act ) . χ is defined by induction on the cases authorized by its precondition p ∈ f ront ( i ) . If i ∈ Act , p can only be (cid:15) (and vice-versa). In this case χ ( i, (cid:15) ) = ( ∅ , i ) because the action i is executed and nothing remains to be executed. In anyother case, p is either of the form .p or .p , meaning that the action to beexecuted is resp. in the left or right sub-interaction. Then the result of χ ( i, p ) isa reconstruction of the interaction term from resp. the result of χ ( i , p ) and i or the result of χ ( i , p ) and i . The most subtle case occurs when p = 2 .p and i = seq ( i , i ) . The precondition p ∈ f ront ( i ) implies that i | p ∈ Act and thatthe left child i avoids lf ( i | p ) . In this case, to construct χ ( i, .p ) , χ does notuse i but rather its pruned version prune ( i , lf ( i | p )) which eliminates all tracesinvolving lf ( i | p ) while preserving all others.Fig.10 illustrates an application of χ on the example interaction from Fig.1.The action c ! m at the frontier position is being executed. On the leftis the original interaction i and on the right the resulting i (cid:48) interaction s.t. χ ( i, p ) = ( i (cid:48) , c ! m ) . In the middle is illustrated the process of computing i (cid:48) fromthe rewriting of the interaction term i . The first step is the elimination of thetarget action c ! m which is done by replacing it with ∅ . Then i (cid:48) is reconstructedfrom the bottom to the top. The immediate parent of c ! m at position is a strict operator. On Fig.10 we have added term simplification steps which corre-spond to the elimination of ∅ children of scheduling operators. Those simplifi-cations steps are immediate and do not incur changes in the model’s semantics mall-step multi-trace checking against interactions (long version) 13 seqloop seq seqstricta ! m b ? m seqaltstrictb ! m c ? m ∅ b ! m para ! m strictc ! m a ? m (cid:4) pruning (cid:4) e χ ecution interaction before χ e χ ecution interaction after χ Fig. 10: Illustration of e χ ecution as for any scheduling operator f and interaction i we have that f ( i , ∅ ) and f ( ∅ , i ) are equivalent to i . As such, when reaching i | during the rewritingof i into i (cid:48) , we simply replace the strict node by its right child a ? m . Whenreaching the par node at position , we do not write any change. However, whenreaching the seq node at root position (cid:15) , the rewriting of i involves the pruningof the left sub-interaction i | (highlighted in blue) w.r.t. the lifeline on which theexecuted action c ! m occurs (i.e. c ). Our small-step approach then consists in the exploration of an execution treerepresenting all possible successions of transformations i act @ p −−−−→ i (cid:48) , starting from aninitial interaction i . An accepted trace corresponds to a sequence act . · · · .act n obtained from a path i act @ p −−−−−→ i · · · act n @ p n −−−−−→ i n with i n a terminal interaction,i.e. accepting (cid:15) . By grouping all such paths together, we obtain a tree, calledthe execution tree, whose nodes are interactions and arcs are labelled by couples ( p, act ) noted act @ p . For a node i , child nodes are interactions i (cid:48) obtained via theexecution of any frontier action act = i | p with p ∈ f ront ( i ) . Any such child node i (cid:48) corresponds to an interaction that accepts traces that are suffixes of tracesaccepted by i and which start with act . Let us note that, given the existence ofloops, execution trees can be infinite, and traces can be arbitrarily long.On Fig.11 is illustrated such an execution tree. It is only partially drawngiven that, in any case, the original interaction having a loop node, this tree isinfinite. Let us note that the transformation i c ! m @221 −−−−−−→ _ in which the interac-tion on the top part becomes the one underneath on its right (after transition" c ! m @ "), corresponds to the e χ ecution illustrated on Fig.10. The 3 child in-teraction underneath the top one correspond to the e χ ecutions of the frontieractions of i (detailed in Fig.7). The path leading to the empty interaction (whitesquare (cid:3) ) yields the trace a ! m .c ! m .a ? m (which is therefore part of the tracesemantics of i ). χ as small-step In Def.10 below, we formally define our trace and multi-trace semantics
Accept and
AccM ult . Definition 10 (Semantics).
The sets of accepted traces
Accept : Int → P ( Act ∗ ) and multi-traces AccM ult : Int → P ( M ult ) are s.t. for any i ∈ Int : Accept ( i ) = empty ( i ) ∪ { act.Accept ( i (cid:48) ) | ∃ p ∈ f ront ( i ) , χ ( i, p ) = ( i (cid:48) , act ) } AccM ult ( i ) = { proj ( ς ) | ς ∈ Accept ( i ) } with: empty ( i ) = { (cid:15) } if exp (cid:15) ( i ) and empty ( i ) = ∅ otherwise. We define a process able to decide whether or not a multi-trace µ is acceptedby an interaction i . Its key principle is to construct traces accepted by i thatproject on µ . Constructing those traces is based on elementary steps ( i, µ ) (cid:32) ( i (cid:48) , µ (cid:48) ) s.t. χ ( i, p ) = ( i (cid:48) , act j ) for some p ∈ f ront ( i ) with act j ∈ Act ( l j ) and µ and µ (cid:48) being of resp. forms ( σ , · · · , act j .σ j , · · · , σ n ) and ( σ , · · · , σ j , · · · , σ n ) . mall-step multi-trace checking against interactions (long version) 15 Any such step ( i, µ ) (cid:32) ( i (cid:48) , µ (cid:48) ) corresponds simultaneously, for a given action act , (1) to a small-step e χ ecution in i of act (there can be several of thosegiven that a same action can be present at multiple positions) and (2) to theconsumption of act on the component lf ( act ) of µ (reducing its size by ).By considering all possible matches between frontier actions i | p and the headsof components of µ , and by iterating those steps of computation, the processbuilds a tree whose paths are of the form ( i , µ ) (cid:32) · · · (cid:32) ( i p , µ p ) · · · (cid:32) ( i q , µ q ) ,denoted as ( i , µ ) ∗ (cid:32) ( i q , µ q ) .At each step ( i, µ ) (cid:32) ( i (cid:48) , µ (cid:48) ) , the size of the multi-trace decreases by one.Hence, any path eventually reaches a point where it is no longer possible to finda next step. This halting of the process can occur in 2 cases.(1) Either the process reaches a state ( i q , µ q ) where µ q is not empty and nofrontier action of i q matches some first elements in µ q . In that case the sequenceof actions that leads to ( i q , µ q ) is not a trace accepted by i and a local verdict U nCov (for "multi-trace not covered") is associated to ( i , µ ) ∗ (cid:32) ( i q , µ q ) .(2) Or the process reaches a state ( i q , ( (cid:15), · · · , (cid:15) )) . Here, all actions of µ havebeen consumed to form a given global trace ς . The process then checks if ς isaccepted by i (which happens iff exp (cid:15) ( i q ) ). If the answer is yes then ( i , µ ) ∗ (cid:32) ( i q , ( ε, · · · , ε )) is associated with a coverage verdict Cov (for "multi-tracecovered"). Otherwise, the verdict
U nCov is associated to the path.If there exists a path leading to
Cov , the global verdict is
P ass . If no suchpath exists, the global verdict is
F ail . i µ i µ i µ i µ UnCovCovi k µ k k = κ ( i , µ ) matches · · · i cov µ cov · · · · · · i cov µ cov · · ·· · · · · · i rcov µ rcov r = | µ | (cid:4) rule R (cid:4) rule R or R (cid:4) rule R Fig. 12: Principle of multi-trace analysis
Let us consider the illustration on Fig.12. Starting from node ( i , µ ) , a num-ber of paths can be explored. From ( i , µ ) , there exists k outgoing transitionsto other nodes, k being the number of matches between frontier actions of i and trace actions at the heads of components from µ . Exploration steps (whichare not represented but implicitly designated by · · · in Fig.12) are then repeated(recursively) for every one of those children. Ultimately, every path that is thus created leads back to one of the two coverage verdicts U nCov or Cov (given thedecreasing size of the multi-trace).Paths starting from ( i , µ ) may have different lengths and different out-comes. This is explained by the fact that the graph explores how some executionsof i might (or might not) cover the behavior expressed by the multi-trace µ . Itmay be so that there exists several executions of i that match µ . At the sametime there might exists some that do not, and the fact that they do not match µ can be made clear after an arbitrary number (bounded by the length of µ )of small-step e χ ecutions.With regard to Cov , the fact that several paths might lead to it may beexplained by the fact that several global traces can be projected into the samemulti-trace (as in Fig.3). Therefore, when trying to reorder µ into a global tracethat satisfies i , we can find several of those.In the example illustrated in Fig.12, there exists (at least one) such path [( i , µ ) (cid:32) ( i cov , µ cov ) (cid:32) · · · (cid:32) ( i rcov , µ rcov )] that leads to Cov . Given thatobtaining
Cov requires to empty the initial multitrace µ , the length of thispath r is equal to that of the multi-trace. The existence of this path then impliesthat the global verdict P ass will be returned.
Multi-trace analysis relies on 4 rules, denoted R , R , R and R and givenin Def.11. Those rules define a directed graph G in which vertices are either atuple ( i, µ ) ∈ Int × M ult or a coverage verdict v ∈ { Cov, U nCov } . We note V = { Cov, U nCov } ∪ ( Int × M ult ) the set of vertices. For x in { , , , } , therule ( Rx ) vv (cid:48) cond , with v ∈ Int × M ult and v (cid:48) ∈ V specifies edges of the form v (cid:32) v (cid:48) of that graph, provided that v satisfies condition cond . Definition 11 (Rules of Multi-Trace Analysis).
The analysis relation (cid:32) ⊆ V × V is defined as: i ( (cid:15), · · · , (cid:15) ) (R1) exp (cid:15) ( i ) Cov i ( (cid:15), · · · , (cid:15) ) (R2) ¬ exp (cid:15) ( i ) U nCovi ( σ , · · · , act.σ k , · · · , σ n ) (R3) ∃ p ∈ f ront ( i ) s.t. χ ( i, p ) = ( i (cid:48) , act ) i (cid:48) ( σ , · · · , σ k , · · · , σ n ) i ( σ , · · · , σ n ) (R4) ( σ , · · · , σ n ) (cid:54) = ( (cid:15), · · · , (cid:15) ) ∧ (cid:18) ∀ j ∈ [1 , n ] , ∀ p ∈ f ront ( i ) , ( σ j (cid:54) = (cid:15) ) ⇒ ( f st ( σ j ) (cid:54) = i | p ) (cid:19) U nCov where f st ( σ ) denotes the first element of a non empty sequence σ . Vertices of the form ( i, µ ) are not sinks. Indeed, if µ is the empty multi-trace,given that exp (cid:15) ( i ) can either be true or f alse , either R or R applies andtherefore there exists an outgoing edge from any ( i, ( (cid:15), . . . , (cid:15) )) . If µ (cid:54) = ( (cid:15), . . . , (cid:15) ) ,one can either have or not have matches between frontier actions and multi-tracecomponent heads. Hence, an outgoing edge exists according to either R or R .Consequently, coverage verdicts { Cov, U nCov } are the only sinks of graph G . mall-step multi-trace checking against interactions (long version) 17 Rules R , R and R specify edges from vertices of the form ( i, µ ) to cov-erage verdicts. The rule R specifies edges ( i, µ ) (cid:32) ( i (cid:48) , µ (cid:48) ) such that (1) thereexists an action act occurring in i at position p ∈ f ront ( i ) matching a headaction act j of µ , i.e. µ = ( σ , · · · , act j .σ (cid:48) j , · · · , σ n ) , (2) i (cid:48) is defined by χ ( i, p ) =( i (cid:48) , act j ) , and (3) µ (cid:48) is the multi-trace µ in which we have removed act j , i.e. µ (cid:48) = ( σ , · · · , σ (cid:48) j , · · · , σ n ) . Moreover, let us note that for a vertex ( i, µ ) , there areat most | f ront ( i ) | possible applications of the rule R with | f ront ( i ) | boundedby the number of occurrences of actions in i .Let us consider | µ | the number of actions occurring in a multi-trace µ , i.e. thesum of lengths of its component traces. Let us extend this notation to vertices,that is, | ( i, µ ) | defined as | µ | , and | Cov | and | U nCov | defined as − . For anyedge v (cid:32) v (cid:48) of G , we have | v (cid:48) | < | v | with | v (cid:48) | ≥ − . Consequently, the successiveapplication of the rules strictly decrements the size of nodes and from any vertex ( i, µ ) , any maximal outgoing path is finite, and terminates in a coverage verdictin { Cov, U nCov } (since ( i, µ ) are not sinks of G ). Thus, G is an acyclic graph.With the notation v ∗ (cid:32) v (cid:48) to indicate that there is a path from v to v (cid:48) in G , wedefine multi-trace analysis. Definition 12 (Multi-Trace Analysis).
We define ω : Int × M ult → {
P ass, F ail } such that for any i ∈ Int and µ ∈ M ult we have: – ω ( i, µ ) = P ass iff there exists a path ( i, µ ) ∗ (cid:32) Cov – ω ( i, µ ) = F ail otherwise;i.e. for all path ( i, µ ) ∗ (cid:32) v with v ∈ { Cov, U nCov } , then v = U nCov
The function ω is well-defined. Indeed, we established that any maximal pathfrom a vertex ( i , µ ) has a maximum length of | µ | + 1 and end on a coverageverdict ( Cov or U nCov ). Besides, each intermediate vertex ( i, µ ) between ( i , µ ) and a coverage verdict has a number of children bounded by the number ofactions of i . Therefore, the number of vertices reachable from ( i , µ ) is finiteAll definitions of Sec.3 are structured by induction on terms and positions,and as such, allow a direct implementation of the execution function χ (involvedin the application of rule R ). Similarly, one can implement the ω function bybuilding on-the-fly the sub-graph originating from ( i, µ ) thanks to queues andusual search heuristics. Moreover, in practice, graph traversals can be interruptedas soon as a Cov verdict is reached. We have implemented χ and ω in theHIBOU tool available online . In HIBOU, traceability for end-users is facilitatedby the operational nature of our approach. Indeed, whether we are exploring anexecution tree (as in Fig.11) or analyzing a multi-trace (as in Fig.13), HIBOUhas access to the succession of states and can draw representations of the process.In Fig.13 below is represented the analysis of multitrace µ = ( a ! m .a ? m , (cid:15), c ! m ) w.r.t. the interaction from Fig.1, which yields the global verdict P ass . Here weconfigured HIBOU to use a Depth First Search heuristic when exploring thegraph G . Given that Cov was found quickly, paths starting with the executionsof frontier action c ! m have not been explored. https://github.com/erwanM974/hibou_label8 E. Mahe, B. Bannour, C. Gaston, A. Lapitre, P. Le GallFig. 13: Example of multi-trace analysismall-step multi-trace checking against interactions (long version) 19 We now prove that the function ω in charge of analysing multi-traces w.r.t. aninteraction captures exactly its semantics defined by the step-by-step executionfunction χ given in Sec.3. More precisely, we will prove that for any ( i, µ ) in Int × M ult , ω ( i, µ ) = P ass iff µ ∈ AccM ult ( i ) (and by extension, ω ( i, µ ) = F ail iff µ (cid:54)∈ AccM ult ( i ) ). Given that AccM ult ( i ) is the set of projected globaltraces of Accept ( i ) , it then suffices to prove that for any trace ς ∈ Act ∗ we have ω ( i, proj ( ς )) = P ass iff ς ∈ Accept ( i ) . Below, Th.1 and Th.2 resp. correspondto the ⇐ and ⇒ implication of this " iff ". Theorem 1 (
Accept implies
P ass ). For any i ∈ Int and ς ∈ Act ∗ : ( ς ∈ Accept ( i )) ⇒ ( ω ( i, proj ( ς )) = P ass ) Proof.
Let us reason by induction on the trace ς . • ς = (cid:15) . Let us consider an interaction i s.t. (cid:15) ∈ Accept ( i ) . We have proj ( (cid:15) ) =( (cid:15), · · · , (cid:15) ) . As (cid:15) ∈ Accept ( i ) , then exp (cid:15) ( i ) = (cid:62) and R is applicable from ( i, ( (cid:15), · · · , (cid:15) )) .We obtain ω ( i, ( (cid:15), · · · , (cid:15) )) = P ass . • ς = act.ς (cid:48) . Let us consider i s.t. ς ∈ Accept ( i ) . The induction hypothe-sis on ς (cid:48) is: " ∀ i (cid:48) ∈ Int, ( ς (cid:48) ∈ Accept ( i (cid:48) )) ⇒ ( ω ( i (cid:48) , proj ( ς (cid:48) )) = P ass ) ". As act.ς (cid:48) ∈ Accept ( i ) , then there exists i (cid:48) in Int and p ∈ f ront ( i ) s.t. χ ( i, p ) =( i (cid:48) , act ) and ς (cid:48) ∈ Accept ( i (cid:48) ) . Let us consider the index j such that proj ( act.ς (cid:48) ) =( σ , · · · , act.σ j , · · · , σ n ) . Given that χ ( i, p ) = ( i (cid:48) , act ) , R can be applied so that ( i, ( σ , · · · , act.σ j , · · · , σ n )) (cid:32) ( i (cid:48) , ( σ , · · · , σ j , · · · , σ n )) with ( σ , · · · , σ j , · · · , σ n ) = proj ( ς (cid:48) ) . By induction, we have ( ω ( i (cid:48) , proj ( ς (cid:48) )) = P ass ) , i.e. there exists a path ( i (cid:48) , proj ( ς (cid:48) )) ∗ (cid:32) Cov . By preceding this path with ( i, proj ( act.ς (cid:48) )) (cid:32) ( i (cid:48) , proj ( ς (cid:48) )) ,we get a path ( i, ( σ , · · · , act.σ j , · · · , σ n )) ∗ (cid:32) Cov and ω ( i, proj ( ς )) = P ass . (cid:117)(cid:116) Theorem 2 (
P ass implies
Accept ). For any i ∈ Int and µ ∈ M ult : ( ω ( i, µ ) = P ass ) ⇒ ( ∃ ς ∈ Act ∗ s.t. proj ( ς ) = µ and ς ∈ Accept ( i )) Proof.
Let us reason by induction on the size of µ , i.e. on | µ | . • | µ | = 0 . Let us consider i s.t. ω ( i, µ ) = P ass . By | µ | = 0 , µ = ( (cid:15), · · · , (cid:15) ) . Since ω ( i, ( (cid:15), · · · , (cid:15) )) = P ass , R must apply and this implies that exp (cid:15) ( i ) = (cid:62) andconsequently (cid:15) ∈ Accept ( i ) . Therefore the property holds at length 0. • | µ | = z + 1 with z ≥ . Let us consider i s.t. ω ( i, µ ) = P ass . The inductionhypothesis states that "for all ( i (cid:48) , µ (cid:48) ) ∈ Int × M ult with | µ (cid:48) | = z , ( ω ( i (cid:48) , µ (cid:48) ) = P ass ) ⇒ ( ∃ ς (cid:48) ∈ Act ∗ s.t. proj ( ς (cid:48) ) = µ (cid:48) and ς (cid:48) ∈ Accept ( i (cid:48) )) ". Since ω ( i, µ ) = P ass , there exists a path ( i, µ ) ∗ (cid:32) Cov . As noticed in Sec. 4.2, each edge ofa maximal path exactly consumes one action, with the exception of the lastedge leading to the coverage verdict. Thus the path starts with an edge of form ( i, µ ) (cid:32) ( i (cid:48) , µ (cid:48) ) with | µ (cid:48) | = z and we have then ( i (cid:48) , µ (cid:48) ) ∗ (cid:32) Cov . By definition, ω ( i (cid:48) , µ (cid:48) ) = P ass . By induction, there exists a trace ς (cid:48) s.t. proj ( ς (cid:48) ) = µ (cid:48) and ς (cid:48) ∈ Accept ( i (cid:48) ) . ( i, µ ) (cid:32) ( i (cid:48) , µ (cid:48) ) corresponds to the consumption of an action act which matches a frontier action i | p of i . By definition, the trace ς = act.ς (cid:48) verifies proj ( ς ) = µ and ς ∈ Accept ( i ) . (cid:117)(cid:116) The two theorems demonstrate that ω ( i, µ ) = P ass characterizes the mem-bership of a multi-trace µ to AccM ult ( i ) . Those theorems and all the definitionsand lemmas they depend on have been encoded in the Gallina language so asto formally verify our proofs using the Coq automated theorem prover. A Coqproof, which includes the 2 previous demonstrations is available online .The computational cost of ω varies greatly depending on the initial ( i, µ ) couple. In the following we demonstrate the NP-hardness of this membershipproblem through a reduction of the 1-in-3-SAT problem [20]. This discussion isinspired by [1,8,6]. p ≥ boolean variables V = { v , · · · , v p } and a set of q ≥ clauses { C , · · · , C q } in 3-CNF form i.e. s.t. for any j ∈ [1 , q ] , C j = α j ∨ β j ∨ γ j with α j , β j , γ j in V ∪ V , ¯ being the usual negation operator . The 1-in-3-SAT problemon formula φ = C ∧ · · · ∧ C q then consists in finding ρ : V → {(cid:62) , ⊥} s.t. ρ | = φ and s.t. for any clause C j , only one in the three literals α j , β j , or γ j is set to (cid:62) . In the following, we sketch a reduction proof which states that any 1-in-3-SAT problem can be reduced to the multi-trace membership problem for a given ( i, µ ) ∈ Int × M ult (i.e. whether or not µ ∈ AccM ult ( i ) ).Let us consider the reduction of 1-in-3-SAT in the simple case where p = 4 and q = 2 . This approach can then be extended to include any other case.From formula φ = C ∧ C , we define an interaction i via a 1-on-1 trans-formation. This i is of the form exemplified on Fig.14 i.e. a parallelisation of 4alternatives alt ( i v , i v ) s.t. for any x ∈ V ∪ V , i x is s.t. if x occurs: • in C and C then i x = seq ( l ! m, l ! m ) • in C but not in C then i x = l ! m • in C but not in C then i x = l ! m • neither in C nor in C then i x = ∅ For instance, with C = ( v ∨ v ∨ v ) and C = ( v ∨ v ∨ v ) , Fig.14 givesthe corresponding interaction.We affirm that this 1-in-3-SAT problem φ is equivalent to the multi-tracemembership problem µ = ( l ! m, l ! m ) ∈ AccM ult ( i ) . Indeed, in a given execu-tion of i , component σ = l ! m of µ is expressed exactly once iff exactly one ofthe sub-interactions i α , i β or i γ is "chosen" during the execution of i . Giventhat the parent interaction (within i ) of sub-interaction i α (same reasoning for i β and i γ ) is of the form alt ( i α , i α ) (or with the order of branches inverted),what we mean by "chosen" is that the exclusive branch that hosts i α is chosenover that which hosts i α .The expression of component σ on lifeline l is therefore equivalent to thesatisfaction of clause C in 1-in-3-SAT. In our example, with C = ( v ∨ v ∨ v ) , https://erwanm974.github.io/coq_hibou_label_multi_trace_analysis/ canonically extended to any set of formulas X as X = { ψ | ψ ∈ X } " ρ | = φ " is the usual satisfaction relation in propositional logic.mall-step multi-trace checking against interactions (long version) 21 the fact that ρ | = C with ρ : [ v → ⊥ , v → (cid:62) , v → (cid:62) , v → (cid:62) ] is equivalent tothe fact that l ! m is expressed exactly once during the execution of i when i v is chosen over i v , i v over i v , i v over i v , and i v over i v . Fig. 14: Reduction
The same reasoning can be applied as for the rela-tionship between clause C and component σ .In other words, during the execution of i , given theuse of exclusive alternative operators in alt ( i v , i ¯ v ) sub-terms, the choice of either one of the alt branch consti-tutes an assignment of Boolean variable v . The overallparallel composition then simulates all possible variableassignments (i.e. the search space for ρ ).Then, the satisfaction of φ as the conjunction ofclauses C and C in 1-in-3-SAT is equivalent to thatof µ = ( σ , σ ) ∈ AccM ult ( i ) . Indeed, the same ρ mustbe used to solve both C and C and the same globalexecution of i must be used to consume both σ and σ exactly.In our example, φ = ( v ∨ v ∨ v ) ∧ ( v ∨ v ∨ v ) is solvable in 1-in-3-SAT by ρ : [ v → ⊥ , v → (cid:62) , v →(cid:62) , v → (cid:62) ] . This is equivalent to the fact that µ =( l ! m, l ! m ) is consumed exactly by the execution of i from Fig.14 when i v is chosen over i v , i v over i v , i v over i v , and i v over i v . For any such 3-CNF formula φ = C ∧ C defined over V = { v , · · · , v } , the 1-in-3-SAT problem can therefore be reduced to that ofthe membership of ( l ! m, l ! m ) w.r.t. the interaction i constructed from φ as above.As explained earlier, this sketch of proof can be ex-tended to include any numbers p and q of resp. variables and clauses. It sufficesto consider q lifelines l , · · · , l q , the multi-trace µ = ( l ! m, · · · , l q ! m ) and p par-allelized sub-interactions alt ( i v , i v ) , · · · , alt ( i v p , i v p ) . This generalized versionis detailed in the annex.Given that we have identified a case of multi-trace membership equivalent toa NP-complete problem, by reduction, multi-trace membership is NP-hard. When testing DSs, the ω function cannot be directly used. Indeed, the ab-sence of global clock in a DS might cause difficulties to synchronize the cessationof observations on the different sub-systems. Let us consider for example theterm i = strict ( a ! m, b ? m ) and a system under test composed of a sub-system Sys a (implementing lifeline a ) communicating with a system Sys b (implement-ing lifeline b ). A tester connected to Sys a might observe the empty execution (cid:15) because he stopped logging before the occurrence of emission a ! m . On the otherhand a tester connected to Sys b could have logged long enough to observe b ? m so that the overall observation would correspond to µ = ( (cid:15), b ? m ) . As µ is a strict prefix of an accepted multi-trace, but not an accepted multi-trace itself, ω ( i, µ ) would return F ail . From a testing perspective one would like to concludethat, even though µ is not an accepted multi-trace, a longer observation on Sys a might have enabled the observation of a ! m , yielding to the global observation ( a ! m, b ? m ) . As such, µ does not reveal a fault, contrary to observations that arenot prefixes of accepted multi-traces.In all generality, being able to determine whether or not a multi-trace is aprefix of an accepted multi-trace would require techniques that are beyond thescope of this paper. However, we propose a first approach in which we simplyadapt ω to identify couples ( i, µ ) for which we can not make a decision (andtherefore provide them with dedicated verdict of inconclusiveness).Rules of Def.11 are adapted as follows. For a couple ( i, µ ) , if µ is emptyand i does not express (cid:15) , rule R applies and leads to U nCov . However, wecould rather return a
T ooShort verdict (as we did for trace analysis in [14]).Intuitively the existence of an execution (from an initial couple ( i , µ ) ) leadingto this verdict will prove that the trace leading to it is a prefix of a trace acceptedby i , but not a trace accepted by i . Now, if µ is not empty and from ( i, µ ) nofirst element of a component of µ matches a frontier action of i , rule R appliesand leads to the U nCov verdict. However we can here distinguish between casesin which an observability problem (as discussed earlier) may arise and cases inwhich it does not. (a) If no component of µ has been emptied, this means that,at this point of the test, observations on all sub-systems are still ongoing. Weare therefore sure of having a true error because no further execution complieswith the interaction model. We may therefore return an Out verdict. (b) If atleast one component of µ has been emptied, this means that the tester ceasedlogging on the corresponding sub-system. It might be that a longer observationon this sub-system would have enabled the application of rule R . However welack this information and hence return a LackObs verdict.Those considerations are reflected in the rules by slightly changing R (re-placing U nCov by T ooShort ) and by sub-dividing R into rules R a and R b which discriminate between the 2 aforementioned cases and are s.t.: i ( σ , · · · , σ n )( R a ) (cid:26) ∀ j ∈ [1 , n ] , σ j (cid:54) = (cid:15), ∀ j ∈ [1 , n ] , ∀ p ∈ f ront ( i ) , f st ( σ j ) (cid:54) = i | p Outi ( σ , · · · , σ n )( R b ) ( σ , · · · , σ n ) (cid:54) = ( (cid:15), · · · , (cid:15) ) ∧ ( ∃ j ∈ [1 , n ] s.t. σ j = (cid:15) ) ∧ (cid:18) ∀ j ∈ [1 , n ] , ∀ p ∈ f ront ( i ) ,σ j (cid:54) = (cid:15) ⇒ f st ( σ j ) (cid:54) = i | p (cid:19) LackObs
We can then formulate a new (cid:101) ω : Int × M ult → {
P ass, W eakP ass, Inconc, F ail } function s.t. for any i ∈ Int and µ ∈ M ult we have: – (cid:101) ω ( i, µ ) = P ass iff there exists a path ( i, µ ) ∗ (cid:32) Cov – (cid:101) ω ( i, µ ) = W eakP ass iff there does not exist a path ( i, µ ) ∗ (cid:32) Cov and thereexists a path ( i, µ ) ∗ (cid:32) T ooShort mall-step multi-trace checking against interactions (long version) 23 – (cid:101) ω ( i, µ ) = Inconc iff there does not exist a path of the form ( i, µ ) ∗ (cid:32) v with v ∈ { Cov, T ooShort } and there exists a path ( i, µ ) ∗ (cid:32) LackObs – (cid:101) ω ( i, µ ) = F ail if all paths lead to
Out (cid:101) ω is just a first proposal to deal with DS testing, and we will, in future workspropose more precise methods for applying multi-trace analysis to testing. Forexample, instead of producing LackObs when the conditions of applications of R b holds, we could explore the possible future characterized by i , pursuingthe goal of identifying continuations starting by actions of Act ( l j ) , for some j ≤ n such that σ j = (cid:15) . We could then try to produce more precise verdicts, byreasoning on the existence of such continuations, obtained by simulating someexecutions of i rather than by consuming actions in ( σ , · · · , σ n ) . We have proposed an approach to decide on the membership of multi-tracesw.r.t. semantics defined on interaction models. The analysis consists in applyingnon-deterministic reading of the multi-trace using small-steps of the operationalsemantics. This approach have been validated with formal proofs of correctnessusing Coq, and a study on complexity. Moreover, a prototype tool that imple-ments this analysis method has been developed in line with theoretical claims.Finally, we have discussed how membership analysis can be extended for testingdistributed systems where logging of multi-traces is performed under observabil-ity limitations. This last subject, with that of introducing data in models willbe the objects of further works.
References
1. Alur, R., Etessami, K., Yannakakis, M.: Realizability and verification of MSCgraphs. In: Orejas, F., Spirakis, P.G., van Leeuwen, J. (eds.) Automata, Lan-guages and Programming, 28th International Colloquium, ICALP 2001, Crete,Greece, July 8-12, 2001, Proceedings. Lecture Notes in Computer Science, vol. 2076,pp. 797–808. Springer (2001). https://doi.org/10.1007/3-540-48224-5_65, https://doi.org/10.1007/3-540-48224-5_652. Andrés, C., Cambronero, M., Núñez, M.: Formal passive testing of service-orientedsystems. In: 2010 IEEE International Conference on Services Computing, SCC2010, Miami, Florida, USA, July 5-10, 2010. pp. 610–613. IEEE Computer Society(2010). https://doi.org/10.1109/SCC.2010.62, https://doi.org/10.1109/SCC.2010.623. Bannour, B., Gaston, C., Servat, D.: Eliciting unitary constraints from timedsequence diagram with symbolic techniques: Application to testing. In: Thu,T.D., Leung, K.R.P.H. (eds.) 18th Asia Pacific Software Engineering Confer-ence, APSEC 2011, Ho Chi Minh, Vietnam, December 5-8, 2011. pp. 219–226.IEEE Computer Society (2011). https://doi.org/10.1109/APSEC.2011.40, https://doi.org/10.1109/APSEC.2011.404 E. Mahe, B. Bannour, C. Gaston, A. Lapitre, P. Le Gall4. Benharrat, N., Gaston, C., Hierons, R.M., Lapitre, A., Gall, P.L.: Constraint-based oracles for timed distributed systems. In: Yevtushenko, N., Cavalli, A.R.,Yenigün, H. (eds.) Testing Software and Systems - 29th IFIP WG 6.1 Interna-tional Conference, ICTSS 2017, St. Petersburg, Russia, October 9-11, 2017, Pro-ceedings. Lecture Notes in Computer Science, vol. 10533, pp. 276–292. Springer(2017). https://doi.org/10.1007/978-3-319-67549-7_17, https://doi.org/10.1007/978-3-319-67549-7_175. Dan, H., Hierons, R.M.: Conformance testing from message sequence charts. In:Fourth IEEE International Conference on Software Testing, Verification and Vali-dation, ICST 2011, Berlin, Germany, March 21-25, 2011. pp. 279–288. IEEE Com-puter Society (2011). https://doi.org/10.1109/ICST.2011.29, https://doi.org/10.1109/ICST.2011.296. Dan, H., Hierons, R.M.: The oracle problem when testing from mscs. Comput. J. (7), 987–1001 (2014). https://doi.org/10.1093/comjnl/bxt055, https://doi.org/10.1093/comjnl/bxt0557. Dershowitz, N., Jouannaud, J.P.: Handbook of theoretical computer science (vol.b). chap. Rewrite Systems, pp. 243–320. MIT Press, Cambridge, MA, USA (1990)8. Genest, B., Muscholl, A.: Pattern matching and membership for hi-erarchical message sequence charts. Theory Comput. Syst. A NP-hardness sketch of proof in general case
For practicity, we will use n − ary notations for binary operators f ∈ { strict,seq, par, alt } , with i = f ( i , · · · , i n ) designating the folding of f s.t. i = f ( i , f ( · · · , f ( i n − , i n ) · · · )) . We now reduce the 1-in-3-SAT boolean satisfiability prob-lem to multitrace membership so as to prove the NP-hardness of the latter.Let us consider a 3-CNF formula φ = C ∧ · · · ∧ C q defined over a set V = { v , · · · , v p } of Boolean variables. φ being a 3-CNF formula, for any j ∈ [1 , q ] , C j is a disjunction of 3 literals α j ∨ β j ∨ γ j (i.e. α j , β j and γ j are of the form v or v with v ∈ V and ¯ the negation operator).The 1-in-3 SAT problem consists in finding a solution ρ : V → {(cid:62) , ⊥} s.t.for every clause C j only one of α j , β j or γ j is set to (cid:62) .For any k ∈ [1 , p ] let us define i k = alt ( i v k , i v k ) with i v k = seq ( i v k , · · · , i qv k ) and i v k = seq ( i v k , · · · , i qv k ) s.t. for all j ∈ [1 , q ] , given clause C j we have: • if v k occurs in C j then i jv k = l j ! m and else i jv k = ∅ • if v k occurs in C j then i jv k = l j ! m and else i jv k = ∅ Let us then consider i = par ( i , · · · , i p ) as illustrated on Fig.15. For instance,given C = v ∨ v ∨ v p we have i v = l ! m , i v = ∅ , i v = l ! m , i v = ∅ , i v p = l ! m and i v p = ∅ . Likewise, the other emissions of m drawn in Fig.15correspond to C = v ∨ v ∨ v p and C q = v ∨ v ∨ v p .The 1-in-3-SAT problem φ is then equivalent to the multi-trace membershipproblem µ = ( l ! m, · · · , l q ! m ) ∈ AccM ult ( i ) . Fig. 15: i obtained from φ Indeed, for any component σ j = l j ! m of µ , σ j is expressed exactly once iff exactly oneof the sub-interactions i α j , i β j or i γ j is chosenduring the execution of i (choice w.r.t. their re-spective parent alt operator). The satisfactionof component σ j is therefore equivalent to thatof clause C j . For instance, on the example fromFig.15, σ = l ! m is satisfied iff only one of i v , i v or i v p is chosen on their respective alterna-tive branches, which exactly corresponds to thesatisfaction of C = v ∨ v ∨ v p in 1-in-3-SAT,In other words, during the execution of i ,given the use of exclusive alternative operatorsin alt ( i x , i x ) sub-terms, the choice of either oneof the alt branch constitutes an assignment ofboolean variable x . The overall parallel com-position then simulates all possible variable as-signment (i.e. the search space for ρ ).Then, the satisfaction of φ as the conjunc-tion of clauses C j in 1-in-3-SAT is equivalentto that of µ ∈ AccM ult ( i ) , given that the sameexecution of the model i must satisfy conjointly every component σ jj