Verification for Reliable Product Lines
Maxime Cordy, Patrick Heymans, Pierre-Yves Schobbens, Amir Molzam Sharifloo, Carlo Ghezzi, Axel Legay
VVerification for Reliable Product Lines
Maxime Cordy ∗ Patrick HeymansPierre-Yves Schobbens
PReCISE Research CenterUniversity of Namur, BelgiumEmail: [email protected]
Amir Molzam ShariflooCarlo Ghezzi
DeepSE Research GroupPolitecnico di Milano, ItalyEmail: [email protected]
Axel Legay
INRIA Rennes, FranceEmail: [email protected]
Abstract —Many product lines are critical, and therefore re-liability is a vital part of their requirements. Reliability is aprobabilistic property. We therefore propose a model for feature-aware discrete-time Markov chains as a basis for verifyingprobabilistic properties of product lines, including reliability. Wecompare three verification techniques: The enumerative tech-nique uses PRISM, a state-of-the-art symbolic probabilistic modelchecker, on each product. The parametric technique exploitsour recent advances in parametric model checking. Finally, wepropose a new bounded technique that performs a single boundedverification for the whole product line, and thus takes advantageof the common behaviours of the product line. Experimentalresults confirm the advantages of the last two techniques.
Index Terms —Non-Functional Requirements, Software Prod-uct Lines, Markov models, Model Checking, Probability
I. I
NTRODUCTION
Software Product Line Engineering (SPLE) aims at de-veloping a large number of software systems that share acommon and managed set of features. In the past years, ithas been an active area in both research and industry. SPLEaims at improving productivity and reducing the time, effortand cost required to develop a family of products (also called variants ). The key point to achieve this goal is to managethe variability among various products of a Software ProductLine (SPL). SPLE mainly relies on model-based techniquesby which variable features and behaviours are specified. Themodels are used to derive numerous products, each of whichcontains a specific set of features.Each product of the SPL has to satisfy its functional andnon-functional requirements. A common approach to assessthese requirements consists in building the system and testingit. However, if the system fails to meet the requirements, costlyiterations are needed to improve it. This problem is even morecrucial in SPLE, where a huge number of products have tobe designed. Analysis techniques for single systems thereforeare expensive to apply. This is the reason why researchershave recently focused on designing new quality assurancetechniques dedicated to SPLs. In particular, model checkingis an automated verification technique that systematicallyexplores the whole state space of a model in search of errors.In the last years, several approaches lifted model checkingto SPLs [ ? ], [ ? ] (see more in Section VII). However, most ∗ FNRS Research Fellow of them consider qualitative properties on the sequencing ofevents only. They ignore important non-functional aspects ofsystems like reliability, availability, performance, and resourceusage.Moreover, today’s software are embedded in a wide varietyof systems like networks and aircrafts, that run in environ-ments where uncontrolled events may occur randomly andaffect the system. For example, a TCP transmission might beperturbed by hardware failures in a network node; a pumpmotor has to run faster when the flow of water goes beyonda certain threshold. Further, some systems use randomnessfor their own functioning, like the USB protocol. Markovmodels are widespread formalisms to model probabilisticproperties of systems whose behaviour is driven by randomevents. Still, they cannot cope with variability. A possibleapproach to modeling stochastic product lines is to specifythem with parametric equations; resolving these equations canbe difficult, however. There is thus a need for (1) behaviouralmodels able to represent both variability and randomness, and(2) automated techniques to efficiently verify all the modelledproducts against probabilistic properties.In this paper, we propose an approach to model and verifyfully observable stochastic SPLs. We enrich Markov chainswith an explicit notion of variability. This allows us to derive anew formalism that combine representations of SPL behaviourwith Markov models. In particular, we introduce
FeaturedDiscrete-Time Markov Chains (FDTMCs), an extension ofdiscrete-time Markov chains that allows to describe a full prod-uct line in a single model, thus sharing the common parts. InFDTMCs, transitions are associated with a probability profile that, given a product, returns the probability that the transitionis executed in this product. The definition of probability profileis intended to be flexible, and allows features to arbitrarilymodify the transition probability.Next, we present and discuss three verification methods todetermine which products modelled in an FDTMC (fail to)satisfy a given non-functional property. We focus more partic-ularly on properties such as reliability that can be expressedin Probabilistic Computation Tree Logic (PCTL) [ ? ]. A firstsolution to achieve this goal is to derive, for each product,the discrete-time Markov chain modelling this product fromthe FDTMC and to apply single-system algorithms to modelcheck it. This enumerative technique thus performs one ex- a r X i v : . [ c s . S E ] N ov loration per product . We also propose to reduce FDTMCmodel checking to parametric Markov-model checking, wherethe parameters encode the features. We can then feed theparametric model into a parametric model checker, which, in asingle exploration , yields a parametric expression that encodesthe value of the desired properties. Then, for each product ,the checker has to evaluate this expression by replacing theparameters by the values corresponding to the features of theproducts. Our last proposition is a novel algorithm that exploitsprobability profiles to determine all the products satisfyingthe property in a single exploration , which benefits from thecompact structure of FDTMC to reduce the cost of verification.This approach applies a bounded search through the state spaceof an FDTMC model, and is able to calculate approximativeresults considering the satisfaction of a given property withina desired precision.To evaluate the efficiency of the three approaches, wecarried out two series of experiments based on systematicallyextended technical examples. The results show that the enu-merative approach is inefficient compared to the other twotechniques, especially as the number of products grows.The remainder of the paper is structured as follows. SectionII recapitulates background on SPL model checking. In SectionIII, we recap probability theory and give its featured version.In Section IV, we introduce FDTMCs, FMDP, and presentour compositional modelling approach. The three verificationtechniques are discussed in Section V. Section VI providesexperimental results. We overview related work in Section VII,and sketch future work in Section VIII.II. F OUNDATIONS
Variability in SPLs is commonly captured into features[ ? ]. A feature (or variation point [ ? ]) is a unit of differencethat can differentiate two variants of an SPL. It can modeloptional components, functionalities of the system, but alsocross-cutting behaviour. Dependencies between features mayexist, and these are commonly modeled in a feature dia-gram (FD) [ ? ]. Many different languages are used now forthis purpose [ ? ]. A product, sometimes called configuration , variant [ ? ], or model [ ? ], [ ? ], gives a value to each feature ofits FD. A product p is valid for d , noted p (cid:15) d , if the valuesof its features satisfy all the constraints expressed in the FD.In this paper, we abstract from the syntax of FDs and definedirectly the semantics of an FD d as its set of features (its signature [ ? ]) Σ d and the set of its valid products (its productline [ ? ]) [[ d ]] . We can restrict a product p to a sub-signature Σ (cid:48) , noted p | Σ (cid:48) , which selects only the value of the features of Σ (cid:48) . In other words, FD form an institution [ ? ].As an increasing number of SPLs are developed, qualityassurance techniques for them become vital and are activelystudied. FDs being unable to express behaviour, several ap-proaches for SPL model checking have emerged during therecent years (see more in Section VII). Here we follow theprinciples of Featured Transition Systems (FTS) [ ? ]. Assumewe have an adequate model type for single products; thenthe semantics of the “featured” variant of this model is a runstopped emergencyready [A] methane[A] no_methane[W & !A] high[W] normal[W] low [A] methane[W & A] high ∩ no_methane Fig. 1. Mine pump controller FTS function giving for each product of the product line, thecorresponding model. For instance, Transition Systems (TS)are a well-accepted model type for the qualitative behaviourof a system, and thus the semantics of FTS maps each productto a TS. At the syntactic level, the dependance on the productsis moved as much as possible inside the model, so that thecommon parts can be described only once. Thus, an FTS isa state machine where transitions are annotated with featureexpressions , i.e. formulae defined over the features. Formally,an FTS is a tuple F = ( S , s , Act , trans , AP , L , d , γ ) where S is a set of states, s is the initial state, Act is a set ofactions, trans ⊆ S × Act × S is a set of transitions, AP is a set of atomic propositions, L : S → AP labels a statewith the propositions that it satisfies, d is a feature diagramand γ : trans → ([[ d ]] → {(cid:62) , ⊥} ) associates a transitionwith a feature expression , i.e. a formula that encodes the setof products able to execute the transition. The dependance onthe products has thus been moved inside, to the transitions.An FTS model-checking algorithm takes the feature infor-mation into account while looking for errors. It is thus ableto keep track of the products able to execute the behaviourcurrently analysed, and will only examine once the commonparts. Feature expressions constitute an intuitive and flexibleway to represent variability inside behavioural models. Inthis work, we will reuse the principles of such encoding formodelling behavioural variability in stochastic systems. Example . We illustrate the concept of FTS by means of themine pump controller [ ? ], [ ? ]. The objective of this softwareis to control a pump that drains water from a mine. Whenthe level of water goes beyond a certain threshold, it shouldturn the pump on, except when there is methane in the mine,since the motor might cause an explosion. Figure 1 presents anFTS modelling the (simplified) controller SPL. Two featuresinfluence the behaviour of the controller: the presence ofa WaterSensor ( W ), which reports the water level, and MethaneAlarm ( A ), which detects the presence or absenceof methane. The controller starts in its state ready . Then, itcan reach the state run when the level of water is so high thatthe pump has to be turned on. As specified by the associatedfeature expression, a product p can execute this transition t iff γ ( t )( p ) = (cid:62) , that is, iff feature W is enabled in p . Wheneverthe level of water is low or there is methane, the pump mustbe turned off. The controller reacts accordingly and reachesstate stopped , respectively emergency , provided that thecontroller is equipped with feature W , respectively A .herefore, the behaviour of the controller is determinednot only by its features, but also by uncontrolled events (thatis, water level and methane presence) which occur randomly.Even though FTSs are convenient for modelling behaviouralvariability, they cannot represent randomness. On the contrary,stochastic models cannot cope with variability. Both abilitiesare required to model and check non-functional quantitativeproperties like “what is the probability that eventually thepump is not running even though the water level is high?”on all the products of an SPL. The purpose of this paper is toprovide a formal framework for extending stochastic modelswith variability information.III. F EATURED M ARKOV M ODELS
Markov processes are meant to model systems where ran-dom events occur. They have been used to analyse quantitativeproperties in a variety of areas including economics, networks,and software. Throughout the section, we recapitulate the fun-damental definitions related to Markov processes, and enrichthem with an explicit notion of variability along the principlessketched above.Markov processes, and more generally any kind of stochas-tic process, are defined over a probability space and a measur-able space. A probability space is a triple (Ω , F, P ) where Ω is a finite set of outcomes, F ⊆ Ω is a set of events, each ofwhich is a set of outcomes, that forms a σ -algebra, i.e., F isnon-empty and closed under complementation and countableunions. P : F → [0 , is a probability measure function thatassigns probability to events. The probability of a countableunion of disjoint sets must be the sum of their probabilities,and the probability of Ω must be . In our minepump ex-ample, the outcomes are the combination of water level ( low , normal , and high ) with the presence or absence of methane( methane and no_methane , respectively); an example ofevent is “there is methane and the water level is not low”, thatis, { ( high, methane ) , ( normal, methane ) } . The state will berepresented by a measurable space: it is a couple ( S, Σ) where S is a set and Σ ⊆ S is a a σ -algebra. S is called the statespace .Let T be a totally ordered set, called time . A stochasticprocess is a set { X t ∈ S | t ∈ T } of indexed, S -valued randomvariables . Random variables are functions X : Ω → S thatmaps an outcome to a state. X t represents the state of the pro-cess at time t . Given that we consider SPLs instead of singlesystems, we define that this state also depends on the featuresof the process. For example, if the outcome ( high, methane ) occurs while the controller is in state ready, it should reachstate run if it has feature W or state emergency if it hasfeature A ; otherwise it remains in state ready . The value of X t is thus impacted by variability. Accordingly, we revisethe definition of random variable, and define it as a function X : Ω → ([[ d ]] → S ) that, given an outcome ω and a variant p , returns the state reached by p following the occurrence of ω . This revised definition leads us to the following extendeddefinition of stochastic process. Definition 1
Let (Ω , F, P ) be a probability space, ( S, Σ) bea measurable space, T a totally ordered set, and d a featurediagram. A featured stochastic process is a set { X t ∈ S [[ d ]] | t ∈ T } where for any t , X t is a random variable. According to this definition, the state reached by the processat time t is determined by (1) the outcome and (2) the product,i.e. the combination of features of the process.Markov processes are a particular type of stochastic pro-cesses that satisfy the so-called Markov property. Intuitively,this property, often referred to as the memoryless property,implies that the probability of reaching a state at a future pointof time can be determined knowing only its current state. Sincewe revised the definition of random variables and stochasticprocess, we extend the definition of the Markov property aswell, and obviously we require that the property holds for eachproduct. Definition 2
A featured Markov process is a variability-awarestochastic process { X t ∈ S [[ d ]] | t ∈ T } such that ∀ p ∈ [[ d ]] • ∀ t < t ∈ T • s ∈ S we have P r ( X t ( p ) = s | X u ( p ) = s u , ∀ u ≤ t )= P r ( X t ( p ) = s | X t ( p ) = s t ) Assume furthermore that time T is equipped with an addition. Definition 3
A featured Markov process is time-homogeneousif
P r ( X t + t ( p ) = s | X t ( p ) = s (cid:48) ) = P r ( X t ( p ) = s | X ( p ) = s (cid:48) ) When the time is discrete, it is enough to take t = 1 .There exist many types of Markov models: First, we canmodel time as continuous ( T = R ≥ ) or discrete ( T = N ).Second, we can assume that the process is purely stochastic(Markov chain) or partially under control of an opponent(Markov decision process). Third, rewards can be added tothe model (Markov reward model), etc. Specialized algorithmshave been developed for each type of model [ ? ], [ ? ]. In therest of this paper, we will focus on the two most widelyused models, namely time-homogeneous discrete-time Markovchains (DTMC) and time-homogeneous discrete-time Markovdecision process (MDP). For each, we will derive its featuredextension (FDTMC and FMDP). We leave for future work thealgorithmic treatment of other featured stochastic processes,but their semantics is already defined in this section.IV. F EATURED D ISCRETE -T IME M ARKOV P ROCESSES
The above definition of featured Markov processes is thesemantic foundation for the definition of formalisms to modelprobabilistic requirements in SPLs. Here, we specialize it toFeatured Discrete-Time Markov Chains (FDTMCs), a for-malism that extends Discrete-Time Markov Chains (DTMCs)to allow modelling both variability and stochasticity. In thissection, we first briefly illustrate how classical DTMCs modelthe behaviour of a stochastic environment. Then we presentthe syntax and the semantics of FDTMC. . Modelling random events with DTMCs
DTMCs are a particular type of Markov processes where(1) the state space of the system is finite, and (2) timeelapses at discrete steps. DTMCs can be regarded as transitionsystems where transitions are annotated with a probabilityvalue that describes the likelihood of their occurrence. Theseprobabilities satisfy the usual probability axioms. In particular,for a given state, the probabilities of its outgoing transitionsmust sum to 1.A example DTMCs is shown in Figure 2. It modelsthe evolution of methane evolution in the mine in whichthe aforementioned pump is installed. Initially, there is nomethane. At the next discrete point of time, either there ismethane (probability 0.125) or not (probability 0.875). Thenthe methane, if present, will disappear spontaneously in 75%of the cases. Note that the probability of staying in the samestate (self-loop) can be deduced from the other outgoingtransitions, since they have to sum up to 1; we will thus omitthem in the rest of the paper. DTMC are adequate to modelsystems that evolve spontaneously, without influence from theoutside world. no_methane methane7/8 1/41/83/41
Fig. 2. The DTMC modelling methane evolution
B. FDTMC: A featured stochastic formalism
Although DTMCs are convenient to represent stochasticity,they cannot cope with variability. To represent stochasticbehaviour in SPLs, we enrich them in the same way as weenhanced transition systems with feature expressions to obtainthe FTS formalism [ ? ], [ ? ]. The original intent of featureexpressions is to specify that features can add or removetransitions. In DTMCs, however, the addition or removal ofa transition leaving a given state may modify the probabilityof its other outgoing transitions. More generally, features canmodify the probability distribution of any transition. The onlyrestriction is the satisfaction of the probability axioms in allthe products. Therefore, the probability of a given transition isnot a fixed value anymore. To represent this new kind of vari-ability, we propose to annotate transitions with a probabilityprofile [[ d ]] → [0 , , by a function Π : S × S → ([[ d ]] → [0 , that associates to each transition a a product with a probabilityvalue. Intuitively, Π( s, s (cid:48) )( p ) is the probability of occurrenceof the transition ( s, s (cid:48) ) in the product p . In the figures, a profileis drawn as several arrows, one per probability value, and eacharrow has a guard that describes the set of products that yieldthis probability value. Definition 4
A Featured Discrete-Time Markov Chain(FDTMC) is a tuple ( S, s , d, Π , A, L ) where: • S is a finite, non-empty set of states; • s ∈ S is the initial state; • d is a feature diagram; • Π : S × S → ([[ d ]] → [0 , is the transition probabilityfunction, which assigns a probability profile to eachtransition. Equivalently, for each starting state and eachproduct, it gives a probability distribution on the targetstates. Any probability profile must satisfy the probabilityaxiom for all the products: ∀ p ∈ [[ d ]] , ∀ s ∈ S, (cid:88) s (cid:48) ∈ S Π( s, s (cid:48) )( p ) = 1 • A is a set of atomic predicates; • L : S → A labels each state by the set of predicatesthat holds there. An FDTMC is a concise representation for a family ofDTMCs, that is, one per valid product. The DTMC modellinga particular variant p is obtained by projecting the probabilityprofile of each transition onto p . The transition probabilityfunction of the resulting DTMC is defined as P : S × S → [0 ,
1] : P ( s, s (cid:48) ) = Π( s, s (cid:48) )( p ) . It generalises DTMC since,when [[ d ]] is a singleton, a FDTMC is simply a DTMC.All usual operations on probabilities can be extended byconsidering them as a function of the product line. Forinstance, the product of two (independent) probability profiles Π and Π (cid:48) is defined as (Π ⊗ Π (cid:48) )( p ) = Π( p ) . Π (cid:48) ( p ) .A path of an FDTMC is a sequence of its states. Giventhat transition probability depends on features, the executionprobability of a path does as well. Let s ∈ S be a state ofan FDTMC, and P aths ( s ) the set of paths starting from s .Due to the Markov property, the probability that a finite path ρ = ρ [0] , ρ [1] , . . . , ρ [ n ] ∈ P aths ( s ) is executed is given bythe product of the probability profiles of its transitions: Π( ρ ) = Π( ρ [0] , ρ [1]) ⊗ · · · ⊗ Π( ρ [ n − , ρ [ n ]) . This constitutes the base of the unique family of probabilitymeasure on paths, by the usual cylinder set construction [ ? ].For instance, the FDTMC of Figure 3 (where self-loops areomitted) represents a mine that could be equipped by a naturalventilation system, by creating well-placed air entrances. Thisfeature is static: it is selected at the construction of the mine.It cannot prevent the apparition of methane, but it will helpdissipating it. no_methane methane1/8[!V] 3/41 [V] 8/9 Fig. 3. Methane evolution in a mine with optional ventilation
FDTMC is a fundamental formalism, which is not meant tobe used by engineers. Because of the flexibility of probabilityprofiles, FDTMC may be hard to write down. Many states maybe needed to entirely represent the system and its environment.oreover, many transitions have a variable probability valuethat depends on the features of the system. When the numberof features and transitions grows, it becomes increasinglyharder to have a comprehensive view on the stochastic be-haviour of every variant. Therefore we introduce ways todecompose the description. A problem that we meet is that(F)DTMC are meant to described closed systems. Thereforewe can decompose a system into (F)DTMCs only if the sub-systems are completely independent and do not communicate.Therefore, we introduce Markov Decision Processes (MDP)as an intermediary to decompose our systems.
C. FMDP: a Composable Stochastic Model
FMDP is a model adequate to represent composable,communicating, stochastic, featured systems. They generaliseMDP, FTS and FDTMC. In this paper, we use FMDPs only tocreate the final FDTMC that will be model-checked. However,they are of independent interest, and there are tools that dealwith MDPs directly [ ? ], [ ? ]. Definition 5
A Featured Discrete-Time Markov Decision Pro-cess (FMDP) is a tuple ( S, s , Act, d, Π , A, L ) where: • S is a finite, non-empty set of states; • s ∈ S is the initial state; • Act is a finite set of actions; • d is a feature diagram; • P : S × Act × S → ([[ d ]] → [0 , is the transitionprobability function, which assigns a probability profileto each transition. Equivalently, for each starting state,each action and each product, it gives either a probabilitydistribution on the target states, or always to indicatethat the action is not enabled. It must thus satisfy theconsistency axiom: ∀ p ∈ [[ d ]] , ∀ s ∈ S, ∀ a ∈ Act, (cid:88) s (cid:48) ∈ S P ( s, a, s (cid:48) )( p ) = 1 or (1) • A is a set of atomic predicates; • L : S → A labels each state by the set of predicatesthat hold there. When actions are always enabled, the FMDP is complete:
Definition 6
A FMDP is complete if ∀ p ∈ [[ d ]] , ∀ s ∈ S, ∀ a ∈ Act, (cid:88) s (cid:48) ∈ S P ( s, a, s (cid:48) )( p ) = 1 When P returns either 0 or 1, a FMDP reduces to adeterministic FTS. When there is a unique action (that wecall tick ), a complete FMDP reduces to a FDTMC. When d has a unique product, a FMDP reduces to a MDP.For instance, the FMDP of Fig. 3 (where self-loops areomitted) represents the evolution of the water level in themine. The water can raise spontaneously (due to flooding)but it can also decrease spontaneously (due to evaporationand infiltration). Running the pump wil favour decrease ofthe water level. Note that here the guards contains dynamicpredicates (written in lowercase) and not static features. high normal[run] 3/4[run] 1/8 low[run] 1/43/81 [!run] 1/4 [!run] 1/9[!run] 1/4 Fig. 4. Water evolution in a mine with a pump
While most authors (e.g. [ ? ]) give to MDP a semanticsunder a probabilistic infinite-memory scheduler that choosesthe actions, we will not do so here, since our goal is toeliminate this non-determinism by composing the processes. D. Composition
To make FDTMC descriptions more manageable, we useclassical composition operators: • the synchronized product , borrowed from PRISM reactivemodules: actions can be synchronized in the style ofCSP, and each process can read the predicates of otherprocesses in its guards, but can only change its own state.We model these guards in the actions. The probabilisticchoices of each component are independent. • the observer product is asymmetric: the second processcan immediately observe the predicates of the first one,but not conversely. If we add the converse, we wouldcreate a causality loop. The controllers are usually mod-elled as observers: it is assumed that their reaction ismuch faster than the environmental evolution, and canbe considered instantaneous.For instance, the FDTMC of the mine pump can be sep-arated into an FDTMC (i.e. without inputs) for methane(Figure 3), an MDP (i.e. without features) for water (Figure 4),and an FTS (i.e. without probabilities) for the controller(Figure 1). Let us assume that, in a system without ventilation( ¬ V ), there is no_methane and the water level is normal ,whereas the controller is ready . At the next discrete pointof time, the probability to reach state with methane and high water, is given by . = . In the FTS, there aretwo outgoing transitions of ready that are labelled by anevent including this outcome: one leads to run and is labelledby event high and feature expression W ∧ ¬ A ; the otherleads to emergency and is labelled by event methane andfeature expression A . Hence, the system will reach state run if it is equipped with feature W but not A , will reach state emergency if feature A is enabled, and will stay in state ready otherwise.When composing featured systems, we must also composetheir FD. We use the operator d ∧ d defined by [[ d ∧ d ]] = { p ∈ M od (Σ ∪ Σ ) such that p | Σ ∈ [[ d ]] , p | Σ ∈ [[ d ]] } . Definition 7
The synchronized product of two FMDPs M =( S , i , Act × A , d , P , A , L ) and M = ( S , i , Act × A , d , P , A , L ) (with A , A disjoint) is the FMDP M = M (cid:107) M given by: S = S × S • s = ( i , i ) • Act = Act ∪ Act • d = d ∧ d • A = A ∪ A • L (( s , s )) = L ( s ) ∪ L ( s ) If a ∈ Act ∩ Act : P (( s , s ) , a, ( s (cid:48) , s (cid:48) ))( p )= P ( s , ( a, , L ( s )) , s (cid:48) )( p | Σ ) .P ( s , ( a, L ( s )) , s (cid:48) )( p | Σ ) If a ∈ Act \ Act : P (( s , s ) , a, ( s (cid:48) , s (cid:48) ))( p )= P ( s , ( a, L ( s )) , s (cid:48) )( p | Σ ) . ( s = s (cid:48) ) If a ∈ Act \ Act : P (( s , s ) , a, ( s (cid:48) , s (cid:48) ))( p )= ( s = s (cid:48) ) .P ( s , ( a, , L ( s )) , s (cid:48) )( p | Σ ) Above, ( s = s (cid:48) ) is a function that returns 1 when s, s (cid:48) areequal, and 0 otherwise. Theorem 1 If M , M are complete FMDPs, then their syn-chronized product M (cid:107) M is also a complete FMDP. This definition can also be used between two FDTMC,by considering them as single-action complete FMDP. Then,they synchronize on their common unique action, and theirsynchronized product represents their execution, synchronizedon time steps, but probabilistically independent. The theoremabove shows that the result is again a single-action completeFMDP, i.e. a FDTMC.The product of FMDPs without shared actions representstheir interleaved execution.In the observer product , the actions that drive the observerwill again be the sets of the atomic predicates of the observee,but they are now read immediately (hence the prime in thedefinition of transitions).
Definition 8
The observer product of a FMDP M = ( S , i , Act, d , P , A , L ) with a FMDP M = ( S , i , A , d , P , A , L ) (with A , A disjoint) is M = M ⊗ > M given by: • S = S × S ; • i = ( i , i ) • Act = Act • d = d ∧ d • P (( s , s ) , a, ( s (cid:48) , s (cid:48) ))( p )= P ( s , a, s (cid:48) )( p | Σ ) .P ( s , L ( s (cid:48) ) , s (cid:48) )( p | Σ ) • A = A ∪ A • L (( s , s )) = L ( s ) ∪ L ( s ) Note that the definition above does not always yield aproper FMDP: the probabilities could sum to a number striclybetween 0 and 1.
Theorem 2
Let M , M be FMDPs with Act = 2 A asabove. Then, M ⊗ > M is consistent for axiom (1) if M is complete. Further, we have:
Theorem 3
Let M , M be FMDPs with Act = 2 A asabove. Then, M ⊗ > M is a complete FMDP if M , M are complete. In particular, a FDTMC observed by a complete FMDP isagain a FDTMC.These operators are only a basis for a usable language. Weplan to unify fPromela [ ? ] and Probmela [ ? ] to obtain a mod-elling language easy to use (at least for Promela modellers).The FDTMC modelling the minepump SPL is obtained bycomputing the synchronized composition of the the FDTMCfor methane and the FMDP for water, while the deterministiccompleted FTS representing the controller is composed as anobserver. The theorems above show that the result is indeed aFDTMC. It has three boolean features ( W, A, V ), 8 products,and 24 states. For comparison, the Linux product line hasabout 10 000 features.V. V
ERIFICATION OF PROBABILISTIC PROPERTIES
By combining Markov models with variability information,we pave the way for automating the verification of probabilis-tic properties in SPLs. We also need to provide a language toexpress those properties; here we use PCTL [ ? ], that allows toexpress reliability properties. We want to assess the satisfactionof such properties for all the valid products of a given SPL. Anexample of property to verify is “Which products guaranteethat the probability that the pump eventually runs in presenceof methane, is less than 0.1?”. PCTL formulae are definedaccording to the following syntax: Φ ::= true | a | Φ ∧ Φ | ¬ Φ | P J (Ψ)Ψ ::= X Φ | Φ U ≤ t Φ | Φ U Φ where p ∈ [0 , , J is an interval ⊆ [0 , , t ∈ N ∪ {∞} , and a ∈ A is an atomic proposition. Formulae generated from Φ are referred to as state formulae and since can be evaluated toeither true or false in every state of a product. P is named the probability operator and P J (Ψ) specifies that the probabilitythat Ψ is satisfied from a given state must be within interval J .Formulae generated from Ψ are named path formulae and theirtruth is to be evaluated for each execution path. The temporaloperator X is called Next , and X Φ means that Φ must holdin the next state. U is called Until . Intuitively, a path satisfies Φ U Φ iff Φ holds in some state in the path and Φ holds inevery preceding state. U ≤ t is a bounded variant of the until.A path satisfies Φ U ≤ t Φ iff it satisfies Φ U Φ in at most t steps. We propose three methods to verify FDTMC againstPCTL formulae. A. Enumerative Model Checking
Our first method, called enumerative , checks an FDTMCusing standard DTMC verification algorithms. To that aim,it computes the projection of the probability profiles of theFDTMC to every product, and then model checks the result-ing DTMCs individually. The computation of the projectionrequires apply the probability profile of every transition ofthe FDTMC. When the FDTMC is modelled as a product ofomponents, one may instead compute the projection of eachcomponent, and then build back the same product of the re-sulting MDPs (without features). Further, if no features appearin a component, like the DTMC of the example, the projectiondoes nothing. The correctness of this method is guaranteed bythe fact that the projection operator is distributive over thesynchronized and observer product.
Theorem 4
Let
M, N be FMDPs. Then ∀ p ∈ [[ d ]] , M | p (cid:107) N | p =( M (cid:107) N ) | p and M | p ⊗ > N | p = ( M ⊗ > N ) | p . An undeniable advantage of the enumerative approach is thepossibility to reuse the most efficient state-of-the-art tools forsingle-systems, with their numerous optimisations. In partic-ular, matrix analytic methods [ ? ], [ ? ] cannot be applied onFDTMCs given that in these models, probability distributionscannot be represented as a two-dimensional stochastic matrix;it would be a matrix of profiles. Another advantage is that thismethod is appropriate for product sampling -based verification,where only a (small) subset of the products are verified. Theenumerative approach, however, does not benefit from thecommonalities between the products and thus performs moreredundant verifications as the number of products increases. B. Parametric Model Checking
In this second method, we propose to convert an FDTMCinto a parametric DTMC where the parameters are the features,and reuse existing methods for model checking parametricDTMCs.In FDTMC, probability distributions are represented by theaforementioned probability profiles. We defined a probabilityprofile as a function that associates a product p with aprobability value. We can encode it as a parametric expressionwhere the parameters are the features. For this purpose, wesum up the values that it can return, weighted by a parametricexpression designating the corresponding products. We denoteby (cid:15) ( p ) the parametric expression corresponding to the product p . Assume that the features of d are boolean, i.e. can havea value either 0 (absent) or 1 (present), which is the mostcommon case. Let { f , . . . , f k } ∈ [[ d ]] be the features of aproduct and { f (cid:48) , . . . , f (cid:48) j } the set of features that this productdoes not have. Then, (cid:15) ( p ) is given by (cid:15) ( p ) = b × · · · × b k × b k +1 × · · · × b k + j , (2) b i = (cid:26) f i , i = 1 . . . k, − f (cid:48) i − k , i = k + 1 . . . k + j (3)This parametric expression is equal to 1 if we assign the value1 to all the features of p and the value 0 to all the others. Forany other product, the expression is equal to 0.Using the above encoding, we represent the profile as theparametric expression (cid:88) p i ∈ [[ d ]] (cid:15) ( p i )Π( p i ) . (4)When considering a particular product p , every term of thissum except one is equal to 0; the remaining term givesthe probability. Note that if several products share the same probability value, we can drastically simplify the above sum.In particular, if the probability value depends only on a feature f , then we can rewrite Equation 4 as f × α + (1 − f ) × α where α (resp. α ) is the probability of t when the feature f is enabled (resp. disabled).By applying the above encoding, we reduce FDTMC verifi-cation to parametric DTMC verification, where the parametersmodel the presence or absence of boolean features. By doingso, we can benefit from efficient parametric model-checkerslike PARAM [ ? ] and our tool [ ? ]. Given a parametric model,these tools return an expression containing parameters thatencodes the probability we want to compute. To determine theactual probability that a given product satisfies the property,we replace, in the expression, each feature by 1 if it belongsto the product, or by 0 otherwise. The P operator then yields aparametric inequality, that we need to transform in a booleanformula. In the worst case, this can be done by evaluating theexpression once per valid product, at the cost of an exponentialtime.An advantage of this algorithm is that it performs onlyone exploration to compute the expression. However, thisexpression becomes increasingly complex as the number of(feature-dependent) transitions to explore grows. C. Feature-Aware Bounded Search
Our third method relies on a novel algorithm that exploresthe FDTMC and keeps track of the variability and probabilityinformation obtained during the search. Its first step is todecompose the PCTL formula into its parse tree , i.e. a treewhere each node is a state subformula of the original formula,such that the root is the formula itself, the leaves are atomicpropositions, and child nodes form the direct subformulaeof their parent. Similarly to CTL model checking algorithmfor FTS [ ? ], we compute the satisfaction sets of all thesubformulae bottom-up. The satisfaction set of a state formula Φ is the set Sat (Φ) ⊆ S × [[ d ]] such that ( s, p ) ∈ Sat (Φ) iff p satisfies Φ starting from s f , noted p ∈ [[( s | = Φ)]] . Therefore, s | = Φ can be regarded as a Boolean formula that encodes theset of products for which s satisfies Φ . The satisfaction rulesof PCTL formulae are given as follows. Definition 9
Let M be an FDTMC over an FD d , s ∈ S oneof its states. Then the satisfiability of a PCTL state formulaby s is calculated according to the following rules: s, p | = (cid:62) ⇔ (cid:62) s, p | = a ⇔ a ∈ L ( s ) s, p | = Φ ∧ Φ ⇔ s, p | = Φ ∧ s, p | = Φ s, p | = ¬ Φ ⇔ ¬ ( s, p | = Φ) s, p | = P J (Ψ) ⇔ Π (cid:0) s | = Ψ (cid:1) ( p ) ∈ J where Π( s | = Ψ) is a probability profile defined as Π( s | = Ψ)( p ) = Π( { ρ ∈ P aths ( s ) • ρ, p | = Ψ } ) . he satisfaction of path formulae is defined as follows: ρ, p | = X Φ ⇔ ρ [1] , p | = Φ ρ, p | = Φ U Φ ⇔ ∃ j • ( ρ [ j ] , p | = Φ ∧∀ ≤ k < j • ρ [ k ] , p | = Φ ) ρ, p | = Φ U ≤ t Φ ⇔ ∃ j ≤ t • ( ρ [ j ] , p | = Φ ) ∧∀ ≤ k < j • ρ [ k ] , p | = Φ ) with ρ = ρ [0] , ρ [1] , . . . Since we have to use a probability profile for each state forcomputing the P J operator, we also encode satisfaction setsthis way, using only 0 or 1 as results: s | =Φ with s | =Φ ( p ) = (cid:26) , s, p | = Φ0 , otherwise.Apart from the probability operator, the computation of thesatisfaction sets follows the same procedure as in [ ? ]. Asolution to determine for which products a state s belongsto the satisfaction set P J (Ψ) is to compute the probabilityprofile Π( s | = Ψ) . If Ψ = X Φ , then this profile is computedby: Π( s | = X Φ) = (cid:88) s (cid:48) ∈ S Π( s, s (cid:48) ) ⊗ s (cid:48) | =Φ . Indeed, for each product p , the probability that s satisfies X Φ is equal to the probability that, in p , s reaches a state satisfying Φ in one transition.When Ψ = Φ U ≤ t Φ , the probability profile is computedby solving the following recursive equations: Π( s | = Φ U ≤ Φ ) = s | =Φ (5) Π( s | = Φ U ≤ i Φ ) = s | =Φ ⊕ (6) ( − s | =Φ ) ⊗ s | =Φ ⊗ (7) (cid:88) s (cid:48) ∈ S (cid:0) Π( s, s (cid:48) ) ⊗ Π( s (cid:48) | = Φ U ≤ i − Φ ) (cid:1) (8)where i > . Indeed, according to the until operator, theprobability that in a product p , s satisfies Φ U Φ in zero stepis 1 if s satisfies Φ , and 0 otherwise. The probability that s satisfies Φ U Φ in i > steps in p is the probability that itsatisfies the formula steps or in j steps, with < j ≤ i . Tosatisfy the formula in j steps in p , s must (1) not satisfy Φ in p (otherwise it satisfies the formula in zero step), (2) satisfy Φ in p (otherwise it cannot satisfy the formula at all), and(3) reach a direct successor s (cid:48) that satisfies the formula in atmost i − steps.The correctness of the above equations directly follow fromthe semantics of the bounded until, the definition of projection,and the expansion laws of the bounded until operator [ ? ]. Inthe case of unbounded until, the probability value is classicallyobtained by removing the superscripts in the equations andsolving the resulting system of linear equations [ ? ]. But wecan also obtain the solution by iterating these equations for anincreasingly high bound k . We obtain lower approximationsof the desired probability values, that increase with k and tendto the exact value. We iterate these equations by performing a bounded explo-ration on the FDTMC. Algorithm 1 presents this explorationprocedure. Following the principles of FTS model check-ing [ ? ], the algorithm ensures that a given path is visited onlyonce. The idea of this algorithm is to start from the set ofstates that satisfy Φ for at least one product. Then we performa backward exploration to discover new paths that satisfy Φ U ≤ k Φ . For each state found along these paths, we record avariable x s , a probability profile that is a lower approximationof Π( s | = Φ U Φ ) and will eventually reach a value above Π( s | = Φ U ≤ k Φ ) ; therefore it tends to Π( s | = Φ U Φ ) withhigher k .First, for each state s , we record the set of products forwhich s satisfies Φ (Line 3). As explained above, this can beencoded as the probability profile s | =Φ . If there is at leastone such product, for every possible predecessor s (cid:48) of s thatcan reach one of those products, we push the transition fromfrom s (cid:48) to s together with a number that indicates that thealgorithm analyses this transition as part of a path of length 1(Lines 4–6). Next, we iteratively analyse the transitions on thestack in order to explore paths of greater lengths. Let ( s, i, s (cid:48) ) be the top element of the stack (Line 9). If i exceeds k , weskip the element since our approximation bound is reached.Then we apply the equations to recalculate x s using the newprobability values found for s (Line 11). Note that by doingso, we might go above Π( s | = Φ U ≤ k Φ ) since x s u mightalready include paths longer than the current path, giving thusan even better approximation of Π( s | = Φ U Φ ) . If one ofthe values changed, we may have to update the value of everypredecessor of s . To that aim, we add a new triplet ( s (cid:48)(cid:48) , i +1 , s (cid:48) ) on stack where s (cid:48)(cid:48) is a predecessor of s (cid:48) . The algorithm alwaysterminates since there is a finite of paths bounded by k . Thecorrectness is ensured by Equations (5) and (8).For the bounded until, the algorithm is similar, but it worksin k phases. It uses two profiles: x s contains the previousiteration: x s = Π( s | = Φ U ≤ i Φ ) , and computes a new profile x (cid:48) s = Π( s | = Φ U ≤ i +1 Φ ) . The “stack” is first emptied of i -edges before dealing with i + 1 edges, which amounts to abreadth-first search.The advantage of this bounded method is that it checks allthe products in one exploration. Unlike the parametric method,our feature-aware search does not require to evaluate a rationalexpression for each of the products. Instead, for each PCTLstate formula, it returns a Boolean formula encoding whichproducts satisfy it. VI. E XPERIMENTS
In this section, we report the results obtained by evaluatingthe performance of the three FDTMC verification techniquesin terms of verification time. We consider two technical casestudies as our benchmarks, which we systematically extendto obtain larger models. All the models are available on http://info.fundp.ac.be/ ∼ pys/fdtmc/.The first case study is an abstract model of failure-pronesystems. In this model, the system has to go through successivedegradation states to eventually reach an absorbing failure lgorithm 1 Feature-aware bounded search
Require:
An FDTMC, two PCTL state formulae Φ and Φ ,an integer bound k ≥ . Ensure:
For each s ∈ S , Π( s | = Φ U ≤ k Φ ) . Stack ← [] ; for s (cid:48) • s | =Φ (cid:54) = x s (cid:48) ← s (cid:48) | =Φ ; for s • Π( s, s (cid:48) ) ⊗ s (cid:48) | =Φ (cid:54) = Stack ← push ( Stack, ( s, , s (cid:48) )) ; end for end for while Stack (cid:54) = [] do ( s, i, s (cid:48) ) ← pop ( Stack ) ; if i ≤ k then new ← s | =Φ ⊕ ( − s | =Φ ) ⊗ s | =Φ ⊗ (cid:80) s u ∈ S Π( s, s u ) ⊗ x s u ; if new > x s then x s ← new ; for s (cid:48)(cid:48) • Π( s (cid:48)(cid:48) , s ) (cid:54) = Stack ← push ( Stack, ( s (cid:48)(cid:48) , i + 1 , s (cid:48) )) ; end for end if end if end while return x state. In every degradation state, however, instead of goingto the next degradation state, the system may completelybreak and reach a second absorbing failure state. In everydegradation state, the system may also partially recover andreach the previous degradation state. Apart from the initialstate and the two absorbing states, the probability of thetransitions leaving each state depends on the presence ofabsence of specific features. The model is extended by addingnew degradation states and features.The second case study is an abstract model of a serviceprovider system that gives the opportunity to its users to invokedifferent services. The execution of a service is modelled bya sequence of states. During such executions, the system mayfail and suddenly reach a failure (absorbing) state. After anyservice execution, the system may keep executing more ser-vices or may go to an absorbing successful-termination node.Each service requires a specific feature to be started, hencethe variability within such a system. Unlike the first model,the behaviour of the features are completely independent inthis model. We enlarge the model by gradually adding newservices, which also increases the number of features.For both examples, we checked that the probability thatsystem reaches a failure state is below 0.1. This reachabilityproperty can be expressed in PCTL as P < . (cid:0) (cid:5) f ailure ) (cid:1) .All benchmarks were run on a Dual Intel(R) Xeon(R) CPUE5530 @2.40GHz with 8Gb of RAM, equipped with GNULinux Ubuntu server 11.04 64bit. To perform the enumerativeverification, all the DTMCs modelling a specific product arefirst derived from the FDTMC and then verified one-by-one ve r i f i ca t i on t i m e ( sec . ) number of features Fig. 5. The verification time of failure-recovery case study by using the PRISM model checker [ ? ]. For the parametricapproach, we use the parametric model checker developed byFilieri et al. [ ? ], and then evaluate the resulting expression byusing JEP Java library . For the bounded approach, we use aprototype we developed from scratch. The latter is availableon http://info.fundp.ac.be/ ∼ pys/fdtmc/ as well.The total verification time of the enumerative approach isthe sum of the model-checking times to verify each singleproduct by PRISM. We excluded the time to produce individ-ual DTMCs as well as the time taken for creating PRISM inputfiles. As for the parametric approach, the verification time isobtained by summing the parametric verification time and thetime spent to evaluate the expression for each single product.Since the bounded approach verifies all products together, itsverification time equals to the total time taken by the prototypetool. In all the experiments, we set the bound of the algorithmsuch that the maximum precision error is always less than − .Figure 5 shows the verification times for the failure-recoverycase study. In this case, the number of features f grows from2 to 16. It turns out that the bounded approach outperformsthe others in almost every case. The verification time of theenumerative approach grows exponentially with the numberof features, as expected. We also observed that the parametricapproach suffers from the growing complexity of the rationalfunction.Table I reports the time each of the verification techniquestakes to verify the models of the service provider case study.The results show that the enumerative approach takes a longertime, while the other two techniques exhibit a similar perfor-mance. On contrary to the first case, the parametric approachoutperforms the bounded technique.The results of our experiments suggest that the enumerativeapproach is increasingly inefficient as the number of features(and hence of products) increases. Still, if a verification taskonly deals with a small number of products, the enumerativeapproach is a reasonable choice. The cost of the parametricapproach is highly dependent on the complexity of the rational HE VERIFICATION TIME OF SERVICE PROVIDER CASE STUDY ( INSECONDS )Features Enumerative Parametric Bounded2 4,207 0,237 0,1114 6,932 0,267 0,3066 14,57 0,336 0,3148 29,224 0,408 0,51910 57,215 0,53 0,67612 119,544 0,572 1,63614 227,91 0,99 1,93116 466,185 1,126 2,966 function that is produced by the parametric model checker.This complexity varies depending on the topology of theverified model. In the first case, where feature-dependenttransitions occur sequentially, the verification time grows fasterthan in the second case, where the features are scattered aroundthe model. Our third algorithm exhibits a good performance inboth experiments. In the second case, it remains competitivein spite of the high efficiency of the parametric approach.Our theory is that the parametric approach performs better inmodels where feature-dependent transitions do often not occurin sequence, that is, when there is a limited number of featureinteractions . On the contrary, if many of these sequences occurin the model then the size of the function returned by theparametric algorithm will sharply grow. In such cases, ourfeature-aware bounded search should instead be used.VII. R
ELATED W ORK
Analyzing non-functional properties of software systems hasreceived an increasing interest during the last years. However,there are only a few work discussing this issue for SPLs [ ? ],[ ? ]. The recent research carried out addressed the problem atfeature models or available source codes. Our approach insteadfocuses on the use of behavioural models. Feature models aresuitable to specify and organize variability of SPLs, but theyare not enough expressive and precise for quality analysis.On the other hand, analyzing the source code of an SPL isonly possible after the implementation, and is not applicableat the early stage of development [ ? ]. Ghezzi and MolzamSharifloo [ ? ], as well as Nunes [ ? ] propose to use parametricmodel checking to check PCTL formulae on all the variantsof an SPL. Yet, their modelling formalisms do not include anexplicit notion of features and they do not propose alternativeverification algorithms.Given the increasing popularity of product lines in variousareas including critical systems, SPL verification methods areactively studied, although most of them do not consider non-functional properties. The work surrounding FTS is the mostrelated to ours [ ? ], [ ? ], [ ? ], [ ? ]. Several alternative to FTSexist. Larsen et al. [ ? ] show that I/O automata are convenientfor modelling product lines as open systems. Asirelli et al. [ ? ]equip modal transition systems with a logic able to expressconstraints on variable behaviour. Gruler et al. [ ? ] extend theprocess algebra CCS with variability operators, which allow tomodel alternative choices between two processes. Li et al. [ ? ] model both the base systems and optional features as finitestate machines that connect to each other. Apel et al. [ ? ]specify features as separate units that can be composed; eachfeature defines safety properties that are subsequently verifiedusing single-system model-checkers.Outside the context of software product lines, we findthe notions of Interval Markov chain [ ? ] and ConstraintMarkov chain [ ? ]. The former is a generalization of Markovchains where execution probability of transitions are givenby probability intervals rather than constant values. IntervalMarkov chains thus concisely models a potentially infinite setof Markov chains. Then, it is possible to determine whetheror not all those Markov chains satisfy a given PCTL property.Constraint Markov chains are a generalization of IntervalMarkov chains, where probability distributions are definedby parameters. The values that a parameter can take aredetermined by linear constraints between the parameters (e.g. α + β = 0 . ). This kind of parameterization is more generalthan that of FDTMC with Boolean features as presented inthis paper. By extending FDTMC with numeric features [ ? ],we obtain a formalism as expressive as Constraint Markovchains. Moreover, to the best of our knowledge there existsno algorithm for checking Constraint Markov chains againstproperties expressed in PCTL.VIII. C ONCLUSION
In this paper, we tackled the verification of non-functionalrequirements in software product lines. We extended theprobability theory with notions of variability, and applied thisextension to define DTMCs enriched with variability opera-tors, aka FDTMCs. We showed that when the inner behaviorof a software is not stochastic, we can model a stochastic SPLas the composition of an FTS modelling the system and a setof DTMCs modelling the stochasticity of the environment. Wediscussed and experimented three methods to model checkFDTMCs, including a novel algorithm that directly exploitsvariability information contained in the model. Benchmarkssuggest that enumerative verifications of the whole productline take a long time w.r.t. the other two methods.In the future, we plan to develop a complete tool chain thatwill provide assistance in both the specification and verifica-tion of stochastic product lines. We will support specificationof FDTMCs either via a graphical user interface or throughthe use of high-level textual languages. Moreover, our tool willmake use of the compositional modeling approach presentedin this paper. To achieve the verification, we plan to implementand optimize our feature-aware bounded algorithm within thetool and will link parametric model checkers to the tool.A natural direction for our future work is the extension ofthe proposed approach to other kinds of Markov models, forexample continuous time Markov chains, and Markov rewardmodels, which are widely used to reason about other kinds ofnon-functional requirements such as performance and energy-consumption. The enrichment of Markov processes as pre-sented in Section III provides a formal framework from whichthese new models can be derived. Moreover, our compositionalodelling method can be applied to these models as well. Still,the major challenge remains the design of efficient verificationalgorithms.Also, we want to deepen the current approach to answeroptimizations problems. Nowadays, optimizing non-functionalrequirements is an important challenge in a wide range ofareas. For instance, we could consider the problem of findingthe most reliable and economic products among a productline. Since an FDTMC can be regarded as a parametric modelwith Boolean variables (namely the features) only, addressingthis problem leads us in the particular domain of Booleanlinear optimization. As an alternative to model checking andalgebraic methods, we will also investigate simulation-basedmethods for computing probabilities and rewards [ ??